Having again some thoughts about the protein sequences we all in #teammassspec use for #proteomics / #peptidomics experiments. I so much want to have better tools to have more granular control over the peptide sequences used by engines. #bioinformatics
On Thursday morning, I'll be guest lecturing on #DataVis at the May Institute for Computation and Statistics for #MassSpec and #Proteomics at Northeastern #BostonMA!
Our pre-print on the influence of hormonal contraceptives (and other medications) on the plasma proteome is online! Great collaboration with Markus Ralser's group at Charite Berlin.
Do you enjoy PastelBio's proteomics posts ? each morning (timeline, conference, fav-post, software-resource), then articles, preprints, blogs, jobs, re-posts etc.,
Full week* dedicated to write down and plan improvements to make to our #teammassspec (#proteomics and #peptidomics) work. The #bioinformatics side of things at least. There are so many things I want to do since I joined the company. So so many things...
*beside interruptions caused by surprise additional work...
OpenMS is the by far the largest software project that I've ever worked on, and this is the first major release during my tenure here. Happy to answer any questions.
Some naive ways we could use gzipped based distance (they need to be tried, at least for fun):
protein/DNA sequences text files (sorted by abundance rank for example, or alphabetical order)
peptide sequences text files (same)
just 1 sequence
Other than plain text files (fasta, mzml, pepxml, mgf, etc.) it would be less interesting (scientifically) I guess because of the additional text. But worth trying, just to see how it behaves.
We are hiring a bioinformatician / analyst in the Proteomics and Analytical Sciences department (focus on peptidomics and mass spectrometry data) to assist research and product development activities.
The more I use Skyline for #proteomics / #peptidomics the more I want to build a simpler alternative. Mostly just as a finder/viewer, without the parts doing statistics. Something like that.
I also really want to build something to extract and store thousands of spectra from tons of DDA and peptide search results. 🤔
The more years I spend working in peptidomics the more I am convinced adapting proteomics software and tools is not enough. So many methods and implementations depends on the assumption there are only non overlapping tryptic peptides in a MS dataset.
Got into into a combination nightmare today doing peptide search without any enzyme cleavage site and a massive fasta file. Oops. I am going to need to find the sweetspot...
Who is on the ethical hook? What other areas of proteomics are also high risk? (ion mobility? real-time search?) We should address this ASAP before AI takes over. Soon AI will make up the data, exaggerate the results, fake the reviews, etc., so maybe all this is moot? (3/3)
I'm a computational biologist/bioinformatician, professor at UCLouvain in Brussels. I work with various types of omics data, with a special interest in #MassSpectrometry and #proteomics. I (co-)develop and maintain several #rstats and #Bioconductor packages. In addition to R, I use #emacs quite a bit, and run.