Direct maximization of protein identifications from tandem mass spectra

Marina Spivak; Jason Weston; Daniela Tomazela; Michael J MacCoss; William Stafford Noble

doi:10.1074/mcp.M111.012161

Direct maximization of protein identifications from tandem mass spectra

Mol Cell Proteomics. 2012 Feb;11(2):M111.012161. doi: 10.1074/mcp.M111.012161. Epub 2011 Nov 3.

Authors

Marina Spivak¹, Jason Weston, Daniela Tomazela, Michael J MacCoss, William Stafford Noble

Affiliation

¹ Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.

Abstract

The goal of many shotgun proteomics experiments is to determine the protein complement of a complex biological mixture. For many mixtures, most methodological approaches fall significantly short of this goal. Existing solutions to this problem typically subdivide the task into two stages: first identifying a collection of peptides with a low false discovery rate and then inferring from the peptides a corresponding set of proteins. In contrast, we formulate the protein identification problem as a single optimization problem, which we solve using machine learning methods. This approach is motivated by the observation that the peptide and protein level tasks are cooperative, and the solution to each can be improved by using information about the solution to the other. The resulting algorithm directly controls the relevant error rate, can incorporate a wide variety of evidence and, for complex samples, provides 18-34% more protein identifications than the current state of the art approaches.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Algorithms
Amniotic Fluid / chemistry
Amniotic Fluid / metabolism
Artificial Intelligence*
Caenorhabditis elegans Proteins / metabolism
Complex Mixtures / analysis*
Databases, Protein
Humans
Laryngopharyngeal Reflux
Models, Statistical*
Peptide Fragments / analysis
Proteins / analysis*
Proteomics*
Saccharomyces cerevisiae Proteins / metabolism
Software
Tandem Mass Spectrometry / methods*

Substances

Caenorhabditis elegans Proteins
Complex Mixtures
Peptide Fragments
Proteins
Saccharomyces cerevisiae Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding