Extracting high confidence protein interactions from affinity purification data: at the crossroads

Shuye Pu; James Vlasblom; Andrei Turinsky; Edyta Marcon; Sadhna Phanse; Sandra Smiley Trimble; Jonathan Olsen; Jack Greenblatt; Andrew Emili; Shoshana J Wodak

doi:10.1016/j.jprot.2015.03.009

Extracting high confidence protein interactions from affinity purification data: at the crossroads

J Proteomics. 2015 Apr 6:118:63-80. doi: 10.1016/j.jprot.2015.03.009. Epub 2015 Mar 14.

Authors

Affiliations

¹ Hospital for Sick Children, 555 University Avenue, Toronto, Ontario M4K 1X8, Canada. Electronic address: shuye2009@gmail.com.
² Hospital for Sick Children, 555 University Avenue, Toronto, Ontario M4K 1X8, Canada; Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada.
³ Hospital for Sick Children, 555 University Avenue, Toronto, Ontario M4K 1X8, Canada.
⁴ Banting and Best Department of Medical Research, University of Toronto, Donnelly Centre for Cellular and Biomolecular Research, 160 College Street, Toronto, Ontario M5S 3E1, Canada.
⁵ Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
⁶ Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada; Banting and Best Department of Medical Research, University of Toronto, Donnelly Centre for Cellular and Biomolecular Research, 160 College Street, Toronto, Ontario M5S 3E1, Canada.
⁷ Hospital for Sick Children, 555 University Avenue, Toronto, Ontario M4K 1X8, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada; Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada. Electronic address: shoshana@sickkids.ca.

PMID: 25782749
DOI: 10.1016/j.jprot.2015.03.009

Abstract

Deriving protein-protein interactions from data generated by affinity-purification and mass spectrometry (AP-MS) techniques requires application of scoring methods to measure the reliability of detected putative interactions. Choosing the appropriate scoring method has become a major challenge. Here we apply six popular scoring methods to the same AP-MS dataset and compare their performance. The comparison was carried out for six distinct datasets from human, fly and yeast, which focus on different biological processes and differ in their coverage of the proteome. Results show that the performance of a given scoring method may vary substantially depending on the dataset. Disturbingly, we find that the high confidence (HC) PPI networks built by applying the six scoring methods to the same raw AP-MS dataset display very poor overlap, with only 1.7-4.1% of the HC interactions present in all the networks built, respectively, from the proteome-wide human, fly or yeast datasets. Various properties of the shared versus unique interactions in each network, including biases in protein abundance, suggest that current scoring methods are able to eliminate only the most obvious contaminants, but still fail to reliably single out specific interactions from the large body of spurious associations detected in the AP-MS experiments.

Biological significance: The fast progress in AP-MS techniques has prompted the development of a multitude of scoring methods, which are relied upon to remove contaminants and non-specific binders. Choosing the appropriate scoring scheme for a given AP-MS dataset has become a major challenge. The comparative analysis of 6 of the most popular scoring methods, presented here, reveals that overall these methods do not perform as expected. Evidence is provided that this is due to 3 closely related issues: the high 'noise' levels of the raw AP-MS data, the limited capacity of current scoring methods to deal with such high noise levels, and the biases introduced using Gold Standard datasets to benchmark the scoring functions and threshold the networks. For the field to move forward, all three issues will have to be addressed. This article is part of a Special Issue entitled: Protein dynamics in health and disease. Guest Editors: Pierre Thibault and Anne-Claude Gingras.

Keywords: Affinity purification; Mass spectrometry; Protein–protein interaction; Scoring methods.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Databases, Protein*
Humans
Mass Spectrometry
Saccharomyces cerevisiae Proteins / chemistry*
Saccharomyces cerevisiae Proteins / isolation & purification*
Saccharomyces cerevisiae Proteins / metabolism
Saccharomyces cerevisiae*

Substances

Saccharomyces cerevisiae Proteins

Grants and funding

MOP #82940/Canadian Institutes of Health Research/Canada