A "seed-refine" algorithm for detecting protein complexes from protein interaction data

IEEE Trans Nanobioscience. 2007 Mar;6(1):43-50. doi: 10.1109/tnb.2007.891900.

Abstract

New technology advances in large-scale protein-protein interaction detection provide researchers an initial view of proteins on a global scale. These massive data sets provide a valuable source for elucidating the biomolecular mechanism in the cell. In this paper, we investigate the problem of protein complex detection from noisy protein interaction data, i.e., finding the subsets of proteins that are closely coupled via protein interactions. We identify the challenges and propose a "seed-refine" approach. We propose a novel statistically meaningful subgraph quality measure, a two-layer seeding heuristic to find good seeds, and a novel subgraph refinement method that controls the overlap between subgraphs. Experiments show the desirable properties of our subgraph quality measure and the effectiveness of our "seed-refine" algorithm.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Computer Simulation
  • Models, Biological*
  • Models, Statistical
  • Protein Binding
  • Protein Interaction Mapping / methods*
  • Proteins / metabolism*
  • Signal Transduction / physiology*

Substances

  • Proteins