Mining biological networks for unknown pathways

Bioinformatics. 2007 Oct 15;23(20):2775-83. doi: 10.1093/bioinformatics/btm409. Epub 2007 Aug 30.

Abstract

Motivation: Biological pathways provide significant insights on the interaction mechanisms of molecules. Presently, many essential pathways still remain unknown or incomplete for newly sequenced organisms. Moreover, experimental validation of enormous numbers of possible pathway candidates in a wet-lab environment is time- and effort-extensive. Thus, there is a need for comparative genomics tools that help scientists predict pathways in an organism's biological network.

Results: In this article, we propose a technique to discover unknown pathways in organisms. Our approach makes in-depth use of Gene Ontology (GO)-based functionalities of enzymes involved in metabolic pathways as follows: i. Model each pathway as a biological functionality graph of enzyme GO functions, which we call pathway functionality template. ii. Locate frequent pathway functionality patterns so as to infer previously unknown pathways through pattern matching in metabolic networks of organisms. We have experimentally evaluated the accuracy of the presented technique for 30 bacterial organisms to predict around 1500 organism-specific versions of 50 reference pathways. Using cross-validation strategy on known pathways, we have been able to infer pathways with 86% precision and 72% recall for enzymes (i.e. nodes). The accuracy of the predicted enzyme relationships has been measured at 85% precision with 64% recall.

Availability: Code upon request.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Bacterial Physiological Phenomena*
  • Bacterial Proteins / metabolism*
  • Computer Simulation
  • Information Storage and Retrieval / methods*
  • Models, Biological*
  • Signal Transduction / physiology*

Substances

  • Bacterial Proteins