Automatic detection of subsystem/pathway variants in genome analysis

Bioinformatics. 2005 Jun:21 Suppl 1:i478-86. doi: 10.1093/bioinformatics/bti1052.

Abstract

Motivation: Proteins work together in pathways and networks, collectively comprising the cellular machinery. A subsystem (a generalization of pathway concept) is a group of related functional roles (such as enzymes) jointly involved in a specific aspect of the cellular machinery. Subsystems provide a natural framework for comparative genome analysis and functional annotation. A subsystem may be implemented in a number of different functional variants in individual species. In order to reliably project functional assignments across multiple genomes, we have to be able to identify the variants implemented in each genome. The analysis of such variants across diverse species is an interesting problem by itself and may provide new evolutionary insights. However, no computational techniques are presently available for an automated detection and analysis of subsystem variants.

Results: Here we formulate the subsystem variant detection problem as finding the minimum number of subgraphs of a subsystem, which is represented as a graph, and solve the optimization problem by integer programming approach. The performance of our method was tested on subsystems encoded in the SEED, a genomic integration platform developed by the Fellowship for Interpretation of Genomes as a component of a large-scale effort on comparative analysis and annotation of multiple diverse genomes. Here we illustrate the results obtained for two expert-encoded subsystems of the biosynthesis of Coenzyme A and FMN/FAD cofactors. Applications of variant detection, to support genomic annotations and to assess divergence of species, are briefly discussed in the context of these universally conserved and essential metabolic subsystems.

Supplementary information: The details of the variant detection results are available at http://ffas.burnham.org/svar/supp.html.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Automation
  • Computational Biology / methods*
  • Databases, Genetic
  • Genome*
  • Genomics / methods*
  • Models, Biological
  • Models, Statistical
  • Models, Theoretical
  • Programming Languages
  • Protein Interaction Mapping
  • Software
  • Systems Biology