Ten years of maintaining and expanding a microbial genome and metagenome analysis system

Trends Microbiol. 2015 Nov;23(11):730-741. doi: 10.1016/j.tim.2015.07.012. Epub 2015 Oct 14.

Abstract

Launched in March 2005, the Integrated Microbial Genomes (IMG) system is a comprehensive data management system that supports multidimensional comparative analysis of genomic data. At the core of the IMG system is a data warehouse that contains genome and metagenome datasets sequenced at the Joint Genome Institute or provided by scientific users, as well as public genome datasets available at the National Center for Biotechnology Information Genbank sequence data archive. Genomes and metagenome datasets are processed using IMG's microbial genome and metagenome sequence data processing pipelines and are integrated into the data warehouse using IMG's data integration toolkits. Microbial genome and metagenome application specific data marts and user interfaces provide access to different subsets of IMG's data and analysis toolkits. This review article revisits IMG's original aims, highlights key milestones reached by the system during the past 10 years, and discusses the main challenges faced by a rapidly expanding system, in particular the complexity of maintaining such a system in an academic setting with limited budgets and computing and data management infrastructure.

Keywords: comparative genome analysis; metagenomics; microbial genomics.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Review

MeSH terms

  • Computational Biology
  • Database Management Systems
  • Databases, Genetic*
  • Genome, Microbial / genetics*
  • Genomics / methods
  • Metagenome / genetics*
  • Models, Molecular
  • Software
  • Statistics as Topic