Self-organizing approach for meta-genomes

Comput Biol Chem. 2014 Dec:53 Pt A:118-24. doi: 10.1016/j.compbiolchem.2014.08.016. Epub 2014 Aug 24.

Abstract

We extend the self-organizing approach for annotation of a bacterial genome to analyze the raw sequencing data of the human gut metagenome without sequence assembling. The original approach divides the genomic sequence of a bacterium into non-overlapping segments of equal length and assigns to each segment one of seven 'phases', among which one is for the noncoding regions, three for the direct coding regions to indicate the three possible codon positions of the segment starting site, and three for the reverse coding regions. The noncoding phase and the six coding phases are described by two frequency tables of the 64 triplet types or 'codon usages'. A set of codon usages can be used to update the phase assignment and vice versa. An iteration after an initialization leads to a convergent phase assignment to give an annotation of the genome. In the extension of the approach to a metagenome, we consider a mixture model of a number of categories described by different codon usages. The Illumina Genome Analyzer sequencing data of the total DNA from faecal samples are then examined to understand the diversity of the human gut microbiome.

Keywords: Codon usages; Human gut meta-genome; Self-organizing genome annotation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosome Mapping / methods
  • Chromosome Mapping / statistics & numerical data*
  • Codon*
  • Feces / chemistry
  • Feces / microbiology
  • Gastrointestinal Tract / microbiology
  • Genome, Bacterial*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Metagenome*
  • Microbiota / genetics
  • Molecular Sequence Annotation
  • Sequence Analysis, DNA / statistics & numerical data*

Substances

  • Codon