A new metagenome binning method based on gene uniqueness

Genes Genomics. 2020 Aug;42(8):883-892. doi: 10.1007/s13258-020-00956-2. Epub 2020 Jun 6.

Abstract

Background: The human gut microbiome contains millions of genes and many undetected bacteria species. Recovering bacterial genomes from large complex metagenomes remains highly challenging, and current binning methods show insufficient recall rates.

Objective: This study was performed to put forward a new metagenome binning method with promising recall rate and accuracy.

Methods: We found that more than 85% of the genes could be aligned to only one bacteria species by using strict BLAST parameters (identity > 90% and aligning length > 100 bp). This phenomenon was called "the gene uniqueness", which indicated that the most bacterial genes could be exclusive to the species' taxonomy. In our new metagenome binning method, we could cluster contigs based on gene similarity via a graph model. Any contig shared with same gene under Strict Blast parameters would be clustered into one bin.

Results: we obtained 1,131 bins and reconstructed the genomes of 12 unknown species for MetaHIT data Our method exhibited a more promising recall rate, faster running speed and lower time complexity than the current methods.

Conclusions: The present new metagenome binning method based on gene uniqueness had high recall rate and low error, which could be applied to assemble the bacterial genomes efficiently in complex metagenome.

Keywords: Bacteria assembly; Bacterial genome; Gene uniqueness; Metagenome; Metagenome binning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis
  • DNA Barcoding, Taxonomic / methods
  • Gastrointestinal Microbiome / genetics*
  • Genome, Bacterial / genetics*
  • Humans
  • Metagenome / genetics*
  • Metagenomics / methods*
  • Sequence Analysis, DNA