A proposal for a portal to make earth's microbial diversity easily accessible and searchable

Antonie Van Leeuwenhoek. 2017 Oct;110(10):1271-1279. doi: 10.1007/s10482-017-0849-z. Epub 2017 Mar 9.

Abstract

Estimates of the number of bacterial species range from 107 to 1012. At the pace at which descriptions of new species are currently being published, the description of all bacterial species on earth will only be completed in thousands of years. However, even if one day all species were named and described, these names and descriptions would still be of little practical value unless they could be easily searched and accessed, so that novel strains could be easily identified as members of any of these species. To complicate the situation further, many of the currently known species contain significant genotypic and phenotypic diversity that would still be missed if description of microbial diversity were limited to species. The solution to this problem could be a database in which every bacterial species and every intra-specific group is anchored to a genome-similarity framework. This ideal database should be searchable using complete or partial genome sequences as well as phenotypes. Moreover, the database should include functions to easily add newly sequenced novel strains, automatically place them into the genome-similarity framework, identify them as members of an already named species, or tag them as members of yet to be described species or new intra-specific groups. Here, we propose the means to develop such a database by taking advantage of the concept of genome sequence similarity-based codes, called Life Identification Numbers or LINs.

Keywords: Average nucleotide identity; Bacterial species; Database; Genome sequences.

Publication types

  • Review

MeSH terms

  • Bacteria / classification*
  • Bacteria / genetics
  • Biodiversity*
  • Datasets as Topic*
  • Genome, Bacterial / genetics
  • Microbiology / standards
  • Microbiology / trends*
  • Phylogeny
  • Terminology as Topic