MicFunPred: A conserved approach to predict functional profiles from 16S rRNA gene sequence data

Genomics. 2021 Nov;113(6):3635-3643. doi: 10.1016/j.ygeno.2021.08.016. Epub 2021 Aug 24.

Abstract

The 16S rRNA gene amplicon sequencing is a popular technique that provides accurate characterization of microbial taxonomic abundances but does not provide any functional information. Several tools are available to predict functional profiles based on 16S rRNA gene sequence data that use different genome databases and approaches. As variable regions of partially-sequenced 16S rRNA gene cannot resolve taxonomy accurately beyond the genus level, these tools may give inflated results. Here, we developed 'MicFunPred', which uses a novel approach to derive imputed metagenomes based on a set of core genes only, thereby minimizing false-positive predictions. On simulated datasets, MicFunPred showed the lowest False Positive Rate (FPR) with mean Spearman's correlation of 0.89 (SD = 0.03), while on seven real datasets the mean correlation was 0.75 (SD = 0.08). MicFunPred was found to be faster with low computational requirements and performed better or comparable when compared with other tools.

Keywords: 16S rRNA gene; Amplicon sequencing; Imputed metagenomes; MicFunPred; Microbiome.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria* / genetics
  • Genes, rRNA
  • Metagenome*
  • Phylogeny
  • RNA, Ribosomal, 16S / genetics

Substances

  • RNA, Ribosomal, 16S