Hierarchical Hidden Markov models enable accurate and diverse detection of antimicrobial resistance sequences

Commun Biol. 2019 Aug 6:2:294. doi: 10.1038/s42003-019-0545-9. eCollection 2019.

Abstract

The characterization of antimicrobial resistance genes from high-throughput sequencing data has become foundational in public health research and regulation. This requires mapping sequence reads to databases of known antimicrobial resistance genes to determine the genes present in the sample. Mapping sequence reads to known genes is traditionally accomplished using alignment. Alignment methods have high specificity but are limited in their ability to detect sequences that are divergent from the reference database, which can result in a substantial false negative rate. We address this shortcoming through the creation of Meta-MARC, which enables detection of diverse resistance sequences using hierarchical, DNA-based Hidden Markov Models. We first describe Meta-MARC and then demonstrate its efficacy on simulated and functional metagenomic datasets. Meta-MARC has higher sensitivity relative to competing methods. This sensitivity allows for detection of sequences that are divergent from known antimicrobial resistance genes. This functionality is imperative to expanding existing antimicrobial gene databases.

Keywords: Antimicrobial resistance; Machine learning; Microbial communities; Microbial genetics.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Data Mining / methods*
  • Databases, Genetic*
  • Drug Resistance, Microbial / genetics*
  • High-Throughput Nucleotide Sequencing*
  • Machine Learning*
  • Markov Chains*
  • Metagenomics*
  • Reproducibility of Results
  • Sequence Analysis, DNA*