ARGs-OAP v2.0 with an expanded SARG database and Hidden Markov Models for enhancement characterization and quantification of antibiotic resistance genes in environmental metagenomes

Bioinformatics. 2018 Jul 1;34(13):2263-2270. doi: 10.1093/bioinformatics/bty053.

Abstract

Motivation: Much global attention has been paid to antibiotic resistance in monitoring its emergence, accumulation and dissemination. For rapid characterization and quantification of antibiotic resistance genes (ARGs) in metagenomic datasets, an online analysis pipeline, ARGs-OAP has been developed consisting of a database termed Structured Antibiotic Resistance Genes (the SARG) with a hierarchical structure (ARGs type-subtype-reference sequence).

Results: The new release of the database, termed SARG version 2.0, contains sequences not only from CARD and ARDB databases, but also carefully selected and curated sequences from the latest protein collection of the NCBI-NR database, to keep up to date with the increasing number of ARG deposited sequences. SARG v2.0 has tripled the sequences of the first version and demonstrated improved coverage of ARGs detection in metagenomes from various environmental samples. In addition to annotation of high-throughput raw reads using a similarity search strategy, ARGs-OAP v2.0 now provides model-based identification of assembled sequences using SARGfam, a high-quality profile Hidden Markov Model (HMM), containing profiles of ARG subtypes. Additionally, ARGs-OAP v2.0 improves cell number quantification by using the average coverage of essential single copy marker genes, as an option in addition to the previous method based on the 16S rRNA gene.

Availability and implementation: ARGs-OAP can be accessed through http://smile.hku.hk/SARGs. The database could be downloaded from the same site. Source codes for this study can be downloaded from https://github.com/xiaole99/ARGs-OAP-v2.0.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Archaea / genetics
  • Archaea / physiology
  • Bacteria / genetics
  • Bacterial Physiological Phenomena
  • Databases, Factual*
  • Drug Resistance, Microbial / genetics*
  • Genome, Archaeal
  • Genome, Bacterial
  • Metagenome*
  • Metagenomics / methods
  • Sequence Analysis, DNA / methods
  • Software*