Discovery of thermophilic Bacillales using reduced-representation genotyping for identification

BMC Microbiol. 2020 May 13;20(1):114. doi: 10.1186/s12866-020-01800-z.

Abstract

Background: This study demonstrates the use of reduced-representation genotyping to provide preliminary identifications for thermophilic bacterial isolates. The approach combines restriction enzyme digestion and PCR with next-generation sequencing to provide thousands of short-read sequences from across the bacterial genomes. Isolates were obtained from compost, hot water systems, and artesian bores of the Great Artesian Basin. Genomic DNA was double-digested with two combinations of restriction enzymes followed by PCR amplification, using a commercial provider of DArTseq™, Diversity Arrays Technology Pty Ltd. (Canberra, Australia). The resulting fragments which formed a reduced-representation of approximately 2.3% of the genome were sequenced. The sequence tags obtained were aligned against all available RefSeq bacterial genome assemblies by BLASTn to identify the nearest reference genome.

Results: Based on the preliminary identifications, a total of 99 bacterial isolates were identified to species level, from which 8 isolates were selected for whole-genome sequencing to assess the identification results. Novel species and strains were discovered within this set of isolates. The preliminary identifications obtained by reduced-representation genotyping, as well as identifications obtained by BLASTn alignment of the 16S rRNA gene sequence, were compared with those derived from the whole-genome sequence data, using the same RefSeq sequence database for the three methods. Identifications obtained with reduced-representation sequencing agreed with the identifications provided by whole-genome sequencing in 100% of cases. The identifications produced by BLASTn alignment of 16S rRNA gene sequence to the same database differed from those provided by whole-genome sequencing in 37.5% of cases, and produced ambiguous identifications in 50% of cases.

Conclusions: Previously, this method has been successfully demonstrated for use in bacterial identification for medical microbiology. This study demonstrates the first successful use of DArTseq™ for preliminary identification of thermophilic bacterial isolates, providing results in complete agreement with those obtained from whole-genome sequencing of the same isolates. The growing database of bacterial genome sequences provides an excellent resource for alignment of reduced-representation sequence data for identification purposes, and as the available sequenced genomes continue to grow, the technique will become more effective.

Keywords: Bacterial identification; DArTseq; Genotyping-by-sequencing; Great Artesian Basin; Reduced-representation sequencing; Thermophiles.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacillales / classification*
  • Bacillales / genetics
  • Bacillales / isolation & purification
  • Composting
  • DNA, Bacterial / genetics*
  • Genotyping Techniques / methods*
  • High-Throughput Nucleotide Sequencing
  • Polymerase Chain Reaction
  • RNA, Ribosomal, 16S / genetics
  • Restriction Mapping
  • Sequence Analysis, DNA
  • Water Microbiology
  • Whole Genome Sequencing

Substances

  • DNA, Bacterial
  • RNA, Ribosomal, 16S