Identifying Taxonomic Units in Metagenomic DNA Streams on Mobile Devices

IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):1092-1103. doi: 10.1109/TCBB.2022.3172661. Epub 2023 Apr 3.

Abstract

With the emergence of portable DNA sequencers, such as Oxford Nanopore Technology MinION, metagenomic DNA sequencing can be performed in real-time and directly in the field. However, because metagenomic DNA analysis tasks, e.g., classification, taxonomic units assignment, etc., are compute and memory intensive, and the available methods are designed for batch processing, the current metagenomic tools are not well suited for mobile devices. In this work, we propose a new memory-efficient approach to identify Operational Taxonomic Units (OTUs) in metagenomic DNA streams on mobile devices. Our method is based on finding connected components in overlap graphs constructed over a real-time stream of long DNA reads as produced by the MinION platform. We propose an efficient algorithm to maintain connected components when an overlap graph is streamed and show how redundant information can be removed from the stream by transitive closures. We also propose how our algorithms can be integrated into a larger DNA analysis pipeline tailored for mobile computing. Through experiments on simulated and real-world metagenomic data, executed on the actual mobile device, we demonstrate that our resulting solution is able to recover OTUs with high precision. Our experiments also demonstrate the compounding benefits of introducing feedback loops in the DNA analysis pipeline.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • DNA
  • Metagenome / genetics
  • Metagenomics* / methods
  • Sequence Analysis, DNA / methods

Substances

  • DNA