Pathogen detection in RNA-seq data with Pathonoia

BMC Bioinformatics. 2023 Feb 17;24(1):53. doi: 10.1186/s12859-023-05144-z.

Abstract

Background: Bacterial and viral infections may cause or exacerbate various human diseases and to detect microbes in tissue, one method of choice is RNA sequencing. The detection of specific microbes using RNA sequencing offers good sensitivity and specificity, but untargeted approaches suffer from high false positive rates and a lack of sensitivity for lowly abundant organisms.

Results: We introduce Pathonoia, an algorithm that detects viruses and bacteria in RNA sequencing data with high precision and recall. Pathonoia first applies an established k-mer based method for species identification and then aggregates this evidence over all reads in a sample. In addition, we provide an easy-to-use analysis framework that highlights potential microbe-host interactions by correlating the microbial to the host gene expression. Pathonoia outperforms state-of-the-art methods in microbial detection specificity, both on in silico and real datasets.

Conclusion: Two case studies in human liver and brain show how Pathonoia can support novel hypotheses on microbial infection exacerbating disease. The Python package for Pathonoia sample analysis and a guided analysis Jupyter notebook for bulk RNAseq datasets are available on GitHub.

Keywords: Metagenomics; Pathogen detection; RNA sequencing.

MeSH terms

  • Algorithms*
  • Bacteria* / genetics
  • Base Sequence
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Metagenomics / methods
  • RNA-Seq
  • Sequence Analysis, RNA / methods