Rhometa: Population recombination rate estimation from metagenomic read datasets

PLoS Genet. 2023 Mar 27;19(3):e1010683. doi: 10.1371/journal.pgen.1010683. eCollection 2023 Mar.

Abstract

Prokaryotic evolution is influenced by the exchange of genetic information between species through a process referred to as recombination. The rate of recombination is a useful measure for the adaptive capacity of a prokaryotic population. We introduce Rhometa (https://github.com/sid-krish/Rhometa), a new software package to determine recombination rates from shotgun sequencing reads of metagenomes. It extends the composite likelihood approach for population recombination rate estimation and enables the analysis of modern short-read datasets. We evaluated Rhometa over a broad range of sequencing depths and complexities, using simulated and real experimental short-read data aligned to external reference genomes. Rhometa offers a comprehensive solution for determining population recombination rates from contemporary metagenomic read datasets. Rhometa extends the capabilities of conventional sequence-based composite likelihood population recombination rate estimators to include modern aligned metagenomic read datasets with diverse sequencing depths, thereby enabling the effective application of these techniques and their high accuracy rates to the field of metagenomics. Using simulated datasets, we show that our method performs well, with its accuracy improving with increasing numbers of genomes. Rhometa was validated on a real S. pneumoniae transformation experiment, where we show that it obtains plausible estimates of the rate of recombination. Finally, the program was also run on ocean surface water metagenomic datasets, through which we demonstrate that the program works on uncultured metagenomic datasets.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • High-Throughput Nucleotide Sequencing / methods
  • Likelihood Functions
  • Metagenome* / genetics
  • Metagenomics* / methods
  • Recombination, Genetic / genetics
  • Sequence Analysis, DNA / methods
  • Software

Grants and funding

This work was supported by an Australian Government Research Training Program Scholarship. This research was supported by the Australian Government through the Australian Research Council Discovery Projects funding scheme under the project DP180101506, http://purl.org/au-research/grants/arc/DP180101506 (to AED). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.