Challenges in estimating effective population sizes from metagenome-assembled genomes

Front Microbiol. 2024 Jan 5:14:1331583. doi: 10.3389/fmicb.2023.1331583. eCollection 2023.

Abstract

Effective population size (Ne) plays a critical role in shaping the relative efficiency between natural selection and genetic drift, thereby serving as a cornerstone for understanding microbial ecological dynamics. Direct Ne estimation relies on neutral genetic diversity within closely related genomes, which is, however, often constrained by the culturing difficulties for the vast majority of prokaryotic lineages. Metagenome-assembled genomes (MAGs) offer a high-throughput alternative for genomic data acquisition, yet their accuracy in Ne estimation has not been fully verified. This study examines the Thermococcus genus, comprising 66 isolated strains and 29 MAGs, to evaluate the reliability of MAGs in Ne estimation. Despite the even distribution across the Thermococcus phylogeny and the comparable internal average nucleotide identity (ANI) between isolate populations and MAG populations, our results reveal consistently lower Ne estimates from MAG populations. This trend of underestimation is also observed in various MAG populations across three other bacterial genera. The underrepresentation of genetic variation in MAGs, including loss of allele frequency data and variable genomic segments, likely contributes to the underestimation of Ne. Our findings underscore the necessity for caution when employing MAGs for evolutionary studies, which often depend on high-quality genome assemblies and nucleotide-level diversity.

Keywords: effective population size; genetic drift; metagenomics; microbial evolution; natural selection.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This study was supported by the China Postdoctoral Science Foundation 2022M722220 (XW) and 2022M712195 (XF); Guangdong Basic and Applied Basic Research Foundation 2023A1515012162 (XF).