Concordance and discordance of sequence survey methods for molecular epidemiology

PeerJ. 2015 Feb 17:3:e761. doi: 10.7717/peerj.761. eCollection 2015.

Abstract

The post-genomic era is characterized by the direct acquisition and analysis of genomic data with many applications, including the enhancement of the understanding of microbial epidemiology and pathology. However, there are a number of molecular approaches to survey pathogen diversity, and the impact of these different approaches on parameter estimation and inference are not entirely clear. We sequenced whole genomes of bacterial pathogens, Burkholderia pseudomallei, Yersinia pestis, and Brucella spp. (60 new genomes), and combined them with 55 genomes from GenBank to address how different molecular survey approaches (whole genomes, SNPs, and MLST) impact downstream inferences on molecular evolutionary parameters, evolutionary relationships, and trait character associations. We selected isolates for sequencing to represent temporal, geographic origin, and host range variability. We found that substitution rate estimates vary widely among approaches, and that SNP and genomic datasets yielded different but strongly supported phylogenies. MLST yielded poorly supported phylogenies, especially in our low diversity dataset, i.e., Y. pestis. Trait associations showed that B. pseudomallei and Y. pestis phylogenies are significantly associated with geography, irrespective of the molecular survey approach used, while Brucella spp. phylogeny appears to be strongly associated with geography and host origin. We contrast inferences made among monomorphic (clonal) and non-monomorphic bacteria, and between intra- and inter-specific datasets. We also discuss our results in light of underlying assumptions of different approaches.

Keywords: Biological weapons; Bioterrorism; Data type; Genomes; High-throughput sequencing; MLST; Molecular epidemiology; Phylogenomics; Phylogeography; SNP.

Grants and funding

The Department of Homeland Security provided funding from Grant# HSHQDC-10-C-00177. Eduardo Castro-Nallar was funded by “CONICYT + PAI/ CONCURSO NACIONAL APOYO AL RETORNO DE INVESTIGADORES/AS DESDE EL EXTRANJERO, CONVOCATORIA 2014 + FOLIO 82140008”. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.