A Phylogenomic Approach Based on PCR Target Enrichment and High Throughput Sequencing: Resolving the Diversity within the South American Species of Bartsia L. (Orobanchaceae)

PLoS One. 2016 Feb 1;11(2):e0148203. doi: 10.1371/journal.pone.0148203. eCollection 2016.

Abstract

Advances in high-throughput sequencing (HTS) have allowed researchers to obtain large amounts of biological sequence information at speeds and costs unimaginable only a decade ago. Phylogenetics, and the study of evolution in general, is quickly migrating towards using HTS to generate larger and more complex molecular datasets. In this paper, we present a method that utilizes microfluidic PCR and HTS to generate large amounts of sequence data suitable for phylogenetic analyses. The approach uses the Fluidigm Access Array System (Fluidigm, San Francisco, CA, USA) and two sets of PCR primers to simultaneously amplify 48 target regions across 48 samples, incorporating sample-specific barcodes and HTS adapters (2,304 unique amplicons per Access Array). The final product is a pooled set of amplicons ready to be sequenced, and thus, there is no need to construct separate, costly genomic libraries for each sample. Further, we present a bioinformatics pipeline to process the raw HTS reads to either generate consensus sequences (with or without ambiguities) for every locus in every sample or--more importantly--recover the separate alleles from heterozygous target regions in each sample. This is important because it adds allelic information that is well suited for coalescent-based phylogenetic analyses that are becoming very common in conservation and evolutionary biology. To test our approach and bioinformatics pipeline, we sequenced 576 samples across 96 target regions belonging to the South American clade of the genus Bartsia L. in the plant family Orobanchaceae. After sequencing cleanup and alignment, the experiment resulted in ~25,300 bp across 486 samples for a set of 48 primer pairs targeting the plastome, and ~13,500 bp for 363 samples for a set of primers targeting regions in the nuclear genome. Finally, we constructed a combined concatenated matrix from all 96 primer combinations, resulting in a combined aligned length of ~40,500 bp for 349 samples.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Cell Nucleus / genetics
  • DNA Primers / metabolism
  • DNA, Chloroplast / genetics
  • Flowers / genetics
  • Genetic Variation*
  • Genome, Plant*
  • Geography
  • High-Throughput Nucleotide Sequencing / methods*
  • Microfluidics
  • Orobanchaceae / genetics*
  • Phylogeny*
  • Polymerase Chain Reaction / methods*
  • Reproducibility of Results
  • Species Specificity

Substances

  • DNA Primers
  • DNA, Chloroplast