Quality and quantity of data recovered from massively parallel sequencing: Examples in Asparagales and Poaceae

Am J Bot. 2012 Feb;99(2):330-48. doi: 10.3732/ajb.1100491. Epub 2012 Jan 30.

Abstract

Premise of the study: Genome survey sequences (GSS) from massively parallel sequencing have potential to provide large, cost-effective data sets for phylogenetic inference, replace single gene or spacer regions as DNA barcodes, and provide a plethora of data for other comparative molecular evolution studies. Here we report on the application of this method to estimating the molecular phylogeny of core Asparagales, investigating plastid gene losses, assembling complete plastid genomes, and determining the type and quality of assembled genomic data attainable from Illumina 80-120-bp reads.

Methods: We sequenced total genomic DNA from samples in two lineages of monocotyledonous plants, Poaceae and Asparagales, on the Illumina platform in a multiplex arrangement. We compared reference-based assemblies to de novo contigs, evaluated consistency of assemblies resulting from use of various references sequences, and assessed our methods to obtain sequence assemblies in nonmodel taxa.

Key results: Our method returned reliable, robust organellar and nrDNA sequences in a variety of plant lineages. High quality assemblies are not dependent on genome size, amount of plastid present in the total genomic DNA template, or relatedness of available reference sequences for assembly. Phylogenetic results revealed familial and subfamilial relationships within Asparagales with high bootstrap support, although placement of the monotypic genus Aphyllanthes was placed with moderate confidence.

Conclusions: The well-supported molecular phylogeny provides evidence for delineation of subfamilies within core Asparagales. With advances in technology and bioinformatics tools, the use of massively parallel sequencing will continue to become easier and more affordable for phylogenomic and molecular evolutionary biology investigations.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Cell Nucleus / genetics
  • Computational Biology / methods
  • DNA, Plant / genetics
  • DNA, Ribosomal / genetics
  • Databases, Genetic
  • Evolution, Molecular
  • Genome Size
  • Genome, Chloroplast*
  • Genome, Mitochondrial
  • Liliaceae / classification
  • Liliaceae / genetics*
  • Mitochondria / genetics
  • Molecular Sequence Annotation
  • Phylogeny
  • Plastids / genetics
  • Poaceae / classification
  • Poaceae / genetics*
  • Reference Standards
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods

Substances

  • DNA, Plant
  • DNA, Ribosomal