Small Genomes and Big Data: Adaptation of Plastid Genomics to the High-Throughput Era

Biomolecules. 2019 Jul 24;9(8):299. doi: 10.3390/biom9080299.

Abstract

Plastid genome sequences are becoming more readily available with the increase in high-throughput sequencing, and whole-organelle genetic data is available for algae and plants from across the diversity of photosynthetic eukaryotes. This has provided incredible opportunities for studying species which may not be amenable to in vivo study or genetic manipulation or may not yet have been cultured. Research into plastid genomes has pushed the limits of what can be deduced from genomic information, and in particular genomic information obtained from public databases. In this Review, we discuss how research into plastid genomes has benefitted enormously from the explosion of publicly available genome sequence. We describe two case studies in how using publicly available gene data has supported previously held hypotheses about plastid traits from lineage-restricted experiments across algal and plant diversity. We propose how this approach could be used across disciplines for inferring functional and biological characteristics from genomic approaches, including integration of new computational and bioinformatic approaches such as machine learning. We argue that the techniques developed to gain the maximum possible insight from plastid genomes can be applied across the eukaryotic tree of life.

Keywords: bioinformatics; biotechnology; next-generation sequencing; plastid biology.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Big Data
  • Computational Biology / methods*
  • Evolution, Molecular
  • Genome Size
  • Genome, Plastid
  • Genomics / methods
  • High-Throughput Nucleotide Sequencing
  • Machine Learning
  • Phylogeny
  • Plants / classification
  • Plants / genetics*
  • Plastids / classification
  • Plastids / genetics*