A two-tier bioinformatic pipeline to develop probes for target capture of nuclear loci with applications in Melastomataceae

Appl Plant Sci. 2020 May 9;8(5):e11345. doi: 10.1002/aps3.11345. eCollection 2020 May.

Abstract

Premise: Putatively single-copy nuclear (SCN) loci, which are identified using genomic resources of closely related species, are ideal for phylogenomic inference. However, suitable genomic resources are not available for many clades, including Melastomataceae. We introduce a versatile approach to identify SCN loci for clades with few genomic resources and use it to develop probes for target enrichment in the distantly related Memecylon and Tibouchina (Melastomataceae).

Methods: We present a two-tiered pipeline. First, we identified putatively SCN loci using MarkerMiner and transcriptomes from distantly related species in Melastomataceae. Published loci and genes of functional significance were then added (384 total loci). Second, using HybPiper, we retrieved 689 homologous template sequences for these loci using genome-skimming data from within the focal clades.

Results: We sequenced 193 loci common to Memecylon and Tibouchina. Probes designed from 56 template sequences successfully targeted sequences in both clades. Probes designed from genome-skimming data within a focal clade were more successful than probes designed from other sources.

Discussion: Our pipeline successfully identified and targeted SCN loci in Memecylon and Tibouchina, enabling phylogenomic studies in both clades and potentially across Melastomataceae. This pipeline could be easily applied to other clades with few genomic resources.

Keywords: HybPiper; MarkerMiner; Memecylon; Tibouchina; phylogenomics; target capture.

Associated data

  • Dryad/10.5061/dryad.8931zcrm2