Development and evaluation of a custom bait design based on 469 single-copy protein-coding genes for exon capture of isopods (Philosciidae: Haloniscus)

PLoS One. 2021 Sep 17;16(9):e0256861. doi: 10.1371/journal.pone.0256861. eCollection 2021.

Abstract

Transcriptome-based exon capture approaches, along with next-generation sequencing, are allowing for the rapid and cost-effective production of extensive and informative phylogenomic datasets from non-model organisms for phylogenetics and population genetics research. These approaches generally employ a reference genome to infer the intron-exon structure of targeted loci and preferentially select longer exons. However, in the absence of an existing and well-annotated genome, we applied this exon capture method directly, without initially identifying intron-exon boundaries for bait design, to a group of highly diverse Haloniscus (Philosciidae), paraplatyarthrid and armadillid isopods, and examined the performance of our methods and bait design for phylogenetic inference. Here, we identified an isopod-specific set of single-copy protein-coding loci, and a custom bait design to capture targeted regions from 469 genes, and analysed the resulting sequence data with a mapping approach and newly-created post-processing scripts. We effectively recovered a large and informative dataset comprising both short (<100 bp) and longer (>300 bp) exons, with high uniformity in sequencing depth. We were also able to successfully capture exon data from up to 16-year-old museum specimens along with more distantly related outgroup taxa, and efficiently pool multiple samples prior to capture. Our well-resolved phylogenies highlight the overall utility of this methodological approach and custom bait design, which offer enormous potential for application to future isopod, as well as broader crustacean, molecular studies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Arthropod Proteins / classification
  • Arthropod Proteins / genetics*
  • Arthropod Proteins / metabolism
  • Datasets as Topic
  • Exons*
  • Gene Expression
  • Genetic Loci
  • Genetics, Population
  • Genome*
  • High-Throughput Nucleotide Sequencing
  • Introns
  • Isopoda / classification
  • Isopoda / genetics*
  • Open Reading Frames*
  • Phylogeny

Substances

  • Arthropod Proteins

Grants and funding

This project was funded by the Australian Research Council (LP0669062 and LP140100555) to ADA and SJBC with the following industry partners: the Department for Environment and Water (SA), the South Australian Museum, BHP Billiton, Nature Foundation (SA), Biota Environmental Sciences, the Western Australian Museum, Department of Biodiversity, Conservation and Attractions (WA), and Bennelongia. Additional funding was provided by an Australian Biological Resources Study Capacity Building grant to DNS (CT214-11), the Nature Conservancy, kindly supported by The Thomas Foundation, to DNS, and Bioplatforms Australia (Research Contract to ADA and SJBC) for sequencing costs. DNS acknowledges the support of an Australian Government Research Training Program Scholarship. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.