REFMAKER: Make your own reference to target nuclear loci in low coverage genome skimming libraries. Phylogenomic application in Sapotaceae

Mol Phylogenet Evol. 2023 Sep:186:107826. doi: 10.1016/j.ympev.2023.107826. Epub 2023 May 29.

Abstract

Genome skimming approach is widely used in plant systematics to infer phylogenies mostly from organelle genomes. However, organelles represent only 10 % of the produced libraries, and the low coverage associated with these libraries (<3X) prevents the capture of nuclear sequences, which are not always available in non-model organisms or limited to the ribosomal regions. We developed REFMAKER, a user-friendly pipeline, to create specific sets of nuclear loci that can be extracted directly from the genome skimming libraries. For this, a catalogue is built from the meta-assembly of each library contigs, and cleaned by selecting the nuclear regions and removing duplicates from clustering steps. Libraries are next mapped onto this catalogue and consensus sequences are generated to produce a ready-to-use phylogenetic matrix following different filtering parameters aiming at removing putative errors and paralogous sequences. REFMAKER allowed us to infer a well resolved phylogeny in Capurodendron (Sapotaceae) on 67 nuclear loci from low-coverage libraries (<1X). The resulting phylogeny is concomitant with one previously inferred on 638 nuclear genes from target enrichment libraries. While it remains preliminary because of this low sequencing depth, REFMAKER therefore opens perspectives in phylogenomics by allowing nuclear phylogeny reconstructions with genome skimming datasets.

Keywords: Bioinformatics; Nuclear phylogeny; Sapotaceae; Shotgun sequencing; Systematics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Nucleus / genetics
  • Phylogeny
  • Sapotaceae*