De novo assembly of a chromosome-level reference genome for the California Scrub-Jay, Aphelocoma californica

J Hered. 2023 Nov 15;114(6):669-680. doi: 10.1093/jhered/esad047.

Abstract

We announce the assembly of the first de novo reference genome for the California Scrub-Jay (Aphelocoma californica). The genus Aphelocoma comprises four currently recognized species including many locally adapted populations across Mesoamerica and North America. Intensive study of Aphelocoma has revealed novel insights into the evolutionary mechanisms driving diversification in natural systems. Additional insights into the evolutionary history of this group will require continued development of high-quality, publicly available genomic resources. We extracted high molecular weight genomic DNA from a female California Scrub-Jay from northern California and generated PacBio HiFi long-read data and Omni-C chromatin conformation capture data. We used these data to generate a de novo partially phased diploid genome assembly, consisting of two pseudo-haplotypes, and scaffolded them using inferred physical proximity information from the Omni-C data. The more complete pseudo-haplotype assembly (arbitrarily designated "Haplotype 1") is 1.35 Gb in total length, highly contiguous (contig N50 = 11.53 Mb), and highly complete (BUSCO completeness score = 97%), with comparable scaffold sizes to chromosome-level avian reference genomes (scaffold N50 = 66.14 Mb). Our California Scrub-Jay assembly is highly syntenic with the New Caledonian Crow reference genome despite ~10 million years of divergence, highlighting the temporal stability of the avian genome. This high-quality reference genome represents a leap forward in publicly available genomic resources for Aphelocoma, and the family Corvidae more broadly. Future work using Aphelocoma as a model for understanding the evolutionary forces generating and maintaining biodiversity across phylogenetic scales can now benefit from a highly contiguous, in-group reference genome.

Keywords: CCGP; California Conservation Genomics Project; N50; Omni-C; PacBio; long-read sequencing.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • California
  • Chromosomes
  • Female
  • Genome*
  • Passeriformes*
  • Phylogeny