New T2T assembly of Cryptosporidium parvum IOWA annotated with reference genome gene identifiers

bioRxiv [Preprint]. 2023 Jun 13:2023.06.13.544219. doi: 10.1101/2023.06.13.544219.

Abstract

Cryptosporidium parvum is a significant pathogen causing gastrointestinal infections in humans and animals, that is spread through the ingestion of contaminated food and water. Despite its global impact on public health, generating a C. parvum genome sequence has always been challenging due to a lack of in vitro cultivation systems and challenging sub-telomeric gene families. A gapless telomere to telomere genome assembly has been created for Cryptosporidium parvum IOWA obtained from Bunch Grass Farms, named here as CpBGF. There are 8 chromosomes that total 9,259,183 bp. The new hybrid assembly which was generated with Illumina and Oxford Nanopore resolves complex sub-telomeric regions of chromosomes 1, 7 and 8. To facilitate ease of use and consistency with the literature, whenever possible, chromosomes have been oriented and genes in this annotation have been given the same gene IDs used in the current reference genome sequence generated in 2004. The annotation of this assembly utilized considerable RNA expression evidence, thus, untranslated regions, long noncoding RNAs and antisense RNAs are annotated. The CpBGF genome assembly serves as a valuable resource for understanding the biology, pathogenesis, and transmission of C. parvum, and it facilitates the development of diagnostics, drugs, and vaccines against cryptosporidiosis.

Publication types

  • Preprint