A standard workflow for community-driven manual curation of Strongyloides genome annotations

Philos Trans R Soc Lond B Biol Sci. 2024 Jan 15;379(1894):20220443. doi: 10.1098/rstb.2022.0443. Epub 2023 Nov 27.

Abstract

Advances in the functional genomics and bioinformatics toolkits for Strongyloides species have positioned these species as genetically tractable model systems for gastrointestinal parasitic nematodes. As community interest in mechanistic studies of Strongyloides species continues to grow, publicly accessible reference genomes and associated genome annotations are critical resources for researchers. Genome annotations for multiple Strongyloides species are broadly available via the WormBase and WormBase ParaSite online repositories. However, a recent phylogenetic analysis of the receptor-type guanylate cyclase (rGC) gene family in two Strongyloides species highlights the potential for errors in a large percentage of current Strongyloides gene models. Here, we present three examples of gene annotation updates within the Strongyloides rGC gene family; each example illustrates a type of error that may occur frequently within the annotation data for Strongyloides genomes. We also extend our analysis to 405 previously curated Strongyloides genes to confirm that gene model errors are found at high rates across gene families. Finally, we introduce a standard manual curation workflow for assessing gene annotation quality and generating corrections, and we discuss how it may be used to facilitate community-driven curation of parasitic nematode biodata. This article is part of the Theo Murphy meeting issue 'Strongyloides: omics to worm-free populations'.

Keywords: Strongyloides; WormBase; community curation; comparative genomics; genome annotation; nematodes.

MeSH terms

  • Animals
  • Databases, Genetic*
  • Genome
  • Molecular Sequence Annotation
  • Phylogeny
  • Strongyloides* / genetics
  • Workflow