Crowdsourcing and the feasibility of manual gene annotation: A pilot study in the nematode Pristionchus pacificus

Sci Rep. 2019 Dec 11;9(1):18789. doi: 10.1038/s41598-019-55359-5.

Abstract

Nematodes such as Caenorhabditis elegans are powerful systems to study basically all aspects of biology. Their species richness together with tremendous genetic knowledge from C. elegans facilitate the evolutionary study of biological functions using reverse genetics. However, the ability to identify orthologs of candidate genes in other species can be hampered by erroneous gene annotations. To improve gene annotation in the nematode model organism Pristionchus pacificus, we performed a genome-wide screen for C. elegans genes with potentially incorrectly annotated P. pacificus orthologs. We initiated a community-based project to manually inspect more than two thousand candidate loci and to propose new gene models based on recently generated Iso-seq and RNA-seq data. In most cases, misannotation of C. elegans orthologs was due to artificially fused gene predictions and completely missing gene models. The community-based curation raised the gene count from 25,517 to 28,036 and increased the single copy ortholog completeness level from 86% to 97%. This pilot study demonstrates how even small-scale crowdsourcing can drastically improve gene annotations. In future, similar approaches can be used for other species, gene sets, and even larger communities thus making manual annotation of large parts of the genome feasible.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Caenorhabditis elegans / genetics*
  • Catalogs as Topic
  • Feasibility Studies
  • Genes, Helminth*
  • Genes, Synthetic
  • Genome, Helminth
  • Molecular Sequence Annotation* / standards
  • Pilot Projects
  • Rhabditida / genetics*
  • Species Specificity
  • Transcriptome