Improved strategy for the curation and classification of kinases, with broad applicability to other eukaryotic protein groups

Sci Rep. 2018 May 1;8(1):6808. doi: 10.1038/s41598-018-25020-8.

Abstract

Despite the substantial amount of genomic and transcriptomic data available for a wide range of eukaryotic organisms, most genomes are still in a draft state and can have inaccurate gene predictions. To gain a sound understanding of the biology of an organism, it is crucial that inferred protein sequences are accurately identified and annotated. However, this can be challenging to achieve, particularly for organisms such as parasitic worms (helminths), as most gene prediction approaches do not account for substantial phylogenetic divergence from model organisms, such as Caenorhabditis elegans and Drosophila melanogaster, whose genomes are well-curated. In this paper, we describe a bioinformatic strategy for the curation of gene families and subsequent annotation of encoded proteins. This strategy relies on pairwise gene curation between at least two closely related species using genomic and transcriptomic data sets, and is built on recent work on kinase complements of parasitic worms. Here, we discuss salient technical aspects of this strategy and its implications for the curation of protein families more generally.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Caenorhabditis elegans / classification
  • Caenorhabditis elegans / enzymology
  • Caenorhabditis elegans / genetics
  • Computational Biology / methods
  • Data Curation / methods
  • Databases, Genetic
  • Female
  • Gene Ontology
  • Genome, Helminth*
  • Haemonchus / classification
  • Haemonchus / enzymology
  • Haemonchus / genetics*
  • Helminth Proteins / classification
  • Helminth Proteins / genetics*
  • Helminth Proteins / metabolism
  • Molecular Sequence Annotation / methods
  • Phylogeny
  • Protein Kinases / classification
  • Protein Kinases / genetics*
  • Protein Kinases / metabolism
  • Schistosoma / classification
  • Schistosoma / enzymology
  • Schistosoma / genetics*
  • Transcriptome
  • Trichinella / classification
  • Trichinella / enzymology
  • Trichinella / genetics*
  • Trichuris / classification
  • Trichuris / enzymology
  • Trichuris / genetics*

Substances

  • Helminth Proteins
  • Protein Kinases