Challenges in the annotation of pseudoenzymes in databases: the UniProtKB approach

FEBS J. 2020 Oct;287(19):4114-4127. doi: 10.1111/febs.15100. Epub 2019 Nov 3.

Abstract

The universal protein knowledgebase (UniProtKB) collects and centralises functional information on proteins across a wide range of species. In addition to the functional information added to all protein entries, for enzymes, which represent 20-40% of most proteomes, UniProtKB provides additional information about Enzyme Commission classification, catalytic activity, cofactors, enzyme regulation, kinetics and pathways, all based on critical assessment of published experimental data. Computer-based analysis and structural data are used to enrich the annotation of the sequence through the identification of active sites and binding sites. While the annotation of enzymes is well-defined, the curation of pseudoenzymes in UniProtKB has highlighted some challenges: how to identify them, how to assess their lack of catalytic activity, how to annotate their lack of catalytic activity in a consistent way and how much can be inferred and propagated from experimental data obtained from other species. Through various examples, we illustrate some of these issues and discuss some of the changes we propose to enhance the annotation and discovery of pseudoenzymes. Ultimately, improving the curation of pseudoenzymes will provide the scientific community with a comprehensive resource for pseudoenzymes, which in turn will lead to a better understanding of the evolution of these molecules, the aetiology of related diseases and the development of drugs.

Keywords: UniProtKB; curation; protein database; pseudoenzyme.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Databases, Protein*
  • Enzymes* / chemistry
  • Humans
  • Knowledge Bases*

Substances

  • Enzymes