Handling gene and protein names in the age of bioinformatics: the special challenge of secreted multimodular bacterial enzymes such as the cbhA/cbh9A gene of Clostridium thermocellum

World J Microbiol Biotechnol. 2018 Feb 26;34(3):42. doi: 10.1007/s11274-018-2424-9.

Abstract

An increasing number of researchers working in biology, biochemistry, biotechnology, bioengineering, bioinformatics and other related fields of science are using biological molecules. As the scientific background of the members of different scientific communities is more diverse than ever before, the number of scientists not familiar with the rules for non-ambiguous designation of genetic elements is increasing. However, with biological molecules gaining importance through biotechnology, their functional and unambiguous designation is vital. Unfortunately, naming genes and proteins is not an easy task. In addition, the traditional concepts of bioinformatics are challenged with the appearance of proteins comprising different modules with a respective function in each module. This article highlights basic rules and novel solutions in designation recently used within the community of bacterial geneticists, and we discuss the present-day handling of gene and protein designations. As an example we will utilize a recent mischaracterization of gene nomenclature. We make suggestions for better handling of names in future literature as well as in databases and annotation projects. Our methodology emphasizes the hydrolytic function of multi-modular genes and extracellular proteins from bacteria.

Keywords: Database handling; Gene annotation; Gene naming; Gene sequencing; Multimodular protein; Nomenclature; Non-catalytic modules; Record tracking.

Publication types

  • Review

MeSH terms

  • Clostridium thermocellum / enzymology*
  • Clostridium thermocellum / genetics*
  • Computational Biology / methods*
  • Databases, Genetic
  • Databases, Protein
  • Genome, Bacterial
  • Genomics / methods
  • Glucosidases / genetics
  • Information Storage and Retrieval / methods
  • Molecular Sequence Annotation
  • Proteins / genetics*

Substances

  • Proteins
  • Glucosidases