The GH19 Engineering Database: Sequence diversity, substrate scope, and evolution in glycoside hydrolase family 19

PLoS One. 2021 Oct 26;16(10):e0256817. doi: 10.1371/journal.pone.0256817. eCollection 2021.

Abstract

The glycoside hydrolase 19 (GH19) is a bifunctional family of chitinases and endolysins, which have been studied for the control of plant fungal pests, the recycle of chitin biomass, and the treatment of multi-drug resistant bacteria. The GH19 domain-containing sequences (22,461) were divided into a chitinase and an endolysin subfamily by analyzing sequence networks, guided by taxonomy and the substrate specificity of characterized enzymes. The chitinase subfamily was split into seventeen groups, thus extending the previous classification. The endolysin subfamily is more diverse and consists of thirty-four groups. Despite their sequence diversity, twenty-six residues are conserved in chitinases and endolysins, which can be distinguished by two specific sequence patterns at six and four positions, respectively. Their location outside the catalytic cleft suggests a possible mechanism for substrate specificity that goes beyond the direct interaction with the substrate. The evolution of the GH19 catalytic domain was investigated by large-scale phylogeny. The inferred evolutionary history and putative horizontal gene transfer events differ from previous works. While no clear patterns were detected in endolysins, chitinases varied in sequence length by up to four loop insertions, causing at least eight distinct presence/absence loop combinations. The annotated GH19 sequences and structures are accessible via the GH19 Engineering Database (GH19ED, https://gh19ed.biocatnet.de). The GH19ED has been developed to support the prediction of substrate specificity and the search for novel GH19 enzymes from neglected taxonomic groups or in regions of the sequence space where few sequences have been described yet.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism
  • Catalytic Domain
  • Chitinases / chemistry
  • Chitinases / genetics*
  • Chitinases / metabolism
  • Databases, Protein
  • Endopeptidases / chemistry
  • Endopeptidases / genetics*
  • Endopeptidases / metabolism
  • Evolution, Molecular
  • Fungi / chemistry
  • Fungi / genetics
  • Fungi / metabolism
  • Humans
  • Models, Molecular
  • Phylogeny
  • Plant Proteins / chemistry
  • Plant Proteins / genetics
  • Plant Proteins / metabolism
  • Protein Conformation
  • Substrate Specificity

Substances

  • Bacterial Proteins
  • Plant Proteins
  • Chitinases
  • Endopeptidases
  • endolysin

Grants and funding

MO acknowledges a PhD fellowship by the University of Milano-Bicocca, PCFB acknowledges funding by Bundesministerium für Bildung und Forschung (grant 031B0571A), JP acknowledges funding by Deutsche Forschungsgemeinschaft (grant EXC2075). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.