Structural motif screening reveals a novel, conserved carbohydrate-binding surface in the pathogenesis-related protein PR-5d

BMC Struct Biol. 2010 Aug 3:10:23. doi: 10.1186/1472-6807-10-23.

Abstract

Background: Aromatic amino acids play a critical role in protein-glycan interactions. Clusters of surface aromatic residues and their features may therefore be useful in distinguishing glycan-binding sites as well as predicting novel glycan-binding proteins. In this work, a structural bioinformatics approach was used to screen the Protein Data Bank (PDB) for coplanar aromatic motifs similar to those found in known glycan-binding proteins.

Results: The proteins identified in the screen were significantly associated with carbohydrate-related functions according to gene ontology (GO) enrichment analysis, and predicted motifs were found frequently within novel folds and glycan-binding sites not included in the training set. In addition to numerous binding sites predicted in structural genomics proteins of unknown function, one novel prediction was a surface motif (W34/W36/W192) in the tobacco pathogenesis-related protein, PR-5d. Phylogenetic analysis revealed that the surface motif is exclusive to a subfamily of PR-5 proteins from the Solanaceae family of plants, and is absent completely in more distant homologs. To confirm PR-5d's insoluble-polysaccharide binding activity, a cellulose-pulldown assay of tobacco proteins was performed and PR-5d was identified in the cellulose-binding fraction by mass spectrometry.

Conclusions: Based on the combined results, we propose that the putative binding site in PR-5d may be an evolutionary adaptation of Solanaceae plants including potato, tomato, and tobacco, towards defense against cellulose-containing pathogens such as species of the deadly oomycete genus, Phytophthora. More generally, the results demonstrate that coplanar aromatic clusters on protein surfaces are a structural signature of glycan-binding proteins, and can be used to computationally predict novel glycan-binding proteins from 3 D structure.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Amino Acids, Aromatic
  • Carbohydrate Metabolism*
  • Cellulose / chemistry
  • Cellulose / metabolism
  • Computational Biology / methods*
  • Conserved Sequence*
  • Databases, Protein
  • Discriminant Analysis
  • Models, Molecular
  • Molecular Sequence Data
  • Mutation
  • Nicotiana / classification
  • Phylogeny
  • Plant Proteins / chemistry*
  • Plant Proteins / genetics
  • Plant Proteins / metabolism*
  • Protein Binding
  • Protein Conformation
  • Solubility
  • Species Specificity

Substances

  • Amino Acids, Aromatic
  • Plant Proteins
  • pathogenesis-related proteins, plant
  • Cellulose