Toward a robust computational screening strategy for identifying glycosaminoglycan sequences that display high specificity for target proteins

Glycobiology. 2014 Dec;24(12):1323-33. doi: 10.1093/glycob/cwu077. Epub 2014 Jul 21.

Abstract

Glycosaminoglycans (GAGs) interact with many proteins to regulate processes such as hemostasis, cell adhesion, growth and differentiation and viral infection. Yet, majority of these interactions remain poorly understood at a molecular level. A major reason for this state is the phenomenal structural diversity of GAGs, which has precluded analysis of specificity of their interactions. We had earlier presented a computational protocol for predicting "high-specificity" GAG sequences based on combinatorial virtual library screening (CVLS) technology. In this work, we expand the robustness of this technology through rigorous studies of parameters affecting GAG recognition of proteins, especially antithrombin and thrombin. The CVLS approach involves automated construction of a virtual library of all possible oligosaccharide sequences (di- to octasaccharide) followed by a two-step selection strategy consisting of "affinity" (GOLD score) and "specificity" (consistency of binding) filters. We find that "specificity" features are optimally evaluated using 100 genetic algorithm experiments, 100,000 evolutions and variable docking radius from 10 Å (disaccharide) to 14 Å (hexasaccharide). The results highlight critical interactions in H/HS oligosaccharides that govern specificity. Application of CVLS technology to the antithrombin-heparin system indicates that the minimal "specificity" element is the GlcAp(1 → 4)GlcNp2S3S disaccharide of heparin. The CVLS technology affords a simple, intuitive framework for the design of longer GAG sequences that can exhibit high "specificity" without resorting to exhaustive screening of millions of theoretical sequences.

Keywords: glycosaminoglycans; heparin/heparan sulfate; molecular docking; specificity; virtual screening.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Carbohydrate Conformation
  • Carbohydrate Sequence
  • Computational Biology*
  • Glycosaminoglycans / chemistry*
  • Glycosaminoglycans / metabolism*
  • Molecular Sequence Data
  • Proteins / chemistry*
  • Substrate Specificity

Substances

  • Glycosaminoglycans
  • Proteins