Development of a novel monosaccharide substitution matrix for improved comparison of glycan structures

Carbohydr Res. 2022 Jan:511:108496. doi: 10.1016/j.carres.2021.108496. Epub 2022 Jan 4.

Abstract

Unlike DNA and proteins, there is a limit to inferring the structure and function of glycans only by analyzing their sequence. Due to their structural flexibility, it can be said that an understanding of the 3D structural conformations of glycans is important to better understand their functions. While there are several tools now available that aid in analyzing the 3D structures of glycans, they are very computationally intensive and not easily useable by non-experts. Thus, as a first step, we decided to investigate the monosaccharides that make up the building blocks of glycans and their similarities. We developed a method and software that takes the three-dimensional structures of monosaccharides and finds their commonalities through an efficient algorithm, which we call TouCom (tou = "sugar" in Japanese). We then created a similarity matrix to represent the degree of similarity of pairs of monosaccharides based on this information and the properties of their functional groups. We performed an analysis of pairwise glycan alignment using this similarity matrix, confirming that the scores of pairwise-alignments obtained were improved compared to alignments without using this matrix. As a result, we propose the first monosaccharide substitution matrix that has been developed based on 3D atomic structure. In the future, we will apply this matrix to other glycan alignment tools so that glycan sequence analysis can better utilize this information. We expect that this monosaccharide substitution matrix can improve the analysis of glycan function based on glycan structural information.

Keywords: Glycan alignment; Glycan search; Monosaccharide substitution matrix.

MeSH terms

  • Algorithms
  • Monosaccharides*
  • Polysaccharides* / chemistry
  • Proteins
  • Software

Substances

  • Monosaccharides
  • Polysaccharides
  • Proteins