Entropy Analysis of Protein Sequences Reveals a Hierarchical Organization

Entropy (Basel). 2021 Dec 7;23(12):1647. doi: 10.3390/e23121647.

Abstract

Background: Analyzing the local sequence content in proteins, earlier we found that amino acid residue frequencies differ on various distances between amino acid positions in the sequence, assuming the existence of structural units.

Methods: We used informational entropy of protein sequences to find that the structural unit of proteins is a block of adjacent amino acid residues-"information unit". The ANIS (ANalysis of Informational Structure) method uses these information units for revealing hierarchically organized Elements of the Information Structure (ELIS) in amino acid sequences.

Results: The developed mathematical apparatus gives stable results on the structural unit description even with a significant variation in the parameters. The optimal length of the information unit is five, and the number of allowed substitutions is one. Examples of the application of the method for the design of protein molecules, intermolecular interactions analysis, and the study of the mechanisms of functioning of protein molecular machines are given.

Conclusions: ANIS method makes it possible not only to analyze native proteins but also to design artificial polypeptide chains with a given spatial organization and, possibly, function.

Keywords: ANIS method; HSP70; TNF; carboxypeptidase; foldon; hem-containing proteins; hierarchy; hydrolases; informational structure; interleukin 13; oligopeptidase B; peroxiredoxin; protein design; protein sequences; protein structure.