Computational basis of knowledge-based conformational probabilities derived from local- and long-range interactions in proteins

Lerzan Ormeci; Attila Gursoy; Guzin Tunca; Burak Erman

doi:10.1002/prot.21206

Computational basis of knowledge-based conformational probabilities derived from local- and long-range interactions in proteins

Proteins. 2007 Jan 1;66(1):29-40. doi: 10.1002/prot.21206.

Authors

Lerzan Ormeci¹, Attila Gursoy, Guzin Tunca, Burak Erman

Affiliation

¹ College of Engineering, Koc University, Rumelifeneri Yolu, 34450 Sariyer, Istanbul, Turkey.

PMID: 17039547
DOI: 10.1002/prot.21206

Abstract

The probabilities of the various basins in Ramachandran maps are examined critically. The theoretical basis of probability calculations both from molecular computations and from protein libraries are discussed. The well-defined basins of the Ramachandran maps are treated as rotational isomeric states. Statistical independence and dependence of the states of different residues along the peptide chain are discussed. The Flory isolated pair hypothesis, near neighbor correlations, context effects, and long-range correlations are examined critically. A method of evaluating long-range correlations in helical and extended sequences is introduced in analogy with earlier polymer theory. Three different protein libraries are constructed where data is considered from residues in the (i) coiled regions, (ii) all regions, and (iii) only the helical and extended regions of proteins. Singlet and pairwise dependent probabilities calculated from these libraries are used to predict whether a given sequence is helical or extended. Predictions using pairwise dependence were not better than those using singlet probabilities. Modeling of long-range correlations improved the predictions significantly. Removal of the Chameleon sequences from the data set also improved the predictions, but to a lesser extent.

Publication types

Comparative Study
Evaluation Study

MeSH terms

Amino Acid Sequence
Amino Acids / chemistry
Amino Acids / metabolism
Computational Biology / methods*
Computer Simulation
Databases, Protein
Knowledge Bases
Models, Statistical*
Molecular Sequence Data
Probability
Protein Conformation*
Protein Folding
Proteins / chemistry
Proteins / metabolism

Substances

Amino Acids
Proteins