Bayesian models and algorithms for protein β-sheet prediction

Zafer Aydin; Yucel Altunbasak; Hakan Erdogan

doi:10.1109/TCBB.2008.140

Bayesian models and algorithms for protein β-sheet prediction

IEEE/ACM Trans Comput Biol Bioinform. 2011 Mar-Apr;8(2):395-409. doi: 10.1109/TCBB.2008.140.

Authors

Zafer Aydin¹, Yucel Altunbasak, Hakan Erdogan

Affiliation

¹ Department of Genome Sciences, University of Washington, Genome Sciences, Box 357456, 1705 NE Pacific St., Seattle, WA 98195-5065, USA. zafer@u.washington.edu

PMID: 21233522
DOI: 10.1109/TCBB.2008.140

Abstract

Prediction of the 3D structure greatly benefits from the information related to secondary structure, solvent accessibility, and nonlocal contacts that stabilize a protein's structure. We address the problem of \beta-sheet prediction defined as the prediction of \beta--strand pairings, interaction types (parallel or antiparallel), and \beta-residue interactions (or contact maps). We introduce a Bayesian approach for proteins with six or less \beta-strands in which we model the conformational features in a probabilistic framework by combining the amino acid pairing potentials with a priori knowledge of \beta-strand organizations. To select the optimum \beta-sheet architecture, we significantly reduce the search space by heuristics that enforce the amino acid pairs with strong interaction potentials. In addition, we find the optimum pairwise alignment between \beta-strands using dynamic programming in which we allow any number of gaps in an alignment to model \beta-bulges more effectively. For proteins with more than six \beta-strands, we first compute \beta-strand pairings using the BetaPro method. Then, we compute gapped alignments of the paired \beta-strands and choose the interaction types and \beta--residue pairings with maximum alignment scores. We performed a 10-fold cross-validation experiment on the BetaSheet916 set and obtained significant improvements in the prediction accuracy.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms*
Amino Acid Sequence
Bayes Theorem*
Computational Biology / methods
Models, Molecular
Molecular Sequence Data
Protein Structure, Secondary*
Proteins / chemistry
Sequence Alignment

Substances

Proteins