Contribution to the prediction of the fold code: application to immunoglobulin and flavodoxin cases

Mateusz Banach; Nicolas Prudhomme; Mathilde Carpentier; Elodie Duprat; Nikolaos Papandreou; Barbara Kalinowska; Jacques Chomilier; Irena Roterman

doi:10.1371/journal.pone.0125098

Contribution to the prediction of the fold code: application to immunoglobulin and flavodoxin cases

PLoS One. 2015 Apr 27;10(4):e0125098. doi: 10.1371/journal.pone.0125098. eCollection 2015.

Authors

Mateusz Banach¹, Nicolas Prudhomme², Mathilde Carpentier³, Elodie Duprat³, Nikolaos Papandreou⁴, Barbara Kalinowska¹, Jacques Chomilier³, Irena Roterman¹

Affiliations

¹ Department of Bioinformatics and Telemedicine, Medical College, Jagiellonian University, Krakow, Poland.
² Protein Structure Prediction group, IMPMC, UPMC & CNRS, Paris, France.
³ Protein Structure Prediction group, IMPMC, UPMC & CNRS, Paris, France; RPBS, 35 rue Hélène Brion, 75013, Paris, France.
⁴ Genetics Department, Agricultural University of Athens, Iera Odos 75, Athens, Greece.

Abstract

Background: Folding nucleus of globular proteins formation starts by the mutual interaction of a group of hydrophobic amino acids whose close contacts allow subsequent formation and stability of the 3D structure. These early steps can be predicted by simulation of the folding process through a Monte Carlo (MC) coarse grain model in a discrete space. We previously defined MIRs (Most Interacting Residues), as the set of residues presenting a large number of non-covalent neighbour interactions during such simulation. MIRs are good candidates to define the minimal number of residues giving rise to a given fold instead of another one, although their proportion is rather high, typically [15-20]% of the sequences. Having in mind experiments with two sequences of very high levels of sequence identity (up to 90%) but different folds, we combined the MIR method, which takes sequence as single input, with the "fuzzy oil drop" (FOD) model that requires a 3D structure, in order to estimate the residues coding for the fold. FOD assumes that a globular protein follows an idealised 3D Gaussian distribution of hydrophobicity density, with the maximum in the centre and minima at the surface of the "drop". If the actual local density of hydrophobicity around a given amino acid is as high as the ideal one, then this amino acid is assigned to the core of the globular protein, and it is assumed to follow the FOD model. Therefore one obtains a distribution of the amino acids of a protein according to their agreement or rejection with the FOD model.

Results: We compared and combined MIR and FOD methods to define the minimal nucleus, or keystone, of two populated folds: immunoglobulin-like (Ig) and flavodoxins (Flav). The combination of these two approaches defines some positions both predicted as a MIR and assigned as accordant with the FOD model. It is shown here that for these two folds, the intersection of the predicted sets of residues significantly differs from random selection. It reduces the number of selected residues by each individual method and allows a reasonable agreement with experimentally determined key residues coding for the particular fold. In addition, the intersection of the two methods significantly increases the specificity of the prediction, providing a robust set of residues that constitute the folding nucleus.

Publication types

Comparative Study
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Amino Acid Sequence
Animals
Bacteria / chemistry
Bacteria / metabolism
Binding Sites
Flavodoxin / chemistry*
Humans
Hydrophobic and Hydrophilic Interactions
Immunoglobulins / chemistry*
Models, Molecular*
Monte Carlo Method
Protein Folding
Protein Structure, Secondary

Substances

Flavodoxin
Immunoglobulins

Grants and funding

Funding came from the French Polish bilateral collaborative grant under number 27748NE. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.