Ensemble generation and the influence of protein flexibility on geometric tunnel prediction in cytochrome P450 enzymes

PLoS One. 2014 Jun 23;9(6):e99408. doi: 10.1371/journal.pone.0099408. eCollection 2014.

Abstract

Computational prediction of ligand entry and egress paths in proteins has become an emerging topic in computational biology and has proven useful in fields such as protein engineering and drug design. Geometric tunnel prediction programs, such as Caver3.0 and MolAxis, are computationally efficient methods to identify potential ligand entry and egress routes in proteins. Although many geometric tunnel programs are designed to accommodate a single input structure, the increasingly recognized importance of protein flexibility in tunnel formation and behavior has led to the more widespread use of protein ensembles in tunnel prediction. However, there has not yet been an attempt to directly investigate the influence of ensemble size and composition on geometric tunnel prediction. In this study, we compared tunnels found in a single crystal structure to ensembles of various sizes generated using different methods on both the apo and holo forms of cytochrome P450 enzymes CYP119, CYP2C9, and CYP3A4. Several protein structure clustering methods were tested in an attempt to generate smaller ensembles that were capable of reproducing the data from larger ensembles. Ultimately, we found that by including members from both the apo and holo data sets, we could produce ensembles containing less than 15 members that were comparable to apo or holo ensembles containing over 100 members. Furthermore, we found that, in the absence of either apo or holo crystal structure data, pseudo-apo or -holo ensembles (e.g. adding ligand to apo protein throughout MD simulations) could be used to resemble the structural ensembles of the corresponding apo and holo ensembles, respectively. Our findings not only further highlight the importance of including protein flexibility in geometric tunnel prediction, but also suggest that smaller ensembles can be as capable as larger ensembles at capturing many of the protein motions important for tunnel prediction at a lower computational cost.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Apoproteins / chemistry
  • Cluster Analysis
  • Computational Biology / methods*
  • Cytochrome P-450 Enzyme System / chemistry*
  • Hydrogen Bonding
  • Molecular Dynamics Simulation
  • Software*

Substances

  • Apoproteins
  • Cytochrome P-450 Enzyme System