Sampling of Organic Solutes in Aqueous and Heterogeneous Environments Using Oscillating Excess Chemical Potentials in Grand Canonical-like Monte Carlo-Molecular Dynamics Simulations

J Chem Theory Comput. 2014 Jun 10;10(6):2281-2290. doi: 10.1021/ct500201y. Epub 2014 May 6.

Abstract

Solute sampling of explicit bulk-phase aqueous environments in grand canonical (GC) ensemble simulations suffer from poor convergence due to low insertion probabilities of the solutes. To address this, we developed an iterative procedure involving Grand Canonical-like Monte Carlo (GCMC) and molecular dynamics (MD) simulations. Each iteration involves GCMC of both the solutes and water followed by MD, with the excess chemical potential (μex) of both the solute and the water oscillated to attain their target concentrations in the simulation system. By periodically varying the μex of the water and solutes over the GCMC-MD iterations, solute exchange probabilities and the spatial distributions of the solutes improved. The utility of the oscillating-μex GCMC-MD method is indicated by its ability to approximate the hydration free energy (HFE) of the individual solutes in aqueous solution as well as in dilute aqueous mixtures of multiple solutes. For seven organic solutes: benzene, propane, acetaldehyde, methanol, formamide, acetate, and methylammonium, the average μex of the solutes and the water converged close to their respective HFEs in both 1 M standard state and dilute aqueous mixture systems. The oscillating-μex GCMC methodology is also able to drive solute sampling in proteins in aqueous environments as shown using the occluded binding pocket of the T4 lysozyme L99A mutant as a model system. The approach was shown to satisfactorily reproduce the free energy of binding of benzene as well as sample the functional group requirements of the occluded pocket consistent with the crystal structures of known ligands bound to the L99A mutant as well as their relative binding affinities.