Determining gradient conditions for peptide purification in RPLC with machine-learning-based retention time predictions

J Chromatogr A. 2019 Aug 2:1598:92-100. doi: 10.1016/j.chroma.2019.03.043. Epub 2019 Mar 29.

Abstract

A strategy for determining a suitable solvent gradient in silico in preparative peptide separations is presented. The strategy utilizes a machine-learning-based method, called ELUDE, for peptide retention time predictions based on the amino acid sequences of the peptides. A suitable gradient is calculated according to linear solvent strength theory by predicting the retention times of the peptides being purified at three different gradient slopes. The advantage of this strategy is that fewer experiments are needed to develop a purification method, making it useful for labs conducting many separations but with limited resources for method development. The preparative separation of met-enkephalin and leu-enkephalin was used as model solutes on two stationary phases: XBridge C18 and CSH C18. The ELUDE algorithm contains a support vector regression and is pre-trained, meaning that only 10-50 peptides are needed to calibrate a model for a certain stationary phase and gradient. The calibration is done once and the model can then be used for new peptides similar in size to those in the calibration set. We found that the accuracy of the retention time predictions is good enough to usefully estimate a suitable gradient and that it was possible to compare the selectivity on different stationary phases in silico. The absolute relative errors in retention time for the predicted gradients were 4.2% and 3.7% for met-enkephalin and leu-enkephalin, respectively, on the XBridge C18 column and 2.0% and 2.8% on the CSH C18 column. The predicted retention times were also used as initial values for adsorption isotherm parameter determination, facilitating the numerical calculation of overloaded elution profiles. Changing the trifluoroacetic acid (TFA) concentration from 0.05% to 0.15% in the eluent did not seriously affect the error in the retention time predictions for the XBridge C18 column, an increase of 1.0 min (in retention factor, 1.3). For the CSH C18 column the error was, on average, 2.6 times larger. This indicates that the model needs to be recalibrated when changing the TFA concentration for the CSH column. Studying possible scale-up complications from UHPLC to HPLC such as pressure, viscous heating (i.e., temperature gradients), and stationary-phase properties (e.g., packing heterogeneity and surface chemistry) revealed that all these factors were minor to negligible. The pressure effect had the largest effect on the retention, but increased retention by only 3%. In the presented case, method development can therefore proceed using UHPLC and then be robustly transferred to HPLC.

Keywords: In silico determination; Machine learning; Peptide; Preparative; Purification; Retention time prediction.

MeSH terms

  • Adsorption
  • Chemistry Techniques, Analytical / methods*
  • Chromatography, Reverse-Phase*
  • Machine Learning*
  • Peptides / chemistry
  • Peptides / isolation & purification*
  • Solvents / chemistry
  • Temperature
  • Time
  • Trifluoroacetic Acid / chemistry

Substances

  • Peptides
  • Solvents
  • Trifluoroacetic Acid