MultiConditionRT: Predicting liquid chromatography retention time for emerging contaminants for a wide range of eluent compositions and stationary phases

J Chromatogr A. 2022 Mar 15:1666:462867. doi: 10.1016/j.chroma.2022.462867. Epub 2022 Jan 31.

Abstract

Structural elucidation of compounds detected with liquid chromatography coupled to high resolution mass spectrometry is a challenging and time-consuming step in the workflow of non-targeted analysis and often requires manual validation of the results. Retention time, alongside exact mass, isotope pattern, fragmentation spectra, and collision cross-section, is valuable information for ruling out unlikely structures and increasing the confidence in others. Different approaches to predict retention times have been used previously for reversed phase chromatography and hydrophilic interaction liquid chromatography (HILIC), but application is limited to a small set of mobile phases and gradient profiles. Here, we expand the toolbox available for retention time predictions by developing a random forest regression model for predicting retention times for four column types and twenty mobile phase systems. MultiConditionRT was built using a dataset containing 78 compounds analyzed with C18 reversed phase, mixed mode, HILIC, and biphenyl columns. In addition, different eluent compositions were used: both methanol and acetonitrile were combined with different aqueous phases with pH from 2.1 to 10.0 (formic acid, acetic acid, trifluoroacetic acid, formate, acetate, bicarbonate, and ammonia). The root mean square error (RMSE) of the test set predictions was 1.55 min for C18 reversed phase, 1.79 min for mixed-mode, 1.93 min for HILIC, and 1.56 min for biphenyl column. Additionally, MultiConditionRT can be applied to different gradient profiles with a general additive model-based calibration approach. The approach of MultiConditionRT was validated externally and internally with 356 and 151 compounds respectively, yielding an RMSE of 2.68 and 2.32 min. 324 and 84 of these compounds were not in the dataset used in the model development.

Keywords: Gradient elution; High resolution mass spectrometry; Quantitative structure-retention relationship model; Random forest regression.

MeSH terms

  • Chromatography, Liquid / methods
  • Chromatography, Reverse-Phase* / methods
  • Hydrophobic and Hydrophilic Interactions
  • Indicators and Reagents
  • Methanol* / chemistry

Substances

  • Indicators and Reagents
  • Methanol