High-Throughput Non-targeted Chemical Structure Identification Using Gas-Phase Infrared Spectra

Anal Chem. 2021 Aug 3;93(30):10688-10696. doi: 10.1021/acs.analchem.1c02244. Epub 2021 Jul 21.

Abstract

The high-throughput identification of unknown metabolites in biological samples remains challenging. Most current non-targeted metabolomics studies rely on mass spectrometry, followed by computational methods that rank thousands of candidate structures based on how closely their predicted mass spectra match the experimental mass spectrum of an unknown. We reasoned that the infrared (IR) spectra could be used in an analogous manner and could add orthologous structure discrimination; however, this has never been evaluated on large data sets. Here, we present results of a high-throughput computational method for predicting IR spectra of candidate compounds obtained from the PubChem database. Predicted spectra were ranked based on their similarity to gas-phase experimental IR spectra of test compounds obtained from the NIST. Our computational workflow (IRdentify) consists of a fast semiempirical quantum mechanical method for initial IR spectra prediction, ranking, and triaging, followed by a final IR spectra prediction and ranking using density functional theory. This approach resulted in the correct identification of 47% of 258 test compounds. On average, there were 2152 candidate structures evaluated for each test compound, giving a total of approximately 555,200 candidate structures evaluated. We discuss several variables that influenced the identification accuracy and then demonstrate the potential application of this approach in three areas: (1) combining IR and mass spectra rankings into a single composite rank score, (2) identifying the precursor and fragment ions using cryogenic ion vibrational spectroscopy, and (3) the incorporation of a trimethylsilyl derivatization step to extend the method compatibility to less-volatile compounds. Overall, our results suggest that matching computational with experimental IR spectra is a potentially powerful orthogonal option for adding significant high-throughput chemical structure discrimination when used with other non-targeted chemical structure identification methods.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Factual
  • Ions
  • Mass Spectrometry
  • Metabolomics*

Substances

  • Ions