A Machine Learning Protocol for Predicting Protein Infrared Spectra

J Am Chem Soc. 2020 Nov 11;142(45):19071-19077. doi: 10.1021/jacs.0c06530. Epub 2020 Oct 30.

Abstract

Infrared (IR) absorption provides important chemical fingerprints of biomolecules. Protein secondary structure determination from IR spectra is tedious since its theoretical interpretation requires repeated expensive quantum-mechanical calculations in a fluctuating environment. Herein we present a novel machine learning protocol that uses a few key structural descriptors to rapidly predict amide I IR spectra of various proteins and agrees well with experiment. Its transferability enabled us to distinguish protein secondary structures, probe atomic structure variations with temperature, and monitor protein folding. This approach offers a cost-effective tool to model the relationship between protein spectra and their biological/chemical properties.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amides / chemistry
  • Machine Learning*
  • Peptides / chemistry
  • Peptides / metabolism
  • Protein Folding
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Proteins / metabolism
  • Quantum Theory
  • Spectrophotometry, Infrared*
  • Temperature
  • Ubiquitin / chemistry
  • Ubiquitin / metabolism

Substances

  • Amides
  • Peptides
  • Proteins
  • Trp-cage peptide
  • Ubiquitin