Machine learning recognition of protein secondary structures based on two-dimensional spectroscopic descriptors

Proc Natl Acad Sci U S A. 2022 May 3;119(18):e2202713119. doi: 10.1073/pnas.2202713119. Epub 2022 Apr 27.

Abstract

Protein secondary structure discrimination is crucial for understanding their biological function. It is not generally possible to invert spectroscopic data to yield the structure. We present a machine learning protocol which uses two-dimensional UV (2DUV) spectra as pattern recognition descriptors, aiming at automated protein secondary structure determination from spectroscopic features. Accurate secondary structure recognition is obtained for homologous (97%) and nonhomologous (91%) protein segments, randomly selected from simulated model datasets. The advantage of 2DUV descriptors over one-dimensional linear absorption and circular dichroism spectra lies in the cross-peak information that reflects interactions between local regions of the protein. Thanks to their ultrafast (∼200 fs) nature, 2DUV measurements can be used in the future to probe conformational variations in the course of protein dynamics.

Keywords: biochemistry; physical chemistry; theoretical chemistry; ultrafast spectroscopy.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Machine Learning*
  • Neural Networks, Computer*
  • Proteins
  • Spectrum Analysis

Substances

  • Proteins