Predictive Modeling of NMR Chemical Shifts without Using Atomic-Level Annotations

J Chem Inf Model. 2020 Aug 24;60(8):3765-3769. doi: 10.1021/acs.jcim.0c00494. Epub 2020 Jul 31.

Abstract

Recently, machine learning has been successfully applied to the prediction of nuclear magnetic resonance (NMR) chemical shifts. To build a prediction model, the existing methods require a training data set that comprises molecules whose NMR-active atoms are annotated with their chemical shifts. However, the laborious task of atomic-level annotation must be manually conducted by chemists. Thus, it becomes difficult to perform large-scale annotation. To address this issue, we propose a weakly supervised learning method to enable the predictive modeling of NMR chemical shifts without requiring explicit atomic-level annotations in the training data set. For the training data set, the proposed method only requires the annotation of chemical shifts at the molecular level. As a prediction model, we build a message passing neural network (MPNN) that predicts the chemical shifts of individual NMR-active atoms in a molecule. Using a loss function that is invariant to the permutation of atoms in a molecule, the model is trained in a weakly supervised manner to minimize the molecular-level difference between a set of predicted chemical shifts and the corresponding set of actual chemical shifts across the training data set. Accordingly, during the training, the chemical shifts predicted by the model are approximately aligned with the actual chemical shifts in a data-driven fashion. The proposed method performs comparably to the existing fully supervised methods in terms of predicting the chemical shifts of 1H and 13C NMR spectra for small molecules.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Magnetic Resonance Imaging*
  • Magnetic Resonance Spectroscopy
  • Neural Networks, Computer*