Reproducing the invention of a named reaction: zero-shot prediction of unseen chemical reactions

Phys Chem Chem Phys. 2022 May 4;24(17):10280-10291. doi: 10.1039/d1cp05878a.

Abstract

While state-of-art models can predict reactions through the transfer learning of thousands of samples with the same reaction types as those of the reactions to predict, how to prepare such models to predict "unseen" reactions remains an unanswered question. We aimed to study the Transformer model's ability to predict "unseen" reactions through "zero-shot reaction prediction (ZSRP)", a concept derived from zero-shot learning and zero-shot translation. We reproduced the human invention of the Chan-Lam coupling reaction where the inventor was inspired by the Suzuki reaction when improving Barton's bismuth arylation reaction. After being fine-tuned with samples from these two "existing" reactions, the USPTO-trained Transformer could predict "unseen" Chan-Lam coupling reactions with 55.7% top-1 accuracy. Our model could also mimic the later stage of the history of this reaction, where the initial case of this reaction was generalized to more reactants and reagents via "one-shot/few-shot reaction prediction (OSRP/FSRP)" approaches.

MeSH terms

  • Humans
  • Inventions*
  • Machine Learning*