Mining Insights on Metal-Organic Framework Synthesis from Scientific Literature Texts

J Chem Inf Model. 2022 Mar 14;62(5):1190-1198. doi: 10.1021/acs.jcim.1c01297. Epub 2022 Feb 23.

Abstract

Identifying optimal synthesis conditions for metal-organic frameworks (MOFs) is a major challenge that can serve as a bottleneck for new materials discovery and development. A trial-and-error approach that relies on a chemist's intuition and knowledge has limitations in efficiency due to the large MOF synthesis space. To this end, 46,701 MOFs were data mined using our in-house developed code to extract their synthesis information from 28,565 MOF papers. The joint machine-learning/rule-based algorithm yields an average F1 score of 90.3% across different synthesis parameters (i.e., metal precursors, organic precursors, solvents, temperature, time, and composition). From this data set, a positive-unlabeled learning algorithm was developed to predict the synthesis of a given MOF material using synthesis conditions as inputs, and this algorithm successfully predicted successful synthesis in 83.1% of the synthesized data in the test set. Finally, our model correctly predicted three amorphous MOFs (with their representative experimental synthesis conditions) as having low synthesizability scores, while the counterpart crystalline MOFs showed high synthesizability scores. Our results show that big data extracted from the texts of MOF papers can be used to rationally predict synthesis conditions for these materials, which can accelerate the speed in which new MOFs are synthesized.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Metal-Organic Frameworks* / chemistry
  • Metals / chemistry
  • Solvents

Substances

  • Metal-Organic Frameworks
  • Metals
  • Solvents

Associated data

  • figshare/10.6084/m9.figshare.16902652.v3