Text Mining the Literature to Inform Experiments and Rationalize Impurity Phase Formation for BiFeO3

Chem Mater. 2023 Dec 29;36(2):772-785. doi: 10.1021/acs.chemmater.3c02203. eCollection 2024 Jan 23.

Abstract

We used data-driven methods to understand the formation of impurity phases in BiFeO3 thin-film synthesis through the sol-gel technique. Using a high-quality dataset of 331 synthesis procedures and outcomes extracted manually from 177 scientific articles, we trained decision tree models that reinforce important experimental heuristics for the avoidance of phase impurities but ultimately show limited predictive capability. We find that several important synthesis features, identified by our model, are often not reported in the literature. To test our ability to correctly impute missing synthesis parameters, we attempted to reproduce nine syntheses from the literature with varying degrees of "missingness". We demonstrate how a text-mined dataset can be made useful by informing new controlled experiments and forming a better understanding for impurity phase formation in this complex oxide system.