Interpretable machine learning-assisted screening of perovskite oxides

Jie Zhao; Xiaoyan Wang; Haobo Li; Xiaoyong Xu

doi:10.1039/d3ra08591k

Interpretable machine learning-assisted screening of perovskite oxides

RSC Adv. 2024 Jan 26;14(6):3909-3922. doi: 10.1039/d3ra08591k. eCollection 2024 Jan 23.

Authors

Jie Zhao¹, Xiaoyan Wang², Haobo Li³, Xiaoyong Xu³

Affiliations

¹ College of Chemical Engineering, Nanjing Tech University Nanjing Jiangsu 211816 China j.zhao1@njtech.edu.cn.
² School of Computer Science, Nanjing Audit University Nanjing Jiangsu 211815 China xywang@nau.edu.cn.
³ School of Chemical Engineering, The University of Adelaide Adelaide SA 5005 Australia xiaoyong.xu@adelaide.edu.au.

Abstract

Perovskite oxides are extensively utilized in energy storage and conversion. However, they are conventionally screened via time-consuming and cost-intensive experimental approaches and density functional theory. Herein, interpretable machine learning is applied to identify perovskite oxides from virtual perovskite-type combinations by constructing classification and regression models to predict their thermodynamic stability and energy above the convex hull (E_h), respectively, and interpreting the models using SHapley Additive exPlanations. The highest occupied molecular orbital energy and the elastic modulus of the B-site elements of perovskite oxides are the top two features for stability prediction, whereas the Stability Label and features involving the elastic modulus and ionic radius are crucial for E_h regression. A classification model, which displays an accuracy of 0.919, precision of 0.937, F1-score of 0.932, and recall of 0.935, screens 682 143 stable perovskite oxides from 1 126 668 virtual perovskite-type combinations. The E_h values of the predicted stable perovskites are forecasted by a regression model with a coefficient of determination of 0.916, and root mean square error of 24.2 meV atom^-1. Good agreement is observed between the regression model predicted and density functional theory-calculated E_h values.