Composition Based Oxidation State Prediction of Materials Using Deep Learning Language Models

Adv Sci (Weinh). 2023 Oct;10(28):e2301011. doi: 10.1002/advs.202301011. Epub 2023 Aug 7.

Abstract

Oxidation states (OS) are the charges on atoms due to electrons gained or lost upon applying an ionic approximation to their bonds. As a fundamental property, OS has been widely used in charge-neutrality verification, crystal structure determination, and reaction estimation. Currently, only heuristic rules exist for guessing the oxidation states of a given compound with many exceptions. Recent work has developed machine learning models based on heuristic structural features for predicting the oxidation states of metal ions. However, composition-based oxidation state prediction still remains elusive so far, which has significant implications for the discovery of new materials for which the structures have not been determined. This work proposes a novel deep learning-based BERT transformer language model BERTOS for predicting the oxidation states for all elements of inorganic compounds given only their chemical composition. This model achieves 96.82% accuracy for all-element oxidation states prediction benchmarked on the cleaned ICSD dataset and achieves 97.61% accuracy for oxide materials. It is also demonstrated how it can be used to conduct large-scale screening of hypothetical material compositions for materials discovery.

Keywords: deep learning; language model; material discovery; material screening; neural networks; oxidation states; transformer.