Highly accurate machine learning prediction of crystal point groups for ternary materials from chemical formula

Sci Rep. 2022 Jan 28;12(1):1577. doi: 10.1038/s41598-022-05642-9.

Abstract

One of the most challenging problems in condensed matter physics is to predict crystal structure just from the chemical formula of the material. In this work, we present a robust machine learning (ML) predictor for the crystal point group of ternary materials (A[Formula: see text]B[Formula: see text]C[Formula: see text]) - as first step to predict the structure - with very small set of ionic and positional fundamental features. From ML perspective, the problem is strenuous due to multi-labelity, multi-class, and data imbalance. The resulted prediction is very reliable as high balanced accuracies are obtained by different ML methods. Many similarity-based approaches resulted in a balanced accuracy above 95% indicating that the physics is well captured by the reduced set of features; namely, stoichiometry, ionic radii, ionization energies, and oxidation states for each of the three elements in the ternary compound. The accuracy is not limited by the approach; but rather by the limited data points and we should expect higher accuracy prediction by having more reliable data.