Genome Mining in Glass Chemistry Using Linear Component Analysis of Ion Conductivity Data

Adv Sci (Weinh). 2023 Jul;10(21):e2301435. doi: 10.1002/advs.202301435. Epub 2023 May 7.

Abstract

Understanding the multivariate origin of physical properties is particularly complex for polyionic glasses. As a concept, the term genome has been used to describe the entirety of structure-property relations in solid materials, based on functional genes acting as descriptors for a particular property, for example, for input in regression analysis or other machine-learning tools. Here, the genes of ionic conductivity in polyionic sodium-conducting glasses are presented as fictive chemical entities with a characteristic stoichiometry, derived from strong linear component analysis (SLCA) of a uniquely consistent dataset. SLCA is based on a twofold optimization problem that maximizes the quality of linear regression between a property (here: ionic conductivity) and champion candidates from all possible combinations of elements. Family trees and matrix rotation analysis are subsequently used to filter for essential elemental combinations, and from their characteristic mean composition, the essential genes. These genes reveal the intrinsic relationships within the multivariate input data. While they do not require a structural representation in real space, how possible structural interpretations agree with intuitive understanding of structural entities known from spectroscopic experiments is finally demonstrated.

Keywords: genome mining; glass; ionic conductivity; materials genome; property predictions.