Automatic determination of the spectrum-structure relationship by tree structure-based unsupervised and supervised learning

Ultramicroscopy. 2022 Mar:233:113438. doi: 10.1016/j.ultramic.2021.113438. Epub 2021 Dec 4.

Abstract

Spectroscopy is widely used for the analysis of chemical, vibrational, and bonding information. Interpretations of the spectral features have been performed by comparing the objective spectra with reference spectra from experiments or simulations. However, the interpretation process by humans is not always straightforward, especially for spectra obtained from unknown or new materials. In the present study, we developed a method using machine learning techniques to obtain human-like interpretation automatically. We combined unsupervised and supervised learning methods; then applied it to the spectrum database which includes more than 400 spectra of water and organic molecules containing various ligands and chemical bonds. The proposed method has successfully found the correlations between the spectral features and descriptors of the atoms, bonds, and ligands. We demonstrated that the proposed method enabled the automatic determination of reasonable spectrum-structure relationships such as between π* resonance in C-K edges and multiple bonds. The proposed method enables the automatic determination of physically and chemically reasonable spectrum-structure relationships without arbitrariness in data-driven manner, which is considerably difficult only with simulation or conventional machine leaning techniques. Such relationships are useful for understanding what structural parameters cause changes in the spectrum, providing a way for the better interpretation of spatial distributed or time evolutionary data. Furthermore, although the present work focused on the ELNES/XANES spectrum from small organic molecules, the proposed method can be readily extended to other spectral data. It is expected to contribute to a better understanding of the spectrum-structure relationship in various spectroscopy applications.

Keywords: ELNES/XANES; Informatics; Machine learning.