POMFinder: identifying polyoxometallate cluster structures from pair distribution function data using explainable machine learning

J Appl Crystallogr. 2024 Feb 1;57(Pt 1):34-43. doi: 10.1107/S1600576723010014.

Abstract

Characterization of a material structure with pair distribution function (PDF) analysis typically involves refining a structure model against an experimental data set, but finding or constructing a suitable atomic model for PDF modelling can be an extremely labour-intensive task, requiring carefully browsing through large numbers of possible models. Presented here is POMFinder, a machine learning (ML) classifier that rapidly screens a database of structures, here polyoxometallate (POM) clusters, to identify candidate structures for PDF data modelling. The approach is shown to identify suitable POMs from experimental data, including in situ data collected with fast acquisition times. This automated approach has significant potential for identifying suitable models for structure refinement to extract quantitative structural parameters in materials chemistry research. POMFinder is open source and user friendly, making it accessible to those without prior ML knowledge. It is also demonstrated that POMFinder offers a promising modelling framework for combined modelling of multiple scattering techniques.

Keywords: POMFinder; computational modelling; machine learning; polyoxometallate clusters.

Grants and funding

This work is part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation Programme (grant agreement No. 804066). We are grateful to the Villum Foundation for financial support through a Villum Young Investigator grant (VKR00015416). Funding from the Danish Ministry of Higher Education and Science through the SMART Lighthouse is gratefully acknowledged. We also thank DANSCATT (supported by the Danish Agency for Science and Higher Education) for support. A. S. Anker and M. Juelsholt acknowledge the Siemens Foundation for support for their thesis projects. We acknowledge the MAX IV Laboratory for time on Beamline DanMAX under proposal 20200731. Research conducted at MAX IV is supported by the Swedish Research council under contract 2018-07152, the Swedish Governmental Agency for Innovation Systems under contract 2018-04969 and Formas under contract 2019-02496. DanMAX is funded by NUFI (grant No. 4059-00009B).