Towards a machine-readable literature: finding relevant papers based on an uploaded powder diffraction pattern

Acta Crystallogr A Found Adv. 2022 Sep 1;78(Pt 5):386-394. doi: 10.1107/S2053273322007483. Epub 2022 Aug 19.

Abstract

A prototype application for machine-readable literature is investigated. The program is called pyDataRecognition and serves as an example of a data-driven literature search, where the literature search query is an experimental data set provided by the user. The user uploads a powder pattern together with the radiation wavelength. The program compares the user data to a database of existing powder patterns associated with published papers and produces a rank ordered according to their similarity score. The program returns the digital object identifier and full reference of top-ranked papers together with a stack plot of the user data alongside the top-five database entries. The paper describes the approach and explores successes and challenges.

Keywords: CIF; data similarity; data-driven literature search; machine-readable scientific literature; powder diffraction.

MeSH terms

  • Databases, Factual
  • Powder Diffraction
  • Powders
  • Publications*

Substances

  • Powders