Structural annotation of unknown molecules in a miniaturized mass spectrometer based on a transformer enabled fragment tree method

Commun Chem. 2024 May 13;7(1):109. doi: 10.1038/s42004-024-01189-0.

Abstract

Structural annotation of small molecules in tandem mass spectrometry has always been a central challenge in mass spectrometry analysis, especially using a miniaturized mass spectrometer for on-site testing. Here, we propose the Transformer enabled Fragment Tree (TeFT) method, which combines various types of fragmentation tree models and a deep learning Transformer module. It is aimed to generate the specific structure of molecules de novo solely from mass spectrometry spectra. The evaluation results on different open-source databases indicated that the proposed model achieved remarkable results in that the majority of molecular structures of compounds in the test can be successfully recognized. Also, the TeFT has been validated on a miniaturized mass spectrometer with low-resolution spectra for 16 flavonoid alcohols, achieving complete structure prediction for 8 substances. Finally, TeFT confirmed the structure of the compound contained in a Chinese medicine substance called the Anweiyang capsule. These results indicate that the TeFT method is suitable for annotating fragmentation peaks with clear fragmentation rules, particularly when applied to on-site mass spectrometry with lower mass resolution.