Investigation of the chemical compounds in Pheretima aspergillum (E. Perrier) using a combination of mass spectral molecular networking and unsupervised substructure annotation topic modeling together with in silico fragmentation prediction

J Pharm Biomed Anal. 2020 May 30:184:113197. doi: 10.1016/j.jpba.2020.113197. Epub 2020 Feb 20.

Abstract

Untargeted mass spectrometry analysis is one of the most challenging and meaningful steps in the rapid structural elucidation of the highly complex and diverse constituents of traditional Chinese medicine. Specifically, it is a laborious and time-consuming way to identify unknown compounds. Herein, a workflow was proposed to expedite the annotations of the chemical structures in Pheretima aspergillum (E. Perrier) (Di-Long, DL). First, ultra high-performance liquid chromatography coupled with quadrupole time-of-flight mass spectrometry (UHPLC-QTOFMS) was performed to obtain the untargeted mass spectral data. Then, the spectral data were uploaded to the Global Natural Products Social Molecular Networking (GNPS) platform to create a network and extract the Mass2Motifs (co-occurring fragments and neutral losses) using unsupervised substructure annotation topic modeling (MS2LDA). Finally, a structural analysis was performed using the proposed workflow of MS2LDA in combination with mass spectral molecular networking and in silico fragmentation prediction. As a result, a total of 124 compounds from DL were effectively characterized, of which 89 (7 furan sulfonic acids, 57 phospholipids and 25 carboxamides) were identified as potentially new compounds from DL. The results presented in this article significantly improve the understanding of the chemical composition of DL and provide a solid scientific basis for the future study of the quality control, underlying pharmacology and mechanism of DL. Moreover, the proposed workflow was used for the first time to accelerate the annotations of unknown molecules from TCM. Furthermore, this workflow will increase the efficiency of characterizing the 'unknown knowns' and elucidation of the 'unknown unknowns' from TCM, which are crucial steps of discovering the natural product drugs in TCM.

Keywords: In silico prediction; MS2LDA; Mass spectral molecular networking; Pheretima aspergillum (E. Perrier); UHPLC-QTOFMS.

MeSH terms

  • Biological Factors / chemistry*
  • Chromatography, High Pressure Liquid / methods
  • Computer Simulation
  • Medicine, Chinese Traditional / methods
  • Quality Control
  • Tandem Mass Spectrometry / methods*
  • Workflow

Substances

  • Biological Factors