A post processing strategy to score and rank the annotation confidence of saponins in natural products by integrating MS2 spectral similarity and fragment interpretation

J Pharm Biomed Anal. 2021 Sep 10:204:114291. doi: 10.1016/j.jpba.2021.114291. Epub 2021 Jul 30.

Abstract

Tandem mass spectrometry-spectra-based annotation in natural products challenges a lot because of ambiguous structural characterization. It still lacks an efficiency method to score and rank the annotation confidence. Herein, we develop a novel approach to rank the annotation confidences of saponins. Annotations were accomplished according to fragmentation patterns. The corresponding diagnostic fragments and their abundances were recorded. Average abundances were taken as a reference spectrum, and the cosine similarity score (CSS) was calculated to measure how well the spectral matched. According to CSS values, statistic description for confidence levels can be effectively provided. Next, the fragment interpretation score (FIS) was proposed to investigate the deviators' characteristic fragmentation. FIS offset the effect from the deviators' unique fragments. Suspicious annotations involving low CSS and high FIS, may derived from the MS2 spectral background interferences or co-elution. Annotations with low CSS and FIS rank as low confidences, as these annotations need more attention. Using this method, novel saccharide sequences, specific fragmentation preferences, undistinguished precursors, even new structures can also be well traced. By proposed new scoring system, confidence evaluations can be ranked, resulting in significantly enhanced annotation reliability.

Keywords: Annotation confidence; Fragment interpretation; Mass spectrometry; Natural products; Spectral similarity.

MeSH terms

  • Biological Products*
  • Reproducibility of Results
  • Saponins*
  • Tandem Mass Spectrometry

Substances

  • Biological Products
  • Saponins