GIPMA: Global Intensity-Guided Peak Matching and Alignment for 2D 1H-13C HSQC-Based Metabolomics

Anal Chem. 2023 Feb 14;95(6):3195-3203. doi: 10.1021/acs.analchem.2c03323. Epub 2023 Feb 2.

Abstract

Two-dimensional (2D) 1H-13C heteronuclear single quantum coherence (HSQC) has been increasingly applied to metabolomics studies because it can greatly improve the resolving capability compared with one-dimensional (1D) 1H NMR. However, preprocessing methods such as peak matching and alignment tools for 2D NMR-based metabolomics have lagged behind similar methods for 1D 1H NMR-based metabolomics. Correct matching and alignment of 2D NMR spectral features across multiple samples are particularly important for subsequent multivariate data analysis. Considering different intensity dynamic ranges of a variety of metabolites and the chemical shift variation across the spectra of multiple samples, here, we developed an efficient peak matching and alignment algorithm for 2D 1H-13C HSQC-based metabolomics, called global intensity-guided peak matching and alignment (GIPMA). In GIPMA, peaks identified in all spectra are pooled together and sorted by intensity. Chemical shift of a stronger peak is regarded to be more accurate and reliable than that of a weaker peak. The strongest undesignated peak is chosen as the reference of a new cluster if it is not located within the chemical shift tolerance of any existing peak cluster (PC), or otherwise it is matched to an existing PC and the aligned chemical shift of the PC is updated as the intensity-weighted average of the chemical shifts of all peaks in the cluster. Setting an optimum chemical shift tolerance (Δδo) is critical for the peak matching and alignment across multiple samples. GIPMA dynamically searches for and intelligently selects the Δδo for peak matching to maximize the number of valid peak clusters (vPC), that is, spectral features, among multiple samples. By GIPMA, fully automatic peakwise matching and alignment do not require any spectrum as initial reference, while the chemical shift of each PC is updated as the intensity-weighted average of the chemical shifts of all peaks in the same PC, which is warranted to be statistically more accurate. Accurate chemical shifts for each representative spectral feature will facilitate subsequent peak assignment and are essential for correct metabolite identification and result interpretation. The proposed method was demonstrated successfully on the spectra of six model mixtures consisting of seven typical metabolites, yielding correct matching of all known spectral features. The performance of GIPMA was also demonstrated on 2D 1H-13C HSQC spectra of 87 real extracts of 29 samples of five Dendrobium species. Hierarchical cluster analysis (HCA) and principal component analysis (PCA) of the 87 matched and aligned spectra by GIPMA generates correct classification of the 29 samples into five groups. In summary, the proposed algorithm of GIPMA provided a practical peak matching and alignment method to facilitate 2D NMR-based metabolomics studies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Magnetic Resonance Spectroscopy / methods
  • Metabolomics* / methods