Norm ISWSVR: A Data Integration and Normalization Approach for Large-Scale Metabolomics

Anal Chem. 2022 May 31;94(21):7500-7509. doi: 10.1021/acs.analchem.1c05502. Epub 2022 May 18.

Abstract

Large-scale and long-period metabolomics study is more susceptible to various sources of systematic errors, resulting in nonreproducibility and poor data quality. A reliable and robust batch correction method removes unwanted systematic variations and improves the statistical power of metabolomics data, which undeniably becomes an important issue for the quality control of metabolomics. This study proposed a novel data normalization and integration method, Norm ISWSVR. It is a two-step approach via combining the best-performance internal standard correction with support vector regression normalization, comprehensively removing the systematic and random errors and matrix effects. This method was investigated in three untargeted lipidomics or metabolomics datasets, and the performance was further evaluated systematically in comparison with that of 11 other normalization methods. As a result, Norm ISWSVR decreased the data's median cross-validated relative standard deviation (cvRSD), increased the correlation between QCs, improved the classification accuracy of biomarkers, and was well-compatible with quantitative data. More importantly, Norm ISWSVR also allows a low frequency of QCs, which could significantly decrease the burden of a large-scale experiment. Correspondingly, Norm ISWSVR favorably improves the data quality of large-scale metabolomics data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers
  • Lipidomics*
  • Metabolomics* / methods
  • Quality Control

Substances

  • Biomarkers