Data-dependent normalization strategies for untargeted metabolomics-a case study

Anal Bioanal Chem. 2020 Sep;412(24):6391-6405. doi: 10.1007/s00216-020-02594-9. Epub 2020 Apr 14.

Abstract

Despite the recent advances in the standardization of untargeted metabolomics workflows, there is still a lack of attention to specific data treatment strategies that require deep knowledge of the biological problem and need to be applied after a well-thought out process to understand the effect of the practice. One of those strategies is data normalization. Data-driven assumptions are critical especially addressing unwanted variation present in the biological model as it can be the case in heterogeneous tissues, cells with different sizes or biofluids with different concentrations. Chronic kidney disease (CKD) is a widespread disorder affecting kidney structure and function. Animal models are being developed to be able to get valuable insights into the etiopathogenesis of the condition and effect of the treatments. Moreover, diagnosis and disease staging still require defining appropriate biomarkers. Untargeted metabolomics has the potential to deal with those challenges. Renal fibrosis is one of the consequences of kidney injury which greatly affects the concentration of metabolites in the same quantity of sample. To overcome this challenge, several data normalization strategies have been applied, following a multilevel normalization method with the overall aim of focussing on the relevant biological information and reducing the influence of disturbing factors. A comprehensive evaluation of the performance of the normalization strategies, both on methods assessing the intragroup variation and on the impact on differential analysis, is provided. Finally, we present evidence of the importance of biological-model-driven guided normalization methods and discuss multiple criteria that need to be taken into consideration to obtain robust and reliable data. Special concern is transmitted on the misleading conclusions that might be the consequence of inappropriate data pre-treatment solutions applied for untargeted methods. Graphical abstract.

Keywords: Biomarker discovery; Capillary electrophoresis mass spectrometry; Data pre-treatment; Normalization; Tissue samples; Unwanted variation.

MeSH terms

  • Animals
  • Discriminant Analysis
  • Disease Models, Animal
  • Humans
  • Kidney / metabolism*
  • Least-Squares Analysis
  • Male
  • Metabolome
  • Metabolomics / methods*
  • Mice, Inbred C57BL
  • Mice, Transgenic
  • Renal Insufficiency, Chronic / metabolism*