Normalization methods for reducing interbatch effect without quality control samples in liquid chromatography-mass spectrometry-based studies

Alisa O Tokareva; Vitaliy V Chagovets; Alexey S Kononikhin; Natalia L Starodubtseva; Eugene N Nikolaev; Vladimir E Frankevich

doi:10.1007/s00216-021-03294-8

Normalization methods for reducing interbatch effect without quality control samples in liquid chromatography-mass spectrometry-based studies

Anal Bioanal Chem. 2021 May;413(13):3479-3486. doi: 10.1007/s00216-021-03294-8. Epub 2021 Mar 24.

Authors

Alisa O Tokareva^{1

2

3}, Vitaliy V Chagovets¹, Alexey S Kononikhin^{1

4}, Natalia L Starodubtseva^{5

6}, Eugene N Nikolaev⁴, Vladimir E Frankevich⁷

Affiliations

¹ National Medical Research Center for Obstetrics, Gynecology and Perinatology named after Academician V.I. Kulakov of the Ministry of Healthcare of the Russian Federation, Moscow, 117997, Russia.
² V.L. Talrose Institute for Energy Problems of Chemical Physics, N.N. Semenov Federal Center of Chemical Physic, Russian Academy of Sciences, Moscow, 119334, Russia.
³ Moscow Institute of Physics and Technology, Moscow, 141701, Russia.
⁴ Skolkovo Institute of Science and Technology, Moscow, 121205, Russia.
⁵ National Medical Research Center for Obstetrics, Gynecology and Perinatology named after Academician V.I. Kulakov of the Ministry of Healthcare of the Russian Federation, Moscow, 117997, Russia. n_starodubtseva@oparina4.ru.
⁶ Moscow Institute of Physics and Technology, Moscow, 141701, Russia. n_starodubtseva@oparina4.ru.
⁷ National Medical Research Center for Obstetrics, Gynecology and Perinatology named after Academician V.I. Kulakov of the Ministry of Healthcare of the Russian Federation, Moscow, 117997, Russia. vfrankevich@gmail.com.

PMID: 33760933
DOI: 10.1007/s00216-021-03294-8

Abstract

Data normalization is an essential part of a large-scale untargeted mass spectrometry metabolomics analysis. Autoscaling, Pareto scaling, range scaling, and level scaling methods for liquid chromatography-mass spectrometry data processing were compared with the most common normalization methods, including quantile normalization, probabilistic quotient normalization, and variance stabilizing normalization. These methods were tested on eight datasets from various clinical studies. The efficiency of the data normalization was assessed by the distance between clusters corresponding to batches and the distance between clusters corresponding to clinical groups in the space of principal components, as well as by the number of features with a pairwise statistically significant difference between the batches and the number of features with a pairwise statistically significant difference between clinical groups. Autoscaling demonstrated the most effective reduction in interbatch variation and can be preferable to probabilistic quotient or quantile normalization in liquid chromatography-mass spectrometry data.

Keywords: Interbatch correction; Liquid chromatography-mass spectrometry; Normalization; Scaling.

Grants and funding

18-75-10097/Russian Science Foundation