Instrumental signals of samples cannot be compared and/or analysed directly if their concentrations are unknown. Differences in overall concentration need to be removed at the data normalization step. The choice of normalization method has a profound effect on the final results of data analysis, and especially on biomarker identification. One of the possible approaches to deal with the 'size effect' is to work with size-irrelevant (log) ratios instead of the original variables. In the presented study, the performance of log-ratio methods, namely pairwise log-ratio (plr) and centered log-ratio (clr), is discussed for real and simulated data sets with different characteristics. It was found that the clr method can lead to distribution of local differences along an entire signal and as such, it should be avoided in all studies aiming to identify biomarkers.
Keywords: Biomarkers identification; Normalization; Pairwise log-ratio; Pre-processing; Size effect.
Copyright © 2019 Elsevier B.V. All rights reserved.