Accounting for biological variation with linear mixed-effects modelling improves the quality of clinical metabolomics data

Comput Struct Biotechnol J. 2019 Apr 22:17:611-618. doi: 10.1016/j.csbj.2019.04.009. eCollection 2019.

Abstract

Metabolite profiles from biological samples suffer from both technical variations and subject-specific variants. To improve the quality of metabolomics data, conventional data processing methods can be employed to remove technical variations. These methods do not consider sources of subject variation as separate factors from biological factors of interest. This can be a significant issue when performing quantitative metabolomics in clinical trials or screening for a potential biomarker in early-stage disease, because changes in metabolism or a desired-metabolite signal are small compared to the total metabolite signals. As a result, inter-individual variability can interfere subsequent statistical analyses. Here, we propose an additional data processing step using linear mixed-effects modelling to readjust an individual metabolite signal prior to multivariate analyses. Published clinical metabolomics data was used to demonstrate and evaluate the proposed method. We observed a substantial reduction in variation of each metabolite signal after model fitting. A comparison with other strategies showed that our proposed method contributed to improved classification accuracy, precision, sensitivity and specificity. Moreover, we highlight the importance of patient metadata as it contains rich information of subject characteristics, which can be used to model and normalize metabolite abundances. The proposed method is available as an R package lmm2met.

Keywords: Confounding biological factors; Linear mixed-effects models; Metabolomics; Multivariate analysis; Subject metadata.