Matrix Linear Models for connecting metabolite composition to individual characteristics

bioRxiv [Preprint]. 2023 Dec 20:2023.12.19.572450. doi: 10.1101/2023.12.19.572450.

Abstract

High-throughput metabolomics data provide a detailed molecular window into biological processes. We consider the problem of assessing how the association of metabolite levels with individual (sample) characteristics such as sex or treatment may depend on metabolite characteristics such as pathway. Typically this is one in a two-step process: In the first step we assess the association of each metabolite with individual characteristics. In the second step an enrichment analysis is performed by metabolite characteristics among significant associations. We combine the two steps using a bilinear model based on the matrix linear model (MLM) framework we have previously developed for high-throughput genetic screens. Our framework can estimate relationships in metabolites sharing known characteristics, whether categorical (such as type of lipid or pathway) or numerical (such as number of double bonds in triglycerides). We demonstrate how MLM offers flexibility and interpretability by applying our method to three metabolomic studies. We show that our approach can separate the contribution of the overlapping triglycerides characteristics, such as the number of double bonds and the number of carbon atoms. The proposed method have been implemented in the open-source Julia package, MatrixLM. Data analysis scripts with example data analyses are also available.

Keywords: Julia language; bilinear models; high-throughput data; lipidomics; metabolomics.

Publication types

  • Preprint