Multi-omic integration by machine learning (MIMaL)

Bioinformatics. 2022 Oct 31;38(21):4908-4918. doi: 10.1093/bioinformatics/btac631.

Abstract

Motivation: Cells respond to environments by regulating gene expression to exploit resources optimally. Recent advances in technologies allow for measuring the abundances of RNA, proteins, lipids and metabolites. These highly complex datasets reflect the states of the different layers in a biological system. Multi-omics is the integration of these disparate methods and data to gain a clearer picture of the biological state. Multi-omic studies of the proteome and metabolome are becoming more common as mass spectrometry technology continues to be democratized. However, knowledge extraction through the integration of these data remains challenging.

Results: Connections between molecules in different omic layers were discovered through a combination of machine learning and model interpretation. Discovered connections reflected protein control (ProC) over metabolites. Proteins discovered to control citrate were mapped onto known genetic and metabolic networks, revealing that these protein regulators are novel. Further, clustering the magnitudes of ProC over all metabolites enabled the prediction of five gene functions, each of which was validated experimentally. Two uncharacterized genes, YJR120W and YDL157C, were accurately predicted to modulate mitochondrial translation. Functions for three incompletely characterized genes were also predicted and validated, including SDH9, ISC1 and FMP52. A website enables results exploration and also MIMaL analysis of user-supplied multi-omic data.

Availability and implementation: The website for MIMaL is at https://mimal.app. Code for the website is at https://github.com/qdickinson/mimal-website. Code to implement MIMaL is at https://github.com/jessegmeyerlab/MIMaL.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Cluster Analysis
  • Machine Learning*
  • Metabolic Networks and Pathways*
  • Proteome

Substances

  • Proteome