Forecasting Chronic Diseases Using Data Fusion

J Proteome Res. 2017 Jul 7;16(7):2435-2444. doi: 10.1021/acs.jproteome.7b00039. Epub 2017 Jun 9.

Abstract

Data fusion, that is, extracting information through the fusion of complementary data sets, is a topic of great interest in metabolomics because analytical platforms such as liquid chromatography-mass spectrometry (LC-MS) and nuclear magnetic resonance (NMR) spectroscopy commonly used for chemical profiling of biofluids provide complementary information. In this study, with a goal of forecasting acute coronary syndrome (ACS), breast cancer, and colon cancer, we jointly analyzed LC-MS, NMR measurements of plasma samples, and the metadata corresponding to the lifestyle of participants. We used supervised data fusion based on multiple kernel learning and exploited the linearity of the models to identify significant metabolites/features for the separation of healthy referents and the cases developing a disease. We demonstrated that (i) fusing LC-MS, NMR, and metadata provided better separation of ACS cases and referents compared with individual data sets, (ii) NMR data performed the best in terms of forecasting breast cancer, while fusion degraded the performance, and (iii) neither the individual data sets nor their fusion performed well for colon cancer. Furthermore, we showed the strengths and limitations of the fusion models by discussing their performance in terms of capturing known biomarkers for smoking and coffee. While fusion may improve performance in terms of separating certain conditions by jointly analyzing metabolomics and metadata sets, it is not necessarily always the best approach as in the case of breast cancer.

Keywords: acute coronary syndrome; cancer; data fusion; liquid chromatography−mass spectrometry; multiple kernel learning; nuclear magnetic resonance spectroscopy.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acute Coronary Syndrome / blood
  • Acute Coronary Syndrome / diagnosis*
  • Biomarkers / blood
  • Breast Neoplasms / blood
  • Breast Neoplasms / diagnosis*
  • Caffeine / adverse effects
  • Chromatography, Liquid
  • Chronic Disease
  • Coffee / chemistry
  • Colonic Neoplasms / blood
  • Colonic Neoplasms / diagnosis*
  • Female
  • Humans
  • Magnetic Resonance Spectroscopy
  • Male
  • Mass Spectrometry
  • Metabolome*
  • Models, Statistical*
  • Prognosis
  • Risk Factors
  • Smoking / physiopathology

Substances

  • Biomarkers
  • Coffee
  • Caffeine