ONION: Functional Approach for Integration of Lipidomics and Transcriptomics Data

PLoS One. 2015 Jun 8;10(6):e0128854. doi: 10.1371/journal.pone.0128854. eCollection 2015.

Abstract

To date, the massive quantity of data generated by high-throughput techniques has not yet met bioinformatics treatment required to make full use of it. This is partially due to a mismatch in experimental and analytical study design but primarily due to a lack of adequate analytical approaches. When integrating multiple data types e.g. transcriptomics and metabolomics, multidimensional statistical methods are currently the techniques of choice. Typical statistical approaches, such as canonical correlation analysis (CCA), that are applied to find associations between metabolites and genes are failing due to small numbers of observations (e.g. conditions, diet etc.) in comparison to data size (number of genes, metabolites). Modifications designed to cope with this issue are not ideal due to the need to add simulated data resulting in a lack of p-value computation or by pruning of variables hence losing potentially valid information. Instead, our approach makes use of verified or putative molecular interactions or functional association to guide analysis. The workflow includes dividing of data sets to reach the expected data structure, statistical analysis within groups and interpretation of results. By applying pathway and network analysis, data obtained by various platforms are grouped with moderate stringency to avoid functional bias. As a consequence CCA and other multivariate models can be applied to calculate robust statistics and provide easy to interpret associations between metabolites and genes to leverage understanding of metabolic response. Effective integration of lipidomics and transcriptomics is demonstrated on publically available murine nutrigenomics data sets. We are able to demonstrate that our approach improves detection of genes related to lipid metabolism, in comparison to applying statistics alone. This is measured by increased percentage of explained variance (95% vs. 75-80%) and by identifying new metabolite-gene associations related to lipid metabolism.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Databases as Topic
  • Genomics*
  • Least-Squares Analysis
  • Lipid Metabolism*
  • Metabolomics*
  • Mice
  • Nutrigenomics
  • Software
  • Transcriptome / genetics*

Grants and funding

This research was supported in part by PL-Grid Infrastructure and Klaster LifeScience Kraków. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.