Inferring differentially expressed pathways using kernel maximum mean discrepancy-based test

BMC Bioinformatics. 2016 Jun 6;17 Suppl 5(Suppl 5):205. doi: 10.1186/s12859-016-1046-1.

Abstract

Background: Pathway expression is multivariate in nature. Thus, from a statistical perspective, to detect differentially expressed pathways between two conditions, methods for inferring differences between mean vectors need to be applied. Maximum mean discrepancy (MMD) is a statistical test to determine whether two samples are from the same distribution, its implementation being greatly simplified using the kernel method.

Results: An MMD-based test successfully detected the differential expression between two conditions, specifically the expression of a set of genes involved in certain fatty acid metabolic pathways. Furthermore, we exploited the ability of the kernel method to integrate data and successfully added hepatic fatty acid levels to the test procedure.

Conclusion: MMD is a non-parametric test that acquires several advantages when combined with the kernelization of data: 1) the number of variables can be greater than the sample size; 2) omics data can be integrated; 3) it can be applied not only to vectors, but to strings, sequences and other common structured data types arising in molecular biology.

Keywords: Kernel maximum mean test; Kernel-based methods; Omics data integration.

MeSH terms

  • Algorithms*
  • Animals
  • Computational Biology / methods*
  • Diet
  • Fatty Acids / metabolism
  • Gene Expression*
  • Genomics
  • Liver / metabolism
  • Metabolomics
  • Mice
  • Mice, Knockout
  • Plant Oils / chemistry
  • Plant Oils / metabolism
  • Sunflower Oil

Substances

  • Fatty Acids
  • Plant Oils
  • Sunflower Oil