Kernel-based hierarchical structural component models for pathway analysis

Bioinformatics. 2022 May 26;38(11):3078-3086. doi: 10.1093/bioinformatics/btac276.

Abstract

Motivation: Pathway analyses have led to more insight into the underlying biological functions related to the phenotype of interest in various types of omics data. Pathway-based statistical approaches have been actively developed, but most of them do not consider correlations among pathways. Because it is well known that there are quite a few biomarkers that overlap between pathways, these approaches may provide misleading results. In addition, most pathway-based approaches tend to assume that biomarkers within a pathway have linear associations with the phenotype of interest, even though the relationships are more complex.

Results: To model complex effects including non-linear effects, we propose a new approach, Hierarchical structural CoMponent analysis using Kernel (HisCoM-Kernel). The proposed method models non-linear associations between biomarkers and phenotype by extending the kernel machine regression and analyzes entire pathways simultaneously by using the biomarker-pathway hierarchical structure. HisCoM-Kernel is a flexible model that can be applied to various omics data. It was successfully applied to three omics datasets generated by different technologies. Our simulation studies showed that HisCoM-Kernel provided higher statistical power than other existing pathway-based methods in all datasets. The application of HisCoM-Kernel to three types of omics dataset showed its superior performance compared to existing methods in identifying more biologically meaningful pathways, including those reported in previous studies.

Availability and implementation: The HisCoM-Kernel software is freely available at http://statgen.snu.ac.kr/software/HisCom-Kernel/. The RNA-seq data underlying this article are available at https://xena.ucsc.edu/, and the others will be shared on reasonable request to the corresponding author.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers
  • Computer Simulation
  • Phenotype
  • RNA-Seq
  • Software*

Substances

  • Biomarkers