Clustering gene expression regulators: new approach to disease subtyping

PLoS One. 2014 Jan 9;9(1):e84955. doi: 10.1371/journal.pone.0084955. eCollection 2014.

Abstract

One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA) which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms), that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenoma / classification
  • Adenoma / diagnosis
  • Adenoma / genetics*
  • Algorithms*
  • Carcinoma / classification
  • Carcinoma / diagnosis
  • Carcinoma / genetics*
  • Cluster Analysis
  • Colorectal Neoplasms / classification
  • Colorectal Neoplasms / diagnosis
  • Colorectal Neoplasms / genetics*
  • Diagnosis, Differential
  • Gene Expression Profiling
  • Gene Expression Regulation
  • Gene Regulatory Networks
  • Humans
  • Multigene Family
  • Neuromuscular Diseases / classification
  • Neuromuscular Diseases / diagnosis
  • Neuromuscular Diseases / genetics*
  • Oligonucleotide Array Sequence Analysis
  • Precision Medicine

Grants and funding

This work was partly supported by the Russian Ministry of Education and Science, contract #14.512.11.0042. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. No additional external funding was received for this study.