Local network component analysis for quantifying transcription factor activities

Methods. 2017 Jul 15:124:25-35. doi: 10.1016/j.ymeth.2017.06.018. Epub 2017 Jul 12.

Abstract

Transcription factors (TFs) could regulate physiological transitions or determine stable phenotypic diversity. The accurate estimation on TF regulatory signals or functional activities is of great significance to guide biological experiments or elucidate molecular mechanisms, but still remains challenging. Traditional methods identify TF regulatory signals at the population level, which masks heterogeneous regulation mechanisms in individuals or subgroups, thus resulting in inaccurate analyses. Here, we propose a novel computational framework, namely local network component analysis (LNCA), to exploit data heterogeneity and automatically quantify accurate transcription factor activity (TFA) in practical terms, through integrating the partitioned expression sets (i.e., local information) and prior TF-gene regulatory knowledge. Specifically, LNCA adopts an adaptive optimization strategy, which evaluates the local similarities of regulation controls and corrects biases during data integration, to construct the TFA landscape. In particular, we first numerically demonstrate the effectiveness of LNCA for the simulated data sets, compared with traditional methods, such as FastNCA, ROBNCA and NINCA. Then, we apply our model to two real data sets with implicit temporal or spatial regulation variations. The results show that LNCA not only recognizes the periodic mode along the S. cerevisiae cell cycle process, but also substantially outperforms over other methods in terms of accuracy and consistency. In addition, the cross-validation study for glioblastomas multiforme (GBM) indicates that the TFAs, identified by LNCA, can better distinguish clinically distinct tumor groups than the expression values of the corresponding TFs, thus opening a new way to classify tumor subtypes and also providing a novel insight into cancer heterogeneity.

Availability: LNCA was implemented as a Matlab package, which is available at http://sysbio.sibcb.ac.cn/cb/chenlab/software.htm/LNCApackage_0.1.rar.

Keywords: Adaptive optimization strategy; Data heterogeneity; Integrative analysis; Network component analysis; Transcription factor activities.

MeSH terms

  • Algorithms*
  • Brain Neoplasms / diagnosis
  • Brain Neoplasms / genetics*
  • Brain Neoplasms / metabolism
  • Brain Neoplasms / mortality
  • Cell Cycle / genetics
  • Databases, Genetic
  • Gene Expression Profiling
  • Gene Expression Regulation
  • Gene Regulatory Networks
  • Glioblastoma / diagnosis
  • Glioblastoma / genetics*
  • Glioblastoma / metabolism
  • Glioblastoma / mortality
  • Humans
  • Neoplasm Proteins / genetics*
  • Prognosis
  • Saccharomyces cerevisiae / genetics
  • Saccharomyces cerevisiae / metabolism
  • Signal Transduction
  • Survival Analysis
  • Transcription Factors / genetics*
  • Transcription Factors / metabolism
  • Transcription, Genetic*

Substances

  • Neoplasm Proteins
  • Transcription Factors