An Advanced Omic Approach to Identify Co-Regulated Clusters and Transcription Regulation Network with AGCT and SHOE Methods

Methods Mol Biol. 2017:1598:373-389. doi: 10.1007/978-1-4939-6952-4_19.

Abstract

To obtain the global picture of genetic machinery for massive high-throughput gene expression data, novel data-driven unsupervised learning approaches are becoming essentially important. For this purpose, basic analytic workflow has been established and should include two steps: first, unsupervised clustering to identify genes with similar behavior upon exposure to a signal, and second, identification of transcription factors regulating those genes. In this chapter, we will describe an advanced tool that can be used for analyzing and characterizing large-scale time-series gene expression composed of a two-step approach. For the first step, we developed an original method "A Geometric Clustering Tool" (AGCT) that unveils the complex architecture of large-scale time-series gene expression data in a real-time manner using cutting edge techniques of low dimension manifold learning, data clustering, and visualization. For the second step, we established an original method "Sequence Homology in Eukaryotes" (SHOE) executing comparative genomic analysis on humans, mice, and rats.

Keywords: Dimension score; Gene expression; Geometrical clustering; Phylogenetic footprinting; Promoter analysis; Unsupervised learning.

MeSH terms

  • Algorithms
  • Animals
  • Cluster Analysis*
  • Computational Biology / methods*
  • Eukaryota / genetics*
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation
  • Genomics / methods*
  • Mice
  • Promoter Regions, Genetic
  • Sequence Homology*
  • User-Computer Interface
  • Workflow