Prediction of regulatory gene pairs using dynamic time warping and gene ontology

Int J Data Min Bioinform. 2014;10(2):121-45. doi: 10.1504/ijdmb.2014.064010.

Abstract

Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.

MeSH terms

  • Algorithms*
  • Data Mining / methods
  • Databases, Genetic*
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation / genetics
  • Gene Ontology*
  • Genes, Regulator / genetics*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Protein Interaction Mapping / methods*