Predicting gene function from gene expressions and ontologies

Pac Symp Biocomput. 2001:299-310. doi: 10.1142/9789814447362_0030.

Abstract

We introduce a methodology for inducing predictive rule models for functional classification of gene expressions from microarray hybridisation experiments. The basic learning method is the rough set framework for rule induction. The methodology is different from the commonly used unsupervised clustering approaches in that it exploits background knowledge of gene function in a supervised manner. Genes are annotated using Ashburner's Gene Ontology and the functional classes used for learning are mined from these annotations. From the original expression data, we extract a set of biologically meaningful features that are used for learning. A rule model is induced from the data described in terms of these features. Its predictive quality is fine-turned via cross-validation on subsets of the known genes prior to classification of unknown genes. The predictive and descriptive quality of such a rule model is demonstrated on the fibroblast serum response data previously analysed by Iyer et. al. Our analysis shows that the rules are capable of representing the complex relationship between gene expressions and function, and that it is possible to put forward high quality hypotheses about the function of unknown genes.

MeSH terms

  • Algorithms
  • Cells, Cultured
  • Culture Media
  • Fibroblasts / metabolism
  • Gene Expression Profiling / statistics & numerical data*
  • Gene Expression*
  • Humans
  • Models, Genetic*
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data*
  • RNA, Messenger / genetics
  • RNA, Messenger / metabolism
  • Software

Substances

  • Culture Media
  • RNA, Messenger