Functional discrimination of gene expression patterns in terms of the gene ontology

Pac Symp Biocomput. 2003:565-76.

Abstract

The ever-growing amount of experimental data in molecular biology and genetics requires its automated analysis, by employing sophisticated knowledge discovery tools. We use an Inductive Logic Programming (ILP) learner to induce functional discrimination rules between genes studied using microarrays and found to be differentially expressed in three recently discovered subtypes of adenocarcinoma of the lung. The discrimination rules involve functional annotations from the Proteome HumanPSD database in terms of the Gene Ontology, whose hierarchical structure is essential for this task. While most of the lower levels of gene expression data (pre)processing have been automated, our work can be seen as a step toward automating the higher level functional analysis of the data. We view our application not just as a prototypical example of applying more sophisticated machine learning techniques to the functional analysis of genes, but also as an incentive for developing increasingly more sophisticated functional annotations and ontologies, that can be automatically processed by such learning algorithms.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenocarcinoma / genetics
  • Artificial Intelligence
  • Computational Biology
  • Databases, Protein
  • Gene Expression Profiling / statistics & numerical data*
  • Genomics / statistics & numerical data*
  • Humans
  • Lung Neoplasms / genetics
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data
  • Proteomics / statistics & numerical data