Learning a predictive model for growth inhibition from the NCI DTP human tumor cell line screening data: does gene expression make a difference?

Pac Symp Biocomput. 2006:596-607.

Abstract

We address the problem of learning a predictive model for growth inhibition from the NCI DTP human tumor cell line screening data. Extending the classical Quantitative Structure Activity Relationship paradigm, we investigate whether including gene expression data leads to a statistically significant improvement of prediction quality. Our analysis shows that the straightforward approach of including individual gene expression as features does not necessarily improve, but on the contrary, may degrade performance significantly. When gene expression information is aggregated, for instance by features representing the correlation with reference cell lines, performance can be improved significantly. Further improvements may be expected if the learning task is structured by grouping features and instances.

MeSH terms

  • Cell Line, Tumor
  • Computational Biology
  • Databases, Genetic
  • Drug Screening Assays, Antitumor / statistics & numerical data*
  • Gene Expression
  • Humans
  • Models, Biological*
  • Pharmacogenetics / statistics & numerical data*