Metalearning approach for leukemia informative genes prioritization

J Integr Bioinform. 2020 May 8;17(1):20190069. doi: 10.1515/jib-2019-0069.

Abstract

The discovery of diagnostic or prognostic biomarkers is fundamental to optimize therapeutics for patients. By enhancing the interpretability of the prediction model, this work is aimed to optimize Leukemia diagnosis while retaining a high-performance evaluation in the identification of informative genes. For this purpose, we used an optimal parameterization of Kernel Logistic Regression method on Leukemia microarray gene expression data classification, applying metalearners to select attributes, reducing the data dimensionality before passing it to the classifier. Pearson correlation and chi-squared statistic were the attribute evaluators applied on metalearners, having information gain as single-attribute evaluator. The implemented models relied on 10-fold cross-validation. The metalearners approach identified 12 common genes, with highest average merit of 0.999. The practical work was developed using the public datamining software WEKA.

Keywords: informative genes; leukemia; machine learning; metalearning; microarray.

MeSH terms

  • Algorithms*
  • Gene Expression Profiling
  • Humans
  • Leukemia* / genetics