Logic Learning Machine and standard supervised methods for Hodgkin's lymphoma prognosis using gene expression data and clinical variables

Health Informatics J. 2018 Mar;24(1):54-65. doi: 10.1177/1460458216655188. Epub 2016 Jun 27.

Abstract

This study evaluates the performance of a set of machine learning techniques in predicting the prognosis of Hodgkin's lymphoma using clinical factors and gene expression data. Analysed samples from 130 Hodgkin's lymphoma patients included a small set of clinical variables and more than 54,000 gene features. Machine learning classifiers included three black-box algorithms ( k-nearest neighbour, Artificial Neural Network, and Support Vector Machine) and two methods based on intelligible rules (Decision Tree and the innovative Logic Learning Machine method). Support Vector Machine clearly outperformed any of the other methods. Among the two rule-based algorithms, Logic Learning Machine performed better and identified a set of simple intelligible rules based on a combination of clinical variables and gene expressions. Decision Tree identified a non-coding gene ( XIST) involved in the early phases of X chromosome inactivation that was overexpressed in females and in non-relapsed patients. XIST expression might be responsible for the better prognosis of female Hodgkin's lymphoma patients.

Keywords: Decision Tree; Hodgkin’s lymphoma; Logic Learning Machine; Support Vector Machine; artificial neural network; cancer prognosis.

MeSH terms

  • Cluster Analysis
  • Decision Trees
  • Gene Expression / physiology*
  • Hodgkin Disease / classification*
  • Hodgkin Disease / diagnosis
  • Humans
  • Machine Learning / trends*
  • Prognosis*