Censoring weighted separate-and-conquer rule induction from survival data

Methods Inf Med. 2014;53(2):137-48. doi: 10.3414/ME13-01-0046. Epub 2014 Feb 27.

Abstract

Objectives: Rule induction is one of the major methods of machine learning. Rule-based models can be easily read and interpreted by humans, that makes them particularly useful in survival studies as they can help clinicians to better understand analysed data and make informed decisions about patient treatment. Although of such usefulness, there is still a little research on rule learning in survival analysis. In this paper we take a step towards rule-based analysis of survival data.

Methods: We investigate so-called covering or separate-and-conquer method of rule induction in combination with a weighting scheme for handling censored observations. We also focus on rule quality measures being one of the key elements differentiating particular implementations of separate-and-conquer rule induction algorithms. We examine 15 rule quality measures guiding rule induction process and reflecting a wide range of different rule learning heuristics.

Results: The algorithm is extensively tested on a collection of 20 real survival datasets and compared with the state-of-the-art survival trees and random survival forests algorithms. Most of the rule quality measures outperform Kaplan-Meier estimate and perform at least equally well as tree-based algorithms.

Conclusions: Separate-and-conquer rule induction in combination with weighting scheme is an effective technique for building rule-based models of survival data which, according to predictive accuracy, are competitive with tree-based representations.

Keywords: Survival prediction; rule induction; rule quality measures.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Datasets as Topic
  • Decision Support Techniques*
  • Electronic Data Processing
  • Humans
  • Survival Analysis*