Ensemble of Rotation Trees for Imbalanced Medical Datasets

J Healthc Eng. 2018 Apr 10:2018:8902981. doi: 10.1155/2018/8902981. eCollection 2018.

Abstract

Medical datasets are often predominately composed of "normal" examples with only a small percentage of "abnormal" ones and how to correctly recognize the abnormal examples is very meaningful. However, conventional classification learning methods try to pursue high accuracy by assuming that the number of any class examples is similar to each other, which lead to the fact that the abnormal class examples are usually ignored and misclassified to normal ones. In this paper, we propose a simple but effective ensemble method called ensemble of rotation trees (ERT) to handle this problem in imbalanced medical datasets. ERT learns an ensemble through the following four stages: (1) undersampling subsets from normal class, (2) obtaining new balanced training sets through combining each subset and abnormal class, (3) inducing a rotation matrix on randomly sampling subset of each new balanced set, and in each rotation matrix space, (4) learning a decision tree on each balanced training data. Here, the rotation matrix is mainly to improve the diversity between ensemble members, and undersampling technique aims to improve the performance of learned models on abnormal class. Experimental results show that, compared with other state-of-the-art methods, ERT shows significantly better performance for imbalanced medical datasets.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Databases, Factual / classification*
  • Decision Trees*
  • Humans
  • Machine Learning*