Visual Diagnosis of Tree Boosting Methods

IEEE Trans Vis Comput Graph. 2018 Jan;24(1):163-173. doi: 10.1109/TVCG.2017.2744378. Epub 2017 Aug 29.

Abstract

Tree boosting, which combines weak learners (typically decision trees) to generate a strong learner, is a highly effective and widely used machine learning method. However, the development of a high performance tree boosting model is a time-consuming process that requires numerous trial-and-error experiments. To tackle this issue, we have developed a visual diagnosis tool, BOOSTVis, to help experts quickly analyze and diagnose the training process of tree boosting. In particular, we have designed a temporal confusion matrix visualization, and combined it with a t-SNE projection and a tree visualization. These visualization components work together to provide a comprehensive overview of a tree boosting model, and enable an effective diagnosis of an unsatisfactory training process. Two case studies that were conducted on the Otto Group Product Classification Challenge dataset demonstrate that BOOSTVis can provide informative feedback and guidance to improve understanding and diagnosis of tree boosting algorithms.

Publication types

  • Research Support, Non-U.S. Gov't