TNT: An Interpretable Tree-Network-Tree Learning Framework using Knowledge Distillation

Jiawei Li; Yiming Li; Xingchun Xiang; Shu-Tao Xia; Siyi Dong; Yun Cai

doi:10.3390/e22111203

TNT: An Interpretable Tree-Network-Tree Learning Framework using Knowledge Distillation

Entropy (Basel). 2020 Oct 24;22(11):1203. doi: 10.3390/e22111203.

Authors

Jiawei Li¹, Yiming Li¹, Xingchun Xiang¹, Shu-Tao Xia^{1

2}, Siyi Dong³, Yun Cai³

Affiliations

¹ Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China.
² PCL Research Center of Networks and Communications, Peng Cheng Laboratory, Shenzhen 518055, China.
³ Ping An Life Insurance Company of China, Ltd., Shenzhen 518046, China.

Abstract

Deep Neural Networks (DNNs) usually work in an end-to-end manner. This makes the trained DNNs easy to use, but they remain an ambiguous decision process for every test case. Unfortunately, the interpretability of decisions is crucial in some scenarios, such as medical or financial data mining and decision-making. In this paper, we propose a Tree-Network-Tree (TNT) learning framework for explainable decision-making, where the knowledge is alternately transferred between the tree model and DNNs. Specifically, the proposed TNT learning framework exerts the advantages of different models at different stages: (1) a novel James-Stein Decision Tree (JSDT) is proposed to generate better knowledge representations for DNNs, especially when the input data are in low-frequency or low-quality; (2) the DNNs output high-performing prediction result from the knowledge embedding inputs and behave as a teacher model for the following tree model; and (3) a novel distillable Gradient Boosted Decision Tree (dGBDT) is proposed to learn interpretable trees from the soft labels and make a comparable prediction as DNNs do. Extensive experiments on various machine learning tasks demonstrated the effectiveness of the proposed method.

Keywords: James–Stein Decision Trees; deep neural networks; distillable gradient boosted decision tree; interpretable machine learning; knowledge distillation.

Abstract

Grants and funding