Residual Sketch Learning for a Feature-Importance-Based and Linguistically Interpretable Ensemble Classifier

IEEE Trans Neural Netw Learn Syst. 2023 Feb 10:PP. doi: 10.1109/TNNLS.2023.3242049. Online ahead of print.

Abstract

Motivated by both the commonly used "from wholly coarse to locally fine" cognitive behavior and the recent finding that simple yet interpretable linear regression model should be a basic component of a classifier, a novel hybrid ensemble classifier called hybrid Takagi-Sugeno-Kang fuzzy classifier (H-TSK-FC) and its residual sketch learning (RSL) method are proposed. H-TSK-FC essentially shares the virtues of both deep and wide interpretable fuzzy classifiers and simultaneously has both feature-importance-based and linguistic-based interpretabilities. RSL method is featured as follows: 1) a global linear regression subclassifier on all original features of all training samples is generated quickly by the sparse representation-based linear regression subclassifier training procedure to identify/understand the importance of each feature and partition the output residuals of the incorrectly classified training samples into several residual sketches; 2) by using both the enhanced soft subspace clustering method (ESSC) for the linguistically interpretable antecedents of fuzzy rules and the least learning machine (LLM) for the consequents of fuzzy rules on residual sketches, several interpretable Takagi-Sugeno-Kang (TSK) fuzzy subclassifiers are stacked in parallel through residual sketches and accordingly generated to achieve local refinements; and 3) the final predictions are made to further enhance H-TSK-FC's generalization capability and decide which interpretable prediction route should be used by taking the minimal-distance-based priority for all the constructed subclassifiers. In contrast to existing deep or wide interpretable TSK fuzzy classifiers, benefiting from the use of feature-importance-based interpretability, H-TSK-FC has been experimentally witnessed to have faster running speed and better linguistic interpretability (i.e., fewer rules and/or TSK fuzzy subclassifiers and smaller model complexities) yet keep at least comparable generalization capability.