SEGAL time series classification - Stable explanations using a generative model and an adaptive weighting method for LIME

Neural Netw. 2024 Apr 27:176:106345. doi: 10.1016/j.neunet.2024.106345. Online ahead of print.

Abstract

Local Interpretability Model-agnostic Explanations (LIME) is a well-known post-hoc technique for explaining black-box models. While very useful, recent research highlights challenges around the explanations generated. In particular, there is a potential lack of stability, where the explanations provided vary over repeated runs of the algorithm, casting doubt on their reliability. This paper investigates the stability of LIME when applied to multivariate time series classification. We demonstrate that the traditional methods for generating neighbours used in LIME carry a high risk of creating 'fake' neighbours, which are out-of-distribution in respect to the trained model and far away from the input to be explained. This risk is particularly pronounced for time series data because of their substantial temporal dependencies. We discuss how these out-of-distribution neighbours contribute to unstable explanations. Furthermore, LIME weights neighbours based on user-defined hyperparameters which are problem-dependent and hard to tune. We show how unsuitable hyperparameters can impact the stability of explanations. We propose a two-fold approach to address these issues. First, a generative model is employed to approximate the distribution of the training data set, from which within-distribution samples and thus meaningful neighbours can be created for LIME. Second, an adaptive weighting method is designed in which the hyperparameters are easier to tune than those of the traditional method. Experiments on real-world data sets demonstrate the effectiveness of the proposed method in providing more stable explanations using the LIME framework. In addition, in-depth discussions are provided on the reasons behind these results.

Keywords: Explainable artificial intelligence; Feature importance; LIME; Multivariate time series classification; Stability.