GENDIS: Genetic Discovery of Shapelets

Sensors (Basel). 2021 Feb 4;21(4):1059. doi: 10.3390/s21041059.

Abstract

In the time series classification domain, shapelets are subsequences that are discriminative of a certain class. It has been shown that classifiers are able to achieve state-of-the-art results by taking the distances from the input time series to different discriminative shapelets as the input. Additionally, these shapelets can be visualized and thus possess an interpretable characteristic, making them appealing in critical domains, where longitudinal data are ubiquitous. In this study, a new paradigm for shapelet discovery is proposed, which is based on evolutionary computation. The advantages of the proposed approach are that: (i) it is gradient-free, which could allow escaping from local optima more easily and supports non-differentiable objectives; (ii) no brute-force search is required, making the algorithm scalable; (iii) the total amount of shapelets and the length of each of these shapelets are evolved jointly with the shapelets themselves, alleviating the need to specify this beforehand; (iv) entire sets are evaluated at once as opposed to single shapelets, which results in smaller final sets with fewer similar shapelets that result in similar predictive performances; and (v) the discovered shapelets do not need to be a subsequence of the input time series. We present the results of the experiments, which validate the enumerated advantages.

Keywords: data mining; explainable artificial intelligence (xAI); genetic algorithms; time series analysis; time series classification.