Probing machine-learning classifiers using noise, bubbles, and reverse correlation

J Neurosci Methods. 2021 Oct 1:362:109297. doi: 10.1016/j.jneumeth.2021.109297. Epub 2021 Jul 25.

Abstract

Background: Many scientific fields now use machine-learning tools to assist with complex classification tasks. In neuroscience, automatic classifiers may be useful to diagnose medical images, monitor electrophysiological signals, or decode perceptual and cognitive states from neural signals. However, such tools often remain black-boxes: they lack interpretability. A lack of interpretability has obvious ethical implications for clinical applications, but it also limits the usefulness of these tools to formulate new theoretical hypotheses.

New method: We propose a simple and versatile method to help characterize the information used by a classifier to perform its task. Specifically, noisy versions of training samples or, when the training set is unavailable, custom-generated noisy samples, are fed to the classifier. Multiplicative noise, so-called "bubbles", or additive noise are applied to the input representation. Reverse correlation techniques are then adapted to extract either the discriminative information, defined as the parts of the input dataset that have the most weight in the classification decision, and represented information, which correspond to the input features most representative of each category.

Results: The method is illustrated for the classification of written numbers by a convolutional deep neural network; for the classification of speech versus music by a support vector machine; and for the classification of sleep stages from neurophysiological recordings by a random forest classifier. In all cases, the features extracted are readily interpretable.

Comparison with existing methods: Quantitative comparisons show that the present method can match state-of-the art interpretation methods for convolutional neural networks. Moreover, our method uses an intuitive and well-established framework in neuroscience, reverse correlation. It is also generic: it can be applied to any kind of classifier and any kind of input data.

Conclusions: We suggest that the method could provide an intuitive and versatile interface between neuroscientists and machine-learning tools.

Keywords: Auditory models; Automatic classifiers; Data analysis; Deep neural networks; Interpretability; Reverse correlation; Sleep stages classification.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Machine Learning*
  • Neural Networks, Computer*
  • Support Vector Machine