Optimization of data pre-processing methods for time-series classification of electroencephalography data

Network. 2023 Feb-Nov;34(4):374-391. doi: 10.1080/0954898X.2023.2263083. Epub 2023 Nov 9.

Abstract

The performance of time-series classification of electroencephalographic data varies strongly across experimental paradigms and study participants. Reasons are task-dependent differences in neuronal processing and seemingly random variations between subjects, amongst others. The effect of data pre-processing techniques to ameliorate these challenges is relatively little studied. Here, the influence of spatial filter optimization methods and non-linear data transformation on time-series classification performance is analyzed by the example of high-frequency somatosensory evoked responses. This is a model paradigm for the analysis of high-frequency electroencephalography data at a very low signal-to-noise ratio, which emphasizes the differences of the explored methods. For the utilized data, it was found that the individual signal-to-noise ratio explained up to 74% of the performance differences between subjects. While data pre-processing was shown to increase average time-series classification performance, it could not fully compensate the signal-to-noise ratio differences between the subjects. This study proposes an algorithm to prototype and benchmark pre-processing pipelines for a paradigm and data set at hand. Extreme learning machines, Random Forest, and Logistic Regression can be used quickly to compare a set of potentially suitable pipelines. For subsequent classification, however, machine learning models were shown to provide better accuracy.

Keywords: Computational neuroscience; Data pre-processing; Electroencephalography data; Evoked response; High-frequency somatosensory-evoked response; Time-series classification.

MeSH terms

  • Algorithms*
  • Electroencephalography* / methods
  • Humans
  • Random Forest
  • Signal Processing, Computer-Assisted
  • Signal-To-Noise Ratio
  • Upper Extremity