Unsupervised detection and removal of muscle artifacts from scalp EEG recordings using canonical correlation analysis, wavelets and random forests

Clin Neurophysiol. 2017 Sep;128(9):1755-1769. doi: 10.1016/j.clinph.2017.06.247. Epub 2017 Jul 8.

Abstract

Objective: This paper proposes supervised and unsupervised algorithms for automatic muscle artifact detection and removal from long-term EEG recordings, which combine canonical correlation analysis (CCA) and wavelets with random forests (RF).

Methods: The proposed algorithms first perform CCA and continuous wavelet transform of the canonical components to generate a number of features which include component autocorrelation values and wavelet coefficient magnitude values. A subset of the most important features is subsequently selected using RF and labelled observations (supervised case) or synthetic data constructed from the original observations (unsupervised case). The proposed algorithms are evaluated using realistic simulation data as well as 30min epochs of non-invasive EEG recordings obtained from ten patients with epilepsy.

Results: We assessed the performance of the proposed algorithms using classification performance and goodness-of-fit values for noisy and noise-free signal windows. In the simulation study, where the ground truth was known, the proposed algorithms yielded almost perfect performance. In the case of experimental data, where expert marking was performed, the results suggest that both the supervised and unsupervised algorithm versions were able to remove artifacts without affecting noise-free channels considerably, outperforming standard CCA, independent component analysis (ICA) and Lagged Auto-Mutual Information Clustering (LAMIC).

Conclusion: The proposed algorithms achieved excellent performance for both simulation and experimental data. Importantly, for the first time to our knowledge, we were able to perform entirely unsupervised artifact removal, i.e. without using already marked noisy data segments, achieving performance that is comparable to the supervised case.

Significance: Overall, the results suggest that the proposed algorithms yield significant future potential for improving EEG signal quality in research or clinical settings without the need for marking by expert neurophysiologists, EMG signal recording and user visual inspection.

Keywords: Blind source separation; Canonical correlation analysis; Continuous wavelet transform; EMG artifacts.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Artifacts*
  • Electroencephalography / methods*
  • Electroencephalography / standards
  • Humans
  • Random Allocation
  • Scalp / physiology*
  • Wavelet Analysis*