MaskSleepNet: A Cross-Modality Adaptation Neural Network for Heterogeneous Signals Processing in Sleep Staging

IEEE J Biomed Health Inform. 2023 May;27(5):2353-2364. doi: 10.1109/JBHI.2023.3253728. Epub 2023 May 4.

Abstract

Deep learning methods have become an important tool for automatic sleep staging in recent years. However, most of the existing deep learning-based approaches are sharply constrained by the input modalities, where any insertion, substitution, and deletion of input modalities would directly lead to the unusable of the model or a deterioration in the performance. To solve the modality heterogeneity problems, a novel network architecture named MaskSleepNet is proposed. It consists of a masking module, a multi-scale convolutional neural network (MSCNN), a squeezing and excitation (SE) block, and a multi-headed attention (MHA) module. The masking module consists of a modality adaptation paradigm that can cooperate with modality discrepancy. The MSCNN extracts features from multiple scales and specially designs the size of the feature concatenation layer to prevent invalid or redundant features from zero-setting channels. The SE block further optimizes the weights of the features to optimize the network learning efficiency. The MHA module outputs the prediction results by learning the temporal information between the sleeping features. The performance of the proposed model was validated on two publicly available datasets, Sleep-EDF Expanded (Sleep-EDFX) and Montreal Archive of Sleep Studies (MASS), and a clinical dataset, Huashan Hospital Fudan University (HSFU). The proposed MaskSleepNet can achieve favorable performance with input modality discrepancy, e.g. for single-channel EEG signal, it can reach 83.8%, 83.4%, 80.5%, for two-channel EEG+EOG signals it can reach 85.0%, 84.9%, 81.9% and for three-channel EEG+EOG+EMG signals, it can reach 85.7%, 87.5%, 81.1% on Sleep-EDFX, MASS, and HSFU, respectively. In contrast the accuracy of the state-of-the-art approach which fluctuated widely between 69.0% and 89.4%. The experimental results exhibit that the proposed model can maintain superior performance and robustness in handling input modality discrepancy issues.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Electroencephalography* / methods
  • Humans
  • Neural Networks, Computer*
  • Polysomnography / methods
  • Sleep
  • Sleep Stages