A Spatial-Temporal Transformer Architecture Using Multi-Channel Signals for Sleep Stage Classification

IEEE Trans Neural Syst Rehabil Eng. 2023:31:3353-3362. doi: 10.1109/TNSRE.2023.3305201. Epub 2023 Aug 25.

Abstract

Sleep stage classification is a fundamental task in diagnosing and monitoring sleep diseases. There are 2 challenges that remain open: (1) Since most methods only rely on input from a single channel, the spatial-temporal relationship of sleep signals has not been fully explored. (2) Lack of sleep data makes models hard to train from scratch. Here, we propose a vision Transformer-based architecture to process multi-channel polysomnogram signals. The method is an end-to-end framework that consists of a spatial encoder, a temporal encoder, and an MLP head classifier. The spatial encoder using a pre-trained Vision Transformer captures spatial information from multiple PSG channels. The temporal encoder utilizing the self-attention mechanism understands transitions between nearby epochs. In addition, we introduce a tailored image generation method to extract features within multi-channel and reshape them for transfer learning. We validate our method on 3 datasets and outperform the state-of-the-art algorithms. Our method fully explores the spatial-temporal relationship among different brain regions and addresses the problem of data insufficiency in clinical environments. Benefiting from reformulating the problem as image classification, the method could be applied to other 1D-signal problems in the future.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Electroencephalography* / methods
  • Humans
  • Polysomnography
  • Sleep Stages
  • Sleep*