Automated sleep stage scoring employing a reasoning mechanism and evaluation of its explainability

Kazumasa Horie; Leo Ota; Ryusuke Miyamoto; Takashi Abe; Yoko Suzuki; Fusae Kawana; Toshio Kokubo; Masashi Yanagisawa; Hiroyuki Kitagawa

doi:10.1038/s41598-022-16334-9

Automated sleep stage scoring employing a reasoning mechanism and evaluation of its explainability

Sci Rep. 2022 Jul 27;12(1):12799. doi: 10.1038/s41598-022-16334-9.

Authors

Kazumasa Horie¹, Leo Ota², Ryusuke Miyamoto³, Takashi Abe⁴, Yoko Suzuki⁴, Fusae Kawana^{5

6}, Toshio Kokubo^{4

7}, Masashi Yanagisawa^{4

8

9

10}, Hiroyuki Kitagawa^{3

4}

Affiliations

¹ Center for Computational Sciences, University of Tsukuba, Tsukuba, Japan. horie@cs.tsukuba.ac.jp.
² Center for Computational Sciences, University of Tsukuba, Tsukuba, Japan. ota.leo@kde.cs.tsukuba.ac.jp.
³ Center for Computational Sciences, University of Tsukuba, Tsukuba, Japan.
⁴ International Institute for Integrative Sleep Medicine (WPI-IIIS), University of Tsukuba, Tsukuba, Japan.
⁵ Yumino Heart Clinic, Toshima, Japan.
⁶ Juntendo University Graduate School of Medicine, Bunkyo, Japan.
⁷ S'UIMIN Inc., Shibuya, Japan.
⁸ R&D Center for Frontiers of Mirai in Policy and Technology, University of Tsukuba, Tsukuba, Japan.
⁹ Tsukuba Advanced Research Alliance (TARA), University of Tsukuba, Tsukuba, Japan.
¹⁰ Department of Molecular Genetics, University of Texas Southwestern Medical Center, Dallas, USA.

Abstract

Scoring sleep stages from biological signals is an essential but labor-intensive inspection for sleep diagnosis. The existing automated scoring methods have achieved high accuracy but are not widely applied in clinical practice. In our understanding, the existing methods have failed to establish the trust of sleep experts (e.g., physicians and clinical technologists) due to a lack of ability to explain the evidences/clues for scoring. In this study, we developed a deep-learning-based scoring model with a reasoning mechanism called class activation mapping (CAM) to solve this problem. This mechanism explicitly shows which portions of the signals support our model's sleep stage decision, and we verified that these portions overlap with the "characteristic waves," which are evidences/clues used in the manual scoring process. In exchange for the acquisition of explainability, employing CAM makes it difficult to follow some scoring rules. Although we concerned the negative effect of CAM on the scoring accuracy, we have found that the impact is limited. The evaluation experiment shows that the proposed model achieved a scoring accuracy of [Formula: see text]. It is superior to those of some existing methods and the inter-rater reliability among the sleep experts. These results suggest that Sleep-CAM achieved both explainability and required scoring accuracy for practical usage.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Data Collection
Electroencephalography / methods
Polysomnography / methods
Problem Solving*
Reproducibility of Results
Sleep
Sleep Stages* / physiology