Decoding selective auditory attention with EEG using a transformer model

Zihao Xu; Yanru Bai; Ran Zhao; Hongmei Hu; Guangjian Ni; Dong Ming

doi:10.1016/j.ymeth.2022.04.009

Decoding selective auditory attention with EEG using a transformer model

Methods. 2022 Aug:204:410-417. doi: 10.1016/j.ymeth.2022.04.009. Epub 2022 Apr 18.

Authors

Zihao Xu¹, Yanru Bai¹, Ran Zhao¹, Hongmei Hu², Guangjian Ni³, Dong Ming⁴

Affiliations

¹ Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China; Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China.
² Medizinische Physik, Carl von Ossietzky Universität Oldenburg and Cluster of Excellence "Hearing4all", Küpkersweg 74, 26129, Oldenburg, Germany.
³ Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China; Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China; Department of Biomedical Engineering, College of Precision Instruments and Optoelectronics Engineering, Tianjin University, Tianjin 300072 China. Electronic address: niguangjian@tju.edu.cn.
⁴ Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072 China; Tianjin Key Laboratory of Brain Science and Neuroengineering, Tianjin 300072 China; Department of Biomedical Engineering, College of Precision Instruments and Optoelectronics Engineering, Tianjin University, Tianjin 300072 China. Electronic address: richardming@tju.edu.cn.

PMID: 35447360
DOI: 10.1016/j.ymeth.2022.04.009

Abstract

The human auditory system extracts valid information in noisy environments while ignoring other distractions, relying primarily on auditory attention. Studies have shown that the cerebral cortex responds differently to the sound source locations and that auditory attention is time-varying. In this work, we proposed a data-driven encoder-decoder architecture model for auditory attention detection (AAD), denoted as AAD-transformer. The model contains temporal self-attention and channel attention modules and could reconstruct the speech envelope by dynamically assigning weights according to the temporal self-attention and channel attention mechanisms of electroencephalogram (EEG). In addition, the model is conducted based on data-driven without additional preprocessing steps. The proposed model was validated using a binaural listening dataset, in which the speech stimulus was Mandarin, and compared with other models. The results showed that the decoding accuracy of the AAD-transformer in the 0.15-second decoding time window was 76.35%, which was much higher than the accuracy of the linear model using temporal response function in the 3-second decoding time window (increased by 16.27%). This work provides a novel auditory attention detection method, and the data-driven characteristic makes it convenient for neural-steered hearing devices, especially those who speak tonal languages.

Keywords: Attention-mechanism; Auditory attention decoding; EEG; Transformer.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Auditory Perception / physiology
Cerebral Cortex
Electroencephalography / methods
Humans
Speech
Speech Perception* / physiology