Vowel speech recognition from rat electroencephalography using long short-term memory neural network

PLoS One. 2022 Jun 23;17(6):e0270405. doi: 10.1371/journal.pone.0270405. eCollection 2022.

Abstract

Over the years, considerable research has been conducted to investigate the mechanisms of speech perception and recognition. Electroencephalography (EEG) is a powerful tool for identifying brain activity; therefore, it has been widely used to determine the neural basis of speech recognition. In particular, for the classification of speech recognition, deep learning-based approaches are in the spotlight because they can automatically learn and extract representative features through end-to-end learning. This study aimed to identify particular components that are potentially related to phoneme representation in the rat brain and to discriminate brain activity for each vowel stimulus on a single-trial basis using a bidirectional long short-term memory (BiLSTM) network and classical machine learning methods. Nineteen male Sprague-Dawley rats subjected to microelectrode implantation surgery to record EEG signals from the bilateral anterior auditory fields were used. Five different vowel speech stimuli were chosen, /a/, /e/, /i/, /o/, and /u/, which have highly different formant frequencies. EEG recorded under randomly given vowel stimuli was minimally preprocessed and normalized by a z-score transformation to be used as input for the classification of speech recognition. The BiLSTM network showed the best performance among the classifiers by achieving an overall accuracy, f1-score, and Cohen's κ values of 75.18%, 0.75, and 0.68, respectively, using a 10-fold cross-validation approach. These results indicate that LSTM layers can effectively model sequential data, such as EEG; hence, informative features can be derived through BiLSTM trained with end-to-end learning without any additional hand-crafted feature extraction methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Electroencephalography / methods
  • Male
  • Memory, Short-Term
  • Neural Networks, Computer
  • Rats
  • Rats, Sprague-Dawley
  • Speech
  • Speech Perception*

Grants and funding

This work was supported by Basic Science Research program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (No. 2020R1A2B5B01002297). This work was also supported by GIST Research Institute (GRI) IIBR grant funded by the GIST in 2022. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.