Decoding and synthesizing tonal language speech from brain activity

Sci Adv. 2023 Jun 9;9(23):eadh0478. doi: 10.1126/sciadv.adh0478. Epub 2023 Jun 9.

Abstract

Recent studies have shown that the feasibility of speech brain-computer interfaces (BCIs) as a clinically valid treatment in helping nontonal language patients with communication disorders restore their speech ability. However, tonal language speech BCI is challenging because additional precise control of laryngeal movements to produce lexical tones is required. Thus, the model should emphasize the features from the tonal-related cortex. Here, we designed a modularized multistream neural network that directly synthesizes tonal language speech from intracranial recordings. The network decoded lexical tones and base syllables independently via parallel streams of neural network modules inspired by neuroscience findings. The speech was synthesized by combining tonal syllable labels with nondiscriminant speech neural activity. Compared to commonly used baseline models, our proposed models achieved higher performance with modest training data and computational costs. These findings raise a potential strategy for approaching tonal language speech restoration.

MeSH terms

  • Brain
  • Humans
  • Language*
  • Neural Networks, Computer
  • Speech*