Improved tactile speech robustness to background noise with a dual-path recurrent neural network noise-reduction method

Mark D Fletcher; Samuel W Perry; Iordanis Thoidis; Carl A Verschuur; Tobias Goehring

doi:10.1038/s41598-024-57312-7

Improved tactile speech robustness to background noise with a dual-path recurrent neural network noise-reduction method

Sci Rep. 2024 Mar 28;14(1):7357. doi: 10.1038/s41598-024-57312-7.

Authors

Mark D Fletcher^{1

2}, Samuel W Perry^{3

4}, Iordanis Thoidis⁵, Carl A Verschuur³, Tobias Goehring⁶

Affiliations

¹ University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK. M.D.Fletcher@soton.ac.uk.
² Institute of Sound and Vibration Research, University of Southampton, University Road, Southampton, SO17 1BJ, UK. M.D.Fletcher@soton.ac.uk.
³ University of Southampton Auditory Implant Service, University of Southampton, University Road, Southampton, SO17 1BJ, UK.
⁴ Institute of Sound and Vibration Research, University of Southampton, University Road, Southampton, SO17 1BJ, UK.
⁵ School of Electrical and Computer Engineering, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece.
⁶ MRC Cognition and Brain Sciences Unit, University of Cambridge, 15 Chaucer Road, Cambridge, CB2 7EF, UK.

Abstract

Many people with hearing loss struggle to understand speech in noisy environments, making noise robustness critical for hearing-assistive devices. Recently developed haptic hearing aids, which convert audio to vibration, can improve speech-in-noise performance for cochlear implant (CI) users and assist those unable to access hearing-assistive devices. They are typically body-worn rather than head-mounted, allowing additional space for batteries and microprocessors, and so can deploy more sophisticated noise-reduction techniques. The current study assessed whether a real-time-feasible dual-path recurrent neural network (DPRNN) can improve tactile speech-in-noise performance. Audio was converted to vibration on the wrist using a vocoder method, either with or without noise reduction. Performance was tested for speech in a multi-talker noise (recorded at a party) with a 2.5-dB signal-to-noise ratio. An objective assessment showed the DPRNN improved the scale-invariant signal-to-distortion ratio by 8.6 dB and substantially outperformed traditional noise-reduction (log-MMSE). A behavioural assessment in 16 participants showed the DPRNN improved tactile-only sentence identification in noise by 8.2%. This suggests that advanced techniques like the DPRNN could substantially improve outcomes with haptic hearing aids. Low-cost haptic devices could soon be an important supplement to hearing-assistive devices such as CIs or offer an alternative for people who cannot access CI technology.

MeSH terms

Cochlear Implantation* / methods
Cochlear Implants*
Hearing Loss* / surgery
Humans
Neural Networks, Computer
Speech
Speech Perception*

Grants and funding

MR/T03095X/1/MRC_/Medical Research Council/United Kingdom