A Comparative Investigation of Automatic Speech Recognition Platforms for Aphasia Assessment Batteries

Seedahmed S Mahmoud; Raphael F Pallaud; Akshay Kumar; Serri Faisal; Yin Wang; Qiang Fang

doi:10.3390/s23020857

A Comparative Investigation of Automatic Speech Recognition Platforms for Aphasia Assessment Batteries

Sensors (Basel). 2023 Jan 11;23(2):857. doi: 10.3390/s23020857.

Authors

Seedahmed S Mahmoud¹, Raphael F Pallaud², Akshay Kumar¹, Serri Faisal², Yin Wang¹, Qiang Fang¹

Affiliations

¹ Department of Biomedical Engineering, Shantou University, Shantou 515063, China.
² Computer and Information Technology Department, IT Institute @ Phoenix College, Phoenix, AZ 85013, USA.

Abstract

The rehabilitation of aphasics is fundamentally based on the assessment of speech impairment. Developing methods for assessing speech impairment automatically is important due to the growing number of stroke cases each year. Traditionally, aphasia is assessed manually using one of the well-known assessment batteries, such as the Western Aphasia Battery (WAB), the Chinese Rehabilitation Research Center Aphasia Examination (CRRCAE), and the Boston Diagnostic Aphasia Examination (BDAE). In aphasia testing, a speech-language pathologist (SLP) administers multiple subtests to assess people with aphasia (PWA). The traditional assessment is a resource-intensive process that requires the presence of an SLP. Thus, automating the assessment of aphasia is essential. This paper evaluated and compared custom machine learning (ML) speech recognition algorithms against off-the-shelf platforms using healthy and aphasic speech datasets on the naming and repetition subtests of the aphasia battery. Convolutional neural networks (CNN) and linear discriminant analysis (LDA) are the customized ML algorithms, while Microsoft Azure and Google speech recognition are off-the-shelf platforms. The results of this study demonstrated that CNN-based speech recognition algorithms outperform LDA and off-the-shelf platforms. The ResNet-50 architecture of CNN yielded an accuracy of 99.64 ± 0.26% on the healthy dataset. Even though Microsoft Azure was not trained on the same healthy dataset, it still generated comparable results to the LDA and superior results to Google's speech recognition platform.

Keywords: aphasia; deep learning; speech impairment assessment.

MeSH terms

Aphasia* / diagnosis
Aphasia* / rehabilitation
Humans
Language
Speech
Speech Disorders
Speech Perception*
Stroke*

Grants and funding

2020LKSFG04C/Li Ka Shing Foundation