Machine learning approach for detecting Covid-19 from speech signal using Mel frequency magnitude coefficient

Sudhansu Sekhar Nayak; Anand D Darji; Prashant K Shah

doi:10.1007/s11760-023-02537-8

Machine learning approach for detecting Covid-19 from speech signal using Mel frequency magnitude coefficient

Signal Image Video Process. 2023 Mar 25:1-8. doi: 10.1007/s11760-023-02537-8. Online ahead of print.

Authors

Sudhansu Sekhar Nayak¹, Anand D Darji¹, Prashant K Shah¹

Affiliation

¹ Sardar Vallabhbhai National Institute of Technology, Surat, Gujarat India.

Abstract

The Covid-19 pandemic is one of the most significant global health concerns that have emerged in this decade. Intelligent healthcare technology and techniques based on speech signal and artificial intelligence make it feasible to provide a faster and more efficient timely detection of Covid-19. The main objective of our study is to design speech signal-based noninvasive, low-cost, remote diagnosis of Covid-19. In this study, we have developed system to detect Covid-19 from speech signal using Mel frequency magnitude coefficients (MFMC) and machine learning techniques. In order to capture higher-order spectral features, the spectrum is divided into a larger number of subbands with narrower bandwidths as MFMC, which leads to better frequency resolution and less overall noise. As a consequence of an improvement in frequency resolution as well as a decrease in the quantity of noise that is included with the extraction of MFMC, the higher-order MFMCs are able to identify Covid-19 from speech signals with an increased level of accuracy. The procedures for machine learning are often less complicated than those for deep learning, and they may commonly be carried out on regular computers. However, deep learning systems need extensive computing power and data storage. Twelve, twenty-four, thirty, and forty spectral coefficients are obtained using MFMC in our study, and from these coefficients, performance is accessed using machine learning classifiers, such as random forests and K-nearest neighbor (KNN); however, KNN has performed better than the other model with having AUC score of 0.80.

Keywords: Covid-19; Machine learning; Mel frequency magnitude coefficient; Speech feature; Speech signal.

© The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.