A novel multimodal fusion framework for early diagnosis and accurate classification of COVID-19 patients using X-ray images and speech signal processing techniques

Santosh Kumar; Mithilesh Kumar Chaube; Saeed Hamood Alsamhi; Sachin Kumar Gupta; Mohsen Guizani; Raffaele Gravina; Giancarlo Fortino

doi:10.1016/j.cmpb.2022.107109

A novel multimodal fusion framework for early diagnosis and accurate classification of COVID-19 patients using X-ray images and speech signal processing techniques

Comput Methods Programs Biomed. 2022 Nov:226:107109. doi: 10.1016/j.cmpb.2022.107109. Epub 2022 Sep 12.

Authors

Santosh Kumar¹, Mithilesh Kumar Chaube², Saeed Hamood Alsamhi³, Sachin Kumar Gupta⁴, Mohsen Guizani⁵, Raffaele Gravina⁶, Giancarlo Fortino⁷

Affiliations

¹ Department of Computer Science and Engineering, International Institute of Information Technology, Naya Raipur, Chhattishgarh, India. Electronic address: santosh@iiitnr.edu.in.
² Department of Mathematical Sciences, International Institute of Information Technology, Naya Raipur, Chhattishgarh, India. Electronic address: mithilesh@iiitnr.edu.in.
³ Insight Centre for Data Analytics, National University of Ireland, Galway, Ireland; Faculty of Engineering, IBB University, Ibb, Yemen. Electronic address: saeed.alsamhi@insight-centre.org.
⁴ School of Electronics and Communication Engineering, Shri Mata Vaishno Devi University, Katra, India. Electronic address: sachin.gupta@smvdu.ac.in.
⁵ Machine Learning Department, Mohamed Bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates. Electronic address: mguizani@ieee.org.
⁶ Department of Informatics, Modeling, Electronic, and System Engineering, University of Calabria, Rende 87036, Italy. Electronic address: r.gravina@dimes.unical.it.
⁷ Department of Informatics, Modeling, Electronic, and System Engineering, University of Calabria, Rende 87036, Italy. Electronic address: giancarlo.fortino@unical.it.

Abstract

Background and objective: COVID-19 outbreak has become one of the most challenging problems for human being. It is a communicable disease caused by a new coronavirus strain, which infected over 375 million people already and caused almost 6 million deaths. This paper aims to develop and design a framework for early diagnosis and fast classification of COVID-19 symptoms using multimodal Deep Learning techniques.

Methods: we collected chest X-ray and cough sample data from open source datasets, Cohen and datasets and local hospitals. The features are extracted from the chest X-ray images are extracted from chest X-ray datasets. We also used cough audio datasets from Coswara project and local hospitals. The publicly available Coughvid DetectNow and Virufy datasets are used to evaluate COVID-19 detection based on speech sounds, respiratory, and cough. The collected audio data comprises slow and fast breathing, shallow and deep coughing, spoken digits, and phonation of sustained vowels. Gender, geographical location, age, preexisting medical conditions, and current health status (COVID-19 and Non-COVID-19) are recorded.

Results: The proposed framework uses the selection algorithm of the pre-trained network to determine the best fusion model characterized by the pre-trained chest X-ray and cough models. Third, deep chest X-ray fusion by discriminant correlation analysis is used to fuse discriminatory features from the two models. The proposed framework achieved recognition accuracy, specificity, and sensitivity of 98.91%, 96.25%, and 97.69%, respectively. With the fusion method we obtained 94.99% accuracy.

Conclusion: This paper examines the effectiveness of well-known ML architectures on a joint collection of chest-X-rays and cough samples for early classification of COVID-19. It shows that existing methods can effectively used for diagnosis and suggesting that the fusion learning paradigm could be a crucial asset in diagnosing future unknown illnesses. The proposed framework supports health informatics basis on early diagnosis, clinical decision support, and accurate prediction.

Keywords: COVID-19; Deep learning; Early detection; Multimodel fusion; Speech processing; X-ray image classification.

MeSH terms

COVID-19* / diagnostic imaging
Cough / diagnostic imaging
Deep Learning*
Early Diagnosis
Humans
SARS-CoV-2
Speech
X-Rays