Transfer learning for the efficient detection of COVID-19 from smartphone audio data

Mattia Giovanni Campana; Franca Delmastro; Elena Pagani

doi:10.1016/j.pmcj.2023.101754

Transfer learning for the efficient detection of COVID-19 from smartphone audio data

Pervasive Mob Comput. 2023 Feb:89:101754. doi: 10.1016/j.pmcj.2023.101754. Epub 2023 Jan 30.

Authors

Mattia Giovanni Campana¹, Franca Delmastro¹, Elena Pagani²

Affiliations

¹ Institute for Informatics and Telematics of the National Research Council of Italy (IIT-CNR), Pisa, Italy.
² Computer Science Department, University of Milano, Milan, Italy.

Abstract

Disease detection from smartphone data represents an open research challenge in mobile health (m-health) systems. COVID-19 and its respiratory symptoms are an important case study in this area and their early detection is a potential real instrument to counteract the pandemic situation. The efficacy of this solution mainly depends on the performances of AI algorithms applied to the collected data and their possible implementation directly on the users' mobile devices. Considering these issues, and the limited amount of available data, in this paper we present the experimental evaluation of 3 different deep learning models, compared also with hand-crafted features, and of two main approaches of transfer learning in the considered scenario: both feature extraction and fine-tuning. Specifically, we considered VGGish, YAMNET, and L³-Net (including 12 different configurations) evaluated through user-independent experiments on 4 different datasets (13,447 samples in total). Results clearly show the advantages of L³-Net in all the experimental settings as it overcomes the other solutions by 12.3% in terms of Precision-Recall AUC as features extractor, and by 10% when the model is fine-tuned. Moreover, we note that to fine-tune only the fully-connected layers of the pre-trained models generally leads to worse performances, with an average drop of 6.6% with respect to feature extraction. Finally, we evaluate the memory footprints of the different models for their possible applications on commercial mobile devices.

Keywords: COVID-19; Deep audio embeddings; Deep learning; Transfer learning; m-health.