Triplet Loss-Based Models for COVID-19 Detection from Vocal Sounds

Adria Mallol-Ragolta; Florian B Pokorny; Katrin D Bartl-Pokorny; Anastasia Semertzidou; Bjorn W Schuller

doi:10.1109/EMBC48229.2022.9871125

Triplet Loss-Based Models for COVID-19 Detection from Vocal Sounds

Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul:2022:998-1001. doi: 10.1109/EMBC48229.2022.9871125.

Authors

Adria Mallol-Ragolta, Florian B Pokorny, Katrin D Bartl-Pokorny, Anastasia Semertzidou, Bjorn W Schuller

PMID: 36086187
DOI: 10.1109/EMBC48229.2022.9871125

Abstract

This work focuses on the automatic detection of COVID-19 from the analysis of vocal sounds, including sustained vowels, coughs, and speech while reading a short text. Specifically, we use the Mel-spectrogram representations of these acoustic signals to train neural network-based models for the task at hand. The extraction of deep learnt representations from the Mel-spectrograms is performed with Convolutional Neural Networks (CNNs). In an attempt to guide the training of the embedded representations towards more separable and robust inter-class representations, we explore the use of a triplet loss function. The experiments performed are conducted using the Your Voice Counts dataset, a new dataset containing German speakers collected using smartphones. The results obtained support the suitability of using triplet loss-based models to detect COVID-19 from vocal sounds. The best Unweighted Average Recall (UAR) of 66.5 % is obtained using a triplet loss-based model exploiting vocal sounds recorded while reading.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Acoustics
COVID-19* / diagnosis
Humans
Neural Networks, Computer
Speech
Voice*