Classification of real-world pathological phonocardiograms through multi-instance learning

Andrea Duggento; Allegra Conti; Maria Guerrisi; Nicola Toschi

doi:10.1109/EMBC46164.2021.9630705

Classification of real-world pathological phonocardiograms through multi-instance learning

Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov:2021:771-774. doi: 10.1109/EMBC46164.2021.9630705.

Authors

Andrea Duggento, Allegra Conti, Maria Guerrisi, Nicola Toschi

PMID: 34891404
DOI: 10.1109/EMBC46164.2021.9630705

Abstract

Heart auscultation is an inexpensive and fundamental technique to effectively to diagnose cardiovascular disease. However, due to relatively high human error rates even when auscultation is performed by an experienced physician, and due to the not universal availability of qualified personnel e.g. in developing countries, a large body of research is attempting to develop automated, computational tools for detecting abnormalities in heart sounds. The large heterogeneity of achievable data quality and devices, the variety o possible heart pathologies, and a generally poor signal-to-noise ratio make this problem extremely challenging. We present an accurate classification strategy for diagnosing heart sounds based on 1) automatic heart phase segmentation, 2) state-of-the art filters drawn from the filed of speech synthesis (mel-frequency cepstral representation), and 3) an ad-hoc multi-branch, multi-instance artificial neural network based on convolutional layers and fully connected neuronal ensembles which separately learns from each heart phase, hence leveraging their different physiological significance. We demonstrate that it is possible to train our architecture to reach very high performances, e.g. an AUC of 0.87 or a sensitivity of 0.97. Our machine-learning-based tool could be employed for heart sound classification, especially as a screening tool in a variety of situations including telemedicine applications.

MeSH terms

Heart Auscultation
Heart Sounds*
Humans
Machine Learning
Neural Networks, Computer
Signal-To-Noise Ratio