Robust and Interpretable Temporal Convolution Network for Event Detection in Lung Sound Recordings

IEEE J Biomed Health Inform. 2022 Jul;26(7):2898-2908. doi: 10.1109/JBHI.2022.3144314. Epub 2022 Jul 1.

Abstract

Objective: This paper proposes a novel framework for lung sound event detection, segmenting continuous lung sound recordings into discrete events and performing recognition of each event.

Methods: We propose the use of a multi-branch TCN architecture and exploit a novel fusion strategy to combine the resultant features from these branches. This not only allows the network to retain the most salient information across different temporal granularities and disregards irrelevant information, but also allows our network to process recordings of arbitrary length.

Results: The proposed method is evaluated on multiple public and in-house benchmarks, containing irregular and noisy recordings of the respiratory auscultation process for the identification of auscultation events including inhalation, crackles, and rhonchi. Moreover, we provide an end-to-end model interpretation pipeline.

Conclusion: Our analysis of different feature fusion strategies shows that the proposed feature concatenation method leads to better suppression of non-informative features, which drastically reduces the classifier overhead resulting in a robust lightweight network.

Significance: Lung sound event detection is a primary diagnostic step for numerous respiratory diseases. The proposed method provides a cost-effective and efficient alternative to exhaustive manual segmentation, and provides more accurate segmentation than existing methods. The end-to-end model interpretability helps to build the required trust in the system for use in clinical settings.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Auscultation / methods
  • Humans
  • Lung
  • Respiratory Sounds*
  • Sound Recordings*