Unsupervised Multi-Latent Space RL Framework for Video Summarization in Ultrasound Imaging

Roshan P Mathews; Mahesh Raveendranatha Panicker; Abhilash R Hareendranathan; Yale Tung Chen; Jacob L Jaremko; Brian Buchanan; Kiran Vishnu Narayan; Kesavadas C; Greeta Mathews

doi:10.1109/JBHI.2022.3208779

Unsupervised Multi-Latent Space RL Framework for Video Summarization in Ultrasound Imaging

IEEE J Biomed Health Inform. 2023 Jan;27(1):227-238. doi: 10.1109/JBHI.2022.3208779. Epub 2023 Jan 4.

Authors

Roshan P Mathews, Mahesh Raveendranatha Panicker, Abhilash R Hareendranathan, Yale Tung Chen, Jacob L Jaremko, Brian Buchanan, Kiran Vishnu Narayan, Kesavadas C, Greeta Mathews

PMID: 36136928
DOI: 10.1109/JBHI.2022.3208779

Abstract

The COVID-19 pandemic has highlighted the need for a tool to speed up triage in ultrasound scans and provide clinicians with fast access to relevant information. To this end, we propose a new unsupervised reinforcement learning (RL) framework with novel rewards to facilitate unsupervised learning by avoiding tedious and impractical manual labelling for summarizing ultrasound videos. The proposed framework is capable of delivering video summaries with classification labels and segmentations of key landmarks which enhances its utility as a triage tool in the emergency department (ED) and for use in telemedicine. Using an attention ensemble of encoders, the high dimensional image is projected into a low dimensional latent space in terms of: a) reduced distance with a normal or abnormal class (classifier encoder), b) following a topology of landmarks (segmentation encoder), and c) the distance or topology agnostic latent representation (autoencoders). The summarization network is implemented using a bi-directional long short term memory (Bi-LSTM) which utilizes the latent space representation from the encoder. Validation is performed on lung ultrasound (LUS), that typically represent potential use cases in telemedicine and ED triage acquired from different medical centers across geographies (India and Spain). The proposed approach trained and tested on 126 LUS videos showed high agreement with the ground truth with an average precision of over 80% and average F₁ score of well over 44 ±1.7 %. The approach resulted in an average reduction in storage space of 77% which can ease bandwidth and storage requirements in telemedicine.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

COVID-19*
Humans
India
Lung / diagnostic imaging
Pandemics
Ultrasonography