Multimodal Continual Learning with Sonographer Eye-Tracking in Fetal Ultrasound

Simpl Med Ultrasound (2021). 2021 Sep 21:12967:14-24. doi: 10.1007/978-3-030-87583-1_2.

Abstract

Deep networks have been shown to achieve impressive accuracy for some medical image analysis tasks where large datasets and annotations are available. However, tasks involving learning over new sets of classes arriving over extended time is a different and difficult challenge due to the tendency of reduction in performance over old classes while adapting to new ones. Controlling such a 'forgetting' is vital for deployed algorithms to evolve with new arrivals of data incrementally. Usually, incremental learning approaches rely on expert knowledge in the form of manual annotations or active feedback. In this paper, we explore the role that other forms of expert knowledge might play in making deep networks in medical image analysis immune to forgetting over extended time. We introduce a novel framework for mitigation of this forgetting effect in deep networks considering the case of combining ultrasound video with point-of-gaze tracked for expert sonographers during model training. This is used along with a novel weighted distillation strategy to reduce the propagation of effects due to class imbalance.

Keywords: Eye tracking; Fetal ultrasound; Incremental learning.