First Trimester Gaze Pattern Estimation Using Stochastic Augmentation Policy Search for Single Frame Saliency Prediction

Med Image Underst Anal (2021). 2021 Jul:2021:361-374. doi: 10.1007/978-3-030-80432-9_28. Epub 2021 Jul 6.

Abstract

While performing an ultrasound (US) scan, sonographers direct their gaze at regions of interest to verify that the correct plane is acquired and to interpret the acquisition frame. Predicting sonographer gaze on US videos is useful for identification of spatio-temporal patterns that are important for US scanning. This paper investigates utilizing sonographer gaze, in the form of gaze-tracking data, in a multimodal imaging deep learning framework to assist the analysis of the first trimester fetal ultrasound scan. Specifically, we propose an encoderdecoder convolutional neural network with skip connections to predict the visual gaze for each frame using 115 first trimester ultrasound videos; 29,250 video frames for training, 7,290 for validation and 9,126 for testing. We find that the dataset of our size benefits from automated data augmentation, which in turn, alleviates model overfitting and reduces structural variation imbalance of US anatomical views between the training and test datasets. Specifically, we employ a stochastic augmentation policy search method to improve segmentation performance. Using the learnt policies, our models outperform the baseline: KLD, SIM, NSS and CC (2.16, 0.27, 4.34 and 0.39 versus 3.17, 0.21, 2.92 and 0.28).

Keywords: Data augmentation; Fetal ultrasound; First trimester; Gaze tracking; Single frame saliency prediction; U-Net.