Towards Capturing Sonographic Experience: Cognition-Inspired Ultrasound Video Saliency Prediction

Abstract

For visual tasks like ultrasound (US) scanning, experts direct their gaze towards regions of task-relevant information. Therefore, learning to predict the gaze of sonographers on US videos captures the spatio-temporal patterns that are important for US scanning. The spatial distribution of gaze points on video frames can be represented through heat maps termed saliency maps. Here, we propose a temporally bidirectional model for video saliency prediction (BDS-Net), drawing inspiration from modern theories of human cognition. The model consists of a convolutional neural network (CNN) encoder followed by a bidirectional gated-recurrent-unit recurrent convolutional network (GRU-RCN) decoder. The temporal bidirectionality mimics human cognition, which simultaneously reacts to past and predicts future sensory inputs. We train the BDS-Net alongside spatial and temporally one-directional comparative models on the task of predicting saliency in videos of US abdominal circumference plane detection. The BDS-Net outperforms the comparative models on four out of five saliency metrics. We present a qualitative analysis on representative examples to explain the model’s superior performance.

Publication
Medical Image Understanding and Analysis (MIUA 2019). Best Paper Award

BibTex

@inproceedings{droste_towards_2020,
title = {Towards Capturing Sonographic Experience: Cognition-Inspired Ultrasound Video Saliency Prediction},
 author = {Droste, Richard and Cai, Yifan and Sharma, Harshita and Chatelain, Pierre and Papageorghiou, Aris T. and Noble, J. Alison},
 booktitle = {Medical Image Understanding and Analysis},
 doi = {10.1007/978-3-030-39343-4_15},
 editor = {Zheng, Yalin and Williams, Bryan M. and Chen, Ke},
 isbn = {978-3-030-39343-4},
 keywords = {Convolutional neural networks, Fetal ultrasound, Gaze tracking, Video saliency prediction},
 language = {en},
 pages = {174--186},
 publisher = {Springer International Publishing},
 address = {Cham},
 series = {Communications in Computer and Information Science},
 shorttitle = {Towards Capturing Sonographic Experience},
 year = {2020}
}

Related