Learning to Rank Proposals for Siamese Visual Tracking

IEEE Trans Image Process. 2021:30:8785-8796. doi: 10.1109/TIP.2021.3120305. Epub 2021 Oct 27.

Abstract

Recently, Siamese network based trackers with region proposal networks(RPN) decompose the visual tracking task into classification and regression, and have drawn much attention. However, previous Siamese trackers process all the training samples equally to learn the desired network, and only take the classification scores of proposals to locate the tracked target at the inference stage. To address the above issues, we propose a simple, yet effective strategy to rank the importance of training samples, and pay more attention to the important samples, which can facilitate the classification optimization. Moreover, we propose a lightweight ranking network to generate the ranking scores for proposals. Higher scores are assigned to proposals whose Intersection over Union(IoU) with the ground-truth are larger. The combination of classification and ranking scores serves as a new proposal selection criterion for online tracking, and can boost the tracking performance significantly. Our proposed method could be easily integrated into existing RPN-based Siamese networks in an end-to-end fashion. Extensive experiments are conducted on 10 tracking benchmarks, including NFS, UAV123, OTB2015, Temple-Color, VOT2016, VOT2017, VOT2019, TrackingNet, GOT-10K and LaSOT. The proposed method achieves a state-of-the-art tracking accuracy with a real-time speed.