DP-Siam: Dynamic Policy Siamese Network for Robust Object Tracking

IEEE Trans Image Process. 2019 Sep 25. doi: 10.1109/TIP.2019.2942506. Online ahead of print.

Abstract

Balancing the trade-off between real-time performance and accuracy in object tracking is a major challenge. In this paper, a novel dynamic policy gradient Agent-Environment architecture with Siamese network (DP-Siam) is proposed to train the tracker to increase the accuracy and the expected average overlap while performing in real-time. DP-Siam is trained offline with reinforcement learning to produce a continuous action that predicts the optimal object location. DP-Siam has a novel architecture that consists of three networks: an Agent network to predict the optimal state (bounding box) of the object being tracked, an Environment network to get the Q-value during the offline training phase to minimize the error of the loss function, and a Siamese network to produce a heat-map. During online tracking, the Environment network acts as a verifier to the Agent network action. Extensive experiments are performed on six widely used benchmarks: OTB2013, OTB50, OTB100, VOT2015, VOT2016 and VOT2018. The results show that DP-Siam significantly outperforms the current state-of-the-art trackers.