Deep Learning in Visual Tracking: A Review

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5497-5516. doi: 10.1109/TNNLS.2021.3136907. Epub 2023 Sep 1.

Abstract

Deep learning (DL) has made breakthroughs in many computer vision tasks and also in visual tracking. From the beginning of the research on the automatic acquisition of high abstract feature representation, DL has gone deep into all aspects of tracking to date, to name a few, similarity metric, data association, and bounding box estimation. Also, pure DL-based trackers have obtained the state-of-the-art performance after the community's constant research. We believe that it is time to comprehensively review the development of DL research in visual tracking. In this article, we overview the critical improvements brought to the field by DL: deep feature representations, network architecture, and four crucial issues in visual tracking (spatiotemporal information integration, target-specific classification, target information update, and bounding box estimation). The scope of the survey of DL-based tracking covers two primary subtasks for the first time, single-object tracking and multiple-object tracking. Also, we analyze the performance of DL-based approaches and give meaningful conclusions. Finally, we provide several promising directions and tasks in visual tracking and relevant fields.