A practical evaluation of correlation filter-based object trackers with new features

PLoS One. 2022 Aug 25;17(8):e0273022. doi: 10.1371/journal.pone.0273022. eCollection 2022.

Abstract

Visual object tracking is a critical problem in the field of computer vision. The visual object tracker methods can be divided into Correlation Filters (CF) and non-correlation filters trackers. The main advantage of CF-based trackers is that they have an accepted real-time tracking response. In this article, we will focus on CF-based trackers, due to their key role in online applications such as an Unmanned Aerial Vehicle (UAV), through two contributions. In the first contribution, we proposed a set of new video sequences to address two uncovered issues of the existing standard datasets. The first issue is to create two video sequence that is difficult to be tracked by a human being for the movement of the Amoeba under the microscope; these two proposed video sequences include a new feature that combined background clutter and occlusion features in a unique way; we called it hard-to-follow-by-human. The second issue is to increase the difficulty of the existing sequences by increasing the displacement of the tracked object. Then, we proposed a thorough, practical evaluation of eight CF-base trackers, with the top performance, on the existing sequence features such as out-of-view, background clutters, and fast motion. The evaluation utilized the well-known OTB-2013 dataset as well as the proposed video sequences. The overall assessment of the eight trackers on the standard evaluation metrics, e.g., precision and success rates, revealed that the Large Displacement Estimation of Similarity transformation (LDES) tracker is the best CF-based tracker among the trackers of comparison. On the contrary, with a deeper analysis, the results of the proposed video sequences show an average performance of the LDES tracker among the other trackers. The eight trackers failed to capture the moving objects in every frame of the proposed Amoeba movement video sequences while the same trackers managed to capture the object in almost every frame of the sequences of the standard dataset. These results outline the need to improve the CF-based object trackers to be able to process sequences with the proposed feature (i.e., hard-to-follow-by-human).

MeSH terms

  • Algorithms*
  • Computers*
  • Humans
  • Video Recording / methods

Grants and funding

The author(s) received no specific funding for this work.