High-Temporal-Resolution Object Detection and Tracking Using Images and Events

J Imaging. 2022 Jul 27;8(8):210. doi: 10.3390/jimaging8080210.

Abstract

Event-based vision is an emerging field of computer vision that offers unique properties, such as asynchronous visual output, high temporal resolutions, and dependence on brightness changes, to generate data. These properties can enable robust high-temporal-resolution object detection and tracking when combined with frame-based vision. In this paper, we present a hybrid, high-temporal-resolution object detection and tracking approach that combines learned and classical methods using synchronized images and event data. Off-the-shelf frame-based object detectors are used for initial object detection and classification. Then, event masks, generated per detection, are used to enable inter-frame tracking at varying temporal resolutions using the event data. Detections are associated across time using a simple, low-cost association metric. Moreover, we collect and label a traffic dataset using the hybrid sensor DAVIS 240c. This dataset is utilized for quantitative evaluation using state-of-the-art detection and tracking metrics. We provide ground truth bounding boxes and object IDs for each vehicle annotation. Further, we generate high-temporal-resolution ground truth data to analyze tracking performance at different temporal rates. Our approach shows promising results, with minimal performance deterioration at higher temporal resolutions (48-384 Hz) when compared with the baseline frame-based performance at 24 Hz.

Keywords: event-based vision; frame-based vision; high-temporal-resolution tracking; hybrid approach; object detection and tracking.

Grants and funding

This research received no external funding.