Adopting the YOLOv4 Architecture for Low-Latency Multispectral Pedestrian Detection in Autonomous Driving

Sensors (Basel). 2022 Jan 30;22(3):1082. doi: 10.3390/s22031082.

Abstract

Detecting pedestrians in autonomous driving is a safety-critical task, and the decision to avoid a a person has to be made with minimal latency. Multispectral approaches that combine RGB and thermal images are researched extensively, as they make it possible to gain robustness under varying illumination and weather conditions. State-of-the-art solutions employing deep neural networks offer high accuracy of pedestrian detection. However, the literature is short of works that evaluate multispectral pedestrian detection with respect to its feasibility in obstacle avoidance scenarios, taking into account the motion of the vehicle. Therefore, we investigated the real-time neural network detector architecture You Only Look Once, the latest version (YOLOv4), and demonstrate that this detector can be adapted to multispectral pedestrian detection. It can achieve accuracy on par with the state-of-the-art while being highly computationally efficient, thereby supporting low-latency decision making. The results achieved on the KAIST dataset were evaluated from the perspective of automotive applications, where low latency and a low number of false negatives are critical parameters. The middle fusion approach to YOLOv4 in its Tiny variant achieved the best accuracy to computational efficiency trade-off among the evaluated architectures.

Keywords: You Only Look Once; deep learning; multispectral fusion; pedestrian detection; real-time.

MeSH terms

  • Automobile Driving*
  • Humans
  • Lighting
  • Neural Networks, Computer
  • Pedestrians*
  • Weather