High-Profile VRU Detection on Resource-Constrained Hardware Using YOLOv3/v4 on BDD100K

J Imaging. 2020 Dec 19;6(12):142. doi: 10.3390/jimaging6120142.

Abstract

Vulnerable Road User (VRU) detection is a major application of object detection with the aim of helping reduce accidents in advanced driver-assistance systems and enabling the development of autonomous vehicles. Due to intrinsic complexity present in computer vision and to limitations in processing capacity and bandwidth, this task has not been completely solved nowadays. For these reasons, the well established YOLOv3 net and the new YOLOv4 one are assessed by training them on a huge, recent on-road image dataset (BDD100K), both for VRU and full on-road classes, with a great improvement in terms of detection quality when compared to their MS-COCO-trained generic correspondent models from the authors but with negligible costs in forward pass time. Additionally, some models were retrained when replacing the original Leaky ReLU convolutional activation functions from original YOLO implementation with two cutting-edge activation functions: the self-regularized non-monotonic function (MISH) and its self-gated counterpart (SWISH), with significant improvements with respect to the original activation function detection performance. Additionally, some trials were carried out including recent data augmentation techniques (mosaic and cutmix) and some grid size configurations, with cumulative improvements over the previous results, comprising different performance-throughput trade-offs.

Keywords: advanced driver-assistance systems; artificial intelligence; convolutional neural networks; machine learning; on-road detection; one-stage detectors; resource-constrained hardware; vulnerable road users.

Grants and funding