Joint Object Detection and Re-Identification for 3D Obstacle Multi-Camera Systems

Irene Cortés; Jorge Beltrán; Arturo de la Escalera; Fernando García

doi:10.3390/s23239395

Joint Object Detection and Re-Identification for 3D Obstacle Multi-Camera Systems

Sensors (Basel). 2023 Nov 25;23(23):9395. doi: 10.3390/s23239395.

Authors

Irene Cortés¹, Jorge Beltrán², Arturo de la Escalera¹, Fernando García¹

Affiliations

¹ Department of Systems Engineering and Automation, Universidad Carlos III de Madrid (UC3M), 28911 Madrid, Spain.
² Department of Signal Theory, Telematics, and Computer Science, Rey Juan Carlos University (URJC), 28922 Madrid, Spain.

Abstract

The growing on-board processing capabilities have led to more complex sensor configurations, enabling autonomous car prototypes to expand their operational scope. Nowadays, the joint use of LiDAR data and multiple cameras is almost a standard and poses new challenges for existing multi-modal perception pipelines, such as dealing with contradictory or redundant detections caused by inference on overlapping images. In this paper, we address this last issue in the context of sequential schemes like F-PointNets, where object candidates are obtained in the image space, and the final 3D bounding box is then inferred from point cloud information. To this end, we propose the inclusion of a re-identification branch into the 2D detector, i.e., Faster R-CNN, so that objects seen from adjacent cameras can be handled before the 3D box estimation takes place, removing duplicates and completing the object's cloud. Extensive experimental evaluations covering both the 2D and 3D domains affirm the effectiveness of the suggested methodology. The findings indicate that our approach outperforms conventional Non-Maximum Suppression (NMS) methods. Particularly, we observed a significant gain of over 5% in terms of accuracy for cars in camera overlap regions. These results highlight the potential of our upgraded detection and re-identification system in practical scenarios for autonomous driving.

Keywords: 3D object detection; Siamese network; multi-camera setup; non-maxima suppression.

Abstract

Grants and funding