MDS-Net: Multi-Scale Depth Stratification 3D Object Detection from Monocular Images

Sensors (Basel). 2022 Aug 18;22(16):6197. doi: 10.3390/s22166197.

Abstract

Monocular 3D object detection is very challenging in autonomous driving due to the lack of depth information. This paper proposes a one-stage monocular 3D object detection network (MDS Net), which uses the anchor-free method to detect 3D objects in a per-pixel prediction. Firstly, a novel depth-based stratification structure is developed to improve the network's ability of depth prediction, which exploits the mathematical relationship between the size and the depth in the image of an object based on the pinhole model. Secondly, a new angle loss function is developed to further improve both the accuracy of the angle prediction and the convergence speed of training. An optimized Soft-NMS is finally applied in the post-processing stage to adjust the confidence score of the candidate boxes. Experiment results on the KITTI benchmark demonstrate that the proposed MDS-Net outperforms the existing monocular 3D detection methods in both tasks of 3D detection and BEV detection while fulfilling real-time requirements.

Keywords: 3D object detection; autonomous driving; computer vision; monocular image.

Grants and funding

This work is supported in part by the Leading Innovative and Entrepreneur Team Introduction Program of Zhejiang under Grant 2018R01001, in part by the Fundamental Research Funds for the Central Universities under Grant 226202200096.