MonoDCN: Monocular 3D object detection based on dynamic convolution

PLoS One. 2022 Oct 4;17(10):e0275438. doi: 10.1371/journal.pone.0275438. eCollection 2022.

Abstract

3D object detection is vital in the environment perception of autonomous driving. The current monocular 3D object detection technology mainly uses RGB images and pseudo radar point clouds as input. The methods of taking RGB images as input need to learn with geometric constraints and ignore the depth information in the picture, leading to the method being too complicated and inefficient. Although some image-based methods use depth map information for post-calibration and correction, such methods usually require a high-precision depth estimation network. The methods of using the pseudo radar point cloud as input easily introduce noise in the conversion process of depth information to the pseudo radar point cloud, which cause a large deviation in the detection process and ignores semantic information simultaneously. We introduce dynamic convolution guided by the depth map into the feature extraction network, the convolution kernel of dynamic convolution automatically learns from the depth map of the image. It solves the problem that depth information and semantic information cannot be used simultaneously and improves the accuracy of monocular 3D object detection. MonoDCN is able to significantly improve the performance of both monocular 3D object detection and Bird's Eye View tasks within the KITTI urban autonomous driving dataset.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Learning
  • Radar
  • Semantics*

Grants and funding

This work was supported in part by Henan Science and Technology Development Plan Project under Grant 212102210538,222102210101, in part by Henan Key Science and Technology Project under Grant 201300210400, in part by the Postgraduate education innovation and quality improvement project of Henan University under Grant SYL20040121, and in part by FDCT postdoctoral fund(0003/2021/APD). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.