Distance Transform Pooling Neural Network for LiDAR Depth Completion

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5580-5589. doi: 10.1109/TNNLS.2021.3129801. Epub 2023 Sep 1.

Abstract

Recovering dense depth maps from sparse depth sensors, such as LiDAR, is a recently proposed task with many computer vision and robotics applications. Previous works have identified input sparsity as the key challenge of this task. To solve the sparsity challenge, we propose a recurrent distance transform pooling (DTP) module that aggregates multi-level nearby information prior to the backbone neural network. The intuition of this module is originated from the observation that most pixels within the receptive field of the network are zero. This indicates a deep and heavy network structure has to be used to enlarge the receptive field aiming at capturing enough useful information as most processed signals are uninformative zeros. Our recurrent DTP module can fill in empty pixels with the nearest value in a local patch and recurrently transform distance to reach farther nearest points. The output of the proposed DTP module is a collection of multi-level semi-dense depth maps from original sparse to almost full. Processing this collection of semi-dense depth maps alleviates the network from the input sparsity, which helps a lightweight simplified ResNet-18 with 1M parameters achieve state-of-the-art performance on the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) depth completion benchmark with LiDAR only. Besides the sparsity, the input LiDAR map also contains some incorrect values due to the sensor error. Thus, we further enhance the DTP with an error correction (EC) module to avoid the spreading of the incorrect input values. At last, we discuss the benefit of only using LiDAR for nighttime driving and the potential extension of the proposed method for sensor fusion and the indoor scenario. The code has been released online at https://github.com/placeforyiming/DistanceTransform-DepthCompletion.