Nested DWT-Based CNN Architecture for Monocular Depth Estimation

Sandip Paul; Deepak Mishra; Senthil Kumar Marimuthu

doi:10.3390/s23063066

Nested DWT-Based CNN Architecture for Monocular Depth Estimation

Sensors (Basel). 2023 Mar 13;23(6):3066. doi: 10.3390/s23063066.

Authors

Sandip Paul^{1

2}, Deepak Mishra¹, Senthil Kumar Marimuthu²

Affiliations

¹ Indian Institute of Space Science and Technology, Trivandrum 695547, Kerela, India.
² Space Applications Centre, Ahmedabad 380016, Gujrat, India.

Abstract

Applications such as medical diagnosis, navigation, robotics, etc., require 3D images. Recently, deep learning networks have been extensively applied to estimate depth. Depth prediction from 2D images poses a problem that is both ill-posed and non-linear. Such networks are computationally and time-wise expensive as they have dense configurations. Further, the network performance depends on the trained model configuration, the loss functions used, and the dataset applied for training. We propose a moderately dense encoder-decoder network based on discrete wavelet decomposition and trainable coefficients (LL, LH, HL, HH). Our Nested Wavelet-Net (NDWTN) preserves the high-frequency information that is otherwise lost during the downsampling process in the encoder. Furthermore, we study the effect of activation functions, batch normalization, convolution layers, skip, etc., in our models. The network is trained with NYU datasets. Our network trains faster with good results.

Keywords: depth–map; discrete wavelets; evaluation; loss function; nested wavelet net; training.

Grants and funding

This research received no external funding.