Optimizing 3D Convolution Kernels on Stereo Matching for Resource Efficient Computations

Sensors (Basel). 2021 Oct 13;21(20):6808. doi: 10.3390/s21206808.

Abstract

Despite recent stereo matching algorithms achieving significant results on public benchmarks, the problem of requiring heavy computation remains unsolved. Most works focus on designing an architecture to reduce the computational complexity, while we take aim at optimizing 3D convolution kernels on the Pyramid Stereo Matching Network (PSMNet) for solving the problem. In this paper, we design a series of comparative experiments exploring the performance of well-known convolution kernels on PSMNet. Our model saves the computational complexity from 256.66 G MAdd (Multiply-Add operations) to 69.03 G MAdd (198.47 G MAdd to 10.84 G MAdd for only considering 3D convolutional neural networks) without losing accuracy. On Scene Flow and KITTI 2015 datasets, our model achieves results comparable to the state-of-the-art with a low computational cost.

Keywords: 3D channel-wise attention; 3D vision; lightweight 3D kernels; network design; stereo matching.

MeSH terms

  • Algorithms*
  • Benchmarking
  • Image Processing, Computer-Assisted*
  • Neural Networks, Computer