Is Context-Aware CNN Ready for the Surroundings? Panoramic Semantic Segmentation in the Wild

IEEE Trans Image Process. 2021:30:1866-1881. doi: 10.1109/TIP.2020.3048682. Epub 2021 Jan 18.

Abstract

Semantic segmentation, unifying most navigational perception tasks at the pixel level has catalyzed striking progress in the field of autonomous transportation. Modern Convolution Neural Networks (CNNs) are able to perform semantic segmentation both efficiently and accurately, particularly owing to their exploitation of wide context information. However, most segmentation CNNs are benchmarked against pinhole images with limited Field of View (FoV). Despite the growing popularity of panoramic cameras to sense the surroundings, semantic segmenters have not been comprehensively evaluated on omnidirectional wide-FoV data, which features rich and distinct contextual information. In this paper, we propose a concurrent horizontal and vertical attention module to leverage width-wise and height-wise contextual priors markedly available in the panoramas. To yield semantic segmenters suitable for wide-FoV images, we present a multi-source omni-supervised learning scheme with panoramic domain covered in the training via data distillation. To facilitate the evaluation of contemporary CNNs in panoramic imagery, we put forward the Wild PAnoramic Semantic Segmentation (WildPASS) dataset, comprising images from all around the globe, as well as adverse and unconstrained scenes, which further reflects perception challenges of navigation applications in the real world. A comprehensive variety of experiments demonstrates that the proposed methods enable our high-efficiency architecture to attain significant accuracy gains, outperforming the state of the art in panoramic imagery domains.

MeSH terms

  • Image Processing, Computer-Assisted / methods*
  • Neural Networks, Computer*
  • Semantics*