Real-Time Semantic Segmentation with Dual Encoder and Self-Attention Mechanism for Autonomous Driving

Sensors (Basel). 2021 Dec 2;21(23):8072. doi: 10.3390/s21238072.

Abstract

As the techniques of autonomous driving become increasingly valued and universal, real-time semantic segmentation has become very popular and challenging in the field of deep learning and computer vision in recent years. However, in order to apply the deep learning model to edge devices accompanying sensors on vehicles, we need to design a structure that has the best trade-off between accuracy and inference time. In previous works, several methods sacrificed accuracy to obtain a faster inference time, while others aimed to find the best accuracy under the condition of real time. Nevertheless, the accuracies of previous real-time semantic segmentation methods still have a large gap compared to general semantic segmentation methods. As a result, we propose a network architecture based on a dual encoder and a self-attention mechanism. Compared with preceding works, we achieved a 78.6% mIoU with a speed of 39.4 FPS with a 1024 × 2048 resolution on a Cityscapes test submission.

Keywords: autonomous driving; convolution neural network; deep learning; edge devices; image recognition; real-time semantic segmentation.

MeSH terms

  • Automobile Driving*
  • Image Processing, Computer-Assisted
  • Neural Networks, Computer*
  • Semantics