EDSSA: An Encoder-Decoder Semantic Segmentation Networks Accelerator on OpenCL-Based FPGA Platform

Hongzhi Huang; Yakun Wu; Mengqi Yu; Xuesong Shi; Fei Qiao; Li Luo; Qi Wei; Xinjun Liu

doi:10.3390/s20143969

EDSSA: An Encoder-Decoder Semantic Segmentation Networks Accelerator on OpenCL-Based FPGA Platform

Sensors (Basel). 2020 Jul 17;20(14):3969. doi: 10.3390/s20143969.

Authors

Hongzhi Huang¹, Yakun Wu¹, Mengqi Yu², Xuesong Shi³, Fei Qiao², Li Luo¹, Qi Wei⁴, Xinjun Liu⁵

Affiliations

¹ School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China.
² Department of Electronic Engineering and BNRist, Tsinghua University, Beijing 100084, China.
³ Intel Labs China, Beijing 100090, China.
⁴ Department of Precision Instrument, Tsinghua University, Beijing 100084, China.
⁵ Department of Mechanical Engineering, Tsinghua University, Beijing 100084, China.

Abstract

Visual semantic segmentation, which is represented by the semantic segmentation network, has been widely used in many fields, such as intelligent robots, security, and autonomous driving. However, these Convolutional Neural Network (CNN)-based networks have high requirements for computing resources and programmability for hardware platforms. For embedded platforms and terminal devices in particular, Graphics Processing Unit (GPU)-based computing platforms cannot meet these requirements in terms of size and power consumption. In contrast, the Field Programmable Gate Array (FPGA)-based hardware system not only has flexible programmability and high embeddability, but can also meet lower power consumption requirements, which make it an appropriate solution for semantic segmentation on terminal devices. In this paper, we demonstrate EDSSA-an Encoder-Decoder semantic segmentation networks accelerator architecture which can be implemented with flexible parameter configurations and hardware resources on the FPGA platforms that support Open Computing Language (OpenCL) development. We introduce the related technologies, architecture design, algorithm optimization, and hardware implementation of the Encoder-Decoder semantic segmentation network SegNet as an example, and undertake a performance evaluation. Using an Intel Arria-10 GX1150 platform for evaluation, our work achieves a throughput higher than 432.8 GOP/s with power consumption of about 20 W, which is a 1.2× times improvement the energy-efficiency ratio compared to a high-performance GPU.

Keywords: FPGA; OpenCL; framework; semantic segmentation.