DetectFormer: Category-Assisted Transformer for Traffic Scene Object Detection

Tianjiao Liang; Hong Bao; Weiguo Pan; Xinyue Fan; Han Li

doi:10.3390/s22134833

DetectFormer: Category-Assisted Transformer for Traffic Scene Object Detection

Sensors (Basel). 2022 Jun 26;22(13):4833. doi: 10.3390/s22134833.

Authors

Tianjiao Liang^{1

2}, Hong Bao^{1

2}, Weiguo Pan^{1

2}, Xinyue Fan^{1

2}, Han Li^{1

2}

Affiliations

¹ Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China.
² College of Robotics, Beijing Union University, Beijing 100101, China.

Abstract

Object detection plays a vital role in autonomous driving systems, and the accurate detection of surrounding objects can ensure the safe driving of vehicles. This paper proposes a category-assisted transformer object detector called DetectFormer for autonomous driving. The proposed object detector can achieve better accuracy compared with the baseline. Specifically, ClassDecoder is assisted by proposal categories and global information from the Global Extract Encoder (GEE) to improve the category sensitivity and detection performance. This fits the distribution of object categories in specific scene backgrounds and the connection between objects and the image context. Data augmentation is used to improve robustness and attention mechanism added in backbone network to extract channel-wise spatial features and direction information. The results obtained by benchmark experiment reveal that the proposed method can achieve higher real-time detection performance in traffic scenes compared with RetinaNet and FCOS. The proposed method achieved a detection performance of 97.6% and 91.4% in AP50 and AP75 on the BCTSDB dataset, respectively.

Keywords: autonomous driving; deep learning; object detection; transformer.

Grants and funding

61802019， 61932012, 61871039， 61906017，62006020/National Natural Science Foundation of China