Traffic Signal Control Using Hybrid Action Space Deep Reinforcement Learning

Salah Bouktif; Abderraouf Cheniki; Ali Ouni

doi:10.3390/s21072302

Traffic Signal Control Using Hybrid Action Space Deep Reinforcement Learning

Sensors (Basel). 2021 Mar 25;21(7):2302. doi: 10.3390/s21072302.

Authors

Salah Bouktif¹, Abderraouf Cheniki², Ali Ouni³

Affiliations

¹ Department of Computer Science and Software Engineering, University of United Arab Emirates, Al Ain 15551, Abu Dhabi, United Arab Emirates.
² Department of Electrical Engineering, University of Boumerdes, Boumerdès 35000, Algeria.
³ École de Technologie Supérieure, University of Quebec, Montreal, QC H3C 1K3, Canada.

Abstract

Recent research works on intelligent traffic signal control (TSC) have been mainly focused on leveraging deep reinforcement learning (DRL) due to its proven capability and performance. DRL-based traffic signal control frameworks belong to either discrete or continuous controls. In discrete control, the DRL agent selects the appropriate traffic light phase from a finite set of phases. Whereas in continuous control approach, the agent decides the appropriate duration for each signal phase within a predetermined sequence of phases. Among the existing works, there are no prior approaches that propose a flexible framework combining both discrete and continuous DRL approaches in controlling traffic signal. Thus, our ultimate objective in this paper is to propose an approach capable of deciding simultaneously the proper phase and its associated duration. Our contribution resides in adapting a hybrid Deep Reinforcement Learning that considers at the same time discrete and continuous decisions. Precisely, we customize a Parameterized Deep Q-Networks (P-DQN) architecture that permits a hierarchical decision-making process that primarily decides the traffic light next phases and secondly specifies its the associated timing. The evaluation results of our approach using Simulation of Urban MObility (SUMO) shows its out-performance over the benchmarks. The proposed framework is able to reduce the average queue length of vehicles and the average travel time by 22.20% and 5.78%, respectively, over the alternative DRL-based TSC systems.

Keywords: P-DQN; hybrid action space; parameterized deep reinforcement learning; traffic optimization; traffic signal control.

Abstract

Grants and funding