Adaptive Cruise Control Based on Safe Deep Reinforcement Learning

Rui Zhao; Kui Wang; Wenbo Che; Yun Li; Yuze Fan; Fei Gao

doi:10.3390/s24082657

Adaptive Cruise Control Based on Safe Deep Reinforcement Learning

Sensors (Basel). 2024 Apr 22;24(8):2657. doi: 10.3390/s24082657.

Authors

Rui Zhao¹, Kui Wang², Wenbo Che¹, Yun Li³, Yuze Fan¹, Fei Gao⁴

Affiliations

¹ College of Automotive Engineering, Jilin University, Changchun 130025, China.
² School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China.
³ Graduate School of Information and Science Technology, The University of Tokyo, Tokyo 113-8654, Japan.
⁴ State Key Laboratory of Automotive Simulation and Control, Jilin University, Changhun 130025, China.

Abstract

Adaptive cruise control (ACC) enables efficient, safe, and intelligent vehicle control by autonomously adjusting speed and ensuring a safe following distance from the vehicle in front. This paper proposes a novel adaptive cruise system, namely the Safety-First Reinforcement Learning Adaptive Cruise Control (SFRL-ACC). This system aims to leverage the model-free nature and high real-time inference efficiency of Deep Reinforcement Learning (DRL) to overcome the challenges of modeling difficulties and lower computational efficiency faced by current optimization control-based ACC methods while simultaneously maintaining safety advantages and optimizing ride comfort. Firstly, we transform the ACC problem into a safe DRL formulation Constrained Markov Decision Process (CMDP) by carefully designing state, action, reward, and cost functions. Subsequently, we propose the Projected Constrained Policy Optimization (PCPO)-based ACC Algorithm SFRL-ACC, which is specifically tailored to solve the CMDP problem. PCPO incorporates safety constraints that further restrict the trust region formed by the Kullback-Leibler (KL) divergence, facilitating DRL policy updates that maximize performance while keeping safety costs within their limit bounds. Finally, we train an SFRL-ACC policy and compare its computation time, traffic efficiency, ride comfort, and safety with state-of-the-art MPC-based ACC control methods. The experimental results prove the superiority of the proposed method in the aforementioned performance aspects.

Keywords: adaptive cruise control; autonomous driving; deep reinforcement learning; projected constrained policy optimization; safety aware.

Abstract

Grants and funding