Reinforcement Learning-Based Approach for Minimizing Energy Loss of Driving Platoon Decisions

Zhiru Gu; Zhongwei Liu; Qi Wang; Qiyun Mao; Zhikang Shuai; Ziji Ma

doi:10.3390/s23084176

Reinforcement Learning-Based Approach for Minimizing Energy Loss of Driving Platoon Decisions

Sensors (Basel). 2023 Apr 21;23(8):4176. doi: 10.3390/s23084176.

Authors

Zhiru Gu¹, Zhongwei Liu¹, Qi Wang², Qiyun Mao¹, Zhikang Shuai², Ziji Ma²

Affiliations

¹ College of Railway Transportation, Hunan University of Technology, Zhuzhou 412007, China.
² College of Electrical and Information Engineering, Hunan University, Changsha 410082, China.

Abstract

Reinforcement learning (RL) methods for energy saving and greening have recently appeared in the field of autonomous driving. In inter-vehicle communication (IVC), a feasible and increasingly popular research direction of RL is to obtain the optimal action decision of agents in a special environment. This paper presents the application of reinforcement learning in the vehicle communication simulation framework (Veins). In this research, we explore the application of reinforcement learning algorithms in a green cooperative adaptive cruise control (CACC) platoon. Our aim is to train member vehicles to react appropriately in the event of a severe collision involving the leading vehicle. We seek to reduce collision damage and optimize energy consumption by encouraging behavior that conforms to the platoon's environmentally friendly aim. Our study provides insight into the potential benefits of using reinforcement learning algorithms to improve the safety and efficiency of CACC platoons while promoting sustainable transportation. The policy gradient algorithm used in this paper has good convergence in the calculation of the minimum energy consumption problem and the optimal solution of vehicle behavior. In terms of energy consumption metrics, the policy gradient algorithm is used first in the IVC field for training the proposed platoon problem. It is a feasible training decision-planning algorithm for solving the minimization of energy consumption caused by decision making in platoon avoidance behavior.

Keywords: carbon emissions; green driving; green eco; platoon; policy gradient; reinforcement learning.

Abstract

Grants and funding