Learning-Based Adaptive Optimal Control for Connected Vehicles in Mixed Traffic: Robustness to Driver Reaction Time

Mengzhe Huang; Zhong-Ping Jiang; Kaan Ozbay

doi:10.1109/TCYB.2020.3029077

Learning-Based Adaptive Optimal Control for Connected Vehicles in Mixed Traffic: Robustness to Driver Reaction Time

IEEE Trans Cybern. 2022 Jun;52(6):5267-5277. doi: 10.1109/TCYB.2020.3029077. Epub 2022 Jun 16.

Authors

Mengzhe Huang, Zhong-Ping Jiang, Kaan Ozbay

PMID: 33170792
DOI: 10.1109/TCYB.2020.3029077

Abstract

Through vehicle-to-vehicle (V2V) communication, both human-driven and autonomous vehicles can actively exchange data, such as velocities and bumper-to-bumper distances. Employing the shared data, control laws with improved performance can be designed for connected and autonomous vehicles (CAVs). In this article, taking into account human-vehicle interaction and heterogeneous driver behavior, an adaptive optimal control design method is proposed for a platoon mixed with multiple preceding human-driven vehicles and one CAV at the tail. It is shown that by using reinforcement learning and adaptive dynamic programming techniques, a near-optimal controller can be learned from real-time data for the CAV with V2V communications, but without the precise knowledge of the accurate car-following parameters of any driver in the platoon. The proposed method allows the CAV controller to adapt to different platoon dynamics caused by the unknown and heterogeneous driver-dependent parameters. To improve the safety performance during the learning process, our off-policy learning algorithm can leverage both the historical data and the data collected in real time, which leads to considerably reduced learning time duration. The effectiveness and efficiency of our proposed method is demonstrated by rigorous proofs and microscopic traffic simulations.

MeSH terms

Accidents, Traffic / prevention & control
Algorithms
Automobile Driving*
Humans
Reaction Time
Safety