Task Offloading Decision-Making Algorithm for Vehicular Edge Computing: A Deep-Reinforcement-Learning-Based Approach

Wei Shi; Long Chen; Xia Zhu

doi:10.3390/s23177595

Task Offloading Decision-Making Algorithm for Vehicular Edge Computing: A Deep-Reinforcement-Learning-Based Approach

Sensors (Basel). 2023 Sep 1;23(17):7595. doi: 10.3390/s23177595.

Authors

Wei Shi^{1

2}, Long Chen^{1

2}, Xia Zhu^{1

2}

Affiliations

¹ School of Computer Science and Engineering, Southeast University, Nanjing 211189, China.
² The Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing 211189, China.

Abstract

Efficient task offloading decision is a crucial technology in vehicular edge computing, which aims to fulfill the computational performance demands of complex vehicular tasks with respect to delay and energy consumption while minimizing network resource competition and consumption. Conventional distributed task offloading decisions rely solely on the local state of the vehicle, failing to optimize the utilization of the server's resources to its fullest potential. In addition, the mobility aspect of vehicles is often neglected in these decisions. In this paper, a cloud-edge-vehicle three-tier vehicular edge computing (VEC) system is proposed, where vehicles partially offload their computing tasks to edge or cloud servers while keeping the remaining tasks local to the vehicle terminals. Under the restrictions of vehicle mobility and discrete variables, task scheduling and task offloading proportion are jointly optimized with the objective of minimizing the total system cost. Considering the non-convexity, high-dimensional complex state and continuous action space requirements of the optimization problem, we propose a task offloading decision-making algorithm based on deep deterministic policy gradient (TODM_DDPG). TODM_DDPG algorithm adopts the actor-critic framework in which the actor network outputs floating point numbers to represent deterministic policy, while the critic network evaluates the action output by the actor network, and adjusts the network evaluation policy according to the rewards with the environment to maximize the long-term reward. To explore the algorithm performance, this conduct parameter setting experiments to correct the algorithm core hyper-parameters and select the optimal combination of parameters. In addition, in order to verify algorithm performance, we also carry out a series of comparative experiments with baseline algorithms. The results demonstrate that in terms of reducing system costs, the proposed algorithm outperforms the compared baseline algorithm, such as the deep Q network (DQN) and the actor-critic (AC), and the performance is improved by about 13% on average.

Keywords: computation offloading; deep reinforcement learning; vehicular edge computing.

Abstract

Grants and funding