Deep Reinforcement Learning for Joint Trajectory Planning, Transmission Scheduling, and Access Control in UAV-Assisted Wireless Sensor Networks

Xiaoling Luo; Che Chen; Chunnian Zeng; Chengtao Li; Jing Xu; Shimin Gong

doi:10.3390/s23104691

Deep Reinforcement Learning for Joint Trajectory Planning, Transmission Scheduling, and Access Control in UAV-Assisted Wireless Sensor Networks

Sensors (Basel). 2023 May 12;23(10):4691. doi: 10.3390/s23104691.

Authors

Xiaoling Luo^{1

2}, Che Chen^{3

4}, Chunnian Zeng¹, Chengtao Li², Jing Xu⁵, Shimin Gong⁴

Affiliations

¹ School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China.
² China Three Gorges Corporation, Wuhan 430010, China.
³ School of Computer Sciences, Minnan Normal University, Zhangzhou 363000, China.
⁴ School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 518107, China.
⁵ School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, China.

Abstract

Unmanned aerial vehicles (UAVs) can be used to relay sensing information and computational workloads from ground users (GUs) to a remote base station (RBS) for further processing. In this paper, we employ multiple UAVs to assist with the collection of sensing information in a terrestrial wireless sensor network. All of the information collected by the UAVs can be forwarded to the RBS. We aim to improve the energy efficiency for sensing-data collection and transmission by optimizing UAV trajectory, scheduling, and access-control strategies. Considering a time-slotted frame structure, UAV flight, sensing, and information-forwarding sub-slots are confined to each time slot. This motivates the trade-off study between UAV access-control and trajectory planning. More sensing data in one time slot will take up more UAV buffer space and require a longer transmission time for information forwarding. We solve this problem by a multi-agent deep reinforcement learning approach that takes into consideration a dynamic network environment with uncertain information about the GU spatial distribution and traffic demands. We further devise a hierarchical learning framework with reduced action and state spaces to improve the learning efficiency by exploiting the distributed structure of the UAV-assisted wireless sensor network. Simulation results show that UAV trajectory planning with access control can significantly improve UAV energy efficiency. The hierarchical learning method is more stable in learning and can also achieve higher sensing performance.

Keywords: UAV; access control; multi-agent deep reinforcement learning; trajectory planning.

Grants and funding

This research received no external funding.