Spectrum-efficient user grouping and resource allocation based on deep reinforcement learning for mmWave massive MIMO-NOMA systems

Minghao Wang; Xin Liu; Fang Wang; Yang Liu; Tianshuang Qiu; Minglu Jin

doi:10.1038/s41598-024-59241-x

Spectrum-efficient user grouping and resource allocation based on deep reinforcement learning for mmWave massive MIMO-NOMA systems

Sci Rep. 2024 Apr 17;14(1):8884. doi: 10.1038/s41598-024-59241-x.

Authors

Minghao Wang^#¹, Xin Liu^#¹, Fang Wang¹, Yang Liu², Tianshuang Qiu³, Minglu Jin³

Affiliations

¹ College of Electronic Information Engineering, Inner Mongolia University, Hohhot, 010021, China.
² College of Electronic Information Engineering, Inner Mongolia University, Hohhot, 010021, China. yangliu@imu.edu.cn.
³ Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian, 116024, China.

^# Contributed equally.

PMID: 38632323
DOI: 10.1038/s41598-024-59241-x

Abstract

Millimeter-wave (mmWave) massive multiple-input multiple-output non-orthogonal multiple access (MIMO-NOMA) is proven to be a primary technique for sixth-generation (6G) wireless communication networks. However, the great increase in users and antennas brings challenges for interference suppression and resource allocation for mmWave massive MIMO-NOMA systems. This study proposes a spectrum-efficient and fast convergence deep reinforcement learning (DRL)-based resource allocation framework to optimize user grouping and allocation of subchannel and power. First, an enhanced K-means grouping algorithm is proposed to reduce the multi-user interference and accelerate the convergence. Then, a dueling deep Q-network (DQN) structure is proposed to perform subchannel allocation, which further improves the convergence speed. Moreover, a deep deterministic policy gradient (DDPG)-based power resource allocation algorithm is designed to avoid the performance loss caused by power quantization and improve the system's achievable sum-rate. The simulation results demonstrate that our proposed scheme outperforms other neural network-based algorithms in terms of convergence performance, and can achieve higher system capacity compared with the greedy algorithm, the random algorithm, the RNN algorithm, and the DoubleDQN algorithm.

Abstract

Grants and funding