A Multiarmed Bandit Approach for LTE-U/Wi-Fi Coexistence in a Multicell Scenario

Iago Diógenes do Rego; José M de Castro Neto; Sildolfo F G Neto; Pedro M de Santana; Vicente A de Sousa Jr; Dario Vieira; Augusto Venâncio Neto

doi:10.3390/s23156718

A Multiarmed Bandit Approach for LTE-U/Wi-Fi Coexistence in a Multicell Scenario

Sensors (Basel). 2023 Jul 27;23(15):6718. doi: 10.3390/s23156718.

Authors

Iago Diógenes do Rego^{1

2}, José M de Castro Neto^{1

3}, Sildolfo F G Neto^{1

4}, Pedro M de Santana^{1

5}, Vicente A de Sousa Jr¹, Dario Vieira², Augusto Venâncio Neto¹

Affiliations

¹ PPgEEC, Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil.
² Efrei Research Lab, EFREI Paris, 94800 Villejuif, France.
³ Vyaire Medical Inc., Cotia 06715-865, SP, Brazil.
⁴ Instituto de Pesquisas Eldorado, Av. Alan Turing, 275-Cidade Universitária, Campinas 13083-898, SP, Brazil.
⁵ Corporate Research, Bosch, Robert-Bosch-Campus 1, 71272 Renningen, Germany.

Abstract

Recent studies and literature reviews have shown promising results for 3GPP system solutions in unlicensed bands when coexisting with Wi-Fi, either by using the duty cycle (DC) approach or licensed-assisted access (LAA). However, it is widely known that general performance in these coexistence scenarios is dependent on traffic and how the duty cycle is adjusted. Most DC solutions configure their parameters statically, which can result in performance losses when the scenario experiences changes on the offered data. In our previous works, we demonstrated that reinforcement learning (RL) techniques can be used to adjust DC parameters. We showed that a Q-learning (QL) solution that adapts the LTE DC ratio to the transmitted data rate can maximize the Wi-Fi/LTE-Unlicensed (LTE-U) aggregated throughput. In this paper, we extend our previous solution by implementing a simpler and more efficient algorithm based on multiarmed bandit (MAB) theory. We evaluate its performance and compare it with the previous one in different traffic scenarios. The results demonstrate that our new solution offers improved balance in throughput, providing similar results for LTE and Wi-Fi, while still showing a substantial system gain. Moreover, in one of the scenarios, our solution outperforms the previous approach by 6% in system throughput. In terms of user throughput, it achieves more than 100% gain for the users at the 10th percentile of performance, while the old solution only achieves a 10% gain.

Keywords: LTE-U; Q-learning; Wi-Fi; coexistence; multiarmed bandit; reinforcement learning.

Grants and funding

Finance Code 001/Coordenação de Aperfeicoamento de Pessoal de Nível Superior