Noah: Reinforcement-Learning-Based Rate Limiter for Microservices in Large-Scale E-Commerce Services

Zhao Li; Haifeng Sun; Zheng Xiong; Qun Huang; Zehong Hu; Ding Li; Shasha Ruan; Hai Hong; Jie Gui; Jintao He; Zebin Xu; Yang Fang

doi:10.1109/TNNLS.2023.3264038

Noah: Reinforcement-Learning-Based Rate Limiter for Microservices in Large-Scale E-Commerce Services

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5403-5417. doi: 10.1109/TNNLS.2023.3264038. Epub 2023 Sep 1.

Authors

Zhao Li, Haifeng Sun, Zheng Xiong, Qun Huang, Zehong Hu, Ding Li, Shasha Ruan, Hai Hong, Jie Gui, Jintao He, Zebin Xu, Yang Fang

PMID: 37040246
DOI: 10.1109/TNNLS.2023.3264038

Abstract

Modern large-scale online service providers typically deploy microservices into containers to achieve flexible service management. One critical problem in such container-based microservice architectures is to control the arrival rate of requests in the containers to avoid containers from being overloaded. In this article, we present our experience of rate limit for the containers in Alibaba, one of the largest e-commerce services in the world. Given the highly diverse characteristics of containers in Alibaba, we point out that the existing rate limit mechanisms cannot meet our demand. Thus, we design Noah, a dynamic rate limiter that can automatically adapt to the specific characteristic of each container without human efforts. The key idea of Noah is to use deep reinforcement learning (DRL) that automatically infers the most suitable configuration for each container. To fully embrace the advantages of DRL in our context, Noah addresses two technical challenges. First, Noah uses a lightweight system monitoring mechanism to collect container status. In this way, it minimizes the monitoring overhead while ensuring a timely reaction to system load changes. Second, Noah injects synthetic extreme data when training its models. Thus, its model gains knowledge on unseen special events and hence remains highly available in extreme scenarios. To guarantee model convergence with the injected training data, Noah adopts task-specific curriculum learning to train the model from normal data to extreme data gradually. Noah has been deployed in the production of Alibaba for two years, serving more than 50000 containers and around 300 types of microservice applications. Experimental results show that Noah can well adapt to three common scenarios in the production environment. It effectively achieves better system availability and shorter request response time compared with four state-of-the-art rate limiters.