Supervisor: (G.Parisis@sussex.ac.uk)
Multiple users accessing a network must share available resources - bandwidth and buffers. Network congestion is a network state characterised by increased network delay and packet loss rate, because of traffic going through one or more bottleneck links where the required bandwidth exceeds the available one. Network congestion results in severe degradation of users’ quality of experience and must therefore be controlled. Congestion control involves end-hosts, and potentially in-network devices, and aims to maximise resource utilisation while fairly allocating resources among all users. This is commonly done on an end-to-end basis by regulating senders’ transmission rate. Recently, a new learning-based congestion control paradigm has gained traction, with the key argument being that congestion signals and control actions are too complex for humans to interpret and that machine-generated algorithms can provide superior policies compared to human-derived ones. An objective function then guides the learning of the control strategy. Early work in this thread included off-line optimisation of a fixed rule table [1] and online gradient ascent optimisation [2], with later work adopting sequential decision-making optimisation via reinforcement learning (RL) algorithms [3, 4].
RL-based congestion control is still in its infancy and substantial research is required to yield deployable algorithms and respective RL policies. In [5] we have shown that existing approaches fall short when it comes to fairness, a fundamental requirement of congestion control. In this project, we will further explore the concept of fairness in RL-based congestion control through experimentation in both emulated [5] and simulated networks [6]. We will specifically experiment using RayNet [6] a simulation framework that we have developed; RayNet integrates state of the art software in packet level simulations (OMNeT++) and unified computing and RL (Ray/RLlib). We will consider novel approaches in fairness by integrating it as an explicit component of the reward during training; e.g., by employing centralised training or, even, during regular operation by generating fairness signals through in-network telemetry.
[1] Keith Winstein and Hari Balakrishnan. 2013. TCP ex Machina: Computer-generated congestion control. ACM SIGCOMM Computer Communication Review 43, 4 (2013), 123–134.
[2] Mo Dong, Tong Meng, Doron Zarchy, Engin Arslan, Yossi Gilad, Brighten Godfrey, and Michael Schapira. 2018. PCC Vivace:Online-Learning Congestion Control. In Proceedings of USENIX NSDI. 343–356.
[3] Soheil Abbasloo, Chen-Yu Yen, and H Jonathan Chao. 2020. Classic meets modern: A pragmatic learning-based congestion control for the internet. In Proceedings of ACM SIGCOMM. 632–647.
[4] Nathan Jay, Noga Rotman, Brighten Godfrey, Michael Schapira, and Aviv Tamar. 2019. A deep reinforcement learning perspective on Internet congestion control. In Proceedings of ICML. 3050–3059.
[5] L. Giacomoni and G. Parisis, Reinforcement Learning-based Congestion Control: A Systematic Evaluation of Fairness, Efficiency and Responsiveness, in Proc. of IEEE INFOCOM , 2024 (accepted).
[6] L.Giacomoni,B.Benny,andG.Parisis,“RayNet:Asimulationplatform for developing reinforcement learning-driven network protocols,” CoRR, vol. abs/2302.04519, 2023.
[7]
[8]