Deep Reinforcement Learning with Symmetric Prior for Predictive Power Allocation to Mobile Users

10 Feb 2021  ·  Jianyu Zhao, Chenyang Yang ·

Deep reinforcement learning has been applied for a variety of wireless tasks, which is however known with high training and inference complexity. In this paper, we resort to deep deterministic policy gradient (DDPG) algorithm to optimize predictive power allocation among K mobile users requesting video streaming, which minimizes the energy consumption of the network under the no-stalling constraint of each user. To reduce the sampling complexity and model size of the DDPG, we exploit a kind of symmetric prior inherent in the actor and critic networks: permutation invariant and equivariant properties, to design the neural networks. Our analysis shows that the free model parameters of the DDPG can be compressed by 2/K^2. Simulation results demonstrate that the episodes required by the learning model with the symmetric prior to achieve the same performance as the vanilla policy reduces by about one third when K = 10.

PDF Abstract
No code implementations yet. Submit your code now


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.