no code implementations • 19 Jul 2022 • Saeid Sadeghi Vilni, Mohammad Moltafet, Markus Leinonen, Marian Codreanu
Finally, we consider unknown environment and devise a learning-based transmission policy by relaxing the CMDP problem into an MDP problem using the DPP method and then adopting the deep Q-learning algorithm.