no code implementations • 17 Mar 2024 • Jingcheng Jiang, Haiyin Piao, Yu Fu, Yihang Hao, Chuanlu Jiang, Ziqi Wei, Xin Yang
Furthermore, we construct a dogfight scenario for aerial agents to demonstrate the practicality of the PDO algorithm.
Multi-Armed Bandits reinforcement-learning