no code implementations • 10 Oct 2018 • Zheng Tian, Shihao Zou, Ian Davies, Tim Warr, Lisheng Wu, Haitham Bou Ammar, Jun Wang
The auxiliary reward for communication is integrated into the learning of the policy module.