Sequential Communication in Multi-Agent Reinforcement Learning

29 Sep 2021 · Ziluo Ding, Weixin Hong, Liwen Zhu, Tiejun Huang, Zongqing Lu ·

Coordination is one of the essential problems in multi-agent reinforcement learning. Communication provides an alternative for agents to obtain information about others so that better coordinated behavior can be learned. Some existing work communicates predicted future trajectory with others, hoping to get clues about what others would do for better coordination. However, circular dependencies can inevitably occur when agents are treated equally so that it is impossible to coordinate decision-making. In this paper, we propose a novel communication scheme Sequential Communication (SeqComm). In more detail, we treat agents unequally (the upper-level agents make decisions prior to the lower-level) and have two communication phases. In the negotiation phase, agents share observations with others and obtain their intention by modeling the environment dynamics. Agents determine the priority of decision-making by comparing the value of intention. In the launching phase, the upper-level agents take the lead in making decisions and share their actions with the lower-level agents. Empirically, we show that SeqComm improves the performance in a variety of multi-agent cooperative scenarios, comparing to existing methods.

PDF Abstract