Learning multi-agent communication with double attentional deep reinforcement learning

作者：Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Zhibo Gong, Yan Ni

摘要

Communication is a critical factor for the big multi-agent world to stay organized and productive. Recently, Deep Reinforcement Learning (DRL) has been adopted to learn the communication among multiple intelligent agents. However, in terms of the DRL setting, the increasing number of communication messages introduces two problems: (1) there are usually some redundant messages; (2) even in the case that all messages are necessary, how to process a large number of messages in an efficient way remains a big challenge. In this paper, we propose a DRL method named Double Attentional Actor-Critic Message Processor (DAACMP) to jointly address these two problems. Specifically, DAACMP adopts two attention mechanisms. The first one is embedded in the actor part, such that it can select the important messages from all communication messages adaptively. The other one is embedded in the critic part so that all important messages can be processed efficiently. We evaluate DAACMP on three multi-agent tasks with seven different settings. Results show that DAACMP not only outperforms several state-of-the-art methods but also achieves better scalability in all tasks. Furthermore, we conduct experiments to reveal some insights about the proposed attention mechanisms and the learned policies.

论文关键词：Learning to communicate, Multi-agent reinforcement learning, Attentional deep reinforcement learning, Large-scale communication

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10458-020-09455-w