Performance Competitions as Research Infrastructure: Large Scale Comparative Studies of Multi-Agent Teams

作者:Gal A. Kaminka, Ian Frank, Katsuto Arai, Kumiko Tanaka-Ishii

摘要

Performance competitions (events that pit many different programs against each other on a standardized task) provide a way for a research community to promote research progress towards challenging goals. In this paper, we argue that for maximum research benefit, any such competition must involve comparative studies under closely controlled, varying conditions. We demonstrate the critical role of comparative studies in the context of one well-known and growing performance competition: the annual Robotic Soccer World Cup (RoboCup) Championship. Specifically, over the past three years, we have carried out annual large-scale comparative evaluations—distinct from the competition itself—of the multi-agent teams taking part in the largest RoboCup league. Our study, which involved 30 different teams of agents produced by dozens of different research groups, focused on robustness. We show that (i) multi-agent teams exhibit a clear performance-robustness tradeoff; (ii) teams tend to over-specialize, so that they cannot handle beneficial changes we make to their operating environment; and (iii) teams improve in performance more than in robustness from one year to the next, despite the emphasis by RoboCup organizers on robustness as a key challenge. These results demonstrate the potential of large-scale comparative studies for producing important results otherwise difficult to discover, and are significant both in the lessons they raise for designers of multi-agent teams, and in understanding the place of performance competitions within the multi-agent research infrastructure.

论文关键词:performance competitions, comparative studies, teamwork, robustness, measurement, linear regression

论文评审过程:

论文官网地址:https://doi.org/10.1023/A:1024180921782