Russian-Armenian University
University of Oxford
University of Oxford
University of Oxford
University of Oxford
University of Oxford
University of Oxford
University of Oxford
Facebook AI Research
University of Oxford
Abstract
In the last few years, deep multi-agent reinforcement learning (RL) has become a highly active area of research. A particularly challenging class of problems in this area is partially observable, cooperative, multi-agent learning, in which teams of agents must learn to coordinate their behaviour while conditioning only on their private observations. This is an attractive research area since such problems are relevant to a large number of real-world systems and are also more amenable to evaluation than general-sum problems. Standardised environments such as the ALE and MuJoCo have allowed single-agent RL to move beyond toy domains, such as grid worlds. However, there is no comparable benchmark for cooperative multi-agent RL. As a result, most papers in this field use one-off toy problems, making it difficult to measure real progress. In this paper, we propose the StarCraft Multi-Agent Challenge (SMAC) as a benchmark problem to fill this gap. SMAC is based on the popular real-time strategy game StarCraft II and focuses on micromanagement challenges where each unit is controlled by an independent agent that must act based on local observations. We offer a diverse set of challenge scenarios and recommendations for best practices in benchmarking and evaluations. We also open-source a deep multi-agent RL learning framework including state-of-the-art algorithms. We believe that SMAC can provide a standard benchmark environment for years to come. Videos of our best agents for several SMAC scenarios are available at https://youtu.be/VZ7zmQ_obZ0.
Citation
@inproceedings{samvelyan2019starcraft,
title={The StarCraft Multi-Agent Challenge},
author={Samvelyan, Mikayel and Rashid, Tabish and Schroeder de Witt, Christian and Farquhar, Gregory and Nardelli, Nantas and Rudner, Tim GJ and Hung, Chia-Man and Torr, Philip HS and Foerster, Jakob and Whiteson, Shimon},
booktitle={Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems},
pages={2186--2188},
year={2019}
}
Acknowledgement
The authors would like to thank Davit Hayrapetyan for his helpful suggestions about the StarCraft II scenarios. We also thank Phil Bates and Jon Russell (Oracle Corporation) for guidance deploying and executing SMAC on Oracle’s public Cloud Infrastructure and the SC2LE teams at DeepMind and Blizzard for their work on the interface.
The work is supported by the European Union’s Horizon 2020 research and innovation programme (grant agreement number 637713), the National Institutes of Health (grant agreement number R01GM114311) and EPSRC/MURI grant EP/N019474/1. It is also supported by the an EPSRC grant (EP/M508111/1, EP/N509711/1), the Oxford-Google DeepMind Graduate Scholarship, Microsoft and Free the Drones (FreeD) project under the Innovation Fund Denmark. The experiments were made possible by generous cloud credit grants from Oracle’s Cloud Innovation Accelerator and NVIDIA.