An Empirical Study on Google Research Football Multi-agent Scenarios

扫码查看

原文链接

NETL
NSTL
万方数据

外文摘要：Few multi-agent reinforcement learning(MARL)researches on Google research football(GRF)[1]focus on the 11-vs-11 multi-agent full-game scenario and to the best of our knowledge,no open benchmark on this scenario has been released to the public.In this work,we fill the gap by providing a population-based MARL training pipeline and hyperparameter settings on multi-agent football scenario that outperforms the bot with difficulty 1.0 from scratch within 2 million steps.Our experiments serve as a reference for the ex-pected performance of independent proximal policy optimization(IPPO)[2],a state-of-the-art multi-agent reinforcement learning al-gorithm where each agent tries to maximize its own policy independently across various training configurations.Meanwhile,we release our training framework Light-MALib which extends the MALib[3]codebase by distributed and asynchronous implementation with addi-tional analytical tools for football games.Finally,we provide guidance for building strong football AI with population-based training[4]and release diverse pretrained policies for benchmarking.The goal is to provide the community with a head start for whoever experi-ment their works on GRF and a simple-to-use population-based training framework for further improving their agents through self-play.The implementation is available at https://github.com/Shanghai-Digital-Brain-Laboratory/DB-Football.

外文关键词：

Multi-agent reinforcement learning(RL)distributed RL systempopulation-based trainingreward shapinggame theory

作者：

Yan Song、He Jiang、Zheng Tian、Haifeng Zhang、Yingping Zhang、Jiangcheng Zhu、Zonghong Dai、Weinan Zhang、Jun Wang

展开 >

作者单位：

Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China

Digital Brain Lab,Shanghai 200001,China

ShanghaiTech University,Shanghai 200001,China

Huawei Cloud,Guiyang 550003,China

Shanghai Jiao Tong University,Shanghai 200001,China

University College London,London WC1E 6PT,UK

展开 >

基金：

National Natural science Foundation of China

项目编号：

62206289

出版年：

2024

DOI：

10.1007/s11633-023-1426-8

机器智能研究(英文)

中国科学院自动化所

机器智能研究(英文)

CSTPCDEI

影响因子：0.49

ISSN：2731-538X

年,卷(期)：2024.21(3)

参考文献量43