竞争与合作视角下的多Agent强化学习研究进展

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据
维普

中文摘要：随着深度学习和强化学习研究取得长足的进展,多Agent强化学习已成为解决大规模复杂序贯决策问题的通用方法.为了推动该领域的发展,从竞争与合作的视角收集并总结近期相关的研究成果.该文介绍单Agent强化学习;分别介绍多Agent强化学习的基本理论框架——马尔可夫博弈以及扩展式博弈,并重点阐述了其在竞争、合作和混合三种场景下经典算法及其近期研究进展;讨论多Agent强化学习面临的核心挑战——环境的不稳定性,并通过一个例子对其解决思路进行总结与展望.

外文标题：RECENT PROCESS AND PROSPECT OF MULTI-AGENT REINFORCEMENT LEARNING UNDER THE PERSPECTIVE OF COMPETITION AND COOPERATION

外文摘要：With the rapid development of deep learning and reinforcement learning,multi-agent reinforcement learning(MARL)has become a common approach to solve the large scale complex sequential decision-making problem.In order to promote the development of this field,this paper collects and reviews recent research results from the perspective of competition and cooperation.This paper introduced deep reinforcement learning and introduced the basic theoretical framework of MARL-Markov game and extensive game,and especially emphasized the reinforcement learning algorithms developed recently in three scenarios of competition,cooperation and mixture.This paper discussed the core challenge of MARL that was non-stationary of the environment,and an example was given to summarize and prospect its solutions.

外文关键词：

Deep learningReinforcement learningMulti-agent reinforcement learningNon-stationary of the environment

作者：

田小禾、李伟、许铮、刘天星、戚骁亚、甘中学

展开 >

作者单位：

复旦大学工程与应用技术研究院上海 200433

上海智能机器人工程技术研究中心上海 200433

智能机器人教育部工程研究中心上海 200433

季华实验室广东佛山 528000

北京深度奇点科技有限公司北京 100089

展开 >

关键词：

深度学习强化学习多Agent强化学习环境的不稳定性

基金：

广东省季华实验室基金项目上海市科学技术委员会项目

项目编号：

X190021TB1901951113200

出版年：

2024

DOI：

10.3969/j.issn.1000-386x.2024.04.001

计算机应用与软件

上海市计算技术研究所上海计算机软件技术开发中心

计算机应用与软件

CSTPCD北大核心

影响因子：0.615

ISSN：1000-386X

年,卷(期)：2024.41(4)

参考文献量71