多智能体强化学习方法综述

A survey of multi-agent reinforcement learning methods

陈人龙 ¹陈嘉礼 ¹李善琦 ¹谭营²

扫码查看

作者信息

1. 北京大学机器感知与智能教育部重点实验室,北京 100871;北京大学智能学院,北京 100871
2. 北京大学机器感知与智能教育部重点实验室,北京 100871;北京大学智能学院,北京 100871;北京大学人工智能研究院,北京 100871;北京大学跨媒体通用人工智能全国重点实验室,北京 100871
折叠

摘要

在自动驾驶、团队配合游戏等现实场景的序列决策问题中,多智能体强化学习表现出了优秀的潜力.然而,多智能体强化学习面临着维度灾难、不稳定性、多目标性和部分可观测性等挑战.为此,概述了多智能体强化学习的概念与方法,并整理了当前研究的主要趋势和研究方向.研究趋势包括CTDE范式、具有循环神经单元的智能体和训练技巧.主要研究方向涵盖混合型学习方法、协同与竞争学习、通信与知识共享、适应性与鲁棒性、分层与模块化学习、基于博弈论的方法以及可解释性.未来的研究方向包括解决维度灾难问题、求解大型组合优化问题和分析多智能体强化学习算法的全局收敛性.这些研究方向将推动多智能体强化学习在实际应用中取得更大的突破.

Abstract

In real-world scenarios such as autonomous driving and team-based cooperative games,multi-agent reinforcement learning has demonstrated significant potential in tackling sequential decision-making problems.However,it also encounters challenges including the curse of dimensionality,instability,multi-objectivity,and partial observability.This article offers an overview of the concepts and methods employed in multi-agent reinforcement learn-ing,providing a summary of the prevailing trends and research directions in the current stud-ies.The identified research trends comprise the CTDE paradigm,agents equipped with recur-rent neural units,and various training techniques.The primary research directions encom-pass hybrid learning methods,cooperative and competitive learning,communication and knowledge sharing,adaptability and robustness,hierarchical and modular learning,game theoretic approaches,and interpretability.Looking ahead,future research directions entail addressing the curse of dimensionality,solving large-scale combinatorial optimization prob-lems,and conducting analyses on the global convergence of multi-agent reinforcement learn-ing algorithms.Pursuing these research directions will significantly contribute to further breakthroughs in the practical application of multi-agent reinforcement learning.

关键词

多智能体强化学习/强化学习/多智能体系统/群体协同/维度灾难

Key words

multi-agent reinforcement learning/reinforcement learning/multi-agent system/swarm collaboration/curse dimensionality

引用本文复制引用

基金项目

国家重点研发计划项目(2018AAA0102301)

国家自然科学基金资助项目(62250037)

国家自然科学基金资助项目(62276008)

国家自然科学基金资助项目(62076010)

出版年

2024

信息对抗技术

国防科技大学电子对抗学院

信息对抗技术

CSCD

ISSN：2097-163X

被引量2

参考文献量71

段落导航