基于值函数分解的多智能体深度强化学习方法研究综述

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：多智能体深度强化学习方法是深度强化学习方法在多智能体问题上的扩展,其中基于值函数分解的多智能体深度强化学习方法取得了较好的表现效果,是目前研究和应用的热点.文中介绍了基于值函数分解的多智能体深度强化学习方法的主要原理和框架;根据近期相关研究,总结出了提高混合网络拟合能力问题、提高收敛效果问题和提高算法可扩展性问题3个研究热点,从算法约束、环境复杂度、神经网络限制等方面分析了3个热点问题产生的原因;根据拟解决的问题和使用的方法对现有研究进行了分类梳理,总结了同类方法的共同点,分析了不同方法的优缺点;对基于值函数分解的多智能体深度强化学习方法在网络节点控制、无人编队控制两个热点领域的应用进行了阐述.

外文标题：Survey of Multi-agent Deep Reinforcement Learning Based on Value Function Factorization

外文摘要：The multi-agent deep reinforcement learning is an extension of the deep reinforcement learning method to the multi-agents problem,in which the multi-agents deep reinforcement learning based on the value function factorization has achieved bet-ter performance and is a hotspot for research and application at present.This paper introduces the main principles and framework of the multi-agents deep reinforcement learning based on the value function factorization.Based on the recent related research,three research hotspots are summarized:the problem of improving the fitting ability of mixing network,the problem of improving the convergence effect and the problem of improving the scalability of algorithms,and the reasons for the three hotspot problems are analyzed in terms of algorithm constraints,environmental complexity and neural network limitations.The existing research is classified according to the problems to be solved and the methods to be used,the common points of similar methods are summa-rized,and the advantages and disadvantages of different methods are analyzed;the application of multi-agent deep reinforcement learning method based on value function decomposition in two hot fields of network node control and unmanned formation control is expounded.

外文关键词：

Multi-agent deep reinforcement learningValue function factorizationFitting abilityConvergence effectScalability

作者：

高玉钊、聂一鸣

展开 >

作者单位：

军事科学院国防科技创新研究院北京 100071

关键词：

多智能体深度强化学习值函数分解拟合能力收敛效果可扩展性

出版年：

2024

DOI：

10.11896/jsjkx.230300170

计算机科学

重庆西南信息有限公司（原科技部西南信息中心）

计算机科学

CSTPCD北大核心

影响因子：0.944

ISSN：1002-137X

年,卷(期)：2024.51(z1)

参考文献量67