基于多Agent深度强化学习的无人机协作规划方法

A UAV COOPERSTIVE PLANNING METHOD BASED ON MULTI-AGENT DEEP REINFORCEMENT LEARING

王娜 ¹马利民 ²姜云春 ¹宗成国¹

扫码查看

作者信息

1. 青岛黄海学院智能制造学院山东青岛 266427
2. 湖南大学电气与信息工程学院湖南长沙 410082
折叠

摘要

人机协作控制是多无人机任务规划的重要方式.考虑多无人机任务环境协同解释和策略控制一致性需求,提出基于多Agent深度强化学习的无人机协作规划方法.依据任务知识和行为状态,构建基于任务分配A-gent的任务规划器,生成人机交互的相互依赖关系;设计一种深度学习强化方法,解决群体行为最优策略和协同控制方法,并利用混合主动行为选择机制评估学习策略.实验结果表明:作为人机交互实例,所提方法通过深度强化学习使群体全局联合动作表现较好,学习速度和稳定性均能优于确定性策略梯度方法.同时,在跟随、自主和混合主动3种模式比较下,可以较好地控制无人机飞行路径和任务,为无人机集群任务执行提供了智能决策依据.

Abstract

Human-machine cooperative control is an important way of multi-UAV task planning.Considering the requirements of cooperative interpretation of multi-UAV task environment and consistency of strategy control,we propose a UAV cooperative planning method based on multi-agent deep reinforcement learning.According to the task knowledge and behavior state,it constructed a task planner based on hierarchical agent to generate the interdependence of human-machine interaction.It designed a deep learning reinforcement method to solve the optimal strategy and cooperative control method of group behavior,and used the mixed-initiative behavior selection mechanism to evaluate the learning strategy.Experimental results show that,as an example of human-machine interaction,the proposed method can make the group perform better in the global joint action through deep reinforcement learning,and the learning speed and stability can be better than the deterministic strategy gradient method.The flight path and task of UAV can be controlled better in modes of the following,autonomous and mixed-initiative,which provides intelligent decision basis for the implementation of UAV cluster tasks.

关键词

多Agent规划/深度强化学习/无人机协同规划/混合主动行为

Key words

Multi-agent planning/Deepreinforcement learning/UAV cooperative planning/Mixed-initiative behavior

引用本文复制引用

基金项目

湖南省自然科学基金项目(2018JJ1002)

青岛黄海学院校内博士项目(2017boshi02)

出版年

2024

计算机应用与软件

上海市计算技术研究所上海计算机软件技术开发中心

计算机应用与软件

CSTPCD北大核心

影响因子：0.615

ISSN：1000-386X

段落导航