基于MASAC强化学习算法的多无人机协同路径规划
Multi-UAV collaborative path planning based on multi-agent soft actor critic
方城亮 1杨飞生 2潘泉1
作者信息
- 1. 西北工业大学自动化学院,西安 710129
- 2. 西北工业大学自动化学院,西安 710129;西北工业大学重庆科创中心,重庆 401151
- 折叠
摘要
针对动态不确定环境下异构多无人机协同路径规划问题,提出了一种新的多智能体深度强化学习算法.首先,开发了一个空域场景下多无人机到达目标地点的强化学习环境,环境引入了无人机动力学方程,并考虑了无人机异构的因素以及安全避障的需求.其次,设计了任务完成率、编队保持率、飞行时间等性能指标,用以衡量算法的优劣.然后,将多无人机协同路径规划问题建模为部分可观Markov决策过程,提出了一种多智能体柔性执行评价(multi-agent soft actor critic,MASAC)算法寻求该问题的近似最优策略.最后,通过仿真实验验证了所提算法的有效性和优越性.
Abstract
This paper proposes a novel multi-agent deep reinforcement learning algorithm for the collaborative path planning problem of heterogeneous unmanned aerial vehicles(UAVs)in a dynamic uncertain environment.Firstly,a reinforcement learning environment for UAVs is developed to reach a target location in an airspace scenario,where the environment introduces the UAV dynamics equations and considers the UAV heterogeneity as well as the requirement for safe obstacle avoidance.Secondly,evaluation metrics including task completion rate,formation maintenance rate,flight time,flight trajectory,and energy consumption are designed to evaluate the algorithm performance.Then,the multi-UAV collaborative path planning problem is modeled as a partially observable Markov decision process and a multi-agent soft actor critic algorithm is proposed to seek the approximate optimal strategy for the problem.Finally,the effectiveness and superiority of the proposed algorithm are demonstrated through simulations.
关键词
多无人机/路径规划/多智能体深度强化学习/部分可观Markov决策过程/MASAC算法Key words
multi-UAV/path planning/multi-agent deep reinforcement learning/partially observable Markov decision process/multi-agent soft actor critic algorithm引用本文复制引用
基金项目
国家自然科学基金(62073269)
重庆市自然科学基金面上项目(CSTB2022NSCQ-MSX0963)
航空科学基金(2020Z034053002)
广东省基础与应用基础研究基金自然科学基金面上项目(2023A1515011220)
出版年
2024