基于拍卖多智能体深度确定性策略梯度的多无人车分散策略研究

扫码查看

原文链接

万方数据
维普

中文摘要：多无人车(multi-UGV)分散在军事作战任务中应用非常广泛,现有方法较为复杂,规划时间较长,且适用性不强.针对此问题,该文提出一种基于拍卖多智能体深度确定性策略梯度(AU-MADDPG)算法的多无人车分散策略.在单无人车模型的基础上,建立基于深度强化学习的多无人车分散模型.对MADDPG结构进行优化,采用拍卖算法计算总路径最短时各无人车所对应的分散点,降低分散点分配的随机性,结合MADDPG算法规划路径,提高训练效率及运行效率;优化奖励函数,考虑训练过程中及结束两个阶段,全面考虑约束,将多约束问题转化为奖励函数设计问题,实现奖励函数最大化.仿真结果表明:与传统MADDPG算法相比,所提算法在训练时间上缩短了3.96%,路径总长度减少14.50%,解决分散问题时更为有效,可作为此类问题的通用解决方案.

外文标题：Research on Dispersion Strategy for Multiple Unmanned Ground Vehicles Based on Auction Multi-agent Deep Deterministic Policy Gradient

外文摘要：Multiple Unmanned Ground Vehicle(multi-UGV)dispersion is commonly used in military combat missions.The existing conventional methods of dispersion are complex,long time-consuming,and have limited applicability.To address these problems,a multi-UGV dispersion strategy is proposed based on the AUction Multi-Agent Deep Deterministic Policy Gradient(AU-MADDPG)algorithm.Founded on the single unmanned vehicle model,the multi-UGV dispersion model is established based on deep reinforcement learning.Then,the MADDPG structure is optimized,and the auction algorithm is used to calculate the dispersion points corresponding to each unmanned vehicle when the absolute path is shortest to reduce the randomness of dispersion points allocation.Plan the path according to the MADDPG algorithm to improve training efficiency and running efficiency.The reward function is optimized by taking into account both during and the end of training process to consider the constraints comprehensively.The multi-constraint problem is converted into the reward function design problem to realize maximization of the reward f unction.The simulation results show that,compared with the traditional MADDPG algorithms,the proposed algorithm has a 3.96%reduction in training time-consuming and a 14.5%reduction in total path length,which is more effective in solving the decentralized problems,and can be used as a general solution for dispersion problems.

外文关键词：

Path planningDeep reinforcement learningMulti-UGVsDispersion strategyAuction algorithm

作者：

郭宏达、娄静涛、杨珍珍、徐友春

展开 >

作者单位：

陆军军事交通学院天津 300161

关键词：

路径规划深度强化学习多无人车分散策略拍卖算法

出版年：

2024

DOI：

10.11999/JEIT221582

电子与信息学报

中国科学院电子学研究所国家自然科学基金委员会信息科学部

电子与信息学报

CSTPCD北大核心

影响因子：1.302

ISSN：1009-5896

年,卷(期)：2024.46(1)

参考文献量13