基于近端策略优化的两栖无人平台路径规划算法研究

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：为解决水陆两栖无人平台在复杂环境中的路径规划问题,针对传统方法难以应对动态障碍物和多变环境的局限性,提出了一种基于近端策略优化(PPO)的路径规划算法,包含四种感知信息输入方案以及速度强化奖励函数,适应动态和静态环境.该算法通过批次函数正则化、策略熵引入和自适应裁剪因子,显著提升了算法的收敛速度和稳定性.研究中采用了 ROS 仿真平台,结合 Flatland 物理引擎和 PedSim 插件,模拟了包含动态障碍物的多种复杂场景.实验结果表明,采用 BEV+V 状态空间输入结构和离散动作空间的两栖无人平台,在路径规划中展现出高成功率和低超时率,优于传统方法和其他方案.仿真和对比实验显示采用鸟瞰图与速度组合的状态空间数据结构配合速度强化奖励函数算法提高了性能,收敛速度提高 25.58%,路径规划成功率提升 25.54%,超时率下降13.73%.

外文标题：Path Planning Algorithm of Amphibious Unmanned Platform Based on Proximal Policy Optimization

外文摘要：In order to solve the algorithm problem of the training speed and stability in local path planning of am-phibious unmanned platform,a proximal policy optimization(PPO)algorithm was improved,establishing a foundation of multi-sensory information input for the amphibious platform.Actually,four perceptual informa-tion input schemes and speed-enhanced reward function were proposed to adapt to the dynamic and static envir-onment.The experimental results show that the amphibious unmanned platform with BEV+V state-space input structure and discrete action space demonstrates high success rate and low timeout rate in path planning,which is superior to the traditional methods and other schemes.Simulation and comparative experiment results show that the state space data structure with the combination of aerial view and speed combined with the speed enhance-ment reward function algorithm can improve the algorithm performance,increasing convergence speed up to 25.58%,the success rate of path planning up to 25.54%,and descending the timeout rate by 13.73%.

外文关键词：

path planningamphibiousunmanned platformproximal policy optimization(PPO)

作者：

左哲、覃卫、徐梓洋、李寓安、陈泰然

展开 >

作者单位：

北京理工大学机械与车辆学院,北京 100081

关键词：

路径规划两栖无人平台近端策略优化(PPO)

出版年：

2025

DOI：

10.15918/j.tbit1001-0645.2024.025

北京理工大学学报

北京理工大学

北京理工大学学报

北大核心

影响因子：0.609

ISSN：1001-0645

年,卷(期)：2025.45(1)