首页|基于近端策略优化的两栖无人平台路径规划算法研究

基于近端策略优化的两栖无人平台路径规划算法研究

扫码查看
为解决水陆两栖无人平台在复杂环境中的路径规划问题,针对传统方法难以应对动态障碍物和多变环境的局限性,提出了一种基于近端策略优化(PPO)的路径规划算法,包含四种感知信息输入方案以及速度强化奖励函数,适应动态和静态环境.该算法通过批次函数正则化、策略熵引入和自适应裁剪因子,显著提升了算法的收敛速度和稳定性.研究中采用了 ROS 仿真平台,结合 Flatland 物理引擎和 PedSim 插件,模拟了包含动态障碍物的多种复杂场景.实验结果表明,采用 BEV+V 状态空间输入结构和离散动作空间的两栖无人平台,在路径规划中展现出高成功率和低超时率,优于传统方法和其他方案.仿真和对比实验显示采用鸟瞰图与速度组合的状态空间数据结构配合速度强化奖励函数算法提高了性能,收敛速度提高 25.58%,路径规划成功率提升 25.54%,超时率下降13.73%.
Path Planning Algorithm of Amphibious Unmanned Platform Based on Proximal Policy Optimization
In order to solve the algorithm problem of the training speed and stability in local path planning of am-phibious unmanned platform,a proximal policy optimization(PPO)algorithm was improved,establishing a foundation of multi-sensory information input for the amphibious platform.Actually,four perceptual informa-tion input schemes and speed-enhanced reward function were proposed to adapt to the dynamic and static envir-onment.The experimental results show that the amphibious unmanned platform with BEV+V state-space input structure and discrete action space demonstrates high success rate and low timeout rate in path planning,which is superior to the traditional methods and other schemes.Simulation and comparative experiment results show that the state space data structure with the combination of aerial view and speed combined with the speed enhance-ment reward function algorithm can improve the algorithm performance,increasing convergence speed up to 25.58%,the success rate of path planning up to 25.54%,and descending the timeout rate by 13.73%.

path planningamphibiousunmanned platformproximal policy optimization(PPO)

左哲、覃卫、徐梓洋、李寓安、陈泰然

展开 >

北京理工大学 机械与车辆学院,北京 100081

路径规划 两栖 无人平台 近端策略优化(PPO)

2025

北京理工大学学报
北京理工大学

北京理工大学学报

北大核心
影响因子:0.609
ISSN:1001-0645
年,卷(期):2025.45(1)