首页|基于深度强化学习的沙漠机器人路径规划

基于深度强化学习的沙漠机器人路径规划

扫码查看
由于沙漠环境复杂多变,移动机器人如何进行避障和路径规划是其高效作业的关键所在.针对深度强化学习算法在复杂环境下搜索效率差且收敛速度慢等问题,提出一种改进的深度强化学习路径规划算法.改进探索因子,根据算法的收敛程度动态调整,使探索因子随着智能体对环境了解程度的增加而动态下降,从而加快算法收敛速度.为了提高搜索效率,设置一种动态的奖励函数,将二次函数应用到其设置中,通过选择不同的动作,得到不一样的奖励值.仿真实验表明:改进的算法与原算法相比,所得到的路径长度、迭代次数和规划时间分别减少了 11.9%、32.6%和17.4%,且该算法更适应复杂环境.
Path Planning of Desert Robot Based on Deep Reinforcement Learning
Due to the complexity and variability of the desert environment,the key to the high-efficient of mobile robot is how to avoid obstacles and plan its path.To solve the problems of poor search efficiency and slow convergence of deep reinforcement learning algorithm in complex environment,an improved deep reinforcement learning path planning algorithm is proposed.The exploration factor is improved and dynamically adjusted according to the convergence degree of the algorithm,so that the exploration factor dynamically decreases with the increase of the understanding degree of the agent to the environment,thus speeding up the convergence speed of the algorithm.To improve the search efficiency,a dynamic reward function is set up,the quadratic function is applied to its settings to obtain different reward values by selecting various actions.Simulation results show that compared with the original algorithm,the improved algorithm reduces the path length,iteration times,and planning time by 11.9%,32.6%,and 17.4%respectively,more adapting to complex environment.

path planningrobotdeep reinforcement learningexploration factorreward function

李明、叶汪忠、燕洁华

展开 >

内蒙古农业大学能源与交通工程学院,内蒙古呼和浩特 010018

路径规划 机器人 深度强化学习 探索因子 奖励函数

2024

系统仿真学报
北京仿真中心 中国系统仿真学会

系统仿真学报

CSTPCD北大核心
影响因子:0.551
ISSN:1004-731X
年,卷(期):2024.36(12)