基于深度强化学习的沙漠机器人路径规划

Path Planning of Desert Robot Based on Deep Reinforcement Learning

李明 ¹叶汪忠 ¹燕洁华¹

扫码查看

作者信息

1. 内蒙古农业大学能源与交通工程学院,内蒙古呼和浩特 010018
折叠

摘要

由于沙漠环境复杂多变,移动机器人如何进行避障和路径规划是其高效作业的关键所在.针对深度强化学习算法在复杂环境下搜索效率差且收敛速度慢等问题,提出一种改进的深度强化学习路径规划算法.改进探索因子,根据算法的收敛程度动态调整,使探索因子随着智能体对环境了解程度的增加而动态下降,从而加快算法收敛速度.为了提高搜索效率,设置一种动态的奖励函数,将二次函数应用到其设置中,通过选择不同的动作,得到不一样的奖励值.仿真实验表明:改进的算法与原算法相比,所得到的路径长度、迭代次数和规划时间分别减少了 11.9％、32.6％和17.4％,且该算法更适应复杂环境.

Abstract

Due to the complexity and variability of the desert environment,the key to the high-efficient of mobile robot is how to avoid obstacles and plan its path.To solve the problems of poor search efficiency and slow convergence of deep reinforcement learning algorithm in complex environment,an improved deep reinforcement learning path planning algorithm is proposed.The exploration factor is improved and dynamically adjusted according to the convergence degree of the algorithm,so that the exploration factor dynamically decreases with the increase of the understanding degree of the agent to the environment,thus speeding up the convergence speed of the algorithm.To improve the search efficiency,a dynamic reward function is set up,the quadratic function is applied to its settings to obtain different reward values by selecting various actions.Simulation results show that compared with the original algorithm,the improved algorithm reduces the path length,iteration times,and planning time by 11.9％,32.6％,and 17.4％respectively,more adapting to complex environment.

关键词

路径规划/机器人/深度强化学习/探索因子/奖励函数

Key words

path planning/robot/deep reinforcement learning/exploration factor/reward function

引用本文复制引用

出版年

2024

系统仿真学报

北京仿真中心中国系统仿真学会

系统仿真学报

CSTPCDCSCD北大核心

影响因子：0.551

ISSN：1004-731X

段落导航