首页|基于改进Q-Learning的移动机器人路径规划算法

基于改进Q-Learning的移动机器人路径规划算法

扫码查看
随着移动机器人在生产生活中的深入应用,其路径规划能力也需要向快速性和环境适应性兼备发展.为解决现有移动机器人使用强化学习方法进行路径规划时存在的探索前期容易陷入局部最优、反复搜索同一区域,探索后期收敛率低、收敛速度慢的问题,本研究提出一种改进的Q-Learning算法.该算法改进Q矩阵赋值方法,使迭代前期探索过程具有指向性,并降低碰撞的情况;改进Q矩阵迭代方法,使Q矩阵更新具有前瞻性,避免在一个小区域中反复探索;改进随机探索策略,在迭代前期全面利用环境信息,后期向目标点靠近.在不同栅格地图仿真验证结果表明,本文算法在Q-Learning算法的基础上,通过上述改进降低探索过程中的路径长度、减少抖动并提高收敛的速度,具有更高的计算效率.
Path planning algorithm of mobile robot based on improved Q-Learning
With the in-depth application of mobile robot in production and life,its path planning ability also needs to develop to both rapidity and environmental adaptability. In order to solve the problems existing in the existing mobile robot path planning using reinforcement learning methods,which are easy to fall into local optimization in the early stage of exploration,repeatedly search the same area,and explore the late convergence rate and slow convergence rate,an improved Q-Learning algorithm is proposed in this study. The algorithm improves the Q matrix assignment method to make the exploration process directional in the early iteration and reduces the collision situation;the Q matrix iterative method is improved to make the Q matrix update forward-looking and avoid repeated exploration in a small area;the random exploration strategy is improved to make full use of environmental information in the early iteration and close to the target point in the later stage. The simulation results of different raster maps show that the algorithm in this paper has higher computational efficiency by reducing the path length,reducing jitter and improving the speed of convergence based on the Q-Learning algorithm.

path planningreinforcement learningmobile robotQ-Learning algorithmε-decreasing strategy

王立勇、王弘轩、苏清华、王绅同、张鹏博

展开 >

北京信息科技大学现代测控技术教育部重点实验室 北京 100192

路径规划 强化学习 移动机器人 Q-Learning算法 ε-decreasing策略

基础加强计划基金国家自然科学基金

2021JCJQJJ002252175074

2024

电子测量技术
北京无线电技术研究所

电子测量技术

CSTPCD北大核心
影响因子:1.166
ISSN:1002-7300
年,卷(期):2024.47(9)