基于虚拟目标制导的自适应Q学习路径规划算法

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对经典强化学习算法用于未知环境下机器人路径规划问题时,存在探索效率低、收敛速度慢、易陷入地形陷阱,以及学习过程缺少中间态导致探索盲目性等问题,设计了双重记忆机制、虚拟目标引导方法、自适应贪婪因子,提出基于虚拟目标引导的自适应Q学习算法.设计了 4种环境地图,同其他改进算法进行了对比仿真实验,并通过四驱麦克纳姆轮机器人虚拟仿真实验验证算法性能.实验结果表明,新算法显著减少了迭代次数,提高了强化学习收敛速度,且对复杂环境具有较好的鲁棒性,能够有效避免地形陷阱,提高移动机器人导航系统性能,为移动机器人自主路径规划提供了参考.

外文标题：Adaptive Q-learning path planning algorithm based on virtual target guidance

外文摘要：When the classical reinforcement learning algorithm is used for robot path planning in unknown environ-ments,there are problems such as low exploration efficiency,slow convergence speed,easy to fall into terrain traps,and lack of intermediate states in the learning process,resulting in blindness in exploration.To solve the a-bove problems,a dual memory mechanism,a virtual target guidance method and an adaptive greedy factor were de-signed,and an adaptive Q-Learning algorithm based on Virtual Target Guidance(VTGA-Q-Learning)was pro-posed.To verify the performance of the new algorithm,four kinds of environment maps were designed,and the simulation experiments were compared with other improved algorithms.Furthermore,a virtual simulation experi-ment of the four-wheel drive McNum wheel robot was carried out to simulate the real environment and verify the performance of the algorithm.Experimental results showed that the proposed new algorithm significantly reduced the number of iterations,improved the convergence speed of reinforcement learning,and had good robustness to complex environments,which could effectively avoid terrain traps,improve the performance of mobile robot naviga-tion system and provided a reference for mobile robot autonomous path planning.

外文关键词：

Q-learningpath planningreinforcement learningmobile robots

作者：

李子怡、胡祥涛、张勇乐、许建军

展开 >

作者单位：

安徽大学电气工程与自动化学院,安徽合肥 230601

关键词：

Q学习路径规划强化学习移动机器人

基金：

国家自然科学基金资助项目

项目编号：

52175210

出版年：

2024

DOI：

10.13196/j.cims.2022.0733

计算机集成制造系统

中国兵器工业集团第210研究所

计算机集成制造系统

CSTPCD北大核心

影响因子：1.092

ISSN：1006-5911

年,卷(期)：2024.30(2)

参考文献量40