首页|基于自适应探索DDQN的移动机器人路径规划

基于自适应探索DDQN的移动机器人路径规划

扫码查看
针对传统双深度Q网络算法在路径规划中探索和利用分配不平衡,数据利用不充分等问题,提出了一种改进的DDQN路径规划算法.首先,在自适应探索策略中引入探索成功率的概念,将训练过程分为探索环境和利用环境两个阶段,合理分配探索和利用.其次,通过双经验池混合采样机制,将经验数据按照奖励大小进行分区采样,确保有利数据的利用度达到最大.最后,设计了基于人工势场的奖励函数,使机器人能获得更多的单步奖励,有效改善了奖励稀疏的问题.实验结果表明,所提出的算法相比传统DDQN算法和基于经验分区和多步引导的DDQN算法能获得更高的奖励值,成功率更高,规划时间和步数也更短,算法整体性能更加优越.
Path planning for mobile robots based on self-adaptive exploration DDQN
To address issues such as the imbalanced allocation of exploration and exploitation,as well as insufficient data utilization in traditional double deep Q-Network algorithms for path planning,an improved DDQN path planning algorithm is proposed.Firstly,the concept of exploration success rate is introduced into the adaptive exploration strategy,dividing the training process into exploration and exploitation phases to allocate exploration and exploitation effectively.Secondly,the double experience pool mixed sampling mechanism partitions and samples experience data based on reward size to maximize the utilization of beneficial data.Finally,a reward function based on artificial potential field is designed to enable the robot to receive more single-step rewards,effectively addressing the issue of sparse rewards.Experimental results show that the proposed algorithm achieves higher reward values,greater success rates,and shorter planning times and steps compared to the traditional DDQN algorithm and the DDQN algorithm based on experience classification and multi-steps,demonstrating superior overall performance.

path planningDDQNself-adaptive explorationdouble experience poolartificial potential field

冷忠涛、张烈平、彭建盛、王艺霖、张翠

展开 >

桂林理工大学广西高校先进制造与自动化技术重点实验室 桂林 541006

桂林航天工业学院广西特种工程装备与控制重点实验室 桂林 541004

河池学院广西高校人工智能与信息处理重点实验室 河池 546300

桂林明富机器人科技有限公司 桂林 541004

南宁理工学院信息工程学院 桂林 541006

展开 >

路径规划 DDQN 自适应探索 双经验池 人工势场

2024

电子测量技术
北京无线电技术研究所

电子测量技术

CSTPCD北大核心
影响因子:1.166
ISSN:1002-7300
年,卷(期):2024.47(22)