针对传统深度Q学习网络(deep Q-learning network,DQN)在具有动态障碍物的路径规划下,移动机器人在探索时频繁碰撞难以移动至目标点的问题,通过在探索策略和经验回放机制上进行改进,提出一种改进的DQN算法。在探索策略上,利用快速搜索随机树(rapidly-exploring random tree,RRT)算法自动生成静态先验知识来指导动作选取,替代e-贪婪策略的随机动作,提高智能体到达目标的成功率;在经验利用上,使用K-means算法设计一种聚类经验回放机制,根据动态障碍物的位置信息进行聚类分簇,着重采样与当前智能体状态相似的经验进行回放,使智能体更有效地避免碰撞动态障碍物。二维栅格化环境下的仿真实验表明,在动态环境下,该算法可以避开静态和动态障碍物,成功移动至目标点,验证了该算法在应对动态避障路径规划的可行性。
Dynamic obstacle avoidance path planning based on improved DQN
To address the issue of mobile robots frequently colliding and struggling to reach the target point during exploration in path planning with dynamic obstacles using traditional deep Q-learning networks(DQN),an improved DQN algorithm through advancements in exploration strategy and experience replay mechanism was presented.In terms of exploration strategy,the rapidly-exploring random tree(RRT)algorithm was employed to automatically generate static prior knowledge to guide action selection,replacing random actions of the ε-greedy strategy,thereby increasing the success rate of the agent to reach its target.For experience utilization,a clustering experience replay mechanism was designed using the K-means algorithm;clustering was performed based on the position information of dynamic obstacles,focusing on replaying experiences similar to the current state of the agent.This approach significantly enhances the agent's ability to avoid collisions with dynamic obstacles.Through simulation experiments in a two-dimensional grid environment,the algorithm demonstrates its ability to avoid both static and dynamic obstacles in a dynamic environment,successfully reaching the target point,thus proving its viability in handling dynamic obstacle avoidance path planning.