基于深度强化学习的无地图移动机器人导航

Mapless navigation based on deep reinforcement learning for mobile robots

户高铭 ¹蔡克卫 ²王芳 ¹康玉伟 ¹张家旭 ¹金兆一 ¹林远山¹

扫码查看

作者信息

1. 大连海洋大学信息工程学院,辽宁大连 116023;大连海洋大学设施渔业教育部重点实验室,辽宁大连 116023;辽宁省海洋信息技术重点实验室,辽宁大连 116023
2. 大连民族大学机电工程学院,辽宁大连 116023
折叠

摘要

针对传统导航方法对地图精度依赖和动态复杂场景适应差问题,提出一种基于课程学习的深度强化学习无地图自主导航算法.为了克服智能体稀疏奖励情况下学习困难的问题,借鉴课程学习思想,提出一种基于能力圈课程引导的深度强化学习训练方法.此外,为了更好地利用机器人当前的碰撞信息辅助机器人做动作决策,引入碰撞概率的概念,将机器人当前感知到的障碍物信息以一种高层语义的形式进行表示,并将其作为导航策略输入的一部分编码至机器人当前观测中,以简化观测到动作的映射,进一步降低学习的难度.实验结果表明,所提出的课程引导训练和碰撞概率可令导航策略收敛速度明显加快,习得的导航策略在空间更大的场景成功率到达90％以上,行驶耗时减少53.5％～73.1％,可为非结构化未知环境下的无人化作业提供可靠导航.

Abstract

Aiming at the problem that traditional navigation methods are dependent on map accuracy and have poor adaptability to dynamic and complex scenes,a deep reinforcement learning map-free autonomous navigation algorithm based on curriculum learning is proposed.In order to overcome the problem of learning difficulty in the case of sparse reward,a course-guided deep reinforcement learning training method based on circle of competence is proposed by drawing on the idea of curriculum learning.In addition,in order to make better use of the current collision information of the robot to assist the robot to make action decisions,the concept of collision probability is introduced,and the obstacle information currently perceived by the robot is represented in a high-level semantic form.It is encoded into the current observation of the robot as part of the input of the navigation strategy to simplify the mapping of the observation to the action and further reduce the difficulty of learning.The experimental results show that the convergence speed of the strategy is significantly accelerated after the training of the proposed course,and the success rate reaches more than 90％in larger scenes,and the driving time is reduced by 53.5％～73.1％.It can provide reliable navigation for unmanned operations in unstructured unknown environments.

关键词

移动机器人/自主导航/无地图导航/深度强化学习/课程学习

Key words

mobile robot/autonomous navigation/mapless navigation/deep reinforcement learning/curriculum learning

引用本文复制引用

基金项目

国家自然科学基金(61603067)

辽宁省自然科学基金(2020-KF-12-09)

大连市高层次人才创新支持计划(2017RQ053)

辽宁省重点研发计划(2020JH2/10100043)

辽宁省教育厅项目(LJKZ0730)

辽宁省教育厅项目(QL202016)

辽宁省教育厅项目(JL202015)

出版年

2024

控制与决策

东北大学

控制与决策

CSTPCD北大核心

影响因子：1.227

ISSN：1001-0920

参考文献量25

段落导航