基于深度强化学习的无人驾驶路径规划研究

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法在训练神经网络时出现收敛不稳定、学习效率低等问题,提出了一种基于奖励指导的深度确定性策略梯度(Reward Guidance DDPG,RG_DDPG)算法.该算法在回合内创建优秀经验集合,便于指导智能汽车充分利用过往有效信息,得到稳定的控制策略;采用基于奖励的优先经验回放机制,打破数据之间的关联性,提高数据的利用率,减少搜索过程的盲目性,提高算法的收敛稳定性.基于ROS(Robot Operating System)操作系统对算法进行了验证.在Gazebo建模软件中,设计了智能汽车模型以及障碍物环境,利用决策算法规划智能汽车的安全行驶路径.数据结果验证了RG_DDPG算法在处理路径规划任务的有效性,相比于DDPG算法,改进后智能汽车的车速能够提升60.5%,获取奖励提升一倍多,算法的收敛稳定性更好.最后通过实车实验验证了该算法的实用性.

外文标题：Unmanned driving path planning based on deep reinforcement learning

外文摘要：Aiming at solving the problems of unstable convergence and low learning efficiency of the Deep Determinis-tic Policy Gradient(DDPG)algorithm when training neural networks,a Reward Guidance DDPG(RG_DDPG)algorithm was proposed.The algorithm creates a set of excellent experience in the round,which is convenient to guide the intelli-gent car to make full use of the past effective information and obtain a stable control strategy.The reward-based priority experience playback mechanism is adopted to break the correlation between data,improve the utilization rate of data,re-duce the blindness of the search process,and improve the convergence stability of the algorithm.The algorithm is veri-fied based on Robot Operating System(ROS)operating system.In the Gazebo modeling software,the intelligent car mod-el and the obstacle environment are designed.Use decision-making algorithms to plan safe driving paths for intelligent cars.The data results verify the effectiveness of the RG_DDPG algorithm in handling path planning tasks.Compared with the DDPG algorithm,the speed of the improved intelligent car can be increased by 60.5%,the reward obtained is more than doubled,and the convergence stability of the algorithm is better.Finally,the feasibility of the algorithm is veri-fied by real vehicle experiments.

外文关键词：

intelligent carunmanned drivingpath planningdeep deterministic policy gradientreward guidance

作者：

赵天亮、张小俊、张明路、陈建文

展开 >

作者单位：

河北工业大学机械工程学院,天津 300401

关键词：

智能汽车无人驾驶路径规划深度确定性策略梯度奖励指导

基金：

天津市新一代人工智能科技重大专项资助项目

项目编号：

18ZXZNGX00230

出版年：

2024

DOI：

10.14081/j.cnki.hgdxb.2024.04.002

河北工业大学学报

河北工业大学

河北工业大学学报

CSTPCD

影响因子：0.344

ISSN：1007-2373

年,卷(期)：2024.53(4)