首页|基于深度强化学习的无人驾驶路径规划研究

基于深度强化学习的无人驾驶路径规划研究

扫码查看
针对深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法在训练神经网络时出现收敛不稳定、学习效率低等问题,提出了一种基于奖励指导的深度确定性策略梯度(Reward Guidance DDPG,RG_DDPG)算法.该算法在回合内创建优秀经验集合,便于指导智能汽车充分利用过往有效信息,得到稳定的控制策略;采用基于奖励的优先经验回放机制,打破数据之间的关联性,提高数据的利用率,减少搜索过程的盲目性,提高算法的收敛稳定性.基于ROS(Robot Operating System)操作系统对算法进行了验证.在Gazebo建模软件中,设计了智能汽车模型以及障碍物环境,利用决策算法规划智能汽车的安全行驶路径.数据结果验证了RG_DDPG算法在处理路径规划任务的有效性,相比于DDPG算法,改进后智能汽车的车速能够提升60.5%,获取奖励提升一倍多,算法的收敛稳定性更好.最后通过实车实验验证了该算法的实用性.
Unmanned driving path planning based on deep reinforcement learning
Aiming at solving the problems of unstable convergence and low learning efficiency of the Deep Determinis-tic Policy Gradient(DDPG)algorithm when training neural networks,a Reward Guidance DDPG(RG_DDPG)algorithm was proposed.The algorithm creates a set of excellent experience in the round,which is convenient to guide the intelli-gent car to make full use of the past effective information and obtain a stable control strategy.The reward-based priority experience playback mechanism is adopted to break the correlation between data,improve the utilization rate of data,re-duce the blindness of the search process,and improve the convergence stability of the algorithm.The algorithm is veri-fied based on Robot Operating System(ROS)operating system.In the Gazebo modeling software,the intelligent car mod-el and the obstacle environment are designed.Use decision-making algorithms to plan safe driving paths for intelligent cars.The data results verify the effectiveness of the RG_DDPG algorithm in handling path planning tasks.Compared with the DDPG algorithm,the speed of the improved intelligent car can be increased by 60.5%,the reward obtained is more than doubled,and the convergence stability of the algorithm is better.Finally,the feasibility of the algorithm is veri-fied by real vehicle experiments.

intelligent carunmanned drivingpath planningdeep deterministic policy gradientreward guidance

赵天亮、张小俊、张明路、陈建文

展开 >

河北工业大学 机械工程学院,天津 300401

智能汽车 无人驾驶 路径规划 深度确定性策略梯度 奖励指导

天津市新一代人工智能科技重大专项资助项目

18ZXZNGX00230

2024

河北工业大学学报
河北工业大学

河北工业大学学报

CSTPCD
影响因子:0.344
ISSN:1007-2373
年,卷(期):2024.53(4)