基于深度强化学习的智能车辆行为决策研究

扫码查看

原文链接

万方数据
维普

中文摘要：自动驾驶车辆决策系统直接影响车辆综合行驶性能,是实现自动驾驶技术需要解决的关键难题之一.基于深度强化学习算法DDPG(deep deterministic policy gradient),针对此问题提出了一种端到端驾驶行为决策模型.首先,结合驾驶员模型选取自车、道路、干扰车辆等共64维度状态空间信息作为输入数据集对决策模型进行训练,决策模型输出合理的驾驶行为以及控制量,为解决训练测试中的奖励和控制量突变问题,改进DDPG决策模型对决策控制效果进行优化,并在TORCS(the open racing car simulator)平台进行仿真实验验证.结果表明:所提出的决策模型可以根据车辆和环境实时状态信息输出合理的驾驶行为以及控制量,与DDPG模型相比,改进的模型具有更好的控制精度,且车辆横向速度显著减小,车辆舒适性以及车辆稳定性明显改善.

外文标题：Intelligent Vehicles Behavior Decision-making Based on Deep Reinforcement Learning

外文摘要：Autonomous driving vehicle decision-making system has direct influence on driving performance.It is one of the key challenges to be addressed to realize fully autonomous driving.To solve this problem,a driving decision-making system based on deep reinforcement learning algorithm deep deterministic policy gradient(DDPG)was proposed.Firstly,a total of 64 dimensions of state spaces information such as ego vehicle information,road information and obstacle vehicle information on the basis of a driver model were selected as input variables of the constructed model.Then the decision-making was trained and outputs reasonable driving behaviors and control variable values.Finally,aiming at the problems of reward value and control variable values saltation,the DDPG decision model was improved to optimize decision control effect.To verify the performance of the proposed decision making model,simulation experiments were conducted on the open racing car simulator(TORCS)platform.The results show that the proposed decision-making model can output reasonable driving behaviors and accurate control quantities based on real-time state information of vehicles and environment.Compared with the DDPG model,the improved decision-making model has better control accuracy,significantly reduces vehicle lateral speed,improves vehicle comfort and stability.

外文关键词：

autonomous drivingbehavior decision-makingdeep reinforcement learningdeep deterministic policy gradient

作者：

周恒恒、高松、王鹏伟、崔凯晨、张宇龙

展开 >

作者单位：

山东理工大学交通与车辆工程学院,淄博 255000

关键词：

自动驾驶行为决策深度强化学习深度确定性策略梯度算法

基金：

国家自然科学基金

项目编号：

52102465

出版年：

2024

DOI：

10.12404/j.issn.1671-1815.2303193

科学技术与工程

中国技术经济学会

科学技术与工程

CSTPCD北大核心

影响因子：0.338

ISSN：1671-1815

年,卷(期)：2024.24(12)

参考文献量20