首页|基于深度强化学习的网联燃料电池混合动力汽车生态驾驶联合优化方法

基于深度强化学习的网联燃料电池混合动力汽车生态驾驶联合优化方法

扫码查看
随着物联网、无人驾驶等新技术的快速发展,基于网联交通驾驶环境为混合动力车辆节能驾驶与能量管理优化注入了新的研究思路。针对燃料电池混合动力汽车在多信号灯城市道路的驾驶场景,本文提出一种基于深度强化学习算法的车速与能量管理的多目标分层联合优化方法(DDPG-DP)。在上层节能速度规划方面采用DDPG算法,同时设计多目标奖励值函数和加入优先经验回放机制,在提高算法速度和稳定性的基础上,进行节能、驾驶舒适性以及通行效率的多目标速度规划。在下层能量管理方面采用动态规划算法(DP),以氢气消耗最小化为目标实现混合动力系统的最优节能控制。结果表明:在本文设定的2种场景中,DDPG-DP算法比IDM-DP算法在通行效率上分别提高15。25%、20。18%,氢气燃料消耗分别降低25。66%、17。86%;同时在本文设定的2种场景中DDPG-DP算法相比于全局最优算法(DP-DP)在通行时间上只存在5 s左右差距,氢气燃料消耗比最优算法仅相差2。84%、4。7%。在通行平稳性上DDPG-DP算法比另外2种算法(IDM-DP、DP-DP)速度波动更小且未出现急加减速情况,能够较好地保证乘坐的舒适性。本文通过速度规划和能量管理双层主动式架构,能够实现混合动力车辆主动式节能优化,将为混合动力汽车日常驾驶提供更大节能潜力,同时对于网联燃料电池混合动力汽车的多目标生态驾驶优化奠定了研究基础。
A Joint Eco-driving Optimization Research for Connected Fuel Cell Hybrid Vehicle via Deep Reinforcement Learning
With the rapid development of the new technologies about Internet of Things(IoT)and automatic driving,an advanced research target has been injected into the optimization of eco-driving and energy management of hybrid vehicles based on the connected driving environment.Aiming at the fuel cell hybrid vehicles driving on multi-signalized urban roads,this paper proposes a hierarchical multi-objective optimization method combined deep deterministic policy gradient and dynamic planning(DDPG-DP)for speed planning and energy management.The DDPG algorithm is used in the upper layer of energy-saving speed planning,while the multi-objective reward value function and the priority experience replay mechanism are designed to carry out the multi-objective speed planning for energy saving,driving comfort,and passage efficiency on the basis of improving the algorithm's speed and stability,and the dynamic planning algorithm is used in the lower layer of energy management to achieve the optimal energy-saving of the hybrid system with the goal of minimizing the hydrogen consumption.In scenarios 1 and 2,the results show that the DDPG-DP algorithm improves the traveling efficiency by 15.25%and 20.18%than the IDM-DP algorithm,and reduces the hydrogen fuel consumption by 25.66%and 17.86%,respectively.Meanwhile,there is a gap of only about 5 s in the passing time of the DDPG-DP algorithm compared with the global optimal algorithm(DP-DP)in Scenarios 1 and 2,and the hydrogen fuel consumption is lower than the optimal algorithm.Meanwhile,there is only a difference of about 5 s between the DDPG-DP algorithm and the global optimal algorithm(DP-DP)in traveling time,and there is only a difference of 2.84%and 4.7%in the hydrogen fuel consumption compared with the DP-DP algorithm.In field of driving smoothness,the DDPG-DP algorithm has less speed fluctuation than the other two algorithms(IDM-DP and DP-DP)and doesn't have large acceleration/deceleration.It will provide greater energy-saving potential for daily driving of hybrid vehicles and support the further research for multi-objective eco-driving optimization of connected fuel cell hybrid vehicles.

energy managementfuel cellhybrid vehicledeep reinforcement learningco-optimizationconnected and autonomous vehicles

田晟、陈东

展开 >

华南理工大学土木与交通学院,广东 广州 510640

能量管理 燃料电池 混合动力汽车 深度强化学习 联合优化 智能网联车辆

2024

广西师范大学学报(自然科学版)
广西师范大学

广西师范大学学报(自然科学版)

CSTPCD北大核心
影响因子:0.448
ISSN:1001-6600
年,卷(期):2024.42(6)