为实现混合动力系统在电池荷电状态(state of charge,SOC)平衡以及动力性约束下的经济性提升,提出了基于偏好强化学习的混合动力能量管理策略,该策略将能量管理问题建模为马尔科夫决策过程,采用深度神经网络建立输入状态值到最优动作控制输出的函数映射关系.与传统的强化学习控制算法相比,偏好强化学习算法无需设定回报函数,只需对多动作进行偏好判断即可实现网络训练收敛,克服了传统强化学习方法中回报函数加权归一化设计难题.通过仿真试验和硬件在环验证了所提出能量管理策略的有效性和可行性.结果表明,与传统强化学习能量管理策略相比,该策略能够在满足混合动力车辆 SOC 平衡和动力性约束下,提升经济性 4.6%~10.6%.
Hybrid Power Energy Management Strategy Based on Preferring Reinforcement Learning
To enhance the economy of hybrid power system under SOC balance and power constraints,a hybrid power energy management strategy was proposed based on the preferring reinforcement learning.The strategy treated the energy manage-ment problem as a Markov decision process and adopted a deep neural network to learn and build the nonlinear mapping from the input states to the optimal control inputs.Compared with the traditional reinforcement learning algorithm,the preferring reinforcement learning did not require the setting of a reward function and only needed to make preference judgments on multi-ple actions to achieve the convergence of network training,which overcame the design difficulty of weighting normalization in reward function.The effectiveness and feasibility of the proposed energy management strategy were verified through simulation experiments and hardware-in-the-loop tests.The results show that compared with traditional reinforcement learning energy management strategies,the proposed strategy can improve the economy by 4.6%to 10.6%while maintaining the SOC balance and power constraints of hybrid power vehicle.
hybrid electric vehicleenergy managementpreferring reinforcement learningoptimal controlSOCcontrol strategy