首页|基于强化学习的智能车辆路径跟踪变参数MPC多目标控制

基于强化学习的智能车辆路径跟踪变参数MPC多目标控制

扫码查看
为了解决智能车辆在工况变化时跟踪精度下降和稳定性变差的问题,提出基于强化学习的变参数模型预测控制(MPC)算法多 目标控制策略,实现智能车辆路径跟踪控制系统的参数自适应整定.基于车辆动力学模型设计其线性时变MPC控制器,获得最优前轮转向角和附加横摆力矩.基于Actor-Critic强化学习架构,设计进行控制参数整定的深度确定性策略梯度(DDPG)智能体和双延迟深度确定性策略梯度(TD3)智能体,构造以跟踪精度和稳定性为 目标的收益函数,并搭建对接工况和变曲率工况2种典型仿真场景进行算法性能验证,当车辆处于对接工况时,根据路面附着系数的变化及时调整控制器的预测时域和权重矩阵;当车辆处于变曲率工况下时,针对道路曲率变化及时调整控制器的预测时域和权重矩阵.通过MATLAB/SimuLink、CarSim和Python联合仿真分析,将强化学习方法参数整定MPC与固定参数MPC和模糊控制方法参数整定MPC进行对比,结果表明:强化学习方法更能够在保证车辆安全性的前提下,尽可能提高智能车辆在不同路面条件下的路径跟踪精度.在对接工况下,强化学习方法参数整定MPC相较于固定参数MPC和模糊控制方法参数整定MPC,横向偏差平均值分别减少了 99.8%和97.6%,前轮转角变化率平均值分别减小了 99.7%和77.0%;变曲率工况下,横向偏差平均值分别减少了 79.6%和90.8%,前轮转角变化率平均值分别减小了 40.6%和2.6%.说明所提出的基于强化学习的智能车辆径跟踪变参数MPC多 目标控制能够解决变工况下的路径跟踪的稳定性和跟踪精度控制问题,为复杂场景下的路径跟踪控制提供了一种思路.
Variable-parameter MPC Multi-objective Control for Intelligent Vehicle Path Tracking Based on Reinforcement Learning
To address the problems of tracking accuracy degradation and stability deterioration when operating intelligent vehicles under changing driving conditions,a multi-objective control strategy based on reinforcement learning variable parameter model predictive control(MPC)algorithm was proposed in this study.The proposed method effectively realizes the parameter adaptive tuning of intelligent vehicle path tracking control system.The proposed linear time-varying MPC controller was designed based on a vehicle dynamics model to obtain the optimal front-wheel steering angle and additional yaw moment.Based on the Actor-Critic reinforcement learning architecture,the Deep Deterministic Policy Gradient(DDPG)and Twin Delayed Deep Deterministic Policy Gradient(TD3)agents were designed for control parameter tuning.The gain function was constructed with tracking accuracy and system stability as the goal,and two typical simulation scenarios of docking road and variable curvature road were constructed for the algorithm performance verification.For the docking road scenario,the prediction horizon and weight matrix of the controller were adjusted in time according to the changes in the road adhesion coefficient.Whereas for the variable curvature road scenario,the prediction horizon and weight matrix of the controller were adjusted in time according to the changes in the road curvature.Through joint simulation analyses conducted using MATLAB/SimuLink,CarSim,and Python,the reinforcement learning-tuned MPC was compared with fixed parameter MPC and Fuzzy-tuned MPC models.The results showed that the reinforcement learning methods yielded the best performance regarding the path tracking accuracy of intelligent vehicles under different road conditions,while guaranteeing the vehicle safety as much as possible.Under the docking road condition,compared with the fixed parameter MPC and Fuzzy-tuned MPC models,the average lateral deviation of the vehicle was reduced by 99.8%and 97.6%,respectively,when using the reinforcement learning-tuned MPC,and the average front-wheel angle change rate was reduced by 99.7%and 77.0%,respectively.Moreover,under the variable curvature road condition,the average lateral deviation was decreased by 79.6%and 90.8%,respectively,and the average front-wheel angle change rate decreased by 40.6%and 2.6%,respectively,compared with those obtained when using the fixed parameter MPC and Fuzzy-tuned MPC models.

automotive engineeringpath trackingmodel predictive controlreinforcement learn-ingcontrol parameter tuningadditional yaw moment control

汪洪波、王春阳、赵林峰、胡延平

展开 >

合肥工业大学汽车与交通工程学院,安徽合肥 230009

汽车工程 路径跟踪 模型预测控制 强化学习 控制参数整定 附加横摆力矩控制

国家自然科学基金国家自然科学基金合肥市自然科学基金

U22A20246523723822022008

2024

中国公路学报
中国公路学会

中国公路学报

CSTPCD北大核心
影响因子:1.607
ISSN:1001-7372
年,卷(期):2024.37(3)
  • 26