首页|基于数字孪生的多自动驾驶车辆分布式协同路径规划算法

基于数字孪生的多自动驾驶车辆分布式协同路径规划算法

扫码查看
针对多辆自动驾驶车辆(AVs)在进行路径规划过程中存在的车辆之间协作难、协作训练出来的模型质量低以及所求结果直接应用到物理车辆的效果较差的问题,该文提出一种基于数字孪生(DT)的多AVs分布式协同路径规划算法,基于可信度加权去中心化的联邦强化学习方法(CWDFRL)来实现多AVs的路径规划.首先将单个AVs的路径规划问题建模成在驾驶行为约束下的最小化平均任务完成时间问题,并将其转化成马尔可夫决策过程(MDP),使用深度确定性策略梯度算法(DDPG)进行求解;然后使用联邦学习(FL)保证车辆之间的协同合作,针对集中式的FL中存在的全局模型更新质量低的问题,使用基于可信度的动态节点选择的去中心化FL训练方法改善了全局模型聚合质量低的问题;最后使用DT辅助去中心化联邦强化学习(DFRL)模型的训练,利用孪生体可以从DT环境中学习的优点,快速将训练好的模型直接部署到现实世界的AVs上.仿真结果表明,与现有的方法相比,所提训练框架可以得到一个较高的奖励,有效地提高了车辆对其本身速度的利用率,与此同时还降低了车辆群体的平均任务完成时间和碰撞概率.
Distributed Collaborative Path Planning Algorithm for Multiple Autonomous vehicles Based on Digital Twin
Focusing on the problems of difficult cooperation between vehicles,low quality of the model trained by cooperation and poor effect of direct application of the obtained results to physical vehicles in the process of path planning for multiple Autonomous Vehicles(AVs),a distributed collaborative path planning algorithm is proposed for multiple AVs based on Digital Twin(DT).The algorithm is based on the Credibility-Weighted Decentralized Federated Reinforcement Learning(CWDFRL)to realize the path planning of multiple AVs.In this paper,the path planning problem of a single AVs is first modeled as the problem of minimizing the average task completion time under the constraints of driving behavior,which is transformed into Markov Decision Process(MDP)and solved by Deep Deterministic Policy Gradient algorithm(DDPG).Then Federated Learning(FL)is used to ensure the cooperation between vehicles.Aiming at the problem of low quality of global model update in centralized FL,this paper uses a decentralized FL training method based on dynamic node selection of reliability to improve the low quality.Finally,the DT is used to assist the training of the Decentralized Federated Reinforcement Learning(DFRL)model,and the trained model can be quickly deployed directly to the real-world AVs by taking advantage of the twin's ability of learning from DT environment.The simulation results show that compared with the existing methods,the proposed training framework can obtain a higher reward,effectively improve the utilization of the vehicle's own speed,and at the same time reduce the average task completion time and collision probability of the vehicle swarm.

Digital Twin(DT)Autonomous drivingDecentralized Federated Reinforcement Learning(DFRL)Path planning

唐伦、戴军、成章超、张鸿鹏、陈前斌

展开 >

重庆邮电大学通信与信息工程学院 重庆 400065

移动通信技术重庆市重点实验室 重庆 400065

数字孪生 自动驾驶 去中心化联邦强化学习 路径规划

国家自然科学基金川渝联合实施重点研发项目

620710782021YFQ0053

2024

电子与信息学报
中国科学院电子学研究所 国家自然科学基金委员会信息科学部

电子与信息学报

CSTPCD北大核心
影响因子:1.302
ISSN:1009-5896
年,卷(期):2024.46(6)