首页|有向无环图区块链辅助深度强化学习的智能驾驶策略优化算法

有向无环图区块链辅助深度强化学习的智能驾驶策略优化算法

扫码查看
深度强化学习(DRL)在智能驾驶决策中的应用日益广泛,通过与环境的持续交互,能够有效提高智能驾驶系统的决策能力.然而,DRL在实际应用中面临学习效率低和数据共享安全性差的问题.为了解决这些问题,该文提出一种基于有向无环图(DAG)区块链辅助深度强化学习的智能驾驶策略优化(D-IDSO)算法.首先,构建了基于DAG区块链的双层安全数据共享架构,以确保模型数据共享的效率和安全性.其次,设计了一个基于DRL的智能驾驶决策模型,综合考虑安全性、舒适性和高效性设定多目标奖励函数,优化智能驾驶决策.此外,提出了一种改进型优先经验回放的双延时确定策略梯度(IPER-TD3)方法,以提升训练效率.最后,在CARLA仿真平台中选取制动和变道场景对智能网联汽车(CAV)进行训练.实验结果表明,所提算法显著提高了智能驾驶场景中模型训练效率,在确保模型数据安全共享的基础上,有效提升了智能驾驶的安全性、舒适性和高效性.
An Intelligent Driving Strategy Optimization Algorithm Assisted by Direct Acyclic Graph Blockchain and Deep Reinforcement Learning
The application of Deep Reinforcement Learning(DRL)in intelligent driving decision-making is increasingly widespread,as it effectively enhances decision-making capabilities through continuous interaction with the environment.However,DRL faces challenges in practical applications due to low learning efficiency and poor data-sharing security.To address these issues,a Directed Acyclic Graph(DAG)blockchain-assisted deep reinforcement learning Intelligent Driving Strategy Optimization(D-IDSO)algorithm is proposed.First,a dual-layer secure data-sharing architecture based on DAG blockchain is constructed to ensure the efficiency and security of model data sharing.Next,a DRL-based intelligent driving decision model is designed,incorporating a multi-objective reward function that optimizes decision-making by jointly considering safety,comfort,and efficiency.Additionally,an Improved Prioritized Experience Replay with Twin Delayed Deep Deterministic policy gradient(IPER-TD3)method is proposed to enhance training efficiency.Finally,braking and lane-changing scenarios are selected in the CARLA simulation platform to train Connected and Automated Vehicles(CAVs).Experimental results demonstrate that the proposed algorithm significantly improves model training efficiency in intelligent driving scenarios,while ensuring data security and enhancing the safety,comfort,and efficiency of intelligent driving.

Intelligent drivingData sharingDeep Reinforcement Learning(DRL)Directed Acyclic Graph(DAG)

黄晓舸、李春磊、黎文静、梁承超、陈前斌

展开 >

重庆邮电大学通信与信息工程学院 重庆 400065

智能驾驶 数据共享 深度强化学习 有向无环图

2024

电子与信息学报
中国科学院电子学研究所 国家自然科学基金委员会信息科学部

电子与信息学报

CSTPCD北大核心
影响因子:1.302
ISSN:1009-5896
年,卷(期):2024.46(12)