首页|基于多维时空层递的交通信号分布式强化学习方法

基于多维时空层递的交通信号分布式强化学习方法

扫码查看
信号控制是智能交通系统的重要组成部分,融合人工智能等新技术的信号优化逐渐成为研究热点,具体策略可分为集中式和分布式2类.分布式控制的轻量化状态空间可以有效避免深度强化学习中的维度灾难问题,近年来愈发受到研究者关注.现有的分布式协同控制策略多以图卷积网络或图注意力网络为基础挖掘路口的耦合关系,但对路口状态之间的时空关联性随时变交通流的动态变化特征考虑不足.为此,首先基于门控循环神经网络建立时变交通流特征的提取方法,确定多路口时空关联度;其次采用图注意力机制搭建区域时空特征的层递融合算法,以路口重要度为指标实现状态空间重构;再次,采用全连接理念面向自适应相位相序结构构造路口通行权切换决策模型.最后,基于实际路网仿真测试了模型控制效果.结果表明:相比于传统分布式强化学习算法,该模型在低、中、高3种流量下的车辆平均排队长度分别降低了 13.74%、5.03%、6.30%以上,表明了新方法的潜在应用价值.
Traffic Signal Decentralized Reinforcement Learning Method Based on a Multi-perspective Spatio-temporal Hierarchical Structure
Signal control is a main feature of intelligent transportation systems,and the integration of artificial intelligence with other technologies for traffic signal control has become a major issue.Signal control strategies can be divided into two categories:centralized and decentralized.Decentralized control methods use light-state spaces that effectively avoid the dimensional catastrophe problem in deep reinforcement learning,and these methods have received increasing attention in recent years.Existing multi-intersection decentralized coordinated strategies mostly use graph convolutional networks or graph attention networks to learn the spatial relationships between intersections.However,they do not give sufficient attention to the spatio-temporal correlations related to time-varying traffic flows between intersections.Accordingly,this study constructed a time-varying traffic flow feature extraction method based on gated recurrent neural networks and calculated the spatiotemporal correlation of multiple intersections.A regional spatiotemporal feature fusion method was then developed using a graph attention mechanism,which achieves a deep integration of traffic flow characteristics through a multi-perspective spatiotemporal hierarchical structure,The study also proposed a state-space reconstruction method based on the importance of intersections.An intersection right-of-way switching decision model with adaptive phase-sequence structures based on a fully connected concept was then developed.Finally,the model was tested under simulation of a real-world road network.The results show that compared with the traditional decentralized reinforcement learning method,the developed model reduces the average vehicle queue length by 13.74%,5.03%,and 6.30%for low-,medium-,and high-traffic flows,respectively,indicating the potential application value of the proposed method.

traffic engineeringintelligent transportationdeep reinforcement learningtraffic sig-nal controlmulti-perspective spatio-temporal learninghierarchical learning

王福建、范诚睿、周斌、封春房、马东方

展开 >

浙江大学建筑工程学院,浙江杭州 310058

浙江大学工程师学院,浙江杭州 310015

公安部交通管理科学研究所,江苏无锡 214151

浙江大学海洋学院,浙江舟山 316021

展开 >

交通工程 智能交通 深度强化学习 信号控制 多角度时空学习 层递学习

国家自然科学基金项目浙江省智能交通工程技术研究中心开放课题项目浙江省教育厅科研项目

521723342023ERCITZJ-KF09Y202353473

2024

中国公路学报
中国公路学会

中国公路学报

CSTPCD北大核心
影响因子:1.607
ISSN:1001-7372
年,卷(期):2024.37(7)
  • 7