Coordinated Sequential Optimization for Network-wide Traffic Signal Control Based on Heterogeneous Multi-agent Transformer
Focusing on the complex traffic signal control task in an urban network,this study proposes a coordinated sequential optimization method based on a Heterogeneous Multi-Agent Transformer(HMATLight)to optimize network-wide traffic signals and improve the performance of signal control policy at intersections within the urban network.Specifically,considering the spatial correlation of multi-intersection traffic flow,a value encoder based on a self-attention mechanism is first designed to learn traffic observation representations and realize network-level communication.Secondly,in response to the non-stationary environment for multi-agent policy updates,a policy decoder based on the multi-agent advantage decomposition is constructed,which can sequentially output the optimal responsive action on the basis of the joint actions of preceding agents.Besides,an action-masking mechanism based on effective driving vehicles,adapting the decision frequency within the time-adequate interval,and a spatio-temporal pressure reward function considering the waiting fairness are constructed,which further enhance policy performance and practicality.A series of experiments are carried out on Hangzhou network datasets to validate the effectiveness of the proposed method.Experimental results show that the proposed HMATLight outperforms all baselines on two datasets with five metrics.Compared with the best-performed baseline,HMATLight decreases the average travel time by 10.89%,the average queue length by 18.84%and the average waiting time by 22.21%.Furthermore,HMATLight is dramatically higher in generalization and significantly reduces instances of long vehicle waiting times.
intelligent transportationdeep reinforcement learningnetwork-wide traffic signal controlheterogeneous multi-agentspatio-temporal pressure reward