A Resistance Strategy for Bus Service Disruption in Depot Based on Reinforcement Learning
In order to alleviate the problem of bus service disruption in depot,this paper proposes a dynamic de-parture control strategy based on reinforcement learning.This strategy uses a long short-term memory(LSTM)model to predict bus travel time,so that the agent can perceive the headway status of the depot vehicle and the running vehi-cle to better evaluate the long-term impact of the decision made by the agent.For the scenario where there is no bus stop at the depot,the state-dependent differentiable function is used to mask invalid actions when calculating the ac-tion probability distribution,so as to avoid invalid commands from the agent.The model is trained using proximal poli-cy optimization(PPO)and penalizes large departure intervals through a reward function.The experimental result shows that,compared with the traditional method,the method proposed in this paper can not only effectively avoid the bus service disruption in the depot,but also make the bus passenger load ratio more balanced,the passenger waiting time shorter,and the vehicle utilization efficiency higher.
Bus service disruptionReal-Time controlReinforcement learningProximal policy optimizationInvalid action masking