首页|基于深度强化学习DDDQN的高速列车智能调度调整方法

基于深度强化学习DDDQN的高速列车智能调度调整方法

扫码查看
在高速铁路系统的日常运营中,列车经常受到各种突发事件的干扰而导致晚点,严重影响旅客出行体验.为在短时间内制定出列车运行调整方案并尽可能缩短列车晚点时间,提出一种将深度强化学习与整数规划模型相结合的列车智能调度调整方法(DDDQN).首先,将线路划分为多个轨道区段相连接的形式,并基于车间作业调度问题,以最小化所有列车总晚点时间为目标,构建描述列车运行过程的整数规划模型.之后,将各列车视为智能体,根据实际运营需求定义了多智能体的状态、动作以及回报函数,并构造了2个深度神经网络以近似值函数.最后,结合上述整数规划模型设计了DDDQN的训练方法,先利用智能体在仿真环境中探索求出问题可行解,并通过2个神经网络之间的"互馈"机制,实现神经网络参数的更新.在此基础上求解整数规划模型,即可在短时间内得到问题最优解.利用京张高铁实际线路数据和运营数据进行仿真实验,通过比较3种不同求解方法在10个不同突发事件场景下得到的列车总晚点时间和求解时间,验证了所提出的DDDQN模型可以在短时间内得到问题的最优解,可降低至多30.43%的列车晚点时间以及至多68.33%的求解时间.DDDQN为提升高速铁路系统在突发事件下的应急处置能力以及运输组织效率提供了一种智能化的方法与参考.
Intelligent rescheduling optimization method of high-speed railway based on deep reinforcement learning DDDQN
In the daily operation of the high-speed railway system,trains are often disturbed by various emergencies leading to delays,which seriously affects passengers'travel experience.In order to work out train rescheduling scheme in a short time and reduce train delay time as much as possible,a train rescheduling optimization method DDDQN combining deep reinforcement learning and an programming model was proposed.First,the track was divided into multiple sections connected.An integer programming model was constructed to describe the train operation process to minimize the total delay time of all trains based on the job-shop scheduling problem.Then,each train was regarded as an agent,and the state,action and reward functions of multiple agents were defined according to the actual operation requirements.Two deep neural networks were constructed to approximate the functions.Finally,combined with the above integer programming model,the training method of DDDQN was designed.In this model,the feasible solution to the problem was explored by the agent in the simulation environment,and the parameters of the neural network were updated by the"mutual feed"mechanism between the two neural networks.On this basis,the optimal solution to the problem can be obtained in a short time by solving the integer programming model.The actual track data and operation data of the Beijing-Zhangjiakou high-speed railway were used for simulation experiments,and the total train delay time and solution time obtained by three different solution methods under 10 different emergency scenarios were compared,which verified that the proposed DDDQN model could obtain the optimal solution of the problem in a short time.DDDQN can reduce train delay time by up to 30.43%and solution time by up to 68.33%.DDDQN provides an intelligent method and reference for improving the emergency handling ability and transportation organization efficiency of high-speed railway systems under emergencies.

intelligent train reschedulingtrain delay timedeep reinforcement learninginteger programmingneural network

吴卫、阴佳腾、陈照森、唐涛

展开 >

北京和利时系统工程有限公司,北京 100176

北京交通大学 轨道交通控制与安全国家重点实验室,北京 100044

列车智能调度调整 列车晚点时间 深度强化学习 整数规划模型 神经网络

国家自然科学基金基础科学中心项目国家自然科学基金"优青"项目先进轨道交通自主运行全国重点实验室项目

7228810172322022RAO2023ZZ001

2024

铁道科学与工程学报
中南大学 中国铁道学会

铁道科学与工程学报

CSTPCD北大核心EI
影响因子:0.837
ISSN:1672-7029
年,卷(期):2024.21(4)
  • 21