首页|基于深度强化学习的铁路纵断面智能设计模型研究

基于深度强化学习的铁路纵断面智能设计模型研究

扫码查看
传统智能算法通常要求变量维度在计算过程中不变,而铁路纵断面智能设计中的变坡点数量需要根据地形等变化自适应确定.考虑到强化学习能从地面高程和已经生成的线形等环境数据中获得最优策略的特点,将深度强化学习方法应用于纵断面智能设计,研究智能体决策变坡点的方法,提出铁路纵断面设计的变坡点决策模型,确定模型中的状态、动作、奖励等表达形式.结合纵断面设计约束多的特点,引入动作屏蔽机制处理约束,加快收敛并提高模型性能.将计算期引入模型的状态,提出通过单网络产生多个多目标策略的单网络多策略的多目标处理方法.通过实际工程案例验证了本文所提模型的正确性和有效性.
Study on Deep Reinforcement Learning Model for Railway Vertical Alignment Design
Traditional intelligent algorithms require a fixed number of variables to remain unchanged during the calcula-tion process,while the number of slope-change points in the intelligent design of railway vertical alignment needs to be adaptively determined according to changes in terrain.Considering the characteristics of reinforcement learning being able to learn and interact with environmental data such as ground elevations and generated alignments to obtain the opti-mal strategies,in this paper,the method of deep reinforcement learning was applied to the intelligent design of the verti-cal alignments,and the method for the intelligent agent to decide the slope-change points in sequence from front to back was studied.A grade change point decision-making model was proposed for railway vertical alignment design to determine the expression forms of states,actions and rewards in the model.At the same time,combined with the char-acteristics of many design constraints in the vertical alignment design,an action masking mechanism was introduced to deal with the constraints,accelerate the convergence and improve the performance of the model.In addition,by intro-ducing the computation period into the state of the model,a single-network multi-strategy multi-objective processing method was proposed to generate multiple multi-objective strategies through a single network.The correctness and ef-fectiveness of the models for single-objective and multi-objective profile problems were verified through practical engi-neering cases.

railwayvertical alignment designdeep reinforcement learningsafe reinforcement learningaction mask

缪鹍、戴炎林、高鸿剑

展开 >

中南大学土木工程学院,湖南长沙 410075

铁路 纵断面设计 深度强化学习 安全强化学习 动作屏蔽

2024

铁道学报
中国铁道学会

铁道学报

CSTPCD北大核心
影响因子:0.9
ISSN:1001-8360
年,卷(期):2024.46(9)