使用连续动作的近端策略优化算法求解有限产能批量问题

Using the Proximal Strategy Optimization Algorithm With Continuous Action to Solve the Capacitated Lot-sizing Problem

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：研究了有限产能批量问题,以多产品单机系统为研究对象,以最小化生产总成本(生产成本、库存成本、机器设置成本、缺货积压成本)为优化目标.通过将问题转化为马尔可夫决策过程,利用基于近端策略优化的深度强化学习算法进行求解.由于使用离散动作空间的深度强化学习难以扩展到大型问题,为此本文采用在策略网络中添加映射函数的方法将连续动作表示的深度强化学习应用于求解此问题.实验表明,文中所设计的算法所需的训练时间更少,在实验结果上与直接用CPLEX求解的最优解接近,在求解速度上也更有优势.

外文摘要：In this paper,the capacitated lot-sizing problem is studied,taking the multi-product stand-alone system as the research object,and aiming at minimizing the total production cost,including production cost,inventory cost,machine setup cost and out-of-stock backlog cost.By transforming the problem into a Markov decision process,a deep reinforcement learning algorithm based on proximal policy optimization is used to solve it.Since the deep reinforcement learning using discrete action space is difficult to expand to large problems,this paper uses the method of adding mapping function in the policy network to apply deep reinforcement learning represented by continuous actions to solve this problem.Experiments show that the algorithm designed in this paper requires less training time,and the experimental results are close to the optimal solution solved directly by CPLEX,and the solution speed is more advantageous.

外文关键词：

capacitated lot-sizing problemdeep reinforcement learningmarkov decision process continuous action spaceproximal policy optimization

作者：

章天吉、林文文、张岳君、项薇、战韬阳

展开 >

作者单位：

宁波大学机械工程与力学学院,浙江宁波 315211

奥克斯集团有限公司,浙江宁波 315191

上海交通大学机械与动力工程学院,上海 200240

浙江工商职业技术学院机电工程学院,浙江宁波 315699

宁波大学阳明学院,浙江宁波 315211

展开 >

关键词：

有限产能批量问题深度强化学习马尔可夫决策过程连续动作空间近端策略优化

出版年：

2024

机械设计与研究

上海交通大学

机械设计与研究

CSTPCD北大核心

影响因子：0.531

ISSN：1006-2343

年,卷(期)：2024.40(1)

参考文献量15