Using the Proximal Strategy Optimization Algorithm With Continuous Action to Solve the Capacitated Lot-sizing Problem
In this paper,the capacitated lot-sizing problem is studied,taking the multi-product stand-alone system as the research object,and aiming at minimizing the total production cost,including production cost,inventory cost,machine setup cost and out-of-stock backlog cost.By transforming the problem into a Markov decision process,a deep reinforcement learning algorithm based on proximal policy optimization is used to solve it.Since the deep reinforcement learning using discrete action space is difficult to expand to large problems,this paper uses the method of adding mapping function in the policy network to apply deep reinforcement learning represented by continuous actions to solve this problem.Experiments show that the algorithm designed in this paper requires less training time,and the experimental results are close to the optimal solution solved directly by CPLEX,and the solution speed is more advantageous.