Deep reinforcement learning algorithm for the type Ⅰ two-sided assembly line balancing problem
The traditional optimization algorithm cannot effectively use historical solving experience and is difficult to obtain the optimal solution when solving the type Ⅰ two-sided assembly line balancing problem.Aiming at this problem,a deep re-inforcement learning algorithm named Proximal Policy Optimization with Convolutional Neural Networks(CNN-PPO)was proposed.The deep reinforcement learning agent structure of the CNN-PPO was designed.Based on the Proximal Policy Optimization(PPO),the Convolutional Neural Networks(CNN)was introduced to enhance the data feature extraction ca-pabilities of the agent.According to the characteristics of two-sided assembly line balancing,a state matrix was proposed to describe the two-sided assembly line balancing problem and introduce the mask layer to assist the agent in task decision-making.A reward function was designed according to the optimization goal,the optimal combination behavior strategy was selected for each decision by combining with the reinforcement learning online execution-evaluation(Actor-Critic)mecha-nism,and the effectiveness and stability of the algorithm were verified through multiple example tests.The experimental results showed that the solution results of the proposed algorithm were better than the current algorithms,of which 57 could reach the lower bound among 59 test cases.