Decoupling control method based on multi-agent deep reinforcement learning
[Objective]The main characteristic of modern industrial process is known to couple the nonlinear dynamic with multi-input and multi-output(MIMO).For the purpose of ensuring the optimality and the simplicity of the process control,all key variables of the process must be effectively decoupled in control.However,due to difficulties in process modeling and system analyses,the challenge of achieving this treatment in MIMO systems remains.[Methods]The intelligent optimization algorithm based on multi-agent reinforcement learning(MARL)can attain the policy learning and optimization without relying on any process knowledge.In addition,the control objectives can be set so that each agent can be independently and collaboratively optimized.These futures of MARL can be applied to solve the decoupling control problems of the complex nonlinear MIMO processes.In this paper,a decoupling control system design method based on the MARL algorithm is proposed.In this method,the classical loop gain maximization principle is used to pair the process input and output to form the corresponding decoupled control loops.Each control loop is controlled and optimized by an agent with independent control objective and state feedback information.Based on the objective of each control loop,we design a reward function for the corresponding agent.Finally,the multi-agent deep deterministic policy gradient(MADDPG)algorithm is developed to train agents,resulting in obtaining the optimal decoupling control strategy.[Results]To demonstrate the effectiveness of the proposed design scheme,we model the reaction process in the continuous stirred tank reactor(CSTR)based on Aspen software platform,taking aniline hydrogenation to cyclohexylamine as an example.For this process,to guarantee the quality of the product,we necessarily regulate the residence time of the reactant in the reactor and the reaction temperature according to the optimal production conditions.In other words,two decoupled control loops need to be designed.One controls the reaction temperature in the reactor by adjusting the coolant flow in the jacket,and the other controls the residence time of the reactants by adjusting the feed flow rate.Because the residence time and reaction temperature both directly affect the product quality,for this reaction process,it is essential to solve a decoupling control problem for the two-input two-output nonlinear system.Based on the Aspen model of the process,design methods established upon three schemes,namely PID control scheme,single-agent reinforcement learning scheme,and MARL scheme,are adopted for the controller design,and simulations of the control system are conducted on this Aspen model.Experimental results show that,under the same control objectives and training times,the average control error of reaction temperature under the MARL control is reduced by 79%and 53%,respectively,and the average control error in the molar concentration of the product is reduced by 22%and 76%,respectively,compared with the single-agent reinforcement learning and PID control.These simulation results illustrate that the proposed MARL design scheme can independently control and adjust the reaction temperature and reaction residence time in the process,resulting in high control performance of the production.[Conclusions]The decoupling control for the complex nonlinear multi-input multi-output system is reformulated as a multi-agent self-optimization problem in this paper.Then,the MADDPG algorithm is developed to design the decoupling control system.Compared with other decoupled control methods,the proposed one does not depend on the process model,and secures more flexibility in the design,thus ensuring the better control performance and competently serving as an advanced decoupling control scheme based on the artificial intelligence technology.
multi-agent reinforcement learningdecoupling controldeep deterministic policy gradientcontinuous stirred tank reactornonlinear multi-input multi-output systems