Multi-device Cooperative Reactive Power Optimization Control Strategy of Intelligent Distribution Network Based on Self-attention PPO Algorithm
Aiming at the fast optimization problem in the diversified scenarios of reactive power controllable resources in intelligent distribution networks,this paper proposes a multi-device collaborative reactive power optimization control method based on multi-head self-attention proximal policy optimization(PPO)algorithm.Firstly,the reactive power optimization problem is modeled as Markov decision process.Then,under the framework of deep reinforcement learning,the multi-head self-attention improved PPO algorithm is used to optimize and train the strategy network.The algorithm uses a multi-head self-attention network to obtain the real-time state characteristics of the distribution network,and dynamically controls the update amplitude of the strategy network by the pruning strategy gradient method.Finally,the simulation is done in the improved IEEE 69-node system.The results show that the control performance of the proposed algorithm is better than that of the existing advanced reinforcement learning algorithms.
distribution networkdistributed photovoltaicvoltage reactive power controlmulti-head self-attentionproximal policy optimization algorithm