面向电网前瞻调度嵌入领域知识的深度强化学习方法

扫码查看

原文链接

万方数据
维普

中文摘要：强化学习由于具有自学习与自寻优能力,在电网前瞻调度等领域渐露头角.然而,现有基于强化学习的调度方法对最优策略的探索效率及收敛性较低.为了适应大规模电网,考虑历史发电数据、电力平衡、新能源消纳率、线路负载率等领域知识,将其嵌入至强化学习策略网络正则项并用于引导智能体训练方向.该方法在训练前期基于专家修正后的历史机组出力轨迹学习调度员经验,使得智能体策略网络参数快速收敛到一个有效初始解;在训练中后期,引入电力平衡等损失函数正则项,引导智能体满足先验调度知识,有效预防智能体盲动行为,提升调度决策质量.最后,利用IEEE118节点系统验证所提算法有效性.

外文标题：Look-ahead Dispatch Method via Deep Reinforcement Learning Embedded With Domain Knowledge

外文摘要：Reinforcement learning has a strong ability for self-learning and self-optimization,which has gradually emerged in the field of look-ahead power dispatch.However,the existing look-ahead power dispatch methods based on reinforcement learning tend to reduce the learning efficiency and convergence.To adapt to the large-scale power grid,this paper incorporates domain knowledge into the regularization terms,such as historical generation data,power balance,renewable energy utilization rate,and line loading rate.These terms are embedded in the reinforcement policy network to guide the training of dispatch agents.The method learns from expert-corrected historical power output trajectories to acquire expert experience in the early stages of training,which makes the parameters of the policy network quickly converge to an effective initial solution.During the later stages of training,introducing loss function regularization terms,such as power balance,guides the agent to adhere to prior dispatch knowledge.It also prevents the blind actions of the agent effectively without compromising the dispatch decision.Finally,the effectiveness of the proposed algorithm is verified in the IEEE118-bus system.

外文关键词：

look-ahead dispatchreinforcement learningdomain knowledgedispatch knowledge regularization

作者：

成梁成、严嘉豪、姚建国、杨胜春、李亚平

展开 >

作者单位：

中国电力科学研究院有限公司,江苏省南京市 210003

关键词：

前瞻调度强化学习领域知识调度知识正则项

基金：

国家自然科学基金项目国家自然科学基金项目

项目编号：

U206621252307150

出版年：

2024

DOI：

10.13335/j.1000-3673.pst.2023.1751

电网技术

国家电网公司

电网技术

CSTPCD北大核心

影响因子：2.821

ISSN：1000-3673

年,卷(期)：2024.48(8)