机械与电子2024,Vol.42Issue(10) :54-60,68.

基于Transformer改进强化学习的无人机电力巡检规划

UAV Power Inspection Planning Based on Transformer Improved Reinforcement Learning

杨继阳 欧阳权 丛玉华 王瑞群 王志胜
机械与电子2024,Vol.42Issue(10) :54-60,68.

基于Transformer改进强化学习的无人机电力巡检规划

UAV Power Inspection Planning Based on Transformer Improved Reinforcement Learning

杨继阳 1欧阳权 1丛玉华 1王瑞群 1王志胜1
扫码查看

作者信息

  • 1. 南京航空航天大学自动化学院,江苏 南京 210016
  • 折叠

摘要

为实现无人机电力巡检过程的全自主决策,针对传统强化学习轨迹规划存在的收敛速度慢、易陷入局部最优的问题,基于Transformer模型改进深度强化学习,设计了电量约束下的无人机充电巡检决策算法.首先建立对电力巡检任务场景的能耗模型和马尔可夫决策模型.然后分别设计了基于图神经网络的静态编码器和基于门控循环的动态编码器以提取不同类型环境数据,同时设计了基于多头注意力机制的解码器,输出不定长的全局充电巡检策略序列以预测未来奖励.最后对收敛后的推理模型在电力巡检仿真环境进行验证.仿真结果表明,相比于传统强化学习,所提算法可以提取地图深层状态特征,路径能耗降低了 26.61%,并具有更好的收敛性.

Abstract

To achieve autonomous decision-making in the process of drone power inspection and solve the issues of slow convergence and susceptibility to local optima in traditional reinforcement learning traj-ectory planning,this paper propose an improved deep reinforcement learning approach based on the Trans-former model,which designs a drone charging inspection decision-making algorithm under the constraint of battery capacity.Firstly,an energy consumption model and a Markov decision model are established for the power inspection task scenario.Then,static and dynamic encoders based on graph neural networks(GNN)and gated recurrent units(GRU)are designed to extract different types of environmental data.The multi-head pointer network is employed to plan a global charging inspection strategy and predict future rewards.Finally,the converged inference model is validated in a power inspection simulation environment.Simulation results demonstrate that compared to traditional reinforcement learning,the proposed algorithm can extract deep-level map features,path energy consumption reduced by 26.61%,while achieving better convergence.

关键词

无人机/电力巡检/轨迹规划/Transformer/强化学习

Key words

drone/power inspection/trajectory planning/Transformer/reinforcement learning

引用本文复制引用

基金项目

国家自然科学基金资助项目(61473144)

出版年

2024
机械与电子
中国机械工业联合会科技工作部 机械与电子杂志社

机械与电子

CSTPCD
影响因子:0.243
ISSN:1001-2257
段落导航相关论文