无人系统技术2024,Vol.7Issue(4) :36-46.DOI:10.19942/j.issn.2096-5915.2024.04.36

太阳能无人机深度强化学习航迹规划及半实物仿真

Deep Reinforcement Learning-based Trajectory Planning and HIL Simulation for Solar-powered UAVs

刘子荣 张晓辉 李焱宇 程奥祖 胡馨元
无人系统技术2024,Vol.7Issue(4) :36-46.DOI:10.19942/j.issn.2096-5915.2024.04.36

太阳能无人机深度强化学习航迹规划及半实物仿真

Deep Reinforcement Learning-based Trajectory Planning and HIL Simulation for Solar-powered UAVs

刘子荣 1张晓辉 1李焱宇 1程奥祖 1胡馨元1
扫码查看

作者信息

  • 1. 北京理工大学宇航学院,北京 100081
  • 折叠

摘要

针对太阳能无人机高能效飞行航迹规划及试验验证问题,提出了一种基于深度强化学习的航迹规划方法,并搭建了地面试验平台进行半实物仿真验证.首先,建立太阳能无人机的运动学和动力学模型、能量获取与消耗等模型,并分析了时间、姿态角等飞行状态与能量状态的耦合效应.其次,建立了太阳能无人机能量最优航迹规划问题,基于双延迟深度确定性策略梯度算法开展智能体训练,并进行了有效性仿真分析.最后,搭建地面半实物测试平台,完成平台自身性能测试后,将强化学习控制器进行在线部署,进而开展半实物仿真试验验证.结果表明,相较于传统飞行策略,所提方法训练出的策略在200 s的飞行时间内获取能量提升2.9%,消耗能量降低30.46%,积累能量提升11.36%.所提策略能更加充分利用太阳能,降低飞行需求功率,达到提能增效的目标,可为太阳能无人机高能效飞行提供参考.

Abstract

To address the challenges of energy-efficient trajectory planning and experimental validation for solar-powered unmanned aerial vehicles(UAVs),this paper introduces a novel trajectory planning method leveraging deep reinforcement learning.A ground-based hardware-in-the-loop simulation platform has also been established for validation.Initially,the kinematic,dynamic,energy acquisition,and consumption models of solar-powered UAVs are established.The coupling effects between flight states,such as time,attitude angles,and energy states,are thoroughly analyzed.Subsequently,an energy-optimal trajectory planning framework is formulated,and agent training is conducted using the twin delayed deep deterministic policy gradient algorithm.Effectiveness simulations are conducted to analyze the performance.Finally,a ground-based hardware-in-the-loop testing platform has been established.After thorough performance testing of the platform,the reinforcement learning controller is deployed online for hardware-in-the-loop simulation testing.The results demonstrate that compared to traditional flight strategy,the proposed method trained strategy that increased energy acquisition by 2.9%,reduced energy consumption by 30.46%,and increased energy accumulation by 11.36%within 200 seconds of flight time.The proposed strategy can more fully utilize solar energy,reduce flight power requirements,achieve the goal of increasing energy efficiency,and provide reference for high-efficiency solar-powered UAVs flights.

关键词

太阳能无人机/高能效飞行/深度强化学习/双延迟深度确定性策略梯度/航迹规划/半实物仿真

Key words

Solar-powered UAVs/High Energy Efficiency Flight/Deep Reinforcement Learning/Twin Delayed Deep Deterministic Policy Gradient/Trajectory Planning/HIL Simulation

引用本文复制引用

出版年

2024
无人系统技术

无人系统技术

CSCD
ISSN:
参考文献量30
段落导航相关论文