首页|结合类脑导航的强化学习无人机自主导航

结合类脑导航的强化学习无人机自主导航

扫码查看
针对无人机自主导航常用的端到端强化学习方法存在训练效率低、泛化能力和通用性差等问题,引入了类脑导航模型,基于长短时记忆(LSTM)神经网络构建了类脑细胞导航模型,通过整合编码无人机智能体的自运动信息,实现了网格细胞和头朝向细胞的编码,进一步将这些信息作为深度强化学习算法D3QN的状态补充表示;通过在AirSim仿真环境的实验表明,类脑导航模型的引入能够有效提高算法的训练能力和无人机智能体的导航性能,相较于原D3QN算法,首次目标固定情况下,到达目标成功率提升了 2.54%,达到了 97.11%;而在目标改变后继续训练的情况下,到达目标成功率为99.45%,而D3QN仅为11.46%,未能找到新的目标点;表明算法的泛化能力得到有效提升.
Reinforcement Learning Algorithms Combined with Brain-Inspired Navigation
In response to the low training efficiency,poor generalization ability,and universality of widely used end-to-end rein-forcement learning methods for autonomous navigation of UAV,a brain-inspired navigation model is introduced.Based on the long short-term memory(LSTM)neural network,a brain-inspired cell navigation model is constructed,the self-motion information of the UAV intelligent agent is integrated to encode grid cells and head direction cells,further supplement this information as the state of the deep reinforcement learning algorithm D3QN.The experiments in AirSim simulation environment show that the introduction of the brain-inspired navigation model can effectively improve the training ability of the algorithm and the navigation performance of the UAV intelligent agent.Compared with the original D3QN algorithm,the success rate of reaching the target is increased by 2.54%to 97.11%with the target first fixed.the success rate of reaching the target is 99.45%with the target continued to train after changed.The new target point misses with the success rate of the D3QN of only 11.46%.This indicates that the algorithm effectively improves generalization abilities.

UAVdeep reinforcement learningbrain-inspired navigationD3QNautonomous navigation

吴勇、彭辉、熊峰钥

展开 >

成都信息工程大学软件工程学院,成都 610228

无人机 深度强化学习 类脑导航 D3QN 自主导航

四川省科技计划资助项目

2019YJ0356

2024

计算机测量与控制
中国计算机自动测量与控制技术协会

计算机测量与控制

CSTPCD
影响因子:0.546
ISSN:1671-4598
年,卷(期):2024.32(7)
  • 3