语义通信下基于强化学习的无人机通信容错协同控制

Semantic communication aware reinforcement learning for communication fault-tolerant UAV collaborative control

章阳 ¹顾宏宇 ¹冯博豪 ²王然¹

扫码查看

作者信息

1. 南京航空航天大学计算机科学与技术学院,江苏南京 211106
2. 武汉理工大学计算机与人工智能学院,湖北武汉 430070
折叠

摘要

无人机集群技术近年在各类军民应用中得到了广泛应用.为了提高无人机执行任务的成功率,无人机集群的通信与协同成为重要技术研究方向.然而,在通信不确定的环境下,无人机通信与协同可能受到主观或客观环境因素带来的通信干扰,无法正确地收发信息,导致协同任务失败.为解决这一问题,针对通信受限环境下"长机-僚机"伴飞跟随的需求,提出语义通信下基于强化学习的无人机通信容错协同方法.该方法在基于强化学习的僚机跟随策略基础上,引入语义通信机制和基于近端策略优化算法的长机行为预测算法.在正常通信时,僚机会接收长机通信信息并执行相应指令操作;而在通信受限时,僚机利用历史通信信息提取语义信息,匹配自身的语义通信模型以推测长机的未来目标.同时结合对长机行为模式的学习预测模型,决策僚机自身前进方向.在无额外通信抗干扰设备负载的条件下,一定程度上能抵御通信干扰,实现对长机的不间断跟随,从而提高通信受限环境下任务协同效率.通过实验证明,相较传统方法,基于语义通信的强化学习无人机通信抗干扰方法能更好地适应复杂环境,实现更佳的僚机跟随效果,有效提升通信间断情况下的任务成功率,为无人机在通信受限环境下的协同应用提供了可行的解决方案.

Abstract

Unmanned aerial vehicle(UAV)swarms have seen extensive deployment across a spectrum of military and civilian applications in recent years.The success of UAV missions is contingent upon robust communication and collaboration among the UAV,which has become a pivotal area of technical research.However,in environments rife with communication uncertainties,both subjective and objective environmental factors can disrupt UAV communication and collaboration.This interference can prevent UAV from accurately transmitting and receiving information,thereby jeopardizing the success of collaborative missions To address this challenge,a fault-tolerant UAV collaboration method grounded in reinforcement learning and semantic communication was developed to cater to the leader-follower UAV mission pattern within environments constrained by limited communication capabilities To enhance the follower UAV's strategy for reinforcement learning-based following,a semantic communication mechanism coupled with a Proximal Policy Optimization(PPO)method was implemented.This approach facilitated the prediction of the leader UAV's actions.Under normal communication conditions,the follower UAV received data transmitted by the leader and executed the corresponding command operations.In scenarios where communication was interfered,the follower UAV leveraged historical flight and communication data to extract semantic information.This information was then used autonomously to predict the future flight paths of the leading UAV.By integrating the learned and predicted behavior patterns of the leader,the follower UAV was able to make informed decisions.The proposed scheme,which did not necessitate additional anti-interference equipment,enabled the UAV swarm to counteract communication interference and bolster the efficiency of collaboration within a challenging and obstructed communication context.Experimental studies show that,when compared to benchmark methods,the proposed scheme not only endures complex environments with interferences but also significantly improves the efficiency of UAV leading-following operations and the overall mission success rate.This research provides valuable insights into viable solutions for future UAV swarm collaborations within communication-constrained and interfered environments.

关键词

无人机抗干扰/通信干扰/语义通信/强化学习

Key words

UAV anti-jamming/communication jamming/semantic communication/reinforcement learning

引用本文复制引用

基金项目

国家自然科学基金(62071343)

省部共建公共大数据国家重点实验室开放基金(PBD2023-12)

出版年

2024

网络与信息安全学报

人民邮电出版社

网络与信息安全学报

CSTPCD

ISSN：2096-109X

参考文献量17

段落导航