Research on UAV Path Planning Algorithm for Fairness Data Collection and Energy Supplement
高思华 1李军辉 1李建伏 1刘宝煜1
扫码查看
点击上方二维码区域,可以放大扫码查看
作者信息
1. 中国民航大学计算机科学与技术学院,天津 300300
折叠
摘要
针对无人机(Unmanned Aerial Vehicle,UAV)辅助WSN(Wireless Sensor Networks)数据采集和能量补充工作中存在的数据来源单一和能量补充不均衡现象,本文首先提出数据采集和能量补充公平性问题并进行数学建模.其次,本文设计一种DPDQN(Double Parametrized Deep Q-Networks)强化学习算法,规划无人机的飞行路线和悬停位置,优化数据采集和能量补充效果.DPDQN学习离散动作与多种连续动作相混合的动作选择策略,算法网络模型包括离散动作网络和连续动作网络两部分.前者规划无人机访问数据采集节点的顺序,后者优化无人机在数据采集节点周围的悬停位置.仿真实验结果显示,本文算法在数据采集公平性、能量补充公平性、飞行距离和四种影响公平性的因素比较中均优于三种现有对比算法,并具有良好的鲁棒性和稳定性.
Abstract
UAV(Unmanned Aerial Vehicle)-assisted WSN(Wireless Sensor Networks)suffers from single-source data collection and uneven energy supplement.In this article,we first investigate and develop a mathematical model for the problem of fairness for data collection and energy supplement.Then,a novel deep reinforcement learning algorithm,named DPDQN(Double Parametrized Deep Q-Networks),is designed to resolve the proposed problem.The DPDQN algorithm in-corporates a hybrid discrete-continuous action strategy,which consists of two components,namely,discrete action network and continuous action network.The former schedules the UAV's visiting order to sensors in WSN,and the latter optimizes the UAV's hover position around each visited sensor.Numerical results demonstrate that the DPDQN algorithm outper-forms three existing solutions in data collection fairness,energy replenishment fairness,flying distance,and four factors that influence fairness.Furthermore,the results validate our algorithm is robust and stable.
关键词
公平性数据采集和能量补充/无人机路径规划/深度强化学习/无线传感器网络
Key words
fairness data collection and energy supplement/unmanned aerial vehicle path planning/deep reinforce-ment learning/wireless sensor networks