首页|无人机辅助MEC车辆任务卸载与功率控制近端策略优化算法

无人机辅助MEC车辆任务卸载与功率控制近端策略优化算法

扫码查看
无人机(UAVs)辅助移动边缘计算(MEC)架构是灵活处理车载计算密集、时延敏感型任务的有效模式.但是,如何在处理任务时延与能耗之间达到最佳均衡,一直是此类车联网应用中长期存在的挑战性问题.为了解决该问题,该文基于无人机辅助移动边缘计算架构,考虑无线信道时变特性及车辆高移动性等动态变化特征,构建出基于非正交多址(NOMA)的车载任务卸载与功率控制优化问题模型,然后将该问题建模成马尔可夫决策过程,并提出一种基于近端策略优化(PPO)的分布式深度强化学习算法,使得车辆只需根据自身获取局部信息,自主决策任务卸载量及相关发射功率,从而达到时延与能耗的最佳均衡性能.仿真结果表明,与现有方法相比较,本文所提任务卸载与功率控制近端策略优化方案不仅能够显著获得更优的时延与能耗性能,所提方案平均系统代价性能提升至少13%以上,而且提供一种性能均衡优化方法,能够通过调节用户偏好权重因子,达到系统时延与能耗水平之间的最佳均衡.
Proximal Policy Optimization Algorithm for UAV-assisted MEC Vehicle Task Offloading and Power Control
The architecture of Mobile Edge Computing(MEC),assisted by Unmanned Aerial Vehicles(UAVs),is an efficient model for flexible management of mobile computing-intensive and delay-sensitive tasks.Nevertheless,achieving an optimal balance between task latency and energy consumption during task processing has been a challenging issue in vehicular communication applications.To tackle this problem,this paper introduces a model for optimizing task offloading and power control in vehicle networks based on UAV-assisted mobile edge computing architecture,using a Non-Orthogonal Multiple Access(NOMA)approach.The proposed model takes into account dynamic factors like vehicle high mobility and wireless channel time-variations.The problem is modeled as a Markov decision process.A distributed deep reinforcement learning algorithm based on Proximal Policy Optimization(PPO)is proposed,enabling each vehicle to make autonomous decisions on task offloading and related transmission power based on its own perceptual local information.This achieves the optimal balance between task latency and energy consumption.Simulation results reveal that the proposed proximal policy optimization algorithm for task offloading and power control scheme not only improves the performance of task latency and energy consumption compared to existing methods,The average system cost performance improvement is at least 13%or more.but also offers a performance-balanced optimization method.This method achieves optimal balance between the system task latency and energy consumption level by adjusting user preference weight factors.

Unmanned Aerial Vehicles(UAVs)assisted computingMobile Edge Computing(MEC)Proximal Policy Optimization(PPO)Deep reinforcement learningPower control and task offloading

谭国平、易文雄、周思源、胡鹤轩

展开 >

河海大学计算机与信息学院 南京 211100

无人机辅助计算 移动边缘计算 近端策略优化 深度强化学习 功率控制和任务卸载

国家自然科学基金国家自然科学基金

61832005U21B2016

2024

电子与信息学报
中国科学院电子学研究所 国家自然科学基金委员会信息科学部

电子与信息学报

CSTPCD北大核心
影响因子:1.302
ISSN:1009-5896
年,卷(期):2024.46(6)
  • 1