控制理论与应用2024,Vol.41Issue(6) :1056-1066.DOI:10.7641/CTA.2023.20950

异构群智感知PPO多目标任务指派方法

PPO multi-objective task allocation method for heterogeneous crowd sensing

杨潇 郭一楠 吉建娇 刘旭
控制理论与应用2024,Vol.41Issue(6) :1056-1066.DOI:10.7641/CTA.2023.20950

异构群智感知PPO多目标任务指派方法

PPO multi-objective task allocation method for heterogeneous crowd sensing

杨潇 1郭一楠 2吉建娇 3刘旭1
扫码查看

作者信息

  • 1. 中国矿业大学信息与控制工程学院,江苏徐州 221116
  • 2. 中国矿业大学信息与控制工程学院,江苏徐州 221116;中国矿业大学(北京)机械与电气工程学院,北京 100083
  • 3. 中国矿业大学人工智能研究院,江苏 徐州 221008
  • 折叠

摘要

现有移动群智感知系统的任务指派主要面向单一类型移动用户展开,对于存在多种类型移动用户的异构群智感知任务指派研究相对缺乏.为此,本文针对异质移动用户,定义其区域可达性,并给出感知子区域类型划分.进而,兼顾感知任务数量和移动用户规模的时变性,构建了动态异构群智感知系统任务指派的多目标约束优化模型.模型以最大化感知质量和最小化感知成本为目标,综合考虑用户的最大任务执行数量、无人机的受限工作时间等约束.为解决该优化问题,本文提出一种基于近端策略优化的多目标进化优化算法.采用近端策略优化,根据种群的当前进化状态,选取具有最高奖励值的进化算子,生成子代种群.面向不同异构群智感知实例,与多种算法的对比实验结果表明,所提算法获得的Pareto最优解集具有最佳的收敛性和分布性,进化算子选择策略可以有效提升对时变因素的适应能力,改善算法性能.

Abstract

The task allocation of existing mobile crowd sensing systems is mainly carried out for a single type of mobile users,but there is a lack of research on the task allocation of heterogeneous crowd sensing where there are multiple types of mobile users.Therefore,we define the area accessibility of heterogeneous mobile users,and give a classification of sensing sub-regions.Then,we construct a multi-objective constrained optimization model for task allocation of dynamic heterogeneous crowd sensing systems,taking into account the time-varying nature of the number of sensing tasks and the size of mobile users.The model aims to maximize the sensing quality and minimize the sensing cost,taking into account the maximum number of tasks to be performed by users and the restricted working time of UAVs.To solve this optimization problem,a multi-objective evolutionary optimization algorithm based on proximal policy optimization is proposed.The proximal policy optimization is used to select the evolutionary operator with the highest reward value according to the current evolutionary state of the population,and generate the offspring population.The experimental results of comparing the proposed algorithm with various algorithms for different heterogeneous crowd sensing instances show that the optimal solution set of Pareto obtained by the proposed algorithm has the best convergence and distributivity,and the evolutionary operator selection strategy can effectively improve the adaptability to time-varying factors and improve the performance of the algorithm.

关键词

异构群智感知/多目标优化/强化学习/近端策略优化

Key words

heterogeneous crowd sensing/multi-objective optimization/reinforcement learning/proximal policy opti-mization

引用本文复制引用

基金项目

国家自然科学基金项目(61973305)

国家自然科学基金项目(U23A20340)

国家自然科学基金项目(52121003)

国家重点研发计划项目(2022YFB4703700)

出版年

2024
控制理论与应用
华南理工大学 中国科学院数学与系统科学研究院

控制理论与应用

CSTPCDCSCD北大核心
影响因子:1.076
ISSN:1000-8152
参考文献量2
段落导航相关论文