首页|基于投影奖励机制的多机器人协同编队与避障

基于投影奖励机制的多机器人协同编队与避障

扫码查看
针对多机器人协同编队任务中过度中心化、系统鲁棒性低、编队稳定性较差等问题,提出了基于投影奖励机制的多机器人协同编队与避障(projected reward for multi-robot formation and obstacle avoidance,PRMFO)模型,实现了多机器人基于统一状态表征方法的去中心化决策过程.设计了一种多机器人统一状态表征方法,实现了机器人与外界环境交互信息处理的一致性;基于统一状态表征设计了基于投影的奖励机制,从距离和方向两个维度将奖励过程矢量化,丰富机器人的决策依据;为了解决多机器人系统中过度中心化问题,设置了自主决策层,融合统一状态表征与投影奖励机制的软演员评论家(soft actor-critic,SAC)算法,实现了多机器人协同编队与避障任务.在机器人操作系统(robot operating system,ROS)环境下进行仿真实验,实验数据表明PRMFO模型在单机器人平均回报值、成功率以及时间等指标上分别提高42%、8%、9%,基于PRMFO模型的多机器人编队误差控制在0~0.06范围内,实现了较高精度的多机器人编队.
Projected Reward for Multi-robot Formation and Obstacle Avoidance
To address issues of excessive centralization,low system robustness,and forma-tion instability in multi-robot formation tasks,this paper introduces the projected reward for multi-robot formation and obstacle avoidance(PRMFO)approach.PRMFO achieves decentralized decision-making for multi-robot using a unified state representation method,ensuring consistency in processing information regarding interactions between robots and the external environment.The projected reward mechanism,based on this unified state representation,enhances the decision-making foundation by vectorizing rewards in both distance and direction dimensions.To mitigate excessive centralization,an autonomous decision layer is established by integrating the soft actor-critic(SAC)algorithm with uni-form state representation and the projected reward mechanism.Simulation results in the robot operating system(ROS)environment demonstrate that PRMFO enhances average re-turn,success rate,and time metrics by 42%,8%,and 9%,respectively.Moreover,PRMFO keeps the multi-robot formation error within the range of 0 to 0.06,achieving a high level of accuracy.

deep reinforcement learningcooperative multi-robotformation and obstacle avoidanceprojected reward

葛星、秦丽、沙灜

展开 >

华中农业大学信息学院,湖北武汉 430070

湖北省农业大数据工程技术研究中心,湖北武汉 430070

深度强化学习 多机器人协同 编队与避障 投影奖励

国家自然科学基金国家社会科学基金一般项目中央高校基本科研业务费项目中央高校基本科研业务费项目中央高校基本科研业务费项目中央高校基本科研业务费项目

6227218819BSH0222662022XXYJ0012662022JC0042662021JC0082662023XXPY005

2024

应用科学学报
上海大学 中国科学院上海技术物理研究所

应用科学学报

CSTPCD北大核心
影响因子:0.594
ISSN:0255-8297
年,卷(期):2024.42(1)
  • 1