首页|事件触发式多智能体分层安全强化学习运动规划

事件触发式多智能体分层安全强化学习运动规划

扫码查看
针对深度强化学习序贯决策过程中面临的动作安全性问题,研究一种事件触发式多智能体分层安全强化学习运动规划方法。首先,基于受限马尔可夫决策模型,构建一种具备安全约束的多智能体深度确定性策略梯度框架,该框架针对不同状态空间,以事件触发的方式实现运动策略的分层学习;然后,通过引入李雅普诺夫评价网络,建立带有条件约束的目标动作选择机制,并利用拉格朗日乘子法,解决多目标约束求解困难的问题,保证机器人内部决策的安全性;最后,在多机器人强化学习场景中对所提出方法进行实验。实验结果表明:触发式多智能体分层安全强化学习方法使得机器人的状态轨迹从危险状态中快速恢复至安全空间,增强了策略的安全性和多机协同运动规划能力。
Multi-agent event triggered hierarchical security reinforcement learning
In order to address the security issues that may arise in the sequential decision-making process of deep reinforcement learning,this paper studies a motion planning method based on multi-agent event triggered hierarchical security reinforcement learning(MEHSRL)method.Firstly,this method constructs a multi-agent twin delayed deep deterministic policy gradient algorithm based on the constrained Markov decision model.The model uses state security events as trigger conditions to implement hierarchical reinforcement learning in different state spaces.Then,by introducing a Lyapunov evaluation network,additional safety constraint rules are constructed for the reinforcement learning network,and the safety of robot decision is ensured by multi constraint objective optimization learning.Finally,the proposed method is tested in the security reinforcement learning scenario.The results show that proposed method achieves the goal of restoring the state trajectory from the dangerous state to the safe space in a limited time,improving the security of the strategy,and the effect of motion planning is better than the comparison method.

reinforcement learningsecurity constraintmotion planningmulti-agentevent triggered

孙辉辉、胡春鹤、张军国

展开 >

北京林业大学工学院,北京 100083

淮南师范学院机械与电气工程学院,安徽淮南 232038

林木资源高效生产全国重点实验室,北京 100083

强化学习 安全约束 运动规划 多智能体 事件触发

2024

控制与决策
东北大学

控制与决策

CSTPCD北大核心
影响因子:1.227
ISSN:1001-0920
年,卷(期):2024.39(11)