中国科学:技术科学(英文版)2024,Vol.67Issue(1) :172-182.DOI:10.1007/s11431-023-2435-3

Robust reinforcement learning with UUB guarantee for safe motion control of autonomous robots

ZHANG RuiXian HAN YiNing SU Man LIN ZeFeng LI HaoWei ZHANG LiXian
中国科学:技术科学(英文版)2024,Vol.67Issue(1) :172-182.DOI:10.1007/s11431-023-2435-3

Robust reinforcement learning with UUB guarantee for safe motion control of autonomous robots

ZHANG RuiXian 1HAN YiNing 2SU Man 3LIN ZeFeng 1LI HaoWei 1ZHANG LiXian1
扫码查看

作者信息

  • 1. School of Astronautics,Harbin Institute of Technology,Harbin 150001,China
  • 2. School of Management,Harbin Institute of Technology,Harbin 150001,China
  • 3. Beijing Institute of Tracking and Telecommunication Technology,Beijing 100094,China
  • 折叠

Abstract

This paper addresses the issue of safety in reinforcement learning(RL)with disturbances and its application in the safety-constrained motion control of autonomous robots.To tackle this problem,a robust Lyapunov value function(rLVF)is proposed.The rLVF is obtained by introducing a data-based LVF under the worst-case disturbance of the observed state.Using the rLVF,a uniformly ultimate boundedness criterion is established.This criterion is desired to ensure that the cost function,which serves as a safety criterion,ultimately converges to a range via the policy to be designed.Moreover,to mitigate the drastic variation of the rLVF caused by differences in states,a smoothing regularization of the rLVF is introduced.To train policies with safety guarantees under the worst disturbances of the observed states,an off-policy robust RL algorithm is proposed.The proposed algorithm is applied to motion control tasks of an autonomous vehicle and a cartpole,which involve external disturbances and variations of the model parameters,respectively.The experimental results demonstrate the effectiveness of the theoretical findings and the advantages of the proposed algorithm in terms of robustness and safety.

Key words

motion control/reinforcement learning/robustness/stability

引用本文复制引用

基金项目

National Natural Science Foundation of China(62225305)

National Natural Science Foundation of China(12072088)

Fundamental Research Funds for the Central Universities,China(HIT.BRET.2022004)

Fundamental Research Funds for the Central Universities,China(HIT.OCEF.2022047)

Fundamental Research Funds for the Central Universities,China(HIT.DZIJ.2023049)

State Key Laboratory of Robotics and System(HIT)(JCKY2022603C016)

Heilongjiang Touyan Team()

出版年

2024
中国科学:技术科学(英文版)
中国科学院

中国科学:技术科学(英文版)

CSTPCDEI
影响因子:1.056
ISSN:1674-7321
参考文献量31
段落导航相关论文