Robot manipulation skills learning based on deep reinforcement learning has become a research hotspot.However,due to the sparse reward nature of robot manipulation skills learning,the learning efficiency is low.In this paper,a double experience replay buffer adaptive soft hindsight experience replay(DAS-HER)algorithm based on meta-learning is proposed,and applied to solve the manipulation skills learning problem with sparse reward.Firstly,based on the soft hindsight experience replay(SHER)algorithm,a simplified value function which can improve the efficiency of the algorithm is derived,and a temperature adaptive adjustment strategy is introduced which can dynamically adjust the temperature parameters to adapt to different task environments.Secondly,combined with meta-learning,the experience replay is segmented,dynamically adjust the ratio of real sampling data and construct virtual data during training,and the DAS-HER algorithm is proposed.Thirdly,a generalized framework for robot manipulation skills learning under a sparse reward environment is constructed,and DAS-HER algorithm is applied to robot manipulation skills learning.Finally,comparative experiments for eight tasks are conducted both in Fetch and Hand environments under Mujoco environment,and the results show that the proposed algorithms outperform other algorithms in terms of training efficiency and success rate.
关键词
机器人操作技能学习/强化学习/稀疏奖励/最大熵方法/自适应温度参数/元学习
Key words
robot manipulation skills learning/reinforcement learning/sparse reward/maximum entropy methods/adaptive temperature parameters/meta-learning