Model-agnostic Meta Reinforcement Learning Based on Similarity Weighting
Reinforcement learning has achieved excellent performance in the fields of game games and robotics control.In order to further improve the training efficiency,meta-learning is extended to reinforcement learning,the resulting meta-reinforcement learning has become a research hotspot in the field of reinforcement learning.The quality of meta-knowledge is the key factor determining the effect of meta-reinforcement learning,and gradient-based meta-reinforcement learning takes the initial parameters of the model as meta-knowledge to guide the subsequent learning.To improve the quality of meta-knowledge,we propose a general meta-reinforcement learning method,which explicitly shows the contribution of subtasks to the training effect in the training process by weighting.The proposed method uses the similarity between the gradient update vectors obtained by different subtasks and the gradient update vectors obtained by the overall task set as update weights,improves the gradient update process,improves the quality of the meta-knowledge based on the initial parameters of the model,and makes the trained model solve the new task at a good starting point.The proposed method can be used in gradient-based reinforcement learning to quickly solve new tasks with a small number of samples.In the experiments of 2D navigation tasks and locomotion tasks,the proposed method outperforms other benchmark algorithms,which proves the rationality of weighted mechanism.