Research on Autonomous Driving Algorithm Based on Meta-Reinforcement Learning
Aiming at the problems such as poor"learning to learn"ability of the autonomous driving model based on deep reinforcement learning,start training from scratch when facing new driving tasks,slow training speed,poor generaliza-tion performance and so on,this paper proposes a MPPO(Meta-PPO)autonomous driving model based on meta-rein-forcement learning.The MPPO model combines the meta-learning with the reinforcement learning,and uses the meta-learn-ing algorithm to train a set of good parameters for the autonomous driving model in the meta-training stage,so that the model can quickly reach the convergence state after a small amount of sample fine-tuning on the basis of this set of pa-rameters when facing new driving tasks.The experimental results show that,in the navigation scenario task,compared with the benchmark autonomous driving model based on reinforcement learning,the convergence speed of MPPO model in-creases 2.52 times,the reward value increases 7.50%,the offset reduces 7.27%,and the generalization performance also improves to a certain extent.