PROXIMAL POLICY OPTIMIZATION ALGORITHM COMBINING WITH ATTENTION MECHANISM AND CURIOSITY-DRIVEN
In most problems of real world,incentives in the external world are often very sparse.The agent lacks an effective mechanism to update its policy function because of lack of feedback.Only using the intrinsic curiosity mechanism to drive the exploration of the task may lead to the failure of the exploration task due to the influence of useless or harmful curiosity.This paper proposes a proximal policy optimization algorithm combining with attention mechanism and curiosity-driven.The agent could be driven by curiosity to explore the unknown environment.Meanwhile,combining with the help of the rational curiosity of attention mechanism,the abnormal exploration resulted in harmful curiosity of the agent was effectively controlled,which made the proximal policy optimization algorithm keeping running faster and updating its policy in a more stable state.Experiments show that the agent has better performance and can obtain higher average reward in return.
Deep reinforcement learningAttention mechanismProximal policy optimizationCuriosity mechanism