Deep reinforcement learning,as a key technology supporting breakthrough works such as AlphaGo and ChatGPT,has become a research hotspot in frontier science.In practical applications,deep reinforcement learning,as an important intelligent decision-making technology,is widely used in a variety of planning and decision-making tasks,such as obstacle avoidance in visual scenes,optimal generation of virtual scenes,robotic arm control,digital design and manufacturing,and industrial design decision-making.However,deep reinforcement learning faces the challenge of low sample efficiency in practical applications,which greatly limits its application effectiveness.In order to improve the sample efficiency,this paper proposes an efficient exploration method based on large model guidance,which combines the large model with the mainstream exploration techniques.Specifically,we utilize the semantic extraction capability of a large language model to obtain semantic information of states,which is then used to guide the exploration behavior of agents.Then,we introduce the semantic information into the classical methods in single-policy exploration and population exploration,respectively.By using the large model to guide the exploration behavior of deep reinforcement learning agents,our method shows significant performance improvement in popular environments.This research not only demonstrates the potential of large model techniques in deep reinforcement learning exploration problems,but also provides a new idea to alleviate the low sample efficiency problem in practical applications.
deep reinforcement learninglarge language modelefficient exploration