首页|A survey on model-based reinforcement learning

A survey on model-based reinforcement learning

扫码查看
Reinforcement learning(RL)interacts with the environment to solve sequential decision-making problems via a trial-and-error approach.Errors are always undesirable in real-world applications,even though RL excels at playing complex video games that permit several trial-and-error attempts.To improve sample efficiency and thus reduce errors,model-based reinforcement learning(MBRL)is believed to be a promising direction,as it constructs environment models in which trial-and-errors can occur without incurring actual costs.In this survey,we investigate MBRL with a particular focus on the recent advancements in deep RL.There is a generalization error between the learned model of a non-tabular environment and the actual environment.Consequently,it is crucial to analyze the disparity between policy training in the environment model and that in the actual environment,guiding algorithm design for improved model learning,model utilization,and policy training.In addition,we discuss the recent developments of model-based techniques in other forms of RL,such as offline RL,goal-conditioned RL,multi-agent RL,and meta-RL.Furthermore,we discuss the applicability and benefits of MBRL for real-world tasks.Finally,this survey concludes with a discussion of the promising future development prospects for MBRL.We believe that MBRL has great unrealized potential and benefits in real-world applications,and we hope this survey will encourage additional research on MBRL.

reinforcement learningmodel-based reinforcement learningplanningmodel learningmodel learning with reduced errormodel usage

Fan-Ming LUO、Tian XU、Hang LAI、Xiong-Hui CHEN、Weinan ZHANG、Yang YU

展开 >

National Key Laboratory for Novel Software Technology,Nanjing University,Nanjing 210023,China

Polixir.ai,Nanjing 211106,China

Department of Computer Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China

National Key Research and Development Program of ChinaNational Natural Science Foundation of ChinaNational Natural Science Foundation of China

2020AAA01072006187607762076161

2024

中国科学:信息科学(英文版)
中国科学院

中国科学:信息科学(英文版)

CSTPCDEI
影响因子:0.715
ISSN:1674-733X
年,卷(期):2024.67(2)
  • 201