自动化学报2024,Vol.50Issue(9) :1704-1723.DOI:10.16383/j.aas.c230081

逆强化学习算法、理论与应用研究综述

A Survey of Inverse Reinforcement Learning Algorithms,Theory and Applications

宋莉 李大字 徐昕
自动化学报2024,Vol.50Issue(9) :1704-1723.DOI:10.16383/j.aas.c230081

逆强化学习算法、理论与应用研究综述

A Survey of Inverse Reinforcement Learning Algorithms,Theory and Applications

宋莉 1李大字 1徐昕2
扫码查看

作者信息

  • 1. 北京化工大学信息科学与技术学院 北京 100029
  • 2. 国防科技大学智能科学学院 长沙 410073
  • 折叠

摘要

随着高维特征表示与逼近能力的提高,强化学习(Reinforcement learning,RL)在博弈与优化决策、智能驾驶等现实问题中的应用也取得显著进展.然而强化学习在智能体与环境的交互中存在人工设计奖励函数难的问题,因此研究者提出了逆强化学习(Inverse reinforcement learning,IRL)这一研究方向.如何从专家演示中学习奖励函数和进行策略优化是一个重要的研究课题,在人工智能领域具有十分重要的研究意义.本文综合介绍了逆强化学习算法的最新进展,首先介绍了逆强化学习在理论方面的新进展,然后分析了逆强化学习面临的挑战以及未来的发展趋势,最后讨论了逆强化学习的应用进展和应用前景.

Abstract

With the research and development of deep reinforcement learning,the application of reinforcement learning(RL)in real-world problems such as game and optimization decision,and intelligent driving has also made significant progress.However,reinforcement learning has difficulty in manually designing the reward function in the interaction between an agent and its environment,so researchers have proposed the research direction of inverse re-inforcement learning(IRL).How to learn reward functions from expert demonstrations and perform strategy optim-ization is a novel and important research topic with very important research implications in the field of artificial in-telligence.This paper presents a comprehensive overview of the recent progress of inverse reinforcement learning al-gorithms.Firstly,new advances in the theory of inverse reinforcement learning are introduced,then the challenges faced by inverse reinforcement learning and the future development trends are analyzed,and finally the progress and application prospects of inverse reinforcement learning are discussed.

关键词

强化学习/逆强化学习/线性逆强化学习/深度逆强化学习/对抗逆强化学习

Key words

Reinforcement learning(RL)/inverse reinforcement learning(IRL)/linear inverse reinforcement learn-ing/deep inverse reinforcement learning/adversarial inverse reinforcement learning

引用本文复制引用

基金项目

国家自然科学基金(62273026)

出版年

2024
自动化学报
中国自动化学会 中国科学院自动化研究所

自动化学报

CSTPCD北大核心
影响因子:1.762
ISSN:0254-4156
段落导航相关论文