Robotics & Machine Learning Daily News2024,Issue(Jun.4) :91-91.

New Findings in Robotics and Automation Described from Budapest University of Te chnology and Economics (Adaptive Curriculum Learning With Successor Features for Imbalanced Compositional Reward Functions)

布达佩斯科技与经济学大学介绍的机器人与自动化的新发现(具有不平衡成分奖励函数后续特征的适应性课程学习)

Robotics & Machine Learning Daily News2024,Issue(Jun.4) :91-91.

New Findings in Robotics and Automation Described from Budapest University of Te chnology and Economics (Adaptive Curriculum Learning With Successor Features for Imbalanced Compositional Reward Functions)

布达佩斯科技与经济学大学介绍的机器人与自动化的新发现(具有不平衡成分奖励函数后续特征的适应性课程学习)

扫码查看

摘要

机器人和机器学习的新闻编辑每日新闻-机器人的新研究-机器人和D自动化是一篇报道的主题。根据NewsRx编辑在匈牙利的NewsPest的新闻报道,研究表明:“这项工作解决了强化学习的挑战,奖励函数在重要性和规模方面具有高度不平衡的成分。强化学习算法HMS通常很难有效地处理这种不平衡的奖励函数。”匈牙利文化和创新部国家研究、发展和创新基金为这项研究提供了财政支持。我们的新闻记者从布达佩斯技术与经济大学的研究中获得了一句话,“因此,他们往往倾向于只偏袒占主导地位的奖励成分的次优政策。例如,代理人可能采取被动策略,避免采取任何行动来完全规避潜在的不安全行为。为了减轻报酬功能不平衡的不利影响,我们引入了一种基于后续功能重新呈现的课程学习方法。根据新闻编辑的说法,研究得出的结论是:“这种新颖的方法启动了我们的学习系统,以获得考虑到所有奖励因素的政策,允许一个更平衡和多才多艺的决策过程。”

Abstract

By a News Reporter-Staff News Editor at Robotics & Machine Learning Daily News Daily News – New research on Robotics - Robotics an d Automation is the subject of a report. According to news reporting out of Buda pest, Hungary, by NewsRx editors, research stated, “This work addresses the chal lenge of reinforcement learning with reward functions that feature highly imbala nced components in terms of importance and scale. Reinforcement learning algorit hms generally struggle to handle such imbalanced reward functions effectively.” Financial support for this research came from Ministry of Culture and Innovation of Hungary from the National Research, Development, and Innovation Fund. Our news journalists obtained a quote from the research from the Budapest Univer sity of Technology and Economics, “Consequently, they often converge to suboptim al policies that favor only the dominant reward component. For example, agents m ight adopt passive strategies, avoiding any action to evade potentially unsafe o utcomes entirely. To mitigate the adverse effects of imbalanced reward functions , we introduce a curriculum learning approach based on the successor features re presentation.” According to the news editors, the research concluded: “This novel approach enab les our learning system to acquire policies that take into account all reward co mponents, allowing for a more balanced and versatile decision-making process.”

Key words

Budapest/Hungary/Europe/Robotics and Automation/Robotics/Emerging Technologies/Machine Learning/Reinforcement Lea rning/Budapest University of Technology and Economics

引用本文复制引用

出版年

2024
Robotics & Machine Learning Daily News

Robotics & Machine Learning Daily News

ISSN:
段落导航相关论文