一种基于资格迹的并行强化学习算法

Parallel reinforcement learning with eligibility traces

杨旭东 ¹刘全 ¹李瑾¹

扫码查看

作者信息

1. 苏州大学计算机科学与技术学院,江苏苏州215006
折叠

摘要

强化学习是一种重要的机器学习方法,然而在实际应用中,收敛速度缓慢是其主要不足之一.为了提高强化学习的效率,提出了一种基于资格迹的并行强化学习算法,并给出了算法实现的框架模型和一些可行的优化方法.由于使用资格迹的算法具有内在的并行性,可以使用多个计算结点分摊值函数表和资格迹表的更新工作,从而实现提高整个系统学习效率的目的.实验结果表明该方法与当前两种主要的并行强化学习算法相比具有一定的优势.

Abstract

Reinforcement learning is an important machine learning method. However, slow convergence has been one of the main challenges in the area of reinforcement learning. To improve the efficiency of existing reinforcement learning algorithms, a parallel reinforcement learning algorithm framework with eligibility traces is proposed. To take advantage of the inherent parallelism found in reinforcement learning algorithms with eligibility traces, multiple computing nodes are used together to take charge of the value function table and eligibility trace table. Some optimizations of the algorithm framework are given. The experimental results show that the proposed method has certain advantages compared to two other existing parallel reinforcement learning methods.

关键词

并行算法/强化学习/Sarsa(λ)学习/Tic-tac-toe

Key words

parallel algorithms/reinforcement learning/Sarsa(λ)-learning/Tic-tac-toe

引用本文复制引用

基金项目

国家自然科学基金(60873116)

国家自然科学基金(61070223)

江苏省自然科学基金(BK2008161)

江苏省高校自然科学基金(09KJA520002)

出版年

2012

苏州大学学报(自然科学版)

苏州大学

苏州大学学报(自然科学版)

影响因子：0.237

ISSN：1000-2073

被引量1

参考文献量1

段落导航