首页|基于组稀疏优化的强化学习稀疏表征

基于组稀疏优化的强化学习稀疏表征

扫码查看
强化学习由于具有出色的数据效率和快速学习的能力,开始应用于许多实际问题以学习复杂策略.但是高维环境中的强化学习常常受限于维度灾难或者灾难性干扰,性能表现不佳甚至导致学习失败.围绕表征学习,提出了一种符合Lasso类型优化的稀疏卷积深度强化学习方法.首先,对稀疏表征的理论和优势进行综述,将稀疏卷积方法引入深度强化学习中,提出了一种新的稀疏表征方法;其次,对由稀疏卷积编码定义的可微优化层进行了数学推导并给出了优化算法,为了验证新的稀疏表征方法的有效性,将其应用于相关文献常见的基准环境中进行测试.实验结果表明,应用稀疏卷积编码的算法具有更好的性能和鲁棒性,在降低了 50%以上模型开销的前提下,取得了相当甚至更优的性能.此外,还研究了稀疏程度对算法性能的影响,结果显示适当的稀疏度能获得更优的性能.
Reinforcement Learning with Sparse Representation via Sparse Overlapping Group Lasso
Due to the outstanding data efficiency and learning rapidly,the reinforcement learning method is applied in many real-world problems to learn complex strategies.However,reinforcement learning in high-dimensional environments is often limited by the curse of dimensionality or catastrophic interference,resulting in poor performance or even learning failure.Aiming at representation learning,this paper proposes a sparse convolutional deep reinforcement learning method based on Lasso-type optimization.First,the theory and advantages of sparse representation are reviewed,and the sparse convolution method is innovatively introduced into deep reinforcement learning to obtain a new sparse representation method.Secondly,a mathematical derivation is conducted on the differentiable optimization layer defined by sparse convolutional encoding,based on which an optimization algorithm is proposed.In addition,in order to verify the effectiveness of the new sparse representation method,we tested this method in common benchmark environments.It is shown from experimental results that the sparse convolutional encoding-based algorithm has better performance and robustness,achieving comparable or even superior performance while reducing model overhead by more than 50%.Furthermore,the impact of sparsity on algorithm performance is also investigated,which shows that appropriate sparsity can achieve better performance.

reinforcement learningcatastrophic interferencesparse representationimplicit layerLasso optimization

蔡林逸、冯翔、虞慧群

展开 >

华东理工大学计算机科学与工程系,上海 200237

上海智慧能源工程技术研究中心,上海 200237

强化学习 灾难性干扰 稀疏表征 隐式层 Lasso优化

2024

华东理工大学学报(自然科学版)
华东理工大学

华东理工大学学报(自然科学版)

CHSSCD北大核心
影响因子:0.289
ISSN:1006-3080
年,卷(期):2024.50(6)