陆军工程大学学报2024,Vol.3Issue(2) :57-62.DOI:10.12018/j.issn.2097-0730.20230928001

环境适应的高斯噪声数据增强强化学习方法

Reinforcement Learning Approach with Environment-Adaptive Gaussian Noise Augmentation

朱乐乾 潘志松
陆军工程大学学报2024,Vol.3Issue(2) :57-62.DOI:10.12018/j.issn.2097-0730.20230928001

环境适应的高斯噪声数据增强强化学习方法

Reinforcement Learning Approach with Environment-Adaptive Gaussian Noise Augmentation

朱乐乾 1潘志松1
扫码查看

作者信息

  • 1. 陆军工程大学 指挥控制工程学院,江苏 南京 210007
  • 折叠

摘要

状态向量输入的强化学习方法是一种基本的强化学习研究方向,具有广泛的应用前景.针对目前强化学习方法数据效率低下导致学习时间较长从而难以在现实环境中应用的问题,提出了一种环境适应的高斯噪声数据增强(environment-adapted Gaussian noise augmentation,EAGNA)方法,并将其作为一个模块插入到软演员-评论家(soft actor-critic,SAC)和近端策略优化(proximal policy optimization,PPO)方法中.针对任务环境中状态向量的各个元素分布范围,对每个元素添加具有不同均值和标准差的高斯噪声,从而达到增强数据的目的.在 OpenAI Gym基准测试的 3 个基于状态向量输入的控制任务中,EAGNA较原算法获得了更高的平均回报,提高了算法的数据效率.特别是在具有复杂状态输入的 Lunar Lander控制任务中,EAGNA获得的平均回报比SAC和PPO 方法分别高出 30.52 和 26.09.

Abstract

The state vector input-based reinforcement learning approach is currently a fundamental re-search direction in the field of reinforcement learning with broad application prospects.However,the low data efficiency of current reinforcement learning methods leads to prolonged learning times,making it dif-ficult to apply in real-world environments.To address these issues,an environment-adaptive Gaussian noise augmentation(EAGNA)method is proposed,which is integrated as a module into soft actor-critic(SAC)and proximal policy optimization(PPO)methods.This study focuses on the distribution range of each element in the state vector of the task environment and adds Gaussian noise with different means and standard deviations to each element for data augmentation.Across three state-vector-based control tasks in the OpenAI Gym benchmark,EAGNA achieved a higher average return than the original algorithm,en-hancing data efficiency.Notably,in the Lunar Lander control task with complex state inputs,EAGNA outperformed the SAC and PPO methods by 30.52 and 26.09 average returns,respectively.

关键词

强化学习/数据增强/高斯噪声/状态向量输入/环境适应

Key words

reinforcement learning/data augmentation/Gaussian noise/state vector input/environ-ment adaptation

引用本文复制引用

基金项目

国家自然科学基金(62076251)

出版年

2024
陆军工程大学学报
解放军理工大学科研部

陆军工程大学学报

影响因子:0.556
ISSN:2097-0730
参考文献量21
段落导航相关论文