基于强化学习的双人博弈差分隐私保护研究

扫码查看

原文链接

万方数据
维普

中文摘要：针对双人博弈问题,在学习Q-learning算法的基础上,利用神经网络参数逼近的方式更新状态值函数,选取自适应梯度优化算法进行参数更新,并通过纳什均衡思想调节两个智能体的行为.同时为提高模型的保护效果,对结果添加差分隐私保护,保证智能体博弈过程中数据的安全性.最后,实验结果验证了算法的可用性,其能够训练两个智能体在多回合之后稳定抵达各自目标点.

外文标题：Research on Differential Privacy Protection of Two-player Games Based on Reinforcement Learning

外文摘要：For the two-player game problem,on the basis of Q-learning algorithm,the state-value function is updated by using neural network parameter approximation,the adaptive gradient opti-mization algorithm is selected for parameter updating,and the behaviors of the two agents are regulated by the Nash equilibrium idea.At the same time,in order to improve the protection effect of the model,differential privacy protection is added to the results to ensure the security of the data in the process of the two-player games.Finally,the experimental results verify the usa-bility of the algorithm,which is able to train two agents to reach their respective target points stably after multiple rounds.

外文关键词：

reinforcement learningdifferential privacytwo-player games

作者：

马明扬、杨洪勇、刘飞

展开 >

作者单位：

鲁东大学信息与电气工程学院,山东烟台 264025

关键词：

强化学习差分隐私双人博弈

出版年：

2024

DOI：

10.13306/j.1672-3813.2024.04.016

复杂系统与复杂性科学

青岛大学

复杂系统与复杂性科学

CSTPCD北大核心

影响因子：0.798

ISSN：1672-3813

年,卷(期)：2024.21(4)