首页|基于强化学习的双人博弈差分隐私保护研究

基于强化学习的双人博弈差分隐私保护研究

扫码查看
针对双人博弈问题,在学习Q-learning算法的基础上,利用神经网络参数逼近的方式更新状态值函数,选取自适应梯度优化算法进行参数更新,并通过纳什均衡思想调节两个智能体的行为.同时为提高模型的保护效果,对结果添加差分隐私保护,保证智能体博弈过程中数据的安全性.最后,实验结果验证了算法的可用性,其能够训练两个智能体在多回合之后稳定抵达各自目标点.
Research on Differential Privacy Protection of Two-player Games Based on Reinforcement Learning
For the two-player game problem,on the basis of Q-learning algorithm,the state-value function is updated by using neural network parameter approximation,the adaptive gradient opti-mization algorithm is selected for parameter updating,and the behaviors of the two agents are regulated by the Nash equilibrium idea.At the same time,in order to improve the protection effect of the model,differential privacy protection is added to the results to ensure the security of the data in the process of the two-player games.Finally,the experimental results verify the usa-bility of the algorithm,which is able to train two agents to reach their respective target points stably after multiple rounds.

reinforcement learningdifferential privacytwo-player games

马明扬、杨洪勇、刘飞

展开 >

鲁东大学信息与电气工程学院,山东烟台 264025

强化学习 差分隐私 双人博弈

2024

复杂系统与复杂性科学
青岛大学

复杂系统与复杂性科学

CSTPCD北大核心
影响因子:0.798
ISSN:1672-3813
年,卷(期):2024.21(4)