利用深度强化学习的多阶段博弈网络拓扑欺骗防御方法

Multi-Stage Game-based Topology Deception Method Using Deep Reinforcement Learning

何威振 ¹谭晶磊 ¹张帅 ¹程国振 ²张帆 ¹郭云飞¹

扫码查看

作者信息

1. 信息工程大学信息技术研究所郑州 450001
2. 信息工程大学信息技术研究所郑州 450001;网络空间安全教育部重点实验室郑州 450001
折叠

摘要

针对当前网络拓扑欺骗防御方法仅从空间维度进行决策,没有考虑云原生网络环境下如何进行时空多维度拓扑欺骗防御的问题,该文提出基于深度强化学习的多阶段Flipit博弈网络拓扑欺骗防御方法来混淆云原生网络中的侦察攻击.首先分析了云原生网络环境下的拓扑欺骗攻防模型,接着在引入折扣因子和转移概率的基础上,构建了基于Flipit的多阶段博弈网络拓扑欺骗防御模型.在分析博弈攻防策略的前提下,构建了基于深度强化学习的拓扑欺骗生成方法求解多阶段博弈模型的拓扑欺骗防御策略.最后,通过搭建实验环境,验证了所提方法能够有效建模分析云原生网络的拓扑欺骗攻防场景,且所提算法相比于其他算法具有明显的优势.

Abstract

Aiming at the problem that current network topology deception methods only make decisions in the spatial dimension without considering how to perform spatio-temporal multi-dimensional topology deception in cloud-native network environments,a multi-stage Flipit game topology deception method with deep reinforcement learning to obfuscate reconnaissance attacks in cloud-native networks.Firstly,the topology deception defense-offense model in cloud-native complex network environments is analyzed.Then,by introducing a discount factor and transition probabilities,a multi-stage game-based network topology deception model based on Flipit is constructed.Furthermore under the premise of analyzing the defense-offense strategies of game models,a topology deception generation method is developed based on deep reinforcement learning to solve the topology deception strategy of multi-stage game models.Finally,through experiments,it is demonstrated that the proposed method can effectively model and analyze the topology deception defense-offense scenarios in cloud-native networks.It is shown that the algorithm has significant advantages compared to other algorithms.

关键词

云原生网络/拓扑欺骗/多阶段Flipit博弈/深度强化学习/深度确定性策略梯度算法

Key words

Cloud-native network/Topology deception/Multi-stage Flipit game/Deep reinforcement learning/Deep deterministic policy gradient

引用本文复制引用

出版年

2024

电子与信息学报

中国科学院电子学研究所国家自然科学基金委员会信息科学部

电子与信息学报

CSTPCDCSCD北大核心

影响因子：1.302

ISSN：1009-5896

段落导航