沈阳理工大学学报2024,Vol.43Issue(5) :7-13.DOI:10.3969/j.issn.1003-1251.2024.05.002

基于PER-PPO2的入侵检测技术

Intrusion Detection Technology Based on PER-PPO2

黄迎春 任国杰
沈阳理工大学学报2024,Vol.43Issue(5) :7-13.DOI:10.3969/j.issn.1003-1251.2024.05.002

基于PER-PPO2的入侵检测技术

Intrusion Detection Technology Based on PER-PPO2

黄迎春 1任国杰1
扫码查看

作者信息

  • 1. 沈阳理工大学 信息科学与工程学院,沈阳 110159
  • 折叠

摘要

随着万物信息化与智能化的快速发展,网络攻击范围不断扩大.传统的入侵检测算法,如主成分分析(PCA)结合随机森林和K近邻等,由于网络数据繁多,特征提取能力较差,分类准确率低.针对上述问题,提出一种新的入侵检测技术,称为优先经验采样的近端策略优化裁剪(prioritized experience replay-proximal policy optimization clip,PER-PPO2)算法,基于强化学习实现包裹法特征选择.深度强化学习通过构建以分类器混淆矩阵为基础的奖励函数,使智能体根据奖励反馈选择分类器的较优特征,结合优先经验采样优化算法的训练样本,提高算法的稳定性与收敛性能;使用性能较优的轻量级梯度提升机(LightGBM)作为分类器.使用NSL-KDD数据集对模型进行实验评估,结果表明模型将数据集的 41 维特征降低为 8 维时分类F1 值达到0.871 3,可以满足入侵检测的要求.

Abstract

With the rapid development of informatization and intelligence of all things,the scope of network attacks continues to expand.Traditional intrusion detection algorithms,such as principal component analysis(PCA)combined with random forests and K-nearest neighbors,have poor fea-ture extraction capabilities and low classification accuracy in the face of the numerous features of current network data.In response to the above problems,a new intrusion detection technology is proposed,called Proximal Policy Optimization Pruning with Prioritized Experience Sampling(PER-PPO2).This algorithm implements wrapping method feature selection based on reinforcement learn-ing.Reinforcement learning constructs a reward function based on the classifier confusion matrix,allowing the agent to select the better features of the classifier based on reward feedback;combined with the training samples of the priority experience sampling optimization algorithm,Improve the stability and convergence performance of the algorithm;use the lightweight gradient boosting ma-chine(LightGBM)with better performance as the classifier.The NSL-KDD data set was used to conduct an experimental evaluation of the model.The results showed that when the model reduced the 41-dimensional features of the data set to 8 dimensions,the classification F1 value reached 0.871 3,which can meet the requirements of intrusion detection.

关键词

近端策略优化裁剪/优先经验采样/入侵检测/深度强化学习/LightGBM

Key words

proximal policy optimization clip/prioritized experience replay/intrusion detection/deep reinforcement learning/lightweight gradient boosting machine

引用本文复制引用

基金项目

国家自然科学基金项目(61971291)

出版年

2024
沈阳理工大学学报
沈阳理工大学

沈阳理工大学学报

影响因子:0.223
ISSN:1003-1251
段落导航相关论文