首页|融合SMOTE-Tomek Link与集成模型的入侵检测方法

融合SMOTE-Tomek Link与集成模型的入侵检测方法

扫码查看
随着全球互联网的快速扩张和网络安全威胁日益复杂化,开发高效稳定的入侵检测系统成为了网络安全领域中的重要研究任务。该文的重点在于解决入侵检测数据集普遍存在的两个难题:一是由于正常与异常网络行为样本数量差异引起的类别不均衡;二是由于数据集中冗余和无效特征过多导致的高维度问题。为此,运用集成学习思想,融合SOMTE-Tomek Link综合采样算法与3 个同质模型,提出针对非平衡数据集的加权投票法集成模型。先利用SOMTE-Tomek Link综合采样算法对非平衡数据进行预处理,再利用随机森林排列重要性度量算法对数据集进行有效特征选择,以降低模型检测失误率与计算开销。将多个机器学习模型和所提集成模型进行对比实验评估。实验结果表明,集成模型准确率达到97。84%,比单一模型可提高1~4 百分点,在少样本攻击分类的准确率、精确率、召回率和F1 分数上有大幅提升,且模型训练效率更高,稳定性更强。
An Intrusion Detection Approach Incorporating SMOTE-Tomek Link with Integrated Modeling
With the rapid expansion of the global Internet and the increasing complexity of network security threats,the development of ef-ficient and stable intrusion detection systems has become an important research task in the field of network security.The focus of this paper is to solve two common problems in intrusion detection data sets:one is the category imbalance caused by the difference in the number of normal and abnormal network behavior samples;The second is the high dimensional problem caused by too many redundant and invalid features in the data set.Therefore,according to ensemble learning,by combining the SOMTE-Tomek Link comprehensive sampling algorithm with three homogeneous models,an integrated weighted voting model for unbalanced data sets is proposed.The SOMTE-Tomek Link algorithm preprocesses the data,while the random forest ranking importance measure algorithm selects effective features,reducing error rates and computational demands.Comparative evaluations against multiple machine learning models demonstrate that the integrated model achieves a 97.84%accuracy rate,outperforming single models by 1~4 percentage points.Notably,it significantly improves accuracy,precision,recall,and F1 scores in classifying less-sample attacks,enhancing training efficiency and model stability.

intrusion detectionsampling algorithmsfeature selectionensemble learningcyber security

李润杰、张小庆、刘昌华

展开 >

武汉轻工大学 数学与计算机学院,湖北 武汉 430048

入侵检测 采样算法 特征选择 集成学习 网络安全

湖北省教育科技项目武汉轻工大学校级科研项目

B20200632023Y44

2024

计算机技术与发展
陕西省计算机学会

计算机技术与发展

CSTPCD
影响因子:0.621
ISSN:1673-629X
年,卷(期):2024.34(7)