计算机工程与设计2024,Vol.45Issue(8) :2272-2280.DOI:10.16208/j.issn1000-7024.2024.08.005

基于多字节频率域可视化和深度学习的恶意软件检测

Malware detection based on multi-byte frequency domain visualization and deep learning

孙世淼 刘亚姝 严寒冰
计算机工程与设计2024,Vol.45Issue(8) :2272-2280.DOI:10.16208/j.issn1000-7024.2024.08.005

基于多字节频率域可视化和深度学习的恶意软件检测

Malware detection based on multi-byte frequency domain visualization and deep learning

孙世淼 1刘亚姝 1严寒冰2
扫码查看

作者信息

  • 1. 北京建筑大学电气与信息工程学院,北京 102616
  • 2. 国家计算机网络应急技术处理协调中心运行部,北京 100029
  • 折叠

摘要

随着恶意软件数量和种类的增长,恶意软件可视化研究在提高检测效率上遇到了瓶颈.为提高准确率,从频率域角度,提出一种基于改进的多阶马尔可夫概率的恶意软件可视化方法.在恶意软件可视化过程中充分考虑相邻字节之间的关联性和不同长度汇编指令的字节分布等问题,根据指令长度计算不同阶的马尔可夫概率,获取多阶马尔可夫图像以扩展样本量.融合深度学习构建IM-CNN(image of muti-order Malkov-CNN)检测框架,进行分类检测,其结果表明,IM-CNN在CNCERT和BIG2015数据集上的准确率最高均可达99%,受恶意软件数据集的平衡性因素影响较小.

Abstract

With the increase in the number and types of malwares,the research on malware visualization has encountered a bottle-neck in improving the detection efficiency.To improve the accuracy,from the perspective of frequency domain,a malware visua-lization method based on improved multi-order Markov probability was proposed.The correlation between adjacent bytes and the byte distribution of assembly instructions with different lengths were fully considered in the process of malware visualization.The Markov probabilities of different orders were calculated according to the instruction length,and the multi-order Markov ima-ges were obtained to expand the sample size.The IM-CNN(image of muti-order Malkov-CNN)detection framework was con-structed by integrating deep learning for malware detection.The results show that the accuracy of IM-CNN on both CNCERT and BIG2015 datasets can reach 99%,and IM-CNN is less affected by the balance factor of malware dataset.

关键词

网络安全/恶意软件/可视化/马尔可夫/深度学习/卷积神经网络/分类检测

Key words

cybersecurity/malware/visualization/Markov/deep learning/CNN/classification detection

引用本文复制引用

基金项目

国家重点研发计划基金项目(2018YFB0803604)

出版年

2024
计算机工程与设计
中国航天科工集团二院706所

计算机工程与设计

CSTPCD北大核心
影响因子:0.617
ISSN:1000-7024
参考文献量5
段落导航相关论文