基于注意力与门控机制的多特征融合恶意软件检测方法

Multi-feature fusion malware detection method based on attention and gating mechanisms

陈仲元 ¹张建标¹

扫码查看

作者信息

1. 北京工业大学信息学部计算机学院,北京 100124
折叠

摘要

随着网络技术的飞速发展,恶意软件及其变种的数量不断增加,这使得恶意软件的检测成为网络安全领域面临的一大挑战.然而,现有的单一特征恶意软件检测方法在样本信息的表示上存在不足,而对于采用多特征的检测方法,它们在特征融合方面存在局限,未能有效地学习和理解特征内部及特征间的复杂关联,这些问题都会导致检测效果不佳.提出了一种基于多模态特征融合的恶意软件检测方法——MFAGM.通过处理数据集的.asm和.bytes文件,成功提取了两种类型的3种关键特征(操作码统计序列、API序列和灰度图像特征),实现了从多个角度全面地表征样本信息.为了更好地融合这些多模态特征,设计了一个特征融合模块SA-JGmu.该模块不仅采用自注意力机制捕获特征之间的内部依赖关系,还利用门控机制增强了不同特征的交互性,并巧妙地引入了权重跳跃连接以进一步优化模型的表示能力.最终,基于MMCC(Microsoft malware classification challenge)数据集的实验结果显示,MFAGM在恶意软件检测任务上与其他方法相比,达到了更高的准确率和F1分数.

Abstract

With the rapid development of network technology,the number and variety of malware have been increasing,posing a significant challenge in the field of network security.However,existing single-feature malware detection meth-ods have proven inadequate in representing sample information effectively.Moreover,multi-feature detection approaches also face limitations in feature fusion,resulting in an inability to learn and comprehend the complex relationships within and between features.These limitations ultimately lead to subpar detection results.To address these issues,a malware de-tection method called MFAGM was proposed,which focused on multimodal feature fusion.By processing the.asm and.bytes files of the dataset,three key features belonging to two types(opcode statistics sequences,API sequences,and grey-scale image features)were successfully extracted.This comprehensive characterization of sample information from multiple perspectives aimed to improve detection accuracy.In order to enhance the fusion of these multimodal features,a feature fusion module called SA-JGmu was designed.This module utilized the self-attention mechanism to capture internal dependencies between features.It also leveraged the gating mechanism to enhance interactivity among different features.Additionally,weight-jumping links were introduced to further optimize the representational capabili-ties of the model.Experimental results on the Microsoft malware classification challenge dataset demonstrate that MFAGM achieves higher accuracy and Fl scores compared to other methods in the task of malware detection.

关键词

恶意软件检测/深度学习/特征融合/多模态学习/静态分析

Key words

malware detection/deep learning/feature fusion/multimodal learning/static analysis

引用本文复制引用

基金项目

北京市自然科学基金(M21039)

出版年

2024

网络与信息安全学报

人民邮电出版社

网络与信息安全学报

CSTPCD

ISSN：2096-109X

参考文献量35

段落导航