Classification of malicious code based on transformer and CNN
Existing CNN-based malware classification methods suffer from high training costs and low accu-racy for minority classes.To overcome these limitations,this paper proposes an improved method based on improved MobileVit,which combines the characteristics of CNN and Transformer.Firstly,a malicious code visualization sample preprocessing method is adopted to accelerate model convergence.Then,combining CNN with a self-attention mechanism,a cost-sensitive MobileVit model is designed to improve the Trans-former encoder structure and introduce the Focal Loss method to reduce the training costs of the model.Mean-while,it enhances the ability to represent malicious code samples and ensures attention to minority classes.Experimental results demonstrate that the improved MobileVit model maintains an advantage in accuracy while significantly reducing the number of network layers and parameters.On the Microsoft malware classifi-cation dataset,the accuracy of the improved model can reach 98.88%,showing improvements of 1.7%,2.0%,and 2.1%in precision,recall,and F1 score respectively compared to the unmodified model.The model achieves over 99%accuracy for large malware families and up to 17%improvement for small malware families.