With the rapid development of the information society,the number of malware variants is increasing,posing challenges to existing detection methods.To improve the accuracy and efficiency of detecting malware variants,this paper proposed a new hybrid architecture called FasterMalViT.This architecture enhanced the Vision Transformer(ViT)by integrating partial convolutional structures,significantly improving its performance in malware detection.To address the issue of increased parameter count due to the introduction of convolutional operations,the paper employed a separable self-attention mechanism instead of traditional multi-head attention,effectively reducing the number of parameters and computational cost.To tackle the problem of imbalanced sample distribution in malware datasets,the paper introduced a class-balanced focal loss function,guiding the model to pay more attention to categories with fewer samples during training,thus improving performance on hard-to-classify categories.Experimental results on the Microsoft BIG,Malimg,and MalwareBazaar datasets demonstrate that FasterMalViT exhibits good detection performance and generalization ability.