基于预训练语言模型的安卓恶意软件检测方法研究

扫码查看

原文链接

万方数据

中文摘要：近几年,安卓系统中的预训练学习算法已经得到了很大的发展.然而,受限于现有的恶意代码样本获取困难,且标注数据集合通常很少,训练出来的学习模型推广性能受到限制.为此,本项目拟研究基于预训练语言模型的程序恶意检测算法.针对大规模非标签APK数据,采用非监督的方式对其进行训练,从海量的非标签APK中提取出丰富复杂的语义关联,从而提升其推广性能.利用已标注的恶意代码样本对该语言模型进行了优化,使得它可以更高效地对病毒代码进行探测,从而确保安卓系统运行的安全性.

外文标题：Research on Android Malware Detection Methods Based on Pre-trained Language Models

外文摘要：In recent years,the pre-training learning algorithms in Android system have been greatly developed.However,limited by the difficulty of obtaining existing malicious code samples and the usually small set of labeled data,the promotion performance of the trained learning models is restricted.For this reason,this project proposes to study the malicious detection algorithms for programs based on pre-trained language models.Aiming at large-scale unlabeled APK data,an unsupervised approach is adopted to train it,and rich and complex semantic associations are extracted from the massive unlabeled APKs,so as to improve its promotion performance.Using the labeled malicious code samples,the language model is optimized so that it can detect virus codes more efficiently,thus ensuring the security of Android system operation.

外文关键词：

pre-trained language modelAndroid malwaredetection methods

作者：

蔡荟荃、董满

展开 >

作者单位：

中国软件评测中心(工业和信息化部软件与集成电路促进中心),北京 100048

关键词：

预训练语言模型安卓恶意软件检测方法

出版年：

2024

数码设计

ISSN：1672-9129

年,卷(期)：2024.(13)