首页|基于N-gram改进特征的ACFG在GCC编译器版本识别中的应用

基于N-gram改进特征的ACFG在GCC编译器版本识别中的应用

扫码查看
探讨基于N-gram改进特征的ACFG,与优化后的LightGBM分类器相结合,以实现对GCC编译器版本的精确识别.研究重点在于关键特征的提取和判别函数的构建.在识别编译结果的关键特征时,构建了N-gram关联模型,以关联寄存器与操作码的统计特征,确保代码块内部的局部特征得到充分保留.此外,在改进的ACFG框架基础上,通过N-gram关联的聚合图特征,有效捕捉了指令序列代码块之间的上下文信息.在判别函数的构建过程中,实验验证了LightGBM分类器在处理复杂特征方面的显著优势,并采用了贝叶斯算法进行超参数优化.文章最后提出了通过生成对抗网络(GAN)优化等策略进一步提升模型性能的建议.
Application of ACFG Based on N-gram Improved Features in GCC Compiler Version Identification
The article explores an improved ACFG based on N-gram features,combined with an optimized LightGBM classifier,to achieve precise identification of GCC compiler versions.The research focuses on the extraction of key features and the construction of discriminant functions.In identifying the key features of compilation results,an N-gram association model was constructed to correlate statistical features of registers and opcodes,ensuring that local features within code blocks are fully preserved.Furthermore,on the basis of the improved ACFG framework,the aggregated graph features associated with N-grams effectively capture the contextual information between instruction sequence code blocks.During the construction of the discriminant function,experiments verified the significant advantage of the LightGBM classifier in handling complex features and employed Bayesian algorithms for hyperparameter optimization.The article concludes with suggestions for further enhancing model performance through strategies such as optimizing with Generative Adversarial Networks(GANs).

GCC Compiler Version IdentificationN-gramACFGLightGBM

陈舒、董晨洋、叶慧斌、韩铨、钟秀艺

展开 >

福建警察学院 计算机与信息安全管理系,福建 福州 350007

GCC编译器版本识别 N-gram ACFG LightGBM

2024

数学建模及其应用

数学建模及其应用

影响因子:0.215
ISSN:
年,卷(期):2024.13(4)