网络安全与数据治理2024,Vol.43Issue(4) :24-27.DOI:10.19358/j.issn.2097-1788.2024.04.004

基于BERT-LSTM模型的WebShell文件检测研究

Research on WebShell file detection based on BERT-LSTM model

邓全才 徐怀彬
网络安全与数据治理2024,Vol.43Issue(4) :24-27.DOI:10.19358/j.issn.2097-1788.2024.04.004

基于BERT-LSTM模型的WebShell文件检测研究

Research on WebShell file detection based on BERT-LSTM model

邓全才 1徐怀彬1
扫码查看

作者信息

  • 1. 河北建筑工程学院 信息工程学院,河北 张家口 075000
  • 折叠

摘要

针对基于传统规则的WebShell文件检测难度大,采用文本分类的思想,设计了一种基于BERT-LSTM模型的WebShell检测方法.首先,对现有公开的正常PHP文件和恶意PHP文件进行清洗编译,得到指令opcode码;然后,通过变换器的双向编码器表示技术(BERT)将操作码转换为特征向量;最后结合长短期记忆网络(LSTM)从文本序列角度检测特征建立分类模型.实验结果表明,该检测模型的准确率为98.95%,召回率为99.45%,F1值为99.09%,相比于其他模型检测效果更好.

Abstract

Aiming at the difficulty of WebShell file detection based on traditional rules,a WebShell detection method based on BERT-LSTM model is designed using the idea of text classification.Firstly,the existing publicly available normal PHP files and malicious PHP files are cleaned and compiled to get the instruction opcode code;then,the opcode is converted into a feature vec-tor by the bi-directional encoder representation technique(BERT)of the transformer;finally,the classification model is built by combining with the long-short-term memory network(LSTM)to detect the features from the perspective of text sequence.The ex-perimental results show that the detection model has an accuracy of 98.95%,a recall of 99.45%,and an F1 value of 99.09%,which is better compared to other models for detection.

关键词

BERT/LSTM/WebShell/PyTorch

Key words

BERT/LSTM/WebShell/PyTorch

引用本文复制引用

出版年

2024
网络安全与数据治理
华北计算机系统工程研究所(中国电子信息产业集团有限公司第六研究所)

网络安全与数据治理

影响因子:0.348
ISSN:2097-1788
参考文献量11
段落导航相关论文