首页|基于深度学习的网络安全命名实体识别方法

基于深度学习的网络安全命名实体识别方法

扫码查看
针对中文网络安全领域缺乏公开数据集和有效的命名实体识别(Named Entity Recognition,NER)方法,提出一种融合汉字多源信息的网络安全NER方法.通过构建数据集中所有字符的偏旁和字频向量表,增强了中文字向量的特征表达能力,嵌入到改进的词汇融合模型中进行字向量与词向量的融合,输入到条件随机场(Conditional Random Fields,CRF)进行解码.实验结果表明,该方法在保持较快解码速度和占用较低计算机资源的情况下,在网络安全数据集上,其准确率、召回率和F1值分别为0.864 9、0.840 2和0.852 3,均优于现有模型,能够为后续网络安全知识图谱的构建提供支撑.
Network Security Named Entity Recognition Method Based on Deep Learning
To solve the problem of the lack of public datasets and effective Named Rntity Recognition(NER)methods in the field of Chinese network security,a network security NER method based on multi-source information of Chinese characters is proposed.By constructing the radical and word frequency vector table of all characters in the dataset,the feature expression ability of the Chinese word vector is enhanced,embedded in the improved vocabulary fusion model to fuse character vector and word vector,and finally input to Conditional Random Fields(CRF)for decoding.Experimental results show that the accuracy,recall rate and F1 values of 0.864 9,0.840 2 and 0.852 3 on the network security dataset respectively,are better than the existing models while maintaining a fast decoding speed and occupying low computer resources,which can improve the support for the subsequent construction of network security knowledge graphs.

network securityChinese NERpre-training modelword vector fusionCRF

李大岭、张浩军、王家慧、李世龙

展开 >

河南工业大学信息科学与工程学院,河南郑州 450001

河南省粮食信息处理国际联合实验室,河南郑州 450001

网络安全 中文命名实体识别 预训练模型 词向量融合 条件随机场

国家自然科学基金面上项目河南省重大公益专项河南省科技攻关项目

62073123201300311200212102210086

2024

无线电工程
中国电子科技集团公司第五十四研究所

无线电工程

影响因子:0.667
ISSN:1003-3106
年,卷(期):2024.54(3)
  • 19