基于深度学习的网络安全命名实体识别方法

扫码查看

原文链接

万方数据
维普

中文摘要：针对中文网络安全领域缺乏公开数据集和有效的命名实体识别(Named Entity Recognition,NER)方法,提出一种融合汉字多源信息的网络安全NER方法.通过构建数据集中所有字符的偏旁和字频向量表,增强了中文字向量的特征表达能力,嵌入到改进的词汇融合模型中进行字向量与词向量的融合,输入到条件随机场(Conditional Random Fields,CRF)进行解码.实验结果表明,该方法在保持较快解码速度和占用较低计算机资源的情况下,在网络安全数据集上,其准确率、召回率和F1值分别为0.864 9、0.840 2和0.852 3,均优于现有模型,能够为后续网络安全知识图谱的构建提供支撑.

外文标题：Network Security Named Entity Recognition Method Based on Deep Learning

外文摘要：To solve the problem of the lack of public datasets and effective Named Rntity Recognition(NER)methods in the field of Chinese network security,a network security NER method based on multi-source information of Chinese characters is proposed.By constructing the radical and word frequency vector table of all characters in the dataset,the feature expression ability of the Chinese word vector is enhanced,embedded in the improved vocabulary fusion model to fuse character vector and word vector,and finally input to Conditional Random Fields(CRF)for decoding.Experimental results show that the accuracy,recall rate and F1 values of 0.864 9,0.840 2 and 0.852 3 on the network security dataset respectively,are better than the existing models while maintaining a fast decoding speed and occupying low computer resources,which can improve the support for the subsequent construction of network security knowledge graphs.

外文关键词：

network securityChinese NERpre-training modelword vector fusionCRF

作者：

李大岭、张浩军、王家慧、李世龙

展开 >

作者单位：

河南工业大学信息科学与工程学院,河南郑州 450001

河南省粮食信息处理国际联合实验室,河南郑州 450001

关键词：

网络安全中文命名实体识别预训练模型词向量融合条件随机场

基金：

国家自然科学基金面上项目河南省重大公益专项河南省科技攻关项目

项目编号：

62073123201300311200212102210086

出版年：

2024

DOI：

10.3969/j.issn.1003-3106.2024.03.016

无线电工程

中国电子科技集团公司第五十四研究所

无线电工程

影响因子：0.667

ISSN：1003-3106

年,卷(期)：2024.54(3)

参考文献量19