通信学报2024,Vol.45Issue(10) :1-16.DOI:10.11959/j.issn.1000-436x.2024183

基于软提示微调和强化学习的网络安全命名实体识别方法研究

Research on named entity recognition method in cybersecurity based on soft prompt tuning and reinforcement learning

田泽庶 刘春雨 张云婷 张嘉宇 孟超 张宏莉
通信学报2024,Vol.45Issue(10) :1-16.DOI:10.11959/j.issn.1000-436x.2024183

基于软提示微调和强化学习的网络安全命名实体识别方法研究

Research on named entity recognition method in cybersecurity based on soft prompt tuning and reinforcement learning

田泽庶 1刘春雨 1张云婷 1张嘉宇 1孟超 1张宏莉1
扫码查看

作者信息

  • 1. 哈尔滨工业大学计算学部,黑龙江 哈尔滨 150001
  • 折叠

摘要

随着网络技术的迅猛发展,新型网络安全威胁不断涌现,网络安全命名实体识别重要性日益增加.针对现有基于大语言模型的命名实体识别方法在网络安全领域识别准确率差的问题,提出了一种结合软提示微调和强化学习的网络安全命名实体识别方法.通过结合软提示微调技术,针对网络安全领域的复杂性,精细调整大语言模型的识别能力,提升模型对网络安全命名实体的识别准确率,同时优化训练效率.此外,提出了基于强化学习的网络安全实体筛选器,可以有效去除训练集中的低质量标注,从而提升识别准确率.在2个开源基准网络安全实体识别数据集上评估了所提方法,实验结果表明,所提方法的F1值优于现有最佳的网络安全命名实体识别方法.

Abstract

As network technology rapidly advanced,new cybersecurity threats constantly emerged,increasing the impor-tance of cybersecurity named entity recognition.To address the problem of poor recognition accuracy in named entity recognition methods based on large language models in the cybersecurity domain,a novel cybersecurity named entity recognition method that combined soft prompt tuning and reinforcement learning was proposed.By integrating the soft prompt tuning technique,the method precisely adjusted the recognition capabilities of large language models to handle the complexity of the cybersecurity domain,improving recognition accuracy for cybersecurity named entities while opti-mizing training efficiency.Additionally,a reinforcement learning-based instance filter was proposed,which effectively removed low-quality annotations from the training set,further enhancing recognition accuracy.The proposed method was evaluated on two benchmark cybersecurity NER datasets,with experimental results demonstrating superior perfor-mance in F1 score compared to state-of-the-art cybersecurity NER methods.

关键词

网络安全命名实体识别/软提示微调/强化学习/大规模预训练模型

Key words

cybersecurity named entity recognition/soft prompt tuning/reinforcement learning/large-scale pre-trained models

引用本文复制引用

基金项目

国家重点研发计划基金资助项目(2016QY03D0501)

国家重点研发计划基金资助项目(2017YFB0803304)

黑龙江省自然科学基金资助项目(LH2023F018)

出版年

2024
通信学报
中国通信学会

通信学报

CSTPCDCSCD北大核心
影响因子:1.265
ISSN:1000-436X
参考文献量31
段落导航相关论文