首页|融合注意力与词边界的防震减灾实体识别方法

融合注意力与词边界的防震减灾实体识别方法

扫码查看
针对防震减灾命名实体识别任务中存在的特征信息不足且识别效率低的问题,提出了一种"融合自注意力与MarkBERT"的防震减灾领域实体识别模型.该模型在预训练过程中引入MarkBERT:①得到含有词边界信息的序列;②利用BiLSTM获取字符位置信息;③引入自注意力机制进一步捕获序列内部关系并分配特征权重;④通过条件随机场输出最优序列标注结果.本模型基于"地震防治相关问句BIO标注数据"进行了实验,结果显示F1值达到了96.18%,并与3组同类模型进行对比,验证了算法的优越性.实验结果表明,该模型能高效准确的识别文本中的防震减灾实体.
An earthquake disaster mitigation entity recognition method that integrates attention and word boundary
In response to the problem of insufficient feature information and low recognition efficiency in the task of naming entities for earthquake prevention and disaster reduction,this study proposes a method for entity recognition in the field of earthquake prevention and disaster reduction that integrates Self-Attention and MarkBERT. Using MarkBERT to introduce word boundary information during the pre-training process,a sequence containing boundary information is obtained;Obtain character position information through BiLSTM;Introducing a Self-Attention mechanism to further capture the internal relationships of sequences and allocate feature weights;Finally,the optimal sequence annotation result is output through conditional random fields. This model was tested based on the"BIO annotation data of earthquake prevention and control related questions",and the F1 value reached 96. 18%. And the superiority of the algorithm was verified by comparing three sets of similar models. The experimental results show that the model can efficiently and accurately identify earthquake prevention and disaster reduction entities in text.

named entity recognitionnatural language processingearthquake prevention and disaster reductionMarkBERTself-attention mechanismBiLSTMCRF

徐婧、刘纪平、王亮、王岩

展开 >

兰州交通大学 测绘与地理信息学院,兰州 730070

中国测绘科学研究院,北京 100036

地理国情监测技术应用国家地方联合工程研究中心,兰州 730070

甘肃省地理国情监测工程实验室,兰州 730070

辽宁工程技术大学 测绘与地理科学学院,辽宁 阜新 123000

展开 >

命名实体识别 自然语言处理 防震减灾 MarkBERT 自注意力机制 长短记忆网络 条件随机场

国家重点研发计划项目

2022YFC3003604

2024

测绘科学
中国测绘科学研究院

测绘科学

CSTPCD北大核心
影响因子:0.774
ISSN:1009-2307
年,卷(期):2024.49(1)
  • 5