首页|融合领域词典嵌入的航空不安全事件命名实体识别

融合领域词典嵌入的航空不安全事件命名实体识别

扫码查看
针对航空不安全事件领域命名实体识别任务,以航空安全信息周报为数据源,分析并构建航空不安全事件命名实体识别数据集和领域词典.为解决传统命名实体识别模型对于捕获领域实体边界性能较差的问题,基于BERT(bidirectional encoder representations from transformers)预训练语言模型提出融合领域词典嵌入的领域语义信息增强的方法.在自建数据集上进行多次对比实验,结果表明:所提出的方法可以进一步提升实体边界的识别率,相较于传统的双向长短期记忆网络-条件随机场(bi-directional long short term memory-conditional random field,BiLSTM-CRF)命名实体识别模型,性能提升约 5%.
Named Entity Recognition of Aviation Unsafe Events Embedded with Fusion Domain Dictionary
Aiming at the task of named entity recognition in the field of aviation unsafe events,the aviation safety information weekly report was used as the data source to analyze and construct the named entity recognition dataset and domain dictionary of aviation unsafe events.In order to solve the problem of poor performance of the traditional named entity recognition models in capturing domain entity boundaries,based on the bidirectional encoder representations from transformers(BERT)pre-trained language model,a method for enhancing domain semantic information by integrating domain dictionary embedding was proposed.Several comparative experiments were carried out on self-built datasets.The results show that the proposed method can further improve the recognition rate of entity boundaries.Compared with the traditional bi-directional long short term memory-conditional random field(BiLSTM-CRF)named entity recognition model,the performance is improved by about 5%.

aviation unsafe eventsdomain dictionarynamed entity recognitionpre-trained language model

许雅玺、孟天宇、王欣、刘炳南

展开 >

中国民用航空飞行学院经济与管理学院,广汉 618307

四川腾盾科技有限公司,成都 610037

中国民用航空飞行学计算机学院,广汉 618307

中国国际航空股份有限公司,重庆 401120

展开 >

航空不安全事件 领域词典 命名实体识别 预训练语言模型

国家自然科学基金-中国民用航空总局联合资助重点项目中央高校基本科研业务费专项民航飞行技术与飞行安全重点实验室自主研究项目

U2033213J2022-048FZ2022ZZ01

2024

科学技术与工程
中国技术经济学会

科学技术与工程

CSTPCD北大核心
影响因子:0.338
ISSN:1671-1815
年,卷(期):2024.24(8)
  • 17