首页|基于BERT-Bi-LSTM-CRF模型的机场类中文航行通告要素实体识别

基于BERT-Bi-LSTM-CRF模型的机场类中文航行通告要素实体识别

扫码查看
航行通告是民用航空情报领域的重要情报资料,针对中文航行通告专业名词较多、格式不统一及语义复杂等问题,提出了一种基于BERT-Bi-LSTM-CRF的实体识别模型,对航行通告E项内容中事件要素实体进行抽取.首先通过BERT(bi-directional encoder representations from transforms)模型对处理后的向量进行预训练,捕捉丰富的语义特征,然后传送至双向长短期记忆网络(bidirectional long short-term memory,Bi-LSTM)模型对上下文特征进行提取,最后利用条件随机场(conditional ran-dom field,CRF)模型对最佳实体标签预测并输出.收集并整理机场类航行通告相关的原始语料,经过文本标注与数据预处理,形成了可用于实体识别实验的训练集、验证集和评价集数据.基于此数据与不同的实体识别模型进行对比实验,BERT-Bi-LSTM-CRF 模型的准确率为89.68%、召回率为81.77%、F1为85.54%,其中F1相比现有模型得到有效提升,结果验证了该模型在机场类航行通告中要素实体识别的有效性.
Airport Class Based on BERT-BILSTM-CRF Model Chinese Navigation Notice Element Entity Recognition
NOTAM(notice to airman)is important information in the field of civil aviation intelligence.In view of the problems such as more professional terms,inconsistent format and complex semantics of Chinese NOTAMs(notices to airman),an entity recognition model based on BERT-Bi-LSTM-CRF was proposed to extract the event element entities from the E items of NOTAMs.The word vector was pre-trained by the bidirectional encoder representations from transforms(BERT)model to capture rich semantic features and input into the bidirectional long and short-term memory network(Bi-LSTM)model to extract contextual features,and finally the conditional random field(CRF)model was used to output the best Entity prediction labels.The original corpus related to airport-like navigation announcements was collected and organized,and after data pre-processing and text annotation,the training set,validation set and evaluation set data were formed which can be used for entity recognition experiments.Based on this data and different entity recognition models for comparison experiments,the accuracy of the BERT-Bi-LSTM-CRF model is 89.68%,the recall rate is 81.77%,and the F,value is 85.54%,where the F,value is effectively improved compared with the existing models,and the results validate the effectiveness of the model for elemental entity recognition in airport-like navigational announcements.

airport class navigation noticeelemental entity recognitionbidirectional conversion encoderbidirectional long and short-term memory networktext information extraction

郝宽公、董兵、吴悦、彭自琛、罗创

展开 >

中国民用航空飞行学院空中交通管理学院,广汉 618307

机场类航行通告 要素实体识别 双向转换编码器 双向长短期记忆网络 文本信息抽取

中国民用航空飞行学院重点科研项目中央高校基本科研业务费专项

ZJ2021-09J2023-050

2024

科学技术与工程
中国技术经济学会

科学技术与工程

CSTPCD北大核心
影响因子:0.338
ISSN:1671-1815
年,卷(期):2024.24(10)
  • 1
  • 13