首页|基于BERT模型的空管危险源文本数据挖掘

基于BERT模型的空管危险源文本数据挖掘

扫码查看
由于危险源与安全隐患在民航安全管理工作中容易出现概念混淆和记录混乱的情况,根据双重预防机制管理规定,需要将两者区分开来.通过在ASIS系统上采集得到空管危险源控制清单作为研究对象,并对其进行相应的文本数据挖掘工作.根据危险源与安全隐患特点构建相应的文本分类模型:首先通过文本清洗、去停用词、Jieba分词等对空管危险源控制清单进行预处理,然后基于BERT模型生成词向量,采用BERT-Base-Chinese预训练模型进行预训练,并对模型进行微调超参数,最后结合Softmax分类器得到分类结果.
Textual Data Mining of Air Traffic Control Hazard Sources Based on BERT Modeling
As hazardous sources and safety hazards are prone to conceptual confusion and record confusion in civil aviation safety management,it is necessary to distinguish the two according to the management regulations of the dual prevention mechanism.The control list of ATC hazardous sources is collected on ASIS system as the research object of this paper,and the corresponding text data mining work is carried out on it.The corresponding text classification model is constructed according to the characteristics of haz-ardous sources and safety hazards:firstly,the ATC hazardous source control list is preprocessed by text cleaning,de-duplication,Jieba split,etc.,and then the word vectors are generated based on the BERT model,and the pre-training model is pre-trained using the BERT-Base-Chinese pre-training model with fine-tuning of hyper-parameters,and finally,the classification is combined with a Softmax classifier to get the classification results.

text categorizationdata miningBERT modelhazard sourcessafety hazards

杨昌其、姜美岑、林灵

展开 >

中国民用航空飞行学院,四川 广汉 618000

文本分类 数据挖掘 BERT模型 危险源 安全隐患

中国民用航空局空中交通管理局横向科研项目

H2023-100

2024

航空计算技术
中国航空工业西安航空计算技术研究所

航空计算技术

CSTPCD
影响因子:0.316
ISSN:1671-654X
年,卷(期):2024.54(4)