Textual Data Mining of Air Traffic Control Hazard Sources Based on BERT Modeling
As hazardous sources and safety hazards are prone to conceptual confusion and record confusion in civil aviation safety management,it is necessary to distinguish the two according to the management regulations of the dual prevention mechanism.The control list of ATC hazardous sources is collected on ASIS system as the research object of this paper,and the corresponding text data mining work is carried out on it.The corresponding text classification model is constructed according to the characteristics of haz-ardous sources and safety hazards:firstly,the ATC hazardous source control list is preprocessed by text cleaning,de-duplication,Jieba split,etc.,and then the word vectors are generated based on the BERT model,and the pre-training model is pre-trained using the BERT-Base-Chinese pre-training model with fine-tuning of hyper-parameters,and finally,the classification is combined with a Softmax classifier to get the classification results.
text categorizationdata miningBERT modelhazard sourcessafety hazards