首页|基于概率融合算法的煤矿事故隐患文本知识实体抽取研究

基于概率融合算法的煤矿事故隐患文本知识实体抽取研究

扫码查看
针对煤矿事故隐患文本数据的非结构化特性,基于煤矿事故隐患文本数据集,通过分析隐患描述文本数据的特征及隐含信息,结合事故隐患传播规律设计了适用于煤矿事故隐患描述文本的知识实体标注类型并使用Brat工具进行标注,构建用于知识实体抽取模型的数据集;提出一种基于动态权重融合的BERT-IDCNN-CRF模型,并引入基于牛顿冷却定律的概率融合算法.结果表明:引入概率融合算法后,动态权重融合的BERT-IDCNN-CRF在隐患文本知识实体抽取任务中表现最佳,其精度、召回率与F1值分别提升了 8.93%、5.28%、7.51%,显著提高了模型的预测准确性和稳定性,并具有良好的适应性.
Textual knowledge entity extraction of hidden dangers in coal mine accidents based on probabilistic fusion algorithm
Given the unstructured nature of text data related to hidden dangers in coal mine accidents,extracting latent knowledge is crucial for constructing a knowledge graph of hidden dangers in coal mine accidents.This study proposes annotation types for knowledge entities to describe hidden dangers in coal mine accidents by analyzing the characteristics and latent information in the texts of hidden dangers based on their propagation patterns.Using the Brat annotation tool,we annotated the text data related to hidden dangers of coal mine accidents to construct a dataset for knowledge extraction model.We pro-poses a BERT-IDCNN-CRF model based on dynamic fusion and introduced a probabilistic fusion algo-rithm based on Newton's law of cooling.The results indicate that with the incorporation of the probabi-listic fusion algorithm,the dynamically weighted BERT-IDCNN-CRF model achieved the best perform-ance in the task of knowledge entity extraction from hidden danger texts.Its precision,recallrate,and F1-score improved by 8.93%,5.28%,and 7.51%,respectively,significantly enhancing the model's prediction accuracy and stability,while demonstrating excellent adaptability.

hidden dangers in coal mine accidentsknowledge entity extractionK-fold cross-valida-tionprobabilistic fusion

李靖、李泽荃、石福泰、郝强

展开 >

青海师范大学国家安全与应急管理学院,青海西宁 810016

中国矿业大学(北京)能源与矿业学院,北京 100083

华北科技学院,河北廊坊 065201

华亭煤业集团有限责任公司,甘肃平凉 744100

华能煤炭技术研究有限公司,北京 100070

展开 >

煤矿事故隐患 知识实体抽取 K折交叉验证 概率融合

2024

矿业科学学报
中国矿业大学(北京)

矿业科学学报

CSTPCD北大核心
ISSN:2096-2193
年,卷(期):2024.9(6)