地质科技通报2024,Vol.43Issue(4) :224-234.DOI:10.19509/j.cnki.dzkq.tb20230543

基于XLNET模型的开阳磷矿成矿条件相关地质实体识别与应用

Recognition and application of geological entities related to ore-forming conditions in the Kaiyang phosphate mine based on the XLNET model

彭彬 田宜平 曾斌 吴雪超 吴文明
地质科技通报2024,Vol.43Issue(4) :224-234.DOI:10.19509/j.cnki.dzkq.tb20230543

基于XLNET模型的开阳磷矿成矿条件相关地质实体识别与应用

Recognition and application of geological entities related to ore-forming conditions in the Kaiyang phosphate mine based on the XLNET model

彭彬 1田宜平 2曾斌 1吴雪超 3吴文明4
扫码查看

作者信息

  • 1. 中国地质大学(武汉)计算机学院,武汉 430078
  • 2. 中国地质大学(武汉)计算机学院,武汉 430078;中国地质大学(武汉)生物地质与环境地质国家重点实验室,武汉 430078;中国地质大学(武汉)智能地学信息处理湖北省重点实验室,武汉 430078;自然资源部基岩区矿产资源勘查工程技术创新中心,贵阳 550081;贵州省战略矿产智慧勘查重点实验室,贵阳 550081
  • 3. 中国地质大学(武汉)计算机学院,武汉 430078;贵州省战略矿产智慧勘查重点实验室,贵阳 550081;武汉地大坤迪科技有限公司,武汉 430200
  • 4. 自然资源部基岩区矿产资源勘查工程技术创新中心,贵阳 550081;贵州省战略矿产智慧勘查重点实验室,贵阳 550081;贵州省地质矿产勘查开发局一○五地质大队,贵阳 550018
  • 折叠

摘要

随着磷矿找矿难度越来越大,地质勘探成果报告也愈来愈多,通过人工识别海量文档中与磷矿成矿相关地质信息耗时低效,无法满足知识共享传播和地质报告智能管理的需求.为快速获得磷矿地质文档报告中隐藏的成矿地质知识,基于XLNET模型建立了磷矿成矿地质实体自动识别的方法.首先对实体进行BIO标注建立地质实体字典,利用XLNET作为底层预处理模型学习句子双向语义;然后使用BILSTM-Attention-CRF模型实现文本多标签的智能分类;最后通过定位磷矿实体在报告中的分布位置大致推测该处磷矿成矿条件和成矿模式.将该模型与其余3种模型比较得出结果,该模型识别的准确率(P)、召回率(R)及F1值都接近了 90%,较前3种模型分别调高了 2%,5%,6%.该研究为开阳磷矿地质研究人员提供了更加高效的地质实体自动识别的方法.

Abstract

[Objective]With increasing difficulty in phosphate ore prospecting,there are an increasing number of geological exploration reports.The manual recognition of geological information related to phosphate rock minerali-zation in massive documents is time-consuming and inefficient.It cannot meet the needs of knowledge sharing,dis-semination and intelligent management of geological reports.[Methods]To quickly obtain the ore-forming geolog-ical knowledge hidden in the phosphate ore reports,this work intends to establish an automatic recognition method for ore-forming geological entities based on the extreme learning machine network(XLNET)model.First,BIO la-belling of entities was carried out to establish a geological entity dictionary,and XLNET was used as the underlying preprocessing model to learn the bidirectional semantics of sentences.Then,the BILSTM-Attention-CRF(bidirec-tional long short term memory(BILSTM)-self attention layer(Attention)-conditional random field(CRF))model was used to realize intelligent classification of multiple text labels.Finally,the ore-forming conditions and ore-form-ing model of phosphate ore in the reports were roughly predicted by locating the distribution position of phosphate ore entities in the report.[Results]Comparing this model with the other three models,these results show that the accuracy rate,recall rate and F1 value of this model are close to 90%,which are 2%,5%and 6%higher than those of the previous three models,respectively.[Conclusion]This study provides a more efficient method for au-tomatic geological entity recognition for geological researchers in the Kaiyang phosphate mine.

关键词

地质实体识别/XLNET-BILSTM-Attention-CRF/磷矿成矿模式/预训练模型/序列标注

Key words

geological entity recognition/extreme learning machine network(XLNET)-bidirectional long short term memory(BILSTM)-self attention layer(Attention)-conditional random field(CRF)/metallogenic model of phosphate ore/pre-training model/sequence labeling

引用本文复制引用

基金项目

中央引导地方科技发展资金项目(黔科中引地[2021]4027)

贵州磷、锰、铝优势资源成矿规律与快速高效智慧化勘查技术研究及示范项目(黔科合战略找矿[2022]ZD003)

贵州磷、锰、铝优势资源成矿规律与快速高效智慧化勘查技术研究及示范项目([2022]ZD004)

智能地学信息处理湖北省重点实验室2022年度开放研究课题(KLIGIP-2022-B05)

2021年度生物地质与环境地质国家重点实验室自主课题(128-GKZ21Y647)

出版年

2024
地质科技通报
中国地质大学(武汉)

地质科技通报

CSTPCD北大核心
影响因子:1.018
ISSN:2096-8523
段落导航相关论文