大连海洋大学学报2024,Vol.39Issue(1) :153-161.DOI:10.16535/j.cnki.dlhyxb.2023-201

基于改进BiRTE的渔业健康养殖标准复杂关系抽取

Complex relation extraction from health aquaculture standards based on an improved BiRTE model

宋奇书 于红 乔诗晗 罗璇 李光宇 邵立铭 张思佳
大连海洋大学学报2024,Vol.39Issue(1) :153-161.DOI:10.16535/j.cnki.dlhyxb.2023-201

基于改进BiRTE的渔业健康养殖标准复杂关系抽取

Complex relation extraction from health aquaculture standards based on an improved BiRTE model

宋奇书 1于红 1乔诗晗 1罗璇 1李光宇 1邵立铭 1张思佳1
扫码查看

作者信息

  • 1. 大连海洋大学 信息工程学院, 辽宁 大连 116023;大连市智慧渔业重点实验室, 辽宁 大连 116023;设施渔业教育部重点实验室 (大连海洋大学), 辽宁 大连 116023;辽宁省海洋信息技术重点实验室, 辽宁 大连 116023
  • 折叠

摘要

为解决渔业健康养殖标准文本关系抽取领域特定性强、语义复杂导致关系抽取准确率不高等问题,提出了基于改进BiRTE的渔业健康养殖标准复杂关系抽取方法,针对实体和语义关联建模,将RoBERTa作为编码器,采用全词掩码和动态掩码的方式增强词向量特征表示,并在此基础上融合了自注意力机制(Self-Attention,SelfATT)将实体特征与关系特征结合聚焦,加强实体抽取与关系预测的联系,从而提升渔业标准文本抽取的准确性.结果表明:本文提出的基于改进BiRTE的渔业健康养殖标准复杂关系抽取模型(RoBERTa-BiRTE-SelfATT)对渔业标准复杂关系抽取的准确率、召回率和 F1 值分别为 95.9%、95.4%、95.7%,较BiRTE模型分别提升了 4.2%、3.1%、3.8%.研究表明,本文提出的渔业健康养殖标准复杂关系抽取模型RoBERTa-BiRTE-SelfATT可以有效解决渔业标准文本关系抽取中专有名词识别不准确、语义复杂导致实体关系难以抽取的问题,是一种有效的渔业标准复杂关系抽取方法.

Abstract

A complex relationship extraction method for health aquaculture standards is proposed to address issues such as inaccurate recognition of domain-specific nouns and the complexity of semantics hindering entity relation-ship extraction based on an improved BiRTE model.The BiRTE model,which reduces error propagation through bidirectional extraction and exhibits strong relationship extraction capabilities,was adopted as the foundational mod-el.To enhance the model's information-extracting ability from texts of fisheries standard files,RoBERTa was used as the encoder encoding domain-specific nouns in fisheries standard files using whole-word masking and dynamic masking,enriching word vector information and enhancing feature representation.Thus,the Self-Attention is inte-grated to combine entity features and relationship features,focusing on strengthening the connection between entity extraction and relation prediction,thereby improving the accuracy of relation extraction.It was found that the pro-posed model achieved precision of 95.9%,recall of 95.4%,and F1 scores of 95.7%from the extraction of com-plex relationships in fisheries standards,representing an improvement of 4.2%,3.1%,and 3.8%,respectively,compared to the original model.The finding indicates that the proposed improved BiRTE-based model,as an effec-tive method for extracting complex relationships in fishing standards,can effectively address the problems of inaccu-rate identification of proper nouns and difficulty in extracting entity relationships due to semantic complexity in the field of fishing standard text relation extraction.

关键词

渔业标准/关系抽取/重叠关系/复杂关系/自注意力机制

Key words

fishery standard/relation extraction/overlapping relation/complex relation/Self-Attention

引用本文复制引用

基金项目

辽宁省重点研发计划项目(2023JH26/10200015)

出版年

2024
大连海洋大学学报
大连海洋大学

大连海洋大学学报

CSTPCD北大核心
影响因子:0.913
ISSN:2095-1388
参考文献量21
段落导航相关论文