首页|基于交叉注意力的双通道文本关系抽取

基于交叉注意力的双通道文本关系抽取

扫码查看
[目的]针对现有文本关系抽取模型只能获得部分文本特征的问题,构建基于交叉注意力的双通道文本关系抽取模型,提升文本关系抽取的全面性和准确性,实现领域数据集高性能关系抽取.[方法]本文提出基于交叉注意力的双通道文本关系抽取模型DCCAM(Dual Channel Cross Attention Model),设计融合序列通道和图通道的双通道结构,构建自注意力和门控注意力的交叉注意力机制,促进文本特征高度融合,更深入地挖掘文本潜在的关联信息.在公开数据集和构建的两类警务领域数据集中进行实验.[结果]在公开数据集NYT和WebNLG上的实验结果表明,DCCAM模型F1值与SAPCNN、GraphRel 2p模型相比分别提升3个百分点和4个百分点.此外,消融实验结果证明了各模块提升文本抽取能力的有效性.在警务领域的电信诈骗类数据集和帮助信息网络犯罪类数据集上的实验结果表明,DCCAM模型能够提高警务领域文本关系抽取效果,与GraphRel模型相比F1值分别提高8.8和11.8个百分点.[局限]未从大语言模型的角度进行文本关系抽取技术的探索.[结论]DCCAM模型可以显著提升文本关系抽取的能力,是警务工作中文本关联分析的解决方案.
Dual Channel Text Relation Extraction Based on Cross Attention
[Objective]This paper constructs a dual-channel text relation extraction model based on cross-attention to address the partial text feature issues of the existing models.The new model aims to enhance the comprehensiveness and accuracy of text relation extraction,achieving high-performance relation extraction in domain-specific datasets.[Methods]We proposed a Dual Channel Textual Relation Extraction Based on Cross Attention relation extraction model DCCAM(Dual Channel Cross Attention Model),designing a dual-channel structure that integrated sequence and graph channels.Then,we constructed a cross-attention mechanism of self-attention and gated-attention to promote the high fusion of text features and deeply examine the potential associative information.Finally,we conducted experiments on public datasets and two constructed policing datasets.[Results]Experimental results on the NYT and WebNLG public datasets showed that the DCCAM model's F1 values improved by 3%and 4%compared to the baseline model.Additionally,ablation experiments proved the effectiveness of each module in enhancing text extraction capability.Experimental results on the telecom fraud category dataset and the aiding cybercrime dataset in the police domain showed that the DCCAM model can improve the text relation extraction effectiveness in the police domain,with F1 values improving by 8.8%and 11.8%compared with the baseline model.[Limitations]We did not use large language models to explore text relation extraction techniques.[Conclusions]The DCCAM model can significantly improve the ability of text relationship extraction,demonstrating the effectiveness and practicality of text relation extraction tasks in the policing domain,and can provide text association analysis and guidance for police work.

Text Relation ExtractionDual Channel MechanismsCross Attention

叶乃夫、袁得嵛、张郅、侯晓龙

展开 >

中国人民公安大学信息网络安全学院 北京 100038

安全防范与风险评估公安部重点实验室 北京 102623

中国人民公安大学侦查学院 北京 100038

文本关系抽取 双通道机制 交叉注意力机制

2024

数据分析与知识发现
中国科学院文献情报中心

数据分析与知识发现

CSTPCDCSSCICHSSCD北大核心EI
影响因子:1.452
ISSN:2096-3467
年,卷(期):2024.8(11)