基于关系触发词与多特征的中文人物关系抽取
Chinese character relation extraction based on relation trigger and multi-feature
冷根 1周允升 1余敦辉 2孙斌1
作者信息
- 1. 湖北大学计算机与信息工程学院,湖北武汉 430062
- 2. 湖北大学计算机与信息工程学院,湖北武汉 430062;湖北省教育信息化工程技术研究中心,湖北武汉 430062
- 折叠
摘要
针对当前主流的中文人物关系抽取方法未充分利用核心词,且难以提取中文深层文本信息的问题,提出一种基于关系触发词与多特征的中文人物关系抽取方法.将词语语义与其位置、词性、依存句法以及语义角色融合,使用结构简洁但特征提取能力更强的Transformer编码器对原始文本进行编码,基于同义词词林与词向量提取人物关系触发词,并将其作为注意力导向引入注意力机制中,提高模型对文本重要信息的学习能力.实验结果表明,该方法的F1值为89.7%,相比CNN、BiLSTM-ATT、R-BERT等模型平均提升了 9.6个百分点,验证了该方法的有效性.
Abstract
To solve the inadequacy of utilizing the core words and difficulty in extracting the Chinese deep text information in the current mainstream extraction methods of Chinese character relation,a Chinese character relation extraction method based on relation trigger words and multi-feature was proposed.The semantics of words were fused with their positions,parts of speech,dependent syntax and semantic roles,and the original text was encoded by a Transformer encoder with a concise structure but stronger feature extraction ability.Character relationship trigger words were extracted based on the synonym word forest and word vector,and introduced into the attention mechanism as an attention guide,which improved the model's ability to learn important textual information.Experimental results show that the F1 value of this method is 89.7%,which is 9.6 percentage higher than that of CNN,BILSTM-ATT,R-BERT on average,verifying the effectiveness of this method.
关键词
人物关系抽取/变换网络/关系触发词/注意力机制/多特征/中文文本/双通道Key words
character relation extraction/transformer/relation trigger/attention mechanism/multi-feature/Chinese text/two-channel引用本文复制引用
基金项目
国家自然科学基金项目(61977021)
国家自然科学基金项目(62102136)
湖北省技术创新专项(重大项目)基金项目(2020AEA008)
出版年
2024