融合目标词上下文序列与结构信息的框架识别方法

Integrating Contextual and Structural Information of Target Words for Frame Identification

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：框架识别是框架语义角色标注的重要前提,该任务是为给定句子中的目标词寻找一个可激活的框架.框架识别通常看作是针对目标词的分类问题,一般采用序列建模的方式学习融合上下文的目标词表示.该方式忽略了目标词所在上下文的结构信息,且在建模时未考虑不同词性目标词在句法和语义结构上的差异.针对这些不足,该文提出了一种融合目标词上下文序列与结构信息的框架识别方法,该方法使用BERT和 GCN分别对不同词性目标词的上下文信息和融合PropBank语义角色或依存句法结构信息的目标词进行建模,然后得到融合序列和结构信息的目标词表示.另外,该文分析了不同词性目标词依存信息的结构差异,采用一种集成学习方法克服了单一模型在此方面的不足.最后,在 FN1.7 和CFN数据集上的实验结果表明,融合目标词上下文序列与结构信息的框架识别方法在性能上优于当前最好模型.

外文摘要：Frame Identification(FI),which aims to find the proper frame to activate for a target words in a given sentence,is an important prerequisite for labeling frame semantic roles.Generally,FI is regarded as a classifying task,applying the sequence modeling to learn the contextual representation of target words.To further capture the structural information of target words themselves,this paper proposes a model which fuses the contextual and structural information of target words.Specifically,BERT and GCN are utilized to model the contextual information of target words in different parts of speech and the structural information of target words in PropBank roles or de-pendence syntax,respectively.Also,this paper analyzes the structural differences of the dependency information of target words with different parts of speech,and employs an ensemble learning approach to consider the structural differences.Experiments on FN1.7 and CFN datasets show that our model outperforms the SOTA.

外文关键词：

frame identificationsemantic rolesdependency syntaxBERTGCN

作者：

闫智超、李茹、苏雪峰、李欣杰、柴清华、韩孝奇、赵云肖

展开 >

作者单位：

山西大学计算机与信息技术学院,山西太原 030006

山西大学计算智能与中文信息处理教育部重点实验室,山西太原 030006

山西工程科技职业大学现代物流学院,山西晋中 030609

中译语通科技股份有限公司,北京 100043

山西大学外国语学院,山西太原 030006

展开 >

关键词：

框架识别语义角色依存句法 BERT GCN

基金：

国家自然科学基金山西省基础研究计划项目

项目编号：

61936012202203021211286

出版年：

2024

中文信息学报

中国中文信息学会,中国科学院软件研究所

中文信息学报

CSTPCDCHSSCD北大核心

影响因子：0.8

ISSN：1003-0077

年,卷(期)：2024.38(1)

参考文献量32