基于概念预测和关系预测的AMR解析与对齐方法

An AMR Parsing and Alignment Method Based on Concept and Relation Prediction

扫码查看

原文链接

维普
万方数据

中文摘要：抽象语义表示(Abstract Meaning Representation,AMR)是一种深层次的句子级语义表示形式,其将句子中的语义信息抽象为由概念结点与关系组成的有向无环图,相比其他较为浅层的语义表示形式如语义角色标注、语义依存分析等,AMR因其出色的深层次语义信息捕捉能力,被广泛运用在例如信息抽取、智能问答、对话系统等多种下游任务中.AMR解析过程将自然语言转换成AMR图.虽然AMR图中的大部分概念结点和关系与句子中的词语具有较为明显的对齐关系,但原始的英文AMR语料中并没有给出具体的对齐信息.为了克服对齐信息不足给AMR解析以及AMR在下游任务上的应用造成的阻碍,Li等人[14]提出并标注了具有概念和关系对齐的中文AMR语料库.然而,现有的AMR解析方法并不能很好地在AMR解析的过程中利用和生成对齐信息.因此,该文首次提出了一种可以利用并且生成对齐信息的AMR解析方法,包括了概念预测和关系预测两个阶段.该文提出的方法具有高度的灵活性和可扩展性,实验结果表明,该方法在公开数据集CAMR 2.0和CAMRP 2022盲测集分别取得了 77.6(+10.6)和70.7(+8.5)的Align Smatch分数,超过了过去基于序列到序列(Sequence-to-Sequence)模型的方法.该文同时对AMR解析的性能和细粒度指标进行详细的分析,并对存在的改进方向进行了展望.该文的代码和模型参数已经开源到https://github.com/pkunlp-icler/Two-Stage-CAMRP,供复现与参考.

外文摘要：Abstract Meaning Representation(AMR)is a semantic representation that captures the sentence-level meaning through directed acyclic graph with conceptual nodes and relations.This representation surpasses other shallow semantic representations,such as semantic role labeling and semantic dependency parsing,making it suitable for various downstream tasks including information extraction,question answering,and dialog system.AMR parsing,the process of converting natural language into an AMR graph,faces the challenge due to the lack of alignment information in the original English AMR corpus.In this paper,we present a novel AMR parsing method that leverages and generates alignment information,comprising two stages:concept prediction and relation predic-tion.Our approach outperforms previous sequence-to-sequence model based methods by achieving AlignSmatch scores of 77.6(+10.6)and 70.7(+8.5)on the publicly available dataset CAMR2.0 and the blind test set CAM-RP2022,respectively.We provide a detailed analysis of both the performance and fine-grained metrics of AMR par-sing,and discuss the potential for improvement,with the code and model parameters available at https://github.com/pkunlp-icler/Two-Stage-CAMRP.

外文关键词：

semantic parsingabstract meaning representationChinese natural language processing

作者：

陈亮、高博飞、常宝宝、张亦驰

展开 >

作者单位：

北京大学多媒体信息处理全国重点实验室,北京 100871

关键词：

语义解析抽象语义表示中文自然语言处理

基金：

国家自然科学基金

项目编号：

61936012

出版年：

2024

中文信息学报

中国中文信息学会,中国科学院软件研究所

中文信息学报

CSTPCDCHSSCD北大核心

影响因子：0.8

ISSN：1003-0077

年,卷(期)：2024.38(7)