两阶段文档筛选和异步多粒度图多跳问答

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据
维普

中文摘要：多跳问答旨在通过对多篇文档内容进行推理,来预测问题答案以及针对答案的支撑事实.然而当前的多跳问答方法在文档筛选任务中旨在找到与问题相关的所有文档,未考虑到这些文档是否都对找到答案有所帮助.因此,该文提出一种两阶段的文档筛选方法.第一阶段通过对文档进行评分且设置较小的阈值来获取尽可能多的与问题相关文档,保证文档的高召回率;第二阶段对问题答案的推理路径进行建模,在第一阶段的基础上再次提取文档,保证文档的高精确率.此外,针对由文档构成的多粒度图,提出一种新颖的异步更新机制来进行答案预测以及支撑事实预测.提出的异步更新机制将多粒度图分为异质图和同质图来进行异步更新以更好地进行多跳推理.该方法在性能上优于目前主流的多跳问答方法,验证了该方法的有效性.

外文标题：Two-stage Document Filtering and Asynchronous Multi-granularity Graph Multi-hop Question Answering

外文摘要：Multi-hop question answering aims to predict the answer to a question and the supporting facts for the answer by reasoning over the content of multiple documents.However,current multi-hop question answering methods aim to find all documents related to the question in the document filtering task,without considering whether all these documents are useful for finding the answer.Therefore,we propose a two-stage document filtering approach.In the first stage,the documents are scored and a small threshold is set to obtain as many relevant documents as possible to ensure a high recall of documents.In the second stage,the inference path of the question answer is modeled,and the documents are extracted again based on the first stage to ensure high accuracy.In addition,we propose a novel asyn-chronous update mechanism for answer prediction and supporting fact prediction for multi-granularity graph composed of documents.The proposed asynchronous update mechanism divides the multi-grain graph into heterogeneous and homogeneous graphs to perform a-synchronous updates for better multi-hop inference.The performance of the proposed method is better than that of the current mainstream multi hop question answering method,and the effectiveness of the proposed method is verified.

外文关键词：

multi-hop question answeringdocument filteringmulti-granularity graphasynchronous updateanswer prediction

作者：

张雪松、李冠君、聂士佳、张大伟、吕钊、陶建华

展开 >

作者单位：

安徽大学计算机科学与技术学院,安徽合肥 230601

中国科学院自动化研究所模式识别国家重点实验室,北京 100190

清华大学自动化系,北京 100084

关键词：

多跳问答文档筛选多粒度图异步更新答案预测

基金：

国家重点研发计划浙江实验室开放研究项目北京市科委、中关村管委会计划

项目编号：

2020AAA01400032021KH0AB06Z211100004821013

出版年：

2024

DOI：

10.3969/j.issn.1673-629X.2024.01.018

计算机技术与发展

陕西省计算机学会

计算机技术与发展

CSTPCD

影响因子：0.621

ISSN：1673-629X

年,卷(期)：2024.34(1)

参考文献量6