首页|基于话头话体共享结构信息的机器阅读理解研究

基于话头话体共享结构信息的机器阅读理解研究

扫码查看
机器阅读理解(Machine Reading Comprehension,MRC)任务旨在让机器回答给定上下文的问题来测试机器理解自然语言的能力.目前,基于大规模预训练语言模型的神经机器阅读理解模型已经取得重要进展,但在涉及答案要素、线索要素和问题要素跨标点句、远距离关联时,答案抽取的准确率还有待提升.该文通过篇章内话头话体结构分析,建立标点句间远距离关联关系,补全共享缺失成分,辅助机器阅读理解答案抽取;设计和实现融合话头话体结构信息的机器阅读理解模型,在公开数据集CMRC2018上的实验结果表明,模型的F1值相对于基线模型提升2.4%,EM值提升6%.
Machine Reading Comprehension Based on Shared Structure Information between Naming and Telling
The machine reading comprehension(MRC)task challenges the machine's ability to understand natural language by asking the machine to answer questions in a given context.To improve the accuracy of answer extraction involving the crossing of punctuation sentences and long-distance correlation of answer elements,clue elements and question elements,this paper proposes to model the long-distance relationship between punctuation sentences,and complement the missing components by shared structure.A machine reading comprehension model is implemented by integrating the Naming-Telling structure information.The experimental results on the public data set CMRC2018 show that the proposed method achieves an increase of 2.4%in F1-value and 6%in EM value compared with the baseline model.

machine reading comprehensionnaming-telling structureattentionpretraining language model

韩玉蛟、罗智勇、张明明、赵志琳、张青

展开 >

北京语言大学信息科学学院,北京 100083

机器阅读理解 话头话体结构分析 注意力机制 预训练语言模型

国家自然科学基金

62076037

2024

中文信息学报
中国中文信息学会,中国科学院软件研究所

中文信息学报

CSTPCDCHSSCD北大核心
影响因子:0.8
ISSN:1003-0077
年,卷(期):2024.38(5)
  • 1