首页|多文档摘要研究综述

多文档摘要研究综述

扫码查看
[目的]调研和梳理相关文献,总结多文档摘要研究框架和主流模型.[文献范围]以"Multi-Document Summarization"、"多文档摘要"为检索词,分别在AI Open Index、Paper with Code和CNKI数据库中进行检索,共筛选出76篇文献.[方法]归纳多文档摘要技术实现的主流框架,依据关键技术对近年最新模型和算法进行分类概述,并对未来研究提出展望.[结果]对比阐述了多文档摘要最新模型与传统方法的优缺点,并对高质量多文档摘要数据集、现阶段评价指标进行总结.[局限]在实验结果对比部分,只讨论了Multi-News等数据集上部分应用较为广泛模型的评估结果,缺乏全部模型在同一数据集上的实验结果对比.[结论]多文档摘要任务仍存在很多亟待解决的问题,如生成摘要的事实性不高、摘要模型的通用性差等.
An Overview of Research on Multi-Document Summarization
[Objective]This paper reviews the literature on multi-document summarization,aiming to examine their research frameworks and mainstream models.[Coverage]We searched the AI Open Index,Paper with Code,and CNKI databases with queries"multi-document summarization"and"多文档摘要".A total of 76 representative articles were retrieved.[Methods]We summarized the mainstream research frameworks,the latest models,and algorithms of multi-document summarization technology.We also present prospects for future studies.[Results]This paper compared the strengths and weaknesses of the latest models for multi-document summarization to the traditional methods.We also summarized high-quality multi-document summarization datasets and current evaluation metrics.[Limitations]We only discussed the evaluation results of some popular models on the Multi-News dataset,lacking a comparison of all models on the same dataset.[Conclusions]Many challenges remain in the task of multi-document summarization,including the generated summaries'low factual accuracy and the models'poor generality.

Multi-Document SummarizationText SummarizationContent SelectionTransformer ModelPre-Training Model

宝日彤、孙海春

展开 >

中国人民公安大学信息网络安全学院 北京 100038

安全防范技术与风险评估公安部重点实验室 北京 100026

多文档摘要 文本摘要 内容选择 Transformer模型 预训练模型

公安部技术研究计划北京市自然科学基金

2020JSYJC222020JSYJC22

2024

数据分析与知识发现
中国科学院文献情报中心

数据分析与知识发现

CSTPCDCSSCICHSSCD北大核心EI
影响因子:1.452
ISSN:2096-3467
年,卷(期):2024.8(2)
  • 76