基于T5 PEGASUS和DeepKE的文本摘要生成研究
Research on Text Abstract Generation Based on T5 PEGASUS and DeepKE
张琪 1王玲 1申杰2
作者信息
- 1. 河南水利与环境职业学院,河南郑州 450008
- 2. 华北水利水电大学,河南郑州 450045
- 折叠
摘要
为减少T5 PEGASUS模型生成的摘要中的虚构信息、重复等问题,提出了一种基于T5 PEGASUS和DeepKE的文本摘要生成模型——T5 PEGASUS-DK.该模型将T5 PEGASUS模型和DeepKE框架相融合,先使用Pkuseg分词方法改进分词效果,再使用DeepKE框架抽取文本中的三元组,最后将三元组的词向量集合与文本的表示向量进行拼接.通过建立文本与三元组之间的映射关系,使得模型可以提取出事实性知识,从而提取出与原文内容更相符的信息作为摘要.T5 PEGASUS-DK模型的ROUGE值均达到最高,所生成的摘要更真实、连贯,与原文内容更相符.
Abstract
In order to solve the problem of false information and duplication in the summarizations generated by the T5 PEGASUS model,a text summarization model based on T5 PEGASUS and DeepKE-T5 PEGASUS-DK is proposed.This model combines the T5 PEGASUS model with DeepKE framework.Firstly,the Pkuseg segmentation method is used to improve the segmentation performance.Then,the DeepKE framework is used to extract triads from text.Finally,the word vector set of triads is concatenated with the representation vector of text.By establishing a mapping relationship between text and triads,the model can extract factual knowledge and extract information that is more consistent with the original content as a summary.The experimental results show that the T5 PEGASUS-DK model has the highest ROUGE value,and the generated abstracts are more authentic,coherent,and consistent with the original content.
关键词
文本摘要生成/T5/PEGASUS/DeepKE/三元组/ROUGEKey words
text summarization/T5 PEGASUS/DeepKE/triad/ROUGE引用本文复制引用
出版年
2024