融合卷积收缩门控的生成式文本摘要方法

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据
维普

中文摘要：在深度学习技术的推动下,基于编码器-解码器架构并结合注意力机制的序列到序列模型成为文本摘要研究中应用最广泛的模型之一,尤其在生成式文本摘要任务中取得显著效果.然而,现有的采用循环神经网络的模型存在并行能力不足和时效低下的局限性,无法充分概括有用信息,忽视单词与句子间的联系,易产生冗余重复或语义不相关的摘要.为此,提出一种基于Transformer和卷积收缩门控的文本摘要方法.利用BERT作为编码器,提取不同层次的文本表征得到上下文编码,采用卷积收缩门控单元调整编码权重,强化全局相关性,去除无用信息的干扰,过滤后得到最终的编码输出,并通过设计基础Transformer解码模块、共享编码器的解码模块和采用生成式预训练Transformer(GPT)的解码模块3种不同的解码器,加强编码器与解码器的关联,以此探索能生成高质量摘要的模型结构.在LCSTS和CNNDM数据集上的实验结果表明,相比主流基准模型,设计的TCSG、ES-TCSG和GPT-TCSG模型的评价分数增量均不低于1.0,验证了该方法的有效性和可行性.

外文标题：Abstractive Text Summarization Method Incorporating Convolutional Shrinkage Gating

外文摘要：Driven by deep learning techniques,Sequence to Sequence(Seq2Seq)model,based on an encoder-decoder architecture combined with an attention mechanism,is widely utilized in text summarization research,particularly for abstractive text summarization tasks.Remarkable results are achieved by this model.However,limitations are faced by existing models using Recurrent Neural Network(RNN),such as insufficient parallelism,low time efficiency,and a tendency to produce summaries that are either redundant,repetitive,or semantically irrelevant.Additionally,these models often fail to fully summarize useful information and ignore the connection between words and sentences.In response to these challenges,a text summarization method based on Transformer and convolutional shrinkage gating is proposed.Different levels of text representations are extracted using BERT as an encoder,which then obtains contextual encoding.The convolutional shrinkage gating unit is adopted to adjust encoding weights,strengthen global relevance,remove interference from useless information,and obtain the final encoding output after filtering.Three different decoders are designed:the basic Transformer decoding module,the decoding module with a shared encoder,and the decoding module using GPT.These are aimed at strengthening the association between encoder and decoder and exploring model structures capable of generating high-quality abstracts.Evaluation scores of the TCSG,ES-TCSG,and GPT-TCSG models in this method are shown to increment by no less than 1.0 on both LCSTS and CNNDM datasets,verifying the validity and feasibility of the method relative to mainstream benchmark models.

外文关键词：

abstractive text summarizationSequence to Sequence(Seq2Seq)modelTransformer modelBERT encoderconvolutional shrinkage gating unitdecoder

作者：

甘陈敏、唐宏、杨浩澜、刘小洁、刘杰

展开 >

作者单位：

重庆邮电大学通信与信息工程学院,重庆 400065

重庆邮电大学移动通信技术重庆市重点实验室,重庆 400065

关键词：

生成式文本摘要序列到序列模型 Transformer模型 BERT编码器卷积收缩门控单元解码器

基金：

长江学者和创新团队发展计划

项目编号：

IRT_16R72

出版年：

2024

DOI：

10.19678/j.issn.1000-3428.0066847

计算机工程

华东计算技术研究所　上海市计算机学会

计算机工程

CSTPCD北大核心

影响因子：0.581

ISSN：1000-3428

年,卷(期)：2024.50(2)

参考文献量4