中国传媒大学学报(自然科学版)2023,Vol.30Issue(3) :24-30.DOI:10.3969/j.issn.1673-4793.2023.03.005

一种面向新闻文本的生成式中文摘要生成模型

A novel generative Chinese summarization model geared towards news text generation

韩珊珊 王升辉 万丽莉
中国传媒大学学报(自然科学版)2023,Vol.30Issue(3) :24-30.DOI:10.3969/j.issn.1673-4793.2023.03.005

一种面向新闻文本的生成式中文摘要生成模型

A novel generative Chinese summarization model geared towards news text generation

韩珊珊 1王升辉 1万丽莉1
扫码查看

作者信息

  • 1. 北京交通大学计算机与信息处理学院,北京 100091
  • 折叠

摘要

中文文本摘要生成技术旨在解决海量中文文本所带来的信息过载和冗余问题,以提高信息传播效率和方便读者获取信息.在序列到序列深度模型基础上,提出了一种引入对比学习的中文摘要生成模型SimCLCTS(Simple Model for Contrastive Learning of Chinese Text Summarization).SimCLCTS通过在模型中增加以对比损失函数为特征的无监督评估模块,弥补了序列到序列模型中学习目标和评价指标不一致导致的暴露偏差问题.对比实验表明,该模型减少了暴露偏差量,在面向新闻类的中文文本摘要生成中取得了良好效果.

Abstract

The technology of generating Chinese text summaries aims to address the issues of information overload and redundancy that are brought about by massive amounts of Chinese text,with the objective of enhancing the efficiency of information dissemination and facilitating readers'access to information.This article proposes a Chinese text summarization model,named SimCLCTS(Simple Model for Contrastive Learning of Chinese Text Summarization),which is based on the sequence-to-sequence deep learning model(Seq2Seq).SimCLCTS mitigates the problem of exposure bias caused by inconsistencies between the learning objectives and evaluation metrics of the sequence-to-sequence model by incorporating an unsupervised evaluation module that features a contrastive loss function.Comparative experiments demonstrate that the model significantly reduces exposure bias and achieves excellent results in generating Chinese text summaries for news articles.

关键词

生成式摘要/中文文本/序列到序列模型/对比学习

Key words

abstractive summarization/Chinese text/sequence-to-sequence model/contrastive learnin

引用本文复制引用

出版年

2023
中国传媒大学学报(自然科学版)
中国传媒大学

中国传媒大学学报(自然科学版)

CHSSCD
影响因子:0.514
ISSN:1673-4793
参考文献量31
段落导航相关论文