长文本摘要生成:基于Pegasus模型的STM32论文摘要生成分割策略研究

Long Text Abstract Generation:Research on Segmentation Strategy of STM32 Abstract Generation Based on Pegasus Model

龙川 ¹张芹 ¹谢亮生 ¹潘琛 ¹文瑜 ¹杨俊锋¹

扫码查看

作者信息

1. 南昌航空大学测试与光电工程学院,江西南昌 330000
折叠

摘要

研究探讨了使用预训练的Pegasus模型进行长文本摘要时,不同文本分割方法对摘要质量的影响.收集来自知网的200篇关于STM32单片机的学术论文作为实验文本,比较了滑动窗口、句子分割、段落分割及滑动窗口加句子分割四种分割法的长文本摘要生成效果.实验使用ROUGE(Recall-Oriented Understudy for Gisting Evaluation)指标对生成的摘要进行评估,并对实验结果进行了详细分析.在生成摘要的质量方面,段落分割法表现出色,其ROUGE-1、ROUGE-2和ROUGE-L评分分别达到了30.85、7.60和20.15,轻微超过了句子分割法的评分,且显著优于句子分割加滑动窗口法.该研究旨在为研究者和开发者提供关于长文本摘要的实践经验和见解.

Abstract

This study explores the effects of different text segmentation methods on the quality of long text summaries using pre-trained Pegasus model. This paper collects 200 academic papers about STM32 MCU from Knownet as experimental text,and compares the generation effect of four segmentation methods:sliding window,sentence segmentation,paragraph segmentation and sliding window plus sentence segmentation. In the experiment,ROUGE (Recall-Oriented Understudy for Gisting Evaluation) index was used to evaluate the generated abstracts,and the experimental results were analyzed in detail. In terms of the quality of abstracts generated,paragraph segmentation performed well,with the scores of ROUGE-1,ROUGE-2 and ROUGE-L reaching 30.85,7.60 and 20.15,respectively,slightly exceeding the scores of sentence segmentation and significantly superior to sentence segmentation plus sliding window. This study is to provide researchers and developers with practical experience and insights on long text summaries.

关键词

长文本摘要/分割策略/Pegasus模型/STM32学术论文摘要

Key words

long text abstract/segmentation strategy/Pegasus model/STM32 abstract of academic papers

引用本文复制引用

基金项目

江西省创新领军人才长期项目(S2020LQCQ0889)

江西省自然科学基金(20212BAB201022)

教育部产学研协同育人项目(202002032008)

出版年

2024

电脑与信息技术

中国电子学会,湖南省电子研究所

电脑与信息技术

影响因子：0.256

ISSN：1005-1228

段落导航