In the field of natural language processing,generating Chinese long text summaries has always been a challenge in the area of automatic summarization.Chinese language,due to its rich grammatical structure,polysemous vocabulary,and the influence of word order on sentence meaning,the difficulty of automatic summarization is greater.To address this challenge,a hybrid summarization model is proposed that firstly vectorizes the text,then uses an extractive summarization model for information extraction,and finally uses a generative summarization model for summary generation.The model utilizes word lists and tokenizers more suitable for the Chinese context to improve the accuracy of summary sentences.Experimental results show that the extractive-generative hybrid model performs well in generating Chinese long text summaries,with more fluent and coherent summary text,better readability and comprehensibility.
Chinese long text summarizationHybrid modelBERTDGCNNT5-PEGASUS