Large Language Model-based Idempotent Summarization Method for Educational Text
Large Language Models(LLMs)are currently undergoing vigorous development in the field of Natural Language Processing(NLP).However,significant challenges remain in their applications to educational digitization.To address the problem posed by the scarcity of domain-specific data and the instability of summarization leading to information loss or redundancy,this study introduces a lightweight idempotent model framework,Idempotent Generative Language Model(IGLM),for educational text summarization.The model first employs multisource training for adaptive augmentation to enhance data diversity.Subsequently,various fine-tuning procedures are applied to the downstream text summarization task.Concurrently,an idempotent summarization generation strategy is designed to mitigate the impact of text length.This strategy brings the summaries closer to idempotent form,constrains the model,mitigates biases resulting from uneven language corpora,and combines quantization techniques to generate more precise and fluent summaries under low-resource conditions.The experiments used Recall-Oriented Understudy for Gisting Evaluation(ROUGE)scores as the evaluation metric and validated the model on publicly available Chinese text summarization datasets Large-scale Chinese Short Text Summarization(LCSTS),EDUCATION,and Natural Language Processing and Chinese Computing(NLPCC).The results revealed significant enhancements in precision and coherence within this framework.Specifically,compared to the baseline model,the ROUGE-1/2/L scores were improved by 7.9,7.4,and 8.7 percentage points on the LCSTS dataset.Moreover,on the EDUCATION dataset,the scores exhibited enhancements of 12.9,15.4,and 15.7 percentage points for ROUGE-1/2/L.Similarly,on the NLPCC dataset,improvements of 12.2,11.7,and 12.7 percentage points were observed for ROUGE-1/2/L.This validation confirms the model's effectiveness.
educational digitalizationtext summarizationLarge Language Model(LLM)low-resource scenariosidempotentaugmentation