Tibetan summarization algorithm combining extractive and abstractive methods
To advance text summarization technology in the Tibetan language,this study employs a two-stage fine-tuning approach to develop a Tibetan summarization model that integrates extractive and abstractive techniques,ensuring both flu-ency and semantic consistency in summaries.An extractive Tibetan summarization model,BERT-ext,was trained first,fol-lowed by a second fine-tuning stage to create the abstractive model,BERT-ext-abs.Comparative experiments were conduc-ted in terms of model structure and dataset size.Results indicate that,compared to the purely abstractive Tibetan summari-zation model,BERT-abs,the BERT-ext-abs model achieves a 3.23%improvement in ROUGE-1 score and a 0.95%in-crease in BERT Score.Additionally,the BERT-ext-abs model requires fewer parameters and less training data than BERT-abs,enabling it to generate fluent and semantically consistent summaries more efficiently.
extractive summarizationabstractive summarizationpre-trainingbidirectional encoder representations from transformers(BERT)Tibetan language