Cross-lingual Vietnamese-Chinese news text summarization method with image fusion
[Objective]The Vietnamese-Chinese cross-language news summarization task aims to convert Vietnamese news into Chinese summaries in a concise,accurate and readable form.The existing Vietnamese-Chinese cross-language news summarization task mainly focuses on the summary and extraction of text information.Although,to a certain extent,it improves the accuracy of generated summaries,it ignores the importance of images in news reports.[Methods]Therefore,this paper proposes a Vietnamese-Chinese cross-language news text summarization method that integrates image information,and explores how to effectively use image information to solve related problems.Due to the lack of image-text cross-language summary datasets,this paper constructs a real dataset of 142 000 news data sample pairs and 235 770 news images on multiple Vietnamese news websites.First,the Vietnamese news text and image are represented using a text encoder and an image encoder.Second,the image-text contrast loss is used to enhance the consistency of image and text representation,forcing the Vietnamese representation space to approach the language-independent image representation space.Third,the image-text fuser is used to effectively fuse images and texts,enhancing the ability to extract key information from news texts.Finally,the summary decoder is used to generate a Chinese summary.[Results]To demonstrate the effectiveness of the Vietnamese-Chinese cross-language summary method that fuses image information,we compare the performance of this method with those of six other baseline methods on the data set constructed in this article.First,experimental results show that this model has significantly improved compared to the traditional cross-language summary model.Second,comparison results with multiple end-to-end cross-language summary models NCLS,indicating that the integration of image information can effectively improve cross-language summary performance.This article also explores the impact of ablation experiments on model performance.The experimental results show that the model performance dropped significantly after removing the image encoding module and the image-text fusion module.After removing the image-text contrast loss module,the model performance dropped and randomly.Selecting an image and replacing it with an image synthesized by Gaussian noise reduced model performance.In addition,this article also adds the hyper-parameter experimental analysis to further explore the important impact of the proportional relationship between the number of text encoding layers and the number of graphic encoding layers on the performance of the overall model.The experimental results show that when 3 layers are text encoders,and 3 layers are image and text encoders,the ROUGE score is highest.Finally,the manual evaluation experimental analysis is added to demonstrate the authenticity of the summary generated by this model.Experimental results show that the information content score,conciseness score and fluency score of MH-CLS perform more satisfactorily than those of models Sum-Trans,Trans-Sum and MCLAS do,thus further suggesting the effectiveness of the method.[Conclusions]The proposed Vietnamese-Chinese cross-language news text summarization method that fuses image information has achieved significant improvements compared with existing cross-language summarization methods.Analysis of the experimental results shows that the addition of image information and image-text comparison modules to guide the generated summary plays an important role in improving the quality of cross-language news summaries;the synergy of images and text is fully utilized in terms of image-text fusion and key information extraction.It can better extract key information and achieve satisfactory results in terms of summary information volume,accuracy and information richness.Such advantages clearly demonstrate the vital role of images in cross-language summarization and show that our approach can effectively use image information to improve both the quality and understandability of summaries.
cross-lingual summarizationVietnamese-Chinese cross-lingual news summarizationtext-image fusiontext-image contrastive loss