Chinese positive sentiment style transfer based on dialogues
[Objective]Several studies highlight that negative sentiment dialogues within the family remarkably impact individuals'mental and physical well-being.Conversely,positive sentiment dialogues offer individuals constructive feedback,motivating learning and personal growth.Such dialogues aid in building self-confidence and positive attitude,enabling better coping with life's challenges.Text style transfer is an effective tool to shift negative sentimental dialogues to positive sentimental dialogues.The goal of text style transfer is to retain the content of the text while imbuing the generated text with specific attributes.Sentiment style transfer is an important research direction in natural language processing,and sentiment style transfer in the context of family dialogues holds practical value.However,the current literature on sentiment style transfer has mainly focused on English datasets with relatively limited research within the Chinese domain.[Methods]We constructed a dialogue-based Chinese sentimental text dataset in this study.The initial data was extracted from dialogues in the TV series"Home with Kids",where considerable sentiment differences were observed between dialogues involving characters Liu Mei and Liu Xing as well as Liu Mei and Xia Xue.While interactions between Liu Mei and Liu Xing were primarily critical,interactions between Liu Mei and Xia Xue were characterized by encouragement and respect.Preprocessing was applied to this dataset in the following steps:(1)Data cleaning,filtering,and format conversion were performed to ensure data quality and consistency.(2)A recurrent modeling annotation method was employed using suitable algorithms and models to annotate the data,identifying key information and features.Six iterations were performed,with the classifier being fine-tuned using the data updated from the previous iteration each time.(3)Manual annotation was also conducted,meticulously reviewing and labeling the data manually to further enhance accuracy and reliability.Furthermore,the final dataset comprises 30 836 sentences,including 11 562 sentences with positive sentiment content and 19 274 sentences with negative sentiment content.[Results]In this dialogue dataset,most texts explicitly contain sentiment-related words.Based on the characteristics of this dialogue dataset,research involving dialogue-based Chinese positive sentiment style transfer was started using editing-based delete-retrieve-generate(DRG),tagger and generator(TAG),conditional Bert(CondBert),and tagging without rewriting(TWR)models.In addition,the improved TWR(TWR*)Transformer model was introduced.The original TWR model used a multilayer perceptron to train a style classifier.To improve the ability to accurately identify specific styles,a style classifier was trained based on RoBERTa-Large-Chinese model for distinguishing different text styles.These experiments demonstrated that using the pretrained language model RoBERTa-Large-Chinese produced enhanced classification results,which was attributed to the close relationship between the attention weights of the penultimate layer in the Transformer model and words commonly associated with positive and negative sentiments.RoBERTa-Large-Chinese model presented a higher accuracy in recognizing textual sentiment style attribute words.[Conclusions]Experimental results confirm that the style classifier trained on our dataset can effectively identify negative content within text.Through both automated and manual evaluations,this TWR* model outperforms baseline models in identifying textual sentiment attributes,achieving positive sentiment style transfer,thus verifying the effectiveness of model enhancements and the validity of the dataset.
natural language processingtext generationsentiment style transferrecurrent modelediting-based modelfamily dialogue