Multimodal Sentiment Analysis Model Integrating Multi-features and Attention Mechanism
[Objective]This paper proposes a multimodal sentiment analysis model integrating multiple features and attention mechanisms.It addresses the insufficient extraction of multimodal features and inadequate interaction of intra-modal and inter-modal information in existing models.[Methods]In multimodal feature extraction,we enhanced the features of body movements,gender,and age of individuals in the video modality.For the text modality,we integrated BERT-based character-level and word-level semantic vectors.Therefore,we enriched the low-level features of multimodal data.We also utilized self-attention and cross-modal attention mechanisms to integrate intra-modal and inter-modal information.We concatenated the modal features and employed a soft attention mechanism to allocate attention weight to each feature.Finally,we generated the sentiment classification results through fully connected layers.[Results]We examined the proposed model on the public dataset(CH-SIMS)and the Hot Public Opinion Comments Videos(HPOC)dataset constructed in this paper.Compared with the Self-MM model,our model improved the binary classification accuracy,tri-class classification accuracy,and Fl value by 1.83%,1.74%,and 0.69%on the CH-SIMS dataset,and 1.03%,0.94%,and 0.79%on the HPOC dataset.[Limitations]The person's scene in the video may change constantly,and different scenes may contain different emotional information.Our model does not integrate the scene information of the person.[Conclusions]The proposed model enriches the low-level features of multimodal data and improves the effectiveness of sentimental analysis.