在传统的图文跨模态情感分析算法中,由于缺乏对视觉特征空间和通道的关注,往往容易造成局部特征关键信息的丢失,导致在特征融合阶段,不能很好地表示关键信息。因此,该文提出了基于重构双注意力网络的图文情感分析模型(Images-Text Sentiment Analysis Based on Reconstructed Dual Attention Networks Fusion,IRDA)。该模型在视觉特征提取中使用ResNet50获取视觉特征,同时引入空间和通道重构卷积模块,对视觉特征空间和通道位置信息进行重构,对不同位置的关键信息进行融合,加强视觉特征提取。在文本特征提取中使用BERT模型获取文本特征表示,并使用双向门控循环单元(Bi-GRU)关注低层次单词之间的上下文联系,进而增强文本语义特征。使用交互注意力机制关注模态间的特征交互,并进行视觉特征与文本特征融合,进而完成情感分类任务。该模型在MVSA多模态数据集上进行实验验证,实验结果表明该模型皆优于当前主流模型,证实了模型的有效性。
Images-text Sentiment Analysis Based on Reconstructed Dual Attention Networks
In traditional multimodal sentiment analysis algorithms integrating images and text,there is often a loss of critical information in local features due to lack of attention to the visual feature space and channels.This deficiency leads to inadequate representation of key information during the feature fusion stage.To address this,we introduce a novel image-text sentiment analysis model based on the Re-constructed Dual Attention Networks Fusion(IRDA).This model employs ResNet50 for extracting visual features and incorporates a spatial and channel reconstruction convolution module.This module reconstructs the visual feature space and channel position information,enabling the fusion of key information at different positions,thereby enhancing the extraction of visual features.In terms of textual feature extraction,the BERT model is used to obtain textual representations and employs Bi-directional Gated Recurrent Units(Bi-GRU)to focus on the contextual relationships between lower-level words,thus enhancing the semantic features of the text.Additionally,an interactive attention mechanism is used to focus on the interaction between modalities and to fuse visual and textual features,culminating in the completion of the sentiment classification task.The efficacy of this model is demonstrated through experimental validation on the MVS A multimodal dataset,with results indicating superior performance compared to current mainstream models,thereby confirming the effectiveness of the proposed model.
deep learningmultimodalinteractive attentionBERTreconstructing unit convolutional moduleconvolutional neural networkssentiment analysis