Research on a Multimodal Sentiment Analysis Model Combined with the Momentum Distillation Method
The paper investigates the issue of multimodal sentiment analysis and proposes a senti-ment analysis model that incorporates momentum distillation techniques and a multimodal co-attention mechanism.This model aims to address the insufficient extraction of information from single modalities and the heterogeneity present in the representations of different modalities.The multimodal data sources for the model include text,audio,and visual data.In the unimodal feature encoding module for each modality,an attention mechanism is used to highlight key information,and a dynamic weight adjust-ment module is designed to dynamically adjust the weights of different feature dimensions at each time step in the sequence data.Additionally,in a trimodal environment,momentum distillation is intro-duced,incorporating sentiment polarity as a supervisory signal to facilitate knowledge distillation.In the multimodal interaction module,a multi-head attention mechanism is employed to fuse the features of the three modalities,generating a text-guided fused representation for the final sentiment analysis pre-diction.Finally,comparative experiments are conducted on the publicly available CMU-MOSI and CM U-MOSEI datasets.