Due to the parallel structure of Transformer,in the field of Multimodal Sentiment Analysis,it is difficult to model the semantic relationship in the time dimension with indirect fusion models,and cannot effectively control the information output according to the importance of different modalities.To this end,this paper proposes the AGRU-Transfusion-MGN fusion algorithm.The algorithm adds a soft-attention mechanism to the Gated Recurrent Unit to ex-tract time-series emotional information,constructs a reverse transformation between the encoder and decoder of the Transformer,uses the mean absolute error to bridge the fusion loss of the decoded feature and the corresponding target feature,and sets the gated function and builds a multi-modal gated mechanism to comprehensively judge the impor-tance of different modalities.In order to verify the performance of the algorithm,the experiments were carried out on the multimodal emotion dataset CMU-MOSEI,and the weighted accuracy,mean absolute error and symbol detection were used as evaluation indicators.