Multimodal Optimization Method Based on Lightweight DialogueRNN
Emotion recognition in conversation(ERC)is a largely active research field in natural language processing,and the classification of discourse emotion is also widely used in human-computer interaction.At present,most of the research mainly focuses on the modeling of speaker and context information,mainly using simple feature stitching to use multimodal information and ignoring the de-pendency between modes.In order to solve the above problems,dynamic feature fusion based on attention network can magnify the mode of high quality and information,a lightweight dialoguernn multimodal optimization method,MMRNN(multimodal RNN),is proposed.Firstly,in the process of multimodal fusion,the attention score is input to focus on the more important modes.Secondly,the emotional GRU is optimized in DialogueRNN.Finally,the models is stacked to increase the depth of the model,and the attention mechanism is added to the output of each layer to get the final emotional output.The simulation results on two public datasets show that the proposed multimodal optimization method based on lightweight DialogueRNN has excellent performance.
multimodal fusionemotion recognition in conversationattention mechanismscene modelingmodel stacking