首页|Hierarchical heterogeneous graph network based multimodal emotion recognition in conversation
Hierarchical heterogeneous graph network based multimodal emotion recognition in conversation
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
Springer Nature
Emotion Recognition in Conversation (ERC) is a crucial subtask in developing dialogue systems with emotional understanding capabilities. Multimodal ERC contains various types of modality data, including text, vision, and acoustic information, which collectively compensate for the limitations of single modality approaches. Recently, Graph Neural Networks have been extensively applied in multimodal ERC due to their advantages in relational modeling. However, existing methods either directly fuse multimodal information resulting in interaction information loss between different modalities, or fail to effectively capture long-distance contextual dependency information. In this paper, we propose a novel multimodal ERC approach called Hierarchical Heterogeneous Graph Network (HHGN), which models dialogues as both directed and undirected heterogeneous graphs to facilitate hierarchical learning. The directed graph captures contextual dependency information in dialogues, while the undirected graphs learn cross-modal interaction information. Extensive experiments were conducted on two public benchmark datasets, and the experimental results demonstrate that our model outperforms other competitive methods.
Emotion recognition in conversationGraph neural networksHeterogeneous graph networkCross-modal interaction
Junyin Peng、Hong Tang、Wenbin Zheng
展开 >
College of Software Engineering, Chengdu University of Information Technology, Chengdu 610225, Sichuan, China
College of Engineering, Sichuan Normal University, Chengdu 610068, Sichuan, China