Multimodal Recommendation Method Integrating Latent Structures and Semantic Information
Multimodal recommender systems aim to improve recommendation performance via multimodal information such as text and visual information.However,existing systems usually integrate multimodal semantic information into item representations or utilize multimodal features to search the latent structure without fully exploiting the correlation between them.Therefore,a multimodal recommendation method integrating latent structures and semantic information is proposed.Based on user's historical behavior and multimodal features,user-user and item-item graphs are constructed to search the latent structure,and user-item bipartite graphs are built to learn the user's historical behavior.The graph convolutional neural network is utilized to learn the topological structure of different graphs.To better integrate latent structures and semantic information,contrastive learning is employed to align the learned latent structure representations of item with their multimodal original features.Finally,evaluation experiments on three datasets demonstrate the effectiveness of the proposed method.