Graph Neural Network Recommendation Algorithm Based on Multimodal Fusion
Many existing Graph Neural Network(GNN)recommendation algorithms use the node number information of the user-item interaction graph for training and learn the high-order connectivity among user and item nodes to enrich their representations.However,user preferences for different modal information are ignored,modal information such as images and text of items are not utilized,and the fusion of different modal features is summed without distinguishing the user preferences for different modal information types.A multimodal fusion GNN recommendation model is proposed to address this problem.First,for a single modality,a unimodal graph network is constructed by combining the user-item interaction bipartite graph,and the user preference for this modal information is learned in the unimodal graph.Graph ATtention(GAT)network is used to aggregate the neighbor information and enrich the local node representation,and the Gated Recurrent Unit(GRU)is used to decide whether to aggregate the neighbor information to achieve the denoising effect.Finally,the user and item representations learned from each modal graph are fused by the attention mechanism to obtain the final representation and then sent to the prediction module.Experimental results on the MovieLens-20M and H&M datasets show that the multimodal information and attention fusion mechanism can effectively improve the recommendation accuracy,and the algorithm model has significant improvements in Precision@K,Recall@K,and NDCG@K compared with the baseline optimal algorithm for the three indicators.When an evaluation index K value of 10 is selected,Precision@10,Recall@10,and NDCG@10 increase by 4.67%,2.42%,2.03%,and 2.49%,5.24%,2.05%,respectively,for the two datasets.