北京服装学院学报(自然科学版)2024,Vol.44Issue(2) :70-78.DOI:10.16454/j.cnki.issn.1001-0564.2024.02.010

基于混合图神经网络的多模态相关性时尚服饰兼容度预测研究

Research on Multimodal Correlation of Fashion Clothing Compatibility Prediction Based on Hybrid Graph Neural Network

陈燕 吕梓民 李云 陆星宇 井佩光
北京服装学院学报(自然科学版)2024,Vol.44Issue(2) :70-78.DOI:10.16454/j.cnki.issn.1001-0564.2024.02.010

基于混合图神经网络的多模态相关性时尚服饰兼容度预测研究

Research on Multimodal Correlation of Fashion Clothing Compatibility Prediction Based on Hybrid Graph Neural Network

陈燕 1吕梓民 1李云 2陆星宇 3井佩光4
扫码查看

作者信息

  • 1. 广西大学计算机与电子信息学院,广西 南宁530004
  • 2. 广西财经学院大数据与人工智能学院,广西 南宁530003
  • 3. 广西民族大学相思湖学院理工学院,广西 南宁541004
  • 4. 天津大学自动化与信息工程学院,天津300072
  • 折叠

摘要

近年来,许多研究人员在多模态融合的研究中取得了显著效果.多模态比单模态具有更丰富的信息,然而,多模态融合过程中的类别共现频率偏差,使得服饰兼容度预测研究存在着巨大的挑战.因此,本研究提出了基于混合图神经网络的多模态相关性时尚服饰兼容度预测模型.该模型深度挖掘文本和视觉两个模态的相关性,并通过混合图神经网络解决多模态融合过程中类别共现频率偏差引起兼容度预测不准确的问题,提高服饰兼容度预测精度.该模型在Polyvore Outfits和Polyvore Outfits-D两个开源数据集上进行了服饰兼容度预测和填空任务的实验.结果显示,该模型在2个数据集中的服饰兼容度任务中分别取得了0.928和0.878的AUC值,在填空任务中分别取得了62.41%和56.83%的精确度,均优于比较的基准模型.

Abstract

In recent years,significant progress has been made in the field of multimodal fusion research.Multimo-dal data provides a wealth of information compared to unimodal data.However,the category co-occurrence frequen-cy bias during multimodal fusion makes clothing compatibility prediction studies challenging.Therefore,we propose a multimodal correlation fashion clothing compatibility prediction model based on a hybrid graph neural network.The model deeply exploits the correlation between textual and visual modalities,and solves the problem of inaccu-rate compatibility prediction caused by the category co-occurrence frequency bias in the process of multimodal fu-sion by hybrid graph neural network,so as to improve the accuracy of clothing compatibility prediction.The model underwent experiments on the Polyvore Outfits and Polyvore Outfits-D open-source datasets for tasks related to fash-ion compatibility prediction and fill-in-the-blank.The results show that the model achieved AUC values of 0.928 and 0.878 for the dress compatibility task in the two datasets,and 62.41% and 56.83% accuracy for the fill-in-the-blank task,surpassing the performance of the baseline models used for comparison.

关键词

多模态/动态图神经网络/共现频率偏差/服饰兼容度

Key words

multimodality/dynamic graph neural network/co-occurrence frequency bias/clothing compatibility

引用本文复制引用

基金项目

国家自然科学基金资助项目(71862003)

国家自然科学基金资助项目(61861014)

国家自然科学基金资助项目(6236010200)

博士启动基金(BS2021025)

广西自然科学基金(2020GXNSFAA159090)

广西科学研究与技术开发计划项目(AA20302002-3)

出版年

2024
北京服装学院学报(自然科学版)
北京服装学院

北京服装学院学报(自然科学版)

影响因子:0.17
ISSN:1001-0564
参考文献量1
段落导航相关论文