基于图像描述的跨媒体艺术作品智能推荐应用研究

A Study on Intelligent Cross-media Artwork Recommendation Application Based on Image Caption

刘斌 ¹于晓东²

扫码查看

作者信息

1. 安徽商贸职业技术学院信息与人工智能学院,安徽芜湖 241002
2. 芜湖固高自动化技术有限公司,安徽芜湖 241000
折叠

摘要

基于图像描述与跨模态注意力的诗词匹配模型是用来对诗词和图像进行智能匹配的深度学习模型.模型采用视觉算法特征提取输入图片的特征,然后结合采用BERT模型提取的诗词文本的上下文特征,使用跨模态注意力机制结合softmax函数进行诗词匹配,以输出与图像匹配度最高的诗词.实验结果表明,相较于其他基线模型,该模型具有更优的性能.

Abstract

A poetry matching model based on image description and cross-modal attention is a deep learning model for intelli-gent matching of visual images with poetry texts.The model firstly adopts the ResNet-based Faster-RCNN model to extract the visual features of the input image;secondly,in combination with the contextual features of the poetic text extracted with the BERT model,matches poems with image by means of the cross-modal attention mechanism combined with softmax func-tion,so as to output poetry texts which best matches images.The experimental results show that this model has better per-formance compared with other baseline models.

关键词

AoA/图像描述/BERT/跨模态注意力机制/多模态融合

Key words

AoA/image caption/BERT/cross-modal attention/multi-modal fusion

引用本文复制引用

基金项目

安徽省教育厅自然科学研究重点项目(2020)(KJ2020A1082)

出版年

2024

芜湖职业技术学院学报

芜湖职业技术学院

芜湖职业技术学院学报

影响因子：0.274

ISSN：1009-1114

参考文献量11

段落导航