基于transformer的线条图图像检索

扫码查看

原文链接

万方数据
维普

中文摘要：图像检索在计算机视觉中至关重要,在许多领域有着广泛的应用.但是在专利中,图片通常以线条图形式存在.由于线条图没有色彩和纹理信息,对线条图进行检索,仍面临巨大挑战.基于Transformer的线条图检索模型,充分利用Transformer长距离依赖建模的优点,有效的提取线条图全局特征.该模型将输入的线条图切分为n个Patch块,在Patch间通过自注意力机制提取特征,通过对特征进行处理得到100维的增强特征,最终根据图像特征的余弦相似度进行检索.通过实验表明与基于卷积神经网络的GoogleNet和ResNet50相比,基于transformer的模型能达到更好的效果.

外文标题：Transformer-based Line Drawing Image Retrieval

外文摘要：Image retrieval is crucial in computer vision and has widespread applications in various fields.However,in patents,images are typically presented in the form of line drawings.Since line draw-ings lack color and texture information,retrieving them still poses significant challenges.This work proposes a Transformer-based line drawing retrieval model that fully leverages the advan-tage of Transformer's long-range dependency modeling to effectively extract global features from line drawings.The model divides the input line drawing into n patches and extracts features among patches through a self-attention mechanism.These features are then processed to obtain 100-di-mensional enhanced features,and finally,retrieval is performed based on the cosine similarity of image features.Experimental results demonstrate that compared to GoogleNet and ResNet50,which are based on convolutional neural networks,the Transformer-based model achieves better performance.

外文关键词：

TransformerImage RetrievalLine DrawingComputer vision

作者：

岳杰、彭炳鑫

展开 >

作者单位：

河北建筑工程学院,河北张家口 075000

关键词：

Transformer 图像检索线条图计算机视觉

出版年：

2024

DOI：

10.3969/j.issn.1008-4185.2024.01.034

河北建筑工程学院学报

河北建筑工程学院

河北建筑工程学院学报

影响因子：0.502

ISSN：1008-4185

年,卷(期)：2024.42(1)

参考文献量9