首页|基于transformer的线条图图像检索

基于transformer的线条图图像检索

扫码查看
图像检索在计算机视觉中至关重要,在许多领域有着广泛的应用.但是在专利中,图片通常以线条图形式存在.由于线条图没有色彩和纹理信息,对线条图进行检索,仍面临巨大挑战.基于Transformer的线条图检索模型,充分利用Transformer长距离依赖建模的优点,有效的提取线条图全局特征.该模型将输入的线条图切分为n个Patch块,在Patch间通过自注意力机制提取特征,通过对特征进行处理得到100维的增强特征,最终根据图像特征的余弦相似度进行检索.通过实验表明与基于卷积神经网络的GoogleNet和ResNet50相比,基于transformer的模型能达到更好的效果.
Transformer-based Line Drawing Image Retrieval
Image retrieval is crucial in computer vision and has widespread applications in various fields.However,in patents,images are typically presented in the form of line drawings.Since line draw-ings lack color and texture information,retrieving them still poses significant challenges.This work proposes a Transformer-based line drawing retrieval model that fully leverages the advan-tage of Transformer's long-range dependency modeling to effectively extract global features from line drawings.The model divides the input line drawing into n patches and extracts features among patches through a self-attention mechanism.These features are then processed to obtain 100-di-mensional enhanced features,and finally,retrieval is performed based on the cosine similarity of image features.Experimental results demonstrate that compared to GoogleNet and ResNet50,which are based on convolutional neural networks,the Transformer-based model achieves better performance.

TransformerImage RetrievalLine DrawingComputer vision

岳杰、彭炳鑫

展开 >

河北建筑工程学院,河北张家口 075000

Transformer 图像检索 线条图 计算机视觉

2024

河北建筑工程学院学报
河北建筑工程学院

河北建筑工程学院学报

影响因子:0.502
ISSN:1008-4185
年,卷(期):2024.42(1)
  • 9