首页|图像检索中融合扩展码字软分配的图像描述符

图像检索中融合扩展码字软分配的图像描述符

扫码查看
图像的局部聚合描述符向量VLAD的检索精度可以通过增加码书中聚类中心数量来提高,但会增加向量维度和存储空间。为此,本文提出IM-VLAD,其结合基于码字扩展的软分配和二层码书结构,在保持向量维度不变的同时提高了图像检索精度。在训练码书阶段,使用K-means聚类算法对图像局部特征训练第一层视觉码书,然后根据隶属于各个聚类中心的特征训练第二层码书。在计算图像描述符阶段,设计基于码字扩展的软分配方法,根据图像每个局部特征的第二层码书中近邻码字来扩展新的码字,并将其权重分配到近邻码字,进而计算并累加局部特征对应的残差向量。在此基础上,各局部特征的残差向量从第二层向第一层对应的码字逐层聚合成各子向量,并进行串联得到IM-VLAD。实验结果显示:当第一层码书大小为64时,在Holidays数据集上的平均精度较VLAD从0。526提高到0。628,在UKBench数据集的Recall@4结果从3。17提高到3。50,在Holidays_Flickr1M数据集的平均精度由0。513提高到0。604,表明IM-VLAD在多个数据集上均展现出了更高的图像检索精度。
Image Descriptor with Soft Assignment of Expanded Codewords for Image Retrieval
The image retrieval accuracy of Vector of Locally Aggregated Descriptor (VLAD) can be improved by increas-ing the number of centroids in clusters. However, it brings the problems of higher vector dimension and more memory require-ments to store image descriptors. An IM-VLAD is proposed by combining the expanded codewords based soft assignment with a hierarchical codebook of 2-layers. During training the codebooks, K-means algorithm is used to train the 1st layer visu-al codebook. Then, the 2nd layer codebook is trained according to the features which belong to each centroid in 1st codebook. During computing an image descriptor, an expanded codebook based soft assignment method is designed to allocate each local feature of image to its neighbor codewords with corresponding wights, where new codewords in 2st codebook are generated to expand original codewords. Thus, the residual vector of each local feature is computed and aggregated. Consequently, IM-VLAD is the form of concatenation of all the sub-vectors which is computed by aggregating residual vector of each local fea-ture from 2nd layer to 1st layer. Image retrieval is performed on three public datasets. Under the scale of 1st codebook is set as 64, the experimental results demonstrate that improvement compared to VLAD is from 0.526 to 0.628 with mean average pre-cision on Holiday dataset, from 3.17 to 3.50 with Recall@4 on UKBench dataset and from 0.513 to 0.604 with mean average precision on Holidays_Flickr1M respectively. Additionally, IM-VLAD demonstrates superior image retrieval accuracy com-pared to other improved methods.

image retrievalimage descriptorcodeword expansionfeature assignment

陶勇、艾列富

展开 >

安庆师范大学 计算机与信息学院,安徽 安庆 246133

图像检索 图像描述符 码字扩展 特征分配

国家自然科学基金安徽省自然科学基金安徽省自然科学基金安徽省高校自然科学研究重点项目

618010061608085MF1441908085MF194KJ2020A0498

2024

安庆师范大学学报(自然科学版)
安庆师范学院

安庆师范大学学报(自然科学版)

影响因子:0.252
ISSN:1007-4260
年,卷(期):2024.30(2)
  • 2