Image Descriptor with Soft Assignment of Expanded Codewords for Image Retrieval
The image retrieval accuracy of Vector of Locally Aggregated Descriptor (VLAD) can be improved by increas-ing the number of centroids in clusters. However, it brings the problems of higher vector dimension and more memory require-ments to store image descriptors. An IM-VLAD is proposed by combining the expanded codewords based soft assignment with a hierarchical codebook of 2-layers. During training the codebooks, K-means algorithm is used to train the 1st layer visu-al codebook. Then, the 2nd layer codebook is trained according to the features which belong to each centroid in 1st codebook. During computing an image descriptor, an expanded codebook based soft assignment method is designed to allocate each local feature of image to its neighbor codewords with corresponding wights, where new codewords in 2st codebook are generated to expand original codewords. Thus, the residual vector of each local feature is computed and aggregated. Consequently, IM-VLAD is the form of concatenation of all the sub-vectors which is computed by aggregating residual vector of each local fea-ture from 2nd layer to 1st layer. Image retrieval is performed on three public datasets. Under the scale of 1st codebook is set as 64, the experimental results demonstrate that improvement compared to VLAD is from 0.526 to 0.628 with mean average pre-cision on Holiday dataset, from 3.17 to 3.50 with Recall@4 on UKBench dataset and from 0.513 to 0.604 with mean average precision on Holidays_Flickr1M respectively. Additionally, IM-VLAD demonstrates superior image retrieval accuracy com-pared to other improved methods.