基于自监督增强特征的直推式零样本图像分类

Transductive zero-shot image classification based on self-supervised enhancement feature

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：图像的视觉特征对实现零样本图像分类有至关重要的作用.尽管目前VGG、GoogLeNet和ResNet等网络提取的深度特征在图像分类领域获得了广泛的应用,但其在零样本图像分类问题上的表现并不理想,仍旧存在较大的提升空间.此外,由于零样本学习场景下训练集与测试集不相交的设定,导致分类网络不可避免地存在领域偏移问题.为此,提出一种基于自监督增强特征的直推式零样本图像分类框架.首先,通过辅助任务构造伪标签,利用自监督学习获得图像的自监督特征并将其与无监督深度特征进行特征融合;然后,将融合特征嵌入语义空间中进行零样本图像分类,并获得未见类的初始预测标签;最后,利用未见类特征和预测标签迭代地优化视觉-语义映射.所提出框架组件可选择,框架组件自监督网络、主干网络和降维网络分别选用CFN、VGG16和PCA构成网络.在CUB、SUN和AwA2数据集上的实验结果表明,所提出网络能够增强特征的判别能力,在零样本图像分类问题上表现良好.

外文摘要：The visual features of images play a crucial role in realizing zero-shot image classification.Although the deep features extracted by networks such as VGG,GoogLeNet,and ResNet have been widely used in the field of image classification,their performance in zero-shot image classification is not ideal.In addition,due to the disjoint setting of the training and testing sets under the zero-shot learning scenario,the classification network inevitably suffers from the problem of domain shift.Therefor,a transductive zero-shot image classification framework based on self-supervised enhancement feature is proposed.The main idea is as follows:first,the pseudo-labels are constructed via the auxiliary task,the self-supervised features of images are obtained by using the self-supervised learning and are further fused with the unsupervised deep features;then,the fused features are embedded in the semantic space for zero-shot image classification,thus the initial predicted labels for unseen classes are obtained;finally,the features and predicted labels of unseen classes are adopted to iteratively optimize the visual-semantic mapping.The framework components proposed can be selected.The framework components self-supervised network,backbone network and reduced-dimension network are CFN,VGG16 and PCA respectively.Experiments on CUB,SUN,and AwA2 datasets show that the proposed network can enhance the discriminative capability of features and perform well on zero-shot image classification tasks.

外文关键词：

zero-shot learningself-supervised learningtransductivevisual-semantic mappingfeature fusionimage classification

作者：

王浩宇、张欣然、王雪松、程玉虎

展开 >

作者单位：

中国矿业大学信息与控制工程学院,江苏徐州 221116

关键词：

零样本学习自监督学习直推式视觉-语义映射特征融合图像分类

基金：

国家自然科学基金国家自然科学基金江苏省自然科学基金江苏省卓越博士后计划

项目编号：

6217625961976215BK202211162022ZB530

出版年：

2024

DOI：

10.13195/j.kzyjc.2022.1317

控制与决策

东北大学

控制与决策

CSTPCD北大核心

影响因子：1.227

ISSN：1001-0920

年,卷(期)：2024.39(5)

参考文献量23