基于Swin Transformer的图神经网络小样本图像分类算法
Few-Shot Image Classification Algorithm of Graph Neural Network Based on Swin Transformer
王凯 1任劼 1章为川2
作者信息
- 1. 西安工程大学电子信息学院,陕西 西安 710048
- 2. 格里菲斯大学综合智能系统研究所,澳大利亚 布里斯班 4702
- 折叠
摘要
针对小样本图像分类任务中基于卷积神经网络的特征提取模块难以捕获远程语义信息和边特征相似度度量单一的问题,提出一种基于Swin Transformer的图神经网络小样本图像分类算法.首先,利用Swin Transformer网络来提取图像特征,并将该特征作为节点特征输入图神经网络;然后,通过增加额外度量的方式改进了边特征相似度量模块,形成双度量模块以计算节点特征之间的相似度,将得到的相似度作为边特征输入图神经网络;最后,交替更新节点和边特征来获取图像标签的信息.在Stanford Dogs、Stanford Cars和CUB-200-2011三个数据集上,所提方法对5-way 1-shot任务的分类准确率分别达85.21%、91.10%和91.08%,在小样本图像分类任务中取得了显著的效果.
Abstract
In few-shot image classification tasks,capturing remote semantic information in feature extraction modules based on convolutional neural network and single measure of edge-feature similarity are challenging.Therefore,in this study,we present a few-shot image classification method utilizing a graph neural network based on Swin Transformer.First,the Swin Transformer is used to extract image features,which are utilized as node features in the graph neural network.Next,the edge-feature similarity measurement module is improved by adding additional metrics,thus forming a dual-measurement module to calculate the similarity between the node features.The obtained similarity is used as the edge-feature input of the graph neural network.Finally,the nodes and edges of the graph neural network are alternately updated to predict image class labels.The classification accuracy of our proposed method for a 5-way 1-shot task on Stanford Dogs,Stanford Cars,and CUB-200-2011 datasets is calculated as 85.21%,91.10%,and 91.08%,respectively,thereby achieving significant results in few-shot image classification.
关键词
图神经网络/小样本学习/图像分类/Swin/Transformer/双度量学习Key words
graph neural network/few-shot learning/image classification/Swin Transformer/dual metric learning引用本文复制引用
基金项目
陕西省自然科学基础研究计划(2022JM-394)
出版年
2024