基于Swin Transformer的图神经网络小样本图像分类算法

Few-Shot Image Classification Algorithm of Graph Neural Network Based on Swin Transformer

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对小样本图像分类任务中基于卷积神经网络的特征提取模块难以捕获远程语义信息和边特征相似度度量单一的问题,提出一种基于Swin Transformer的图神经网络小样本图像分类算法.首先,利用Swin Transformer网络来提取图像特征,并将该特征作为节点特征输入图神经网络;然后,通过增加额外度量的方式改进了边特征相似度量模块,形成双度量模块以计算节点特征之间的相似度,将得到的相似度作为边特征输入图神经网络;最后,交替更新节点和边特征来获取图像标签的信息.在Stanford Dogs、Stanford Cars和CUB-200-2011三个数据集上,所提方法对5-way 1-shot任务的分类准确率分别达85.21%、91.10%和91.08%,在小样本图像分类任务中取得了显著的效果.

外文摘要：In few-shot image classification tasks,capturing remote semantic information in feature extraction modules based on convolutional neural network and single measure of edge-feature similarity are challenging.Therefore,in this study,we present a few-shot image classification method utilizing a graph neural network based on Swin Transformer.First,the Swin Transformer is used to extract image features,which are utilized as node features in the graph neural network.Next,the edge-feature similarity measurement module is improved by adding additional metrics,thus forming a dual-measurement module to calculate the similarity between the node features.The obtained similarity is used as the edge-feature input of the graph neural network.Finally,the nodes and edges of the graph neural network are alternately updated to predict image class labels.The classification accuracy of our proposed method for a 5-way 1-shot task on Stanford Dogs,Stanford Cars,and CUB-200-2011 datasets is calculated as 85.21%,91.10%,and 91.08%,respectively,thereby achieving significant results in few-shot image classification.

外文关键词：

graph neural networkfew-shot learningimage classificationSwin Transformerdual metric learning

作者：

王凯、任劼、章为川

展开 >

作者单位：

西安工程大学电子信息学院,陕西西安 710048

格里菲斯大学综合智能系统研究所,澳大利亚布里斯班 4702

关键词：

图神经网络小样本学习图像分类 Swin Transformer 双度量学习

基金：

陕西省自然科学基础研究计划

项目编号：

2022JM-394

出版年：

2024

DOI：

10.3788/LOP231596

激光与光电子学进展

中国科学院上海光学精密机械研究所

激光与光电子学进展

CSTPCD北大核心

影响因子：1.153

ISSN：1006-4125

年,卷(期)：2024.61(12)