联合判别区域特征的细粒度视觉分类方法

扫码查看

原文链接

万方数据
维普

中文摘要：细粒度视觉分类方法的核心是定位图像中的判别区域.现有研究通过利用与改进视觉Transformer方法增强了判别区域特征的远距离依赖关系,但是大多数方法仅局限于增强显著判别区域的注意力,忽略了次显著的判别区域中可以联合提取的特征信息,导致具有相似局部特征的不同类别区分难度大,分类准确率较低.因此,提出了一种联合判别区域的提取特征方法,在自注意力模块的前端划分特征图的候选判别区域,引导模型提取不同显著程度的判别区域特征;通过双线性融合自注意力模块对多个不同显著程度的判别区域进行联合特征的提取,获取更加全面的判别区域特征信息.实验结果表明,引入联合判别区域方法的视觉Transformer网络在CUB-200-2011数据集上的准确率达92.7%,较标准视觉Transformer方法提升了2.4个百分点,并且在其余的基准数据集上均超越了当前最优的细粒度视觉分类方法.

外文标题：Fine Grained Visual Classification Method for Combined Discriminative Region Features

外文摘要：The core of the fine-grained visual classification method is to locate the discriminant region in the image.The existing studies have enhanced the long-distance dependence of discriminant regional features by using and improving the vision Transformer method,but most of the methods are only limited to enhancing the attention of the salient discriminant region,ignoring the feature information that can be jointly extracted in the sub-significant discriminant region,which makes it difficult to distinguish different categories with similar local features and has low classification accuracy.There-fore,this paper proposes a joint discriminant region extraction method.Firstly,the candidate discriminant regions of the feature map are divided at the front end of the self-attention module,and the model is guided to extract the discriminant region features with different degrees of significance.Secondly,the bilinear fusion self-attention module is used to extract the joint features of multiple discriminant regions with different degrees of significance,so as to obtain more comprehen-sive discriminant region feature information.Experimental results show that the accuracy of the vision Transformer net-work with the joint discriminant region method on the CUB-200-2011 dataset is 92.7%,which is 2.4 percentage pionts higher than that of the standard vision Transformer method,and surpasses the current optimal fine-grained visual classifi-cation method on the other benchmark datasets.

外文关键词：

fine grained visual classificationdiscriminant regionvision Transformerself-attention mechanism

作者：

康宇、郝晓丽

展开 >

作者单位：

太原理工大学计算机科学与技术学院,太原 030000

关键词：

细粒度视觉分类判别区域视觉Transformer 自注意力机制

出版年：

2025

DOI：

10.3778/j.issn.1002-8331.2310-0127

计算机工程与应用

华北计算技术研究所

计算机工程与应用

北大核心

影响因子：0.683

ISSN：1002-8331

年,卷(期)：2025.61(2)