高技术通讯2024,Vol.34Issue(12) :1266-1278.DOI:10.3772/j.issn.1002-0470.2024.12.003

多视角解耦增强整合的细粒度分类算法

Multi-perspective decoupling enhancement and integration for fine-grained classification

孟月波 王博 刘光辉
高技术通讯2024,Vol.34Issue(12) :1266-1278.DOI:10.3772/j.issn.1002-0470.2024.12.003

多视角解耦增强整合的细粒度分类算法

Multi-perspective decoupling enhancement and integration for fine-grained classification

孟月波 1王博 1刘光辉1
扫码查看

作者信息

  • 1. 西安建筑科技大学信息与控制工程学院 西安 710055;西安市建筑制造智能化技术重点实验室 西安 710055
  • 折叠

摘要

针对细粒度图像分类中由于背景环境、光照条件、样本姿态和拍摄角度等外部因素导致类内差异显著增加的问题,本文提出了多视角解耦增强整合的细粒度分类算法.首先,为了降低图像中外部因素的干扰,设计多视角注意力(MPA)模块,此模块通过将模型分解为数个视角,迫使每个视角关注不同尺度,实现干扰因素的解耦,并通过对特征进行自注意力建模,引导各个视角进一步挖掘关键特征.其次,提出递进式动态加权融合(PDWF)策略,旨在有效整合解耦后的多个视角信息,该策略通过获取不同视角下通道和空间关系动态调整融合系数,实现多尺度信息的高阶融合.最后,采用递进式训练方法促进视角交互,进一步捕获和整合多尺度特征的互补语义信息.在CUB-200-2011、Stanford-Cars、FGVC-Aircraft公开数据集上进行实验,实验结果表明所提方法分类准确率分别达到90.5%、95.5%和94.2%,优于当前细粒度图像分类任务主流方法.

Abstract

To address the significant intra-class variation caused by external factors such as background environment,lighting conditions,sample posture,and shooting angle in fine-grained image classification,this paper proposes a fine-grained classification algorithm based on multi-perspective decoupling enhancement integration.Firstly,to re-duce the interference of external factors in images,a multi-perspective attention(MPA)module is designed.This module decomposes the model into several perspectives,forcing each perspective to focus on different scales,thus decoupling the interference factors.By modeling features with self-attention,each perspective is guided to further mine key features.Secondly,a progressive dynamic weighted fusion(PDWF)strategy is proposed to effectively in-tegrate the decoupled multi-perspective information.This strategy dynamically adjusts the fusion coefficient by ac-quiring channel and spatial relationships from different perspectives,achieving high-order fusion of multi-scale in-formation.Lastly,a progressive training method is adopted to facilitate perspective interaction,further capturing and integrating complementary semantic information from multi-scale features.Experiments are conducted on three public datasets,CUB-200-2011,Stanford-Cars,and FGVC-Aircraft,and the results show that the proposed method achieves classification accuracy rates of 90.5%,95.5%,and 94.2%,respectively,which outperforms current mainstream methods for fine-grained image classification tasks.

关键词

细粒度/多视角注意力(MPA)/递进式动态加权融合(PDWF)/图像分类

Key words

fine-grained/multi-perspective attention(MPA)/progressive dynamic weighted fusion(PD-WF)/image classification

引用本文复制引用

出版年

2024
高技术通讯
中国科学技术信息研究所

高技术通讯

CSTPCD北大核心
影响因子:0.19
ISSN:1002-0470
段落导航相关论文