系统工程与电子技术(英文版)2024,Vol.35Issue(6) :1337-1356.DOI:10.23919/JSEE.2022.000155

A survey of fine-grained visual categorization based on deep learning

XIE Yuxiang GONG Quanzhi LUAN Xidao YAN Jie ZHANG Jiahui
系统工程与电子技术(英文版)2024,Vol.35Issue(6) :1337-1356.DOI:10.23919/JSEE.2022.000155

A survey of fine-grained visual categorization based on deep learning

XIE Yuxiang 1GONG Quanzhi 1LUAN Xidao 2YAN Jie 1ZHANG Jiahui1
扫码查看

作者信息

  • 1. College of System Engineering,National University of Defense Technology,Changsha 410000,China
  • 2. College of Computer Engineering and Applied Mathematics,Changsha University,Changsha 410003,China
  • 折叠

Abstract

Deep learning has achieved excellent results in various tasks in the field of computer vision,especially in fine-grained visual categorization.It aims to distinguish the subordinate categories of the label-level categories.Due to high intra-class variances and high inter-class similarity,the fine-grained visual categorization is extremely challenging.This paper first briefly introduces and analyzes the related public datasets.After that,some of the latest methods are reviewed.Based on the feature types,the feature processing methods,and the overall structure used in the model,we divide them into three types of methods:methods based on general convolutional neural network(CNN)and strong supervision of parts,methods based on single feature processing,and meth-ods based on multiple feature processing.Most methods of the first type have a relatively simple structure,which is the result of the initial research.The methods of the other two types include models that have special structures and training processes,which are helpful to obtain discriminative features.We conduct a specific analysis on several methods with high accuracy on pub-lic datasets.In addition,we support that the focus of the future research is to solve the demand of existing methods for the large amount of the data and the computing power.In terms of tech-nology,the extraction of the subtle feature information with the burgeoning vision transformer(ViT)network is also an important research direction.

Key words

deep learning/fine-grained visual categorization/convolutional neural network(CNN)/visual attention

引用本文复制引用

出版年

2024
系统工程与电子技术(英文版)
中国航天科工防御技术研究院 中国宇航学会 中国系统工程学会 中国系统仿真学会

系统工程与电子技术(英文版)

CSTPCD
影响因子:0.64
ISSN:1004-4132
段落导航相关论文