哈尔滨工程大学学报2024,Vol.45Issue(8) :1616-1623.DOI:10.11990/jheu.202312026

基于ViT-CNN混合网络的合成孔径雷达图像船舶分类

Synthetic aperture radar image ship classification based on ViT-CNN hybrid network

邵然 毕晓君
哈尔滨工程大学学报2024,Vol.45Issue(8) :1616-1623.DOI:10.11990/jheu.202312026

基于ViT-CNN混合网络的合成孔径雷达图像船舶分类

Synthetic aperture radar image ship classification based on ViT-CNN hybrid network

邵然 1毕晓君2
扫码查看

作者信息

  • 1. 哈尔滨工程大学 信息与通信工程学院,黑龙江 哈尔滨 150001;哈尔滨职业技术学院 电子与信息工程学院,黑龙江 哈尔滨 150001
  • 2. 中央民族大学 民族语言智能分析与安全治理教育部重点实验室,北京 100081;中央民族大学 信息工程学院,北京 100081
  • 折叠

摘要

为了解决视觉转换器模型缺乏多尺度与局部特征捕获能力,难以适应合成孔径雷达图像船舶分类任务的问题,本文提出一种混合网络模型用于合成孔径雷达图像船舶分类.利用分阶段下采样网络结构,解决了ViT无法捕获多尺度特征的问题.通过在ViT模型的 3 个核心模块中融入卷积结构,设计了卷积标记嵌入、卷积参数共享注意力和局部前馈网络 3 个模块,使得网络能够同时捕获船舶图像的全局和局部特征,进一步增强了网络归纳偏置和特征提取能力.研究表明:本文所提模型在OpenSARShip和FUSAR-Ship2 个通用合成孔径雷达船舶图像数据集上,分类准确率较最优方法分别提高了 2.96%和 4.18%,有效地提升了合成孔径雷达图像船舶分类性能.

Abstract

In recent years,vision transformer(ViT)has made significant breakthroughs in the field of image classifi-cation.However,it is difficult to adapt to the task of synthetic aperture radar image ship classification due to its lack of multiscale and local feature capture capability.For this reason,this paper proposes a hybrid network model for synthetic aperture radar image ship classification.A staged downsampling network structure is designed to solve the problem that ViT is unable to capture multi-scale features.By incorporating the convolutional structure into three core modules of the ViT model,three modules,namely,convolutional token embedding,convolutional parameters sharing attention,and local feed-forward network,are designed,which enable the network to capture both global and local features of the ship images,and further enhance the network's inductive biasing and feature extraction ability.Exper-imental results show that the proposed model in this paper improves the classification accuracy by 2.96%and 4.18%compared with the existing optimal method on two generalized SAR ship image datasets,OpenSARShip and FUSAR-Ship,respectively,which effectively improves the performance of SAR image ship classification.

关键词

视觉转换器/卷积神经网络/SAR图像/深度学习/参数共享/局部特征/全局特征/船舶图像

Key words

vision transformer/convolutional neural network/synthetic aperture radar image/deep learning/pa-rameters sharing/local feature/global feature/ship image

引用本文复制引用

基金项目

国家社会科学基金重大项目(20&ZD279)

出版年

2024
哈尔滨工程大学学报
哈尔滨工程大学

哈尔滨工程大学学报

CSTPCD北大核心
影响因子:0.655
ISSN:1006-7043
段落导航相关论文