遥感学报2024,Vol.28Issue(7) :1789-1801.DOI:10.11834/jrs.20243249

从光学到SAR:基于多级跨模态对齐的SAR图像舰船检测算法

From optical to SAR:A SAR ship detection algorithm based on multi-level cross-modality alignment

何佳月 宿南 徐从安 尹璐 廖艳苹 闫奕名
遥感学报2024,Vol.28Issue(7) :1789-1801.DOI:10.11834/jrs.20243249

从光学到SAR:基于多级跨模态对齐的SAR图像舰船检测算法

From optical to SAR:A SAR ship detection algorithm based on multi-level cross-modality alignment

何佳月 1宿南 1徐从安 2尹璐 3廖艳苹 1闫奕名1
扫码查看

作者信息

  • 1. 哈尔滨工程大学 信息与通信工程学院,哈尔滨 150001
  • 2. 海军航空大学 信息融合研究所,烟台 264001
  • 3. 北京市遥感信息研究所,北京 100192
  • 折叠

摘要

合成孔径雷达(SAR)舰船检测是近年来的研究热点.然而,与光学图像不同,SAR成像的特点会导致不直观的特征表示.此外,由于SAR图像数据量不足,现有的基于大量标记SAR图像的方法可能难以达到较好的检测效果.为了解决这些问题,本文提出了一种基于多级跨模态对齐的SAR图像舰船检测算法MCMA-Net(Multi-level Cross-Modality Alignment Network),通过将光学模态中丰富的知识迁移到SAR模态来增强SAR图像的特征表示.该算法首先设计了一个基于邻域—全局注意力的特征交互网络NGAN(Neighborhood-Global Attention Network),通过对骨干网络的浅层特征采用邻域注意力机制进行局部交互、对深层特征采取全局自注意力机制进行全局上下文交互,在兼顾全局上下文建模能力的同时,提升局部特征的编码能力,使得网络在不同层级更合理的关注相应的信息,从而能够促进后续的多级别模态对齐.其次,本文设计了一个多级模态对齐模块MLMA(Multi-level Modality Alignment),通过从局部级别到全局级别再到实例级别的对两种模态不同隐含空间中的特征进行对齐,促进模型有效地学习模态不变特征,缓解了光学图像和SAR图像之间的模态鸿沟,实现了从光学模态到SAR模态的知识传输.大量的实验证明我们的算法优于现阶段的检测算法,取得了最好的实验结果.

Abstract

In recent years,interest in Synthetic Aperture Radar(SAR)ship detection has considerably grown.Its distinctive strengths position it as a pivotal player in numerous fields of research.However,the inherent characteristics of SAR images have presented a range of challenges.For instance,in contrast to optical images,SAR images have counterintuitive feature representation.Additionally,owing to the constrained number of SAR image data,achieving satisfactory results with existing methods that depend on a substantial number of annotated SAR images might be challenging.How to effectively train a high-performance SAR ship detection network with a limited quantity of SAR images should be investigated.Given that single-modality SAR detection algorithms have inherent limitations,other effective modalities that can assist the SAR modality in completing tasks are needed.For instance,in SAR image target detection,optical images can serve as supplementary data sources.A knowledge-rich model can be developed by utilizing a large volume of optical data in training with SAR data.Hence,reasonable training approaches for effectively utilizing images from SAR and optical modalities should be explored.To address these challenges,a SAR ship detection algorithm called MCMA-Net,which is based on multilevel cross-modality alignment,is proposed in this paper.The MCMA-Net enriches SAR feature representation by incorporating valuable knowledge from optical modality.First,we propose a neighborhood-global attention-based feature interaction network(NGAN),which employs a neighborhood attention mechanism that enables the local interaction of low-level features and a global self-attention mechanism that captures global context from high-level features.When the ability of global context modeling is considered,the encoding ability of local features improves,NGAN enables the network to focus on corresponding information at different levels and can promote the subsequent multilevel modality alignment.Second,we propose a multilevel modality alignment module(MLMA),which aligns features in the different hidden spaces of the two modalities from three levels.MLMA facilitates the model to acquire modality-invariant features,bridging the modality gap and realizing optical knowledge transmission.Valuable information from the optical modality can compensate for certain deficiencies in SAR images.With the aid of these two modules,we have incorporated optical superiority information by leveraging SAR's inherent advantages,achieving an enhancement in the performance of SAR detection tasks.Our algorithm is superior to current detection algorithms.Notably,whether on public SAR image datasets or our own SAR image dataset,the MCMA-Net consistently achieves optimal detection results,which indicates the model's stable performance and robustness.The visualization results indicate that the MCMA-Net achieves excellent detection capabilities in complex scenarios.The ablation experiments demonstrate that compared with the baseline model,our algorithm achieved a 2.7%increase in mAP on the SSDD dataset.Various experimental results have consistently validated the rationality of the MCMA-Net.

关键词

遥感/SAR/目标检测/跨模态/特征对齐/注意力机制

Key words

remote sensing/SAR/target detection/cross-modality/feature alignment/attention mechanism

引用本文复制引用

基金项目

国家自然科学基金(62271159)

国家自然科学基金(62071136)

国家自然科学基金(62002083)

国家自然科学基金(61971153)

黑龙江省优秀青年基金(YQ2022F002)

黑龙江省博士后基金(LBH-Q20085)

黑龙江省博士后基金(LBH-Z20051)

中央高校基本科研业务费专项(3072022QBZ0805)

中央高校基本科研业务费专项(3072021CFT0801)

中央高校基本科研业务费专项(3072022CF0808)

高分专项中俄边境地区国家安全监测及综合服务产业化示范项目(72-Y50G11-9001-22/23)

出版年

2024
遥感学报
中国地理学会环境遥感分会 中国科学院遥感应用研究所

遥感学报

CSTPCD北大核心
影响因子:2.921
ISSN:1007-4619
参考文献量6
段落导航相关论文