激光与光电子学进展2024,Vol.61Issue(24) :364-373.DOI:10.3788/LOP240958

多尺度与跨空间信息聚合网络的水下目标检测

Underwater Object Detection Using a Multiscale and Cross-Spatial Information Aggregation Network

杨继海 裴晓芳
激光与光电子学进展2024,Vol.61Issue(24) :364-373.DOI:10.3788/LOP240958

多尺度与跨空间信息聚合网络的水下目标检测

Underwater Object Detection Using a Multiscale and Cross-Spatial Information Aggregation Network

杨继海 1裴晓芳2
扫码查看

作者信息

  • 1. 南京信息工程大学电子与信息工程学院,江苏 南京 210044
  • 2. 南京信息工程大学电子与信息工程学院,江苏 南京 210044;无锡学院电子信息工程学院,江苏 无锡 214105
  • 折叠

摘要

提出一种多尺度与跨空间信息聚合网络的水下目标检测算法.首先,在主干网络中利用可变形层聚合模块进行特征提取,提高网络的定位精度;然后,采用Conv2former模块提高颈部的全局信息提取能力,减少水下目标之间相互遮挡造成的漏检;最后,提出多尺度注意力并联增强模块,利用并联卷积块提取更深层次的特征,通过高效多尺度注意力模块将背景以及图像失真后带来的干扰信息滤除,同时引入多条跨层级连接,有效地将低层次局部特征与高层次强语义信息相互融合,从而提高模型的检测精度.在URPC数据集上进行消融实验,相较于原模型,改进后模型的准确率、召回率、平均精度均值(mAP)@0.5、mAP@0.5∶0.95提高3.6百分点、2.6百分点、3.5百分点、3.3百分点,在RUOD数据集中测试不同场景的检测效果表明,对比当下的一些主流模型,所提模型具有显著优势.

Abstract

An underwater target detection algorithm that uses a multiscale and cross-spatial information aggregation network is proposed.First,a deformable layer aggregation module is used within the backbone network to extract features,enhancing the network's positioning accuracy.Second,the Conv2former module is used to enhance the neck's global information extraction capability and reduce missing detections caused by mutual occlusion among underwater targets.Finally,a multiscale attention parallel enhancement module that uses parallel convolution blocks to extract deeper features is proposed.This module integrates an efficient multiscale attention module to filter out interference from background and image distortion and introduces multiple cross-level connections to effectively integrate low-level local features with high-level strong semantic information,thereby improving model detection accuracy.The ablation experiment is conducted on the URPC dataset.Compared with the original model,the accuracy rate,recall rate,mean average precision(mAP)@0.5,and mAP@0.5∶0.95 of the improved model increase by 3.6 percentage points,2.6 percentage points,3.5 percentage points,and 3.3 percentage points,respectively.Tests on the RUOD dataset under different scenarios indicate that the proposed model offers notable advantages over several current mainstream models.

关键词

全局信息/目标遮挡/多尺度与跨空间信息聚合网络/跨层级连接/水下目标检测

Key words

global information/target occlusion/multiscale and cross-spatial information aggregation network/cross-level connection/underwater object detection

引用本文复制引用

出版年

2024
激光与光电子学进展
中国科学院上海光学精密机械研究所

激光与光电子学进展

CSTPCDCSCD北大核心
影响因子:1.153
ISSN:1006-4125
段落导航相关论文