基于语义化多尺度卷积与注意力机制的目标检测算法

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：目标检测中如何将多尺度特征进行有效地融合仍是一个挑战,提出了一种细粒度级别融合多尺度特征的组件,称为语义化多尺度特征融合SMSFF(semantic multi-scale feature fusion).首先多尺度卷积核生成目标检测网络所需的多尺度语义信息,然后使用新颖的多尺度特征融合方法将其充分融合.最后,利用SE(squeeze-and-excitation)跨通道的加权注意力重新标定多尺度特征,有效地强化了网络的多尺度信息,进而提高了网络的特征表征能力.因此,SMSFF能够有效地提高检测精度,且模型对不同尺度实例目标更具鲁棒性.本工作所提方法在基准数据集COCO 2017 test和Pascal VOC上的YOLOX目标检测器的mAP分别为48.6%和87.6%.

外文标题：Object Detection Based on Semantic Multi-scale Convolution and Attention Mechanism

外文摘要：In object detection,the scale variation of objects is one of the most challenging problems,so it is particularly important to fuse more effective multi-scale features.This pa-per proposes a component for fusing multi-scale features at a fine-grained level,called SMS-FF(semantic multi-scale feature fusion).First,the multi-scale convolution kernel generates the multi-scale semantic information required by the object detection network,and then fully fuses it using a novel multi-scale feature fusion method.In addition,using the weighted at-tention across channels of SE(squeeze-and-excitation)to re-calibrate the multi-scale fea-tures,which effectively strengthens the multi-scale information of the network,thereby im-proving the feature representation ability of the network.Therefore,SMSFF can effectively improve the detection accuracy,and the model is more robust to instance objects of different scales.The mAP of the proposed method on the benchmark datasets COCO 2017 test and Pascal VOC for the YOLOX object detector is 48.6%and 87.6%,respectively.

外文关键词：

object detectionmulti-scale feature fusionattention mechanismcomputer vi-sionimage classification

作者：

张浩、王慧薷、王传旭

展开 >

作者单位：

青岛科技大学信息科学技术学院,山东青岛 266061

关键词：

目标检测多尺度特征融合注意力机制计算机视觉图像分类

基金：

国家自然科学基金

项目编号：

61672305

出版年：

2024

DOI：

10.16351/j.1672-6987.2024.03.019

青岛科技大学学报(自然科学版)

青岛科技大学

青岛科技大学学报(自然科学版)

CSTPCD

影响因子：0.297

ISSN：1672-6987

年,卷(期)：2024.45(3)

参考文献量18