首页|基于语义化多尺度卷积与注意力机制的目标检测算法

基于语义化多尺度卷积与注意力机制的目标检测算法

扫码查看
目标检测中如何将多尺度特征进行有效地融合仍是一个挑战,提出了一种细粒度级别融合多尺度特征的组件,称为语义化多尺度特征融合SMSFF(semantic multi-scale feature fusion)。首先多尺度卷积核生成目标检测网络所需的多尺度语义信息,然后使用新颖的多尺度特征融合方法将其充分融合。最后,利用SE(squeeze-and-excitation)跨通道的加权注意力重新标定多尺度特征,有效地强化了网络的多尺度信息,进而提高了网络的特征表征能力。因此,SMSFF能够有效地提高检测精度,且模型对不同尺度实例目标更具鲁棒性。本工作所提方法在基准数据集COCO 2017 test和Pascal VOC上的YOLOX目标检测器的mAP分别为48。6%和87。6%。
Object Detection Based on Semantic Multi-scale Convolution and Attention Mechanism
In object detection,the scale variation of objects is one of the most challenging problems,so it is particularly important to fuse more effective multi-scale features.This pa-per proposes a component for fusing multi-scale features at a fine-grained level,called SMS-FF(semantic multi-scale feature fusion).First,the multi-scale convolution kernel generates the multi-scale semantic information required by the object detection network,and then fully fuses it using a novel multi-scale feature fusion method.In addition,using the weighted at-tention across channels of SE(squeeze-and-excitation)to re-calibrate the multi-scale fea-tures,which effectively strengthens the multi-scale information of the network,thereby im-proving the feature representation ability of the network.Therefore,SMSFF can effectively improve the detection accuracy,and the model is more robust to instance objects of different scales.The mAP of the proposed method on the benchmark datasets COCO 2017 test and Pascal VOC for the YOLOX object detector is 48.6%and 87.6%,respectively.

object detectionmulti-scale feature fusionattention mechanismcomputer vi-sionimage classification

张浩、王慧薷、王传旭

展开 >

青岛科技大学 信息科学技术学院,山东 青岛 266061

目标检测 多尺度特征融合 注意力机制 计算机视觉 图像分类

国家自然科学基金

61672305

2024

青岛科技大学学报(自然科学版)
青岛科技大学

青岛科技大学学报(自然科学版)

CSTPCD
影响因子:0.297
ISSN:1672-6987
年,卷(期):2024.45(3)
  • 18