基于细粒度特征提纯的穿戴目标快速检测方法

Wearable Object Fast Detection Method Based on Fine-Grained Feature Extraction

韩晓微 ¹吴浩铭 ²周育竹 ²谢英红 ²贾旭³

扫码查看

作者信息

1. 沈阳大学科技创新研究院,辽宁沈阳 110041
2. 沈阳大学信息工程学院,辽宁沈阳 110041
3. 辽宁工业大学电子与信息工程学院,辽宁锦州 121001
折叠

摘要

为了解决人体穿戴目标的视觉检测中尺寸变化、光线明暗、部分遮挡,尤其是相似目标区分等导致的识别速度慢、抗干扰能力差、误检漏检等问题,提出了一种基于细粒度特征提纯的穿戴目标快速检测方法(fast fine-grained feature with vision transformer,F3ViT),在 CBAM 结构中增加跳跃连接,获取具有空间与通道双重特性的特征图,同时保留了更丰富的原始信息;融合自注意力机制和卷积神经网络,提升主干网络对于全局信息的感知;设计了一种有利于多尺寸目标检测的特征金字塔网络,同时提取浅层位置信息和深层语义信息,大幅提高了检测精度.在MS COCO数据集上进行了消融实验,验证了各个模块对网络的影响,同时在对比实验中证明了所提方法具有有效性和先进性.在MS COCO 2017数据集上AP50值达到60.5,AP值达到35.0,检测速度5.7 ms.对比YOLOv5s在精度相似的同时检测速度提高18.6％,算力需求降低33.3％,参数量降低16.7％.本方法在高空安全带数据集上的AP值达到62.5,优于主流深度学习的目标检测方法.

Abstract

In order to solve the problems of slow recognition speed,poor anti-interference ability,false detection and missed detection caused by size change,light brightness and darkness,partial occlusion in visual detection of human wearable targets,especially the discrimination of similar targets,a rapid detection method of wearable targets based on fine-grained feature purification was proposed,which added jump connections in the CBAM structure to obtain feature maps with dual characteristics of space and channel,while retaining richer original information.The self-attention mechanism and convolutional neural network were integrated to improve the perception of global information in the backbone network,a feature pyramid network was designed to facilitate multi-size object detection.The shallow position information and deep semantic information were extracted at the same time,which greatly improved the detection accuracy.Ablation experiments were carried out on the MS COCO dataset to verify the influence of each module on the network,and the effectiveness and advancement of the proposed method were proved in comparative experiments.On the MS COCO 2017 dataset,the AP50 value reached 60.5,the AP value reached 35.0,and the detection speed was 5.7 ms.Compared with YOLOv5s,the detection speed was increased by 18.6％,the computing power requirement was reduced by 33.3％,and the number of parameters was reduced by 16.7％while the accuracy was similar.The AP value of this method on the high-altitude seat belt dataset reached 62.5,which was better than the mainstream deep learning object detection methods.

关键词

深度学习/机器视觉/注意力机制/细粒度目标检测/穿戴目标检测

Key words

deep learning/machine vision/attention mechanism/fine-grained object detection/wearable object detection

引用本文复制引用

基金项目

辽宁省教育厅面上项目(LJKMZ20221827)

辽宁省应用基础研究计划(2022JH2)

辽宁省应用基础研究计划(101300279)

辽宁省博士科研启动基金计划项目(2020-BS-263)

出版年

2024

沈阳大学学报(自然科学版)

沈阳大学

沈阳大学学报(自然科学版)

CSTPCD

影响因子：0.475

ISSN：2095-5456

段落导航