摘要
计算机视觉任务中,密集小目标的人数统计在人群行为分析、资源优化配置、现代安防等室内场景中具有重要的社会意义.现有的密集小目标统计方法存在着诸如目标相互遮挡造成的漏检、检测目标密集产生的错检以及目标小且人脸特征提取不足等问题.针对室内场景中密集小目标的漏检、错检以及特征不足等问题,提出一种基于YOLOv5框架的人数统计模型STO-YOLO.该方法首先在YOLOv5的主干网络加入针对密集小目标的检测模块以提升特征提取能力,然后在特征融合Neck网络中加入小目标检测模块来增强特征融合能力,从而改善远离监控的密集小目标的错检问题;其次引入OTA机制,将标签分配视作最优传输问题,同时结合上下文信息来减少模糊框的个数,从而有效减少目标遮挡产生的误差.在实际教学场景中自建数据集并验证所提方法.实验结果表明,与SOTA方法YOLOv5相比,STO-YOLO检测结果的precision和recall指标均得到了显著提升;相比最新的YOLOv8,recall和mAP等指标也得到了提升,充分验证了所提STO-YOLO方法的有效性.
Abstract
The number counting of dense small targets in computer vision tasks is socially important in indoor scenari-os such as crowd behavior analysis,optimal resource alloca-tion,and modern security.Existing dense small target counting methods have problems such as omission caused by mutual occlusion of targets,misdetection due to dense de-tection of targets,and small targets and insufficient extrac-tion of face features.Aiming at the problems of omission,misdetection and insufficient features of dense small targets in indoor scenes,we propose a statistical model STO-YOLO based on the YOLOv5 framework,which firstly adds a detection module for dense small targets to the backbone network of YOLOv5 to improve the feature extraction ca-pability,then adds a small target detection module to the Neck network to enhance the feature extraction capabili-ty,and then adds a small target detection module to the Neck network to improve the feature extraction capabili-ty.The method firstly adds a small target detection mod-ule to the backbone network of YOLOv5 to improve the feature extraction capability,and then adds a small tar-get detection module to the feature fusion network to en-hance the feature fusion capability,so as to improve the misdetection problem of dense small targets far away from the surveillance;secondly,it introduces the OTA mechanism,which treats the label assignment as the op-timal transmission problem,and at the same time com-bines with the contextual information to reduce the num-ber of fuzzy frames to reduce the error generated by the target obstruction.Self-constructed dataset and validate the proposed method in a real teaching scenario.The ex-perimental results show that compared with the SOTA method YOLOv5,the precision and recall indexes of STO-YOLO detection results are significantly improved;compared with the latest YOLOv8,the recall and mAP indexes are also improved,which fully verifies the pro-posed STO-YOLO method.