首页|Self-attention Guidance Based Crowd Localization and Counting
Self-attention Guidance Based Crowd Localization and Counting
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
万方数据
Most existing studies on crowd analysis are limited to the level of counting,which cannot provide the exact location of indi-viduals.This paper proposes a self-attention guidance based crowd localization and counting network(SA-CLCN),which can simultan-eously locate and count crowds.We take the form of object detection,using the original point annotations of crowd datasets as supervi-sion to train the network.Ultimately,the center point coordinate of each head as well as the number of crowds are predicted.Specific-ally,to cope with the spatial and positional variations of the crowd,the proposed method introduces transformer to construct a global-local feature extractor(GLFE)together with the convolutional structure.It establishes the near-to-far dependency between elements so that the global context and local detail features of the crowd image can be extracted simultaneously.Then,this paper designs a pyramid feature fusion module(PFFM)to fuse the global and local information from high level to low level to obtain a multiscale feature repres-entation.In downstream tasks,this paper predicts candidate point offsets and confidence scores by a simple regression header and classi-fication header.In addition,the Hungarian algorithm is used to match the predicted point set and the labelled point set to facilitate the calculation of losses.The proposed network avoids the errors or higher costs associated with using traditional density maps or bounding box annotations.Importantly,we have conducted extensive experiments on several crowd datasets,and the proposed method has pro-duced competitive results in both counting and localization.