首页|基于多尺度金字塔Transformer的人群计数方法

基于多尺度金字塔Transformer的人群计数方法

扫码查看
针对密集人群场景中背景复杂、目标尺度变化较大导致人群计数精度较低的问题,本文提出一种基于多尺度金字塔Transformer的人群计数方法(multi-scale pyramid transformer network,MSPT-Net).在特征提取阶段设计了一种基于深度可分离自注意力的金字塔Transformer主干网络结构,该网络结构能有效捕获图像的局部和全局信息,从而有效解决人群密度图像背景复杂导致计数精度低的问题;设计了一种特征金字塔融合模块及多尺度感受野的回归头,实现了密集人群图像浅层细节特征和深层语义特征的高效融合,增强了网络对不同尺度目标的捕获能力;采用深度监督的训练方法在 3 个公开数据集上对提出的方法进行验证.实验结果表明,本文方法在全监督与弱监督学习策略中,与目前主流的人群计数方法相比,实现了更高精度的人群计数,克服了主流方法对背景复杂、目标尺度变化大的密集人群图像计数精度低的问题,同时本文方法保持着更小的参数量与计算量.
A crowd counting network based on multi-scale pyramid Transformer
A crowd counting network based on multi-scale pyramid Transformer(MSPT-Net)is proposed to address the problem of low accuracy in crowd counting in dense crowd scenes caused by complex backgrounds and large target scale variations.A pyramid transformer backbone network structure based on depth separable self-attention is designed in the feature extraction phase to effectively capture local as well as global information of the image,thereby effectively addressing the problem of low counting accuracy in crowd density images caused by complex backgrounds.A feature pyramid fusion module and a regression head with multi-scale receptive fields are designed to efficiently integrate shal-low detail features and deep semantic features in dense crowd scenes,enhancing the network's ability to capture targets of different scales.Lastly,the proposed model is validated using a deep supervision training method on three publicly available datasets.The experimental results show that the proposed MSPT-Net achieves higher crowd counting accur-acy in the fully supervised and weakly supervised learning strategies as compared to mainstream crowd counting net-works,overcoming the issue of low counting accuracy in dense crowd images with complex backgrounds and signific-ant changes in target scales.At the same time,the method in this paper keeps the parameter number and calculation amount smaller.

dense crowdcrowd countingmulti-scalepyramidTransformerself-attentiondensity mapdeep supervi-sion

张少乐、雷涛、王营博、周强、薛明园、赵伟强

展开 >

陕西科技大学 电气与控制工程学院, 陕西 西安 710021

陕西科技大学 电子信息与人工智能学院, 陕西西安 710021

陕西科技大学 陕西省人工智能联合实验室, 陕西 西安 710021

中电科西北集团有限公司西安分公司, 陕西 西安 710065

展开 >

密集人群 人群计数 多尺度 金字塔 Transformer 自注意力 密度图 深度监督

国家自然科学基金国家自然科学基金陕西省重点研发计划陕西省杰出青年科学基金

62271296622013342021ZDLGY08-072021JC-47

2024

智能系统学报
中国人工智能学会 哈尔滨工程大学

智能系统学报

CSTPCD北大核心
影响因子:0.672
ISSN:1673-4785
年,卷(期):2024.19(1)
  • 36