基于全局双线性注意力的红外视频行为识别
Behavior recognition in infrared video based on global bilinear attention
欧阳楠楠 1况立群 2谢剑斌 1韩慧妍 2曹亚明 2王飞2
作者信息
- 1. 中北大学计算机科学与技术学院,山西太原 030051
- 2. 中北大学计算机科学与技术学院,山西太原 030051;机器视觉与虚拟现实山西省重点实验室,山西太原 030051;山西省视觉信息处理及智能机器人工程研究中心,山西太原 030051
- 折叠
摘要
针对红外视频缺少纹理细节特征以致在人体行为识别中难以兼顾计算复杂度与识别准确率的问题,提出一种基于全局双线性注意力的红外视频行为识别方法.为高效计算红外视频中的人体行为,设计基于两级检测网络的关节点提取模块来获得人体关节点信息,创新性地将所形成的关节点三维热图作为红外视频人体行为识别网络的输入特征;为了在轻量化计算的基础上进一步提升识别准确率,提出一种全局双线性注意力的三维卷积网络,从空间和通道两个维度提升注意力的建模能力,捕获全局结构信息.在InfAR和IITR-IAR数据集上的实验结果表明,该方法在红外视频行为识别中的有效性.
Abstract
To address the problem that infrared video lacks texture detail features which is difficult to balance the com-putational complexity and recognition accuracy in human behavior recognition,a global bilinear attention-based behav-ior recognition method for infrared video is proposed in this paper.Firstly,in order to efficiently compute human be-havior in infrared video,a joint extraction module based on a two-stage detection network is designed to obtain human joint point information,and the resulting 3D heat map of joints is innovatively used as an input feature for the human behaviour recognition network in infrared video.Moreover,to further improve the recognition accuracy on the basis of lightweight computation,a global bilinear attention-based 3D convolutional network is proposed to enhance the atten-tion from both spatial and channel dimensions modeling capability to capture global structural information.The experi-mental results on the InfAR and IITR-IAR datasets demonstrate the effectiveness of the method in infrared video be-havior recognition.
关键词
红外视频/注意力/关节点/行为识别Key words
infrared video/attention/joint points/behavior recognition引用本文复制引用
基金项目
国家自然科学基金(62272426)
国家自然科学基金(62106238)
山西省科技重大专项"揭榜挂帅"项目(202201150401021)
山西省科技成果转化引导专项(202104021301055)
山西省回国留学人员科研项目(2020-113)
山西省基础研究计划(202203021222027)
出版年
2024