利用局部-全局时间依赖的弱监督视频异常检测

扫码查看

原文链接

万方数据
维普

中文摘要：弱监督视频异常检测(WS-VAD)对智能安防领域具有重要意义.而目前WS-VAD任务面临以下问题:现有方法更关注对视频片段本身的判别,而忽略了片段之间的局部和全局时间依赖性;在损失函数设置上忽略了异常事件的时序结构;异常视频中存在大量正常片段噪声,干扰训练的收敛.因此,提出一种基于局部-全局时间依赖(LGTD)网络的弱监督视频异常检测方法.该方法中,LGTD网络利用多尺度时序特征融合(MTFF)模块捕获不同时间跨度内片段的局部时间相关性;同时,利用多头自注意力(MHSA)模块整合视频内所有片段的信息,从而理解整个视频序列的时间相关性;之后,利用通道注意力挤压-激励(SE)模块优化片段内部的特征权重,从而更准确地捕捉视频片段的时空特征,并显著提升检测性能.此外,进一步改进现有损失函数,即引入互补的K-maxmin包内损失和Top-K包外损失,以提高从异常视频中选取异常片段进行训练优化的概率.实验结果表明,所提方法在UCF-Crime和ShanghaiTech数据集上的平均曲线下面积(AUC)分别达到了83.18%和95.41%,;与协同正态学习(CNL)方法相比,分别提高了0.08和7.21个百分点.可见,所提方法能有效提升检测性能.

外文标题：Weakly supervised video anomaly detection with local-global temporal dependency

外文摘要：Weakly Supervised Video Anomaly Detection(WS-VAD)is of great significance to the field of intelligent security.Currently WS-VAD tasks face the following problems:the existing methods focus more on the discrimination of the video snippets themselves,ignoring the local and global temporal dependency among the snippets;the temporal structure of anomalous events is ignored in loss function setting;a large amount of normal snippet noise exists in the anomalous video,which interferes with the training convergence.Therefore,a WS-VAD method based on Local-Global Temporal Dependency(LGTD)network was proposed.In this method,the LGTD network utilized a Multi-scale Temporal Feature Fusion(MTFF)module to capture the local temporal correlation of snippets within different time spans.At the same time,a Multi-Head Self-Attention(MHSA)module was employed to integrate the information of all snippets within the video and understand the temporal correlation of the whole video sequence.After that,a Squeeze-and-Excitation(SE)module was used to optimize the internal feature weights of the snippets,so as to capture the temporal and spatial features of the snippets more accurately,and significantly improve the detection performance.In addition,the existing loss function was improved by introducing complementary K-maxmin inner bag loss and Top-K outer bag loss to increase the probability of selecting anomaly snippets from the anomalous video for optimization training.Experimental results show that the proposed method has the average Area Under the Curve(AUC)on UCF-Crime and ShanghaiTech datasets reached 83.18%and 95.41%respectively,which are improved by 0.08 and 7.21 percentage points respectively compared with the Collaborative Normality Learning(CNL)method.It can be seen that the proposed method can effectively improve the detection performance.

外文关键词：

Video Anomaly Detection(VAD)weakly supervised learningMultiple Instance Learning(MIL)multiscale feature fusionMulti-Head Self-Attention(MHSA)mechanism

作者：

宋鹏程、郭立君、张荣

展开 >

作者单位：

宁波大学信息科学与工程学院,浙江宁波 315211

关键词：

视频异常检测弱监督学习多实例学习多尺度特征融合多头自注意力机制

出版年：

2025

DOI：

10.11772/j.issn.1001-9081.2024010104

计算机应用

中国科学院成都计算机应用研究所

计算机应用

北大核心

影响因子：0.892

ISSN：1001-9081

年,卷(期)：2025.45(1)