首页|基于视音互补语义清晰化的隐私视频动作识别方法

基于视音互补语义清晰化的隐私视频动作识别方法

扫码查看
视频隐私保护是当前社会面临的重要挑战之一,对视频进行模糊处理是保护人们隐私权益的重要手段.由于模糊视频天然缺失视觉模态的信息,主流的视频动作识别算法无法取得令人满意的效果.模糊视频作为多模态介质不仅仅只有视觉模态信息,同时,也含有丰富的音频模态信息,从人类的认知角度而言,音频也是获取信息的重要来源.本文提出一种基于多模态融合的隐私视频动作识别方法,在保证不侵犯使用者隐私的前提下进行人类动作行为识别.具体来说,使用音频-视觉特征融合模块将音频模态特征图融入到视觉模态中,充分融合音视频模态的深层语义信息.除此之外,模型还引入清晰视频帧图像作为标签,在模型训练阶段监督动作识别网络的参数更新,为隐私视频动作识别网络提供清晰的语义信息.在多组隐私行为数据集上,通过大量消融和对比实验验证了所提方法的有效性.
A Method for Private Video Action Recognition Based on Visual Audio Complementary and Semantic Clarity
Video privacy protection is one of the important challenges faced by current society,and blurring videos is an important means to protect people's privacy rights. Due to the natural lack of visual modality information in blurry vid-eos,mainstream video action recognition algorithms cannot achieve satisfactory results. As a multimodal medium,blurry videos not only contain visual modality information but also rich audio modality information. From a human cognitive per-spective,audio is also an important source of information acquisition. In view of this,this article proposes a privacy video action recognition method based on multimodal fusion,which can recognize human action behavior without infringing on user privacy. Specifically,this article uses the audio visual feature fusion module to integrate audio modal feature maps into visual modalities,fully integrating the deep semantic information of audio and video modalities. In addition,the model also introduces clear video frame images as labels to monitor the parameter updates of the action recognition network during the model training phase,providing clear semantic information for the private video action recognition network. The effective-ness of the proposed method was verified through extensive ablation and comparative experiments on multiple sets of pri-vate behavior datasets.

audio-visual feature fusionsemantic clarityprivacy preserving

李泽超、付孝德、潘礼勇、严锐、唐金辉

展开 >

南京理工大学计算机科学与工程学院,江苏南京 210094

音视频特征融合 语义清晰化 隐私保护

国家自然科学基金国家自然科学基金科技创新2030——新一代人工智能重大专项

U20B2064U21B20432022ZD0118802

2024

电子学报
中国电子学会

电子学报

CSTPCD北大核心
影响因子:1.237
ISSN:0372-2112
年,卷(期):2024.52(7)