首页|基于关键帧与时空特征融合的人脸伪造检测

基于关键帧与时空特征融合的人脸伪造检测

扫码查看
基于深度学习的人脸真伪检测是一个典型的二分类问题,模型训练结果的精度不仅受到训练数据质和量的影响,还与训练策略、网络架构设计等有关.以光流法为基础,提出了一种基于关键帧与时空特征融合的人脸伪造检测方法.首先,采用加权光流能量阈值分析法筛选出视频中能量较大的关键帧,将关键帧的光流和LBP纹理特征进行融合,构成具有时间和空间特性的融合特征图,经过增强处理后输入CNN模型进行学习.在FaceForensics++和Celeb-df数据集上的测试表明,所提算法的检测率较传统算法均有明显提升.跨库实验中,所提算法采用Efficientnet-V2结构在FaceForensics++数据集上表现出最优的跨库检测性能,准确率达到90.1%,XceptionNet结构的整体性能优于其他方法,准确率均达到80%以上,具有优越的泛化性能.
Facial Forgery Detection Based on Key Frames and Fused Spatial-Temporal Features
The deep learning-based facial forgery detection is commonly approached as a binary classification problem.The accu-racy of model training results is not only affected by the quality and quantity of training data,but also related to training strategy and network architecture design..In this paper,we propose a new method based on key frames and spatial-temporal features.Firstly,the weighted optical flow energy analysis is used to detect the key frames in a video.Then,the optical flow and LBP fea-tures of the key frames are fused to form feature maps with spatial and temporal characteristics.After data augmentation,the fea-ture maps are fed into the CNN model for training.Evaluations conducted on the FaceForensics++and Celeb-df datasets de-monstrate that the proposed method achieves superior or comparable detection accuracy.Experimental results on cross-datasets show that the proposed method,utilizing the Efficientnet-V2 structure,achieves the best performance on the FaceForensics++database with the accuracy of 90.1%.Furthermore,the overall performance of the XceptionNet structure surpasses that of other methods,achieving the accuracy over 80%,thus demonstrating superior generalization performance of the proposed method.

Optical flowKey framesLBP textureCNN model

程燕

展开 >

华东政法大学信息科学与技术系 上海 201620

光流 关键帧 LBP纹理 CNN模型

教育部人文社科一般项目

23YJA820015

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(11)