摘要
随着深度伪造技术的快速发展,深度伪造视频在每一帧上表现得极为真实,现有检测方法难以有效识别出深度伪造视频.针对这一问题,本文首次提出了一种基于视频流谱特征空间的深度伪造检测方法.该方法基于流谱理论构建了一个视频流谱特征空间,通过视频流谱基底模型将视频流从视频特征隐空间映射到视频流谱特征空间,精准刻画视频流中不一致性信息,获取可分离度更高的视频流谱不一致性特征,从而实现深度伪造视频的检测.具体而言,首先提出了一种视频流谱特征空间的构建方法,通过对视频特征隐空间进行基底映射,得到一个近似同构的视频流谱特征描述空间,在视频流谱特征空间中融合视频流不同视角的高维表征,实现对视频流的精准刻画与分析;然后设计了一个视频不一致性流谱映射模型,通过视频流谱变换算子,从时序角度将视频流的空域信息聚合映射到视频流谱特征空间,建模深度伪造视频的不一致性信息,构建数据可分离度更高的视频表征.实验结果表明,所提方法在Celeb-DF数据集上达到99.23%的准确率,在DFDC数据集上达到95.24%的准确率.
Abstract
The rapid advancement of deepfake technology has led to the creation of deepfake videos that appear extremely realistic on each frame.Existing detection methods have struggled to effectively identify deepfake videos.To tackle this issue,a deepfake detection method based on video flow spectrum feature space is proposed for the first time in this paper.The video flow spectrum feature space is constructed using flow spectrum theory,which maps the spatio-temporal information in the video to the video flow spectrum feature space through the video flow spectrum basis model.This approach better captures the motion inconsistency of the video and obtains a more discriminative video representation,enabling the detection of deepfake videos.Specifically,the paper proposes a method for constructing the video flow spectrum feature space,which obtains an approximately isomorphic video flow spectrum feature description space by basis-mapping the video feature hidden space.It also fuses the high-dimensional representations of different perspectives of the video stream in the video flow spectrum feature space to achieve accurate portrayal and analysis of video streams.Furthermore,a video inconsistency flow spectrum map model is designed to map the spatial information of the video stream into the video flow spectrum feature space from the temporal perspective using the video flow spectrum transform operator.This model effectively captures the inconsistency information of deepfake videos and constructs the video representation with higher data separability.Experimental results demonstrate that the proposed method achieves an accuracy of 99.23%on the Celeb-DF dataset and 95.24%on the DFDC dataset.