首页|基于多尺度时空注意力网络的微表情检测方法

基于多尺度时空注意力网络的微表情检测方法

扫码查看
微表情可以揭示人们试图隐藏的真实情绪,为刑事侦查、心理辅导等提供潜在的信息。现有微表情检测方法主要在获取空间特征的基础上提取时间特征以构建时空特征,这种处理方式容易导致时间特征失真,同时在空间处理过程中会破坏原有时序关系,降低微表情时空特征的判别性。针对这一问题,提出基于多尺度时空注意力网络的微表情检测方法。利用包含时间和空间关系的三维卷积神经网络(3DCNN)对微表情序列进行处理,获取兼顾时间域和空间域的鲁棒性特征。构建多尺度时间输入序列,从不同时间长度的图像序列中提取多维时间特征,采用轻量级3DCNN提取多尺度时空特征,利用全局时空注意力模块(GSAM)对时空特征进行全局性时空关联加强,其中时空重组模块用于加强不同时刻图像帧之间的连通性,全局信息关注模块构建单帧图像上的空间关联信息,最后对不同时刻的特征赋予权重以突出关键时间信息,有效完成微表情检测工作。实验结果表明,该方法可以准确检测出微表情序列片段,在CASME、CASME Ⅱ和SAMM公开数据集上的准确率分别达到92。32%、95。04%和89。56%,相比目前最优的深度学习方法LGAttNet,所提方法在CASME Ⅱ和SAMM数据集上的准确率分别提高了 3。84和4。96个百分点。
Micro-Expression Detection Method Based on Multi-Scale Spatiotemporal Attention Network
Micro-expressions can reveal genuine emotions that people attempt to hide,providing potential information for criminal investigations,psychological counseling,and other situations.Existing methods for detecting micro-expression primarily extract temporal characteristics to construct spatiotemporal features based on obtaining spatial features;however,these approaches can result in distorted temporal features,and thus disrupt the original temporal relationships during spatial processing,consequently diminishing the discriminative ability of the spatiotemporal features of micro-expressions.To address this issue,a method is proposed for micro-expression detection based on a multi-scale spatiotemporal attention network.Using a 3-Dimensional Convolutional Neural Network(3DCNN)that incorporates temporal and spatial relationships,the micro-expression sequences are processed to obtain robust features considering both the temporal and spatial domains.Multi-scale temporal input sequences are constructed to extract multi-dimensional temporal features from image sequences with different time lengths in the network.A lightweight 3DCNN is used to extract multi-scale spatiotemporal features.The Global Spatiotemporal Attention Module(GSAM)is employed to enhance the overall spatiotemporal correlations of features,wherein the spatiotemporal restructuring module strengthens the connectivity between different image frames at different moments,whereas the global information attention module constructs the spatial correlation information on a single-frame image.Finally,the assignment of weights to various temporal characteristics highlights the key temporal information,effectively detecting micro-expressions.The experimental results demonstrate that the proposed method can accurately detect micro-expression sequence fragments,achieving accuracy rates of 92.32%,95.04%,and 89.56%on the publicly available CASME,CASME Ⅱ,and SAMM datasets,respectively.Compared with that of the existing optimal deep learning method,LGAttNet,the accuracy of the proposed method is improved by 3.84 percentage points on the CASME Ⅱ dataset and 4.96 percentage points on the SAMM dataset.

micro-expression detection3-Dimensional Convolutional Neural Network(3DCNN)spatiotemporal featuresmulti-scale featurescorrelation

于洋、孙芳芳、吕华、李扬、王晓民

展开 >

河北工业大学人工智能与数据科学学院,天津 300401

天津市农业科学院信息研究所,天津 300192

微表情检测 三维卷积神经网络 时空特征 多尺度特征 关联性

国家自然科学基金国家自然科学基金

6227608862102129

2024

计算机工程
华东计算技术研究所 上海市计算机学会

计算机工程

CSTPCD北大核心
影响因子:0.581
ISSN:1000-3428
年,卷(期):2024.50(6)