嵌入混合注意力机制的Swin Transformer人脸表情识别

扫码查看

原文链接

万方数据
维普

中文摘要：人脸表情识别是心理学领域的一个重要研究方向,可应用于交通、医疗、安全和刑事调查等领域.针对卷积神经网络(CNN)在提取人脸表情全局特征的局限性,提出了一种嵌入混合注意力机制的Swin Transformer人脸表情识别方法,以Swin Transformer为主干网络,在模型Stage3的融合层(Patch Merging)中嵌入了混合注意力模块,该方法能够有效提取人脸面部表情的全局特征和局部特征.首先,层次化的Swin Transformer模型可有效获取深层全局特征信息.其次,嵌入的混合注意力模块结合了通道和空间注意力机制,在通道维度和空间维度上进行特征提取,从而让模型能够更好地提取局部位置的特征信息.同时,采用迁移学习方法对模型网络权重进行初始化,进而提高模型的精度和泛化能力.所提方法在FER2013、RAF-DB和JAFFE这3个公共数据集上分别达到了 73.63％、87.01％和98.28％的识别准确率,取得了较好的识别效果.

外文标题：Facial expression recognition in Swin Transformer by embedding hybrid attention mechanism

外文摘要：Facial expression recognition is an important research domain in psychology that can be applied to many fields such as transportation,medical care,security,and criminal investigation.Given the limitations of convolutional neural networks(CNN)in extracting global features of facial expressions,this paper proposes a Swin Transformer method embedded with a hybrid attention mechanism for facial expression recognition.Using the Swin Transformer as the backbone network,a hybrid attention module is embedded in the fusion layer(Patch Merging)in the model of Stage3,which can effectively extract global and local features from facial ex-pressions.Firstly,the hierarchical Swin Transformer model can effectively obtain deep global features.Second-ly,the embedded hybrid attention module combines channel and spatial attention mechanisms to extract fea-tures in the channel dimension and spatial dimension,which can attain better local features.At the same time,this article uses the transfer learning method to initialize the model network weights,thereby improving the recognition performance and generalization ability.The proposed method achieved recognition accuracies of 73.63％,87.01％,and 98.28％on three public datasets(FER2013,RAF-DB,and JAFFE)respectively,achieving good recognition results.

外文关键词：

expression recognitionTransformerattention mechanismtransfer learning

作者：

王坤侠、余万成、胡玉霞

展开 >

作者单位：

安徽建筑大学电子与信息工程学院,安徽合肥 230601

安徽省古建筑智能感知与高维建模国际联合研究中心,安徽合肥 230601

关键词：

表情识别 Transformer 注意力机制迁移学习

基金：

国家自然科学基金青年基金安徽省住房城乡建设科学技术计划安徽省住房城乡建设科学技术计划安徽建筑大学智能建筑与建筑节能安徽省重点实验室开放基金

项目编号：

621050022023-YF1132023-YF004IBES2022ZR02

出版年：

2024

DOI：

10.16152/j.cnki.xdxbzr.2024-02-003

西北大学学报(自然科学版)

西北大学

西北大学学报(自然科学版)

CSTPCD北大核心

影响因子：0.35

ISSN：1000-274X

年,卷(期)：2024.54(2)

参考文献量31