人脸表情识别可解释性研究综述

扫码查看

原文链接

万方数据
维普

中文摘要：近年来,人脸表情识别(Facial Expression Recognition,FER)被广泛应用于医疗、社交机器人、通信、安全等诸多领域.与此同时,为加深研究者对模型本质的认识,确保模型的公平性、隐私保护性与鲁棒性,越来越多的研究者关注表情识别可解释性的研究.本文依据结果可解释、机理可解释、模型可解释的分类原则,对表情识别中的可解释性研究方法进行了分类与总结.具体而言,结果可解释表情识别主要包括基于文本描述和人脸基本结构的方法.机理可解释方法主要研究了表情识别中的注意力机制,以及基于特征解耦和概念学习方法的可解释方法.模型可解释方法主要探究了可解释性分类方法.最后,对表情识别可解释性研究进行了对比与分析,并对未来的发展方向进行了讨论与展望,包括复杂表情的可解释性、多模态情绪识别的可解释性、大模型表情与情绪识别的可解释性以及基于可解释性提升泛化能力四个方面.本文旨在为感兴趣的研究人员提供人脸表情识别可解释性问题研究现状的整理与分析,推动该领域的进一步发展.

外文标题：A Survey on Interpretability of Facial Expression Recognition

外文摘要：In recent years,Facial Expression Recognition(FER)has been widely used in medicine,social robotics,communication,security and many other fields.A growing number of researchers are showing interest in the FER area and have proposed useful algorithms.At the same time,the study of FER interpretability has attracted increasing attention from researchers,as it can deepen their understanding of the models and ensure fairness,privacy preservation,and robustness.In this paper,we summarized the interpretability works in the field of FER based on the classification of result interpretability,mechanism interpretability,and model interpretability.Result interpretability indicates the extent to which people with specific experience can consistently understand the results of the models.Specifically,result interpretable FER mainly includes methods based on text description and the basic structure of the face.Wherein the methods based on face structure consists of approaches based on facial action units(AU),topological modeling,caricature images and interference analysis.In addition,mechanism interpretability focuses on explanation of the internal mechanism of the models,including the attention mechanism in FER,as well as the interpretability methods based on feature decoupling and concept learning.As for model interpretability,researchers often try to find out the decision principle or rules of the models.This paper illustrates the interpretable classification methods in FER,which belong to model interpretability.Such approaches involve those based on Multi-Kernel Support Vector Machine(MKSVM)and those based on decision trees and deep forest.Additionally,we compared and analyzed the FER interpretability works.We also identified current problems in this area,including the lack of evaluation metrics for FER interpretability analysis,the challenge of balancing the accuracy and interpretability of FER models,and the limited interpretability data available for expression recognition.Afterwards,a discussion and outlook on the way forward took place.First is about the interpretability of complex expressions recognition,mainly focusing on the compound expressions and more delicate fine-grained expressions.Then it comes to the interpretability of multi-modal emotion recognition.Multi-modal models can obtain better performance by complementing the information of each modality,and their interpretability analysis is also an important direction worth exploring in the future.Additionally,we believe that interpretability of expression and emotion recognition with large models is another significant future direction,including interpretability of Large Vision Models,Vision Language Models and Multi-modal Large Models.Interpretability study can help to improve the safety and reliability of large models.Finally,we address the enhancement of generalization ability based on interpretability.When the models are learning"relevance"rather than"causality",they are easy to make wrong judgments when encountering new data or being affected by other factors,that is,the models do not have good generalization performance.The interpretability analysis helps deepen our understanding of the nature of the models,explain the causal relationship between input and output,and therefore improve the generalization performance.This paper intends to provide interested researchers with a comprehensive review and analysis of the current state of research on the interpretability of facial expression recognition,thereby promoting further advancements in this field.

外文关键词：

facial expression recognitioninterpretabilitycomputer visionaffective computingmachine learning

作者：

张淼萱、张洪刚

展开 >

作者单位：

北京邮电大学人工智能学院北京 100876

关键词：

人脸表情识别可解释性计算机视觉情感计算机器学习

出版年：

2024

DOI：

10.11897/SP.J.1016.2024.02819

计算机学报

中国计算机学会中国科学院计算技术研究所

计算机学报

CSTPCD北大核心

影响因子：3.18

ISSN：0254-4164

年,卷(期)：2024.47(12)