Counterfactuals and causability in explainable artificial intelligence: Theory, algorithms, and applications

扫码查看

原文链接

NSTL
Elsevier

外文摘要：Deep learning models have achieved high performance across different domains, such as medical decision making, autonomous vehicles, decision support systems, among many others. However, despite this success, the inner mechanisms of these models are opaque because their internal representations are too complex for a human to understand. This opacity makes it hard to understand the how or the why of the predictions of deep learning models. There has been a growing interest in model-agnostic methods that make deep learning models more transparent and explainable to humans. Some researchers recently argued that for a machine to achieve human level explainability, this machine needs to provide human causally understandable explanations, also known as causability. A specific class of algorithms that have the potential to provide causability are counterfactuals. This paper presents an in-depth systematic review of the diverse existing literature on counterfactuals and causability for explainable artificial intelligence (AI). We performed a Latent Dirichlet topic modelling analysis (LDA) under a Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework to find the most relevant literature articles. This analysis yielded a novel taxonomy that considers the grounding theories of the surveyed algorithms, together with their underlying properties and applications to real-world data. Our research suggests that current model-agnostic counterfactual algorithms for explainable AI are not grounded on a causal theoretical formalism and, consequently, cannot promote causability to a human decision-maker. Furthermore, our findings suggest that the explanations derived from popular algorithms in the literature provide spurious correlations rather than cause/effects relationships, leading to sub-optimal, erroneous, or even biased explanations. Thus, this paper also advances the literature with new directions and challenges on promoting causability in model-agnostic approaches for explainable AI.

外文关键词：

Deep learningExplainable AICausabilityCounterfactualsCausalityBLACK-BOXEXPLANATIONSMODELSINTERPRETABILITYPREDICTIONS

作者：

Moreira, Catarina、Bruza, Peter、Ouyang, Chun、Jorge, Joaquim、Chou, Yu-Liang

展开 >

作者单位：

Queensland Univ Technol

ULisboa

出版年：

2022

DOI：

10.1016/j.inffus.2021.11.003

Information Fusion

EISCI

ISSN：1566-2535

年,卷(期)：2022.81

被引量38
参考文献量159