首页|Counterfactuals and causability in explainable artificial intelligence: Theory, algorithms, and applications
Counterfactuals and causability in explainable artificial intelligence: Theory, algorithms, and applications
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NSTL
Elsevier
Deep learning models have achieved high performance across different domains, such as medical decision making, autonomous vehicles, decision support systems, among many others. However, despite this success, the inner mechanisms of these models are opaque because their internal representations are too complex for a human to understand. This opacity makes it hard to understand the how or the why of the predictions of deep learning models. There has been a growing interest in model-agnostic methods that make deep learning models more transparent and explainable to humans. Some researchers recently argued that for a machine to achieve human level explainability, this machine needs to provide human causally understandable explanations, also known as causability. A specific class of algorithms that have the potential to provide causability are counterfactuals. This paper presents an in-depth systematic review of the diverse existing literature on counterfactuals and causability for explainable artificial intelligence (AI). We performed a Latent Dirichlet topic modelling analysis (LDA) under a Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework to find the most relevant literature articles. This analysis yielded a novel taxonomy that considers the grounding theories of the surveyed algorithms, together with their underlying properties and applications to real-world data. Our research suggests that current model-agnostic counterfactual algorithms for explainable AI are not grounded on a causal theoretical formalism and, consequently, cannot promote causability to a human decision-maker. Furthermore, our findings suggest that the explanations derived from popular algorithms in the literature provide spurious correlations rather than cause/effects relationships, leading to sub-optimal, erroneous, or even biased explanations. Thus, this paper also advances the literature with new directions and challenges on promoting causability in model-agnostic approaches for explainable AI.
Deep learningExplainable AICausabilityCounterfactualsCausalityBLACK-BOXEXPLANATIONSMODELSINTERPRETABILITYPREDICTIONS