Diffusion Models Based Unconditional Counterfactual Explanations Generation
Counterfactual explanations alter the model output by implementing minimal and interpretable modifications to input data,revealing key factors influencing model decisions.Existing counterfactual explanation methods based on diffusion models rely on conditional generation,requiring additional semantic information related to classification.However,ensuring semantic quality of the semantic information is challenging and computational costs are increased.To address these issues,an unconditional counterfactual explanation generation method based on the denoising diffusion implicit model(DDIM)is proposed.By leveraging the consistency exhibited by DDIM during the reverse denoising process,noisy images are treated as latent variables to control the generated outputs,thus making the diffusion model suitable for unconditional counterfactual explanation generation workflows.Then,the advantages of DDIM in filtering high-frequency noise and out-of-distribution perturbations are fully utilized,thereby reconstructing the unconditional counterfactual explanation workflow to generate semantically interpretable modifications.Extensive experiments on different datasets demonstrate that the proposed method achieves superior results across multiple metrics.
Deep LearningInterpretabilityCounterfactual ExplanationDiffusion ModelAdversarial Attack