This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\AtBeginEnvironment

algorithmic

Causal Identification of Sufficient, Contrastive and Complete Feature Sets
in Image Classification

David A. Kelly1, Hana Chockler1
Abstract

Existing algorithms for explaining the outputs of image classifiers are based on a variety of approaches and produce explanations that lack formal rigour. On the other hand, logic-based explanations are formally and rigorously defined but their computability relies on strict assumptions about the model that do not hold on image classifiers.

In this paper, we show that causal explanations, in addition to being formally and rigorously defined, enjoy the same formal properties as logic-based ones, while still lending themselves to black-box algorithms and being a natural fit for image classifiers. We prove formal properties of causal explanations and introduce contrastive causal explanations for image classifiers. Moreover, we augment the definition of explanation with confidence awareness and introduce complete causal explanations: explanations that are classified with exactly the same confidence as the original image.

We implement our definitions, and our experimental results demonstrate that different models have different patterns of sufficiency, contrastiveness, and completeness. Our algorithms are efficiently computable, taking on average 66s per image on a ResNet50 model to compute all types of explanations, and are totally black-box, needing no knowledge of the model, no access to model internals, no access to gradient, nor requiring any properties, such as monotonicity, of the model.

1 Introduction

Refer to caption
(a) Ladybug
Refer to caption
(b) Causal explanation
Refer to caption
(c) Multiple Causal Explanations
Refer to caption
(d) Contrastive explanation
Refer to caption
(e) Complete explanation
Figure 1: 44 types of explanation for “ladybug” with 0.460.46 confidence on a ResNet50.  Figure 1(b) shows pixels sufficient to obtain class “ladybug” with more than the original confidence (0.50.5). Figure 1(c) shows multiple explanations. In this paper we introduce ‘contrastive’ explanations, which are subsets of pixels that are by themselves sufficient for “ladybug”, but removing these pixels results in “leaf beetle” (Figure 1(d)), and ‘complete’ explanations (Figure 1(e)), which are subsets of pixels required to arrive at the original confidence of 0.460.46.

Recent progress in artificial intelligence and the ever increasing deployment of AI systems has highlighted the need to understand better why some decisions are made by such systems. For example, one may need to know why a classifier decides that an MRI scan showed evidence of a tumor (Blake et al. 2023). Answering such questions is the province of causality. A causal explanation for an image classification is a special case of explanations in actual causality (Halpern 2019) and identifies a minimal set of pixels which, by themselves, are sufficient to re-create the original decision (Chockler and Halpern 2024).

Logic-based approaches provide formal guarantees for explanations, but their framework assumes that the model is given explicitly as a function. Their formal abductive explanations, or prime implicants (PI), are defined as sets of features such that, if they take the given values, always lead to the same decision (Shih, Choi, and Darwiche 2018). Logic-based methods can also compute contrastive explanations, that is, those features which, if altered, change the original decision. These abductive and contrastive explanations require a model to be monotonic or linear to be effectively computable (Marques-Silva et al. 2021), and therefore are not suitable for image classifiers.

In this paper, we show that causal explanations enjoy all the formal properties of logic-based explanations, while not putting any restrictions on the model and being efficiently computable for black-box image classifiers. We prove equivalence between abductive explanations and causal explanations and introduce a variation of causal explanations equivalent to contrastive explanations. Furthermore, we augment the actual causality framework with the model’s confidence in the classification, introducing δ\delta-confident explanations, and use these to produce more fine-grained results. We also introduce a causal version of the completeness property for explanations, following (Srinivas and Fleuret 2019). The differences between explanations of the classification of the same image reveal important information about the decision process of the model. In particular, we call the the difference between a complete explanation and a sufficient one adjustment pixels.

We examine the relationship between contrastive and adjustment pixel sets by examining the semantic distance between the original classification and the classifications of these pixel sets, as illustrated in Figure 1. Our approach allows us to formally subdivide an image into its sufficient pixels, contrastive pixels, and adjustment pixels. We prove complexity results for our definitions, giving a justification to efficient approximation algorithms.

Our algorithms are based on rex (Chockler et al. 2024). In Section 4 we introduce black-box approximation algorithms to compute contrastive, complete, and δ\delta-confident causal explanations for image classifiers. Our algorithms do not require any knowledge of the model, no access to the model internals, nor do they require any specific properties of the model. We implemented our algorithms and present experimental results on three state-of-the-art models and three standard benchmark datasets.

We use the word ‘explanation’ for the remainder of the paper as a convenience. We do not intend, nor seek to imply, that causal explanations are more or less interpretable than other form of explanation. Instead, we are interested in the formal partitioning of pixels in an image into different functional sets. These functional sets reveal important information about the inner workings of the model.

Due to the lack of space, some theoretical and evaluation results are deferred to the supplementary material.

Related Work

Broadly speaking, the field of XAI can be split between formal and informal methods (Izza et al. 2024). The majority of methods belong in the informal camp, including well-known model agnostic methods (Lundberg and Lee 2017a; Ribeiro, Singh, and Guestrin 2016) and saliency methods (Selvaraju et al. 2017; Bach et al. 2015). Formal explanation work has been dominated by logic-based methods (Shih, Choi, and Darwiche 2018; Ignatiev, Narodytska, and Marques-Silva 2019). Logic-based explanations use abductive reasoning to find the simplest or most likely explanation ‘a’ for a (set of) observations ‘b’. Logic-based explanations provide formal guarantees of feature sufficiency Definitions 5 and 6, but usually require strong assumptions of monotonicity or linearity for reasons of computability. The scalability of such an approach is open to question. Logic-based methods are a black-box XAI method, in that they do not require access to a model’s internals, or even its gradient.

Constraint-driven black-box explanations (Shrotri et al. 2022) build on the LIME (Ribeiro, Singh, and Guestrin 2016) framework but include user-provided boolean constraints over the search space. In particular, for image explanations, these constraints could dictate the nature of the perturbations the XAI tool generates. Of course, knowing which constraints to use is a hard problem and assumes at least some knowledge of how the model works and what the explanation should be. While such methods are technically black-box, this is because model-dependent information has been separately encoded by the user.

Causal explanations (Chockler and Halpern 2024) belong in the camp of formal XAI, as they provide mathematically rigorous guarantees in much the same manner as logic-based explanations. rex (Chockler et al. 2024) is a black-box causal explainability tool which makes no assumptions about the classifier. It computes an approximation to minimal, sufficient pixels sets against a baseline value.

2 Background

There are many different definitions of explanation in the literature, some are saliency-based (Selvaraju et al. 2017), some are gradient-based (Srinivas and Fleuret 2019), Shapley-based (Lundberg and Lee 2017b) or train locally interpretable models (Ribeiro, Singh, and Guestrin 2016). Logic-based explanations are rather different in having a mathematically precise definition of explanation. Causal explanations, as defined in (Chockler and Halpern 2024) (and their approximations computed by rex (Chockler et al. 2024)), have much more in common with the rigorous definitions of logic-based explanations.

Refer to caption
(a) Wash basin/sink (0.60.6 confidence)
Refer to caption
(b) Sufficient explanation (0.040.04 confidence)
Refer to caption
(c) Contrastive explanation (0.757(0.757 confidence)
Refer to caption
(d) Pixels for contrastive explanation
Refer to caption
(e) Complete explanation (adjustment pixels in red)
Figure 2: A ‘washbasin’ partitioned into sufficient, contrastive, and adjustment pixel sets. Completeness required 82%82\% of the image for a ResNet50 model. The causal explanation, Figure 2(b) is very small, with low confidence. The contrastive explanation (Figure 2(c) has higher confidence than the original image. Masking out Figure 2(c) to get Figure 2(d) ResNet50 gives us a classification of ‘toilet seat’. Interestingly, the adjustment pixels (Figure 2(e)) reduce model confidence from 0.750.75 to 0.60.6, even though the they are also classified as ‘wash basin’.

2.1 Actual causality

In what follows, we briefly introduce the relevant definitions from the theory of actual causality. The reader is referred to (Halpern 2019) for further reading. We assume that the world is described in terms of variables and their values. Some variables may have a causal influence on others. This influence is modeled by a set of structural equations. It is conceptually useful to split the variables into two sets: the exogenous variables U{\mathcal{}U}, whose values are determined by factors outside the model, and the endogenous variables V{\mathcal{}V}, whose values are ultimately determined by the exogenous variables. The structural equations F{\mathcal{}F} describe how these values are determined. A causal model MM is described by its variables and the structural equations. We restrict the discussion to acyclic (recursive) causal models. A context u\vec{u} is a setting for the exogenous variables U{\mathcal{}U}, which then determines the values of all other variables. We call a pair (M,u)(M,\vec{u}) consisting of a causal model MM and a context u\vec{u} a (causal) setting. An intervention is defined as setting the value of some variable XX to xx, and essentially amounts to replacing the equation for XX in F{\mathcal{}F} by X=xX=x. A probabilistic causal model is a pair (M,Pr)(M,\Pr), where Pr\Pr is a probability distribution on contexts.

A causal formula ψ\psi is true or false in a setting. We write (M,u)ψ(M,\vec{u})\models\psi if the causal formula ψ\psi is true in the setting (M,u)(M,\vec{u}). Finally, (M,u)[Yy]φ(M,\vec{u})\models[\vec{Y}\leftarrow\vec{y}]\varphi if (MY=y,u)φ(M_{\vec{Y}=\vec{y}},\vec{u})\models\varphi, where MYyM_{\vec{Y}\leftarrow\vec{y}} is the causal model that is identical to MM, except that the variables in Y\vec{Y} are set to Y=yY=y for each YYY\in\vec{Y} and its corresponding value yyy\in\vec{y}.

A standard use of causal models is to define actual causation: that is, what it means for some particular event that occurred to cause another particular event. There have been a number of definitions of actual causation given for acyclic models (e.g., (Beckers 2021; Glymour and Wimberly 2007; Hall 2007; Halpern and Pearl 2005; Halpern 2019; Hitchcock 2001, 2007; Weslake 2015; Woodward 2003)). In this paper, we focus on what has become known as the modified Halpern–Pearl definition and some related definitions introduced in (Halpern 2019). We briefly review the relevant definitions below. The events that can be causes are arbitrary conjunctions of primitive events.

Definition 1 (Actual cause).

X=x\vec{X}=\vec{x} is an actual cause of φ\varphi in (M,u)(M,\vec{u}) if the following three conditions hold:

AC1.

(M,u)(X=x)(M,\vec{u})\models(\vec{X}=\vec{x}) and (M,u)φ(M,\vec{u})\models\varphi.

AC2.

There is a a setting x\vec{x}^{\prime} of the variables in X\vec{X}, a (possibly empty) set W\vec{W} of variables in VX{\mathcal{}V}-\vec{X}^{\prime}, and a setting w\vec{w} of the variables in W\vec{W} such that (M,u)W=w(M,\vec{u})\models\vec{W}=\vec{w} and (M,u)[Xx,Ww]¬φ(M,\vec{u})\models[\vec{X}\leftarrow\vec{x}^{\prime},\vec{W}\leftarrow\vec{w}]\neg{\varphi}, and moreover

AC3.

X\vec{X} is minimal; there is no strict subset X\vec{X}^{\prime} of X\vec{X} such that X=x′′\vec{X}^{\prime}=\vec{x}^{\prime\prime} can replace X=x\vec{X}=\vec{x}^{\prime} in AC2, where x′′\vec{x}^{\prime\prime} is the restriction of x\vec{x}^{\prime} to the variables in X\vec{X}^{\prime}.

In the special case that W=\vec{W}=\emptyset, we get the but-for definition. A variable xx in an actual cause X\vec{X} is called a part of a cause. In what follows, we adopt the convention of Halpern and state that part of a cause is a cause.

The notion of explanation taken from Halpern (2019) is relative to a set of contexts.

Definition 2 (Explanation).

X=x\vec{X}=\vec{x} is an explanation of φ\varphi relative to a set K{\mathcal{}K} of contexts in a causal model MM if the following conditions hold:

EX1a.

If uK\vec{u}\in{\mathcal{}K} and (M,u)(X=x)φ(M,\vec{u})\models(\vec{X}=\vec{x})\wedge\varphi, then there exists a conjunct X=xX=x of X=x\vec{X}=\vec{x} and a (possibly empty) conjunction Y=y\vec{Y}=\vec{y} such that X=xY=yX=x\wedge\vec{Y}=\vec{y} is an actual cause of φ\varphi in (M,u)(M,\vec{u}).

EX1b.

(M,u)[X=x]φ(M,\vec{u}^{\prime})\models[\vec{X}=\vec{x}]\varphi for all contexts uK\vec{u}^{\prime}\in{\mathcal{}K}.

EX2.

X\vec{X} is minimal; there is no strict subset X\vec{X}^{\prime} of X\vec{X} such that X=x\vec{X}^{\prime}=\vec{x}^{\prime} satisfies EX1, where x\vec{x}^{\prime} is the restriction of x\vec{x} to the variables in X\vec{X}^{\prime}.

EX3.

(M,u)X=xφ(M,u)\models\vec{X}=\vec{x}\wedge\varphi for some uKu\in{\mathcal{}K}.

2.2 Actual causality in image classification

The material here is taken from (Chockler and Halpern 2024), and the reader is referred to that paper for a complete overview of causal models for black-box image classifiers. We view an image classifier (a neural network) as a probabilistic causal model, with the set of endogenous variables being the set V\vec{V} of pixels that the image classifier gets as input, together with an output variable that we call OO. The variable ViVV_{i}\in\vec{V} describes the color and intensity of pixel ii; its value is determined by the exogenous variables. The equation for OO determines the output of the neural network as a function of the pixel values. Thus, the causal network has depth 22, with the exogenous variables determining the feature variables, and the feature variables determining the output variable.

We assume causal independence between the feature variables V\vec{V}. While feature independence does not hold in general, pixel independence is a common assumption in explainability tools. We argue that it is, in fact, a reasonable and accurate assumption on images. Pixels are not concepts: a 22D image is a projection of a scene in the 33D real world on a plane; concepts that are present in the object in a real world can be invisible on this projection, hence the pixel independence. Moreover, for each setting v\vec{v} of the feature variables, there is a setting of the exogenous variables such that V=v\vec{V}=\vec{v}. That is, the variables in V\vec{V} are causally independent and determined by the context. Moreover, all the parents of the output variable OO are contained in V\vec{V}. Given these assumptions, the probability on contexts directly corresponds to the probability on seeing various images (which the neural network presumably learns during training).

The following definition is proved in (Chockler and Halpern 2024) to be equivalent to Definition 2 (their proof is for a partial explanation, which is a generalization of the notion of explanation).

Definition 3 (Explanation for image classifiers).

For a depth-22 causal model MM corresponding to an image classifier 𝒩\mathcal{N}, X=x\vec{X}=\vec{x} is an explanation of O=oO=o relative to a set of contexts K{\mathcal{}K}, if the following conditions hold:

EXIC1.

(M,u)[X=x]O=o(M,\vec{u})[\vec{X}=\vec{x}]\models O=o for all uK\vec{u}\in{\mathcal{}K}.

EXIC2.

X\vec{X} is minimal; there is no strict subset X\vec{X}^{\prime} of X\vec{X} such that X=x\vec{X}^{\prime}=\vec{x}^{\prime} satisfies EXIC1, where x\vec{x}^{\prime} is the restriction of x\vec{x} to the variables in X\vec{X}^{\prime}.

EXIC3.

There exists a context u′′K\vec{u}^{\prime\prime}\in{\mathcal{}K} and a setting x′′\vec{x}^{\prime\prime} of X\vec{X}, such that (M,u′′)(X=x)(O=o)(M,\vec{u}^{\prime\prime})\models(\vec{X}=\vec{x})\land(O=o) and (M,u′′)[X=x′′](Oo)(M,\vec{u}^{\prime\prime})\models[\vec{X}=\vec{x}^{\prime\prime}](O\neq o).

2.3 Logic-based Explanations

We now briefly review some relevant definitions for the world of logic-based explanations.

A classification problem is characterized by a set of features ={1m}\mathcal{F}=\{1\dots m\} and a set of classes K={c1,ck}K=\{c_{1},\dots c_{k}\}. Each feature ii\in\mathcal{F} has a domain DiD_{i}, resulting in a feature space 𝔽=D1×D2×Dm\mathbb{F}=D_{1}\times D_{2}\dots\times D_{m}. The classifier 𝒩\mathcal{N} cannot be a constant function: there must be at least two different points in the feature space that have different classifications. The most important assumption underlying the computability of logic-based explanations is monotonicity.

Definition 4.

[Monotonic Classifier (Marques-Silva et al. 2021)] Given feature domains and a set of classes assumed to be totally ordered, a classifier 𝒩\mathcal{N} is fully monotonic if ab𝒩(a)𝒩(b)a\leq b\Rightarrow\mathcal{N}(a)\leq\mathcal{N}(b) (where, given two feature vectors aa and bb, we say that aba\leq b if aibi(i=1,n))a_{i}\leq b_{i}(i=1,\dots n)).

Definition 5.

[Abductive Explanation (Marques-Silva et al. 2021)] An abductive, or prime-implicant (PI), explanation is a subset-minimal set of features 𝒳\mathcal{X}\subseteq\mathcal{F}, which, if assigned the values vv dictated by the instance (v,c)(v,c) are sufficient for the prediction cc.

(x𝔽)[i𝒳(xi=vi)](𝒩(x)=c).\forall(x\in\mathbb{F})\big{[}\bigwedge_{i\in\mathcal{X}}(x_{i}=v_{i})\big{]}\rightarrow(\mathcal{N}(x)=c). (1)

The other common definition in logic-based explanations relevant to our discussion is contrastive explanations. A contrastive explanation answers the question “why did this happen, and not that?” (Miller 2019).

Definition 6.

[Contrastive Explanation (Ignatiev et al. 2020)] A contrastive explanation is a subset-minimal set 𝒴\mathcal{Y}\subseteq\mathcal{F} which, if the features in 𝒴\mathcal{F}\setminus\mathcal{Y} are assigned the values dictated by the instance (v,c)(v,c) then there is an assignment to the features in 𝒴\mathcal{Y} that changes the prediction.

(x𝔽)[i𝒴(xi=vi)](𝒩(x)c).\exists(x\in\mathbb{F})\big{[}\bigwedge_{i\in\mathcal{F}-\mathcal{Y}}(x_{i}=v_{i})\big{]}\land(\mathcal{N}(x)\not=c). (2)

3 Theoretical Results

In this section we prove equivalences between definitions and properties satisfied by causal explanations. We start with formalization of logic-based explanations in the actual causality framework. For a given classification problem as defined in Section 2.3, we define a depth 22 causal model MM as follows. The set of endogenous input variables is the set of features \mathcal{F}, with each variable ii\in\mathcal{F} having a domain DiD_{i}. The output of the classifier is the output variable OO of the model, with the domain KK. An instance (v,c)(v,c) corresponds to a context u\vec{u} for MM that assigns to \mathcal{F} the values defined by vv, and the output is cc. The set 𝒦\mathcal{K} of contexts is defined as the feature space 𝔽=D1×D2×Dm\mathbb{F}=D_{1}\times D_{2}\dots\times D_{m}. As the classifier is not constant, there exist two inputs vv and vv^{\prime} such that (M,v)O=c(M,v)\models O=c, (M,v)O=c(M,v^{\prime})\models O=c^{\prime}, and ccc\neq c^{\prime}. It is easy to see that MM is a depth-22 causal model with all input variables being causally independent.

3.1 Equivalences between definitions

Lemma 1.

For image classifiers, causal explanations (Definition 3) are abductive (Definition 5).

The following is an easy corollary from Lemma 1 when we observe that the proof does not use any unique characteristics of image classifiers.

Corollary 3.1.

Explanations (Definition 2) in general causal models of depth 22 with all input variables being independent are abductive.

An abductive explanation is defined relative to the set of all possible assignments, whereas a causal explanation is always relative to a set 𝒦\mathcal{K} of contexts. Monotonicity does not hold in general for image classifiers, where increasing or decreasing the pixels values of some pixels set can change the classification. This is essentially the definition of causal explanation given in Definition 2, as long as we assume that the instance (v,c)(v,c) is an ‘actual’ instance (AC1). We need only account for 𝒦\mathcal{K} in this definition.

Lemma 2.

Contrastive explanations (Definition 6) always exist and are equivalent to actual causes (Definition 1) in the same setting.

3.2 Contrastive and complete explanations with confidence

In this section we extend the definition of causal explanation to explicitly include model confidence and apply this extension to contrastive and complete explanations.

Causal models with confidence

Logic-based explanations do not consider model confidence. While causal explanations are general and can, in principle, include model confidence as a part of its output, (Chockler and Halpern 2024) consider only the output label of the classifier. However, while a pixel set may be sufficient for the classification, the model confidence may be very low, leading to low-quality explanations, as shown in (Kelly, Chanchal, and Blake 2025). A more useful definition of an explanation therefore should take into account model confidence as well, so that the pixels set is sufficient to obtain the required classification with at least the required degree of confidence.

Definition 7.

[δ\delta-confident explanation] For a depth-22 causal model MM corresponding to an image classifier 𝒩\mathcal{N} and 0δ10\leq\delta\leq 1, X=x\vec{X}=\vec{x} is a δ\delta-confident explanation of O=oO=o with a confidence C=cC=c relative to a set of contexts K{\mathcal{}K}, if it is an explanation of O=oO=o relative to a set of contexts K{\mathcal{}K} according to Definition 3 and for all uK\vec{u}\in{\mathcal{}K}, (M,u)[X=x]O=o(M,\vec{u})[\vec{X}=\vec{x}]\models O=o with the confidence at least δ×c\delta\times c. If the confidence is exactly δ×c\delta\times c, we call it a δ\delta-exact explanation. We call 11-exact explanations complete explanations.

Given that, in the worst case, the entire image already has the required confidence, it follows that there is always a complete explanation.

Lemma 3.

A complete explanation always exists.

Contrastive explanations

Multiple sufficient explanations are common in images (Chockler, Kelly, and Kroening 2025). Detecting them depends on the set of contexts 𝒦\mathcal{K}. We have seen that a logic contrastive explanation (Definition 6) corresponds to an actual cause. The actual causality framework allows us to introduce and compute different versions of contrastive explanations for different sets of contexts, as it is not limited to the set of contexts 𝒦\mathcal{K} including all possible combinations of values of variables.

Completeness and adjustment pixels

Completeness is a property found in work on saliency methods (Srinivas and Fleuret 2019). There, the intuition is, if the saliency map S(x)S(x) completely encodes the computational information as performed by 𝒩\mathcal{N}, then it is possible to recover 𝒩(x)\mathcal{N}(x) by using S(x)S(x) and xx using some function ϕ\phi. In effect, this means that, in addition to recovering the original model decision, we also need to recover the model’s confidence in its decision. As a causal explanation is not a saliency map, but rather a set of pixels, we introduce a property of complete explanations (Definition 7). Informally, a complete explanation for an image II is a subset-minimal set of pixels which have both the same class and confidence as 𝒩(I)\mathcal{N}(I) for some model 𝒩\mathcal{N}.

A complete explanation for an image classification can be broken down into 33 sets of pixels, pixels sufficient for the classification, pixels which are contrastive for the classification, and the (possibly empty) set of pixels which bring the confidence to the desired level. We call this last set of pixels the adjustment pixels.

3.3 Input invariance

Finally, we draw attention to a useful property of causal explanations in general. Input invariance (Kindermans et al. 2022) is a property first defined for saliency-based methods. It is a stronger form of the property introduced by Srinivas and Fleuret (2019) as weak dependence. Simply stated, given two models, 𝒩1(x)\mathcal{N}_{1}(x) and 𝒩2(x)\mathcal{N}_{2}(x), which are identical other than the 𝒩2\mathcal{N}_{2} has had its first neuron layer before non-linearity altered in a manner that does not affect the gradient (e.g. means-shifting) from 𝒩1(x)\mathcal{N}_{1}(x), there should be no different in saliency map, i.e. S(𝒩1(x))=S(𝒩2(x))S(\mathcal{N}_{1}(x))=S(\mathcal{N}_{2}(x)).

Some methods, such as LRP (Bach et al. 2015), do not satisfy input invariance (Kindermans et al. 2022). Other methods, notably LIME (Ribeiro, Singh, and Guestrin 2016) train an additional model on local data perturbations: it is difficult to make a general statement regarding LIME and input invariance due to local model variability.

As a causal explanation is independent of the exact value of xx, and depends only on the output of 𝒩\mathcal{N}, a causal explanation is invariant in the face of such alterations. The only subtlety being that the baseline value over which an explanation is extract needs also to be means-shifted.

The following lemma follows from the observation that Definition 3 depends only on the properties of xx and not on its value.

Lemma 4.

A causal explanation is input invariant.

4 Algorithm

Refer to caption
(a) Original image, misclassified as ox (0.29190.2919 confidence)
Refer to caption
(b) Sufficient explanation (0.180.18 confidence)
Refer to caption
(c) Contrastive explanation (0.298(0.298 confidence)
Refer to caption
(d) Complete explanation (as confidence as Figure 3(a))
Figure 3: This image has the most extreme contrast classification for MobileNet on a PascalVOC image. MobileNet incorrectly classifies this as ox with a relatively low confidence (Figure 3(b)). The contrastive explanation, which has a higher confidence than Figure 3(a) has the contrast classification of moped (confidence 0.1330.133). Finally, the adjustment pixels are classified as ‘picket fence’. The unusual behavior may be a result of the original misclassification.

We use rex (Chockler et al. 2024) as a basis for computing the definitions provided in this paper. rex uses an approximation of causal responsibility to rank features. Responsibility is a quantitative measure of causality and, roughly speaking, measures the amount of causal influence on the classification (Chockler and Halpern 2004). Algorithm 1 details the greedy algorithm for approximation of δ\delta-sufficient-contrastive explanations, that is, explanations which are both sufficient and contrastive. Unfortunately, the precise computation of contrastive explanations is intractable, as follows from the intractability of actual causes in depth-22 causal models with independent inputs (Chockler et al. 2024).

We use a responsibility map created by rex (Algorithm 14). A full description of rex can be found in Chockler et al. (2024).  rex approximates the causal responsibility of pixels in an image II by performing an initial partition of II. Intuitively, a part of II has the highest causal responsibility if it, by itself, is sufficient for the classification. A part of II is lower in causal responsibility if it is not sufficient by itself, but it is sufficient together with several other parts. A part is completely irrelevant if it is neither. rex starts from large parts and then refines the important parts iteratively.

We use the resulting responsibility map to rank pixels by their responsibility for the desired classification. We use 22 different sets of contexts, 𝒦+\mathcal{K}^{+} and 𝒦\mathcal{K}^{-} (Algorithm 1, 5) to discover explanations. 𝒦+\mathcal{K}^{+} is created by inserting pixels into an image created from the baseline occlusion value, as dictated by their approximate causal responsibility. In the case of a typical ImageNet model, which accepts an input tensor 224×224224\times 224, the total size of 𝒦+\mathcal{K}^{+} is 5017650176. In order to calculate the contrast of this, we also consider the opposite: the context 𝒦\mathcal{K}^{-}, which replaces pixels from the original image with the baseline value, in accordance with their ranking by responsibility. The procedure is similar to the one used for insertion/deletion curves (Petsiuk, Das, and Saenko 2018). 𝒦+\mathcal{K}^{+} (𝒦\mathcal{K}^{-}) is the set of contexts of all images produced by inserting (occluding) pixels, based on their pixel ranks, over a baseline value. In our experiments, we use the same baseline value for both 𝒦+\mathcal{K}^{+} and 𝒦\mathcal{K}^{-}. Our algorithm computes explanations which are both sufficient and contrastive.

Algorithm 1 δ\delta-Sufficient-Contrastive Explanation(I,δ,𝒩)(I,\delta,\mathcal{N})

INPUT:   an image II, a confidence scalar δ\delta between 0 and 11 and a model 𝒩\mathcal{N}

OUTPUT:   a sufficient explanation ss, a contrastive explanation cc, a sorted responsibility ranking \mathcal{R}

1:s,cs,c\leftarrow initialize to \emptyset
2:l𝒩(x)l\leftarrow\mathcal{N}(x)
3:τσ(𝒩(x))×δ\tau\leftarrow\sigma(\mathcal{N}(x))\times\delta (model confidence)
4:𝑝𝑖𝑥𝑒𝑙_𝑟𝑎𝑛𝑘𝑖𝑛𝑔(I,𝒩)\mathcal{R}\leftarrow\mathit{pixel\_ranking}(I,\mathcal{N}) sorted high to low
5:𝒦+,𝒦\mathcal{K}^{+},\mathcal{K}_{-}\leftarrow initialize
6:for pp\in\mathcal{R} do
7:  𝒦+,𝒦\mathcal{K}^{+},\mathcal{K}_{-}\leftarrow update with pp
8:  if 𝒩(𝒦+)=l\mathcal{N}(\mathcal{K}^{+})=l and 𝒩(𝒦)l\mathcal{N}(\mathcal{K}_{-})\not=l and σ(𝒩(𝒦+))τ\sigma(\mathcal{N}(\mathcal{K^{+}}))\geq\tau then
9:   s 𝒦+\leftarrow\mathcal{K}^{+}
10:   c 𝒦\leftarrow\mathcal{K}_{-}
11:   return s,c,s,c,\mathcal{R}
12:  end if
13:end for
Algorithm 2 𝐴𝑑𝑗𝑢𝑠𝑡𝑚𝑒𝑛𝑡𝐷𝑖𝑠𝑐𝑜𝑣𝑒𝑟𝑦(I,𝒩,,e\mathit{Adjustment\;Discovery}(I,\mathcal{N},\mathcal{R},e)

INPUT:   an image II, a model 𝒩\mathcal{N}, a responsibility landscape \mathcal{R}, a contrast explanation ee

OUTPUT:   a set of adjustment pixels aa

1:cσ(𝒩(x))c\leftarrow\sigma(\mathcal{N}(x)) (model confidence)
2:𝑝𝑖𝑥𝑒𝑙_𝑟𝑎𝑛𝑘𝑖𝑛𝑔\mathit{pixel\_ranking}\leftarrow pixels from \mathcal{R} ordered from low to high
3:aa\leftarrow\emptyset
4:for each pixel pi𝑝𝑖𝑥𝑒𝑙_𝑟𝑎𝑛𝑘𝑖𝑛𝑔p_{i}\in\mathit{pixel\_ranking} do
5:  aa{pi}a\leftarrow a\cup\{p_{i}\}
6:  eeae\leftarrow e\cup a
7:  xx^{\prime}\leftarrow mask pixels of xx that are not in e
8:  if 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒(𝒩(x))=c\mathit{confidence}(\mathcal{N}(x^{\prime}))=c then
9:   return  aa
10:  end if
11:end for

Algorithm 2 details the procedure for extracting the adjustment pixels, given an image and a δ\delta-sufficient-contrastive explanation. In practice, the equality of 8 is difficult to achieve. Our implementation allows for a user-provided degree of precision. In our experiments, we set this to 44 decimal places.

1MobileNet02000040000Sufficient Pixels1Resnet501Swin_t1MobileNetContrastive Pixels1Resnet50Imagenet-1K1Swin_t1MobileNetAdjustment Pixels1Resnet501Swin_t
Figure 4: Results on ImageNet for δ=1.0\delta=1.0. Outliers have been removed for clarity. Both Swin_t and ResNet50 have very low requirements for sufficiency compared to MobileNet. ResNet50 also has the lowest requirement for contrast with confidence at least as great as the original image.
Refer to caption
Figure 5: A contrastive explanation with a low ImageNet distance. This colobus monkey has guenon monkey as its contrast classification. The model is clearly relying on the muzzle or snout to refine its classification to colobus.

5 Experimental Results

0551010151520200100100200200300300400400500500Shortest Path (ImageNet-1K)No. ImagesMobileNetSwin_tResNet50
055101015152020Shortest Path (PascalVOC)MobileNetSwin_tResNet50
055101015152020Shortest Path (ECSSD)MobileNetSwin_tResNet50
Figure 6: Shortest path between the original classification and its inverse in the ImageNet hierarchy over 33 different datasets. There is remarkable similarity across all three models, a similarity which is consistent over the different datasets.

In this section, we present an analysis of various models and datasets viewed through the lens of contrast and completeness. To the best of our knowledge, there are no other tools that compute sufficient, contrastive and complete explanations for image classifiers. As shown in Section 3.2, these causal explanations have the same formal properties as logic-based explanations, but are also efficiently computable. Our algorithms have been included in the publicly available XAI tool rex 111https://github.com/ReX-XAI/ReX. rex was used with default parameters for all experiments, in particular, the random seed is 0, baseline value 0, and the number of iterations was 2020.

To the best of our knowledge, no-one has previously investigated the relationship of original classifications to contrastive explanations. Due to the hierarchical nature of the ImageNet dataset—it forms a tree—we can calculate the shortest path between the original classification and its contrastive class. We do the same with the completeness requirements, isolating the adjustment pixels, classifying them and calculating the shortest path to the original classification. For reasons of space, we include only a representative selection of results here. Complete results may be found in the supplementary material.

We examine 33 models, all from TorchVision: a ResNet50, a MobileNet and Swin_t. All models were used with their default preprocessing procedures as provided by TorchVision. We also used 33 different datasets, ImageNet-1K validation (approximately 40004000 images) (Russakovsky et al. 2015), PascalVoc20212 (Everingham et al. 2012) and a dataset of ‘complex’ images, ECSSD (Shi et al. 2016). We chose these datasets due to their being publicly available and well-studied. On a NVidia A100 GPU running Ubuntu LTS 22.04 we found that runtime varied greatly depending on the model under test. The ResNet50 and MobileNet were both very quick, taking approximately 66 seconds per image. The Swin_t model was slower, taking approximately 1616 seconds per image. Our implementation makes extensive use of batched inference for efficiency, so models which do not accept batches will likely be slower.

Figure 4 shows the relative sizes of sufficient, contrastive and adjustment pixel sets for the 33 different models on ImageNet. These are δ\delta-explanations where δ=1\delta=1. See supplementary material for other models and settings of δ\delta . In general, ResNet50 requires the fewest pixels for both sufficiency and contrast, and also has very low adjustment pixel size. MobileNet and Swin_t appear to be much more similar in their behavior, though Swin_t has slightly greater contrastive requirements in general.

Figure 6 shows the shortest path between the original classification and its contrast class, according to the ImageNet hierarchy. In general, across all models, the distance between the two classes is not great, given that the maximum distance is 2424. This is not always the case, however: Figure 3(a) shows an example of a (mis-)classification where the contrast classification is ‘moped’. The adjustment pixels are classified as ‘picket fence’. It is worth noting that the initial confidence was already low on this image. At the other extreme Figure 6 reveals a few cases where the distance between the original classification and its contrast was very small. Manual inspection reveals that these cases represent refinements inside a larger class. Figure 5, for example, reveals the ResNet50 model required the highlighted pixels to refine the classification to colobus monkey. Without them, the classification is still monkey, but a different subclass – guenon.

6 Conclusions

We have demonstrated the concordances between logic-based and causal explanations. We have also introduced definitions for causal contrastive explanations and causal complete explanations. We have created algorithms for approximating these definitions and incorporated them into the tool rex. We have computed contrastive and complete explanations on 33 different datasets with 33 different models. We find that different models have different sufficiency, contrastive and adjustment requirements. Finally, we have examined the relationship between the original, contrastive and adjustment pixel predictions.

References

  • Bach et al. (2015) Bach, S.; Binder, A.; Montavon, G.; Klauschen, F.; Müller, K.-R.; and Samek, W. 2015. On Pixel-wise Explanations for Non-linear Classifier Decisions by Layer-wise Relevance Propagation. PLOS One, 10(7).
  • Beckers (2021) Beckers, S. 2021. Causal sufficiency and actual causation. Journal of Philosophical Logic, 50: 1341–1374.
  • Blake et al. (2023) Blake, N.; Chockler, H.; Kelly, D. A.; Pena, S. C.; and Chanchal, A. 2023. MRxaI: Black-Box Explainability for Image Classifiers in a Medical Setting. arXiv preprint arXiv:2311.14471.
  • Chockler and Halpern (2004) Chockler, H.; and Halpern, J. Y. 2004. Responsibility and Blame: A Structural-Model Approach. J. Artif. Intell. Res., 22: 93–115.
  • Chockler and Halpern (2024) Chockler, H.; and Halpern, J. Y. 2024. Explaining Image Classifiers. In Proceedings of the 21st International Conference on Principles of Knowledge Representation and Reasoning, KR.
  • Chockler, Kelly, and Kroening (2025) Chockler, H.; Kelly, D. A.; and Kroening, D. 2025. Multiple Different Explanations for Image Classifiers. In ECAI European Conference on Artificial Intelligence.
  • Chockler et al. (2024) Chockler, H.; Kelly, D. A.; Kroening, D.; and Sun, Y. 2024. Causal Explanations for Image Classifiers. arXiv preprint arXiv:2411.08875.
  • Everingham et al. (2012) Everingham, M.; Van Gool, L.; Williams, C. K. I.; Winn, J.; and Zisserman, A. 2012. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html.
  • Glymour and Wimberly (2007) Glymour, C.; and Wimberly, F. 2007. Actual causes and thought experiments. In Campbell, J.; O’Rourke, M.; and Silverstein, H., eds., Causation and Explanation, 43–67. Cambridge, MA: MIT Press.
  • Griffin, Holub, and Perona (2022) Griffin, G.; Holub, A.; and Perona, P. 2022. Caltech 256.
  • Hall (2007) Hall, N. 2007. Structural equations and causation. Philosophical Studies, 132: 109–136.
  • Halpern (2019) Halpern, J. Y. 2019. Actual Causality. The MIT Press.
  • Halpern and Pearl (2005) Halpern, J. Y.; and Pearl, J. 2005. Causes and explanations: a structural-model approach. Part I: causes. British Journal for Philosophy of Science, 56(4): 843–887.
  • Hitchcock (2001) Hitchcock, C. 2001. The intransitivity of causation revealed in equations and graphs. Journal of Philosophy, XCVIII(6): 273–299.
  • Hitchcock (2007) Hitchcock, C. 2007. Prevention, preemption, and the principle of sufficient reason. Philosophical Review, 116: 495–532.
  • Ignatiev et al. (2020) Ignatiev, A.; Narodytska, N.; Asher, N.; and Marques-Silva, J. 2020. On Relating ’Why?’ and ’Why Not?’ Explanations. CoRR, abs/2012.11067.
  • Ignatiev, Narodytska, and Marques-Silva (2019) Ignatiev, A.; Narodytska, N.; and Marques-Silva, J. 2019. Abduction-based explanations for Machine Learning models. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI’19/IAAI’19/EAAI’19. AAAI Press. ISBN 978-1-57735-809-1.
  • Izza et al. (2024) Izza, Y.; Ignatiev, A.; Stuckey, P. J.; and Marques-Silva, J. 2024. Delivering Inflated Explanations. Proceedings of the AAAI Conference on Artificial Intelligence, 38(11): 12744–12753.
  • Kelly, Chanchal, and Blake (2025) Kelly, D. A.; Chanchal, A.; and Blake, N. 2025. I Am Big, You Are Little; I Am Right, You Are Wrong. In IEEE/CVF International Conference on Computer Vision, ICCV. IEEE.
  • Kindermans et al. (2022) Kindermans, P.-J.; Hooker, S.; Adebayo, J.; Alber, M.; Schütt, K. T.; Dähne, S.; Erhan, D.; and Kim, B. 2022. The (Un)reliability of Saliency Methods, 267–280. Berlin, Heidelberg: Springer-Verlag. ISBN 978-3-030-28953-9.
  • Lundberg and Lee (2017a) Lundberg, S. M.; and Lee, S.-I. 2017a. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems (NeurIPS), volume 30, 4765–4774.
  • Lundberg and Lee (2017b) Lundberg, S. M.; and Lee, S.-I. 2017b. A Unified Approach to Interpreting Model Predictions. In Guyon, I.; Luxburg, U. V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
  • Marques-Silva et al. (2021) Marques-Silva, J.; Gerspacher, T.; Cooper, M. C.; Ignatiev, A.; and Narodytska, N. 2021. Explanations for Monotonic Classifiers. In Meila, M.; and Zhang, T., eds., Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, 7469–7479. PMLR.
  • Miller (2019) Miller, T. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267: 1–38.
  • Papadimitriou and Yannakakis (1982) Papadimitriou, C. H.; and Yannakakis, M. 1982. The complexity of facets (and some facets of complexity). JCSS, 28(2): 244–259.
  • Petsiuk, Das, and Saenko (2018) Petsiuk, V.; Das, A.; and Saenko, K. 2018. RISE: Randomized Input Sampling for Explanation of Black-box Models. In British Machine Vision Conference (BMVC). BMVA Press.
  • Ribeiro, Singh, and Guestrin (2016) Ribeiro, M. T.; Singh, S.; and Guestrin, C. 2016. “Why Should I Trust You?” Explaining the Predictions of Any Classifier. In Knowledge Discovery and Data Mining (KDD), 1135–1144. ACM.
  • Russakovsky et al. (2015) Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; Berg, A. C.; and Fei-Fei, L. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3): 211–252.
  • Selvaraju et al. (2017) Selvaraju, R. R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; and Batra, D. 2017. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. In International Conference on Computer Vision (ICCV), 618–626. IEEE.
  • Shi et al. (2016) Shi, J.; Yan, Q.; Xu, L.; and Jia, J. 2016. Hierarchical Image Saliency Detection on Extended CSSD. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(4): 717–729.
  • Shih, Choi, and Darwiche (2018) Shih, A.; Choi, A.; and Darwiche, A. 2018. A symbolic approach to explaining Bayesian network classifiers. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI’18, 5103–5111. AAAI Press. ISBN 9780999241127.
  • Shrotri et al. (2022) Shrotri, A. A.; Narodytska, N.; Ignatiev, A.; Meel, K. S.; Marques-Silva, J.; and Vardi, M. Y. 2022. Constraint-Driven Explanations for Black-Box ML Models. Proceedings of the AAAI Conference on Artificial Intelligence, 36(8): 8304–8314.
  • Srinivas and Fleuret (2019) Srinivas, S.; and Fleuret, F. 2019. Full-gradient representation for neural network visualization. In Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook, NY, USA: Curran Associates Inc.
  • Weslake (2015) Weslake, B. 2015. A partial theory of actual causation. British Journal for the Philosophy of Science, To appear.
  • Woodward (2003) Woodward, J. 2003. Making Things Happen: A Theory of Causal Explanation. Oxford, U.K.: Oxford University Press.

Appendix A Code and Data

Our algorithms have been included in the open source XAI tool rex. rex is available at https://github.com/ReX-XAI/ReX. Complete data and analysis code can be found at the following anonymous link https://figshare.com/s/7d822f952abcbe54ca93.

Appendix B Proofs

All definition references are to the main paper, unless explicitly stated otherwise.

Lemma 5.

For image classifiers, causal explanations (Definition 3) are abductive (Definition 5).

Proof.

For an instance (v,c)(v,c), a set of variables 𝒳\mathcal{X}\subseteq\mathcal{F} and their values as defined by vv satisfy Definition 5 iff they satisfy EXIC1 in Definition 3. Moreover, 𝒳\mathcal{X} is subset-minimal in \mathcal{F} iff it satisfies EXIC2 in Definition 3. Note that EXIC3 does not have an equivalent clause in the definition of abductive explanations, hence the other direction does not necessarily hold. ∎

Lemma 6.

Contrastive explanations (Definition 6) are equivalent to actual causes (Definition 1) in the same setting.

Proof.

Recall that the framework of logic-based explanations translates to causal models of depth 22 with all input variables being independent. An instance (v,c)(v,c) in Definition 6 exists iff for a context u\vec{u} defined by vv, (M,u)(𝒴=y)(O=c)(M,\vec{u})\models(\mathcal{Y}=\vec{y})\wedge(O=c)(AC1), where y\vec{y} is the restriction of vv to the variables in 𝒴\mathcal{Y}. In this type of causal models, W\vec{W} in AC2 is empty, hence Definition 6 holds for 𝒴=y\mathcal{Y}=\vec{y} iff AC2 holds. Finally, 𝒴\mathcal{Y} is minimal iff AC3 holds. ∎

Lemma 7.

A contrastive explanation always exists.

Proof.

Complete replacement of the pixels in an image II is a contrastive explanation if no smaller subset is a contrastive explanation. ∎

This introduces a small complexity in the practical computation of contrastive explanations. It requires that the chosen baseline value does not that the same classification as an image II, otherwise the baseline value does not introduce a contrastive element. rex checks for this problem automatically.

The following result follows from Lemma 6 and DP-completeness of actual causes in depth-22 causal models with independent inputs (Chockler et al. 2024). The class DP consists of those languages LL for which there exist a language L1L_{1} in NP and a language L2L_{2} in co-NP such that L=L1L2L=L_{1}\cap L_{2} (Papadimitriou and Yannakakis 1982).

Lemma 8.

Given an input image, the decision problem of a contrastive explanation is DP-complete.

Proof.

Proof follows Chockler, Kelly, and Kroening (2025)

Appendix C Experimental Results

1MobileNet02000040000Sufficient Pixels1Resnet501Swin_t1MobileNetNecessary Pixels1Resnet50ImageNet1Swin_t1MobileNetComplete Pixels1Resnet501Swin_t1MobileNet02000040000Sufficient Pixels1Resnet501Swin_t1MobileNetNecessary Pixels1Resnet50PascalVoc1Swin_t1MobileNetComplete Pixels1Resnet501Swin_t1MobileNet02000040000Sufficient Pixels1Resnet501Swin_t1MobileNetNecessary Pixels1Resnet50ECSSD1Swin_t1MobileNetComplete Pixels1Resnet501Swin_t
Figure 7: Results on all 33 datasets for confidence threshold 1.0\geq 1.0. Outliers have been removed for clarity. Both Swin_t and ResNet50 have very low requirements for sufficiency compared to MobileNet. ResNet50 also has the lowest requirement for contrast with confidence at least as great as the original image.

Figure 7 shows the results across all 33 datasets and models with δ=1.0\delta=1.0. In general, all models follow a fairly similar pattern. It is interesting to note that MobileNet requires more pixels in general for a causal explanation, but also the lowest number of pixels for adjustment. This suggests that, for MobileNet at least, the contrastive explanation encodes nearly all of its completeness into the contrastive explanation.

0551010151520200100100200200300300400400500500Shortest Path (ImageNet-1K)No. ImagesMobileNetSwin_tResNet50
055101015152020Shortest Path (PascalVOC)MobileNetSwin_tResNet50
055101015152020Shortest Path (ECSSD)MobileNetSwin_tResNet50
Figure 8: Shortest path between the original classification and its adjustment pixels in the ImageNet hierarchy over 33 different datasets. The distance between adjustment and target class on ImageNet-1K is obviously different from both PascalVOC and ECSSD.

Figure 8 shows the length of the shortest path between the original classification and the adjustment pixels.

Appendix D Different δ\delta

1MobileNet02000040000Sufficient Pixels1Resnet501Swin_t1MobileNetContrastive Pixels1Resnet50imagenet1Swin_t1MobileNetAdjustment Pixels1Resnet501Swin_t1MobileNet02000040000Sufficient Pixels1Resnet501Swin_t1MobileNetContrastive Pixels1Resnet50voc1Swin_t1MobileNetAdjustment Pixels1Resnet501Swin_t1MobileNet02000040000Sufficient Pixels1Resnet501Swin_t1MobileNetContrastive Pixels1Resnet50ecssd1Swin_t1MobileNetAdjustment Pixels1Resnet501Swin_t
Figure 9: Results on all 33 datasets for confidence threshold 0.5\geq 0.5. Outliers have been removed for clarity. Both Swin_t and ResNet50 have very low requirements for sufficiency compared to MobileNet. ResNet50 also has the lowest requirement for contrast with confidence at least as great as the original image.

Figure 9 shows results for all datasets when the δ\delta-confident explanation threshold is 0.50.5. If minimality is taken as a quality indicatory, this setting of δ\delta sees a general deterioration of quality. The adjustment pixels sets are larger in general across all models and datasets. This suggests that this is a payoff between contrast and completeness: forcing the contrast explanation to have a higher confidence reduces the size of the adjustment pixel set, whereas a lower δ\delta leads to a smaller contrast explanation and larger adjustment set. Users should bear this in mind when decided what aspect of a model’s behavior they wish to explore.

0551010151520200200200400400600600800800Shortest Path (ImageNet-1K)No. ImagesMobileNetSwin_tResNet50
055101015152020Shortest Path (PascalVOC)MobileNetSwin_tResNet50
055101015152020Shortest Path (ECSSD)MobileNetSwin_tResNet50
Figure 10: Shortest path between the original classification and its inverse in the ImageNet hierarchy over 33 different datasets, with δ=0.5\delta=0.5. There is remarkable similarity across all three models, a similarity which is consistent over the different datasets.
0551010151520200200200400400600600800800Shortest Path (ImageNet-1K)No. ImagesMobileNetSwin_tResNet50
055101015152020Shortest Path (PascalVOC)MobileNetSwin_tResNet50
055101015152020Shortest Path (ECSSD)MobileNetSwin_tResNet50
Figure 11: Shortest path between the original classification and its adjustment pixels in the ImageNet hierarchy over 33 different datasets, with δ=0.5\delta=0.5. The distance between adjustment and target class on ImageNet-1K is obviously different from both PascalVOC and ECSSD.

Figure 11 shows the distance in the ImageNet hierarchy across 33 datasets and 33 models, given a δ=0.5\delta=0.5 for contrastive explanations, i.e. the contrastive explanation must contain at least 50%50\% of the original confidence.

Appendix E Caltech-256

1MobileNet02000040000Sufficient Pixels1Resnet501Swin_t1MobileNetContrastive Pixels1Resnet50Caltech-256 (3 classes)1Swin_t1MobileNetAdjustment Pixels1Resnet501Swin_t
Figure 12: Results on 33 classes from Caltech-256. The general pattern here is the same as for the other datasets.

Figure 12 shows a small study on 33 different classes from Caltech-256 (Griffin, Holub, and Perona 2022). These are, in general, simple images compared to ImageNet. The general pattern seen in Figure 7 does not change.

055101015152020Shortest PathMobileNetSwin_tResNet50
(a) Shortest path the original classification and its contrast in the ImageNet hierarchy over a subset of Caltech-256. There is remarkable similarity across all three models, a similarity which is consistent over the different datasets.
055101015152020Shortest PathMobileNetSwin_tResNet50
(b) Shortest path between the original classification and its adjustment pixels in the ImageNet hierarchy over a subset of caltech-256. There is a lot of similarity across all three models, a similarity which is consistent over the different datasets.
Figure 13: Contrast and Adjustment distance on Caltech-256.

Figure 13 shows the contrast and adjustment distance in the ImageNet hierarchy on a small study of 33 different classes from Caltech-256. This follows a similar pattern to Figure 6 in the main paper.