Rationally Inattentive Utility Maximization for Interpretable Deep Image Classification

Kunal Pattanayak, , Vikram Krishnamurthy V. Krishnamurthy and K. Pattanayak are with the School of Electrical and Computer Engineering, Cornell University, Ithaca, NY, 14853 USA. e-mail: vikramk@cornell.edu, kp87@cornell.edu.

Abstract

Are deep convolutional neural networks (CNNs) for image classification explainable by utility maximization with information acquisition costs? We demonstrate that deep CNNs behave equivalently (in terms of necessary and sufficient conditions) to rationally inattentive utility maximizers, a generative model used extensively in economics for human decision making. Our claim is based by extensive experiments on 200 deep CNNs from 5 popular architectures. The parameters of our interpretable model are computed efficiently via convex feasibility algorithms. As an application, we show that our economics-based interpretable model can predict the classification performance of deep CNNs trained with arbitrary parameters with accuracy exceeding $94\%$ . This eliminates the need to re-train the deep CNNs for image classification. The theoretical foundation of our approach lies in Bayesian revealed preference studied in micro-economics. All our results are on GitHub and completely reproducible.

Index Terms:

Interpretable Machine Learning, Bayesian Revealed preference, Rational Inattention, Deep Neural Networks, Image Classification

1 Introduction

This paper considers interpretable deep image classification.¹¹1This paper builds substantially on our existing arXiv preprint https://arxiv.org/abs/2102.04594 uploaded in January, 2021. We show that image classification using deep Convolutional Neural Networks (CNNs) can be interpreted as a generative human decision-making model developed in microeconomics.

In micro- and behavioral economics²²2Micro-economics models the interaction of individual agents pursuing their private interests. Behavioral economics models human decision making in terms of subjective probabilities via prospect theory and framing. In the rest of this paper, we will use the term ‘agent’ to denote a human decision-maker., a fundamental question relating to human decision making is: How to model attention spans in humans (agents)? The area of rational inattention [1, 2], pioneered by Nobel laureate Christopher Sims models human attention in information-theoretic terms. The key hypothesis is that agents are “boundedly rational”- their perception of the environment is modeled as a Shannon capacity limited channel. In simple terms, rational inattention assigns a mutual information cost for human attention spans.

Building on the rational inattention model, the next key concept is that of a Bayesian agent with rational inattention that maximizes its expected utility. Such models are studied extensively in [3, 4, 5]. The intuition is this: more attentive decisions yield a higher expected utility at the expense of a larger attention cost. Hence, the Bayesian agent optimally trades off between minimizing its sensing cost and maximizing its expected utility. An important question is: How to test for rationally inattentive utility maximization given the decisions of a Bayesian agent? In the last decade, necessary and sufficient conditions have been developed in the area of Bayesian revealed preference [6, 7] to test if the decisions of a Bayesian agent are consistent with rationally inattentive utility maximization.

Summary of Results.

The question we address is: Can the decisions of deep CNNs in image classification be explained by a rationally inattentive Bayesian utility maximizer?

This paper uses a data-driven Bayesian revealed preference approach to interpretable deep CNN image classification. Bayesian revealed preference performs a post-hoc analysis of agent decisions. It constructs a generative³³3A generative model is image-independent, and hence provides a global explanation for deep image classification. In contrast, local approximation models for deep image classification are image-specific; they approximate model decisions via tractable functionals in a $\delta$ -neighborhood of every input. explanatory model for the agent decisions, parameterized by a utility function and an information acquisition cost. Our approach draws important parallels between human decision making and deep neural networks; namely that deep neural networks satisfy economics based rationality.

By very nature, Bayesian revealed preference reconstructs a set of feasible utility functions and information acquisition costs. Every element in the feasible set explains the deep CNN decisions equally well. The computed utility function induces a preference ordering on the set of image classes. That is, how much a deep CNN prioritizes accurate classification over an inaccurate classification. The information acquisition cost abstracts the penalty incurred by the deep CNN to ‘learn’ an accurate latent feature representation. The key results in this paper are:

Figure 1: Schematic illustration of rationally inattentive Bayesian utility maximization based interpretable image classification by deep CNNs. Theorem 1 establishes equivalence between the image classification behavior of a deep CNN and the decisions of a rationally inattentive maximizer. Hence, the deep CNN’s image classification behavior can be parsimoniously represented by a utility function and an information acquisition cost.

1.

We show that the image classification decisions of deep CNNs satisfy the necessary and sufficient conditions for rationally inattentive utility maximization by a large margin. The margin by which the decisions satisfy these conditions are displayed in Table I. Hence, we establish that rationally inattentive utility maximization is a robust fit to deep image classification. This result is schematically shown in Fig. 1.
2.

To aid visualization of our interpretable model, we provide a sparsity-enhanced decision test that computes the sparsest utility function and information acquisition cost which rationalizes deep CNN decisions. The sparsest solution yields a parsimonious representation of hundreds of thousands of layer weights of the deep CNNs in terms of a few hundred parameters. The utility function of the sparsest interpretable model also induces a useful preference ordering amongst the set of hypotheses (image labels) considered by the CNN; for example, how much additional priority is allocated to the classification of a cat as a cat compared to a cat as a dog. In classical deep learning, this preference ordering is not explicitly generated. The sparsity results for various deep CNN architectures are displayed in Table II and Fig. 2.
3.

Our final result demonstrates the usefulness of our interpretable model. We show that the interpretable model computed from CNN decisions can predict the classification accuracy of a CNN trained with arbitrary parameters with accuracy exceeding 94%. This by-passes the need to re-train the deep CNN when its accuracy is observed for a finite number of training parameters. The prediction results are displayed in Table III.

The above results are backed by experiments performed on several deep CNN architectures using the benchmark image dataset, namely, CIFAR-10 [8]. The first two results use deep CNN decisions aggregated over varying training epochs. The third (prediction) result uses deep CNN decisions trained on noisy image datasets parameterized by the noise variance.

Related Works

Since we study interpretable deep learning using behavioral and micro- economics, we briefly discuss related works in these areas.

Bayesian revealed preference and Rational inattention. Estimating utility functions given a finite sequence of decisions and budget constraints is the central theme of revealed preference in micro-economics. The seminal work of [9, 10] (see also [11]) give necessary and sufficient conditions for the existence of a utility function that rationalizes a finite time series of consumption bundles of a decision-maker. Rationally inattentive models for Bayesian decision making have been studied extensively in [3, 4, 5]. In the last decade, the area of Bayesian revealed preference [6, 7] develops necessary and sufficient conditions to test for rationally inattentive Bayesian utility maximization.

Interpretable ML. Providing transparent models for de-obfuscating ‘black-box’ ML algorithms under the area of interpretable machine learning is a subject of extensive research [12, 13, 14]. Interpretable machine learning is defined in [15] as “the use of machine-learning models for the extraction of relevant knowledge about domain relationships contained in data”.

Since the literature is enormous, we only discuss a subset of works pertaining to interpretability of deep neural networks for image classification [16, 17, 18]. One prominent approach, namely, saliency maps, reconstructs the most preferred or typical image pertaining to each image class the deep neural network has learned [19, 20]. Related work includes creating hierarchical models for determining the importance of image features that determine its label [21]. In this paper, this feature importance is encoded into the utility function that parameterizes our interpretable model. Another approach seeks to provide local approximations to the trained model, local w.r.t the input image [22, 23]. In contrast, our generative interpretable model provides a global black-box approximation for deep image classification. A third approach approximates the decisions of the deep neural networks by a linear function of simplified individual image features [24, 25, 26, 23]. In contrast, our interpretable model fits a stochastic non-linear map that relates the true and predicted image labels. The parameters of the map are obtained by solving a convex feasibility problem parameterized by the deep CNN decisions. Finally, deep neural networks have also been modeled by Bayesian inference frameworks using probabilistic graphical methods [27].

To the best of our knowledge, an economics based approach for the post-hoc analysis of deep neural networks has not been explored in literature. However, we note that behavioral economics based interpretable models have been applied to domains outside interpretable machine learning, for example, in online finance platforms for efficient advertising [28, 29], training neural networks [30] and more recently in YouTube to rationalize user commenting behavior [31]. Finally, due to our recent equivalence result [32], our behavioral economics approach to interpretable deep image classification can be related to classical revealed preference methods [9, 10] in microeconomics.

2 Bayesian Revealed preference with Rational Inattention

This section describes the key ideas behind Bayesian revealed preference. Despite the abstract formulation below, the reader should keep in mind the deep learning context. In Sec. 3, we will use Bayesian revealed preference theory to construct an interpretable deep learning representation by showing that deep CNNs are equivalent to rationally inattentive Bayesian utility maximizers.

2.1 Utility Maximization with Rational Inattention (UMRI)

Bayesian revealed preference aims to determine if the decisions of a Bayesian agent are consistent with expected utility maximization subject to a rational inattention sensing cost. We start by describing the utility maximization model with rational inattention (henceforth called UMRI) for a collection of Bayesian decision makers/agents.

Abstractly, the UMRI model is parameterized by the tuple

\Theta=(\mathcal{K},\mathcal{X},\mathcal{Y},\mathcal{A},\pi_{0},C,\{\alpha_{k},u_{k},k\in\mathcal{K}\}).

(1)

With respect to the abstract parametrization of the UMRI model for a collection of Bayesian agents, the following elements constitute the tuple $\Theta$ defined in (1).
Agents: $\mathcal{K}=\{1,2,\ldots,K\}~(K\geq 2)$ indexes the finite set of Bayesian agents.
State: $\mathcal{X}$ is the finite set of ground truths with prior probability distribution $\pi_{0}$ . With respect to our image classification context, $\mathcal{X}=\{1,2,\ldots 10\}$ is the set of image classes in the CIFAR-10 dataset and $\pi_{0}$ is the empirical probability distribution of the image classes in the test dataset of CIFAR-10.
Observation and attention strategy: Agent $k\in\mathcal{K}$ chooses attention strategy $\alpha_{k}:\mathcal{X}\rightarrow\Delta(\mathcal{Y})$ , a stochastic mapping from $\mathcal{X}$ to a finite set of observations $\mathcal{Y}$ . Given state $x$ and attention strategy $\alpha_{k}$ , the agent samples observation $y$ with probability $\alpha_{k}(y|x)$ . The agent then computes the posterior probability distribution $p(x|y)$ via Bayes formula as

p(x|y)=\frac{\pi_{0}(x)\alpha_{k}(y|x)}{\sum_{x^{\prime}\in\mathcal{X}}\pi_{0}(x^{\prime})\alpha_{k}(y|x^{\prime})}.

(2)

The observation and attention strategy are latent variables that abstractly represent the learned feature representations in the deep image classification context. Bayesian revealed preference theory tests their existence via the convex feasibility test in Theorem 1 below.
Action: Agent $k\in\mathcal{K}$ chooses action $a$ from a finite set of actions $\mathcal{A}$ after computing the posterior probability distribution $p(x|y)$ . In the image classification context, $a$ is the image class predicted by the neural network, hence $\mathcal{A}=\mathcal{X}$ .
Utility function: Agent $k\in\mathcal{K}$ has a utility function $u_{k}(x,a)\in\mathbb{R}^{+}$ , $x\in\mathcal{X},a\in\mathcal{A}$ and aims to maximize its expected value, with the expectation taken wrt the random state $x$ and random observation $y$ . A key feature in our approach is to show that the utility function rationalizes the decisions of the deep CNNs (made precise in Definition 1).
Information Acquisition Cost: The information acquisition cost $C(\alpha,\pi_{0})\in\mathbb{R}^{+}$ depends on attention strategy $\alpha$ and prior pmf $\pi_{0}$ . It is the sensing cost the agent incurs in order to estimate the underlying state (2). In the context of machine learning, $C(\cdot)$ abstractly captures the ‘learning’ cost incurred during training of the deep neural networks. In rational inattention theory from behavioral economics, a higher information acquisition cost is incurred for more accurate attention strategies (equivalently, more accurate state estimates (2) given observation $y$ ). We refer the reader to the influential work of [1, 2].

Each Bayesian agent $k\in\mathcal{K}$ , aims to maximize its expected utility while minimizing its cost of information acquisition. Hence, the action $a$ given observation $y$ , and attention strategy $\alpha_{k}$ are chosen as follows:

Definition 1 (Rationally Inattentive Utility Maximization).

Consider a collection of Bayesian agents $\mathcal{K}$ parameterized by $\Theta$ in (1) under the UMRI model. Then,
(a) Expected Utility Maximization: Given posterior probability distribution $p(x|y)$ , every agent $k\in\mathcal{K}$ chooses action $a$ that maximizes its expected utility. That is, with $\mathbb{E}$ denoting mathematical expectation, the action $a$ satisfies

a\in\operatorname*{argmax}_{a^{\prime}\in\mathcal{A}}~\mathbb{E}_{x}\{u_{k}(x,a^{\prime})|y\}=\sum\limits_{x\in\mathcal{X}}p(x|y)u_{k}(x,a^{\prime})

(3)

(b) Attention Strategy Rationality: For agent $k$ , the attention strategy $\alpha_{k}$ optimally trades off between maximizing the expected utility and minimizing the information acquisition cost.

\displaystyle\alpha_{k}\in\operatorname*{argmax}_{\alpha^{\prime}}\mathbb{E}_{y}\{

\displaystyle\operatorname*{max}_{a\in\mathcal{A}}\mathbb{E}_{x}\{u_{k}(x,a)|y\}\}-C(\alpha^{\prime},\pi_{0})

(4)

Eq. 3,4 in Definition 1 constitute a nested optimization problem. The lower-level optimization task is to choose the the ‘best’ action for any observation $y$ based on the computed posterior belief of the state. The upper-level optimization task is to sample the observations optimally by choosing the ‘best’ attention strategy.

Remark. The multiple Bayesian agents in $\Theta$ have the same state space $\mathcal{X}$ , observation space $\mathcal{Y}$ , action space $\mathcal{A}$ , prior $\pi_{0}$ and cost of information acquisition $C$ , but only differ in their utility functions. Bayesian revealed preference theory relies on this crucial constraint on the optimization variables in (3), (4) for detecting optimal behavior in a finite number of agents.

2.2 Bayesian Revealed Preference (BRP) Test for Rationally Inattentive Utility Maximization

Having described the UMRI model (collection of rationally inattentive utility maximizers), we are now ready to state our key result. Theorem 1 below says that the decisions of a collection of Bayesian agents is rationalized by a UMRI tuple $\Theta$ if and only if a set of convex inequalities have a feasible solution. These inequalities comprise our Bayesian Revealed Preference (henceforth called BRP) test for rationally inattentive utility maximization.

For notational convenience, the decisions of the Bayesian agents in the UMRI model are compacted into the dataset $\mathbb{D}$ defined as:

\mathbb{D}=\{\pi_{0},p_{k}(a|x),x\in\mathcal{X},a\in\mathcal{A},k\in\mathcal{K}\}.

(5)

In (5), $\pi_{0}\in\Delta^{|\mathcal{X}|-1}$ denotes the prior pmf over the set of states $\mathcal{X}$ in $\Theta$ (1). The variable $p_{k}(a|x)$ is the conditional probability that agent $k\in\mathcal{K}=\{1,2,\ldots,K\}$ takes action $a$ given state $x$ . $\mathbb{D}$ characterizes the input-output behavior of the collection of Bayesian agents and serves as the input for BRP feasibility test described below.

Theorem 1 (BRP Test for Rationally Inattentive Utility Maximization [7]).

Given the dataset $\mathbb{D}$ (5) obtained from a collection of Bayesian agents $\mathcal{K}$ . Then,
1. Existence: There exists a UMRI tuple $\Theta(\mathbb{D})$ (1) that rationalizes dataset $\mathbb{D}$ if and only if there exists a feasible solution that satisfies the set of convex inequalities

\text{BRP}(\mathbb{D},\{u_{k},c_{k}\}_{k=1}^{K})\leq\mathbf{0},~u_{k}\in\mathbb{R}_{+}^{|\mathcal{X}|\times|\mathcal{A}|},~c_{k}>0.

(6)

In (6), BRP $(\cdot)$ corresponds to a set of convex (in the variables $\{u_{k},c_{k}\}_{k=1}^{K}$ ) inequalities, stated in Algorithm 1.
2. Reconstruction: Given a feasible solution $\{u_{k},c_{k}\}_{k=1}^{K}$ to $BRP(\mathbb{D},\cdot)$ , $u_{k}$ is the $k^{\text{th}}$ Bayesian agent’s utility function in the feasible model tuple $\Theta(\mathbb{D})$ . The set of observations $\mathcal{Y}=\mathcal{A}$ , the set of actions in $\mathbb{D}$ . The feasible cost of information acquisition $C$ in $\Theta(\mathbb{D})$ is defined in terms of $c_{k}$ as:

	$\displaystyle C(\alpha)$	$\displaystyle=\max_{k\in\mathcal{K}}c_{k}+\sum_{a}\max_{b\in\mathcal{A}}\sum_{x}p(x,a)u_{k}(x,b)$
		$\displaystyle\quad\quad\quad~-\sum_{x,a}p_{k}(x,a)u_{k}(x,a),~\alpha=\{p(a\|x)\}.$		(7)

The proof of Theorem 1 is in Appendix A.1. Before launching into a detailed discussion, we stress the “iff” in Theorem 1. Put simply: if the inequalities in (6) are not feasible, then the Bayesian agents that generate the dataset $\mathbb{D}$ are not rationally inattentive utility maximizers. If (6) has a feasible solution, then there exists a reconstructable family of viable utility functions and information acquisition costs that rationalize $\mathbb{D}$ ⁴⁴4In terms of interpretable deep learning, of all parameters in the UMRI tuple, we are only interested in the utility functions of the agents and the cost of information acquisition, since the remaining parameters are immediately deduced from the decision dataset $\mathbb{D}$ .. A key feature of Theorem 1 is that the estimated utilities (and information costs) are set-valued; every utility and cost function in the feasible set explains $\mathbb{D}$ equally well. The estimated UMRI model parameters are set-valued due to the finite number of Bayesian agents whose decisions constitute the dataset $\mathbb{D}$ . The estimated parameter set converges to a point if and only if the inequality (6) holds as $|\mathcal{K}|\rightarrow\infty$ .

Computational Aspects of BRP Test. Suppose the dataset $\mathbb{D}$ is obtained from $K$ Bayesian agents. Then, BRP $(\mathbb{D})$ comprises a feasibility test with $K~(|\mathcal{X}||\mathcal{A}|+1)$ free variables and $K^{2}+K~(|\mathcal{A}|^{2}-|\mathcal{A}|-1)$ convex inequalities. Thus, the number of free variables and inequalities in the BRP feasibility test scale linearly and quadratically, respectively, with the number of observed Bayesian agents.

0: Dataset

\mathbb{D}=\{\pi_{0},p_{k}(a|x),x,a\in\mathcal{X},k\in\mathcal{K}\}

from a collection of Bayesian agents

\mathcal{K}

. Find: Positive reals

c_{k},~u_{k}(x,a)\in(0,1]

for all

x\in\mathcal{X},

a\in\mathcal{A},~k\in\mathcal{K}

that satisfy the following inequalities:

$\displaystyle\hskip-42.67912pt\underline{\textbf{NIAS}}:$	$\displaystyle~\sum_{x}p_{k}(x\|a)~(u_{k}(x,b)-u_{k}(x,a))\leq 0,$	(8)
	$\displaystyle~\forall a,b\in\mathcal{A},~k\in\mathcal{K},$
$\displaystyle\hskip-42.67912pt\underline{\textbf{NIAC}}:$	$\displaystyle~\sum_{a}\left(\max_{b}\sum_{x}p_{j}(x,a)u_{k}(x,b)\right)-c_{j}$	(9)
$\displaystyle-$	$\displaystyle~\sum_{x,a}p_{k}(x,a)u_{k}(x,a)+c_{k}\leq 0,~\forall j,k\in\mathcal{K},$

where

p_{k}(x,a)=\pi_{0}(x)p_{k}(a|x),~p_{k}(x|a)=\frac{p_{k}(x,a)}{\sum_{x^{\prime}}p_{k}(x^{\prime},a)}

. Return: Set of feasible utility functions

u_{k}

and information acquisition costs

c_{k}

incurred by agents

k\in\mathcal{K}

Algorithm 1 BRP Convex Feasibility Test of Theorem 1

2.3 Relating UMRI and BRP test to Interpretable Deep Image Classification

We now discuss how the above BRP test relates to interpretable image classification using deep CNNs. The BRP convex feasibility test in Theorem 1 comprises two sets of inequalities, namely, the NIAS (No-Improving-Action-Switches) (8) and NIAC (No-Improving-Action-Cycles) (9) inequalities (Algorithm 1). NIAS ensures that the agent takes the best action given a posterior pmf. NIAC ensures that every agent chooses the best attention strategy. BRP test checks if there exist $K$ utility functions and $K$ positive reals that, together with $\mathbb{D}$ , satisfy the NIAS and NIAC inequalities.

Toy Example with 2 CNNs

The following discussion gives additional insight into our approach. Consider the simplest case involving two trained deep CNNs $N_{1}$ and $N_{2}$ ; so $\mathcal{K}=\{1,2\}$ in the above notation. Assume $N_{1}$ and $N_{2}$ have the same network architecture. Suppose an analyst observes that $N_{1}$ makes accurate decisions on a rich input image dataset while $N_{2}$ makes less accurate decisions on the same dataset.

Our UMRI model first abstracts the accuracy of the feature representations of the input image data learned by $N_{1}$ and $N_{2}$ via attention strategies $\alpha_{1}$ and $\alpha_{2}$ in (4). Second, the information acquisition cost function $C(\cdot)$ abstracts the computational resources expended for learning the representations. The rationale is that learning an accurate latent feature representation is costly, and this is abstracted by the information acquisition cost.

Let the training cost incurred by $N_{1}$ and $N_{2}$ be $C(\alpha_{1})$ and $C(\alpha_{2})$ respectively. If the decisions of $N_{1}$ and $N_{2}$ can be explained by the UMRI model (and Theorem 1 below will give necessary and sufficient conditions for this), then there exist utility functions $u_{1}$ and $u_{2}$ for $N_{1}$ and $N_{2}$ , that satisfy:

\mathbb{E}_{\alpha_{i}}\{u_{i}\}-C(\alpha_{i})\geq\mathbb{E}_{\alpha_{j}}\{u_{i}\}-C(\alpha_{j}),~i,j\in\{1,2\}~

(10)

The above inequality says that CNNs $N_{1}$ and $N_{2}$ would be worse off (in an expected utility sense) if they make decisions based on swapping each other’s learned representations. That is, both $N_{1}$ and $N_{2}$ learn the ‘best’ feature representation of the input images given their training parameters.

Discussion

(i) Parsimonious Interpretable Representation of deep CNNs. In the deep image classification context, due to the UMRI model’s parsimonious parametrization in (1), the decisions of $K$ CNNs can be rationalized by just $K$ utility functions and an information acquisition cost function, thus bypassing the need of several million parameters to describe the deep CNNs.
(ii) Identifiability. The BRP feasibility test requires the dataset $\mathbb{D}$ to be generated from $K>2$ Bayesian agents. If $K=1$ , then (6) holds trivially since any information acquisition cost satisfies the convex inequalities of BRP. Another intuitive way of motivating a collection of agents for the BRP is as follows. Reconstructing a feasible UMRI model tuple $\Theta$ that rationalizes the decisions of the deep CNNs is analogous to fitting a line to a finite number of points. One can fit infinitely many lines through a single point. The task becomes non-trivial if the number of points exceeds $2$ . In the Bayesian revealed preference context, the points correspond to the decisions from each Bayesian agent. The slope and intercept of the fitted line, in our case, corresponds to the utility function and cost of information acquisition that rationalize the agent decisions.
(iii) Relative Optimality implies Global Optimality. In the setting involving $K>2$ deep CNNs (agents), the NIAS and NIAC inequalities of BRP test check for relative optimality - given utility function $u_{k}$ , does deep CNN $k$ performs at least as well as any other observed deep CNN in $\mathcal{K}\backslash\{k\}$ ? Clearly, testing for relative optimality is weaker than testing for global optimality (4) which ideally requires access to decisions from an infinite number of deep CNNs. Setting the cost of information acquisition as a free variable bridges this gap. The proof of Theorem 1 shows that if the deep CNN decisions satisfy relative optimality, then there exists a cost of information acquisition such that the decisions are globally optimal. That is, Theorem 1 ensures relative optimality is sufficient for global optimality.
(iii) Generalization of [7]. Theorem 1 generalizes [7, Theorem 1] in two ways. (1) In [7], the utilities $u_{k}$ in UMRI model tuple $\Theta$ are assumed known, and only the information acquisition costs $c_{k}$ are estimated, whereas Theorem 1 estimates both parameters. (2) The expression for the reconstructed model tuple $\Theta(\mathbb{D})$ is novel; the discussion in [7] is only confined to the existence of such a tuple.
(vi) Single Utility UMRI (S-UMRI ). In Appendix A.2, we propose a sparse version of UMRI , namely, the S-UMRI model in (21). The key distinction of this model is that all agents have the same utility function $u$ and thus can be represented with substantially fewer parameters. In complete analogy to Theorem 1, we outline a decision test in Theorem 3 that states necessary and sufficient conditions for agent decisions to be consistent with the S-UMRI model of rationally inattentive utility maximization. We discuss this sparse parametrization in the appendix so as not to interrupt the flow of the main text.
(vii) Degenerate solution to BRP and S-BRP tests. The degenerate utility function of all zeros and cost of information acquisition $C=0$ trivially satisfy the BRP and S-BRP tests and lie at the boundary of the feasible set of parameters.

Summary

This section formulated an economics-based decision-making model. Since this model may not be familiar to a machine learning reader, we summarize the main ideas. We introduced the rationally inattentive utility maximization model, namely, the UMRI model for a collection of Bayesian agents (decision makers). Our main result Theorem 1 outlines a decision test BRP for rationally inattentive utility maximization given decisions from a collection of agents. This BRP test comprises a set of convex inequalities that have a feasible solution if and only if the collection of agents are rationally inattentive utility maximizers. Theorem 1 also provides an explicit reconstruction of the feasible UMRI model parameters that rationalize input agent decisions. The set of feasible utility functions and information acquisition costs thus parsimoniously explain the decisions generated by the Bayesian agents. In Appendix A.2, we propose a single utility version of the UMRI model with fewer parameters. Due to fewer parameters, the decision test for this sparse model, given in Theorem 3, is computationally less expensive yet more restrictive than the BRP test for rationality in Theorem 1.

The rest of the paper focuses on computing interpretable UMRI models that rationalize deep CNN decisions. We will investigate through extensive experiments how well the UMRI fits the deep CNN decisions via robustness tests. We will also investigate how well the computed interpretable models, namely, UMRI and S-UMRI , predict the deep CNNs’ decisions when the training parameters are varied.

3 Bayesian Revealed Preference explains CIFAR-10 Image Classification by Deep CNNs

The experimental results in this section are divided into two parts: First, we show that the deep CNNs decisions pass the BRP and S-BRP tests formulated in Theorems 1 and 3 by a large margin. This implies that the rationally inattentive utility maximization model is a robust fit to the deep CNN decisions.

Our second result demonstrates an application of the reconstructed interpretable model. Training datasets are often noisy. We show that in such a noisy setting, the reconstructed interpretable model from Theorem 1 can accurately predict (with accuracy exceeding 94%) the image classification performance of the deep CNNs. This bypasses the need to train the deep CNN for various noise variances that corrupt the training dataset.

Experimental Setup: Deep CNN Architectures, Training Parameters and Construction of Dataset

Image Dataset. In our experiments, we trained and validated the deep CNNs using the CIFAR-10 benchmark image dataset [8]. This public dataset consists of $60000$ $32$ x $32$ colour images in $10$ distinct classes (for example, airplane, automobile, ship, cat, dog etc.), with $6000$ images per class. There are $50000$ training images and $10000$ test images. We will use the terms image classes and image labels interchangeably.⁵⁵5Our experiments are confined to the CIFAR-10 dataset for clarity of exposition. Our approach to interpretable deep learning can be easily extended to richer benchmark image datasets like ImageNet and CIFAR-100 (that comprise over a 100 image labels).

Network Architecture and Training Parameters. In this paper, we use $5$ well-known deep CNN architectures for our experiments. 1. LeNet [33], 2. AlexNet [34] 3. VGG16 [35] 4. ResNet-50 [18] 5. Network-in-Network (NiN) [36] The deep CNNs are trained and validated on the CIFAR-10 image dataset, using $3$ learning rate schedules, namely, L.R. 1, L.R. 2 and L.R. 3. All $3$ schedules use the RMSprop optimizer [37] with the decay parameter and maximum training epochs (full passes of the training dataset) set to $10^{-6}$ and $200$ , respectively, and initial step size set to $0.01$ . The step size is halved every $20,30,40$ epochs, respectively, for L.R. 1, 2 and 3.

Relation to Bayesian revealed preference. We now relate the deep CNN setup to the Bayesian revealed preference framework in Sec. 2. For each CNN architecture, we use the decisions of $K=20$ CNNs, i.e. , $20$ Bayesian agents in the terminology of Sec. 2, for our BRP and S-BRP decision tests. The CNN decisions from $K$ CNNs on the test image dataset of CIFAR-10 are aggregated into dataset $\mathbb{D}$ (5). The results of the decision tests are discussed below. In the deep image classification context, the parameter $p_{k}(a|x)$ in (5) is the probability that the $k^{\text{th}}$ deep CNN classifies an image from category $x$ into category $a$ in the CIFAR-10 test image dataset. The prior $\pi_{0}$ in $\mathbb{D}$ (5) is the empirical pmf over the set of image categories in the CIFAR-10 test dataset. Constructing $\mathbb{D}$ from raw CNN decisions is discussed in Appendix A.3.

3.1 BRP and S-BRP Tests for deep CNN datasets. Results and Insights

Network Architecture	Learning Rate	$\mathcal{R}_{\text{BRP}}$ $(\times 10^{-4})$	$\mathcal{R}_{\text{S-BRP}}$ $(\times 10^{-4})$
LeNet	L. R. 1	$30.34$	$4.72$
	L. R. 2	$35.14$	$4.65$
	L. R. 3	$37.97$	$5.11$
AlexNet	L. R. 1	$32.10$	$3.21$
	L. R. 2	$34.98$	$3.91$
	L. R. 3	$40.60$	$4.62$
VGG16	L. R. 1	$96.36$	$4.09$
	L. R. 2	$107.4$	$4.02$
	L. R. 3	$119.8$	$4.44$
ResNet-50	L. R. 1	$126.2$	$2.82$
	L. R. 2	$129.2$	$3.45$
	L. R. 3	$132.3$	$3.83$
Network-In-Network (NiN)	L. R. 1	$108.3$	$3.59$
	L. R. 2	$132.1$	$3.36$
	L. R. 3	$149.1$	$5.57$

TABLE I: How does increasing the number of degrees of freedom of the interpretable model improve robustness of fit to the CNN decisions? We see that

\mathcal{R}_{\text{BRP}}

(11) is substantially higher (by an order of magnitude) than

\mathcal{R}_{\text{S-BRP}}

(12) for all CNN architectures. We conclude that the UMRI model fits CNN decisions substantially better than the S-UMRI model, but with larger computing cost for evaluating the parameters of the interpretable model. Thus, if there are no computational constraints, we recommend using the UMRI model for interpreting CNN decisions.

A. Robustness Results on Deep CNN datasets

Our first key result is that image classifications of all $5$ deep CNN architectures listed in Sec. 3 pass the BRP and S-BRP tests by a large margin. The results are tabulated in Table I. The robustness values $\mathcal{R}_{\text{BRP}}$ and $\mathcal{R}_{\text{S-BRP}}$ in Table I are defined in Definition 2 below which formalizes the notion of margin for the decision tests.

Network Architecture	Learning Rate (L.R.)	airplane	auto	bird	cat	deer	dog	frog	horse	ship	truck
LeNet	L.R. 1	$17.61$	$3.55$	$20.06$	$1.88$	$17.19$	$21.42$	$42.00$	$27.79$	$1.91$	$9.55$
	L.R. 2	$4.13$	$5.20$	$7.82$	$1.90$	$13.18$	$18.66$	$23.84$	$8.16$	$2.48$	$2.47$
	L.R. 3	$10.79$	$8.27$	$18.62$	$22.67$	$19.91$	$25.01$	$47.71$	$73.52$	$2.65$	$1.01$
AlexNet	L.R. 1	$210.78$	$41.84$	$49.77$	$59.71$	$51.24$	$68.31$	$83.94$	$211.61$	$60.43$	$125.73$
	L.R. 2	$85.51$	$47.89$	$17.38$	$1.00$	$25.34$	$202.78$	$21.30$	$35.01$	$533.62$	$248.57$
	L.R. 3	$18.00$	$49.55$	$58.25$	$28.31$	$135.54$	$29.24$	$224.91$	$214.51$	$8.29$	$264.20$
VGG16	L.R. 1	$164.48$	$154.77$	$15.42$	$33.67$	$6.28$	$123.89$	$62.83$	$26.21$	$1.43$	$170.69$
	L.R. 2	$88.73$	$154.10$	$45.63$	$297.61$	$131.08$	$136.52$	$57.34$	$229.80$	$145.99$	$11.90$
	L.R. 3	$24.33$	$10.78$	$93.90$	$11.11$	$91.96$	$56.64$	$77.30$	$110.60$	$20.28$	$17.09$
ResNet-50	L.R. 1	$50.83$	$17.55$	$16.09$	$4.66$	$17.92$	$3.67$	$4.92$	$3.95$	$15.46$	$4.88$
	L.R. 2	$7.51$	$8.40$	$72.70$	$30.72$	$32.43$	$83.65$	$221.27$	$74.59$	$99.04$	$20.51$
	L.R. 3	$14.61$	$367.59$	$31.61$	$9.20$	$16.35$	$11.58$	$41.44$	$243.95$	$222.67$	$483.91$
Network-in-Network	L.R. 1	$5.02$	$30.95$	$9.91$	$71.38$	$63.69$	$45.88$	$31.39$	$67.86$	$17.03$	$21.41$
	L.R. 2	$40.17$	$60.32$	$4.40$	$55.67$	$95.02$	$88.72$	$91.15$	$15.98$	$176.75$	$10.27$
	L.R. 3	$10.47$	$75.32$	$55.97$	$24.17$	$17.41$	$8.94$	$23.02$	$71.27$	$29.94$	$80.91$

TABLE II: The utility function of the sparsest interpretable model is a diagonal matrix. The diagonal elements yield a natural preference ordering amongst the set of image classes (classification hypotheses). For example, consider the VGG16 architecture trained using learning rate

1

(third row, first sub-row of table). The maximum utility is for trucks (

170.69

, last column) and the minimum is for ships (

1.43

, second last column). This shows the sparsest interpretable model induces the following preference ordering for the VGG16 architecture: classifying trucks correctly is prioritized

100

times more than classifying ships. Such a preference ordering is not explicitly generated by a CNN.

Definition 2 (Robustness (Goodness-of-fit) of BRP and S-BRP Tests.).

Given dataset $\mathbb{D}$ (5) aggregated from a collection of Bayesian agents, $\mathcal{R}_{\text{BRP}}(\mathbb{D})$ and $\mathcal{R}_{\text{S-BRP}}(\mathbb{D})$ measure the largest perturbation so that $\mathbb{D}$ passes the BRP and S-BRP decision tests:

	$\displaystyle\mathcal{R}_{\text{BRP}}(\mathbb{D})=\max_{\varepsilon>0}\frac{\varepsilon~K}{\sum_{k=1}^{K}\\|u_{k}\\|_{2}^{2}},~\text{BRP}(\mathbb{D},\{u_{k},c_{k}\}_{k=1}^{K})\leq-\varepsilon.$		(11)
	$\displaystyle\mathcal{R}_{\text{S-BRP}}(\mathbb{D})=\max_{\varepsilon>0}\frac{\varepsilon}{\\|u\\|_{2}^{2}},\text{S-BRP}(\mathbb{D},u,\{c_{k},\lambda_{k}\}_{k=1}^{K})\leq-\varepsilon.$		(12)

In Definition 2, robustness values $\mathcal{R}_{\text{BRP}}$ and $\mathcal{R}_{\text{S-BRP}}$ measure, respectively, the smallest perturbation needed for $\mathbb{D}$ to fail the BRP and S-BRP decisions tests. Both $\mathcal{R}_{\text{BRP}}$ and $\mathcal{R}_{\text{S-BRP}}$ are normalized wrt the row-wise $\mathcal{L}_{2}$ norm of the feasible utility functions. Higher robustness values imply a better fit of the UMRI , S-UMRI models to the decision dataset ⁶⁶6The robustness value for the non-informative dataset of uniformly distributed pmfs is $0$ . Hence, the robustness value measures the informativeness of the attention strategies in $\mathbb{D}$ relative to the uniform probability distribution..

Discussion and Insights. Robustness Results of Table I

(i) Deep CNN dataset: The deep CNN datasets used for the robustness tests (11), (12) comprise decisions of $K=20$ deep CNNs for every network architecture, where CNN $k$ was trained for $10~k$ training epochs, $k=1,2,\ldots,K$ .
(ii) Comparison between $\mathcal{R}_{\text{BRP}}$ and $\mathcal{R}_{\text{S-BRP}}$ values for deep CNN datasets: The average value of $\mathcal{R}_{\text{S-BRP}}$ (12) over all $3$ learning rate schedules and $5$ network architectures was found to be $4.09\times 10^{-4}$ . In contrast, the average value of $\mathcal{R}_{\text{BRP}}$ (11) was found to be $87.45\times 10^{-4}$ , almost $20$ times the average value of $\mathcal{R}_{\text{S-BRP}}$ . This result shows that the UMRI model fits deep CNN decisions substantially better than the S-UMRI model. This result is expected since S-UMRI is parameterized using much fewer variables compared to the UMRI and hence, S-BRP test is more restrictive than BRP .
(iii) Sensitivity of $\mathcal{R}_{\text{BRP}},\mathcal{R}_{\text{S-BRP}}$ to Network Architecture: The average value of $\mathcal{R}_{\text{BRP}}$ is $122.29\times 10^{-4}$ for the LeNet and AlexNet architectures, which is approximately $3.5$ times the the average value of $\mathcal{R}_{\text{BRP}}$ for the VGG16, ResNet-50 and NiN architectures which is $35.18\times 10^{-4}$ . The variation of $\mathcal{R}_{\text{S-BRP}}$ with network architecture is negligible compared to $\mathcal{R}_{\text{S-BRP}}$ . This shows the robustness test for UMRI model is more sensitive to network architecture compared to that for the S-UMRI model.
(iv) Computational aspects of $\mathcal{R}_{\text{BRP}}$ and $\mathcal{R}_{\text{S-BRP}}$ . The computation time for $\mathcal{R}_{\text{BRP}}$ is almost $30$ times that for $\mathcal{R}_{\text{S-BRP}}$ . This is expected since the UMRI model is parameterized by $K$ utility functions compared to a single utility function in S-UMRI .

B. Sparsity-enhanced Interpretable Model

Our next task is to determine the sparsest possible interpretable model that satisfies the decision tests BRP and S-BRP. The motivation is three fold:

1.

The sparsest interpretable model explains the deep CNN decisions using the fewest number of parameters.
2.

The sparsest interpretable model induces a useful preference ordering amongst the set of hypotheses (image labels) considered by the CNN; for example, how much additional priority is allocated to the classification of a cat as a cat compared to a cat as a dog. In classical deep learning, this preference ordering is not explicitly generated.
3.

Third, the sparsest solution is a point valued estimate. Recall the BRP and S-BRP decision tests yield a set-valued estimate of feasible utility functions and cost of information acquisition that explain the deep CNN datasets. While every element in the set explains the dataset equally well, it is useful to have a single representative point.

Theorem 2 below computes the sparsest utility function out of all feasible utility functions.

Theorem 2 (Sparsity Enhanced BRP and S-BRP Tests for Deep CNN datasets).

Given dataset $\mathbb{D}$ (5) from a collection of $K$ Bayesian agents. The sparsest solutions to the BRP and S-BRP tests minimize the sum of row-wise $\mathcal{L}_{1}$ norm of the feasible utility functions of the $K$ agents that generate $\mathbb{D}$ .

	$\displaystyle(u_{1:K})^{\ast}=\operatorname*{argmin}_{u_{1:K}}\sum_{k=1}^{K}\lVert u_{k}\rVert_{1},\text{BRP}(\mathbb{D},\cdot)\leq\mathbf{0},~\sum_{k=1}^{K}\\|u_{k}\\|_{2}^{2}=K.$
	$\displaystyle u^{\ast}=\operatorname*{argmin}_{u}\lVert u\rVert_{1},\text{S-BRP}(\mathbb{D},\cdot)\leq\mathbf{0},~\\|u\\|_{2}^{2}=1.$		(13)

where $\lVert\cdot\rVert_{1}$ denotes the row-wise $\mathcal{L}_{1}$ norm.

Results and Discussion. Sparsity Test for deep CNN datasets

The sparsest utility function from the S-BRP test are tabulated in Table II for all $5$ deep CNN architectures. The corresponding information acquisition cost for all $5$ architectures averaged over learning rates $1,2,3$ are shown in Fig. 2. Together, the sparsest utility and information cost constitute the sparsest S-UMRI interpretable model⁷⁷7For brevity, we have only included the sparsity results for the S-UMRI model. The sparsest utility functions of the UMRI model that explains deep CNN decisions are included in our public GitHub repository that contains all test results and codes. for the deep CNN decisions.
(i) The sparsest S-UMRI model is comprised of $K(|\mathcal{X}||\mathcal{A}|+1)$ variables. (ii) Preference ordering induced from sparsest utility. The sparsest utility function for the S-UMRI model induces a useful preference ordering among the predicted image classes. That is, they measure how the deep CNN’s priority for accurate classification varies across image classes. For instance, consider the VGG16 architecture trained using learning rate schedule $1$ . Of all image categories, the maximum utility is observed for trucks ( $170.69$ ) and the minimum for ships ( $1.43$ ). This shows the VGG16 architecture prioritizes classifying trucks correctly about $100$ times more than classifying ships.
(ii) Penalty for learning image features accurately. The computed information acquisition costs in Fig. 2 can be understood as the training cost the CNN incurs to learn latent image features accurately. The interpretable model cannot explain the variation in CNN classification accuracy versus variation in training parameters without an information acquisition cost. From Fig. 2, we can conclude that learning accurate image features is the most and least costly, respectively, for the AlexNet and ResNet architectures, respectively.

Refer to caption — Figure 2: The figure illustrates an important property of our approach to interpretable deep learning: in addition to the utility function (Table II), we also need a rational inattention term (cost of learning latent image features) to explain CNN decisions. Put differently, we cannot explain the variation in CNN classification accuracy versus variation in training parameters without an information acquisition cost. The figure displays the information acquisition cost $C$ (7) evaluated for the sparsest interpretable model. We also observe that learning accurate image features is most expensive for AlexNet, and least expensive for ResNet architectures.

Network Architecture	airplane	auto	bird	cat	deer	dog	frog	horse	ship	truck
LeNet	$0.042$	$0.042$	$0.041$	$0.027$	$0.046$	$0.025$	$0.049$	$0.034$	$0.040$	$0.042$
AlexNet	$0.025$	$0.031$	$0.034$	$0.021$	$0.046$	$0.032$	$0.049$	$0.039$	$0.045$	$0.036$
VGG16	$0.033$	$0.035$	$0.043$	$0.041$	$0.048$	$0.048$	$0.035$	$0.046$	$0.037$	$0.048$
ResNet-50	$0.030$	$0.031$	$0.027$	$0.031$	$0.020$	$0.027$	$0.040$	$0.015$	$0.023$	$0.024$
Network-in-Network	$0.051$	$0.029$	$0.025$	$0.028$	$0.056$	$0.059$	$0.030$	$0.058$	$0.045$	$0.036$

TABLE III: How well does our interpretable model predict CNN classification accuracy? The table displays the prediction error

\delta_{\eta}(x)

defined in (14). Recall

\delta_{\eta}(x)

is the error between the true CNN performance and the predicted performance using the interpretable model with Algorithm 2. The maximum error across all image classes and architectures was found to be

5.9\%

. Hence, our interpretable model predicts CNN classification performance with accuracy exceeding

94\%

3.2 Predicting deep CNN classification accuracy using our Interpretable Models

Training datasets are often noisy; for example, [38] considers noisy datasets for hand-written character recognition. We now exploit the proposed interpretable model to predict how the deep CNN will perform with a noisy training dataset without actually implementing the deep CNN.

Our predictive procedure is as follows. We first train the CNNs on noisy datasets that are generated by adding simulated Gaussian noise with noise variances chosen from a finite set. ⁸⁸8Injecting artificial noise in training datasets is also used in variational autoencoders for robust feature learning [39, 40]. Then given the CNN decisions, we compute our interpretable model over this finite set of noise variances. Finally, to predict how the CNN will perform for a noise variance not in the set, we interpolate the utility function of the interpretable model at this noise variance. Then given the interpolated utility function and information acquisition cost from our interpretable model, the predicted classification performance is computed by solving convex optimization problem (4). The above procedure is formalized in Algorithm 2. Hence, our interpetable model serves as a computationally efficient method for predicting the performance of a CNN without implementing the CNN. The interpretable model can be viewed as a low dimension projection of the high-dimension CNN with predictive accuracy exceeding 94%.

Remark. An alternative procedure is to directly interpolate the performance over the space of CNN weights (several hundreds of thousands). Due to the high dimensionality, this is an intractable interpolation. In comparison, interpolation over the utility functions in our interpretable model is over a few hundred variables.

Prediction Results of Algorithm 2 on Deep CNN Performance

Table III displays the prediction errors (difference between the true and predicted classification accuracy) for the deep CNNs for all $5$ architectures and all image classes in CIFAR-10. For a fixed CNN architecture and noise variance $\eta>0$ , the prediction error $\delta_{\eta}(x)$ for image class $x$ is defined as:

\delta_{\eta}(x)=|\hat{p}(x|x)-p_{\text{CNN}}(x|x)|.

(14)

In (14), $\hat{p}(\cdot|\cdot)$ is the predicted CNN performance generated from Algorithm 2 and $p_{\text{CNN}}(\cdot|\cdot)$ is the true CNN performance obtained by implementing the CNN. Recall that $p(x|x)$ is the probability that the CNN correctly classifies an image belonging to class $x$ .

0: Dataset

\mathbb{D}

(26) from

K

deep CNNs from a fixed network architecture. The

k^{\text{th}}

CNN is trained on a noisy dataset with added Gaussian noise with variance

\eta_{k}=1+0.1\times(k-1)

.Step 1: Constructing Interpretable Model. The most robust utility functions

\{u_{k}^{\ast}\}_{k=1}^{K}

and information acquisition cost

C^{\ast}

are computed by solving the following convex optimization problem.

	$\displaystyle\hskip-5.69046pt\{u_{k}^{\ast},c^{\ast}_{k}\}_{k=1}^{K}=\operatorname*{argmax}_{u_{1:K}}\frac{\varepsilon~K}{\sum_{k=1}^{K}\\|u_{k}\\|_{2}^{2}},~\text{BRP}(\mathbb{D},\cdot)\leq-\varepsilon.$
	$\displaystyle\hskip-5.69046ptC^{\ast}(p(a\|x))=\max_{k=1}c_{k}^{\ast}+\sum_{x,a}\pi_{0}(x)(p(a\|x)-p_{k}(a\|x))u^{\ast}(x,a).$

Step 2: Predicting Classification Accuracy. For an arbitrary noise variance

\eta_{1}\leq\eta\leq\eta_{K}

, obtain index

g,~1\leq g<K

such that

\eta\in[\eta_{g},\eta_{g+1}]

. Then, the predicted classification accuracy

\hat{p}(a|x)

for noise variance

\eta

is computed as follows:

	$\displaystyle\hskip-2.84544pt\hat{p}(a\|x)=\operatorname*{argmax}_{p(a\|x)}\sum_{a}\max_{b}\sum_{x}\pi_{0}(x)p(a\|x)\hat{u}(x,a)-C^{\ast}(p),$
	$\displaystyle\hskip-2.84544pt\hat{u}=10\times\{(\eta_{g+1}-\eta)u^{\ast}_{g}+(\eta-\eta_{g})u^{\ast}_{g+1}\}.$		(15)

Return: Predicted CNN performance

\hat{p}(a|x)

for noise variance

\eta

Algorithm 2 Predicting Deep CNN Classification Accuracy via the S-UMRI model using Theorem 2.

Discussion and Insights

(i) Our interpretable model can predict CNN classification performance with accuracy exceeding 94% (see below).
(ii) The interpretable model (utility functions and information acquisition cost) for our predictive procedure (Algorithm 2) is evaluated on the set of noise variances $G_{1}=\{1+0.1\times(k-1),k=1,2,\ldots,11\}$ . The predictive procedure of Algorithm 2 is applied on the the set of noise variances given by $G_{2}=\{1.05+0.1\times(k-1),k=1,2,\ldots,10\}$ . Table III displays the prediction errors $\delta_{\eta}(x)$ averaged over all $\eta\in G_{2}$ .
(iii) From Table III, the prediction error $\delta_{\eta}(x)$ averaged over all image classes $x$ for the $5$ CNN architectures are:

1.

LeNet - $0.038$
2.

AlexNet - $0.036$
3.

VGG16 - $0.041$
4.

ResNet-50 - $0.027$
5.

NiN- $0.035$

So the least accuracy is 95.9%, and highest accuracy is 97.3%.
(iv) The prediction error averaged over the network architectures was observed to be minimum for image class ‘cat’ (98.1%) and maximum for image class ‘deer’ (95.7%) over all image classes.
(iv) Statistical Similarity between Deep CNNs and Interpretable Model. We computed the Kullback-Leibler (KL) divergence between the true and predicted classification performances $p_{\text{imp}}(a|x)$ and $\hat{p}(a|x)$ . Recall $\hat{p}(a|x)$ is computed from the interpretable model via Algorithm 2 and $p_{\text{CNN}}(a|x)$ is obtained from the CNN. The KL divergence values for the 5 CNN architectures are:

1.

LeNet - $0.015$
2.

AlexNet - $0.012$
3.

VGG16 - $0.016$
4.

Resnet-50 - $0.006$
5.

NiN - $0.018$ .

Thus the decisions made by the deep CNNs are statistically similar to decisions generated by our interpretable model.

4 Conclusions and Extensions

This paper has developed an interpretable model for deep image classification using micro- and behavioral economics theory. By extensive analysis of the decisions of $200$ CNNs over $5$ popular CNN architectures, we showed that deep CNNs can be explained by rationally inattentive Bayesian utility maximization.

Our main results were the following:
1. Using the theory of Bayesian revealed preference, Theorem 1 gave a necessary and sufficient condition for the actions of a collection of decision makers to be consistent with rationally inattentive Bayesian utility maximization. We showed that deep CNNs operating on the CIFAR-10 dataset satisfy these necessary and sufficient conditions.
2. Next we studied the robustness margin by which the deep CNNs satisfy Theorem 1; we found that the margins were sufficiently large implying robustness of the results. Our robustness results are summarized in Table I.
3. In Theorem 2, we constructed the sparsest interpretable model from the feasible set generated using Theorem 1. The sparsest interpretable model explains deep CNN decisions using the least number of parameters. The sparsest interpretable model introduces a useful preference ordering amongst the set of hypotheses (image labels) considered by the deep neural network; for example, how much additional priority is allocated to the classification of a cat as a cat compared to a cat as a dog. In classical deep learning, this preference ordering is not explicitly generated
4. Finally, we showed that our interpretable model can predict CNN performance with accuracy exceeding 94%, and the decisions generated by our interpretable model are statistically similar to that of a deep CNN. At a more conceptual level, our results suggest that deep CNNs for image classification are equivalent to an economics-based constrained Bayesian decision system (used in micro-economics to model human decision making).

Extensions. An immediate extension of this work is to construct appropriately designed image features to replace the image class label as the state. This would result in a richer descriptive model of the CNN due to more degrees of freedom in the utility function.

Our proposed interpretable model generates a concave utility function by design. This is an important feature of the revealed preference framework; even though the actual deep learner’s utility may not be convex. To quote Varian [11]: ”If data can be rationalized by any non-trivial utility function, then it can be rationalized by a nice utility function. Violations of concavity cannot be detected with only a finite number of observations.” A more speculative extension is to investigate the asymptotic behavior of the BRP and S-BRP decision tests for rationally inattentive utility maximization-do the tests pass when the number of deep CNNs tend to infinity? Recent results [41] show that an infinite dataset can at best be rationalized by a quasi-concave utility function.

Reproducibility: The computer programs and deep image classification datasets needed to reproduce all the results in this paper can be obtained from the public GitHub repository https://github.com/KunalP117/DL_RI.

Acknowledgement

This research was supported in part by the Army Research Office under grants W911NF-21-1-0093 and W911NF-19-1-0365, and the National Science Foundation under grant CCF-2112457.

References

[1] C. Sims. Implications of rational inattention. Journal of monetary Economics, 50(3):665–690, 2003.
[2] C. Sims. Rational inattention and monetary economics. Handbook of Monetary Economics, 3:155–181, 2010.
[3] M. Woodford. Inattentive valuation and reference-dependent choice. 2012.
[4] F. Matějka and A. McKay. Rational inattention to discrete choices: A new foundation for the multinomial logit model. American Economic Review, 105(1):272–98, 2015.
[5] H. De Oliveira, T. Denti, M. Mihm, and K. Ozbek. Rationally inattentive preferences and hidden information costs. Theoretical Economics, 12(2):621–654, 2017.
[6] A. Caplin and D. Martin. A testable theory of imperfect perception. The Economic Journal, 125(582):184–202, 2015.
[7] A. Caplin and M. Dean. Revealed preference, rational inattention, and costly information acquisition. The American Economic Review, 105(7):2183–2203, 2015.
[8] A. Krizhevsky, G. Hinton, et al. Learning multiple layers of features from tiny images. 2009.
[9] S. N. Afriat. The construction of utility functions from expenditure data. International economic review, 8(1):67–77, 1967.
[10] W. E. Diewert. Afriat and revealed preference theory. The Review of Economic Studies, 40(3):419–425, 1973.
[11] H. R. Varian. The nonparametric approach to demand analysis. Econometrica: Journal of the Econometric Society, pages 945–973, 1982.
[12] S. Chakraborty, R. Tomsett, R. Raghavendra, D. Harborne, M. Alzantot, F. Cerutti, M. Srivastava, A. Preece, S. Julier, R. M. Rao, et al. Interpretability of deep learning models: a survey of results. In 2017 IEEE smartworld, ubiquitous intelligence & computing, advanced & trusted computed, scalable computing & communications, cloud & big data computing, Internet of people and smart city innovation (smartworld/SCALCOM/UIC/ATC/CBDcom/IOP/SCI), pages 1–6. IEEE, 2017.
[13] F. Doshi-Velez and B. Kim. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608, 2017.
[14] R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, and D. Pedreschi. A survey of methods for explaining black box models. ACM Computing Surveys (CSUR), 51(5):1–42, 2018.
[15] W. J. Murdoch, C. Singh, K. Kumbier, R. Abbasi-Asl, and B. Yu. Interpretable machine learning: definitions, methods, and applications. arXiv preprint arXiv:1901.04592, 2019.
[16] D. Ciregan, U. Meier, and J. Schmidhuber. Multi-column deep neural networks for image classification. In 2012 IEEE conference on computer vision and pattern recognition, pages 3642–3649. IEEE, 2012.
[17] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. ImageNet large scale visual recognition challenge. International journal of computer vision, 115(3):211–252, 2015.
[18] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
[19] K. Simonyan, A. Vedaldi, and A. Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
[20] A. Nguyen, A. Dosovitskiy, J. Yosinski, T. Brox, and J. Clune. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. arXiv preprint arXiv:1605.09304, 2016.
[21] P. Hase, C. Chen, O. Li, and C. Rudin. Interpretable image recognition with hierarchical prototypes. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, volume 7-1, pages 32–40, 2019.
[22] T. Lei, R. Barzilay, and T. Jaakkola. Rationalizing neural predictions. arXiv preprint arXiv:1606.04155, 2016.
[23] S. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions. arXiv preprint arXiv:1705.07874, 2017.
[24] S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one, 10(7):e0130140, 2015.
[25] M. T. Ribeiro, S. Singh, and C. Guestrin. Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386, 2016.
[26] A. Shrikumar, P. Greenside, A. Shcherbina, and A. Kundaje. Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713, 2016.
[27] H. Wang and D.-Y. Yeung. Towards Bayesian deep learning: A framework and some existing methods. IEEE Transactions on Knowledge and Data Engineering, 28(12):3395–3408, 2016.
[28] P. Milgrom. Good news and bad news: Representation theorems and applications. Bell Journal of Economics, 12(2):380–391, 1981.
[29] L. Huang and H. Liu. Rational inattention and portfolio selection. The Journal of Finance, 62(4):1999–2040, 2007.
[30] S. E. Mirsadeghi, A. Royat, and H. Rezatofighi. Unsupervised image segmentation by mutual information maximization and adversarial regularization. IEEE Robotics and Automation Letters, 2021.
[31] W. Hoiles, V. Krishnamurthy, and K. Pattanayak. Rationally Inattentive Inverse Reinforcement Learning Explains YouTube commenting behavior. The Journal of Machine Learning Research, 21(170):1–39, 2020.
[32] K. Pattanayak and V. Krishnamurthy. Unifying classical and bayesian revealed preference, 2021.
[33] Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, and L. Jackel. Handwritten digit recognition with a back-propagation network. Advances in neural information processing systems, 2, 1989.
[34] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25:1097–1105, 2012.
[35] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[36] M. Lin, Q. Chen, and S. Yan. Network in network. arXiv preprint arXiv:1312.4400, 2013.
[37] G. Hinton, N. Srivastava, and K. Swersky. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on, 14(8), 2012.
[38] R. Anand, T. Shanthi, R. Sabeenian, and S. Veni. Real time noisy dataset implementation of optical character identification using cnn. International Journal of Intelligent Enterprise, 7(1-3):67–80, 2020.
[39] Y. Bengio. Learning deep architectures for AI. Now Publishers Inc, 2009.
[40] P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, pages 1096–1103, 2008.
[41] P. J. Reny. A characterization of rationalizable consumer behavior. Econometrica, 83(1):175–192, 2015.
[42] D. Blackwell. Equivalent comparisons of experiments. The annals of mathematical statistics, pages 265–272, 1953.

Appendix

A.1 Proof of Theorem 1

Proof of necessity of NIAS and NIAC:

NIAS (8): For agent $k\in\mathcal{K}$ , define the subset $\mathcal{Y}_{a}\subseteq\mathcal{Y}$ so that for any observation $y\in\mathcal{Y}_{a}$ , given posterior pmf $p_{k}(x|y)$ , the optimal choice of action is $a$ (3). We define the revealed posterior pmf given action $a$ as $p_{k}(x|a)$ . The revealed posterior pmf is a stochastically garbled version of the actual posterior pmf $p_{k}(x|y)$ , that is,

p_{k}(x|a)=\sum_{y\in\mathcal{Y}}\frac{p_{k}(x,y,a)}{p_{k}(a)}=\sum_{y\in\mathcal{Y}}p_{k}(y|a)p_{k}(x|y)

(16)

Since the optimal action is $a$ for all $y\in\mathcal{Y}_{a}$ , (3) implies:

	$\displaystyle\hskip-11.38092pt\quad\quad~~\sum_{x\in\mathcal{X}}p_{k}(x\|y)(u_{k}(x,b)-u_{k}(x,a))\leq 0$
	$\displaystyle\hskip-11.38092pt\implies\sum_{y\in\mathcal{Y}_{a}}p_{k}(y\|a)\sum_{x\in\mathcal{X}}p_{k}(x\|y)(u_{k}(x,b)-u_{k}(x,a))\leq 0$
	$\displaystyle\hskip-11.38092pt\implies\sum_{y\in\mathcal{Y}}p_{k}(y\|a)\sum_{x\in\mathcal{X}}p_{k}(x\|y)(u_{k}(x,b)-u_{k}(x,a))\leq 0$
	$\displaystyle\hskip-11.38092pt\quad\quad~~(\text{since }p_{k}(y\|a)=0,~\forall y\in\mathcal{Y}\backslash\mathcal{Y}_{a})$
	$\displaystyle\hskip-11.38092pt\implies\sum_{x\in\mathcal{X}}\sum_{y\in\mathcal{Y}}p_{k}(y\|a)p_{k}(x\|y)(u_{k}(x,b)-u_{k}(x,a))\leq 0$
	$\displaystyle\hskip-11.38092pt\implies\sum_{x\in\mathcal{X}}p_{k}(x\|a)(u_{k}(x,b)-u_{k}(x,a))\leq 0~(\text{from }\eqref{eqn:revpos})$

This is precisely the NIAS inequality (8).

NIAC (9): Let $c_{k}=C(\alpha_{k})>0$ , where $C(\cdot)$ denotes the information acquisition cost of the collection of agents $\mathcal{K}$ . Also, let $J(\alpha_{k},u_{k})$ denote the expected utility of the $k^{\text{th}}$ agent given attention strategy $\alpha_{k}$ (first term in RHS of (4)). Here, the expectation is taken wrt both the state $x$ and observation $y$ . It can be verified that $J(\cdot,u_{k})$ is convex in the first argument. Finally, for the $k^{\text{th}}$ agent, we define the revealed attention strategy $\alpha_{k}^{\prime}$ over the set of actions $\mathcal{A}$ as

\alpha_{k}^{\prime}(a|x)=p_{k}(a|x),~\forall a\in\mathcal{A},

where the variable $p_{k}(a|x)$ is obtained from the dataset $\mathbb{D}$ . Clearly, the revealed attention strategy is a stochastically garbled version of the true attention strategy since

\alpha_{k}^{\prime}(a|x)=p_{k}(a|x)=\sum_{y\in\mathcal{Y}}p_{k}(a|y)\alpha_{k}(y|x)

(17)

From Blackwell dominance [42] and the convexity of the expected utility functional $J(\cdot,u_{k})$ , it follows that:

J(\alpha_{k}^{\prime},u_{j})\leq J(\alpha_{k},u_{j}),

(18)

when $\alpha_{k}$ Blackwell dominates $\alpha_{k}^{\prime}$ . The above relationship holds with equality if $k=j$ (this is due to NIAS (8)). We now turn to condition (4) for optimality of attention strategy. The following inequalities hold for any pair of agents $j\neq k$ :

		$\displaystyle J(\alpha_{k}^{\prime},u_{k})-c_{k}\overset{\eqref{eqn:ineq_BLK}}{=}J(\alpha_{k},u_{k})-c_{k}$
	$\displaystyle\overset{\eqref{eqn:attentionmaximization}}{\geq}~$	$\displaystyle J(\alpha_{j},u_{k})-c_{j}\overset{\eqref{eqn:ineq_BLK}}{\geq}~J(\alpha_{j}^{\prime},u_{k})-c_{j}.$		(19)

This is precisely the NIAC inequality (9).

Proof for sufficiency of NIAS and NIAC: Let $\{u_{k},c_{k}\}_{k=1}^{K}$ denote a feasible solution to the NIAS and NIAC inequalities of Theorem 1. To prove sufficiency, we construct an UMRI tuple as a function of dataset $\mathbb{D}$ and the feasible solution that satisfies the optimality conditions (3),(4) of Definition 1.

Consider the following UMRI model tuple:

	$\displaystyle\Theta=(\mathcal{K},\mathcal{X},\mathcal{Y}=\mathcal{A},\mathcal{A},\pi_{0},C,\{p_{k}(a\|x),u_{k},k\in\mathcal{K}\}),\text{ where}$
	$\displaystyle C(p(a\|x))=\max_{k\in\mathcal{K}}c_{k}+J(p(a\|x),u_{k})-J(p_{k}(a\|x),u_{k}).$		(20)

In (20), $C(\cdot)$ is a convex cost since it is a point-wise maximum of monotone convex functions. Further, since NIAC is satisfied, (20) implies $C(\alpha_{k})=c_{k}$ . It only remains to show that inequalities (3) and (4) in Definition 1 are satisfied for all agents in $\mathcal{K}$ .

1.

NIAS implies (3) holds. This is straightforward to show since the observation and action sets are identical.

Information Acquisition Cost (20) implies (4) holds. Fix agent $j\in\mathcal{K}$ . Then, for any attention strategy $p(a|x)$ , the following inequalities hold.

	$\displaystyle C(p(a\|x))=\max_{k\in\mathcal{K}}~c_{k}+J(p(a\|x),u_{k})-J(p_{k}(a\|x),u_{k})$
	$\displaystyle\hskip-11.38092pt\implies J(p_{j}(a\|x))-c_{j}\geq J(p(a\|x))-C(p(a\|x)),~\forall~p(a\|x)$
	$\displaystyle\hskip-11.38092pt\implies p_{k}(a\|x)\in\operatorname*{argmax}_{p(a\|x)}J(p(a\|x),u_{k})-C(p(a\|x))=\eqref{eqn:attentionmaximization}.$

A.2 S-UMRI (Sparse UMRI ) Model for Rationally Inattentive Bayesian Utility Maximization

In Sec. 2.1, we outlined the UMRI model for rationally inattentive utility maximization of $K$ Bayesian agents parameterized by $K$ utility functions and a cost of information acquisition. This section proposes a sparse version of the UMRI model, namely, the S-UMRI model that is parameterized by a single utility function that rationalizes the decisions of $K$ Bayesian agents. Abstractly, the S-UMRI model is described by the tuple

\Theta=(\mathcal{K},\mathcal{X},\mathcal{Y},\mathcal{A},\pi_{0},C,u,\{\alpha_{k},\lambda_{k},k\in\mathcal{K}\}).

(21)

All parameters in (21) are identical to that in (1) except for the additional parameter $\lambda_{k}\in\mathbb{R}_{+}$ . $\lambda_{k}$ can be interpreted as the sensitivity to information acquisition of the $k^{\text{th}}$ agent. We discuss the significance of $\lambda_{k}$ in more detail below. In complete analogy to Definition 1, Definition 3 below specifies the optimal action and attention strategy policy of the Bayesian agents in $\mathcal{K}$ .

Definition 3 (Rationally Inattentive Utility Maximization for S-UMRI ).

Consider a collection of Bayesian agents $\mathcal{K}$ parameterized by $\Theta$ in (21) under the S-UMRI model. Then,
(a) Expected Utility Maximization: Given posterior pmf $p(x|y)$ , agent $k\in\mathcal{K}$ chooses action $a$ that maximizes its expected utility.

a\in\operatorname*{argmax}_{a^{\prime}\in\mathcal{A}}~\mathbb{E}_{x}\{u_{k}(x,a^{\prime})|y\}=\sum\limits_{x\in\mathcal{X}}p(x|y)u(x,a^{\prime})

(22)

(b) Attention Strategy Rationality: Agent $k$ chooses attention strategy $\alpha_{k}$ that optimally trades off between utility maximization and cost minimization.

\displaystyle\alpha_{k}\in\operatorname*{argmax}_{\alpha^{\prime}}\mathbb{E}_{y}\{

\displaystyle\operatorname*{max}_{a\in\mathcal{A}}\mathbb{E}_{x}\{u(x,a)|y\}\}-\lambda_{k}C(\alpha^{\prime},\pi_{0})

(23)

Remarks. 1. Role of $\lambda_{k}$ . In (23), $\lambda_{k}$ is the differentiating parameter across agents. Even though all agents have the same utility function, different values of $\lambda_{k}$ result in different optimal strategies $\alpha_{k}$ (23).
2. Sparsity of S-UMRI . The UMRI model tuple for $K$ Bayesian agents is parameterized using $K(|\mathcal{X}||\mathcal{A}|+1)$ variables. In comparison, the S-UMRI tuple is parameterized via $|\mathcal{X}||\mathcal{A}|+K$ variables. The difference in variables for parametrization is linear in $K$ .

Finally, in complete analogy to Theorem 1, we now state Theorem 3 that states necessary and sufficient conditions for the decisions of a collection of Bayesian agents to be rationalized by the S-UMRI model.

Theorem 3 (S-BRP Test for Rationally Inattentive Utility Maximization).

Given the dataset $\mathbb{D}$ (5) obtained from a collection of Bayesian agents $\mathcal{K}$ . Then,
1. Existence: There exists a S-UMRI tuple $\Theta(\mathbb{D})$ (1) that rationalizes dataset $\mathbb{D}$ if and only if there exists a feasible solution that satisfies the set of inequalities

\text{S-BRP}(\mathbb{D})\leq\mathbf{0}.

(24)

In (6), S-BRP $(\cdot)$ corresponds to a set of inequalities stated in Algorithm 3 below. The set-valued estimate of $\Theta$ that rationalizes $\mathbb{D}$ is the set of all feasible solutions to (6).
2. Reconstruction: Given a feasible solution $\{u,\lambda_{k},c_{k}\}_{k=1}^{K}$ to $S-BRP(\mathbb{D},\cdot)$ , $u$ is the $k^{\text{th}}$ Bayesian agent’s utility function, for all $k=1,2,\ldots,K$ . The set of observations $\mathcal{Y}=\mathcal{A}$ , the set of actions in $\mathbb{D}$ . The feasible cost of information acquisition $C$ in $\Theta(\mathbb{D})$ is defined in terms of the feasible variables $c_{k},\lambda_{k}$ as:

	$\displaystyle C(\alpha)$	$\displaystyle=\max_{k\in\mathcal{K}}c_{k}+\lambda_{k}\sum_{x,a}(p(x,a)-p_{k}(x,a))u(x,a),$
		$\displaystyle~(\alpha=\{p(a\|x),a\in\mathcal{A},x\in\mathcal{X}\})$		(25)

The proof of Theorem 3 closely follows the proof of Theorem 1 and hence, omitted. In comparison to the BRP test of Theorem 1, the S-BRP test has the same number of inequalities but fewer decision variables. Hence, the set of feasible parameters generated from Algorithm 3 is smaller compared to Algorithm 1.

0: Dataset

\mathbb{D}=\{\pi_{0},p_{k}(a|x),x,a\in\mathcal{X},k\in\mathcal{K}\}

from a collection of Bayesian agents

\mathcal{K}

. Find: Positive reals

c_{k},\lambda_{k},u\in(0,1]

for all

x\in\mathcal{X},

a\in\mathcal{A},~k\in\mathcal{K}

that satisfy the following inequalities:

	$\displaystyle\hskip-22.76228pt1.~\sum_{x}p_{k}(x\|a)~(u(x,b)-u(x,a))\leq 0,\forall a,b,k,$
	$\displaystyle\hskip-22.76228pt2.~\sum_{x,a}(p_{j}(x,a)-p_{k}(x,a))u(x,a)+\lambda_{k}(c_{k}-c_{j})\leq 0,\forall j,k,$

where

p_{k}(x,a)=\pi_{0}(x)p_{k}(a|x),~p_{k}(x|a)=\frac{p_{k}(x,a)}{\sum_{x^{\prime}}p_{k}(x^{\prime},a)}

. Return: Set of feasible utility function

u

, scalars

\lambda_{k}

and information acquisition costs

c_{k}

incurred by agents

k\in\mathcal{K}

Algorithm 3 S-BRP Convex Feasibility Test of Theorem 3

A.3 Construction of Deep CNN Dataset

We now explain how the decisions of the deep CNNs are incorporated into our main theorems Theorems 1 and 3. Suppose $K$ deep CNNs indexed by $k=1,\ldots,K$ with different training parameters are trained on the CIFAR-10 dataset. For every trained deep CNN $k$ , given test image $i$ from CIFAR-10 test dataset with image class $s_{i}$ , let the vector $f_{i,k}\in\Delta^{9}$ denote the corresponding softmax output of the deep CNN. The vector $f_{i,k}$ is a $10$ -dimensional probability vector where $f_{i,k}(j)$ is the probability that deep CNN $k$ classifies test image $i$ into image class $j$ .

The decisions of all $K$ deep CNNs on the CIFAR-10 test dataset are aggregated into dataset $\mathbb{D}$ for compatibility with Theorems 1 and 3 as follows:

	$\displaystyle\mathbb{D}=\{\pi_{0},p_{k}(a\|x),x,a\in\mathcal{X},k\in\{1,2,\ldots,K\}\},\text{ where}$
	$\displaystyle\pi_{0}(x)=\sum_{i=1}^{N}\frac{\mathbbm{1}\{s_{i}=x\}}{N},~p_{k}(a\|x)=\frac{\sum_{i=1}^{N}\mathbbm{1}\{s_{i}=x\}f_{i,k}(a)}{\sum_{i=1}^{N}\mathbbm{1}\{s_{i}=x\}},$
	$\displaystyle N=10000,~\mathcal{X}=\mathcal{A}=\{1,2,\ldots 10\}.$		(26)

Here $\pi_{0}(x)$ is the empirical probability that the image class of a test image in the CIFAR-10 test dataset is $x$ . Since the output of the CNN is a probability vector, we compute $p_{k}(a|x)$ for the $k^{\text{th}}$ CNN by averaging the $a^{\text{th}}$ component of the output over all test images in image class $x$ . Finally, $N$ is the number of test images in the CIFAR-10 test dataset, and the set of true and predicted image classes are the same, i.e., $\mathcal{X}=\mathcal{A}$ . Although implicit in the above description, our Bayesian revealed preference approach to interpretable deep image classification assumes the deep CNN’s (agent’s) ground truth is the true image label, and its decision $a$ is the predicted image label.

	$\displaystyle\hskip-11.38092pt\quad\quad~~\sum_{x\in\mathcal{X}}p_{k}(x\|y)(u_{k}(x,b)-u_{k}(x,a))\leq 0$
	$\displaystyle\hskip-11.38092pt\implies\sum_{y\in\mathcal{Y}_{a}}p_{k}(y\|a)\sum_{x\in\mathcal{X}}p_{k}(x\|y)(u_{k}(x,b)-u_{k}(x,a))\leq 0$
	$\displaystyle\hskip-11.38092pt\implies\sum_{y\in\mathcal{Y}}p_{k}(y\|a)\sum_{x\in\mathcal{X}}p_{k}(x\|y)(u_{k}(x,b)-u_{k}(x,a))\leq 0$
	$\displaystyle\hskip-11.38092pt\quad\quad~~(\text{since }p_{k}(y\|a)=0,~\forall y\in\mathcal{Y}\backslash\mathcal{Y}_{a})$
	$\displaystyle\hskip-11.38092pt\implies\sum_{x\in\mathcal{X}}\sum_{y\in\mathcal{Y}}p_{k}(y\|a)p_{k}(x\|y)(u_{k}(x,b)-u_{k}(x,a))\leq 0$
	$\displaystyle\hskip-11.38092pt\implies\sum_{x\in\mathcal{X}}p_{k}(x\|a)(u_{k}(x,b)-u_{k}(x,a))\leq 0~(\text{from }\eqref{eqn:revpos})$