This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

On the Probability of Immunity

Jose M. Peña1 1Linköping University, Sweden. jose.m.pena@liu.se
Abstract.

This work is devoted to the study of the probability of immunity, i.e. the effect occurs whether exposed or not. We derive necessary and sufficient conditions for non-immunity and ϵ\epsilon-bounded immunity, i.e. the probability of immunity is zero and ϵ\epsilon-bounded, respectively. The former allows us to estimate the probability of benefit (i.e., the effect occurs if and only if exposed) from a randomized controlled trial, and the latter allows us to produce bounds of the probability of benefit that are tighter than the existing ones. We also introduce the concept of indirect immunity (i.e., through a mediator) and repeat our previous analysis for it. Finally, we propose a method for sensitivity analysis of the probability of immunity under unmeasured confounding.

1. Introduction

Let XX and YY denote an exposure and its outcome, respectively. Let XX and YY be binary taking values in {x,x}\{x,x^{\prime}\} and {y,y}\{y,y^{\prime}\}. Let YxY_{x} and YxY_{x^{\prime}} denote the counterfactual outcome when the exposure is set to level X=xX=x and X=xX=x^{\prime}. Let yxy_{x}, yxy^{\prime}_{x}, yxy_{x^{\prime}} and yxy^{\prime}_{x^{\prime}} denote the events Yx=yY_{x}=y, Yx=yY_{x}=y^{\prime}, Yx=yY_{x^{\prime}}=y and Yx=yY_{x^{\prime}}=y^{\prime}. For instance, let XX represent whether a patient gets treated or not for a deadly disease, and YY represent whether she survives it or not. Individual patients can be classified into immune (they survive whether they are treated or not, i.e. yxyxy_{x}\land y_{x^{\prime}}), doomed (they die whether they are treated or not, i.e. yxyxy^{\prime}_{x}\land y^{\prime}_{x^{\prime}}), benefited (they survive if and only if treated, i.e. yxyxy_{x}\land y^{\prime}_{x^{\prime}}), and harmed (they die if and only if treated, i.e. yxyxy^{\prime}_{x}\land y_{x^{\prime}}).

In general, the average treatment effect (ATE) estimated from a randomized controlled trial (RCT) does not inform about the probability of benefit (or of any of the other response types, i.e. harm, immunity, and doom). However, it may do it under certain conditions. For instance,

ATE=p(yx)p(yx)\displaystyle ATE=p(y_{x})-p(y_{x^{\prime}}) =p(yx,yx)+p(yx,yx)[p(yx,yx)+p(yx,yx)]\displaystyle=p(y_{x},y_{x^{\prime}})+p(y_{x},y^{\prime}_{x^{\prime}})-[p(y_{x},y_{x^{\prime}})+p(y^{\prime}_{x},y_{x^{\prime}})]
=p(benefit)p(harm)\displaystyle=p(\text{benefit})-p(\text{harm}) (1)

and thus p(benefit)=ATEp(\text{benefit})=ATE if p(harm)=0p(\text{harm})=0 (a.k.a. monotonicity). Necessary and sufficient conditions are derived by Mueller and Pearl [1] to determine from observational and experimental data if monotonicity holds. In this work, we derive similar conditions for non-immunity, i.e. p(immunity)=p(yx,yx)=0p(\text{immunity})=p(y_{x},y_{x^{\prime}})=0. These are interesting because under non-monotonicity, they turn an RCT informative about the probabilities of benefit and harm. To see it, consider

ATE=p(yx)p(yx)ATE=p(y_{x})-p(y_{x^{\prime}})

where the terms on the right-hand side of the equation are estimated from an RCT. Moreover,

p(yx)\displaystyle p(y_{x}) =p(yx,yx)+p(yx,yx)=p(immunity)+p(benefit)\displaystyle=p(y_{x},y_{x^{\prime}})+p(y_{x},y^{\prime}_{x^{\prime}})=p(\text{immunity})+p(\text{benefit}) (2)
p(yx)\displaystyle p(y_{x^{\prime}}) =p(yx,yx)+p(yx,yx)=p(immunity)+p(harm)\displaystyle=p(y_{x},y_{x^{\prime}})+p(y^{\prime}_{x},y_{x^{\prime}})=p(\text{immunity})+p(\text{harm}) (3)

and thus p(benefit)=p(yx)p(\text{benefit})=p(y_{x}) and p(harm)=p(yx)p(\text{harm})=p(y_{x^{\prime}}) if p(immunity)=0p(\text{immunity})=0.

In some cases, non-immunity is assured. For instance, when evaluating the effect of advertising on the purchase of a new product. The control group not being exposed to the ad has no way of purchasing the product, i.e. p(yx)=0p(y_{x^{\prime}})=0 and thus p(yx,yx)=0p(y_{x},y_{x^{\prime}})=0. In other cases, non-immunity cannot be assured. For instance, when evaluating the effect of a drug. An individual may carry a gene variant that makes her recover from the disease regardless of whether she takes the drug or not, i.e. p(yx,yx)0p(y_{x},y_{x^{\prime}})\geq 0. However, it may still be bounded as p(yx,yx)ϵp(y_{x},y_{x^{\prime}})\leq\epsilon from expert knowledge. We show that our necessary and sufficient conditions for non-immunity can trivially be adapted to ϵ\epsilon-bounded immunity. Moreover, we show that the knowledge of ϵ\epsilon-bounded immunity may tighten the bounds of the probabilities of benefit and harm by Tian and Pearl [2]. We also introduce the concepts of indirect benefit and harm (i.e., through a mediator) and repeat our previous analysis for them. Finally, we propose a method for sensitivity analysis of immunity under unmeasured confounding. We illustrate our results with concrete examples.

2. Conditions for Non-Immunity

Consider the bounds of p(benefit)p(\text{benefit}) derived by Tian and Pearl [2]:

max{0,p(yx)p(yx),p(y)p(yx),p(yx)p(y)}p(benefit)min{p(yx),p(yx),p(x,y)+p(x,y),p(yx)p(yx)+p(x,y)+p(x,y)}.\max\left\{\begin{array}[]{cc}0,\\ p(y_{x})-p(y_{x^{\prime}}),\\ p(y)-p(y_{x^{\prime}}),\\ p(y_{x})-p(y)\end{array}\right\}\leq p(\text{benefit})\leq\min\left\{\begin{array}[]{cc}p(y_{x}),\\ p(y^{\prime}_{x^{\prime}}),\\ p(x,y)+p(x^{\prime},y^{\prime}),\\ p(y_{x})-p(y_{x^{\prime}})+\\ p(x,y^{\prime})+p(x^{\prime},y)\end{array}\right\}. (4)

Then, combining Equations 2 or 3 with 4 gives

max{0,p(yx)p(yx),p(yx)p(x,y)p(x,y),p(yx)p(x,y)p(x,y)}p(immunity)min{p(yx),p(yx),p(yx)p(y)+p(yx),p(y)}.\max\left\{\begin{array}[]{cc}0,\\ p(y_{x})-p(y^{\prime}_{x^{\prime}}),\\ p(y_{x})-p(x,y)-\\ p(x^{\prime},y^{\prime}),\\ p(y_{x^{\prime}})-p(x,y^{\prime})-\\ p(x^{\prime},y)\end{array}\right\}\leq p(\text{immunity})\leq\min\left\{\begin{array}[]{cc}p(y_{x}),\\ p(y_{x^{\prime}}),\\ p(y_{x})-p(y)+\\ p(y_{x^{\prime}}),\\ p(y)\end{array}\right\}. (5)

A sufficient condition for p(immunity)=0p(\text{immunity})=0 to hold is that some argument to the min function in Equation 5 is equal to 0, that is

p(yx)=0 or p(yx)=0 or p(yx)+p(yx)=p(y) or p(y)=0.p(y_{x})=0\text{ or }p(y_{x^{\prime}})=0\text{ or }p(y_{x})+p(y_{x^{\prime}})=p(y)\text{ or }p(y)=0. (6)

Likewise, a necessary condition for p(immunity)=0p(\text{immunity})=0 to hold is that all the arguments to the max function are non-positive, that is

p(yx)+p(yx)1 and\displaystyle p(y_{x})+p(y_{x^{\prime}})\leq 1\text{ and }
p(yx)p(x,y)+p(x,y) and\displaystyle p(y_{x})\leq p(x,y)+p(x^{\prime},y^{\prime})\text{ and }
p(yx)p(x,y)+p(x,y).\displaystyle p(y_{x^{\prime}})\leq p(x,y^{\prime})+p(x^{\prime},y). (7)

2.1. Conditions for ϵ\epsilon-Bounded Immunity

The conditions in the previous section can be relaxed to allow certain degree of immunity (e.g., based on expert knowledge), making them more applicable in practice. Specifically, a sufficient condition for p(immunity)ϵp(\text{immunity})\leq\epsilon to hold is

p(yx)ϵ or p(yx)ϵ or p(yx)+p(yx)p(y)+ϵ or p(y)ϵ.p(y_{x})\leq\epsilon\text{ or }p(y_{x^{\prime}})\leq\epsilon\text{ or }p(y_{x})+p(y_{x^{\prime}})\leq p(y)+\epsilon\text{ or }p(y)\leq\epsilon.

Likewise, a necessary condition for p(immunity)ϵp(\text{immunity})\leq\epsilon to hold is

p(yx)+p(yx)1+ϵ and\displaystyle p(y_{x})+p(y_{x^{\prime}})\leq 1+\epsilon\text{ and }
p(yx)p(x,y)+p(x,y)+ϵ and\displaystyle p(y_{x})\leq p(x,y)+p(x^{\prime},y^{\prime})+\epsilon\text{ and }
p(yx)p(x,y)+p(x,y)+ϵ.\displaystyle p(y_{x^{\prime}})\leq p(x,y^{\prime})+p(x^{\prime},y)+\epsilon. (8)

2.2. ϵ\epsilon-Bounds on Benefit and Harm

Assuming ϵ\epsilon-bounded immunity (e.g., based on expert knowledge) can help narrowing the bounds on p(benefit)p(\text{benefit}) and p(harm)p(\text{harm}). Specifically, if p(immunity)ϵp(\text{immunity})\leq\epsilon then Equation 2 gives

p(yx)ϵp(benefit)p(yx).p(y_{x})-\epsilon\leq p(\text{benefit})\leq p(y_{x}).

Incorporating this into Equation 4 gives

max{0,p(yx)p(yx),p(y)p(yx),p(yx)p(y),p(yx)ϵ}p(benefit)min{p(yx),p(yx),p(x,y)+p(x,y),p(yx)p(yx)+p(x,y)+p(x,y)}\max\left\{\begin{array}[]{cc}0,\\ p(y_{x})-p(y_{x^{\prime}}),\\ p(y)-p(y_{x^{\prime}}),\\ p(y_{x})-p(y),\\ p(y_{x})-\epsilon\end{array}\right\}\leq p(\text{benefit})\leq\min\left\{\begin{array}[]{cc}p(y_{x}),\\ p(y^{\prime}_{x^{\prime}}),\\ p(x,y)+p(x^{\prime},y^{\prime}),\\ p(y_{x})-p(y_{x^{\prime}})+\\ p(x,y^{\prime})+p(x^{\prime},y)\end{array}\right\} (9)

which can potentially return a tighter lower bound than Equation 4, i.e. if ϵ<min(p(yx),p(y))\epsilon<\min(p(y_{x^{\prime}}),p(y)). Although the value of ϵ\epsilon is typically determined from expert knowledge and not from data, the experimental and observational data available do restrict the values that are valid, as indicated by Equation 2.1. In short, ϵ\epsilon can take any value as long as the lower bound is not greater than the upper bound in Equation 9. Moreover, p(harm)p(\text{harm}) can likewise be bounded by simply swapping xx and xx^{\prime} in Equation 9.

2.3. Examples

This section illustrates the results above with two concrete examples.111R code for the examples can be found at https://tinyurl.com/2s3bxmyu.

2.3.1. Example 1

A pharmaceutical company wants to market their drug to cure a disease by claiming that no one is immune. The RCT they conducted for the drug approval yielded the following:

p(yx)\displaystyle p(y_{x}) =0.76\displaystyle=0.76
p(yx)\displaystyle p(y_{x^{\prime}}) =0.31\displaystyle=0.31

which correspond to the following unknown data generation model:

p(u)=0.3\displaystyle p(u)=0.3 p(x|u)=0.2\displaystyle p(x|u)=0.2 p(y|x,u)\displaystyle p(y|x,u) =0.9\displaystyle=0.9
p(y|x,u)\displaystyle p(y|x,u^{\prime}) =0.7\displaystyle=0.7
p(x|u)=0.9\displaystyle p(x|u^{\prime})=0.9 p(y|x,u)\displaystyle p(y|x^{\prime},u) =0.8\displaystyle=0.8
p(y|x,u)\displaystyle p(y|x^{\prime},u^{\prime}) =0.1.\displaystyle=0.1.

Therefore, the necessary condition for non-immunity in Equation 2 does not hold, and thus the company is not entitled to make the claim they intended to make. The company changes strategy and now wishes to market their drug as having a minimum of 50 % efficacy, i.e. benefit. To do so, they first conduct an observational study that yields the following:

p(x,y)=0.5\displaystyle p(x,y)=0.5 p(x,y)\displaystyle p(x,y^{\prime}) =0.2\displaystyle=0.2
p(x,y)=0.2\displaystyle p(x^{\prime},y)=0.2 p(x,y)\displaystyle p(x^{\prime},y^{\prime}) =0.1.\displaystyle=0.1.

Then, they apply Equation 4 to the RCT and observational results to conclude that 0.45p(benefit)0.610.45\leq p(\text{benefit})\leq 0.61. Again, the company cannot proceed with their marketing strategy. A few months later, a research publication reports that no more than 25 % of the population is immune. The company realizes that this value is compatible with their RCT and observational results, by checking the necessary condition for ϵ\epsilon-bounded immunity in Equation 2.1. More importantly, the company realizes that Equation 9 with ϵ=0.25\epsilon=0.25 allows to conclude that 0.51p(benefit)0.610.51\leq p(\text{benefit})\leq 0.61, and thus they can resume their marketing strategy.

2.3.2. Example 2

The previous example has shown that expert knowledge on immunity may complement experimental and observational data. While data alone rarely provide precise information on immunity (or on any other response type, for that matter), there are cases where data alone provide enough actionable information. The following example illustrates this.

A pharmaceutical company is concerned by the poor sales of a drug to cure a disease. The RCT conducted for the drug approval and a subsequent observational study yielded the following:

p(yx)\displaystyle p(y_{x}) =0.48\displaystyle=0.48
p(yx)\displaystyle p(y_{x^{\prime}}) =0.36\displaystyle=0.36

and

p(x,y)=0.08\displaystyle p(x,y)=0.08 p(x,y)\displaystyle p(x,y^{\prime}) =0.2\displaystyle=0.2
p(x,y)=0.25\displaystyle p(x^{\prime},y)=0.25 p(x,y)\displaystyle p(x^{\prime},y^{\prime}) =0.47.\displaystyle=0.47.

which correspond to the following unknown data generation model:

p(u)=0.4\displaystyle p(u)=0.4 p(x|u)=0.1\displaystyle p(x|u)=0.1 p(y|x,u)\displaystyle p(y|x,u) =0.9\displaystyle=0.9
p(y|x,u)\displaystyle p(y|x,u^{\prime}) =0.2\displaystyle=0.2
p(x|u)=0.4\displaystyle p(x|u^{\prime})=0.4 p(y|x,u)\displaystyle p(y|x^{\prime},u) =0.3\displaystyle=0.3
p(y|x,u)\displaystyle p(y|x^{\prime},u^{\prime}) =0.4.\displaystyle=0.4.

The fact that 36 % of the untreated recover from the disease makes the company suspect that the low sales are due to a large part of the population being immune. Equation 5 allows to conclude that 0p(immunity)0.340\leq p(\text{immunity})\leq 0.34, which suggests that the explanation offered by the company is rather unlikely. A more plausible explanation for the low sales may be that the efficacy or benefit of the drug is not very high, as 0.14p(benefit)0.480.14\leq p(\text{benefit})\leq 0.48 by Equation 4.

3. Indirect Benefit and Harm

In the previous sections, the causal graph of the domain under study was unknown. In this section, we assume that the graph is available (e.g., from expert knowledge) and discuss two advantages that follow with it. Specifically, suppose that the domain under study corresponds to the following causal graph:

XXYYZZ

and thus p(yx)=p(y|x)p(y_{x})=p(y|x) and p(yx)=p(y|x)p(y_{x^{\prime}})=p(y|x^{\prime}). Then, p(yx)p(y_{x}) and p(yx)p(y_{x^{\prime}}) can be estimated from observational data and thus, unlike in the previous sections, no RCT is required. A further advantage is that we can now compute the probabilities of benefit and harm mediated by ZZ. We elaborate on this below.

The effect of XX on YY mediated by ZZ (a.k.a. indirect effect) corresponds to the effect due to the indirect path XZYX\rightarrow Z\rightarrow Y, i.e. after deactivating the direct path XYX\rightarrow Y. Different ways of deactivating the direct path have resulted in different indirect effect measures in the literature. Pearl [3] proposes deactivating the direct path by setting XX to non-exposure and comparing the expected outcome when ZZ takes the value it would under exposure and non-exposure:

NIE=E[Yx,Zx]E[Yx]NIE=E[Y_{x^{\prime},Z_{x}}]-E[Y_{x^{\prime}}]

which is known as the average natural (or pure) indirect effect. Geneletti [4] also proposes deactivating the direct path by setting XX to non-exposure but instead, she proposes comparing the expected outcome when ZZ is drawn from the distributions 𝒵x\mathcal{Z}_{x} and 𝒵x\mathcal{Z}_{x^{\prime}} of ZxZ_{x} and ZxZ_{x^{\prime}}:

IIE=E[Yx,𝒵x]E[Yx,𝒵x]IIE=E[Y_{x^{\prime},\mathcal{Z}_{x}}]-E[Y_{x^{\prime},\mathcal{Z}_{x^{\prime}}}]

which is known as the interventional indirect effect. Although NIENIE and IIEIIE do not coincide in general, they coincide for the causal graph above [5]. Finally, Fulcher et al. [6] proposes deactivating the direct path by setting XX to its natural (observed) value and comparing the expected outcome when ZZ takes its natural value and the value it would under no exposure:

PIIE=E[YX,ZX]E[YX,Zx]PIIE=E[Y_{X,Z_{X}}]-E[Y_{X,Z_{x^{\prime}}}]

which is also known as the population intervention indirect effect. This measure is suitable when the exposure is harmful (e.g., smoking), and thus one may be more interested in elucidating the effect (e.g., disease prevalence) of eliminating the exposure rather than in contrasting the effects of exposure and non-exposure.

We propose an alternative way of deactivating the direct path XYX\rightarrow Y and measuring the indirect effect of XX on YY through ZZ. Specifically, we assume that the direct path XYX\rightarrow Y is actually mediated by an unmeasured random variable UU that is left unmodelled. This arguably holds in most domains. The identity of UU is irrelevant. Let GG denote the causal graph below, i.e. the original causal graph refined with the addition of UU.

XXYYUUZZ

Now, deactivating the direct path XYX\rightarrow Y in the original causal graph can be achieved by adjusting for UU in GG, i.e. uE[Y|x,u]p(u)\sum_{u}E[Y|x,u]p(u). Unfortunately, UU is unmeasured. Instead, we propose the following way of deactivating XYX\rightarrow Y. Let HH denote the causal graph below, i.e. the result of reversing the edge XUX\rightarrow U in GG.

XXYYUUZZ

The average total effect of XX on YY in HH can be computed by the front-door criterion [7]:

TE\displaystyle TE =E[Yx]E[Yx]\displaystyle=E[Y_{x}]-E[Y_{x^{\prime}}]
=zp(z|x)x˙E[Y|x˙,z]p(x˙)zp(z|x)x˙E[Y|x˙,z]p(x˙).\displaystyle=\sum_{z}p(z|x)\sum_{\dot{x}}E[Y|\dot{x},z]p(\dot{x})-\sum_{z}p(z|x^{\prime})\sum_{\dot{x}}E[Y|\dot{x},z]p(\dot{x}).

Note that GG and HH are distribution equivalent, i.e. every probability distribution that is representable by GG is representable by HH and vice versa [7]. Then, evaluating the second line of the equation above in GG or HH gives the same result. If we evaluate it in HH, then it corresponds to the part of association between XX and YY that is attributable to the path XZYX\rightarrow Z\rightarrow Y. If we evaluate it in GG, then it corresponds to the part of TETE in GG that is attributable to the path XZYX\rightarrow Z\rightarrow Y, because TETE in GG equals the association between XX and YY, since GG has only directed paths from XX to YY. Therefore, the second line in the equation above corresponds to the part of TETE in the original causal graph that is attributable to the path XZYX\rightarrow Z\rightarrow Y, thereby deactivating the direct path XYX\rightarrow Y. We propose to use the second line in the equation above as a measure of the indirect effect of XX on YY in the original causal graph.

The reasoning above can be extended to the probabilities of benefit and harm, and thereby measure the benefit and harm mediated by ZZ. As mentioned above, the causal graphs GG and HH represent different data generation mechanisms but the same probability distribution over XX, YY and ZZ. Therefore, the mechanisms agree on observational probabilities but may disagree on counterfactual probabilities. We use p()p() to denote observational probabilities obtained from either mechanism, and q()q() to denote counterfactual probabilities obtained from the mechanism corresponding to HH. The probabilities of benefit and harm of XX on YY mediated by ZZ in GG and thus in the original causal graph (henceforth indirect benefit and harm, or IBIB and IHIH) can be computed by applying Equation 2 to HH. That is,

IB=q(benefit)=q(yx)=zp(z|x)x˙p(y|x˙,z)p(x˙)IB=q(\text{benefit})=q(y_{x})=\sum_{z}p(z|x)\sum_{\dot{x}}p(y|\dot{x},z)p(\dot{x})

where the second equality holds if q(immunity)=0q(\text{immunity})=0, and the third is due to the front-door criterion on HH. Likewise for IHIH simply replacing xx by xx^{\prime}. Applying Equation 5 to HH yields necessary and sufficient conditions for q(immunity)=0q(\text{immunity})=0. That is,

zp(z|x)x˙p(y|x˙,z)p(x˙)=0 or\displaystyle\sum_{z}p(z|x)\sum_{\dot{x}}p(y|\dot{x},z)p(\dot{x})=0\text{ or }
zp(z|x)x˙p(y|x˙,z)p(x˙)=0 or\displaystyle\sum_{z}p(z|x^{\prime})\sum_{\dot{x}}p(y|\dot{x},z)p(\dot{x})=0\text{ or }
z[p(z|x)+p(z|x)]x˙p(y|x˙,z)p(x˙)=p(y) or\displaystyle\sum_{z}[p(z|x)+p(z|x^{\prime})]\sum_{\dot{x}}p(y|\dot{x},z)p(\dot{x})=p(y)\text{ or }
p(y)=0\displaystyle p(y)=0 (10)

is a sufficient condition, whereas

z[p(z|x)+p(z|x)]x˙p(y|x˙,z)p(x˙)1 and\displaystyle\sum_{z}[p(z|x)+p(z|x^{\prime})]\sum_{\dot{x}}p(y|\dot{x},z)p(\dot{x})\leq 1\text{ and }
zp(z|x)x˙p(y|x˙,z)p(x˙)p(x,y)+p(x,y) and\displaystyle\sum_{z}p(z|x)\sum_{\dot{x}}p(y|\dot{x},z)p(\dot{x})\leq p(x,y)+p(x^{\prime},y^{\prime})\text{ and }
zp(z|x)x˙p(y|x˙,z)p(x˙)p(x,y)+p(x,y)\displaystyle\sum_{z}p(z|x^{\prime})\sum_{\dot{x}}p(y|\dot{x},z)p(\dot{x})\leq p(x,y^{\prime})+p(x^{\prime},y) (11)

is a necessary condition. Necessary and sufficient conditions for ϵ\epsilon-bounded immunity on HH (i.e., q(immunity)ϵq(\text{immunity})\leq\epsilon) can be obtained much like in Section 2.1. That is, it suffices to add ϵ\epsilon to the right-hand sides of the conditions above and replace == with \leq. Finally, we can adapt accordingly the equations in Section 2.2 to obtain ϵ\epsilon-bounds on IBIB and IHIH. Note that the analysis of indirect benefit and harm presented here does not require an RCT, i.e. all the expressions involved can be estimated from just observational data.

3.1. Example

This section illustrates the results above with a concrete example borrowed from Pearl [8]. It concerns the following causal graph:

XXYYZZ

where XX represents a drug treatment, ZZ the presence of a certain enzyme in a patient’s blood, and YY recovery. Moreover, we have that

p(z|x)=0.75\displaystyle p(z|x)=0.75 p(y|x,z)=0.8\displaystyle p(y|x,z)=0.8
p(y|x,z)=0.4\displaystyle p(y|x,z^{\prime})=0.4
p(z|x)=0.4\displaystyle p(z|x^{\prime})=0.4 p(y|x,z)=0.3\displaystyle p(y|x^{\prime},z)=0.3
p(y|x,z)=0.2.\displaystyle p(y|x^{\prime},z^{\prime})=0.2.

Since p(x)p(x) is not given in the original example, we take p(x)=0.6p(x)=0.6.

Pearl imagines a scenario where the pharmaceutical company plans to develop a cheaper drug that is equal to the existing one except for the lack of direct effect on recovery, i.e. it just stimulates enzyme production as much as the existing drug. Therefore, the probability of benefit of the planned drug is the probability of benefit of the existing drug that is mediated by the enzyme. The company wants to market their drugs by claiming that no one is immune. The sufficient conditions for non-immunity in Equations 6 and 3 do not hold for the drugs. However, while the existing drug satisfies the necessary condition for non-immunity in Equation 2, the planned drug does not satisfy the corresponding condition in Equation 3. Therefore, the company should either abandon their marketing strategy or abandon the plan to develop the new drug and instead focus on trying to confirm non-immunity for the existing drug.

4. Sensitivity Analysis of Immunity

In this section, like in the previous section, we assume that the causal graph of the domain under study is available, e.g. from expert knowledge. We also assume that we only have access to observational data, i.e. no RCT is available. Specifically, consider the following causal graph:

XXYYZZ

which includes potential unmeasured exposure-outcome confounding. Since p(yx)=zp(z|x)x˙E[Y|x˙,z]p(x˙)p(y_{x})=\sum_{z}p(z|x)\sum_{\dot{x}}E[Y|\dot{x},z]p(\dot{x}) by the front-door criterion, we can proceed as in the previous section to derive necessary and sufficient conditions for non-immunity. Suppose now that ZZ is unmeasured or that the effect of XX on YY is direct rather than mediated by ZZ. Then, p(yx)p(y_{x}) is unidentifiable from observational data [7], and thus we cannot proceed as in the previous section. We therefore take an alternative approach to inform the analyst about the probability of immunity and thereby help her in decision making. In particular, we propose a sensitivity analysis method to bound the probability of immunity as a function of the observed data distribution and some intuitive sensitivity parameters. Our method is an straightforward adaption of the method by Peña [9], originally developed to bound the probabilities of benefit and harm.

Let UU denote the unmeasured exposure-outcome confounders. For simplicity, we assume that all these confounders are categorical, but our results also hold for ordinal and continuous confounders.222If UU is continuous then sums/maxima/minimima over uu should be replaced by integrals/suprema/infima. For simplicity, we treat UU as a categorical random variable whose levels are the Cartesian product of the levels of the elements in the original UU.

Note that

p(yx)=p(yx|x)p(x)+p(yx|x)p(x)=p(y|x)p(x)+p(yx|x)p(x)p(y_{x})=p(y_{x}|x)p(x)+p(y_{x}|x^{\prime})p(x^{\prime})=p(y|x)p(x)+p(y_{x}|x^{\prime})p(x^{\prime})

where the second equality follows from counterfactual consistency, i.e. X=xYx=YX=x\Rightarrow Y_{x}=Y. Moreover,

p(yx|x)=up(yx|x,u)p(u|x)=up(y|x,u)p(u|x)maxup(y|x,u)p(y_{x}|x^{\prime})=\sum_{u}p(y_{x}|x^{\prime},u)p(u|x^{\prime})=\sum_{u}p(y|x,u)p(u|x^{\prime})\leq\max_{u}p(y|x,u)

where the second equality follows from YxX|UY_{x}\!\perp\!X|U for all xx, and counterfactual consistency. Likewise,

p(yx|x)minup(y|x,u).p(y_{x}|x^{\prime})\geq\min_{u}p(y|x,u).

Now, let us define

Mx=maxup(y|x,u)M_{x}=\max_{u}p(y|x,u)

and

mx=minup(y|x,u)m_{x}=\min_{u}p(y|x,u)

and likewise MxM_{x^{\prime}} and mxm_{x^{\prime}}. Then,

p(x,y)+p(x)mxp(yx)p(x,y)+p(x)Mxp(x,y)+p(x^{\prime})m_{x}\leq p(y_{x})\leq p(x,y)+p(x^{\prime})M_{x}

and likewise

p(x,y)+p(x)mxp(yx)p(x,y)+p(x)Mx.p(x^{\prime},y)+p(x)m_{x^{\prime}}\leq p(y_{x^{\prime}})\leq p(x^{\prime},y)+p(x)M_{x^{\prime}}.

These equations together with Equation 5 give

max{0,p(x)mx+p(x)mxp(y),p(x)mxp(x,y),p(x)mxp(x,y)}p(immunity)min{p(x,y)+p(x)Mx,p(x,y)+p(x)Mx,p(x)Mx+p(x)Mx,p(y)}\max\left\{\begin{array}[]{cc}0,\\ p(x^{\prime})m_{x}+p(x)m_{x^{\prime}}-p(y^{\prime}),\\ p(x^{\prime})m_{x}-p(x^{\prime},y^{\prime}),\\ p(x)m_{x^{\prime}}-p(x,y^{\prime})\end{array}\right\}\leq p(\text{immunity})\leq\min\left\{\begin{array}[]{cc}p(x,y)+p(x^{\prime})M_{x},\\ p(x^{\prime},y)+p(x)M_{x^{\prime}},\\ p(x^{\prime})M_{x}+p(x)M_{x^{\prime}},\\ p(y)\end{array}\right\} (12)

where mxm_{x}, MxM_{x}, mxm_{x^{\prime}} and MxM_{x^{\prime}} are sensitivity parameters. The possible regions for mxm_{x} and MxM_{x} are

0mxp(y|x)Mx10\leq m_{x}\leq p(y|x)\leq M_{x}\leq 1 (13)

and likewise for mxm_{x^{\prime}} and MxM_{x^{\prime}}.

Our lower bound in Equation 12 is informative if and only if333Note that the second row in the maximum equals the third plus the fourth rows.

0<p(x)mxp(x,y)0<p(x^{\prime})m_{x}-p(x^{\prime},y^{\prime})

or

0<p(x)mxp(x,y).0<p(x)m_{x^{\prime}}-p(x,y^{\prime}).

Then, the informative regions for mxm_{x} and mxm_{x^{\prime}} are

p(y|x)<mxp(y|x)p(y^{\prime}|x^{\prime})<m_{x}\leq p(y|x)

and

p(y|x)mx<p(y|x).p(y^{\prime}|x)\leq m_{x^{\prime}}<p(y|x^{\prime}).

On the other hand, our upper bound in Equation 12 is informative444Note that we already know that p(immunity)p(y)p(\text{immunity})\leq p(y) by Equation 5. if and only if555Note that the third row in the minimum equals the first plus the second minus the fourth rows.

p(x,y)+p(x)Mx<p(y)p(x,y)+p(x^{\prime})M_{x}<p(y)

or

p(x,y)+p(x)Mx<p(y)p(x^{\prime},y)+p(x)M_{x^{\prime}}<p(y)

which occurs if and only if p(y|x)<p(y|x)p(y|x)<p(y|x^{\prime}) or p(y|x)<p(y|x)p(y|x^{\prime})<p(y|x).666To see it, rewrite p(y)=p(x,y)+p(x,y)p(y)=p(x,y)+p(x^{\prime},y) and recall Equation 13. Therefore, our upper bound is always informative, and thus the informative regions for MxM_{x} and MxM_{x^{\prime}} coincide with their possible regions.

Refer to caption
Refer to caption
Figure 1. Lower and upper bounds of p(immunity)p(\text{immunity}) in the example in Section 4.1 as functions of the sensitivity parameters mxm_{x}, mxm_{x^{\prime}}, MxM_{x} and MxM_{x^{\prime}}.

4.1. Example

We illustrate our method for sensitivity analysis of p(immunity)p(\text{immunity}) with the following fictitious epidemiological example. Consider a population consisting of a majority and a minority group. Let the binary random variable UU represent the group an individual belongs to. Let XX represent whether the individual gets treated or not for a certain disease. Let YY represent whether the individual survives the disease. Assume that the scientific community agrees that UU is a confounder for XX and YY. Assume also that it is illegal to store the values of UU, to avoid discrimination complaints. In other words, the identity of the confounder is known but its values are not. More specifically, consider the following unknown data generation model:

p(u)=0.2\displaystyle p(u)=0.2 p(x|u)=0.4\displaystyle p(x|u)=0.4 p(y|x,u)=0.9\displaystyle p(y|x,u)=0.9
p(y|x,u)=0.8\displaystyle p(y|x,u^{\prime})=0.8
p(x|u)=0.2\displaystyle p(x|u^{\prime})=0.2 p(y|x,u)=0.2\displaystyle p(y|x^{\prime},u)=0.2
p(y|x,u)=0.7.\displaystyle p(y|x^{\prime},u^{\prime})=0.7.

Since this model does not specify the functional forms of the causal mechanisms, we cannot compute the true p(immunity)p(\text{immunity}) [7]. However, we can bound it by Equation 5 and the fact that p(yx)=up(y|x,u)p(u)p(y_{x})=\sum_{u}p(y|x,u)p(u) [7], which yields p(immunity)[0.42,0.6]p(\text{immunity})\in[0.42,0.6]. Note that these bounds cannot be computed in practice because UU is unmeasured.

Figure 1 (top) shows the lower bound of p(immunity)p(\text{immunity}) in Equation 12 as a function of the sensitivity parameters mxm_{x} and mxm_{x^{\prime}}. The axes span the possible regions of the parameters. The dashed lines indicate the informative regions of the parameters. Specifically, the bottom left quadrant corresponds to the non-informative region, i.e. the lower bound is zero. In the data generation model considered, mx=0.8m_{x}=0.8 and mx=0.2m_{x^{\prime}}=0.2. These values are unknown to the epidemiologist, because UU is unobserved. However, the figure reveals that the epidemiologist only needs to have some rough idea of these values to confidently conclude that p(immunity)p(\text{immunity}) is lower bounded by 0.2. Figure 1 (bottom) shows our upper bound of p(immunity)p(\text{immunity}) in Equation 12 as a function of the sensitivity parameters MxM_{x} and MxM_{x^{\prime}}. Likewise, having some rough idea of the unknown values Mx=0.9M_{x}=0.9 and Mx=0.7M_{x^{\prime}}=0.7 enables the epidemiologist to confidently conclude that the p(immunity)p(\text{immunity}) is upper bounded by 0.65. Applying Equation 5 with just observational data produces looser bounds, namely 0 and 0.67. Recall that p(immunity)[0.42,0.6]p(\text{immunity})\in[0.42,0.6] in truth.

5. Discussion

The analysis in this work can be repeated for p(doom)p(\text{doom}) instead of p(immunity)p(\text{immunity}) by simply swapping yy and yy^{\prime}, and p(benefit)p(\text{benefit}) and p(harm)p(\text{harm}). Additionally, the analysis of indirect benefit can be repeated for p(harm)p(\text{harm}) instead of p(immunity)p(\text{immunity}) due to Equation 1, and thereby extend the analysis by Mueller and Pearl [1].

References

  • Mueller and Pearl [2023] S. Mueller and J. Pearl. Monotonicity: Detection, Refutation, and Ramification. UCLA Cognitive Systems Laboratory, Technical Report (R-529), 2023.
  • Tian and Pearl [2000] J. Tian and J. Pearl. Probabilities of Causation: Bounds and Identification. Annals of Mathematics and Artificial Intelligence, 28:287–313, 2000.
  • Pearl [2001] J. Pearl. Direct and Indirect Effects. In Proceedings of the 17th Conference on Uncertainty in Artificial Intelligence, pages 411–420, 2001.
  • Geneletti [2007] S. Geneletti. Identifying Direct and Indirect Effects in a Non-Counterfactual Framework. Journal of the Royal Statistical Society Series B, 69:199–215, 2007.
  • VanderWeele et al. [2014] T. J. VanderWeele, S. Vansteelandt, and J. M. Robins. Effect Decomposition in the Presence of an Exposure-Induced Mediator-Outcome Confounder. Epidemiology, 25:300–306, 2014.
  • Fulcher et al. [2020] I. R. Fulcher, I. Shpitser, S. Marealle, and E. J. Tchetgen Tchetgen. Robust Inference on Population Indirect Causal Effects: The Generalized Front Door Criterion. Journal of the Royal Statistical Society Series B, 82:199–214, 2020.
  • Pearl [2009] J. Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2009.
  • Pearl [2012] J. Pearl. The Causal Mediation Formula - A Guide to the Assessment of Pathways and Mechanisms. Prevention Science, 13:426–436, 2012.
  • Peña [2023] J. M. Peña. Bounding the Probabilities of Benefit and Harm Through Sensitivity Parameters and Proxies. Journal of Causal Inference, 11:20230012, 2023.