This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

oddsidemargin has been altered.
textheight has been altered.
marginparsep has been altered.
textwidth has been altered.
marginparwidth has been altered.
marginparpush has been altered.
The page layout violates the UAI style. Please do not change the page layout, or include packages like geometry, savetrees, or fullpage, which change it for you. We’re not able to reliably undo arbitrary changes to the style. Please remove the offending package(s), or layout-changing commands and try again.

Identification of Strong Edges in AMP Chain Graphs

Jose M. Peña
Department of Computer and Information Science
Linköping University
58183 Linköping, Sweden
Abstract

The essential graph is a distinguished member of a Markov equivalence class of AMP chain graphs. However, the directed edges in the essential graph are not necessarily strong or invariant, i.e. they may not be shared by every member of the equivalence class. Likewise for the undirected edges. In this paper, we develop a procedure for identifying which edges in an essential graph are strong. We also show how this makes it possible to bound some causal effects when the true chain graph is unknown.

1 INTRODUCTION

In most practical applications, the data available consists of observations. Therefore, it can rarely single out the true causal model. At best, it identifies the Markov equivalence class that contains the true causal model. In this paper, we represent causal models with the help of AMP chain graphs (Andersson et al.,, 2001). As argued by Peña, (2016), these graphs are suitable for representing causal linear models with additive Gaussian noise. Intuitively, the directed subgraph of a chain graph represents the causal relations in the domain, and the undirected subgraph represents the dependence structure of the noise terms. Additive noise is a rather common assumption in causal discovery (Peters et al.,, 2017), mainly because it produces tractable models which are useful for gaining insight into the system under study. Note also that linear structural equation models, which have extensively been studied for causal effect identification (Pearl,, 2009), are additive noise models.

In order to represent the equivalence class of chain graphs identified from the observations at hand, we typically use a distinguished member of it. In the literature, there are two distinguished members: The essential graph (Andersson and Perlman,, 2006), and the largest deflagged graph (Roverato and Studený,, 2006). In general, they do not coincide: The essential graph is a deflagged graph (Andersson and Perlman,, 2006, Lemma 3.2) but not necessarily the largest in the equivalence class (Andersson et al.,, 2001, p. 57). Unfortunately, the directed edges in either of the two representatives are not necessarily strong,111The term invariant or essential is also used in the literature. i.e. they may not be shared by every member of the equivalence class. Likewise for the undirected edges. In this paper, we use essential graphs to represent equivalence classes of chain graphs. And we develop a procedure for identifying which edges in an essential graph are strong. Note that while we assume that the true chain graph is unknown, its corresponding essential graph can be obtained from observational data as follows. First, learn a chain graph as shown by Peña, (2014, 2016) and Peña and Gómez-Olmedo, (2016) and, then, transform it into an essential graph as shown by Sonntag and Peña, (2015, Section 3).

Identifying the strong edges in an essential graph is important because it makes it possible to identify causal paths from data even though the data may not be able to single out the true chain graph: Simply output every directed path in the essential graph that consists of only strong edges. Of course, the true chain graph may have additional causal paths. Identifying the strong edges in an essential graph is also important because it allows to efficiently bound some causal effects of the form p(y|do(x))p(y|do(x)) where XX and YY are singletons. The simplest way to bound such a causal effect consists in enumerating all the chain graphs that are equivalent to the essential graph and, then, computing the causal effect for each of them from the observational data by adjusting for the appropriate variables. Although we know how to enumerate the equivalent chain graphs (Sonntag and Peña,, 2015, Theorem 3), this method may be inefficient for all but small domains. Instead, we show in this paper how the knowledge of the strong edges in an essential graph allows to enumerate the adjusting sets without enumerating the equivalent chain graphs explicitly.

The rest of the paper is organized as follows. Section 2 introduces some preliminaries. Section 3 presents our algorithm to identify strong edges in an essential graph. Section 4 presents our procedure to bound causal effects when the true chain graph is unknown but its corresponding essential graph is known. Section 5 closes the paper with some discussion and lines of future research.

2 PRELIMINARIES

All the graphs and probability distributions in this paper are defined over a finite set VV unless otherwise stated. All the graphs contain at most one edge between a pair of nodes. The elements of VV are not distinguished from singletons.

The parents of a set of nodes XX of a graph GG is the set PaG(X)={A|ABPa_{G}(X)=\{A|A\rightarrow B is in GG with BX}B\in X\}. The children of XX is the set ChG(X)={A|BACh_{G}(X)=\{A|B\rightarrow A is in GG with BX}B\in X\}. The neighbors of XX is the set NeG(X)={A|ABNe_{G}(X)=\{A|A-B is in GG with BX}B\in X\}. The adjacents of XX is the set AdG(X)={A|ABAd_{G}(X)=\{A|A\rightarrow B, BAB\rightarrow A or ABA-B is in GG with BX}B\in X\}. The descendants of XX is the set DeG(X)={A|BADe_{G}(X)=\{A|B\rightarrow\cdots\rightarrow A is in GG with BX}B\in X\}. A route from a node V1V_{1} to a node VnV_{n} in GG is a sequence of (not necessarily distinct) nodes V1,,VnV_{1},\ldots,V_{n} such that ViAdG(Vi+1)V_{i}\in Ad_{G}(V_{i+1}) for all 1i<n1\leq i<n. A route is called a cycle if Vn=V1V_{n}=V_{1}. A cycle has a chord if two non-consecutive nodes of the cycle are adjacent in GG. A cycle is called semidirected if it is of the form V1V2VnV_{1}\rightarrow V_{2}\multimap\cdots\multimap V_{n} where \multimap is a short for \rightarrow or -. A chain graph (CG) is a graph with (possibly) directed and undirected edges, and without semidirected cycles. A set of nodes of a CG GG is connected if there exists a route in GG between every pair of nodes in the set and such that all the edges in the route are undirected. A chain component of GG is a maximal connected set. Note that the chain components of GG can be sorted topologically, i.e. for every edge ABA\rightarrow B in GG, the component containing AA precedes the component containing BB. A set of nodes of GG is complete if there is an undirected edge between every pair of nodes in the set. Moreover, a node is called simplicial if its neighbors are a complete set.

We now recall the interpretation of CGs due to Andersson et al., (2001), also known as AMP CGs.222Andersson et al., (2001) interpret CGs via the so-called augmentation criterion. Levitz et al., (2001, Theorem 4.1) introduce the so-called p-separation criterion and prove its equivalence to the augmentation criterion. Peña, (2016, Theorem 2) introduce the route-based criterion that we use in this paper and prove its equivalence to the p-separation criterion. A node BB in a route ρ\rho in a CG GG is called a triplex node in ρ\rho if ABCA\rightarrow B\leftarrow C, ABCA\rightarrow B-C, or ABCA-B\leftarrow C is a subroute of ρ\rho. Moreover, ρ\rho is said to be ZZ-open with ZVZ\subseteq V when (i) every triplex node in ρ\rho is in ZZ, and (ii) every non-triplex node in ρ\rho is outside ZZ. Let XX, YY and ZZ denote three disjoint subsets of VV. When there is no ZZ-open route in GG between a node in XX and a node in YY, we say that XX is separated from YY given ZZ in GG and denote it as XGY|ZX\!\perp\!_{G}Y|Z. The statistical independences represented by GG are the separations XGY|ZX\!\perp\!_{G}Y|Z. A probability distribution pp is Markovian with respect to GG if the independences represented by GG are a subset of those in pp. If the two sets of independences coincide, then pp is faithful to GG. Two CGs are Markov equivalent if the sets of distributions that are Markovian with respect to each CG are the same. If a CG has an induced subgraph of the form ABCA\rightarrow B\leftarrow C, ABCA\rightarrow B-C or ABCA-B\leftarrow C, then we say that the CG has a triplex (A,B,C)(A,B,C). Two CGs are Markov equivalent if and only if they have the same adjacencies and triplexes (Andersson et al.,, 2001, Theorem 5).

Lemma 1.

Two CGs GG and HH are Markov equivalent if and only if they represent the same independences.

Proof.

The if part is trivial. To see the only if part, note that Levitz et al., (2001, Theorem 6.1) prove that there are Gaussian distributions pp and qq that are faithful to GG and HH, respectively. Moreover, pp is Markovian with respect to HH, because GG and HH are Markov equivalent. Likewise for qq and GG. Therefore, GG and HH must represent the same independences. ∎

2.1 ESSENTIAL GRAPHS

The essential graph (EG) GG^{*} is a distinguished member of a class of equivalent CGs. Specifically, an edge ABA\rightarrow B is in GG^{*} if and only if ABA\rightarrow B is in some member of the class and ABA\leftarrow B is in no member of the class. An algorithm (without proof of correctness) for constructing the EG from any other member of the equivalence class has been developed by Andersson and Perlman, (2004, Section 7). An alternative algorithm with proof of correctness has been developed by Sonntag and Peña, (2015, Section 3). The latter algorithm can be seen in Tables 1 and 2. A perpendicular line at the end of an edge such as in \leftfootline\leftfootline or \leftfootline\rightfootline\leftfootline\!\!\!\!\!\rightfootline represents a block, and it means that the edge cannot be oriented in that direction. Note that the ends of some of the edges in the rules in Table 2 are labeled with a circle such as in \leftfootline\leftfootline\!\!\!\!\!\multimap or \mathrel{\reflectbox{$\multimap$}}\!\!\!\!\!\multimap. The circle represents an unspecified end, i.e. a block or nothing. The modifications in the consequents of the rules consist in adding some blocks. Note that only the blocks that appear in the consequents are added, i.e. the circled ends do not get modified. In line 2 of Table 1, any such set SS will do. For instance, if BDeG(A)B\notin De_{G}(A), then let S=NeG(A)PaG(ANeG(A))S=Ne_{G}(A)\cup Pa_{G}(A\cup Ne_{G}(A)), otherwise let S=NeG(B)PaG(BNeG(B))S=Ne_{G}(B)\cup Pa_{G}(B\cup Ne_{G}(B)). In line 5, that the cycle has no blocks means that the ends of the edges in the cycle have no blocks. Note that the rule R1 is not used in line 6, because it will never fire after its repeated application in line 4. Finally, note that GG^{*} may have edges without blocks after line 6.

Table 1: Algorithm for constructing the EG.
In: A CG GG.
Out: The EG GG^{*} in the equivalence class of GG.
1 For each ordered pair of non-adjacent nodes AA
and BB in GG
2     Set SAB=SBA=SS_{AB}=S_{BA}=S such that AGB|SA\!\perp\!_{G}B|S
3 Let GG^{*} denote the undirected graph that has the
same adjacencies as GG
4 Apply the rules R1-R4 to GG^{*} while possible
5 Replace every edge ABA-B in every cycle in GG^{*}
that is of length greater than three, chordless,
and without blocks with A\leftfootline\rightfootlineBA\leftfootline\!\!\!\!\!\rightfootline B
6 Apply the rules R2-R4 to GG^{*} while possible
7 Replace every edge A\leftfootlineBA\leftfootline B and A\leftfootline\rightfootlineBA\leftfootline\!\!\!\!\!\rightfootline B in GG^{*}
with ABA\rightarrow B and ABA-B, respectively
Table 2: Rules in the algorithm in Table 1. The antecedents represent induced subgraphs.
R1:
AABBCC
\Rightarrow
AABBCC
and BSACB\notin S_{AC}
R2:
AABBCC
\Rightarrow
AABBCC
and BSACB\in S_{AC}
R3:
AA\ldotsBB
\Rightarrow
AA\ldotsBB
R4:
AABBCCDD
\Rightarrow
AABBCCDD
and ASCDA\in S_{CD}

3 STRONG EDGES

We say that a directed edge in a CG is strong if it appears in every equivalent CG. Likewise for undirected edges. Therefore, strong edges are features of a class of equivalent CGs. Clearly, strong directed edges correspond to directed edges in the EG of the equivalence class. However, the opposite is not true. Likewise for strong undirected edges. For an example, consider the EG ABCDA\rightarrow B\leftarrow C-D. The naive way to detect which edges in an EG are strong consists in generating all the CGs in the equivalence class and, then, recording the shared edges. Since there may be many CGs in the equivalence class, enumerating them in an efficient manner is paramount, but challenging. In truth, it suffices to enumerate what we call the minimally oriented CGs in order to identify the strong directed edges and, then, find one maximally oriented CG to identify the strong undirected edges. We prove these claims in Section 3.1. Although there are typically considerably fewer minimally oriented CGs, enumerating them in an efficient manner seems challenging too. That is why we present in Section 3.2 an algorithm that does not rely on enumerating CGs or minimally oriented CGs.

3.1 MINIMALLY AND MAXIMALLY ORIENTED CGs

Given a CG GG, merging two of its chain components UU and LL implies replacing the edge ABA\rightarrow B with ABA-B for all AUA\in U and BLB\in L. We say that a merging is feasible when

  1. 1.

    LChG(X)L\subseteq Ch_{G}(X) for all XPaG(L)UX\in Pa_{G}(L)\cap U,

  2. 2.

    PaG(L)UPa_{G}(L)\cap U is a complete set,

  3. 3.

    PaG(PaG(L)U)PaG(Y)Pa_{G}(Pa_{G}(L)\cap U)\subseteq Pa_{G}(Y) for all YLY\in L, and

  4. 4.

    DeG(U)PaG(L)=De_{G}(U)\cap Pa_{G}(L)=\emptyset.

A feasible merging of two chain components of a CG results in an equivalent CG (Sonntag and Peña,, 2015, Lemma 2). If a CG does not admit any feasible merging, then we call it minimally oriented. Note that several equivalent minimally oriented CGs may exist, e.g. ABCA\rightarrow B-C and ABCA-B\leftarrow C. Note also that an EG is not necessarily a minimally oriented CG, e.g. ABCA\rightarrow B\leftarrow C. If the directed edges of a CG are a subset of the directed edges of a second CG (with the same orientation), then we say that the former is larger than the latter.

Lemma 2.

The minimally oriented CGs in an equivalence class are the maximally large CGs in the class, and vice versa.

Proof.

Clearly, a maximally large CG must be minimally oriented because, otherwise, it admits a feasible merging which results in a larger CG, which is a contradiction. On the other hand, let GG be a minimally oriented CG, and assume to the contrary that there is a CG HH that is equivalent but larger than GG. Specifically, let GG have an edge ABA\rightarrow B whereas HH has an edge ABA-B. Consider a topological ordering of the chain components of GG. We say that an edge XYX\rightarrow Y precedes an edge ZWZ\rightarrow W in GG if the chain component of XX precedes the chain component of ZZ in the ordering, or if both chain components coincide and the chain component of YY precedes the chain component of WW in the ordering. Assume without loss of generality that no other edge that is directed in GG but undirected in HH precedes the edge ABA\rightarrow B in GG. Let UU and LL denote the chain components of AA and BB, respectively. Clearly, all the directed edges from UU to LL in GG must be undirected in HH because, otherwise, HH has a semidirected cycle. However, this implies a contradiction. To see it, recall that GG is a minimally oriented CG and, thus, merging UU and LL in GG is not feasible. If condition 1 fails, then GG has an induced subgraph XYZX\rightarrow Y-Z where XUX\in U and Y,ZLY,Z\in L, whereas HH has an induced subgraph XYZX-Y-Z. However, this implies that GG and HH are not equivalent, since GG has a triplex (X,Y,Z)(X,Y,Z) that HH has not.

If condition 2 fails but condition 1 holds, then GG has an induced subgraph XYZX\rightarrow Y\leftarrow Z where X,ZUX,Z\in U and YLY\in L, whereas HH has an induced subgraph XYZX-Y-Z. However, this implies that GG and HH are not equivalent, since GG has a triplex (X,Y,Z)(X,Y,Z) that HH has not.

If condition 3 fails but condition 1 holds, then GG has an induced subgraph ZXYZ\rightarrow X\rightarrow Y where XUX\in U, YLY\in L and ZV(UL)Z\in V\setminus(U\cup L), whereas HH has an induced subgraph ZXYZ\rightarrow X-Y. However, this implies that GG and HH are not equivalent, since HH has a triplex (Z,X,Y)(Z,X,Y) that GG has not. Note that ZXZ\rightarrow X is in HH because ZXZ\rightarrow X precedes XYX\rightarrow Y and thus ABA\rightarrow B in GG.

Finally, if condition 4 fails but condition 1 holds, then GG has a subgraph of the form XYZXXX\rightarrow Y\leftarrow\cdots\leftarrow Z\leftarrow X^{\prime}-\cdots-X where X,XUX,X^{\prime}\in U, YLY\in L and ZV(UL)Z\in V\setminus(U\cup L), whereas HH has a subgraph of the form XYZXX-Y-\cdots-Z-X. To see it, note that any other option results in a semidirected cycle because, recall, HH is larger than GG. However, this is a contradiction because XZX^{\prime}\rightarrow Z precedes XYX\rightarrow Y and thus ABA\rightarrow B in GG. ∎

The following result follows from the previous lemma.

Theorem 1.

A directed edge is strong if and only if it is in every minimally oriented CG in the equivalence class.

Finally, one may think that an undirected edge that is in every minimally oriented CG in the equivalence class is strong. But this is not true. For an example, consider the equivalence class represented by the EG ABA-B. Instead, an undirected edge is strong if and only if it is in any maximally oriented CG in the equivalence class (Sonntag and Peña,, 2015, Theorems 4 and 5). Formally, a maximally oriented CG is a CG that does not admit any feasible split, which is the inverse operation of the feasible merge operation described before. Alternatively, we can say that if the minimally oriented CGs are the maximally large CGs in an equivalence class, then the maximally oriented CGs are the minimally large (Sonntag and Peña,, 2015, Lemma 13). Note that several equivalent maximally oriented CGs may exist (e.g., ABA\rightarrow B and ABA\leftarrow B) but all of them have the same undirected edges (Sonntag and Peña,, 2015, Theorems 4 and 5). Note also that an EG is not necessarily a maximally oriented CG, e.g. ABA-B.

Table 3: Algorithm to label strong edges in an EG. It replaces line 7 of the algorithm in Table 1.
7 Label every edge X\leftfootline\rightfootlineYX\leftfootline\!\!\!\!\!\rightfootline Y as strong in GG^{*}
8 For each edge X\leftfootlineYX\leftfootline Y in GG^{*}
9     Set H=GH=G^{*}
10     Replace X\leftfootlineYX\leftfootline Y in HH with X\leftfootline\rightfootlineYX\leftfootline\!\!\!\!\!\rightfootline Y
11     Apply the rules R2-3 to HH while possible
12     If GG^{*} has an induced subgraph A\leftfootlineB\leftfootlineCA\leftfootline B\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}C
    whereas HH has A\leftfootline\rightfootlineB\leftfootline\rightfootlineCA\leftfootline\!\!\!\!\!\rightfootline B\leftfootline\!\!\!\!\!\rightfootline C then
13       Label X\leftfootlineYX\leftfootline Y as strong in GG^{*}
14 Replace every edge X\leftfootlineYX\leftfootline Y and X\leftfootline\rightfootlineYX\leftfootline\!\!\!\!\!\rightfootline Y in GG^{*}
with XYX\rightarrow Y and XYX-Y, respectively

3.2 ENUMERATION-FREE ALGORITHM

Although the minimally and maximally oriented CGs in an equivalence class can be obtained by repeatedly performing feasible splits and merges (Sonntag and Peña,, 2015, Theorem 3), the approach outlined above for identifying strong edges via enumeration may be inefficient for all but small domains. Hence, Table 3 presents an alternative algorithm that does not rely on enumerating the CGs or the minimally oriented CGs in the equivalence class. The new algorithm replaces line 7 in Table 1. In other words, the new algorithm postpones orienting edges until line 14, and in lines 7-13 it identifies which of the future directed and undirected edges are strong. Line 7 identifies the strong undirected edges, whereas lines 8-13 identify the strong directed edges. To do the latter, the algorithm tries to build a CG HH that is equivalent to GG^{*} and contains an edge XYX-Y. If this fails, then XYX\rightarrow Y is strong. Specifically, line 10 forces the edge between XX and YY to be undirected in HH by blocking the end at YY. Line 11 computes other blocks that follow from the new block at YY. After line 11, HH can be oriented as indicated in line 14 without creating a semidirected cycle or a triplex that is not in GG^{*}. Finally, line 12 checks if every triplex in GG^{*} is in HH. If not, XYX-Y is incompatible with some triplex in GG^{*}, which implies that XYX\rightarrow Y is strong in GG^{*}. We prove the correctness of the algorithm below.

Lemma 3.

After line 11, HH does not have any induced subgraph of the form AABBCC .

Proof.

The proof is an adaptation of the proof of Lemma 5 by Peña, (2014). Assume to the contrary that the lemma does not hold. We interpret the execution of lines 10-11 as a sequence of block additions and, for the rest of the proof, one particular sequence of these block additions is fixed. Fixing this sequence is a crucial point upon which some important later steps of the proof are based. Since there may be several induced subgraphs of HH of the form under study after lines 10-11, let us consider any of the induced subgraphs AABBCC that appear first during the execution of lines 10-11 and fix it for the rest of the proof. Note that HH has no such induced subgraph after line 9 (Sonntag and Peña,, 2015, Lemma 9). Now, consider the following cases.

Case 1

Assume that A\leftfootlineBA\leftfootline\!\!\!\!\!\multimap B is in HH due line 10. However, this implies that HH had an induced subgraph AABBCC before line 10, which is a contradiction (Sonntag and Peña,, 2015, Lemma 9).

Case 2

Assume that A\leftfootlineBA\leftfootline\!\!\!\!\!\multimap B is in HH due to R2 in line 11. Then, after line 11, HH has an induced subgraph of one of the following forms:

AABBCCDD AABBCCDD
case 2.1 case 2.2
AABBCCDD AABBCCDD
case 2.3 case 2.4
Case 2.1

If ASCDA\notin S_{CD} then A\rightfootlineCA\rightfootline C is in HH by R1 in line 4 of Table 1, else A\leftfootlineCA\leftfootline C is in HH by R2. Either case is a contradiction.

Case 2.2

Note that DDAACC cannot be an induced subgraph of HH after line 11 because, otherwise, it would contradict the assumption that AABBCC is one of the first induced subgraph of that form that appeared during the execution of lines 10-11. So, this case is impossible.

Case 2.3

Note that A\rightfootlineCA\rightfootline C is in HH by R3, which is a contradiction.

Case 2.4

If CSBDC\notin S_{BD} then B\leftfootlineCB\leftfootline C is in HH by R1 in line 4 of Table 1, else B\rightfootlineCB\rightfootline C is in HH by R2. Either case is a contradiction.

Case 3

Assume that A\leftfootlineBA\leftfootline\!\!\!\!\!\multimap B is in HH due to R3 in line 11. Then, after line 11, HH had a subgraph of one of the following forms, where possible additional edges between CC and internal nodes of the route A\leftfootline\leftfootlineDA\leftfootline\!\!\!\!\!\multimap\cdots\leftfootline\!\!\!\!\!\multimap D are not shown:

AABBCCDD\ldots AABBCCDD\ldots
case 3.1 case 3.2
AABBCCDD\ldots AABBCCDD\ldots
case 3.3 case 3.4

Note that CC cannot belong to the route A\leftfootlineA\leftfootline\!\!\!\!\!\multimap\cdots\leftfootlineD\leftfootline\!\!\!\!\!\multimap D because, otherwise, R3 could not have been applied since the cycle A\leftfootline\leftfootlineD\leftfootlineBAA\leftfootline\!\!\!\!\!\multimap\cdots\leftfootline\!\!\!\!\!\multimap D\leftfootline\!\!\!\!\!\multimap B\multimap A would not have been chordless.

Case 3.1

If BSCDB\notin S_{CD} then B\rightfootlineCB\rightfootline C is in HH by R1 in line 4 of Table 1, else B\leftfootlineCB\leftfootline C is in HH by R2. Either case is a contradiction.

Case 3.2

Note that DDBBCC cannot be an induced subgraph of HH after line 11 because, otherwise, it would contradict the assumption that AABBCC is one of the first induced subgraph of that form that appeared during the execution of lines 10-11. So, this case is impossible.

Case 3.3

Note that B\rightfootlineCB\rightfootline C is in HH by R3, which is a contradiction.

Case 3.4

Note that CC cannot be adjacent to any node of the route A\leftfootline\leftfootlineDA\leftfootline\!\!\!\!\!\multimap\cdots\leftfootline\!\!\!\!\!\multimap D besides AA and DD and, thus, A\leftfootlineCA\leftfootline C is in HH by R3. To see it, assume to the contrary that CC is adjacent to some nodes E1,,EnA,DE_{1},\ldots,E_{n}\neq A,D of the route A\leftfootline\leftfootlineDA\leftfootline\!\!\!\!\!\multimap\cdots\leftfootline\!\!\!\!\!\multimap D. Assume without loss of generality that EiE_{i} is closer to AA in the route than Ei+1E_{i+1} for all 1i<n1\leq i<n. Now, note that En\leftfootlineCE_{n}\leftfootline\!\!\!\!\!\multimap C must be in HH by R3. This implies that En1\leftfootlineCE_{n-1}\leftfootline\!\!\!\!\!\multimap C must be in HH by R3. By repeated application of this argument, we can conclude that E1\leftfootlineCE_{1}\leftfootline\!\!\!\!\!\multimap C must be in HH and, thus, A\leftfootlineCA\leftfootline C must be in HH by R3, which is a contradiction.

Lemma 4.

After line 11, every chordless cycle ρ:V1,,Vn=V1\rho:V_{1},\ldots,V_{n}=V_{1} in HH that has an edge Vi\leftfootlineVi+1V_{i}\leftfootline V_{i+1} also has an edge Vj\rightfootlineVj+1V_{j}\rightfootline V_{j+1}.

Proof.

The proof is an adaptation of the proof of Lemma 6 by Peña, (2014). Assume for a contradiction that ρ\rho is of the length three such that V1\leftfootlineV2V_{1}\leftfootline V_{2} occur and neither V2\rightfootlineV3V_{2}\rightfootline V_{3} nor V1\leftfootlineV3V_{1}\leftfootline V_{3} occur. Note that V2\leftfootline\rightfootlineV3V_{2}\leftfootline\!\!\!\!\!\rightfootline V_{3} cannot occur either because, otherwise, V1\leftfootlineV3V_{1}\leftfootline V_{3} or V1\leftfootline\rightfootlineV3V_{1}\leftfootline\!\!\!\!\!\rightfootline V_{3} must occur by R3. Since the former contradicts the assumption, then the latter must occur. However, this implies that V1\leftfootline\rightfootlineV2V_{1}\leftfootline\!\!\!\!\!\rightfootline V_{2} must occur by R3, which contradicts the assumption. Similarly, V1\leftfootline\rightfootlineV3V_{1}\leftfootline\!\!\!\!\!\rightfootline V_{3} cannot occur either. Then, ρ\rho is of one of the following forms:

V1V_{1}V2V_{2}V3V_{3} V1V_{1}V2V_{2}V3V_{3} V1V_{1}V2V_{2}V3V_{3}

The first form is impossible by Lemma 3. The second form is impossible because, otherwise, V2\leftfootlineV3V_{2}\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}V_{3} would occur by R3. The third form is impossible because, otherwise, V1\leftfootlineV3V_{1}\leftfootline V_{3} would be occur by R3. Thus, the lemma holds for cycles of length three.

Assume for a contradiction that ρ\rho is of length greater than three and has an edge Vi\leftfootlineVi+1V_{i}\leftfootline V_{i+1} but no edge Vj\rightfootlineVj+1V_{j}\rightfootline V_{j+1}. Note that if Vl\leftfootlineVl+1Vl+2V_{l}\leftfootline\!\!\!\!\!\multimap V_{l+1}\mathrel{\reflectbox{$\multimap$}}\!\!\!\!\!\multimap V_{l+2} is a subroute of ρ\rho, then either Vl+1\leftfootlineVl+2V_{l+1}\leftfootline\!\!\!\!\!\multimap V_{l+2} or Vl+1\rightfootlineVl+2V_{l+1}\rightfootline V_{l+2} is in ρ\rho by R1 and R2. Since ρ\rho has no edge Vj\rightfootlineVj+1V_{j}\rightfootline V_{j+1}, Vl+1\leftfootlineVl+2V_{l+1}\leftfootline\!\!\!\!\!\multimap V_{l+2} is in ρ\rho. By repeated application of this reasoning together with the fact that ρ\rho has an edge Vi\leftfootlineVi+1V_{i}\leftfootline V_{i+1}, we can conclude that every edge in ρ\rho is Vk\leftfootlineVk+1V_{k}\leftfootline\!\!\!\!\!\multimap V_{k+1}. Then, by repeated application of R3, observe that every edge in ρ\rho is Vk\leftfootline\rightfootlineVk+1V_{k}\leftfootline\!\!\!\!\!\rightfootline V_{k+1}, which contradicts the assumption. ∎

Lemma 5.

After line 11, HH can be oriented as indicated in line 14 without creating a semidirected cycle.

Proof.

Assume to the contrary that the orientation produces a semidirected cycle ρ:V1,,Vn\rho:V_{1},\ldots,V_{n}. Note that ρ\rho must have a chord because, otherwise, ρ\rho is impossible by Lemma 4. Specifically, let the chord be between ViV_{i} and VjV_{j} with i<ji<j. Then, divide ρ\rho into the cycles ρL:V1,,Vi,Vj,,Vn=V1\rho_{L}:V_{1},\ldots,V_{i},V_{j},\ldots,V_{n}=V_{1} and ρR:Vi,,Vj,Vi\rho_{R}:V_{i},\ldots,V_{j},V_{i}. Note that ρL\rho_{L} or ρR\rho_{R} is a semidirected cycle but shorter than ρ\rho. By repeated application of this reasoning, we can conclude that the orientation produces a chordless semidirected cycle, which contradicts Lemma 4. ∎

Lemma 6.

After line 11, HH can be oriented as indicated in line 14 without creating a triplex that is not in GG^{*}.

Proof.

We call pretriplex to an induced subgraph of GG^{*} or HH that results in a triplex when GG^{*} or HH are oriented as indicated in line 14. Note that GG^{*} and HH have the same pretriplexes after line 9. Assume to the contrary that after line 11 HH has a pretriplex that is not in GG^{*}. Assume that the spurious pretriplex is created in line 10 when A\leftfootlineBA\leftfootline B becomes A\leftfootline\rightfootlineBA\leftfootline\!\!\!\!\!\rightfootline B. Then, after line 11 HH has a pretriplex (1) A\leftfootline\rightfootlineB\rightfootlineCA\leftfootline\!\!\!\!\!\rightfootline B\rightfootline C or (2) C\leftfootlineA\leftfootline\rightfootlineBC\leftfootline A\leftfootline\!\!\!\!\!\rightfootline B. Case (1) implies that HH has actually an induced subgraph A\leftfootline\rightfootlineB\leftfootline\rightfootlineCA\leftfootline\!\!\!\!\!\rightfootline B\leftfootline\!\!\!\!\!\rightfootline C by R2, which is a contradiction. To see that R2 is applicable, note that BSACB\in S_{AC} because GG^{*} does not have a triplex (A,B,C)(A,B,C). Case (2) implies that HH has actually an induced subgraph C\leftfootline\rightfootlineA\leftfootline\rightfootlineBC\leftfootline\!\!\!\!\!\rightfootline A\leftfootline\!\!\!\!\!\rightfootline B by R2, which again is a contradiction. As before, R2 is clearly applicable. Finally, assume that the spurious pretriplex is created in line 11. Then, after line 11 HH has an induced subgraph (1) A\leftfootlineB\rightfootlineCA\leftfootline B\rightfootline C, (2) A\leftfootlineBCA\leftfootline B-C or (3) A\leftfootlineB\leftfootline\rightfootlineCA\leftfootline B\leftfootline\!\!\!\!\!\rightfootline C. However, this implies that HH has actually an induced subgraph A\leftfootline\rightfootlineB\leftfootline\rightfootlineCA\leftfootline\!\!\!\!\!\rightfootline B\leftfootline\!\!\!\!\!\rightfootline C or A\leftfootlineB\leftfootlineCA\leftfootline B\leftfootline C by R2, which again is a contradiction. As before, R2 is clearly applicable. ∎

Lemma 7.

After line 14, the undirected edges in GG^{*} that had no blocks after line 7 are not strong.

Proof.

The proof is an adaptation of the proof of Theorem 11 by Sonntag and Peña, (2015). Let FF denote the graph that contains all and only the edges of GG^{*} resulting from the replacements in line 14, and let UU denote the graph that contains the rest of the edges of GG^{*} after line 14. Note that all the edges in UU are undirected and they had no blocks when line 14 was to be executed. Therefore, UU has no cycle of length greater than three that is chordless by line 5. In other words, UU is chordal. Then, we can orient all the edges in UU without creating triplexes nor directed cycles by using, for instance, the maximum cardinality search (MCS) algorithm (Koller and Friedman,, 2009, p. 312). Consider any such orientation of the edges in UU and denote it DD. Now, add all the edges in DD to FF. As we show below, this last step does not create any triplex or semidirected cycle in FF:

  • It does not create a triplex (A,B,C)(A,B,C) in FF because, otherwise, AB\leftfootlineCA-B\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}C must exist in GG^{*} when line 14 was to be executed, which implies that A\leftfootlineBA\leftfootline\!\!\!\!\!\multimap B or A\leftfootlineBA\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}B was in GG^{*} by R1 or R2 when line 14 was to be executed, which contradicts that ABA-B is in UU.

  • Assume to the contrary that it does create a semidirected cycle ρ\rho in FF. We can assume without loss of generality that ρ\rho is chordless because if it has a chord between ViV_{i} and VjV_{j} with i<ji<j. Then, divide ρ\rho into the cycles ρL:V1,,Vi,Vj,,Vn=V1\rho_{L}:V_{1},\ldots,V_{i},V_{j},\ldots,V_{n}=V_{1} and ρR:Vi,,Vj,Vi\rho_{R}:V_{i},\ldots,V_{j},V_{i}. Note that ρL\rho_{L} or ρR\rho_{R} is a semidirected cycle but shorter than ρ\rho. By repeated application of this reasoning, we can conclude that FF has a chordless semidirected cycle.

    Since DD has no directed cycles, ρ\rho must have a \leftfootline\leftfootline or \leftfootline\rightfootline\leftfootline\!\!\!\!\!\rightfootline edge when line 14 was to be executed. The former case is impossible (Sonntag and Peña,, 2015, Lemma 10). The latter case implies that AB\leftfootline\rightfootlineCA-B\leftfootline\!\!\!\!\!\rightfootline C must exist in GG^{*} when line 14 was to be executed, which implies that AA and CC are adjacent in GG^{*} because, otherwise, A\leftfootlineBA\leftfootline\!\!\!\!\!\multimap B or A\leftfootlineBA\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}B was in GG^{*} by R1 or R2 when line 14 was to be executed, which contradicts that ABA-B is in UU. Then, A\leftfootlineCA\leftfootline\!\!\!\!\!\multimap C or A\leftfootlineCA\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}C exists in GG^{*} when line 14 was to be executed (Sonntag and Peña,, 2015, Lemma 9), which implies that A\leftfootlineBA\leftfootline\!\!\!\!\!\multimap B or A\leftfootlineBA\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}B was in GG^{*} by R3 when line 14 was to be executed, which contradicts that ABA-B is in UU.

Consequently, FF is a CG that is Markov equivalent to GG. Finally, let us recall how the MCS algorithm works. It first unmarks all the nodes in UU and, then, iterates through the following step until all the nodes are marked: Select any of the unmarked nodes with the largest number of marked neighbors and mark it. Finally, the algorithm orients every edge in UU away from the node that was marked earlier. Clearly, any node may get marked first by the algorithm because there is a tie among all the nodes in the first iteration, which implies that every edge may get oriented in any of the two directions in DD and thus in FF. Therefore, either orientation of every edge of UU occurs in some CG FF that is Markov equivalent to GG. Then, every edge of UU must be a strong undirected edge in GG^{*}. ∎

Theorem 2.

Table 3 identifies all and only the strong edges in GG^{*}.

Proof.

By definition of EG, the edges in GG^{*} with blocks on both ends in line 7 correspond to strong undirected edges in GG^{*} after line 14. Moreover, the edges in GG^{*} with no blocks in line 7 correspond to non-strong undirected edges in GG^{*} after line 14, by Lemma 7.

After line 11, HH can be oriented as indicated in line 14 without creating semidirected cycles by Lemma 5, and without creating a triplex that is not in GG^{*} by Lemma 6. Therefore, if HH can be oriented as indicated in line 14 without destroying any of the triplexes in GG^{*}, then the algorithm has found a CG that is Markov equivalent to GG^{*} and such that XYX\rightarrow Y is in GG^{*} but XYX-Y is in the CG found and, thus, XYX\rightarrow Y is non-strong in GG^{*}. Otherwise, XYX\rightarrow Y is strong in GG^{*}. This is checked in line 12. ∎

The algorithm in Table 3 may be sped up with the help of the rules in Table 4. S1-3 should be run while possible before line 8, and S4-6 should be run while possible after line 8 to propagate the labellings due to line 13 in the previous iteration.

Table 4: Rules for accelerating the search for strong directed edges in an EG. The antecedents represent induced subgraphs.
S1:
AABBCCDD
\Rightarrow C\leftfootlineDC\leftfootline D is strong
S2:
AABBCC
\Rightarrow A\leftfootlineBA\leftfootline B is strong
S3:
AABBCCDD\ldots
\Rightarrow A\leftfootlineBA\leftfootline B is strong
S4:
A\leftfootlineB\leftfootlineCA\leftfootline B\leftfootline C
and A\leftfootlineBA\leftfootline B is strong
\Rightarrow B\leftfootlineCB\leftfootline C is strong
S5:
AABBCC
and C\leftfootlineBC\leftfootline B is strong
\Rightarrow A\leftfootlineBA\leftfootline B is strong
S6:
AABBCC
and A\leftfootlineCA\leftfootline C is strong
\Rightarrow A\leftfootlineBA\leftfootline B is strong
Corollary 1.

Applying the rules in Table 4 to an EG GG^{*} correctly identifies strong directed edges in GG^{*}.

Proof.

Consider any member GG of the equivalence class of GG^{*}. Consider the rule S1. Since GG^{*} has a triplex (A,C,B)(A,C,B) after line 14, GG must have an edge ACA\rightarrow C or BCB\rightarrow C. In either case GG must also have an edge CDC\rightarrow D, since GG^{*} has not a triplex (A,C,D)(A,C,D) or (B,C,D)(B,C,D).

Consider the rule S2. Since GG^{*} has a triplex (A,B,C)(A,B,C) after line 14 and GG has an edge BCB-C due to the blocks at BB and CC, then GG must also have an edge ABA\rightarrow B.

Consider the rule S3. Assume to the contrary that GG has an edge ABA-B. Then, GG must have an edge DBD\rightarrow B since GG^{*} has a triplex (A,B,D)(A,B,D) after line 14. However, this implies that GG has a semidirected cycle due to the blocks in the antecedent of the rule, which is a contradiction.

Consider the rule S4. Since GG^{*} has not a triplex (A,B,C)(A,B,C) after line 14 and GG has an edge ABA\rightarrow B because it is strong, then GG must also have an edge BCB\rightarrow C.

Consider the rule S5. Since GG has an edge CBC\rightarrow B because it is strong, then GG must also have an edge ABA\rightarrow B to avoid having a semidirected cycle, because either ACA\rightarrow C or ACA-C is in GG due to the blocks in the antecedent of the rule. The rule S6 can be proven similarly. ∎

The rules in Table 4 are by no means complete, i.e. there may be strong edges that the rules alone do not detect. Thus, additional rules can be created. We doubt though that a complete set of concise rules can be produced. The difficulty lies in the disjunctive nature of some labellings. For instance, let an EG GG^{*} have induced subgraphs ACBA\rightarrow C\leftarrow B, ACDEA\rightarrow C\rightarrow\cdots\rightarrow D\rightarrow E and BCDEB\rightarrow C\rightarrow\cdots\rightarrow D\rightarrow E. Since GG^{*} has no triplex in ACDEA\rightarrow C\rightarrow\cdots\rightarrow D\rightarrow E, if a member GG of the equivalence class of GG^{*} has an edge ACA\rightarrow C then it has an edge DED\rightarrow E. Similarly, if GG has an edge BCB\rightarrow C then it has an edge DED\rightarrow E. Then, GG has an edge DED\rightarrow E because it has an edge ACA\rightarrow C or BCB\rightarrow C, since GG^{*} and thus GG has a triplex (A,C,B)(A,C,B). Therefore, DED\rightarrow E is strong. Although it is easy to produce a rule for this example, many more such disjunctive examples exist and we do not see any way to produce concise rules for all of them.

4 CAUSAL EFFECT BOUNDS

When the true CG is unknown, a causal effect of the form p(y|do(x))p(y|do(x)) with X,YVX,Y\in V cannot be computed, but it can be bounded as follows:

  1. 1.

    Obtain all the CGs that are Markov equivalent to the true one by running the learning algorithm developed by Peña, (2014, 2016) or Peña and Gómez-Olmedo, (2016).

  2. 2.

    Compute the causal effect for each CG obtained as follows. Like in a Bayesian network, any causal effect in a CG GG is computable uniquely from observed quantities (i.e. it is identifiable) by adjusting for the appropriate variables. Specifically,

    p(y|do(x))=p(y|x,z)p(z)𝑑zp(y|do(x))=\int p(y|x,z)p(z)dz

    where Z=NeG(X)PaG(XNeG(X))Z=Ne_{G}(X)\cup Pa_{G}(X\cup Ne_{G}(X)) and YZY\notin Z. The role of ZZ is to block every non-causal path in GG between XX and YY. We call ZZ the adjusting set in GG.

Unfortunately, the learning algorithm in step 1 may be too time consuming for all but small domains. At least, this is the conclusion that follows from the experimental results reported by Sonntag et al., (2015) for a similar algorithm for learning Lauritzen-Wermuth-Frydenberg CGs. Instead, we propose the following alternative approach:

  1. 1’.

    Learn the EG GG^{*} corresponding to the true CG from data as follows. First, learn a CG from data as shown by Peña, (2014, 2016) and Peña and Gómez-Olmedo, (2016) and, then, transform it into an EG as shown by Sonntag and Peña, (2015, Section 3).

  2. 2’.

    Enumerate all the CGs that are Markov equivalent to GG^{*} as shown by Sonntag and Peña, (2015, Theorem 3).

  3. 3’.

    Compute the causal effect for each CG enumerated as shown above.

This approach has successfully been applied when the causal models are represented by other graphical models than CGs (Hyttinen et al.,, 2015; Malinsky and Spirtes,, 2016; Maathuis et al.,, 2009). The experimental results reported by Peña and Gómez-Olmedo, (2016) indicate that the learning algorithm in step 1’ scales to medium sized domains. However, the enumeration in step 2’ may be too time consuming for all but small domains. Alternatively, we may try to enumerate the adjusting sets in the equivalent CGs without enumerating these explicitly. Specifically, we know that all the adjusting sets are subsets of AdG(X)AdG(AdG(X))Ad_{G^{*}}(X)\cup Ad_{G^{*}}(Ad_{G^{*}}(X)), because all the equivalent CGs have the same adjacencies as GG^{*}. Therefore, we can adjust for every subset of AdG(X)AdG(AdG(X))Ad_{G^{*}}(X)\cup Ad_{G^{*}}(Ad_{G^{*}}(X)) to obtain bounds for the causal effect of interest. True that some of these subsets are not valid adjusting sets in the sense that they do not correspond to any of the equivalent CGs. However, this does not make the bounds invalid, just more loose. The rest of the section studies a case where all and only the valid adjusting sets can be enumerated efficiently.

Assume that we believe a priori that the dependencies in the domain at hand are due to causal rather than non-causal relationships. Then, we believe a posteriori that the true CG is a maximally oriented CG, because such CGs have the fewest undirected edges in the equivalence class of the EG GG^{*} learned from the data in step 1’. Moreover, recall from Section 3.1 that all of them have the same undirected edges. Therefore, we can bound the causal effect p(y|do(x))p(y|do(x)) by modifying the latter framework above so that only maximally oriented CGs are enumerated in step 2’. A maximally oriented CG that is equivalent to GG^{*} can be obtained from GG^{*} by repeatedly performing feasible splits (Sonntag and Peña,, 2015, Theorem 3). Unfortunately, this enumeration method may be inefficient for all but small domains. Instead, we show below how to enumerate the adjusting sets in the maximally oriented CGs that are equivalent to GG^{*} without enumerating these explicitly.

Given a node XVX\in V, we define StG(X)={A|AXSt_{G^{*}}(X)=\{A|A-X is a strong edge in G}G^{*}\} and NstG(X)={A|AXNst_{G^{*}}(X)=\{A|A-X is a non-strong edge in G}G^{*}\}. Given a set SNstG(X)S\subseteq Nst_{G^{*}}(X), we let GSXG^{*}_{S\rightarrow X} denote the graph that is obtained from GG^{*} by replacing the edge AXA-X with AXA\rightarrow X for all ASA\in S, and replacing the edge AXA-X with AXA\leftarrow X for all ANstG(X)SA\in Nst_{G^{*}}(X)\setminus S. Moreover, we say that GSXG^{*}_{S\rightarrow X} is locally valid if GSXG^{*}_{S\rightarrow X} does not have any triplex (A,X,B)(A,X,B) that is not in GG^{*}. The next theorem proves that producing the adjusting sets in the equivalent maximally oriented CGs simplifies to produce locally valid sets.

Theorem 3.

GSXG^{*}_{S\rightarrow X} is locally valid if and only if there is a maximally oriented CG GG that is equivalent to GG^{*} and such that NeG(X)=StG(X)Ne_{G}(X)=St_{G^{*}}(X) and PaG(X)=PaG(X)SPa_{G}(X)=Pa_{G^{*}}(X)\cup S, which implies that the adjusting set in GG is StG(X)PaG(XStG(X))SSt_{G^{*}}(X)\cup Pa_{G^{*}}(X\cup St_{G^{*}}(X))\cup S.

Proof.

The proof is an adaptation of the proof of Lemma 3.1 by Maathuis et al., (2009). The if part is trivial. To prove the only if part, note first that SXS\cup X is a complete set because, otherwise, GSXG^{*}_{S\rightarrow X} would not be locally valid.

Let GG denote the graph that contains all and only the non-strong undirected edges in GG^{*}. Recall from Lemma 7 that these edges had no blocks when line 14 in Table 3 was to be executed. Therefore, GG is chordal by line 5 in Table 1. We now show that we can orient the edges of GG without creating triplexes or directed cycles and such that PaG(X)=SPa_{G}(X)=S. Specifically, we show that there is a perfect elimination sequence that ends with XX followed by the nodes in SS. Orienting the edges of GG according to this sequence produces the desired graph. If GG is complete, then the sequence clearly exists. If GG is not complete, then note that GG has at least two non-adjacent simplicial nodes (Jensen and Nielsen,, 2007, Theorem 4.1). Note that one of them is outside of SXS\cup X because, as shown above, the latter is a complete set. Take that node as the first node in the sequence. Note moreover that the subgraph of GG induced by the rest of the nodes is chordal. Therefore, we can repeat the previous step to select the next node in the sequence until we obtain the desired perfect elimination sequence.

Finally, consider the oriented GG obtained in the previous paragraph, and add to it all the directed edges and strong undirected edges in GG^{*}. We now prove that GG is the desired CG in the theorem. First, note that GG is maximally oriented because all the undirected edges in it are strong in GG^{*}. Second, note that if GG^{*} has a triplex (A,B,C)(A,B,C) then A\leftfootlineB\leftfootlineCA\leftfootline B\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}C must be in GG^{*} when line 14 was to be executed, which implies that neither of the edges in the triplex is non-strong undirected in GG^{*}, which implies that GG has a triplex (A,B,C)(A,B,C). Third, note that GG does not have a triplex (A,B,C)(A,B,C) that is not in GG^{*} because, otherwise, the triplex should have been created as a product of the perfect elimination sequence above. This is possible only if ABCA-B-C or AB\leftfootlineCA-B\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}C exists in GG^{*} when line 14 was to be executed. The former case is impossible by definition of perfect elimination sequence. The latter case implies that A\leftfootlineBA\leftfootline\!\!\!\!\!\multimap B or A\leftfootlineBA\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}B was in GG^{*} by R1 or R2 when line 14 was to be executed, which contradicts that ABA-B was a non-strong undirected edge in GG^{*}. Fourth, assume to the contrary that GG has a semidirected cycle ρ:V1,,Vn\rho:V_{1},\ldots,V_{n}. We can assume without loss of generality that ρ\rho is chordless because if it has a chord between ViV_{i} and VjV_{j} with i<ji<j. Then, divide ρ\rho into the cycles ρL:V1,,Vi,Vj,,Vn=V1\rho_{L}:V_{1},\ldots,V_{i},V_{j},\ldots,V_{n}=V_{1} and ρR:Vi,,Vj,Vi\rho_{R}:V_{i},\ldots,V_{j},V_{i}. Note that ρL\rho_{L} or ρR\rho_{R} is a semidirected cycle but shorter than ρ\rho. By repeated application of this reasoning, we can conclude that GG has a chordless semidirected cycle. Note that it follows from the paragraph above that ρ\rho cannot consists of just non-strong undirected edges in GG^{*}. Then, it includes some edge that was A\leftfootlineBA\leftfootline B or A\leftfootline\rightfootlineBA\leftfootline\!\!\!\!\!\rightfootline B when line 14 was to be executed. The former alternative is impossible (Sonntag and Peña,, 2015, Lemma 10). The latter alternative implies that A\leftfootline\rightfootlineBCA\leftfootline\!\!\!\!\!\rightfootline B-C must exist in GG^{*} when line 14 was to be executed, which implies that AA and CC are adjacent in GG^{*} because, otherwise, B\leftfootlineCB\leftfootline\!\!\!\!\!\multimap C or B\leftfootlineCB\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}C was in GG^{*} by R1 or R2 when line 14 was to be executed, which contradicts that BCB-C is a non-strong undirected edge in GG^{*}. Then, A\leftfootlineCA\leftfootline\!\!\!\!\!\multimap C or A\leftfootlineCA\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}C exists in GG^{*} when line 14 was to be executed (Sonntag and Peña,, 2015, Lemma 9), which implies that B\leftfootlineCB\leftfootline\!\!\!\!\!\multimap C or B\leftfootlineCB\mathrel{\reflectbox{$\leftfootline\!\!\!\!\!\multimap$}}C was in GG^{*} by R3 when line 14 was to be executed, which contradicts that BCB-C is a non-strong undirected edge in GG^{*}. ∎

The procedure outlined above can be simplified as follows.

Corollary 2.

StG(X)=St_{G^{*}}(X)=\emptyset or NstG(X)=Nst_{G^{*}}(X)=\emptyset.

Proof.

Assume the contrary. Then, GG^{*} has a subgraph A\leftfootline\rightfootlineXBA\leftfootline\!\!\!\!\!\rightfootline X-B when line 14 in Table 3 is to be executed. Then, AA and BB are adjacent in GG^{*} because, otherwise, the edge XBX-B would have some block by R1 or R2. However, this implies that the edge ABA-B has some block by Lemma 3, which implies that XBX-B has some block by R3. This is a contradiction. ∎

5 DISCUSSION

In this paper, we have presented an algorithm to identify the strong edges in an EG. We have also shown how this makes it possible to compute bounds of causal effects under the assumption that the true CG is unknown but maximally oriented. In the future, we would like to derive a similar result for minimally oriented CGs. Moreover, as mentioned in the introduction, an EG is a deflagged graph but not necessarily the largest in the equivalence class. Therefore, an EG may contain a directed edge where the largest deflagged graph has an undirected edge. Then, the algorithm in Table 3 may be improved by consulting the largest deflagged graph before trying labeling a directed edge as strong. An algorithm for constructing this graph exists (Roverato and Studený,, 2006).

References

  • Andersson et al., (2001) Andersson, S. A., Madigan, D. and Perlman, M. D. Alternative Markov Properties for Chain Graphs. Scandinavian Journal of Statistics, 28:33-85, 2001.
  • Andersson and Perlman, (2004) Andersson, S. A. and Perlman, M. D. Characterizing Markov Equivalent Classes for AMP Chain Graph Models. Technical Report 453, University of Washington, 2004. Available at http://www.stat.washington.edu/www/research/reports /2004/tr453.pdf.
  • Andersson and Perlman, (2006) Andersson, S. A. and Perlman, M. D. Characterizing Markov Equivalent Classes for AMP Chain Graph Models. The Annals of Statistics, 34:939-972, 2006.
  • Koller and Friedman, (2009) Koller, D. and Friedman, N. Probabilistic Graphical Models. MIT Press, 2009.
  • Hyttinen et al., (2015) Hyttinen, A., Eberhardt, F. and Järvisalo, M. Do-calculus when the True Graph is Unknown. In Proceedings of the 31th Conference on Uncertainty in Artificial Intelligence, 395-404, 2015.
  • Jensen and Nielsen, (2007) Jensen, F. V. and Nielsen, T. D. Bayesian Networks and Decision Graphs. Springer Verlag, 2007.
  • Levitz et al., (2001) Levitz, M., Perlman M. D. and Madigan, D. Separation and Completeness Properties for AMP Chain Graph Markov Models. The Annals of Statistics, 29:1751-1784, 2001.
  • Maathuis et al., (2009) Maathuis, M. H., Kalisch, M. and Bühlmann, P. Estimating High-Dimensional Intervention Effects from Observational Data. The Annals of Statistics, 37:3133-3164, 2009.
  • Malinsky and Spirtes, (2016) Malinsky, D. and Spirtes, P. Estimating Causal Effects with Ancestral Graph Markov Models. In Proceedings of the 8th International Conference on Probabilistic Graphical Models, 299-309, 2016.
  • Pearl, (2009) Pearl, J. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2009.
  • Peña, (2014) Peña, J. M. Learning AMP Chain Graphs and some Marginal Models Thereof under Faithfulness. International Journal of Approximate Reasoning, 55:1011-1021, 2014.
  • Peña, (2016) Peña, J. M. Alternative Markov and Causal Properties for Acyclic Directed Mixed Graphs. In Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence, 577-586, 2016.
  • Peña and Gómez-Olmedo, (2016) Peña, J. M. and Gómez-Olmedo, M. Learning Marginal AMP Chain Graphs under Faithfulness Revisited. International Journal of Approximate Reasoning, 68:108-126, 2016.
  • Peters et al., (2017) Peters, J., Janzing, D. and Schölkopf, B. Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, 2017.
  • Roverato and Studený, (2006) Roverato, A. and Studený, M. A Graphical Representation of Equivalence Classes of AMP Chain Graphs. Journal of Machine Learning Research, 7:1045-1078, 2006.
  • Sonntag and Peña, (2015) Sonntag, D. and Peñña, J. M. Chain Graph Interpretations and their Relations Revisited. International Journal of Approximate Reasoning, 58:39-56, 2015.
  • Sonntag et al., (2015) Sonntag, D., Järvisalo, M., Peñña, J. M. and Hyttinen, A. Learning Optimal Chain Graphs with Answer Set Programming. In Proceedings of the 31st Conference on Uncertainty in Artificial Intelligence, 822-831, 2015.