Fast Isomorphism Testing of Graphs with Regularly-Connected Components

José Luis López-Presa
DIATEL, Universidad Politécnica de Madrid
Madrid, Spain
jllopez@diatel.upm.es Antonio Fernández Anta
Institute IMDEA Networks
Madrid, Spain
antonio.fernandez@imdea.org

Abstract

The Graph Isomorphism problem has both theoretical and practical interest. In this paper we present an algorithm, called conauto-1.2, that efficiently tests whether two graphs are isomorphic, and finds an isomorphism if they are. This algorithm is an improved version of the algorithm conauto, which has been shown to be very fast for random graphs and several families of hard graphs [9]. In this paper we establish a new theorem that allows, at very low cost, the easy discovery of many automorphisms. This result is especially suited for graphs with regularly connected components, and can be applied in any isomorphism testing and canonical labeling algorithm to drastically improve its performance. In particular, algorithm conauto-1.2 is obtained by the application of this result to conauto. The resulting algorithm preserves all the nice features of conauto, but drastically improves the testing of graphs with regularly connected components. We run extensive experiments, which show that the most popular algorithms (namely, nauty [10, 11] and bliss [8]) can not compete with conauto-1.2 for these graph families.

1 Introduction

The Graph Isomorphism problem (GI) is of both theoretical and practical interest. GI tests whether there is a one-to-one mapping between the vertices of two graphs that preserves the arcs. This problem has applications in many fields, like pattern recognition and computer vision [3], data mining [17], VLSI layout validation [1], and chemistry [5, 15]. At the theoretical level, its main theoretical interest is that it is not known whether GI is in P or whether it is NP-complete.

Related Work

It would be nice to find a complete graph-invariant¹¹1A complete graph-invariant is a function on a graph that gives the same result for isomorphic graphs, and different results for non-isomorphic graphs. computable in polynomial time, what would allow testing graphs for isomorphism in polynomial time. However, no such invariant is known, and it is unlikely to exist. Note, though, that there are many simple instances of GI, and that many families of graphs can be tested for isomorphism in polynomial time: trees [2], planar graphs [7], graphs of bounded degree [6], etc. For a review of the theoretical results related to GI see [9, 13].

The most interesting practical approaches to the GI problem are (1) the direct approach, which uses backtracking to find a match between the graphs, using techniques to prune the search tree, and (2) computing a certificate²²2A certificate of a graph is a canonical labeling of the graph. of each of the graphs to test, and then compare the certificates directly. The direct approach can be used for both graph and subgraph isomorphism (e.g. vf2 [4] and Ullman’s [16] algorithms), but has problems when dealing with highly regular graphs with a relatively small automorphism group. In this case, even the use of heuristics to prune the search space frequently does not prevent the proposed algorithms from exploring paths equivalent to those already tested. To avoid this, it is necessary to keep track of discovered automorphisms, and use this information to aggressively prune the search space. On the other hand, using certificates, since two isomorphic graphs have the same canonical labeling, their certificates can be compared directly. This is the approach used by the well-known algorithm nauty [10, 11], and the algorithm bliss [8] (which has better performance than nauty for some graph families). This approach requires computing the full automorphism group of the graph (at least a set of generators). In most cases, these algorithms are faster than the ones that use the direct approach.

Algorithm conauto [9] uses a new approach to graph isomorphism³³3A preliminary version of conauto has been included in the LEDA C++ class library of algorithms [14].. It combines the use of discovered automorphisms with a backtracking algorithm that tries to find a match of the graphs without the need of generating a canonical form. To test graphs of $n$ nodes conauto uses $O(n^{2}\log n)$ bits of memory. Additionally, it runs in polynomial time (on $n$ ) with high probability for random graphs. In real experiments, for several families of interesting hard graphs, conauto is faster than nauty and vf2, as shown in [9]. For example Miyazaki’s graphs [12], are very hard for vf2, nauty, and bliss, but conauto handles them efficiently. However, it was found in [9] that some families of graphs built from regularly connected components (in particular, from strongly regular graphs) are not handled efficiently by any of the algorithms evaluated. While conauto runs fast when the tested graphs are isomorphic, it is very slow when the graphs are not isomorphic.

Contributions

In this paper we establish a new theorem that allows, at very low cost, the easy discovery of many automorphisms. This result is especially suited for graphs with regularly connected components, and can be applied in any direct isomorphism testing or canonical labeling algorithm to drastically improve its performance.

Then, a new algorithm, called conauto-1.2, is proposed. This algorithm is obtained by improving conauto with techniques derived from the above mentioned theorem. In particular, conauto-1.2 reduces the backtracking needed to explore every plausible path in the search space with respect to conauto. The resulting algorithm preserves all the nice features of conauto, but drastically improves the testing of some graphs, like those with regularly connected components.

We have carried out experiments to compare the practical performance of conauto-1.2, nauty, and bliss, with different families of graphs built by regularly connecting copies of small components. The experiments show that, for this type of construction, conauto-1.2 not only is the fastest, but also has a very regular behavior.

Structure

In Section 2, we define the basic theoretical concepts used in algorithm conauto-1.2 and present the theorems on which its correction relies. Next, in Section 3 we describe the algorithm itself. Then, Section 4 describes the graph families used for the tests, and show the practical performance of conauto-1.2 compared with conauto, nauty and bliss for these families. Finally we put forward our conclusions and propose new ways to improve conauto-1.2.

2 Theoretical Foundation

2.1 Basic Definitions

A directed graph $G=(V,R)$ consists of a finite non-empty set $V$ of vertices and a binary relation $R$ , i.e. a subset $R\subseteq V\times V$ . The elements of $R$ are called arcs. An arc $(u,v)\in R$ is considered to be oriented from $u$ to $v$ . An undirected graph is a graph whose arc set $R$ is symmetrical, i.e. $(u,v)\in R$ iff $(v,u)\in R$ . From now on, we will use the term graph to refer to a directed graph.

Definition 1

An isomorphism of graphs $G=(V_{G},R_{G})$ and $H=(V_{H},R_{H})$ is a bijection between the vertex sets of $G$ and $H$ , $f:V_{G}\longrightarrow V_{H}$ , such that $(v,u)\in R_{G}\iff(f(v),f(u))\in R_{H}$ . Graphs $G$ and $H$ are called isomorphic, written $G\simeq H$ , if there is at least one isomorphism of them. An automorphism of $G$ is an isomorphism of $G$ and itself.

Given a graph $G=(V,R)$ , $R$ can be represented by an adjacency matrix $\mathit{Adj}(G)=A$ with size $|V|\times|V|$ in the following way:

A_{uv}=\left\{\begin{array}[]{ll | ll}0&\textrm{if $(u,v)\notin R\land(v,u)\notin R$}&1&\textrm{if $(u,v)\notin R\land(v,u)\in R$}\\ 2&\textrm{if $(u,v)\in R\land(v,u)\notin R$}&3&\textrm{if $(u,v)\in R\land(v,u)\in R$}\end{array}\right.

Let $G=(V,R)$ be a graph, and $\mathit{Adj}(G)=A$ its adjacency matrix. Let $V_{1}\subseteq V$ and $v\in V$ , the available degree of $v$ in $V_{1}$ under $G$ , denoted by $\mathit{ADeg}(v,V_{1},G)$ , is the degree of $v$ with respect to $V_{1}$ , i.e., the 3-tuple $(D_{3},D_{2},D_{1})$ where $D_{i}=|\{u\in V_{1}:A_{vu}=i\}|$ for $i\in\{1,2,3\}$ . The predicate $\mathit{HasLinks}(v,V_{1},G)$ says if $v$ has any neighbor in $V_{1}$ , i.e., $\mathit{ADeg}(v,V_{1},G)\neq(0,0,0)$ . Extending the notation, let $V_{1},V_{2}\subseteq V$ ; if $\forall u,v\in V_{1},\mathit{ADeg}(u,V_{2},G)=\mathit{ADeg}(v,V_{2},G)=d$ , then, we denote $\mathit{ADeg}(V_{1},V_{2},G)=d$ . $\mathit{HasLinks}(V_{1},V_{2},G)$ is defined similarly.

We will say a 3-tuple $(D_{3},D_{2},D_{1})\prec(E_{3},E_{2},E_{1})$ when the first one precedes the second one in lexicographic order. This notation will be used to order the available degrees of vertices and sets.

2.2 Specific Notation and Definitions for the Algorithms

It will be necessary to introduce some specific notation to be used in the specification of our algorithms. Like other isomorphism testing algorithms, ours relies on vertex classification. Let us start defining what a partition is, and the partition concatenation operation.

A partition of a set $S$ is a sequence $\mathcal{S}=(S_{1},...,S_{r})$ of disjoint nonempty subsets of $S$ such that $S=\bigcup_{i=1}^{r}S_{i}$ . The sets $S_{i}$ are called the cells of $\mathcal{S}$ . The empty partition will be denoted by $\emptyset$ .

Definition 2

Let $\mathcal{S}=(S_{1},...,S_{r})$ and $\mathcal{T}=(T_{1},...,T_{s})$ be partitions of two disjoint sets $S$ and $T$ , respectively. The concatenation of $\mathcal{S}$ and $\mathcal{T}$ , denoted $\mathcal{S}\circ\mathcal{T}$ , is the partition $(S_{1},...,S_{r},T_{1},...,T_{s})$ . Clearly, $\emptyset\circ\mathcal{S}=\mathcal{S}=\mathcal{S}\circ\emptyset$ .

Let $G=(V,R)$ be a graph, $v\in V$ , $V_{1}\subseteq V\setminus\{v\}$ . The vertex partition of $V_{1}$ by $v$ , denoted $\mathit{PartitionByVertex}(V_{1},v,G)$ , is a partition $(S_{1},...,S_{r})$ of $V_{1}$ such that for all $i,j\in\{1,...,r\}$ , $i>j$ implies $\mathit{ADeg}(S_{i},\{v\},G)\prec\mathit{ADeg}(S_{j},\{v\},G)$ . Let $V_{1},V_{2}\subseteq V$ . The set partition of $V_{1}$ by $V_{2}$ , denoted $\mathit{PartitionBySet}(V_{1},V_{2},G)$ , is a partition $(S_{1},...,S_{r})$ of $V_{1}$ such that for all $i,j\in\{1,...,r\}$ , $i>j$ implies $\mathit{ADeg}(S_{i},V_{2},G)\prec\mathit{ADeg}(S_{j},V_{2},G)$ .

Definition 3

Let $G=(V,R)$ be a graph, and $\mathcal{S}=(S_{1},...,S_{r})$ a partition of $V$ . Let $v\in S_{x}$ for some $x\in\{1,...,r\}$ . The vertex refinement of $\mathcal{S}$ by $v$ , denoted $\mathit{VertexRefinement}(\mathcal{S},v,G)$ is the partition $\mathcal{T}=\mathcal{T}_{1}\circ...\circ\mathcal{T}_{r}$ such that for all $i\in\{1,...,r\}$ , $\mathcal{T}_{i}$ is the empty partition $\emptyset$ if $\lnot\mathit{HasLinks}(S_{i},V,G)$ , and $\mathit{PartitionByVertex}(S_{i}\setminus\{v\},v,G)$ otherwise. $S_{x}$ is the pivot set and $v$ is the pivot vertex.

Definition 4

Let $G=(V,R)$ be a graph, and $\mathcal{S}=(S_{1},...,S_{r})$ a partition of $V$ . Let $P=S_{x}$ for some $x\in\{1,...,r\}$ be a given pivot set. The set refinement of $\mathcal{S}$ by $P$ , denoted $\mathit{SetRefinement}(\mathcal{S},P,G)$ is the partition $\mathcal{T}=\mathcal{T}_{1}\circ...\circ\mathcal{T}_{r}$ such that for all $i\in\{1,...,r\}$ , $\mathcal{T}_{i}$ is the empty partition $\emptyset$ if $\lnot\mathit{HasLinks}(S_{i},V,G)$ , and $\mathit{PartitionBySet}(S_{i},P,G)$ otherwise.

Once we have presented the possible partition refinements that may be applied to partitions, we can build sequences of partitions in which an initial partition (for example the one with one cell containing all the vertices of a graph) is iteratively refined using the two previously defined refinements. Vertex refinements are tagged as $\mathrm{VERTEX}$ (if the pivot set has only one vertex), $\mathrm{SET}$ (if a set refinement is possible with some pivot set), or $\mathrm{BACKTRACK}$ (when a vertex refinement is performed with a pivot set with more than one vertex).

Definition 5

Let $G=(V,R)$ be a graph. A sequence of partitions for graph $G$ is a tuple $(\mathsf{S},\mathsf{R},\mathsf{P})$ , where $\mathsf{S}=(\mathcal{S}^{0},...,\mathcal{S}^{t})$ , are the partitions themselves, $\mathsf{R}=(R^{0},...,R^{t-1})$ indicate the type of refinement applied at each step, and $\mathsf{P}=(P^{0},...,P^{t-1})$ choose the pivot set used for each refinement step, such that all the following statements hold:

1.

For all $i\in\{0,...,t-1\}$ , $R^{i}\in\{\mathrm{VERTEX},\mathrm{SET},\mathrm{BACKTRACK}\}$ , and $P^{i}\in\{1,...,|\mathcal{S}^{i}|\}$ .
2.
For all $i\in\{1,...,t-1\}$ , let $\mathcal{S}^{i}=(S^{i}_{1},...,S^{i}_{r_{i}})$ , $V^{i}=\bigcup_{j=1}^{r_{i}}S^{i}_{j}$ . Then:
1. (a)
  
  $R^{i}=\mathrm{SET}$ implies $\mathcal{S}^{i+1}=\mathit{SetRefinement}(\mathcal{S}^{i},S^{i}_{P^{i}},G)$ .
2. (b)
  
  $R^{i}\neq\mathrm{SET}$ implies $\mathcal{S}^{i+1}=\mathit{VertexRefinement}(\mathcal{S}^{i},v,G)$ for some $v\in S^{i}_{P^{i}}$ .
3.

Let $\mathcal{S}^{t}=(S^{t}_{1},...,S^{t}_{r})$ , $V^{t}=\bigcup_{j=1}^{r}S^{t}_{j}$ , then for all $S^{t}_{x}\in\mathcal{S}^{t}$ , $|S^{t}_{x}|=1$ or $\lnot\mathit{HasLinks}(S^{t}_{x},V^{t},G)$ .

For convenience, for all $l\in\{1,...,t-1\}$ , by level $l$ we refer to the tuple $(\mathcal{S}^{l},R^{l},P^{l})$ in a sequence of partitions. Level $t$ is identified by $\mathcal{S}^{t}$ , since $R^{t}$ and $P^{t}$ are not defined.

We will now introduce the concept of compatibility among partitions, and then define compatibility of sequences of partitions. Let $\mathcal{S}=(S_{1},...,S_{r})$ be a partition of the set of vertices of a graph $G=(V_{G},R_{G})$ , and let $\mathcal{T}=(T_{1},...,T_{s})$ be a partition of the set of vertices of a graph $H=(V_{H},R_{H})$ . $\mathcal{S}$ and $\mathcal{T}$ are said to be compatible under $G$ and $H$ respectively if $|\mathcal{S}|=|\mathcal{T}|$ (i.e. $r=s$ ), and for all $i\in\{1,...,r\}$ , $|S_{i}|=|T_{i}|$ and $\mathit{ADeg}(S_{i},V_{G},G)=\mathit{ADeg}(T_{i},V_{H},H)$ .

Definition 6

Let $G=(V_{G},R_{G})$ and $H=(V_{H},R_{H})$ be two graphs. Let $\mathsf{Q}_{G}=(\mathsf{S}_{G},\mathsf{R}_{G},\mathsf{P}_{G})$ , and $\mathsf{Q}_{H}=(\mathsf{S}_{H},\mathsf{R}_{H},\mathsf{P}_{H})$ be two sequences of partitions for graphs $G$ and $H$ respectively. $\mathsf{Q}_{G}$ and $\mathsf{Q}_{H}$ are said to be compatible sequences of partitions if:

1.

$|\mathsf{S}_{G}|=|\mathsf{S}_{H}|=t$ , $|\mathsf{R}_{G}|=|\mathsf{R}_{H}|=|\mathsf{P}_{G}|=|\mathsf{P}_{H}|=t-1$ .
2.

Let $\mathsf{R}_{G}=(R_{G}^{0},...,R_{G}^{t-1})$ , $\mathsf{R}_{H}=(R_{H}^{0},...,R_{H}^{t-1})$ , $\mathsf{P}_{G}=(P_{G}^{0},...,P_{G}^{t-1})$ , $\mathsf{P}_{H}=(P_{H}^{0},...,P_{H}^{t-1})$ , $\mathsf{S}_{G}=(\mathcal{S}^{0},...,\mathcal{S}^{t})$ , $\mathsf{S}_{H}=(\mathcal{T}^{0},...,\mathcal{T}^{t})$ . For all $i\in\{0,...,t-1\}$ , $R_{G}^{i}=R_{H}^{i}$ , $P_{G}^{i}=P_{H}^{i}$ , and $\mathcal{S}^{i}$ and $\mathcal{T}^{i}$ are compatible under $G$ and $H$ respectively.
3.

Let $\mathcal{S}^{t}=(S^{t}_{1},...,S^{t}_{r})$ , $\mathcal{T}^{t}=(T^{t}_{1},...,T^{t}_{r})$ , then for all $x,y\in\{1,...,r\}$ , $\mathit{ADeg}(S^{t}_{x},S^{t}_{y},G)=\mathit{ADeg}(T^{t}_{x},T^{t}_{y},H)$ .

The following theorem shows that having compatible sequences of partitions is equivalent to being isomorphic.

Theorem 1 ([9])

Two graphs $G$ and $H$ are isomorphic if and only if there are two compatible sequences of partitions $\mathsf{Q}_{G}$ and $\mathsf{Q}_{H}$ for graphs $G$ and $H$ respectively.

In order to properly handle automorphisms, sequences of partitions will be extended with vertex equivalence information. Two vertices $u,v\in V$ of a graph $G=(V,R)$ are equivalent, denoted $u\equiv v$ , if there is an automorphism $f$ of $G$ such that $f(u)=v$ . A vertex $w\in V$ is fixed by $f$ if $f(w)=w$ . When two vertices are equivalent, they are said to belong to the same orbit. The set of all the orbits of a graph is called the orbit partition. Our algorithm performs a partial computation of the orbit partition. The orbit partition will be computed incrementally, starting from the singleton partition. Since our algorithm performs a limited search for automorphisms, it is possible that it stops before the orbit partition is really found. Therefore, we will introduce the notion of semiorbit partition, and extend the sequence of partitions to include a semiorbit partition.

Definition 7

Let $G=(V,R)$ be a graph. A semiorbit partition of $G$ is any partition $\mathsf{O}=\{O_{1},...,O_{k}\}$ of $V$ , such that $\forall i\in\{1,...,k\}$ , $v,u\in O_{i}$ implies that $v\equiv u$ .

Definition 8

An extended sequence of partitions $\mathsf{E}$ for a graph $G=(V,R)$ is a tuple $(\mathsf{Q},\mathsf{O})$ , where $\mathsf{Q}$ is a sequence of partitions, denoted as $\mathit{SeqPart}(\mathsf{E})$ , and $\mathsf{O}$ is a semiorbit partition of $G$ , denoted as $\mathit{Orbits}(\mathsf{E})$ .

Finally, we introduce a notation for the number of vertex refinements tagged $\mathrm{BACKTRACK}$ , since it will be used to choose the target sequence of partitions to be reproduced. Let $\mathsf{Q}=(\mathsf{S},\mathsf{R},\mathsf{P})$ be a sequence of partitions, and let $\mathsf{R}=(R^{0},...,R^{t-1})$ . Then, $\mathit{BacktrackAmount}(\mathsf{Q})=|\{i:i\in\{1,...,t-1\}\land R^{i}=\mathrm{BACKTRACK}\}|$ .

2.3 Components Theorem

It was observed [9] that conauto is very efficient finding isomorphisms for unions of strongly regular graphs, but it is inefficient detecting that two such unions are not isomorphic. Exploring the behavior of conauto in graphs that are the disjoint union of connected components, we observed that it was not able to identify cases in which components in both graphs had already been matched. This was leading to many redundant attempts of matching components.

Note that, once a component $C_{G}$ of a graph $G$ has been found isomorphic to a component $C_{H}$ of a graph $H$ , it is of no use trying to match $C_{G}$ to another component of $H$ . Besides, if $C_{G}$ can not be matched to any component of $H$ , it is of no use trying to match the other components, since, at the end, the graphs can not be isomorphic. After a thorough study of the behavior of conauto for these graphs, we have concluded that its performance can be drastically improved in these cases by directly applying the following theorem (whose proof can be found in the Appendix):

Theorem 2

During the search for a sequence of partitions compatible with the target, backtracking from a level $l$ to a level $k<l$ , such that each cell of level $l$ is contained in a different cell of level $k$ , can not provide a compatible partition.

3 Conauto-1.2

In this section we propose a new algorithm conauto-1.2 (described in Algorithm 1) which is based on algorithm conauto [9], and uses the result of Theorem 2 to drastically reduce backtracking. It starts generating a sequence of partitions for each of the graphs being tested (using function $\mathit{GenerateSequenceOfPartitions}$ ), and performing a limited search for automorphisms using function $\mathit{FindAutomorphisms}$ , just like conauto. The difference with conauto is that, during the search for the compatible sequence of partitions ( $\mathit{Match}$ ), the algorithm not always backtracks to the previous recursive call (the previous level in the sequence of partitions). Instead, it may backtrack directly to a much higher level, or even stop the search, concluding that the graphs are not isomorphic, skipping intermediate backtracking points.

Function $\mathit{GenerateSequenceOfPartitions}$ is the same used by conauto (see [9] for the details). It is worth mentioning that it generates a sequence of partitions with the following criteria:

1.

It starts with the degree partition, and ends when it gets a partition in which no non-singleton cell has remaining links.
2.

The pivot cell used for a refinement must always have remaining links (the more, the better).
3.

At each level, a vertex refinement with a singleton pivot cell is the preferred choice.
4.

The second best choice is to perform a set refinement, preferring small cells over big ones.
5.

If the previous refinements can not be used, then a vertex is chosen from the pivot cell (the smallest cell with links), a vertex refinement is performed with that pivot vertex, and a backtracking point arises.

Function $\mathit{FindAutomorphisms}$ is also the same used by conauto (see [9] for the details). It takes as input a sequence of partitions for a graph, and generates an extended sequence of partitions. In the process, it tries to eliminate backtracking points, and builds a semiorbit partition of the vertices with the information on vertex equivalences it gathers. Recall that two vertices are equivalent if there is an automorphism that permutes them, i.e., if there are two equivalent sequences of partitions in which one takes the place of the other.

Algorithm 1 Test whether

G

and

H

are isomorphic (conauto-1.2).

$\mathit{AreIsomorphic}(G,H):\mathrm{boolean}$
	1	$\mathsf{Q}_{G}\leftarrow\mathit{GenerateSequenceOfPartitions}(G)$
	2 $\mathsf{Q}_{H}\leftarrow\mathit{GenerateSequenceOfPartitions}(H)$
	3	$\mathsf{E}_{G}\leftarrow\mathit{FindAutomorphisms}(G,\mathsf{Q}_{G})$
	4 $\mathsf{E}_{H}\leftarrow\mathit{FindAutomorphisms}(H,\mathsf{Q}_{H})$
	5	if $\mathit{BacktrackAmount}(\mathit{SeqPart}(\mathsf{E}_{G}))\leq\mathit{BacktrackAmount}(\mathit{SeqPart}(\mathsf{E}_{H}))$ then
	6		return $0\leq\mathit{Match}(0,G,H,\mathit{SeqPart}(\mathsf{E}_{G}),\mathit{Orbits}(\mathsf{E}_{H}))$
	7	else
	8		return $0\leq\mathit{Match}(0,H,G,\mathit{SeqPart}(\mathsf{E}_{H}),\mathit{Orbits}(\mathsf{E}_{G}))$
	9	end if

Function $\mathit{Match}$ (Algorithm 2) uses backtracking attempting to find a sequence of partitions for graph $H$ that is compatible with the one for graph $G$ . At backtracking points, it tries every feasible vertex in the pivot cell, so that no possible solution is missed.

Note that, unlike in conauto, the function $\mathit{Match}$ of conauto-1.2 does not return a boolean, but an integer. Thus, if $\mathit{Match}$ returns $-1$ , that means that a mismatch has been found at some level $l$ , such that there is no previous level $l^{\prime}$ at which a cell contains (at least) two cells of the partition of level $l$ . Hence, from Theorem 2 there is no other feasible alternative in the search space that can yield an isomorphism of the graphs. If it returns a value that is higher than the current level, then a match has been found, the graphs are isomorphic and there is no need to continue the search. Therefore, in this case the call immediately returns with this value. If it returns a value that is lower than the current level, then it is necessary to backtrack to that level, since trying another option at this level is meaningless according to Theorem 2. Hence the algorithm also returns immediately with that value. If a call at level $l$ returns $l$ , then another alternative at this level $l$ should be tried if possible. In any other case, it applies Theorem 2 directly, and returns the closest (previous) level $l^{\prime}$ at which two cells of the current level $l$ belong to the same cell of $l^{\prime}$ . If no such previous level exists, it returns $-1$ .

Algorithm 2 Find a sequence of partitions compatible with the target.

$\mathit{Match}(l,G,H,\mathsf{Q}_{G},\mathsf{O}_{H}):\mathrm{integer}$
	1	if partition labeled $\mathrm{FIN}$ and the adjacencies in both partitions match
	2		return $l$
	3	else if partition labeled $\mathrm{VERTEX}$ and vertex refinement compatible then
	4		$l^{\prime}\longleftarrow\mathit{Match}(l+1,G,H,\mathsf{Q}_{G},\mathsf{O}_{H})$
	5		if $l\neq l^{\prime}$ then return $l^{\prime}$
	6	else if partition labeled $\mathrm{SET}$ and set refinement compatible then
	7		$l^{\prime}\longleftarrow\mathit{Match}(l+1,G,H,\mathsf{Q}_{G},\mathsf{O}_{H})$
	8		if $l\neq l^{\prime}$ then return $l^{\prime}$
	9	else if partition labeled $\mathrm{BACKTRACK}$ then
	10		for each vertex $v$ in the pivot cell, while NOT success do
	11			if $v$ may NOT be discarded according to $\mathsf{O}_{H}$ and vertex refinement compatible then
	12				$l^{\prime}\longleftarrow\mathit{Match}(l+1,G,H,\mathsf{Q}_{G},\mathsf{O}_{H})$
	13				if $l\neq l^{\prime}$ then return $l^{\prime}$
	14			end if
	15		end for
	16	end if
	17	return the nearest level $l^{\prime}$ such that the condition of Theorem 2 holds

4 Performance Evaluation

In this section we compare the practical performance of conauto-1.2 with nauty and bliss, two well-known algorithms that are considered the fastest algorithms for isomorphism testing and canonical labeling. In the performance evaluation experiments, we have run these programs with instances (pairs of graphs) that belong to specific families. We also use conauto to show the improvement achieved by conauto-1.2 for these graph families. Undirected and directed (when possible) graphs of different sizes (number of nodes) have been considered. The experiments include instances of isomorphic and non-isomorphic pairs of graphs.

4.1 Graph Families

For the evaluation, we have built some families of graphs with regularly-connected components. The general construction technique of these graphs consists of combining small components of different types by either (1) connecting every vertex of each component to all the vertices of the other components, (2) connecting only some vertices in each component to some vertices in all the other components, or (3) applying the latter construction in two levels. The use of these techniques guarantees that the resulting graph is connected, which is convenient to evaluate algorithms that require connectivity (like, e.g., vf2 [4]). Using the disjoint union of connected components yields similar experimental results.

Next, we describe each family of graphs used. In fact, as the reader will easily infer, the key point in all these constructions is that the components are either disconnected, or connected via complete $n$ -partite graphs. Hence, multiple other constructions may be used which would yield similar results. In each graph family, one hundred pairs of isomorphic and non-isomorphic graphs have been generated for each graph size (up to approximately $1,000$ vertices).

Unions of Strongly Regular Graphs

This graph family is built from a set of $20$ strongly regular graphs with parameters $(29,14,6,7)$ as components. The components are interconnected so that each vertex in one component is connected to every vertex in the other components. This is equivalent to inverting the components, then applying the disjoint union, and finally inverting the result. Graphs up to $20\times 29=580$ vertices have only one copy of each component, and bigger ones may have more than one copy of each component.

Unions of Tripartite Graphs

For this family, we use the digraphs in Figure 4.1 as the basic components. For the positive tests (isomorphic graphs) we use the same number of components of each type, while for the negative tests we use one graph with the same number of components of each type, and another graph in which one component has been replaced by one of the other type.

The connections between components have been done in the following way. The vertices in the $A$ subset of each component are connected to all the vertices in the $B$ subsets of the other components. See Figure 4.1 to locate these subsets. The arcs are directed from the vertices in the $A$ subsets, to the vertices in the $B$ subsets. From the previously described graphs, we have obtained an undirected version by transforming every (directed) arc into an (undirected) edge.

Figure 1: Tripartite graphs used as components.

Hypo-Hamiltonian Graphs 2-level-connected

For this family we use two non-isomorphic Hypo-Hamiltonian graphs with 22 vertices. Both graphs have four orbits of sizes: one, three, six, and twelve. These basic components are interconnected at two levels. Let us call the vertices in the orbits of size one, the 1-orbit vertices, and the vertices in the orbits of size three the 3-orbit vertices. In the first level, we connect $n$ basic components, to form a first-level component, by connecting all the $3$ -orbit vertices in each basic component to all the $3$ -orbit vertices of the other basic components. In this construction, the $3$ -orbit vertices, along with the new edges added to interconnect the $n$ basic components, form a complete $n$ -partite graph. Then, in the second level, $m$ first-level components are interconnected by adding edges that connect the $1$ -orbit vertices of each first-level component with all the $1$ -orbit vertices of the other first-level components. Again, the $1$ -orbit vertices, along with the edges connecting them, form a complete $m$ -partite graph. Since we use two Hypo-Hamiltonian graphs as basic components, to generate negative isomorphism cases, a component of one type is replaced with one of the other type.

4.2 Evaluation Results

The performance of the four programs has been evaluated in terms of their execution time with multiple instances of graphs from the previously defined families. The execution times have been measured in a Pentium III at 1.0 GHz with 256 MB of main memory, under Linux RedHat $9.0$ . The same compiler (GNU gcc) and the same optimization flag (-O) have been used to compile all the programs. The time measured is the real execution time (not only CPU time) of the programs. This time does not include the time to load the graphs from disk into memory. A time limit of $10,000$ seconds has been set for each execution. When the execution of a program with graphs of size $s$ reaches this limit, all the execution data of that program for graphs of the same family with size no smaller than $s$ are discarded.

Average Execution Time

The results of the experiments are first presented, in Figure 4.2, as curves that represent execution time as a function of graph size. In these curves, each point is the average execution time of the corresponding program on all the instances of the corresponding size.

Figure 2: Average execution time.

It was previously known that nauty requires exponential time to process graphs that are unions of strongly regular graphs [12]. From our results, we conjecture that bliss has the same problem. That does not apply to conauto-1.2, though. While the original conauto had problems with non-isomorphic pairs of graphs, conauto-1.2 overcomes this problem.

With the family of unions of tripartite graphs, we have run both positive and negative experiments with directed and undirected versions of the graphs. In all cases, conauto-1.2 has a very low execution time. (Again, the improvement of conauto-1.2 over conauto is apparent in the case of negative tests.) Observe that there are no significant differences in the execution times of bliss and conauto-1.2 between the directed and the undirected cases. However, nauty is slower with directed graphs, even using the adjacencies invariant specifically designed for directed graphs.

Our last graph family, Cubic Hypohamiltonian 2-level-connected graphs, has a more complex structure than the other families, having two levels of interconnection. However, the results do not differ significantly from the previous ones. It seems that these graphs are a bit easier to process (compared with the other graph families) for bliss, but not for nauty. Like in the previous cases, conauto-1.2 is fast and consistent with the graphs in this family. It clearly improves the results of conauto for the non-isomorphic pairs of graphs.

Standard Deviation

In addition to the average behavior for each graph size, we have also evaluated the regular behavior of the programs. With regular behavior we mean that the time required to process any pair of graphs of the same family and size is very similar. We have observed that conauto-1.2 is not only fast for all these families of graphs, but it also has a very regular behavior. However, that does not hold for nauty nor bliss. This is illustrated with the plots of the normalized standard deviation⁴⁴4The normalized standard deviation is obtained by dividing the standard deviation of the sample by the mean. (NSD) shown in Figure 4.2. Algorithm conauto-1.2 has a NSD that remains almost constant, and very close to cero, for all graph sizes, and even decreases for larger graphs. However, nauty and bliss have a much more erratic behavior. In the case of conauto, we see that its problems arise when it faces negative tests, where the NSD rapidly grows.

Figure 3: Normalized Standard Deviation of execution times.

5 Conclusions and Future Work

We have presented a result (the Components Theorem, Theorem 2) that can be applied in GI algorithms to efficiently find automorphisms. Then, we have applied this result to transform the algorithm conauto into conauto-1.2. Algorithm conauto-1.2 has been shown to be fast and consistent in performance for a variety of graph families. However, the algorithm conauto-1.2 can still be improved in several ways: (1) by adding the capability of computing a complete set of generators for the automorphism group, (2) by making extensive use of discovered automorphisms during the match process, and (3) by computing canonical forms of graphs. In all these possible improvements, the Components Theorem will surely help. Additionally, the Components Theorem might also be used by nauty and bliss to improve their performance for the graph families considered, at low cost.

References

[1] Magdy S. Abadir and Jack Ferguson. An improved layout verification algorithm (LAVA). In EURO-DAC ’90: Proceedings of the conference on European design automation, pages 391–395, Los Alamitos, CA, USA, 1990. IEEE Computer Society Press.
[2] Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley series in computer science and information processing. Addison-Wesley Publishing Company, Boston, MA, USA, 1974.
[3] Donatello Conte, Pasquale Foggia, Carlo Sansone, and Mario Vento. Graph matching applications in pattern recognition and image processing. In IEEE International Conference on Image Processing, volume 2, pages 21–24, Barcelona, Spain, September 2003. IEEE Computer Society Press.
[4] L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. An improved algorithm for matching large graphs. In Proceedings of the 3rd IAPR-TC-15 International Workshop on Graph-based Representations, pages 149–159, Ischia, Italy, May 2001.
[5] Jean-Loup Faulon. Isomorphism, automorphism partitioning, and canonical labeling can be solved in polynomial–time for molecular graphs. Journal of chemical information and computer science, 38:432–444, 1998.
[6] I. S. Filotti and Jack N. Mayer. A polynomial–time algorithm for determining the isomorphism of graphs of fixed genus. In STOC ’80: Proceedings of the twelfth annual ACM symposium on Theory of computing, pages 236–243, New York, NY, USA, 1980. ACM Press.
[7] J. E. Hopcroft and J. K. Wong. Linear time algorithm for isomorphism of planar graphs (preliminary report). In STOC ’74: Proceedings of the sixth annual ACM symposium on Theory of computing, pages 172–184, New York, NY, USA, 1974. ACM Press.
[8] Tommi A. Junttila and Petteri Kaski. Engineering an efficient canonical labeling tool for large and sparse graphs. In ALENEX. SIAM, 2007.
[9] José Luis López-Presa and Antonio Fernández Anta. Fast algorithm for graph isomorphism testing. In Jan Vahrenhold, editor, SEA, volume 5526 of Lecture Notes in Computer Science, pages 221–232. Springer, 2009.
[10] Brendan D. McKay. Practical graph isomorphism. Congressus Numerantium, 30:45–87, 1981.
[11] Brendan D. McKay. The nauty page. Computer Science Department, Australian National University, 2004. http://cs.anu.edu.au/ $\sim$ bdm/nauty/.
[12] Takunari Miyazaki. The complexity of McKay’s canonical labeling algorithm. In Larry Finkelstein and William M. Kantor, editors, Groups and Computation II, volume 28 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 239–256. American Mathematical Society, Providence, Rhode Island, USA, 1997.
[13] José Luis López Presa. Efficient Algorithms for Graph Isomorphism Testing. Doctoral thesis, Escuela Técnica Superior de Ingeniería de Telecomunicación, Universidad Rey Juan Carlos, Madrid, Spain, March 2009. Available at http://www.diatel.upm.es/jllopez/tesis/thesis.pdf.
[14] Johannes Singler. Graph isomorphism implementation in LEDA 5.1. Technical report, Algorithmic Solutions Software GmbH, Dec. 2005.
[15] Gottfried Tinhofer and Mikhail Klin. Algebraic combinatorics in mathematical chemistry. Methods and algorithms III. Graph invariants and stabilization methods. Technical Report TUM-M9902, Technische Universität München, March 1999. http://www-lit.ma.tum.de/veroeff/quel/990.05005.pdf.
[16] J. R. Ullmann. An algorithm for subgraph isomorphism. Journal of the ACM, 23(1):31–42, 1976.
[17] Takashi Washio and Hiroshi Motoda. State of the art of graph-based data mining. ACM SIGKDD Explorations Newsletter, 5(1):59–68, 2003.

Appendix A Proof of the Components Theorem (Theorem 2)

The following definition will be needed in the proof.

Definition 9

Let $G=(V,R)$ be a graph. Let $V^{\prime}\subseteq V$ . Then the subgraph induced by $V^{\prime}$ on $G$ , denoted $G_{V^{\prime}}$ , is the graph $H=(V^{\prime},R^{\prime})$ such that $R^{\prime}=\{(u,v):u,v\in V^{\prime}\land(u,v)\in R\}$ .

A backtracking point arises when a partition does not have singleton cells (suitable for a vertex refinement) and it is not possible to refine such partition by means of a set refinement. Let us introduce a new concept that will be useful in the following discussion.

Definition 10

Let $G=(V,R)$ be a graph, and let $\mathcal{S}=(S_{1},...,S_{r})$ be a partition of $V$ . $\mathcal{S}$ is said to be equitable (with respect to $G$ ) if for all $i\in\{1,...,r\}$ , for all $u,v\in S_{i}$ , for all $j\in\{1,...,r\}$ , $\mathit{ADeg}(u,S_{j},G)=\mathit{ADeg}(v,S_{j},G)$ .

Observation 1

The partition at a backtracking point is equitable.

Proof: Assume otherwise. Then, there exists some $S_{j}$ such that there are two vertices $u,v$ in some $S_{i}$ , such that $\mathit{ADeg}(u,S_{j},G)\neq\mathit{ADeg}(v,S_{j},G)$ . Therefore, it would be possible to perform a set refinement on the partition, using $S_{j}$ as the pivot cell, and vertices $u$ and $v$ would be distinguished by this refinement, and cell $S_{i}$ would be split. This is not possible since, at a backtracking point, no set refinement has succeeded.

Observation 2

Let $l$ be a backtracking level. Let $\mathcal{S}^{l}=(S^{l}_{1},...,S^{l}_{r})$ be the partition at that level. Then, for all $i\in\{1,...,r\}$ , $G_{S^{l}_{i}}$ is regular.

Proof: From Observation 1, $\mathcal{S}^{l}$ is equitable. Fix $i\in\{1,...,r\}$ , then, from Definition 10, for all $u,v\in S^{l}_{i}$ , $\mathit{ADeg}(u,S^{l}_{i},G)=\mathit{ADeg}(v,S^{l}_{i},G)$ . Therefore, $G_{S^{l}_{i}}$ is regular, for all $i\in\{1,...,r\}$ .

Let $\mathsf{Q}=(\mathsf{S},\mathsf{R},\mathsf{P})$ be a sequence of partitions for graph $G=(V,R)$ where $\mathsf{S}=(\mathcal{S}^{0},...,\mathcal{S}^{t})$ , $\mathsf{R}=(R^{0},...,R^{t-1})$ , and $\mathsf{P}=(P^{0},...,P^{t-1})$ . For all $i\in\{0,...,t\}$ let $\mathcal{S}^{i}=(S^{i}_{1},...,S^{i}_{r_{i}})$ , and $V^{i}=\bigcup_{j=1}^{r_{i}}S^{i}_{j}$ . We consider two backtracking levels $k$ and $l$ that satisfy the preconditions of Theorem 2, i.e., $k<l$ and each cell of $\mathcal{S}^{l}$ is contained in a different cell of $\mathcal{S}^{k}$ .

Let $p\in S^{k}_{P^{k}}$ be the pivot vertex used for the vertex refinement at level $k$ . Assume there is a vertex $q\in S^{k}_{P^{k}},q\neq p$ that satisfies the following. $\mathcal{T}^{k+1}=\mathit{VertexRefinement}(\mathcal{S}^{k},q,G_{V^{k}})$ is a partition that is compatible with $\mathcal{S}^{k+1}$ . Let $\mathcal{T}^{k+1}=(T^{k+1}_{1},...,T^{k+1}_{r_{k+1}})$ , $W^{k+1}=\bigcup_{j=1}^{r_{k+1}}T^{k+1}_{j}$ . For all $i\in\{k+2,...,l\}$ , let $\mathcal{T}^{i}=(T^{i}_{1},...,T^{i}_{r_{i}})$ be compatible with $\mathcal{S}^{i}$ , where $W^{i}=\bigcup_{j=1}^{r_{i}}T^{i}_{j}$ , $\mathcal{T}^{i}=\mathit{SetRefinement}(\mathcal{T}^{i-1},T^{i-1}_{P^{i-1}},G_{W^{i-1}})$ if $R^{i-1}=\mathrm{SET}$ , and $\mathcal{T}^{i}=\mathit{VertexRefinement}(\mathcal{T}^{i-1},v,G_{W^{i-1}})$ for some $v\in T^{i-1}_{P^{i-1}}$ if $R^{i-1}\neq\mathrm{SET}$ . This generates an alternative sequence of partitions that is compatible with the original one up to level $l$ .

Under these premises, we show in the rest of the section that $G_{V^{l}}$ and $G_{W^{l}}$ are isomorphic, and there is an isomorphism of them that matches the vertices in $S^{l}_{i}$ to the vertices in $T^{l}_{i}$ for all $i\in\{1,...,r_{l}\}$ .

To simplify the notation, let us assume $r_{k}=r_{l}=r$ . Note that in this case, for all $i\in\{1,...,r\}$ , $S^{l}_{i}\subseteq S^{k}_{i}$ . In case $r_{k}\neq r_{l}$ this correspondence is not trivial. However, we can safely assume that there may be some $S^{l}_{i}\in\mathcal{S}^{l}$ that are empty, and develop our argument considering this possibility, although we know that in the real sequence of partitions, these empty cells would have been discarded.

For all $i\in\{1,...,r\}$ , let $E_{i}=S^{k}_{i}\setminus S^{l}_{i}$ , $E^{\prime}_{i}=S^{k}_{i}\setminus T^{l}_{i}$ be the vertices discarded in the refinements from $S^{k}_{i}$ to $S^{l}_{i}$ and $T^{l}_{i}$ respectively, let $A_{i}=E_{i}\cap E^{\prime}_{i}$ be the vertices discarded in both alternative refinements, $B_{i}=E_{i}\setminus A_{i}$ the vertices discarded only in the refinement from $S^{k}_{i}$ to $S^{l}_{i}$ , $C_{i}=E^{\prime}_{i}\setminus A_{i}$ the vertices discarded only in the refinement from $S^{k}_{i}$ to $T^{l}_{i}$ , and $D=S^{l}_{i}\cap T^{l}_{i}$ the vertices remaining in both alternative partitions at level $l$ . Let $A=\bigcup_{i=1}^{r}A_{i}$ , $B=\bigcup_{i=1}^{r}B_{i}$ , $C=\bigcup_{i=1}^{r}C_{i}$ , $D=\bigcup_{i=1}^{r}D_{i}$ , $E=\bigcup_{i=1}^{r}E_{i}$ , and $E^{\prime}=\bigcup_{i=1}^{r}E^{\prime}_{i}$ . Clearly, $E=A\cup B$ , and $E^{\prime}=A\cup C$ . Observe that $|E_{i}|=|E^{\prime}_{i}|$ , and hence $|B_{i}|=|C_{i}|$ for all $i\in\{1,...,r\}$ .

	$E^{\prime}_{1}$	$T^{l}_{1}$
$E_{1}$	$A_{1}$	$B_{1}$
$S^{l}_{1}$	$C_{1}$	$D_{1}$

… $E^{\prime}_{r}$ $T^{l}_{r}$ $E_{r}$ $A_{r}$ $B_{r}$ $S^{l}_{r}$ $C_{r}$ $D_{r}$

Figure 4: Partition of

S^{k}_{i}

into subsets

A_{i}

B_{i}

C_{i}

, and

D_{i}

for all

i\in\{1,...,r\}

Observation 3

$G_{E}$ is isomorphic to $G_{E^{\prime}}$ , and there is an isomorphism of them that matches the vertices in $E_{i}$ to those in $E^{\prime}_{i}$ , for all $i\in\{1,...,r\}$ .

Proof: Direct from the construction of the sequences of partitions.

Lemma 1

Let $M=\mathit{Adj}(G)$ . It is satisfied that:

•

For each $u\in E$ , for all $i\in\{1,...,r\}$ , for all $v,w\in S^{l}_{i}$ , $M_{uv}=M_{uw}$ and $M_{vu}=M_{wu}$ .
•

For each $u\in E^{\prime}$ , for all $i\in\{1,...,r\}$ , for all $v,w\in T^{l}_{i}$ , $M_{uv}=M_{uw}$ and $M_{vu}=M_{wu}$ .

Proof: Since none of the vertices in $E$ has been able to distinguish among the vertices in cell $S^{l}_{i}$ , each of the discarded vertices has the same type of adjacency with all the vertices in $S^{l}_{i}$ . Otherwise, consider vertex $u\in E$ . Assume $u$ has at least two different types of adjacency with the vertices in $S^{l}_{i}$ . Since it was discarded during the refinements from $S^{k}_{i}$ to $S^{l}_{i}$ , that had to be for one of the following reasons:

1.

It was discarded for having no links (i.e. links of type $0$ ), what is impossible since it has two different types of adjacencies with the vertices in $S^{l}_{i}$ .
2.

It was used as the pivot set in a vertex refinement, what is impossible since it would have been able to split cell $S^{l}_{i}$ .

The same argument applies to the vertices in $E^{\prime}$ with respect to the vertices in each cell $T^{l}_{i}$ .

Consider the adjacency between vertex $u$ and vertex $v$ is $M_{uv}=a$ for some $a\in\{0,...,3\}$ . Then, we will denote the adjacency between $v$ and $u$ ( $M_{vu}$ ) as $a^{-1}$ . Note that if $a=0$ , $a^{-1}=0$ , if $a=1$ , $a^{-1}=2$ , if $a=2$ , $a^{-1}=1$ , and if $a=3$ , $a^{-1}=3$ .

Lemma 2

For each $i,j\in\{1,...,r\}$ , there is some $a\in\{0,...,3\}$ such that for all $u\in B_{i}$ , $v\in C_{i}$ , $w\in D_{i}$ , $u^{\prime}\in B_{j}$ , $v^{\prime}\in C_{j}$ , and $w^{\prime}\in D_{j}$ , $M_{uv^{\prime}}=M_{uw^{\prime}}=M_{vu^{\prime}}=M_{vw^{\prime}}=M_{wu^{\prime}}=M_{wv^{\prime}}=a$ and $M_{u^{\prime}v}=M_{u^{\prime}w}=M_{v^{\prime}u}=M_{v^{\prime}w}=M_{w^{\prime}u}=M_{w^{\prime}v}=a^{-1}$ .

Proof: Let us take any $i\in\{1,...,r\}$ and any $j\in\{1,...,r\}$ . Since $B_{i}\subseteq E$ and $C_{j}\subseteq S^{l}_{j}$ , from Lemma 1, for each $u\in B_{i}$ , for all $v^{\prime}\in C_{j}$ , $M_{uv^{\prime}}=a$ for some $a\in\{0,...,3\}$ . Let us take any such $v^{\prime}\in C_{j}$ . Then, $M_{v^{\prime}u}=a^{-1}$ for those particular $v^{\prime}$ and $u$ . Besides, since $C_{j}\subseteq E^{\prime}$ and $B_{i}\subseteq T^{l}_{i}$ , from Lemma 1, for all $u\in B_{i}$ , $M_{v^{\prime}u}=b$ for some $b\in\{0,...,3\}$ . Since we already know that $M_{v^{\prime}u}=a^{-1}$ for that particular pair of vertices, then we conclude that for all $u\in B_{i}$ , $v^{\prime}\in C_{j}$ , $M_{uv^{\prime}}=a$ and $M_{v^{\prime}u}=a^{-1}$ , for some $a\in\{0,...,3\}$ .

$S^{l}_{j}=C_{j}\cup D_{j}$ and $B_{i}\subseteq E$ . Since for all $u\in B_{i}$ , $v^{\prime}\in C_{j}$ , $M_{uv^{\prime}}=a$ and $M_{v^{\prime}u}=a^{-1}$ , then from Lemma 1, for all $u\in B_{i}$ , $w^{\prime}\in D_{j}$ , $M_{uw^{\prime}}=a$ (clearly, the same $a$ ) and $M_{w^{\prime}u}=a^{-1}$ .

$T^{l}_{i}=B_{i}\cup D_{i}$ and $C_{j}\subseteq E^{\prime}$ . Since for all $u\in B_{i}$ , $v^{\prime}\in C_{j}$ , $M_{uv^{\prime}}=a$ and $M_{v^{\prime}u}=a^{-1}$ , then from Lemma 1, for all $v^{\prime}\in C_{j}$ , $w\in D_{i}$ , $M_{v^{\prime}w}=a^{-1}$ and $M_{wv^{\prime}}=a$ (clearly, the same $a$ ).

Furthermore, all the vertices in $S^{l}_{j}=C_{j}\cup D_{j}$ have the same number of adjacent vertices of each type in $E_{i}=A_{i}\cup B_{i}$ . Otherwise, they would have been distinguished in the refinement process from $\mathcal{S}^{k}$ to $\mathcal{S}^{l}$ . Likewise, all the vertices in $T^{l}_{j}=B_{j}\cup D_{j}$ have the same number of adjacent vertices of each type in $E^{\prime}_{i}=A_{i}\cup C_{i}$ . Otherwise, they would have been distinguished in the refinement process from $\mathcal{S}^{k}$ to $\mathcal{T}^{l}$ . Hence, the vertices of $D_{j}$ must have the same number of adjacent vertices of each type in $B_{i}$ and $C_{i}$ . Hence, since for all $w^{\prime}\in D_{j}$ , and for all $u\in B_{i}$ , $M_{uw^{\prime}}=a$ and $M_{w^{\prime}u}=a^{-1}$ , then for all $w^{\prime}\in D_{j}$ , and for all $v\in C_{i}$ , $M_{vw^{\prime}}=a$ and $M_{w^{\prime}v}=a^{-1}$ too.

A similar argument may be used to prove that for all $w\in D_{i}$ , and for all $u^{\prime}\in B_{j}$ , $M_{wu^{\prime}}=a$ and $M_{u^{\prime}w}=a^{-1}$ . Then, from Lemma 1, since $B_{j}\subseteq E$ , for all $u^{\prime}\in B_{j}$ , $M_{u^{\prime}x}=M_{u^{\prime}y}$ for all $x,y\in S^{l}_{i}$ . We already know that for all $u^{\prime}\in B_{j}$ , $M_{u^{\prime}w}=a^{-1}$ for all $w\in D_{i}$ , and $S^{l}_{i}=C_{i}\cup D_{i}$ . Hence, for all $v\in C_{i}$ , $M_{u^{\prime}v}=a^{-1}$ too, and $M_{vu^{\prime}}=a$ .

Putting together all the partial results obtained, we get the assertion stated in the lemma.

Corollary 1

Let $M=\mathit{Adj}(G)$ . For each $i\in\{1,...,r\}$ , it is satisfied that for all $u\in B_{i},v\in C_{i},w\in D_{i}$ , $M_{uv}=M_{vu}=M_{uw}=M_{wu}=M_{vw}=M_{wv}=a$ , where $a\in\{0,3\}$ .

Proof: From Lemma 2, for the case $i=j$ , we get that for all $u\in B_{i}$ , $v\in C_{i}$ , $w\in D_{i}$ , $M_{uv}=M_{uw}=M_{vu}=M_{vw}=M_{wu}=M_{wv}=a$ and $M_{uv}=M_{uw}=M_{vu}=M_{vw}=M_{wu}=M_{wv}=a^{-1}$ . Hence, it must hold that $a=a^{-1}$ , so $a\in\{0,3\}$ .

Let us define two families of partitions of $A_{i}$ for $i,j\in\{1,...,r\}$ :

A_{i}^{cj}=\{x\in A_{i}:\forall u\in B_{i},v^{\prime}\in C_{j},M_{xv^{\prime}}=M_{uv^{\prime}}\}

A_{i}^{nj}=\{x\in A_{i}:\forall u\in B_{i},v^{\prime}\in C_{j},M_{xv^{\prime}}\neq M_{uv^{\prime}}\}

Note that, since the vertices of $A_{i}$ are unable to distinguish among the vertices of $C_{j}$ , then, if $M_{xv^{\prime}}\neq M_{uv^{\prime}}$ for some $u\in B_{i}$ or some $v^{\prime}\in C_{j}$ , then $M_{xv^{\prime}}\neq M_{uv^{\prime}}$ for all $u\in B_{i}$ and all $v^{\prime}\in C_{j}$ . Hence, each pair of sets $A_{i}^{cj}$ and $A_{i}^{nj}$ defines a partition of $A_{i}$ . Note also that, since each vertex in $A_{i}$ has the same type of adjacency with all the vertices in $B_{i}\cup C_{i}\cup D_{i}$ (from Lemma 1), then for all $x\in A_{i}^{cj}$ , $u\in B_{i}$ , $v\in C_{i}$ , $w\in D_{i}$ , $u^{\prime}\in B_{j}$ , $v^{\prime}\in C_{j}$ , and $w^{\prime}\in D_{j}$ , $M_{xu^{\prime}}=M_{xv^{\prime}}=M_{xw^{\prime}}=M_{uv^{\prime}}=M_{uw^{\prime}}=M_{vu^{\prime}}=M_{vw^{\prime}}=M_{wu^{\prime}}=M_{wv^{\prime}}$ (from Lemma 2).

	$E^{\prime}_{i}$	$T^{l}_{i}$
$E_{i}$		$B_{i}$
$S^{l}_{i}$	$C_{i}$	$D_{i}$

Figure 5: Partition of

A_{i}

into subsets

A^{c}_{i}

, and

A^{n}_{i}

Lemma 3

For all $i\in\{1,...,r\}$ , let $A_{i}^{c}=\bigcap_{j=1}^{r}A_{i}^{cj}$ , and let $A_{i}^{n}=\bigcup_{j=1}^{r}A_{i}^{nj}$ . Then, any isomorphism of $G_{E}$ and $G_{E^{\prime}}$ that maps $G_{E_{i}}$ to $G_{E^{\prime}_{i}}$ , maps the vertices in $A_{i}^{n}$ among themselves.

Proof: From Observation 1, partition $\mathcal{S}^{k}$ is equitable. Hence, for each $i,j\in\{1,...,r\}$ , for all $u,v\in S^{k}_{i}$ , $\mathit{ADeg}(u,S^{k}_{j},G)=\mathit{ADeg}(v,S^{k}_{j},G)$ . Thus, for all $x\in A_{i}^{cj}$ , $y\in A_{i}^{nj}$ , $u\in B_{i}$ , $v\in C_{i}$ , $w\in D_{i}$ , $\mathit{ADeg}(x,S^{k}_{j},G)=\mathit{ADeg}(y,S^{k}_{j},G)=\mathit{ADeg}(u,S^{k}_{j},G)=\mathit{ADeg}(v,S^{k}_{j},G)=\mathit{ADeg}(w,S^{k}_{j},G)$ .

Let us take any pair of values of $i$ and $j$ . From Lemma 2, all the vertices of $B_{i}$ have the same type of adjacency with all the vertices of $S^{l}_{j}=C_{j}\cup D_{j}$ . Assume this type of adjacency is $a$ . From the definition of $A_{i}^{cj}$ , all the vertices of $A_{i}^{cj}$ have adjacency $a$ with all the vertices of $S^{l}_{j}$ . Hence, for $x\in A_{i}^{cj}$ , $u\in B_{i}$ , $\mathit{ADeg}(x,S^{l}_{j},G)=\mathit{ADeg}(u,S^{l}_{j},G)$ . Since $\mathit{ADeg}(x,S^{k}_{j},G)=\mathit{ADeg}(u,S^{k}_{j},G)$ and $\mathit{ADeg}(x,S^{l}_{j},G)=\mathit{ADeg}(u,S^{l}_{j},G)$ , then $\mathit{ADeg}(x,E_{j},G)=\mathit{ADeg}(u,E_{j},G)$ (note that $E_{j}=A_{j}^{ci}\cup A_{j}^{ni}\cup B_{j}$ , $S^{l}_{j}=C_{j}\cup D_{j}$ , and $S^{k}_{j}=E_{j}\cup S^{l}_{j}$ ).

However, from the definition of $A_{i}^{nj}$ , for $y\in A_{i}^{nj}$ , $\mathit{ADeg}(y,S^{l}_{j},G)\neq\mathit{ADeg}(x,S^{l}_{j},G)$ . Hence, since $\mathit{ADeg}(y,S^{k}_{j},G)=\mathit{ADeg}(x,S^{k}_{j},G)$ , $\mathit{ADeg}(y,E_{j},G)\neq\mathit{ADeg}(x,E_{j},G)$ .

Since any isomorphism must match vertices with the same degree, every isomorphism of $G_{E}$ and $G_{E^{\prime}}$ that maps $G_{E_{i}}$ to $G_{E^{\prime}_{i}}$ , maps the vertices in $A^{nj}_{i}$ among themselves.

Applying this argument over all possible values of $j$ , we get that any isomorphism of $G_{E}$ and $G_{E^{\prime}}$ that maps $G_{E_{i}}$ to $G_{E^{\prime}_{i}}$ , maps the vertices in $A_{i}^{n}$ among themselves, for all $i\in\{1,...,r\}$ .

Let us focus on any isomorphism of $G_{E}$ and $G_{E^{\prime}}$ that maps $G_{E_{i}}$ to $G_{E^{\prime}_{i}}$ for all $i\in\{1,...,r\}$ (there is at least one from Observation 3).

Lemma 4

$G_{B}$ is isomorphic to $G_{C}$ , and there is an isomorphism of them that matches the vertices in $B_{i}$ to those in $C_{i}$ , for all $i\in\{1,...,r\}$ .

Proof:

Let us analyze the adjacencies between the vertices in $A_{i}^{c}$ , $B_{i}$ , $C_{i}$ , $A_{j}^{c}$ , $B_{j}$ , and $C_{j}$ for some values of $i$ and $j$ . From Corollary 1, for all $u\in B_{i}$ , $v\in C_{i}$ , $M_{uv}=M_{vu}=a$ , where $a\in\{0,3\}$ . From the definition of $A_{i}^{c}$ , for all $x\in A_{i}^{c}$ , $M_{xu}=M_{xv}=M_{ux}=M_{vx}=M_{uv}=a$ .

From Lemma 3, the vertices of $A_{i}^{n}$ are mapped among themselves in any isomorphism of $G_{E}$ and $G_{E^{\prime}}$ that maps $G_{E_{i}}$ to $G_{E^{\prime}_{i}}$ . Hence, the vertices of $A_{i}^{c}\cup B_{i}$ must be mapped to the vertices of $A_{i}^{c}\cup C_{i}$ . If $a=0$ , then $A^{c}_{i}$ , $B_{i}$ , and $C_{i}$ are disconnected. Hence, $G_{B_{i}}$ and $G_{C_{i}}$ must be isomorphic. In the case $a=3$ , taking the inverses of the graphs leads to the same result.

From Lemma 2, for each $i,j\in\{1,...,r\}$ , there is some $a\in\{0,...,3\}$ such that for all $u\in B_{i}$ , $v\in C_{i}$ , $u^{\prime}\in B_{j}$ , $v^{\prime}\in C_{j}$ , $M_{uv^{\prime}}=M_{vu^{\prime}}=a$ and $M_{u^{\prime}v}=M_{v^{\prime}u}=a^{-1}$ . From the definition of $A_{i}^{c}$ , for all $x\in A_{i}^{c}$ , for all $u\in B_{i}$ , $v\in C_{i}$ , $u^{\prime}\in B_{j}$ , $v^{\prime}\in C_{j}$ , $M_{xu^{\prime}}=M_{xv^{\prime}}=M_{uv^{\prime}}$ .

Figure 6: Adjacencies between

E_{i}

and

E_{j}

, and between

E^{\prime}_{i}

and

E^{\prime}_{j}

Putting all this together, we come to a picture of the adjacencies among $A_{i}^{c}$ , $B_{i}$ , $C_{i}$ , $A_{j}^{c}$ , $B_{j}$ , and $C_{j}$ as shown in Figure A. The connections between the vertices of $A_{i}^{c}$ and the vertices of $B_{i}$ , and between the vertices of $A_{i}^{c}$ and the vertices of $C_{i}$ are all-to-all (all the same) of value $0$ or $3$ . Similarly, the adjacencies between the vertices of $A_{j}^{c}$ and the vertices of $B_{j}$ , and the adjacencies between the vertices of $A_{j}^{c}$ and the vertices of $C_{j}$ are all the same, all-to-all $0$ or $3$ (not necessarily equal to those of $A_{i}^{c}$ and $B_{i}$ or $C_{i}$ ). The adjacencies between $A_{i}^{c}$ and $B_{j}\cup C_{j}$ are all the same, all-to-all of any value in the set $\{0,...,3\}$ . This also applies to the adjacencies between $A_{j}^{c}$ and $B_{i}\cup C_{i}$ .

If $G_{B_{i}\cup B_{j}}$ is not isomorphic to $G_{C_{i}\cup C_{j}}$ , the discrepancy must be in the adjacencies between vertices of $B_{i}$ and $B_{j}$ with respect to the adjacencies between vertices of $C_{i}$ and $C_{j}$ . In such a case, in the isomorphism between $G_{E_{i}\cup E_{j}}$ and $G_{E^{\prime}_{i}\cup E^{\prime}_{j}}$ (recall that from Observation 3 there is an isomorphism of $G_{E}$ and $G_{E^{\prime}}$ that maps the vertices of $E_{i}$ to the vertices in $E^{\prime}_{i}$ for all $i\in\{1,...,r\}$ ) some vertices of $A_{i}^{c}$ should be mapped to vertices of $C_{i}$ , and some of the vertices of $B_{i}$ should be mapped to vertices of $A_{i}^{c}$ . However, due to the adjacencies among $A_{i}^{c}$ , $B_{i}$ , $C_{i}$ , $A_{j}^{c}$ , $B_{j}$ , and $C_{j}$ , shown in Figure A, that would imply that the adjacencies between the vertices of $B_{i}$ and $B_{j}$ had to match adjacencies between the vertices of $A^{i}_{c}$ and $A^{j}_{c}$ . But, in that case, the same adjacency pattern must exist between the vertices of $C_{i}$ and $C_{j}$ , to match the corresponding subgraph of $G_{E_{i}\cup E_{j}}$ . Hence, the adjacencies between $B_{i}$ and $B_{j}$ could have been matched to the adjacencies between $C_{i}$ and $C_{j}$ .

Since this applies for all values of $i$ and $j$ , we conclude that $G_{B}$ is isomorphic to $G_{C}$ , and there is an isomorphism of them that matches the vertices in $B_{i}$ to those in $C_{i}$ , for all $i\in\{1,...,r\}$ , completing the proof.

Lemma 5

$G_{V^{l}}$ and $G_{W^{l}}$ are isomorphic, and there is an isomorphism of them that maps the vertices in $S^{l}_{i}$ to the vertices of $T^{l}_{i}$ for all $i\in\{1,...,r\}$ .

Proof: From Lemma 2, we know that for each $i,j\in\{1,...,r\}$ , there is some $a\in\{0,...,3\}$ such that for all $u\in B_{i}$ , $v\in C_{i}$ , $w\in D_{i}$ , $u^{\prime}\in B_{j}$ , $v^{\prime}\in C_{j}$ , and $w^{\prime}\in D_{j}$ , $M_{uv^{\prime}}=M_{uw^{\prime}}=M_{vu^{\prime}}=M_{vw^{\prime}}=M_{wu^{\prime}}=M_{wv^{\prime}}=a$ and $M_{u^{\prime}v}=M_{u^{\prime}w}=M_{v^{\prime}u}=M_{v^{\prime}w}=M_{w^{\prime}u}=M_{w^{\prime}v}=a^{-1}$ .

Note also that, from Corollary 1, for all $u\in B_{i}$ , $v\in C_{i}$ , $w\in D_{i}$ , $M_{uv}=M_{vw}=M_{wu}=a$ , where $a\in\{0,3\}$ . This adjacency pattern is graphically shown in Figure A.

Figure 7: Adjacencies between

S^{l}_{i}

and

S^{l}_{j}

, and between

T^{l}_{i}

and

T^{l}_{j}

From Lemma 4, we know that $G_{B}$ is isomorphic to $G_{C}$ , and there is an isomorphism of them that matches the vertices in $B_{i}$ to those in $C_{i}$ , for all $i\in\{1,...,r\}$ .

From the fact that $G_{D}$ is isomorphic to itself, and the previous considerations on the adjacency pattern between the vertices in $B_{i}$ , $C_{i}$ , $D_{i}$ , $B_{j}$ , $C_{j}$ , and $D_{j}$ for all $i,j\in\{1,...,r\}$ , shown in Figure A, it is easy to see that the isomorphism of $G_{B}$ and $G_{C}$ obtained from Lemma 4, toghether with the trivial automorphism of $G_{D}$ yields an isomorphism of $G_{V^{l}}$ and $G_{W^{l}}$ , what completes the proof.

We have shown that if two alternative sequences of partitions $S^{k+1},...,S^{l}$ and $T^{k+1},...,T^{l}$ lead to compatible partitions $S^{l}$ and $T^{l}$ , where all their cells are subcells of different cells of a previous common level $k$ , then the remaining subgraphs are isomorphic, and the vertices in each cell of one partition may be mapped to the vertices in its corresponding cell in the other partition by one such isomorphism. Thus, if during the search for a sequence of partitions compatible with the target, we have got an incompatibility at some point beyond level $l$ , and we have to backtrack from one level $l$ to another level $k$ in which all the cells are different supersets of the cells in the current backtracking point, when trying a compatible path, we will get to the same dead-end. Hence, it is of no use to try another path from one such level $k$ , and it will be necessary to backtrack to some point where at least two cells in the current backtracking point are subsets of the same cell in the previous backtracking point. This proves Theorem 2.