Fast Isomorphism Testing of Graphs with Regularly-Connected Components
Abstract
The Graph Isomorphism problem has both theoretical and practical interest. In this paper we present an algorithm, called conauto-1.2, that efficiently tests whether two graphs are isomorphic, and finds an isomorphism if they are. This algorithm is an improved version of the algorithm conauto, which has been shown to be very fast for random graphs and several families of hard graphs [9]. In this paper we establish a new theorem that allows, at very low cost, the easy discovery of many automorphisms. This result is especially suited for graphs with regularly connected components, and can be applied in any isomorphism testing and canonical labeling algorithm to drastically improve its performance. In particular, algorithm conauto-1.2 is obtained by the application of this result to conauto. The resulting algorithm preserves all the nice features of conauto, but drastically improves the testing of graphs with regularly connected components. We run extensive experiments, which show that the most popular algorithms (namely, nauty [10, 11] and bliss [8]) can not compete with conauto-1.2 for these graph families.
1 Introduction
The Graph Isomorphism problem (GI) is of both theoretical and practical interest. GI tests whether there is a one-to-one mapping between the vertices of two graphs that preserves the arcs. This problem has applications in many fields, like pattern recognition and computer vision [3], data mining [17], VLSI layout validation [1], and chemistry [5, 15]. At the theoretical level, its main theoretical interest is that it is not known whether GI is in P or whether it is NP-complete.
Related Work
It would be nice to find a complete graph-invariant111A complete graph-invariant is a function on a graph that gives the same result for isomorphic graphs, and different results for non-isomorphic graphs. computable in polynomial time, what would allow testing graphs for isomorphism in polynomial time. However, no such invariant is known, and it is unlikely to exist. Note, though, that there are many simple instances of GI, and that many families of graphs can be tested for isomorphism in polynomial time: trees [2], planar graphs [7], graphs of bounded degree [6], etc. For a review of the theoretical results related to GI see [9, 13].
The most interesting practical approaches to the GI problem are (1) the direct approach, which uses backtracking to find a match between the graphs, using techniques to prune the search tree, and (2) computing a certificate222A certificate of a graph is a canonical labeling of the graph. of each of the graphs to test, and then compare the certificates directly. The direct approach can be used for both graph and subgraph isomorphism (e.g. vf2 [4] and Ullman’s [16] algorithms), but has problems when dealing with highly regular graphs with a relatively small automorphism group. In this case, even the use of heuristics to prune the search space frequently does not prevent the proposed algorithms from exploring paths equivalent to those already tested. To avoid this, it is necessary to keep track of discovered automorphisms, and use this information to aggressively prune the search space. On the other hand, using certificates, since two isomorphic graphs have the same canonical labeling, their certificates can be compared directly. This is the approach used by the well-known algorithm nauty [10, 11], and the algorithm bliss [8] (which has better performance than nauty for some graph families). This approach requires computing the full automorphism group of the graph (at least a set of generators). In most cases, these algorithms are faster than the ones that use the direct approach.
Algorithm conauto [9] uses a new approach to graph isomorphism333A preliminary version of conauto has been included in the LEDA C++ class library of algorithms [14].. It combines the use of discovered automorphisms with a backtracking algorithm that tries to find a match of the graphs without the need of generating a canonical form. To test graphs of nodes conauto uses bits of memory. Additionally, it runs in polynomial time (on ) with high probability for random graphs. In real experiments, for several families of interesting hard graphs, conauto is faster than nauty and vf2, as shown in [9]. For example Miyazaki’s graphs [12], are very hard for vf2, nauty, and bliss, but conauto handles them efficiently. However, it was found in [9] that some families of graphs built from regularly connected components (in particular, from strongly regular graphs) are not handled efficiently by any of the algorithms evaluated. While conauto runs fast when the tested graphs are isomorphic, it is very slow when the graphs are not isomorphic.
Contributions
In this paper we establish a new theorem that allows, at very low cost, the easy discovery of many automorphisms. This result is especially suited for graphs with regularly connected components, and can be applied in any direct isomorphism testing or canonical labeling algorithm to drastically improve its performance.
Then, a new algorithm, called conauto-1.2, is proposed. This algorithm is obtained by improving conauto with techniques derived from the above mentioned theorem. In particular, conauto-1.2 reduces the backtracking needed to explore every plausible path in the search space with respect to conauto. The resulting algorithm preserves all the nice features of conauto, but drastically improves the testing of some graphs, like those with regularly connected components.
We have carried out experiments to compare the practical performance of conauto-1.2, nauty, and bliss, with different families of graphs built by regularly connecting copies of small components. The experiments show that, for this type of construction, conauto-1.2 not only is the fastest, but also has a very regular behavior.
Structure
In Section 2, we define the basic theoretical concepts used in algorithm conauto-1.2 and present the theorems on which its correction relies. Next, in Section 3 we describe the algorithm itself. Then, Section 4 describes the graph families used for the tests, and show the practical performance of conauto-1.2 compared with conauto, nauty and bliss for these families. Finally we put forward our conclusions and propose new ways to improve conauto-1.2.
2 Theoretical Foundation
2.1 Basic Definitions
A directed graph consists of a finite non-empty set of vertices and a binary relation , i.e. a subset . The elements of are called arcs. An arc is considered to be oriented from to . An undirected graph is a graph whose arc set is symmetrical, i.e. iff . From now on, we will use the term graph to refer to a directed graph.
Definition 1
An isomorphism of graphs and is a bijection between the vertex sets of and , , such that . Graphs and are called isomorphic, written , if there is at least one isomorphism of them. An automorphism of is an isomorphism of and itself.
Given a graph , can be represented by an adjacency matrix with size in the following way:
Let be a graph, and its adjacency matrix. Let and , the available degree of in under , denoted by , is the degree of with respect to , i.e., the 3-tuple where for . The predicate says if has any neighbor in , i.e., . Extending the notation, let ; if , then, we denote . is defined similarly.
We will say a 3-tuple when the first one precedes the second one in lexicographic order. This notation will be used to order the available degrees of vertices and sets.
2.2 Specific Notation and Definitions for the Algorithms
It will be necessary to introduce some specific notation to be used in the specification of our algorithms. Like other isomorphism testing algorithms, ours relies on vertex classification. Let us start defining what a partition is, and the partition concatenation operation.
A partition of a set is a sequence of disjoint nonempty subsets of such that . The sets are called the cells of . The empty partition will be denoted by .
Definition 2
Let and be partitions of two disjoint sets and , respectively. The concatenation of and , denoted , is the partition . Clearly, .
Let be a graph, , . The vertex partition of by , denoted , is a partition of such that for all , implies . Let . The set partition of by , denoted , is a partition of such that for all , implies .
Definition 3
Let be a graph, and a partition of . Let for some . The vertex refinement of by , denoted is the partition such that for all , is the empty partition if , and otherwise. is the pivot set and is the pivot vertex.
Definition 4
Let be a graph, and a partition of . Let for some be a given pivot set. The set refinement of by , denoted is the partition such that for all , is the empty partition if , and otherwise.
Once we have presented the possible partition refinements that may be applied to partitions, we can build sequences of partitions in which an initial partition (for example the one with one cell containing all the vertices of a graph) is iteratively refined using the two previously defined refinements. Vertex refinements are tagged as (if the pivot set has only one vertex), (if a set refinement is possible with some pivot set), or (when a vertex refinement is performed with a pivot set with more than one vertex).
Definition 5
Let be a graph. A sequence of partitions for graph is a tuple , where , are the partitions themselves, indicate the type of refinement applied at each step, and choose the pivot set used for each refinement step, such that all the following statements hold:
-
1.
For all , , and .
-
2.
For all , let , . Then:
-
(a)
implies .
-
(b)
implies for some .
-
(a)
-
3.
Let , , then for all , or .
For convenience, for all , by level we refer to the tuple in a sequence of partitions. Level is identified by , since and are not defined.
We will now introduce the concept of compatibility among partitions, and then define compatibility of sequences of partitions. Let be a partition of the set of vertices of a graph , and let be a partition of the set of vertices of a graph . and are said to be compatible under and respectively if (i.e. ), and for all , and .
Definition 6
Let and be two graphs. Let , and be two sequences of partitions for graphs and respectively. and are said to be compatible sequences of partitions if:
-
1.
, .
-
2.
Let , , , , , . For all , , , and and are compatible under and respectively.
-
3.
Let , , then for all , .
The following theorem shows that having compatible sequences of partitions is equivalent to being isomorphic.
Theorem 1 ([9])
Two graphs and are isomorphic if and only if there are two compatible sequences of partitions and for graphs and respectively.
In order to properly handle automorphisms, sequences of partitions will be extended with vertex equivalence information. Two vertices of a graph are equivalent, denoted , if there is an automorphism of such that . A vertex is fixed by if . When two vertices are equivalent, they are said to belong to the same orbit. The set of all the orbits of a graph is called the orbit partition. Our algorithm performs a partial computation of the orbit partition. The orbit partition will be computed incrementally, starting from the singleton partition. Since our algorithm performs a limited search for automorphisms, it is possible that it stops before the orbit partition is really found. Therefore, we will introduce the notion of semiorbit partition, and extend the sequence of partitions to include a semiorbit partition.
Definition 7
Let be a graph. A semiorbit partition of is any partition of , such that , implies that .
Definition 8
An extended sequence of partitions for a graph is a tuple , where is a sequence of partitions, denoted as , and is a semiorbit partition of , denoted as .
Finally, we introduce a notation for the number of vertex refinements tagged , since it will be used to choose the target sequence of partitions to be reproduced. Let be a sequence of partitions, and let . Then, .
2.3 Components Theorem
It was observed [9] that conauto is very efficient finding isomorphisms for unions of strongly regular graphs, but it is inefficient detecting that two such unions are not isomorphic. Exploring the behavior of conauto in graphs that are the disjoint union of connected components, we observed that it was not able to identify cases in which components in both graphs had already been matched. This was leading to many redundant attempts of matching components.
Note that, once a component of a graph has been found isomorphic to a component of a graph , it is of no use trying to match to another component of . Besides, if can not be matched to any component of , it is of no use trying to match the other components, since, at the end, the graphs can not be isomorphic. After a thorough study of the behavior of conauto for these graphs, we have concluded that its performance can be drastically improved in these cases by directly applying the following theorem (whose proof can be found in the Appendix):
Theorem 2
During the search for a sequence of partitions compatible with the target, backtracking from a level to a level , such that each cell of level is contained in a different cell of level , can not provide a compatible partition.
3 Conauto-1.2
In this section we propose a new algorithm conauto-1.2 (described in Algorithm 1) which is based on algorithm conauto [9], and uses the result of Theorem 2 to drastically reduce backtracking. It starts generating a sequence of partitions for each of the graphs being tested (using function ), and performing a limited search for automorphisms using function , just like conauto. The difference with conauto is that, during the search for the compatible sequence of partitions (), the algorithm not always backtracks to the previous recursive call (the previous level in the sequence of partitions). Instead, it may backtrack directly to a much higher level, or even stop the search, concluding that the graphs are not isomorphic, skipping intermediate backtracking points.
Function is the same used by conauto (see [9] for the details). It is worth mentioning that it generates a sequence of partitions with the following criteria:
-
1.
It starts with the degree partition, and ends when it gets a partition in which no non-singleton cell has remaining links.
-
2.
The pivot cell used for a refinement must always have remaining links (the more, the better).
-
3.
At each level, a vertex refinement with a singleton pivot cell is the preferred choice.
-
4.
The second best choice is to perform a set refinement, preferring small cells over big ones.
-
5.
If the previous refinements can not be used, then a vertex is chosen from the pivot cell (the smallest cell with links), a vertex refinement is performed with that pivot vertex, and a backtracking point arises.
Function is also the same used by conauto (see [9] for the details). It takes as input a sequence of partitions for a graph, and generates an extended sequence of partitions. In the process, it tries to eliminate backtracking points, and builds a semiorbit partition of the vertices with the information on vertex equivalences it gathers. Recall that two vertices are equivalent if there is an automorphism that permutes them, i.e., if there are two equivalent sequences of partitions in which one takes the place of the other.
1 | |||
2 | |||
3 | |||
4 | |||
5 | if then | ||
6 | return | ||
7 | else | ||
8 | return | ||
9 | end if |
Function (Algorithm 2) uses backtracking attempting to find a sequence of partitions for graph that is compatible with the one for graph . At backtracking points, it tries every feasible vertex in the pivot cell, so that no possible solution is missed.
Note that, unlike in conauto, the function of conauto-1.2 does not return a boolean, but an integer. Thus, if returns , that means that a mismatch has been found at some level , such that there is no previous level at which a cell contains (at least) two cells of the partition of level . Hence, from Theorem 2 there is no other feasible alternative in the search space that can yield an isomorphism of the graphs. If it returns a value that is higher than the current level, then a match has been found, the graphs are isomorphic and there is no need to continue the search. Therefore, in this case the call immediately returns with this value. If it returns a value that is lower than the current level, then it is necessary to backtrack to that level, since trying another option at this level is meaningless according to Theorem 2. Hence the algorithm also returns immediately with that value. If a call at level returns , then another alternative at this level should be tried if possible. In any other case, it applies Theorem 2 directly, and returns the closest (previous) level at which two cells of the current level belong to the same cell of . If no such previous level exists, it returns .
1 | if partition labeled and the adjacencies in both partitions match | ||||
2 | return | ||||
3 | else if partition labeled and vertex refinement compatible then | ||||
4 | |||||
5 | if then return | ||||
6 | else if partition labeled and set refinement compatible then | ||||
7 | |||||
8 | if then return | ||||
9 | else if partition labeled then | ||||
10 | for each vertex in the pivot cell, while NOT success do | ||||
11 | if may NOT be discarded according to and vertex refinement compatible then | ||||
12 | |||||
13 | if then return | ||||
14 | end if | ||||
15 | end for | ||||
16 | end if | ||||
17 | return the nearest level such that the condition of Theorem 2 holds |
4 Performance Evaluation
In this section we compare the practical performance of conauto-1.2 with nauty and bliss, two well-known algorithms that are considered the fastest algorithms for isomorphism testing and canonical labeling. In the performance evaluation experiments, we have run these programs with instances (pairs of graphs) that belong to specific families. We also use conauto to show the improvement achieved by conauto-1.2 for these graph families. Undirected and directed (when possible) graphs of different sizes (number of nodes) have been considered. The experiments include instances of isomorphic and non-isomorphic pairs of graphs.
4.1 Graph Families
For the evaluation, we have built some families of graphs with regularly-connected components. The general construction technique of these graphs consists of combining small components of different types by either (1) connecting every vertex of each component to all the vertices of the other components, (2) connecting only some vertices in each component to some vertices in all the other components, or (3) applying the latter construction in two levels. The use of these techniques guarantees that the resulting graph is connected, which is convenient to evaluate algorithms that require connectivity (like, e.g., vf2 [4]). Using the disjoint union of connected components yields similar experimental results.
Next, we describe each family of graphs used. In fact, as the reader will easily infer, the key point in all these constructions is that the components are either disconnected, or connected via complete -partite graphs. Hence, multiple other constructions may be used which would yield similar results. In each graph family, one hundred pairs of isomorphic and non-isomorphic graphs have been generated for each graph size (up to approximately vertices).
Unions of Strongly Regular Graphs
This graph family is built from a set of strongly regular graphs with parameters as components. The components are interconnected so that each vertex in one component is connected to every vertex in the other components. This is equivalent to inverting the components, then applying the disjoint union, and finally inverting the result. Graphs up to vertices have only one copy of each component, and bigger ones may have more than one copy of each component.
Unions of Tripartite Graphs
For this family, we use the digraphs in Figure 4.1 as the basic components. For the positive tests (isomorphic graphs) we use the same number of components of each type, while for the negative tests we use one graph with the same number of components of each type, and another graph in which one component has been replaced by one of the other type.
The connections between components have been done in the following way. The vertices in the subset of each component are connected to all the vertices in the subsets of the other components. See Figure 4.1 to locate these subsets. The arcs are directed from the vertices in the subsets, to the vertices in the subsets. From the previously described graphs, we have obtained an undirected version by transforming every (directed) arc into an (undirected) edge.
Hypo-Hamiltonian Graphs 2-level-connected
For this family we use two non-isomorphic Hypo-Hamiltonian graphs with 22 vertices. Both graphs have four orbits of sizes: one, three, six, and twelve. These basic components are interconnected at two levels. Let us call the vertices in the orbits of size one, the 1-orbit vertices, and the vertices in the orbits of size three the 3-orbit vertices. In the first level, we connect basic components, to form a first-level component, by connecting all the -orbit vertices in each basic component to all the -orbit vertices of the other basic components. In this construction, the -orbit vertices, along with the new edges added to interconnect the basic components, form a complete -partite graph. Then, in the second level, first-level components are interconnected by adding edges that connect the -orbit vertices of each first-level component with all the -orbit vertices of the other first-level components. Again, the -orbit vertices, along with the edges connecting them, form a complete -partite graph. Since we use two Hypo-Hamiltonian graphs as basic components, to generate negative isomorphism cases, a component of one type is replaced with one of the other type.
4.2 Evaluation Results
The performance of the four programs has been evaluated in terms of their execution time with multiple instances of graphs from the previously defined families. The execution times have been measured in a Pentium III at 1.0 GHz with 256 MB of main memory, under Linux RedHat . The same compiler (GNU gcc) and the same optimization flag (-O) have been used to compile all the programs. The time measured is the real execution time (not only CPU time) of the programs. This time does not include the time to load the graphs from disk into memory. A time limit of seconds has been set for each execution. When the execution of a program with graphs of size reaches this limit, all the execution data of that program for graphs of the same family with size no smaller than are discarded.
Average Execution Time
The results of the experiments are first presented, in Figure 4.2, as curves that represent execution time as a function of graph size. In these curves, each point is the average execution time of the corresponding program on all the instances of the corresponding size.
It was previously known that nauty requires exponential time to process graphs that are unions of strongly regular graphs [12]. From our results, we conjecture that bliss has the same problem. That does not apply to conauto-1.2, though. While the original conauto had problems with non-isomorphic pairs of graphs, conauto-1.2 overcomes this problem.
With the family of unions of tripartite graphs, we have run both positive and negative experiments with directed and undirected versions of the graphs. In all cases, conauto-1.2 has a very low execution time. (Again, the improvement of conauto-1.2 over conauto is apparent in the case of negative tests.) Observe that there are no significant differences in the execution times of bliss and conauto-1.2 between the directed and the undirected cases. However, nauty is slower with directed graphs, even using the adjacencies invariant specifically designed for directed graphs.
Our last graph family, Cubic Hypohamiltonian 2-level-connected graphs, has a more complex structure than the other families, having two levels of interconnection. However, the results do not differ significantly from the previous ones. It seems that these graphs are a bit easier to process (compared with the other graph families) for bliss, but not for nauty. Like in the previous cases, conauto-1.2 is fast and consistent with the graphs in this family. It clearly improves the results of conauto for the non-isomorphic pairs of graphs.
Standard Deviation
In addition to the average behavior for each graph size, we have also evaluated the regular behavior of the programs. With regular behavior we mean that the time required to process any pair of graphs of the same family and size is very similar. We have observed that conauto-1.2 is not only fast for all these families of graphs, but it also has a very regular behavior. However, that does not hold for nauty nor bliss. This is illustrated with the plots of the normalized standard deviation444The normalized standard deviation is obtained by dividing the standard deviation of the sample by the mean. (NSD) shown in Figure 4.2. Algorithm conauto-1.2 has a NSD that remains almost constant, and very close to cero, for all graph sizes, and even decreases for larger graphs. However, nauty and bliss have a much more erratic behavior. In the case of conauto, we see that its problems arise when it faces negative tests, where the NSD rapidly grows.
5 Conclusions and Future Work
We have presented a result (the Components Theorem, Theorem 2) that can be applied in GI algorithms to efficiently find automorphisms. Then, we have applied this result to transform the algorithm conauto into conauto-1.2. Algorithm conauto-1.2 has been shown to be fast and consistent in performance for a variety of graph families. However, the algorithm conauto-1.2 can still be improved in several ways: (1) by adding the capability of computing a complete set of generators for the automorphism group, (2) by making extensive use of discovered automorphisms during the match process, and (3) by computing canonical forms of graphs. In all these possible improvements, the Components Theorem will surely help. Additionally, the Components Theorem might also be used by nauty and bliss to improve their performance for the graph families considered, at low cost.
References
- [1] Magdy S. Abadir and Jack Ferguson. An improved layout verification algorithm (LAVA). In EURO-DAC ’90: Proceedings of the conference on European design automation, pages 391–395, Los Alamitos, CA, USA, 1990. IEEE Computer Society Press.
- [2] Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley series in computer science and information processing. Addison-Wesley Publishing Company, Boston, MA, USA, 1974.
- [3] Donatello Conte, Pasquale Foggia, Carlo Sansone, and Mario Vento. Graph matching applications in pattern recognition and image processing. In IEEE International Conference on Image Processing, volume 2, pages 21–24, Barcelona, Spain, September 2003. IEEE Computer Society Press.
- [4] L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. An improved algorithm for matching large graphs. In Proceedings of the 3rd IAPR-TC-15 International Workshop on Graph-based Representations, pages 149–159, Ischia, Italy, May 2001.
- [5] Jean-Loup Faulon. Isomorphism, automorphism partitioning, and canonical labeling can be solved in polynomial–time for molecular graphs. Journal of chemical information and computer science, 38:432–444, 1998.
- [6] I. S. Filotti and Jack N. Mayer. A polynomial–time algorithm for determining the isomorphism of graphs of fixed genus. In STOC ’80: Proceedings of the twelfth annual ACM symposium on Theory of computing, pages 236–243, New York, NY, USA, 1980. ACM Press.
- [7] J. E. Hopcroft and J. K. Wong. Linear time algorithm for isomorphism of planar graphs (preliminary report). In STOC ’74: Proceedings of the sixth annual ACM symposium on Theory of computing, pages 172–184, New York, NY, USA, 1974. ACM Press.
- [8] Tommi A. Junttila and Petteri Kaski. Engineering an efficient canonical labeling tool for large and sparse graphs. In ALENEX. SIAM, 2007.
- [9] José Luis López-Presa and Antonio Fernández Anta. Fast algorithm for graph isomorphism testing. In Jan Vahrenhold, editor, SEA, volume 5526 of Lecture Notes in Computer Science, pages 221–232. Springer, 2009.
- [10] Brendan D. McKay. Practical graph isomorphism. Congressus Numerantium, 30:45–87, 1981.
- [11] Brendan D. McKay. The nauty page. Computer Science Department, Australian National University, 2004. http://cs.anu.edu.au/bdm/nauty/.
- [12] Takunari Miyazaki. The complexity of McKay’s canonical labeling algorithm. In Larry Finkelstein and William M. Kantor, editors, Groups and Computation II, volume 28 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 239–256. American Mathematical Society, Providence, Rhode Island, USA, 1997.
- [13] José Luis López Presa. Efficient Algorithms for Graph Isomorphism Testing. Doctoral thesis, Escuela Técnica Superior de Ingeniería de Telecomunicación, Universidad Rey Juan Carlos, Madrid, Spain, March 2009. Available at http://www.diatel.upm.es/jllopez/tesis/thesis.pdf.
- [14] Johannes Singler. Graph isomorphism implementation in LEDA 5.1. Technical report, Algorithmic Solutions Software GmbH, Dec. 2005.
- [15] Gottfried Tinhofer and Mikhail Klin. Algebraic combinatorics in mathematical chemistry. Methods and algorithms III. Graph invariants and stabilization methods. Technical Report TUM-M9902, Technische Universität München, March 1999. http://www-lit.ma.tum.de/veroeff/quel/990.05005.pdf.
- [16] J. R. Ullmann. An algorithm for subgraph isomorphism. Journal of the ACM, 23(1):31–42, 1976.
- [17] Takashi Washio and Hiroshi Motoda. State of the art of graph-based data mining. ACM SIGKDD Explorations Newsletter, 5(1):59–68, 2003.
Appendix A Proof of the Components Theorem (Theorem 2)
The following definition will be needed in the proof.
Definition 9
Let be a graph. Let . Then the subgraph induced by on , denoted , is the graph such that .
A backtracking point arises when a partition does not have singleton cells (suitable for a vertex refinement) and it is not possible to refine such partition by means of a set refinement. Let us introduce a new concept that will be useful in the following discussion.
Definition 10
Let be a graph, and let be a partition of . is said to be equitable (with respect to ) if for all , for all , for all , .
Observation 1
The partition at a backtracking point is equitable.
Proof:
Assume otherwise. Then, there exists some such that there are two vertices in some
, such that . Therefore,
it would be possible to perform a set refinement on the partition, using as
the pivot cell, and vertices and would be distinguished by this refinement,
and cell would be split. This is not possible since, at a backtracking point,
no set refinement has succeeded.
Observation 2
Let be a backtracking level. Let be the partition at that level. Then, for all , is regular.
Proof:
From Observation 1, is equitable. Fix , then,
from Definition 10, for all , .
Therefore, is regular, for all .
Let be a sequence of partitions for graph where , , and . For all let , and . We consider two backtracking levels and that satisfy the preconditions of Theorem 2, i.e., and each cell of is contained in a different cell of .
Let be the pivot vertex used for the vertex refinement at level . Assume there is a vertex that satisfies the following. is a partition that is compatible with . Let , . For all , let be compatible with , where , if , and for some if . This generates an alternative sequence of partitions that is compatible with the original one up to level .
Under these premises, we show in the rest of the section that and are isomorphic, and there is an isomorphism of them that matches the vertices in to the vertices in for all .
To simplify the notation, let us assume . Note that in this case, for all , . In case this correspondence is not trivial. However, we can safely assume that there may be some that are empty, and develop our argument considering this possibility, although we know that in the real sequence of partitions, these empty cells would have been discarded.
For all , let , be the vertices discarded in the refinements from to and respectively, let be the vertices discarded in both alternative refinements, the vertices discarded only in the refinement from to , the vertices discarded only in the refinement from to , and the vertices remaining in both alternative partitions at level . Let , , , , , and . Clearly, , and . Observe that , and hence for all .
…
Observation 3
is isomorphic to , and there is an isomorphism of them that matches the vertices in to those in , for all .
Proof:
Direct from the construction of the sequences of partitions.
Lemma 1
Let . It is satisfied that:
-
•
For each , for all , for all , and .
-
•
For each , for all , for all , and .
Proof: Since none of the vertices in has been able to distinguish among the vertices in cell , each of the discarded vertices has the same type of adjacency with all the vertices in . Otherwise, consider vertex . Assume has at least two different types of adjacency with the vertices in . Since it was discarded during the refinements from to , that had to be for one of the following reasons:
-
1.
It was discarded for having no links (i.e. links of type ), what is impossible since it has two different types of adjacencies with the vertices in .
-
2.
It was used as the pivot set in a vertex refinement, what is impossible since it would have been able to split cell .
The same argument applies to the vertices in with respect to the vertices in each cell .
Consider the adjacency between vertex and vertex is for some . Then, we will denote the adjacency between and () as . Note that if , , if , , if , , and if , .
Lemma 2
For each , there is some such that for all , , , , , and , and .
Proof: Let us take any and any . Since and , from Lemma 1, for each , for all , for some . Let us take any such . Then, for those particular and . Besides, since and , from Lemma 1, for all , for some . Since we already know that for that particular pair of vertices, then we conclude that for all , , and , for some .
and . Since for all , , and , then from Lemma 1, for all , , (clearly, the same ) and .
and . Since for all , , and , then from Lemma 1, for all , , and (clearly, the same ).
Furthermore, all the vertices in have the same number of adjacent vertices of each type in . Otherwise, they would have been distinguished in the refinement process from to . Likewise, all the vertices in have the same number of adjacent vertices of each type in . Otherwise, they would have been distinguished in the refinement process from to . Hence, the vertices of must have the same number of adjacent vertices of each type in and . Hence, since for all , and for all , and , then for all , and for all , and too.
A similar argument may be used to prove that for all , and for all , and . Then, from Lemma 1, since , for all , for all . We already know that for all , for all , and . Hence, for all , too, and .
Putting together all the partial results obtained, we get the assertion stated in the lemma.
Corollary 1
Let . For each , it is satisfied that for all , , where .
Proof:
From Lemma 2, for the case , we get that for all ,
, , and
. Hence, it must hold that , so .
Let us define two families of partitions of for :
Note that, since the vertices of are unable to distinguish among the vertices of , then, if for some or some , then for all and all . Hence, each pair of sets and defines a partition of . Note also that, since each vertex in has the same type of adjacency with all the vertices in (from Lemma 1), then for all , , , , , , and , (from Lemma 2).
Lemma 3
For all , let , and let . Then, any isomorphism of and that maps to , maps the vertices in among themselves.
Proof: From Observation 1, partition is equitable. Hence, for each , for all , . Thus, for all , , , , , .
Let us take any pair of values of and . From Lemma 2, all the vertices of have the same type of adjacency with all the vertices of . Assume this type of adjacency is . From the definition of , all the vertices of have adjacency with all the vertices of . Hence, for , , . Since and , then (note that , , and ).
However, from the definition of , for , . Hence, since , .
Since any isomorphism must match vertices with the same degree, every isomorphism of and that maps to , maps the vertices in among themselves.
Applying this argument over all possible values of , we get that
any isomorphism of and that maps to ,
maps the vertices in among themselves, for all .
Let us focus on any isomorphism of and that maps to for all (there is at least one from Observation 3).
Lemma 4
is isomorphic to , and there is an isomorphism of them that matches the vertices in to those in , for all .
Proof:
Let us analyze the adjacencies between the vertices in , , , , , and for some values of and . From Corollary 1, for all , , , where . From the definition of , for all , .
From Lemma 3, the vertices of are mapped among themselves in any isomorphism of and that maps to . Hence, the vertices of must be mapped to the vertices of . If , then , , and are disconnected. Hence, and must be isomorphic. In the case , taking the inverses of the graphs leads to the same result.
From Lemma 2, for each , there is some such that for all , , , , and . From the definition of , for all , for all , , , , .
Putting all this together, we come to a picture of the adjacencies among , , , , , and as shown in Figure A. The connections between the vertices of and the vertices of , and between the vertices of and the vertices of are all-to-all (all the same) of value or . Similarly, the adjacencies between the vertices of and the vertices of , and the adjacencies between the vertices of and the vertices of are all the same, all-to-all or (not necessarily equal to those of and or ). The adjacencies between and are all the same, all-to-all of any value in the set . This also applies to the adjacencies between and .
If is not isomorphic to , the discrepancy must be in the adjacencies between vertices of and with respect to the adjacencies between vertices of and . In such a case, in the isomorphism between and (recall that from Observation 3 there is an isomorphism of and that maps the vertices of to the vertices in for all ) some vertices of should be mapped to vertices of , and some of the vertices of should be mapped to vertices of . However, due to the adjacencies among , , , , , and , shown in Figure A, that would imply that the adjacencies between the vertices of and had to match adjacencies between the vertices of and . But, in that case, the same adjacency pattern must exist between the vertices of and , to match the corresponding subgraph of . Hence, the adjacencies between and could have been matched to the adjacencies between and .
Since this applies for all values of and , we conclude that is isomorphic to ,
and there is an isomorphism of them that matches the vertices in to those in , for all
, completing the proof.
Lemma 5
and are isomorphic, and there is an isomorphism of them that maps the vertices in to the vertices of for all .
Proof: From Lemma 2, we know that for each , there is some such that for all , , , , , and , and .
Note also that, from Corollary 1, for all , , , , where . This adjacency pattern is graphically shown in Figure A.
From Lemma 4, we know that is isomorphic to , and there is an isomorphism of them that matches the vertices in to those in , for all .
From the fact that is isomorphic to itself, and the previous considerations on the adjacency
pattern between the vertices in , , , , , and for all ,
shown in Figure A, it is easy to see that the isomorphism of and
obtained from Lemma 4, toghether with the trivial automorphism of yields an
isomorphism of and , what completes the proof.
We have shown that if two alternative sequences of partitions and lead to compatible partitions and , where all their cells are subcells of different cells of a previous common level , then the remaining subgraphs are isomorphic, and the vertices in each cell of one partition may be mapped to the vertices in its corresponding cell in the other partition by one such isomorphism. Thus, if during the search for a sequence of partitions compatible with the target, we have got an incompatibility at some point beyond level , and we have to backtrack from one level to another level in which all the cells are different supersets of the cells in the current backtracking point, when trying a compatible path, we will get to the same dead-end. Hence, it is of no use to try another path from one such level , and it will be necessary to backtrack to some point where at least two cells in the current backtracking point are subsets of the same cell in the previous backtracking point. This proves Theorem 2.