Two-Dimensional Pattern Languages††thanks: This document is a full version (i. e., it contains all proofs) of the conference paper [3].
Abstract
We introduce several classes of array languages obtained by generalising Angluin’s pattern languages to the two-dimensional case. These classes of two-dimensional pattern languages are compared with respect to their expressive power and their closure properties are investigated.
1 Introduction
Several methods of generation of two-dimensional languages (also called array languages or picture languages) have been proposed in the literature, extending the techniques and results of formal string language theory. A picture is considered as a rectangular array of terminal symbols in the two-dimensional plane. Models based on grammars or automata as well as those based on theoretical properties of the string languages are well-known and have been extensively investigated. We refer the interested readers to books and surveys like the ones by Rosenfeld [12], Wang [15], Rosenfeld and Siromoney [13], Giammarresi and Restivo [7], or Morita [11]. For example, regular string languages (also known as recognizable string languages) can be characterized in terms of local languages and projections. Based on a similar idea, the class REC of recognizable picture languages (see Giammarresi and Restivo [6]) was proposed as a two dimensional counterpart of regular string languages. In this work, we attempt to generalise a class of string languages to the two-dimensional case, which also provides several desirable features and has therefore attracted considerable interest over the last three decades in the formal language theory community as well as in the learning theory community: Angluin’s pattern languages (see [1]).
In this context, a pattern is a string over an alphabet of variables, e. g., . For some finite alphabet of terminal symbols, the pattern language described by (with respect to ) is the set of all words over that can be derived from by uniformly substituting the variables in by (non-empty) terminal words. For example, if , then and are words of the pattern language given by , since replacing by and by turns into and replacing by and by turns into . On the other hand, the word is not a member of the pattern language of .
One of the most notable features of pattern languages is that they have natural and compact human readable descriptors (or generators), namely the patterns. In particular, this advantage becomes evident when patterns are compared to other language descriptors as, e. g., grammars or automata, which are usually quite involved even though the language they describe is rather simple. Nevertheless, patterns can compete with common automata models and grammars in terms of expressive power and their practical relevance is demonstrated by the widespread use of so-called extended regular expressions with backreferences, which implicitly use the concept of patterns and are capable of defining all pattern languages.111In fact, these extended regular expressions with backreferences are nowadays a standard element of most text editors and programming languages (cf. Friedl [5]).
The main goal of this paper is to generalise the concept of patterns as language descriptors to the two-dimensional case, while preserving the desirable features of (one-dimensional) pattern languages, i. e., the simplicity and compactness of their descriptors. The work done so far on two-dimensional languages demonstrates that there are difficulties that seem to be symptomatic for the task of generalising a class of string languages to the two-dimensional case. Firstly, such a generalisation is usually accompanied with a substantial increase in complexity of the descriptors (e. g., when extending context-free or contextual grammars to the two-dimensional case (see Fernau et al. [2], Freund et al. [4])) and, secondly, there are often many competing and seemingly different ways to generalise a specific class of string languages, which all can be considered natural (e. g., it is still on debate what the appropriate two-dimensional counterpart of the class of regular languages might be (see Giammarresi et al. [8], Matz [10])). Our two-dimensional patterns, to be introduced in this work, are as simple and compact as their one-dimensional counterparts. Although there are several different possibilities of how these two-dimensional patterns can describe two-dimensional languages, one of these sticks out as the intuitively most natural one. Hence, the model of Angluin’s pattern languages seems to be comparatively two-dimensional friendly.
Besides the conceptional contribution of this paper, we present a comparison between the expressive power of different classes of two-dimensional pattern languages and an investigation of their closure properties. We conclude the paper by outlining further research questions and possible extensions to the model of two-dimensional pattern languages.
2 Preliminaries
In this section, we briefly recall the standard definitions and notations regarding one- and two-dimensional words and languages.
Let and let . For a finite alphabet , a string or word (over ) is a finite sequence of symbols from , and stands for the empty string. The notation denotes the set of all nonempty strings over , and . For the concatenation of two strings we write or simply . We say that a string is a factor of a string if there are such that . If or is the empty string, then is a prefix (or a suffix, respectively) of . The notation stands for the length of a string .
A two-dimensional word (or array) over is a tuple
where and, for every , , and , , . We define the number of columns (or width) and number of rows (or height) of by and , respectively. The empty array is denoted by , i. e., . For the sake of convenience, we also denote by or by a matrix of one of the following forms:
If we want to refer to the symbol in row of the array , then we use . By , we denote the set of all nonempty arrays over , and . Every subset is an array language.
Let and be two non-empty arrays over . The column concatenation of and , denoted by , is undefined if and is the array
otherwise. The row concatenation of and , denoted by , is undefined if and is the array
otherwise. Intuitively speaking, the vertical line and the horizontal line in the symbols and , respectively, indicate the edge where the arrays are concatenated. In order to denote that, e. g., is undefined, we also write .
Furthermore, for every array , and . Algebraically speaking, if , then and both form monoids with as an absorbing element.
Example 1.
Let
Then and , but
Example 2.
Let and be defined as in Example 1.
Next, we define some operations for array languages. The row and column concatenation for array languages and is defined by and , respectively. For an array language and , denotes the -fold row concatenation of , i. e., , , . The -fold column concatenation, denoted by , is defined analogously. The row and column concatenation closure of an array language is defined by and , respectively. Obviously, the row and column concatenation closure of an array language correspond to the Kleene closure of a string language.
Now, we turn our attention to some geometric operations for arrays. The transposition of an array , denoted by , is obtained by reflecting along the main diagonal. The -reflection and -reflection of , denoted by and , respectively, are obtained by reflecting along the horizontal and vertical axis, respectively. The right turn and left turn of , denoted by and , respectively, is obtained by turning through degrees to the right and to the left, respectively. For example, if , then
We address left and right turn also as quarter-turns below. Moreover, the twofold right turn (displayed right-most in the example above) is also known as a half-turn.
A special operation considered in the context of arrays is the conjugation of an array with , denoted by , which means that the two symbols of are exchanged in , e. g., . This can be also viewed as a quite restricted form of a two-dimensional morphism defined in the next section.
All these operations for arrays are extended to array languages in the obvious way.
Next, we briefly summarise the concept of (one-dimensional) pattern languages as introduced in [1] by Angluin. Technically, the version of pattern languages used here are called nonerasing terminal-free pattern languages (for an overview of different versions of one-dimensional pattern languages, the reader is referred to [9] by Mateescu and Salomaa).
A (one-dimensional) pattern is a string over an alphabet of variables, e. g., . In Section 1, we have seen an intuitive definition of the language described by a pattern . This intuition can be formalised in an elegant way by using the concept of (word) morphisms, i. e., mappings , which satisfy , for all . In this regard, for some finite alphabet , the (one-dimensional) pattern language of (with respect to ) is the set . An alternative, yet equivalent, way to define pattern languages is by means of factorisations. To this end, let , , . Then is the set of all words that have a characteristic factorisation for , i. e., a factorisation , such that, for every , , implies . It can be easily seen, that these two definitions are equivalent. However, for the two-dimensional case, we shall see that a generalisation of these two approaches will lead to different versions of two-dimensional pattern languages. The class of all one-dimensional pattern languages over the alphabet is denoted by . We recall the example pattern and the words and of Section 1 Since and , where and are the morphisms induced by , and , , we can conclude that , where .
3 Two-Dimensional Pattern Languages
As already mentioned, this work deals with the task of generalising pattern languages from the one-dimensional to the two-dimensional case. In order to motivate our approach to solve this task, we first spent some effort on illustrating the general difficulties and obstacles that arise.
Abstractly speaking, a pattern language for a given pattern is the collection of all elements that satisfy . Thus, a sound definition of how elements satisfy patterns directly entails a sound definition of a class of pattern languages. In the one-dimensional case, the situation that a word satisfies a pattern is intuitively clear and it can be defined in several equivalent ways, i. e., a word satisfies the pattern if and only if
-
•
can be derived from by uniformly substituting the variables in ,
-
•
is a morphic image of ,
-
•
has a characteristic factorisation for .
We shall now demonstrate that for a two-dimensional pattern, i. e., a two-dimensional word over the set of variables , e. g., , these concepts do not work anymore or they describe fundamentally different situations. For instance, the basic operation of substituting a single symbol in a word by another word cannot that easily be extended to the two-dimensional case. For example, the replacements and may turn into one of the following objects,
which are not two-dimensional words, since they all contain holes or are not of rectangular shape and, most importantly, are not uniquely defined. On the other hand, it is straightforward to generalise the concept of a morphism to the two-dimensional case:
Definition 1.
A mapping is a two-dimensional morphism if it satisfies and for all .
Hence, we may say that a two-dimensional word satisfies a two-dimensional pattern if and only if there exists a two-dimensional morphism which maps to . Unfortunately, this definition seems to be too strong as demonstrated by the following example. From an intuitive point of view, the two-dimensional word should satisfy the two-dimensional pattern , but there is no two-dimensional morphism mapping to . This is due to the fact that, as pointed out by the following proposition (which has also been mentioned by Siromoney et al. in [14]), a two-dimensional morphism is a mapping with a surprisingly strong condition.
Proposition 1.
Let and be alphabets. If a mapping is a two-dimensional morphism, then
Proof.
We prove the statement of the proposition by contraposition. To this end, we assume that, for some , , , which implies . Hence, since , we can conclude that , which contradicts the morphism property. Similarly, if , then . ∎
Similarly as in the string case, homomorphisms are uniquely defined by giving the images . If in particular , we term the resulting morphism a letter-to-letter morphism, while in the even more restricted case when the restriction of to yields a surjective mapping , is referred to as a projection. We can conclude that the existence of a two-dimensional morphism seems to be a reasonable sufficient criterion for the situation that a two-dimensional word satisfies a two-dimensional pattern, but not a necessary one.
In fact, it turns out that characteristic factorisations provide the most promising approach to formalise how a two-dimensional word satisfies a two-dimensional pattern. Recall the example pattern from above. Since , a characteristic factorisation of a two-dimensional word for is a factorisation of the form . On the other hand, since , we could as well regard a factorisation as characteristic for . For the sake of convenience, we say that the former factorisation is of column-row type and the latter one is of row-column type. Obviously, the two-dimensional word from above has a characteristic factorisation of column-row type and a characteristic factorisation of row-column type (with respect to ):
As a matter of fact, for every two-dimensional word there exists a characteristic factorisation for of column-row type if and only if there exists a characteristic factorisation for of row-column type. However, this is a particularity of and, e. g., for and , there exists a characteristic factorisation of column-row type , but no characteristic factorisation of row-column type. Furthermore, the column-row factorisation of is somewhat at odds with our intuitive understanding of what it means that a two-dimensional word satisfies a two-dimensional pattern. This is due to the fact that factorising into means that we associate the two-dimensional factors , and with the variables , and , respectively, but in the pattern the vertical neighbourship relation between the occurrence of in the first row and the occurrence of in the second row is not preserved in with respect to the corresponding two-dimensional factors and . More precisely, while a column-row factorisation preserves the horizontal neighbourship relation of the variables, it may violate their vertical neighbourship relation, where for row-column factorisations it is the other way around. Consequently, if we want both the vertical as well as the horizontal neighbourship relation to be preserved, we should require that the two-dimensional word can be disassembled into two-dimensional factors that induce both a column-row as well as a row-column factorisation. More precisely, we say that satisfies if and only if there exist two-dimensional words and , such that , which we call a proper characteristic factorisation of .
We are now ready to formalise the ideas developed so far and we can finally give a sound definition of two-dimensional pattern languages. Although we consider the class of two-dimensional pattern languages that results from the proper characteristic factorisations as the natural two-dimensional counterpart of the class of one-dimensional pattern languages, we shall also define the other classes of two-dimensional pattern languages which were sketched above.
For the definition of two-dimensional patterns, we use the same set of variables that has already been used in the definition of one-dimensional pattern languages. An array pattern is a non-empty two-dimensional word over and a terminal array is a non-empty two-dimensional word over a terminal alphabet . If it is clear from the context that we are concerned with array patterns and terminal arrays, then we simply say pattern and array, respectively. Any mapping is called a substitution. For any substitution , by , we denote the mapping defined in the following way. For any , we define
Similarly, is defined by
Intuitively speaking, both mappings and , when applied to an array pattern , first substitute every variable occurrence of by a terminal array according to the substitution and then these individual terminal arrays are assembled to one terminal array by either first column-concatenating all the terminal arrays in every individual row and then row-concatenating the resulting terminal arrays, or by first row-concatenating all the terminal arrays in every individual column and then column-concatenating the resulting terminal arrays.
Let , and let . The array is a (1) column-row image of (with respect to ), (2) a row-column image of (with respect to ) or (3) a proper image of (with respect to ) if and only if (1) , (2) or (3) , respectively. The mapping is called a column-row substitution for and , a row-column substitution for and or a proper substitution for and , respectively. We say that is a column-row, a row-column or a proper image of if there exists a column-row, a row-column or a proper substitution, respectively, for and .
A nice and intuitive way to interpret the different kinds of images of array patterns is to imagine a grid to be placed over the terminal array. The vertical lines of the grid represent a column concatenation and the horizontal lines of the grid represent a row concatenation of the corresponding factorisation. This means that every rectangular area of the grid corresponds to an occurrence of a variable in the array pattern or, more precisely, to the array substituted for . The fact that an array satisfies a pattern is then represented by the situation that each two rectangular areas of the grid that correspond to occurrences of the same variable must have identical content. In Figure 1, an example for each a morphic image, a proper image, a column-row image and a row-column image of a pattern is represented in this illustrative way.
Alternatively, we can interpret the property that a terminal array is a certain type of image of an array pattern as a tiling of . More precisely, satisfies a given array pattern with different variables if and only if tiles can be allocated to the variables of such that combining the tiles as indicated by the structure of yields . The grids depicted in Figure 1 then illustrate the structure of such a tiling. The definitions of the corresponding classes of pattern languages are now straightforward:
Definition 2.
Let be an array pattern. We define the following variants of two-dimensional pattern languages:
-
•
,
-
•
,
-
•
,
-
•
,
-
•
.
For a pattern , we also denote the above languages by pattern language of , where . For every , we define and .
Since, for a fixed array pattern , every morphic image is a proper image and every proper image is a row-column image as well as a column-row image, the following subset relations between the different types of pattern languages hold (in the following diagram, an arrow denotes a subset relation):
Remark 1.
As indicated in the introductory part of this section, we consider the class of pattern languages as the most natural class of two-dimensional pattern languages. Another observation that supports this claim is that the pattern languages are compatible, in a certain sense, to the one-dimensional pattern languages. More precisely, for a one-dimensional (i. e., ) array pattern the set coincides with the one-dimensional pattern language of . This does not hold for the pattern languages (since in the one-dimensional case the words variables are mapped to can differ in length), but holds for the , and pattern languages. However, as pointed out above, the , and pattern language of a given pattern may contain arrays that, from an intuitive point of view, do not satisfy .
4 General Observations
In this section, we state some general lemmas about two-dimensional morphisms and array pattern languages, which shall be important for proving the further results presented in this paper. First, we refine Proposition 1, by giving a convenient characterisation for the morphism property for mappings on arrays. To this end, we define a substitution to be -uniform if, for every , and and a substitution is uniform if it is -uniform, for some .
Lemma 1.
A mapping is a two-dimensional morphism if and only if , where is a uniform substitution.
Proof.
We first observe that if is uniform, then obviously holds (so it is sufficient to prove the statement of the lemma only for one of these two mappings). Furthermore, if is uniform, then, for every , and , which proves the if direction. In order to prove the only if direction, we assume that is a two-dimensional morphism and we define a substitution by , . Furthermore, let . We now show that equals by induction. By definition, for every , . Now let with and . Then and, analogously, . By induction, it follows that . Consequently, we can conclude that is uniform. ∎
The next lemma states that the composition of two two-dimensional morphisms is again a two-dimensional morphism.
Lemma 2.
Let and be two-dimensional morphisms. Then, the composition is also a two-dimensional morphism.
Proof.
We first observe the following. If and are some uniform substitutions, then is a uniform substitution as well. Furthermore, it can be easily verified that . With Lemma 1, this directly implies the statement of the lemma. ∎
It is intuitively clear that the structure of a pattern fully determines the corresponding pattern language and the actual names of the variables are irrelevant, e. g., the patterns and should be considered identical. In the following we formalise this intuition. Two array patterns and are equivalent up to a renaming, denoted by , if and only if , and, for every , , , if and only if .
Lemma 3.
Let , let be an alphabet with and let . If , then .
Proof.
We assume that and note that this implies that and . This is due to the fact that if or , then the array obtained from by replacing every variable by a single symbol is in , but not in . We further assume that , which implies that there are with and , such that and (or and , for which an analogous argument applies). We now define a substitution in the following way. For every , if , then and if , then . We observe that, since is a morphic image, a proper image, a row-column image and a column-row image of , . Furthermore, . On the other hand, for every , with and , is satisfied. Thus, , which implies that . ∎
For every , , does not necessarily imply , as pointed out by, e. g., or . On the other hand, since, for every , obviously implies , two pattern languages are equivalent if and only if they are described by two patterns that are equivalent up to a renaming.
In the remainder of this work, we do not distinguish anymore between patterns that are equivalent up to a renaming, i. e., from now on we say that and are equivalent, denoted by for simplicity, if they are actually the same arrays or if they are equivalent up to a renaming.
5 Comparison of Array Pattern Language Classes
In this section, we provide a pairwise comparison of our different classes of array pattern languages and, furthermore, we compare them with the class of recognisable array languages, denoted by REC, which is one of the most prominent classes of array languages. For a detailed description of REC, the reader is referred to the survey [7] by Giammarresi and Restivo. Next, we show that, for every alphabet with , the language classes REC, , , , and are pairwise incomparable. More precisely, for every with , we show that , and . The non-emptiness of the pairwise intersections of these language classes can be easily seen:
Proposition 2.
For every , and .
It remains to find, for every , a separating language and a separating language . We first present all these separating languages in a table and then we formally prove their separating property. In rows to of the following table, if a pattern is the entry that corresponds to the row labeled by class and the column labeled by class , where , , then this means that . Row , on the other hand, contains recognisable array languages that are not array pattern languages.
REC | ||||||
---|---|---|---|---|---|---|
REC | – | |||||
– | ||||||
– | ||||||
– | ||||||
– | ||||||
– |
Lemma 4.
.
Proof.
In this proof, we use the characterisation of REC by local array languages and projections (see Giammarresi and Restivo [7] and also the next section). Let and . Suppose . Then there is a local array language over an alphabet so that is a projection of . For the sake of convenience, we define and . For every , let
Obviously, . Now let be the set of pictures in whose projections are in . In the arrays of there are at most possibilities of how the and row can look like. For sufficiently large Thus, there exist two arrays and in such that the corresponding arrays and in have the same row and the same row. Hence, since is a local array language, and therefore , which is a contradiction. ∎
It can be easily verified that, for every , , where . Hence, for every , , which implies the first column of the table. Furthermore, for every , , but , which implies the first row of the table.
We point out that, by Lemma 3, for every , , if there exists a pattern with , then and . Consequently, in order to prove the remaining entries of the table, it is sufficient to identify, for every , , a pattern with , which is done by the following two lemmas.
Lemma 5.
For every , .
Proof.
Let and let . For every , , since , where is defined by
By Lemma 1, it is obvious that there does not exist any morphism with . Thus, and, for every , . ∎
Lemma 6.
For every , , .
Proof.
Let and let . We observe that , where are defined by
Thus, , and . On the other hand, , since every proper image of must have an even number of columns and an even number of rows. Consequently, for every , . Similarly, and , since every column-row image of must have an even number of rows and every row-column image of must have an even number of columns. This implies that, for every , , , which concludes the proof. ∎
6 Closure Properties of Array Pattern Languages
The research of closure properties of classes of formal languages is a classical topic in this area. However, the number of natural properties is richer in the case of arrays compared to the more conventional string case. Thus, in this section, we classify the operations that shall be investigated in this regard according to whether or not they correspond to string language operations.
First, in Section 6.1, we investigate operations that correspond to string language operations. These are the Boolean operations of union, intersection and complementation, and also two special cases of (inverse) morphisms: letter-to-letter morphisms, and more special surjective letter-to-letter morphisms, known as projections in the terminology of array languages, and, more generally, the two-dimensional morphisms as defined in Section 3
Next, in Section 6.2, we take a closer look at operations similar to string language operations. More precisely, we investigate closure under concatenation and concatenation closure (or Kleene star), which constitute classical operations for string languages, but, with respect to the array case, we encounter an important difference, namely, there are two different types of concatenations: row and column concatenation. In particular, the concatenation of two arrays could be undefined (just because the dimensions do not match), but the concatenation of the two corresponding languages need not be empty.
Finally, in Section 6.3, we investigate operations special to arrays, that are usually not considered or even defined for string languages. These are mainly geometric operations like quarter turn, half turn, reflection or transposition of an array.
6.1 String Language Operations
We first point out that, due to Lemma 7 below, whenever a non-closure result is known for terminal-free non-erasing string pattern languages, this would straightforwardly transfer to the array case. We will therefore focus on finding proofs for the string case for non-closure properties, and conversely, we will try to give proofs for the array case for closure properties. Interestingly enough, (non-)closure properties have not been studied for the (classical) terminal-free non-erasing string pattern languages, all published proofs that we are aware of for this topic use terminals or erasing. So, our study also contributes to the theory of string pattern languages. Conversely, if we do not manage to find proofs or examples as required for the mentioned approach, this implicitly always raises an open classical string language question. For any mode and any pattern , let denote those arrays from that have just one row, i. e., . Clearly, such arrays can be interpreted as strings and vice versa. In this sense, and the string language generated by the pattern coincide, as long as . For , we encounter the special case that all inserted words have to be of the same length. Let us formulate this more formally:
Lemma 7.
Let be an array pattern of height one. Then, is, at the same time, a string pattern. Moreover, for any , , while .
We shall now prove non-closure properties for , which directly carry over to the classes , (for some operations, however, the class constitutes a special case, which is treated separately). To this end, we will mostly focus on two patterns: and . The next lemma states an immediate observation for these patterns.
Lemma 8.
Over the terminal alphabet , let () denote the shortest words that can be described by and , respectively, disallowing erasing. Then,
Proposition 3.
For any non-unary alphabet , is not closed under union.
Proof.
Without loss of generality, . In the following argument, we actually focus on , but this can be easily extended to the more general case. Assume that there was a pattern with . Then,
This means that contains exactly three variable occurrences (with more, these words cannot be generated, with less, shorter words could be generated), i.e., , , . As , some of these variables must coincide, which leads to a contradiction. More precisely, if or , then cannot be generated and if , then cannot be generated. Hence, with cannot exist. ∎
Now if there was a pattern such that , , then, by Lemma 7, this would imply , contradicting Proposition 3. We point out that in the proof of Proposition 3, we do not use any replacements by words of different lengths to obtain our contradiction. Hence, this argument is also valid in the case when .
Corollary 1.
None of the array pattern language classes under consideration (over some non-unary alphabet) is closed under union.
We proceed with the intersection operation.
Proposition 4.
For any non-unary alphabet , is not closed under intersection.
Proof.
The argument resembles the previous proof. Assume that describes . Notice that , which clearly implies that . However, . ∎
Notice that the replacement words we used for deriving a contradiction are of different lengths, meaning that because of the replacement and , but because of and . Hence, we cannot conclude non-closure for the -mode in the following corollary:
Corollary 2.
None of the array pattern language classes under consideration (over some non-unary alphabet and apart from the -case) is closed under intersection.
Indeed, the -mode plays a special rôle, as can be seen by the following result.
Proposition 5.
Let be some alphabet. Then, is closed under intersection.
Proof.
Assume that . Let be two array patterns. Let be the height (number of rows) of and be the width (number of columns) of . Likewise, and are understood. Then, the width of the smallest arrays in equals , and their height equals . This can be easily seen by substituting each variable in by the unique array of width and height over the alphabet into the pattern , as , , and a similar substitution within .
For each variable that occurs in , take new variables with and . Define a morphism by replacing the variable by the following array of height and width , consisting of the previously introduced variables:
Hence, is some array of height and of width . Accordingly, one can define a morphism such that is again some array of height and of width . Due to Lemma 2, and (*). Namely, if , then there exists some two-dimensional morphism such that , i.e., there also exists some two-dimensional morphism with , so that . Now, define an array pattern of height and of width , consisting exclusively of variable entries, as follows: The variables occurring at positions and in (where and ) are identical if and only if the corresponding variables in at least one of the arrays or are identical.
We claim that .
As is obtained from by identifying certain variables, due to (*) we find that:
and likewise for , so that .
Conversely, we have already argued that the smallest arrays in have height and width . More generally, any array has height and width . As , there is some two-dimensional morphism such that . Moreover, for each variable in , and . We can make an analogous reasoning with , introducing the constants and for the morphism . As , entries in must coincide both according to and to . This is exactly reflected in the construction of provided above, so that and for some two-dimensional morphism with . Hence, . ∎
Arguments as in Propositions 3 and 4 can be given for any non-trivial binary set operation, for instance, symmetric difference or set difference. This also gives the according result for complementation, but there is also an easier argument in that case. Notice that, as non-erasing pattern languages or array patterns cannot reasonably cope with the empty word or the empty array, we disregard this in the complement operation.
Proposition 6.
For any alphabet , is not closed under complementation.
Proof.
Let be some alphabet with . Consider the pattern . The complement (disregarding the empty word) contains as shortest words a word of the form . This implies that a pattern describing the complement must have length one, i.e., . Hence, , which is not the complement of . ∎
Corollary 3.
None of the array pattern language classes under consideration (over any alphabet) is closed under complementation.
Notice that in the other cases (but complementation), we cannot cope with unary alphabets. This might need some different arguments.
We shall now turn to operations that are described by different kinds of morphisms. For the array case, codings (or projections) is a common such operation.
Theorem 1.
Any of our array pattern language classes (over arbitrary alphabets) is closed under projections.
Proof.
The following proof works for every . Let be some array pattern and let be a surjective mapping that describes some projection. We shall now show that .
Namely, consider some array . is obtained from by replacing any variable occurring in by some array . If we replace by instead (which has the same dimensions as ), we can see that .
Conversely, fix some letter from for each for now and denote it by . This is possible, as is a surjective mapping. As for each by construction, for any we can describe some such that , just by taking . ∎
The result does not generalize to (string) morphisms where each image is of the same length.
Proposition 7.
For any non-unary alphabet , is not closed under morphisms that map every letter to a word of length two.
Proof.
Consider the pattern on the alphabet , describing the universal language , and the morphism with and . Then, , which is not a pattern language, as an easy analysis shows. ∎
Corollary 4.
None of the array pattern language classes under consideration (over some non-unary alphabet) is closed under two-dimensional morphisms.
This is also true for the more general operation of substitution, with the same examples.
Let us also remark that Proposition 7 did not rely on the fact that we restricted our attention to one specific non-unary alphabet . However, if we have a specific alphabet, then we can even state:
Proposition 8.
Any of our array pattern language classes (over some fixed alphabet ) is closed under some projection if and only if is a bijection.
Proof.
If is some bijection, then the argument given in the proof of Theorem 1 applies. Recall now that for finite sets as , is a bijection if and only if is a surjection. If is not a bijection, then is not a surjection; hence, there is some , Consider the pattern . Clearly, , , but any array pattern language over will also contain patterns with the letter . ∎
This immediately implies the following two results:
Corollary 5.
Any of our array pattern language classes (over binary alphabets) is closed under conjugation.
Corollary 6.
None of our array pattern language classes (over some fixed alphabet ) is closed under all letter-to-letter morphisms.
Alternatively, we could also look at inverse morphisms. Here, we already get negative results for inverse codings.
Proposition 9.
For any alphabet with at least four letters, is not closed under inverse letter-to-letter morphisms.
Proof.
Consider the coding . The pattern generates (over the alphabet ) the language . However, is not a pattern language. The shortest words in are of length two, so that a hypothetical pattern for must be of the form or . In the first case, is produced, while in the second case, cannot be produced. ∎
Corollary 7.
None of the array pattern language classes under consideration (over some sufficiently large alphabet) is closed under inverse (two-dimensional) morphisms.
6.2 Operations Similar to String Language Operations
Let us first turn to the concatenation operation. As a warm-up, we first consider the string case.
Lemma 9.
For any alphabet , is closed under concatenation.
Proof.
Consider two patterns and . After renaming, we can assume that and do not contain any identical variables. Then, . ∎
The first thing one should note is that the concatenation of two arrays could be undefined (i. e., if their dimensions do not match), even though the concatenation of the two according languages need not be empty. However, we can prove:
Theorem 2.
Fix some alphabet .
-
•
is closed under row concatenation ;
-
•
is closed under column concatenation ;
-
•
and are closed both under row and under column concatenation.
Proof.
We only prove the first item. The others follow similarly. Observe that in the case of , we need Lemmas 1 and 2 to finish the argument. Let and be two array patterns. We can assume that the variable alphabets of and are disjoint. We want to construct an array pattern such that
Let be the number of rows of and be the number of columns of . Accordingly, is the number of rows of and is the number of columns of . If , then we can set to satisfy as in the string case of Lemma 9. More generally, set . is the width of the smallest arrays in . More generally speaking, any array in has some width that is a multiple of . We are going to exploit this property by constructing two array patterns and of width such that
so that we can apply our previous reasoning, defining now . Consider the two-dimensional morphism that maps every variable of on an array of height one and width , more precisely, onto . Here, are “new variables.” Then, is an array pattern of width . Similarly, we define a morphism , yielding an array pattern of width . Now, any array in has a width that is a multiple of and also belongs to , and conversely any that has a width that is a multiple of belongs to . Together with similar statements for the pattern , the claim follows. ∎
It is not a coincidence that for and , we had to focus on the “correct” concatenation operation in the preceding theorem. More precisely, we can show:
Theorem 3.
Fix some non-unary alphabet .
-
•
is not closed under column concatenation ;
-
•
is not closed under row concatenation ;
-
•
is neither closed under row nor under column concatenation.
Proof.
Again, we only prove the first item; the others can be seen in a similar fashion.
Consider the array patterns and . Notice that , There are arrays of width four and height two in , and these are the smallest arrays in . Hence, any array pattern with has width four and height two. Let , with possibly some of the variables being the same. As , but and also , and . More generally, it can be verified that the array pattern describes those and only those arrays of width four and height two that belong to . However, is an array from (namely, consider , , and ) that does not belong to , as this would mean that for some array or of width two and height two from . However, neither nor belongs to . ∎
Notice that the proofs of negative closure properties necessitate a non-unary alphabet to work.
We now turn to the Kleene closure. Here, we can again first show a non-closure result for the string case that then readily transfers to the array cases.
Lemma 10.
Let be a non-unary alphabet. Consider . Then, .
Proof.
The shortest words in are of length two. Hence, there are only two different possibilities for any pattern with : If , then , while if , then . ∎
Proposition 10.
Let be a non-unary alphabet. Then, none of the array language families with is closed under column concatenation closure nor under row concatenation closure.
Proof.
Consider . Due to Lemmas 7 and 10, this class is not closed under row concatenation closure. For the case of column contentation closure, reconsider the proof of Theorem 3. There, we have presented a language such that . But that argument also shows that the column contentation closure of does not belong to .
The other cases are simillarly seen. For the case of morphisms, observe that the contradiction in Lemma 10 was derived by substituting the variables by words of the same length. ∎
6.3 Operations Special to Arrays
Recall that the transposition operation is first defined for arrays (or patterns) and can then be lifted to languages and even to language classes. Nearly by definition, we find:
Lemma 11.
Let be some alphabet. Let be a pattern. Then, and .
Corollary 8.
Let be some alphabet. Then, and .
Since is identical to its transposition and, as shown in the proof of Lemma 6, describes an pattern language (a pattern language), which is not a pattern language (not an pattern language, respectively), we can conclude the following:
Proposition 11.
Let be an alphabet. Neither nor are closed under transposition.
Proof.
Proposition 12.
For any alphabet and , is closed under transposition.
Proof.
For and , this claim is immediate from the fact that we have proper factorizations. For the case , we use Lemma 11. Let be some pattern. Then,
This immediately implies the claim. ∎
With respect to purely geometric operations as turns and reflections, we find the following:
Proposition 13.
Let be some alphabet.
-
•
, and are closed under quarter-turn.
-
•
For every , is closed under half-turn and reflections.
-
•
and are closed neither under left nor under right turn.
Proof.
For the positive closure results, simply observe that the language described by the quarter-turn, by the half-turn or by a reflection of the array pattern is just the quarter-turn, the half-turn or the reflection of the language described by .
For the non-closure properties, by symmetry it suffices to show that there is a language in whose quarter-turn is not in . To this end, consider . Observe that the quarter-turn of is the same as , which was proven not to be in in Lemma 6. ∎
The positive closure properties can be easily observed by applying the geometric operation directly on the array pattern. In order to show non-closure of and with respect to left and right turn, it is again sufficient to observe that the pattern from above is identical to its left or right turn and then apply a similar argument as in the proof of Lemma 6.
Due to symmetry, it does not matter if we consider horizontal or vertical reflections. Notice that both half-turns and reflections coincide in the string case in any meaningful, non-trivial interpretation; in that case, the operation is also known as mirror image.
7 Future Research Directions
A thorough investigation of the typical decision problems for two-dimensional pattern languages like the membership, inclusion and equivalence problem is left for future research. It can be easily seen that the NP-completeness of the membership problem for string pattern languages carries over to , . On the other hand, for a given array pattern and a terminal array , the question whether or not can be decided in polynomial time by checking whether is a morphic image of with respect to a -uniform substitution. As shown by Lemma 3, the equivalence problem for all the classes with and can be easily solved by simply comparing the patterns. However, for every , , the problem to decide for given patterns and whether or not might be worth investigating. The inclusion problem for terminal-free nonerasing string pattern languages is still open. Hence, with respect to the inclusion problem, a positive decidability result for two-dimensional pattern languages implies a positive decidability result for terminal-free nonerasing string pattern languages.
For string pattern languages it is common to use terminal symbols in the patterns as well as to consider the erasing case, i. e., variables can be replaced by the empty word. The pattern languages can be adapted to the erasing case by allowing variables to be substituted by the empty array. Furthermore, the situation of having a terminal symbol at position of an array pattern simply forces all the variables in the row to be substituted by arrays of height and all the variables in the column to be substituted by arrays of width . As in the string case, it is likely that in the two-dimensional case the difference between erasing and nonerasing substitutions and patterns with and without terminal symbols lead to different language classes with different decidability properties, too.
Finally, we wish to point out that it is straightforward to generalise our different classes of two-dimensional pattern languages to the three-dimensional or even -dimensional case.
References
- [1] D. Angluin. Finding patterns common to a set of strings. Journal of Computer and System Sciences, 21:46–62, 1980.
- [2] H. Fernau, R. Freund, and M. Holzer. The generative power of -dimensional #-context-free array grammars. In M. Margenstern, editor, Proceedings of MCU’98, Volume 2, pages 43–56. University of Metz, 1998.
- [3] H. Fernau, M. L. Schmid, and K. G. Subramanian. Two-dimensional pattern languages. In S. Bensch, F. Drewes, R. Freund, and F. Otto, editors, Fifth Workshop on Non-Classical Models for Automata and Applications, NCMA, volume 294 of books@ocg.at, pages 117–132. Österreichische Computer Gesellschaft, 2013.
- [4] R. Freund, G. Păun, and G. Rozenberg. Chapter 8: Contextual array grammars. In C. Martín-Vide, V. Mitrana, and G. Păun, editors, Series in Machine Perception and Artificial Intelligence: Volume 66 - Formal Models, Languages and Applications, pages 112–136. World Scientific, 2007.
- [5] J. E. F. Friedl. Mastering Regular Expressions. O’Reilly, Sebastopol, CA, third edition, 2006.
- [6] D. Giammarresi and A. Restivo. Recognizable picture languages. International Journal of Pattern Recognition and Artificial Intelligence, 6:31–46, 1992.
- [7] D. Giammarresi and A. Restivo. Two-dimensional languages. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 3, chapter 4, pages 215–267. Springer, 1997.
- [8] D. Giammarresi, A. Restivo, S. Seibert, and W. Thomas. Monadic second-order logic over rectangular pictures and recognizability by tiling systems. Information and Computation (formerly Information and Control), 125:32–45, 1996.
- [9] A. Mateescu and A. Salomaa. Patterns. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 1, pages 230–242. Springer, 1997.
- [10] O. Matz. Recognizable vs. regular picture languages. In Proc. 2nd International Conference on Algebraic Informatics, CAI 2007, volume 4728 of Lecture Notes in Computer Science, pages 75–86, 2007.
- [11] K. Morita. Two-dimensional languages. In C. Martín-Vide, V. Mitrana, and G. Păun, editors, Studies in Fuzziness and Soft Computing - Formal Languages and Applications, pages 427–437. Springer, 2004.
- [12] A. Rosenfeld. Picture Languages: Formal Models for Picture Recognition. Academic Press, Inc., Orlando, 1979.
- [13] A. Rosenfeld and R. Siromoney. Picture languages – a survey. Languages of Design, 1:229–245, 1993.
- [14] G. Siromoney, R. Siromoney, and K. Krithivasan. Picture languages with array rewriting rules. Information and Control, 22:447–470, 1973.
- [15] P. S. P. Wang. Array Grammars, Patterns and Recognizers. World Scientific Publishing Co., Inc., NJ, USA, 1989.