This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A Path Algebra for Multi-Relational Graphs

Marko A. Rodriguez1, Peter Neubauer2
1~{}^{1}Graph System Architect, AT&T Interactive
Santa Fe, NM 87506 USA
marko@markorodriguez.com
2~{}^{2}
VP Product Development, Neo Technology
21119 Malmö, Sweden
peter.neubauer@neotechnology.com
Abstract

A multi-relational graph maintains two or more relations over a vertex set. This article defines an algebra for traversing such graphs that is based on an nn-ary relational algebra, a concatenative single-relational path algebra, and a tensor-based multi-relational algebra. The presented algebra provides a monoid, automata, and formal language theoretic foundation for the construction of a multi-relational graph traversal engine.

I Introduction

The adjacency of vertex ii and vertex jj is defined by the edge (i,j)(i,j). A structure of this form is called a graph and is usually defined as G¨=(V¨,E¨)\ddot{G}=(\ddot{V},\ddot{E}), where i,jV¨i,j\in\ddot{V} are vertices and (i,j)E¨(i,j)\in\ddot{E} is the edge adjoining those vertices.111The “high dot” notation denotes that G¨G˙G\ddot{G}\neq\dot{G}\neq G, where GG is the main definition used throughout the article. When the only distinguishing characteristic between two edges is the vertices they join, the graph is called single-relational. The reason for this is that there is only a single type of relation in the graph—namely, the binary relation E¨(V¨×V¨)\ddot{E}\subseteq(\ddot{V}\times\ddot{V}). Single-relational graphs have been used widely to model various systems of homogenous elements related by a single type of relation and as such, have numerous algorithms associated with their analysis [1].

When the domain of discourse is variegated by a heterogeneous set of relations, then the multi-relational graph becomes the more applicable construct. A multi-relational graph can be defined as G˙=(V˙,𝔼˙)\dot{G}=(\dot{V},\dot{\mathbb{E}}), where 𝔼˙\dot{\mathbb{E}} is a family of edge sets and 𝔼˙={E1˙,E2˙,,Em˙(V˙×V˙)}\dot{\mathbb{E}}=\{\dot{E_{1}},\dot{E_{2}},\ldots,\dot{E_{m}}\subseteq(\dot{V}\times\dot{V})\}. When m>1m>1, then there are multiple relations between the vertices of V˙\dot{V}. Multi-relational graphs not only specify which vertices are adjacent to one another, they also specify the way in which they are adjacent. With respect to the formalisms of this article and without loss of generality, a multi-relational graph can also be represented as G=(V,E)G=(V,E), where EE is ternary relation, E(V×Ω×V)E\subseteq(V\times\Omega\times V), and Ω\Omega is a set of edge labels (i.e. relation types). Thus, in reference to the structure G˙=(V˙,𝔼˙)\dot{G}=(\dot{V},\dot{\mathbb{E}}), |𝔼˙|=|Ω||\dot{\mathbb{E}}|=|\Omega| and n=1n|𝔼˙||E˙n|=|E|:E˙n𝔼˙\sum_{n=1}^{n\leq|\dot{\mathbb{E}}|}|\dot{E}_{n}|=|E|:\dot{E}_{n}\in\dot{\mathbb{E}}. The ternary relation model is the multi-relational graph structure used throughput this article. The reason for the use of this particular GG definition will be explained in §II.

Given the growing use of multi-relational graphs in computing [2] and the lack of graph techniques for such structures (relative to single-relational graphs), an algebraic model for traversing multi-relational graphs is presented. This article can be interpreted as a convergence of the nn-ary relational algebra of [3], the concatenative single-relational path algebra in [4], and the multi-relational tensor algebra presented in [5]. However, unlike [3], the presented algebra is tied specifically to path construction by means of graph traversals as in [5] and [4]. Next, unlike the algebra in [4], which is oriented primarily towards single-relational graphs, the presented algebra conveniently supports multiple relations as in [3] and [5]. Finally, unlike [5], the presented algebra is a concatenative, order-preserving variation of the relational algebra in [3] and, as such, more aligned with [4].

The operations presented are summarized in the itemization below and are provided here as a consolidated summary for ease of reference.

  • a\|a\|: the path length of path aa.

  • :E×EE\circ:E^{*}\times E^{*}\rightarrow E^{*} : the concatenation of two paths.222The unary Kleene star operation forms the free monoid E=n=0EiE^{*}=\bigcup_{n=0}^{\infty}E^{i}, where E0={ϵ}E^{0}=\{\epsilon\} and ϵ\epsilon is the empty/identity element.

  • σ:E×+E\sigma:E^{*}\times\mathbb{N}^{+}\rightarrow E: the projection of the nthn^{\text{th}} edge of a path.

  • γ:EV\gamma^{-}:E^{*}\rightarrow V: the projection of the tail (first element) of a path.

  • γ+:EV\gamma^{+}:E^{*}\rightarrow V: the projection of the head (last element) of a path.

  • ω:EΩ\omega:E\rightarrow\Omega: the projection of the label of an edge.

  • :𝒫(E)×𝒫(E)𝒫(E)\cup:\mathcal{P}(E^{*})\times\mathcal{P}(E^{*})\rightarrow\mathcal{P}(E^{*}): the union of two path sets.

  • :𝒫(E)×𝒫(E)𝒫(E)\bowtie_{\circ}:\mathcal{P}(E^{*})\times\mathcal{P}(E^{*})\rightarrow\mathcal{P}(E^{*}): the concatenative join of two path sets.

  • ×:𝒫(E)×𝒫(E)𝒫(E)\times_{\circ}:\mathcal{P}(E^{*})\times\mathcal{P}(E^{*})\rightarrow\mathcal{P}(E^{*}): the concatenative product of two path sets.

Definitions of these operations are provided in §II. The use of these operations to represent basic traversal idioms is presented in §III. In §IV, regular paths can be recognized and generated as demonstrated in §IV-A and §IV-B, respectively. Making use of the algebra to evaluate single-relational graph algorithms is presented in §IV-C. The algebra provides a set of core operations for constructing a multi-relational graph traversal engine that is founded on monoid, automata, and formal language theory.

II Core Operations

Traversing a graph is the process of moving over the edges specified in EE. During a traversal, paths are derived and properties of those paths can be extracted.

Definition 1 (Path)

A path aa in a multi-relational graph is a sequence, or string, where aEa\in E^{*} and E(V×Ω×V)E\subseteq(V\times\Omega\times V). A path allows for repeated edges. The path length is denoted a\|a\| and is equal to the number of edges in aa. Any edge in EE is a path with a path length of 11 as eEEe\in E\subset E^{*}.

The binary operation :E×EE\circ:E^{*}\times E^{*}\rightarrow E^{*} is the concatenation of two paths into a new path such that if (i,α,j)(i,\alpha,j) and (j,β,k)(j,\beta,k) are two edges in EE, then their concatenation is the path (i,α,j,j,β,k)(i,\alpha,j,j,\beta,k), where i,j,kVi,j,k\in V and α,βΩ\alpha,\beta\in\Omega. Concatenation is associative (i.e. (ab)c=a(bc)(a\circ b)\circ c=a\circ(b\circ c)), not commutative (i.e. it is generally true that abbaa\circ b\neq b\circ a), and ϵ\epsilon serves as an identity (i.e. ϵa=a=aϵ\epsilon\circ a=a=a\circ\epsilon).

Operations exist to extract information out of a path. The operation σ:E×+E\sigma:E^{*}\times\mathbb{N}^{+}\rightarrow E is a projection that maps a path to the nthn^{\text{th}} edge in that path. For example, if a=(i,α,j,j,β,k)a=(i,\alpha,j,j,\beta,k), then σ(a,1)=(i,α,j)\sigma(a,1)=(i,\alpha,j) and σ(a,2)=(j,β,k)\sigma(a,2)=(j,\beta,k). Next, for any path, γ:EV\gamma^{-}:E^{*}\rightarrow V projects the tail (first vertex) of the path such that γ((i,α,j))=i\gamma^{-}((i,\alpha,j))=i. Likewise, γ+:EV\gamma^{+}:E^{*}\rightarrow V, where γ+((i,α,j))=j\gamma^{+}((i,\alpha,j))=j. Similarly, for edge labels, ω:EΩ\omega:E\rightarrow\Omega, where ω((i,α,j))=α\omega((i,\alpha,j))=\alpha.333All projection operations can be reduced to a single string indexing operation, but for the sake of clarity in the following discussion, they are presented as being atomic.

Definition 2 (Path Label)

The path label of path aa is defined as the edge labels contained in aa. Formally, if aa is a path, then the path label is constructed by ω:EΩ\omega^{\prime}:E^{*}\rightarrow\Omega^{*}, where, using concatenation,

ω(a)=n=1naω(σ(a,n)).\omega^{\prime}(a)=\prod_{n=1}^{n\leq\|a\|}\omega\left(\sigma\left(a,n\right)\right).

The path label of any single edge eEe\in E is simply the edge’s label as e=1\|e\|=1 and ω(e)=ω(σ(e,1))=ω(e)\omega^{\prime}(e)=\omega(\sigma(e,1))=\omega(e).

The binary operation :𝒫(E)×𝒫(E)𝒫(E)\cup:\mathcal{P}(E^{*})\times\mathcal{P}(E^{*})\rightarrow\mathcal{P}(E^{*}) is standard set union. The binary operation :𝒫(E)×𝒫(E)𝒫(E)\bowtie_{\circ}:\mathcal{P}(E^{*})\times\mathcal{P}(E^{*})\rightarrow\mathcal{P}(E^{*}) is the concatenative join of two sets of paths such that if A,B𝒫(E)A,B\in\mathcal{P}(E^{*}), then

AB=\displaystyle A\bowtie_{\circ}B= {ab|aAbB\displaystyle\;\{a\circ b\;|\;a\in A\;\wedge\;b\in B
(a=ϵb=ϵγ+(a)=γ(b))},\displaystyle\;\;\wedge\;\left(a=\epsilon\;\vee\;b=\epsilon\;\vee\;\gamma^{+}(a)=\gamma^{-}(b)\right)\},

where γ+(a)=γ(b)\gamma^{+}(a)=\gamma^{-}(b) ensures that only joint (i.e. adjacent) paths are concatenated.444The defined concatenative join is analogous to the θ\theta-join in [3], where ABγ+(a)=γ(b)\begin{array}[]{c}A\bowtie B\\ \gamma^{+}(a)=\gamma^{-}(b)\end{array}. In this form, its known as an equijoin. A discussion relating concatenative join and the relational algebra is found in [6]. For example, if

A={(i,α,j),(j,β,k,k,α,j)}A=\left\{(i,\alpha,j),(j,\beta,k,k,\alpha,j)\right\}

and

B={(j,β,j),(j,β,i,i,α,k),(i,β,k)},B=\left\{(j,\beta,j),(j,\beta,i,i,\alpha,k),(i,\beta,k)\right\},

then

AB=\displaystyle A\bowtie_{\circ}B= {(i,α,j,j,β,j),(i,α,j,j,β,i,i,α,k),\displaystyle\;\{(i,\alpha,j,j,\beta,j),(i,\alpha,j,j,\beta,i,i,\alpha,k),
(j,β,k,k,α,j,j,β,j),\displaystyle\;\;(j,\beta,k,k,\alpha,j,j,\beta,j),
(j,β,k,k,α,j,j,β,i,i,α,k)},\displaystyle\;\;(j,\beta,k,k,\alpha,j,j,\beta,i,i,\alpha,k)\},

where i,j,kVi,j,k\in V, α,βΩ\alpha,\beta\in\Omega, and (i,α,j),(j,β,k),(k,α,j),(j,β,j),(j,β,i),(i,α,k),(i,β,k)E(i,\alpha,j),(j,\beta,k),(k,\alpha,j),\\ (j,\beta,j),(j,\beta,i),(i,\alpha,k),(i,\beta,k)\in E. Given that \bowtie_{\circ} is based on \circ, \bowtie_{\circ} is associative, but not commutative.

Definition 3 (Path Jointness)

A path is joint is it satisfies the characteristic function f:E{,}f:E^{*}\rightarrow\{\top,\bot\} with the function rule

f(a)={if a=1,if n=1n<a1γ+(σ(a,n))=γ(σ(a,n+1)),otherwise.\displaystyle f(a)=\begin{cases}\top&\text{if }\|a\|=1,\\ \top&\text{if }\bigwedge_{n=1}^{n<\|a\|-1}\gamma^{+}(\sigma(a,n))=\gamma^{-}(\sigma(a,n+1)),\\ \bot&\text{otherwise}.\end{cases}

The function maps to \top if the path is joint and \bot if it is disjoint.

The binary operation \bowtie_{\circ} constructs joint paths. It may be the case that traversing disjoint paths is desirable.555For example, priors-based algorithms require the concept of “teleportation” in order to make a disjoint jump in the graph. The Cartesian product supports the concatenation of potentially disjoint paths. As such, ×:𝒫(E)×𝒫(E)𝒫(E)\times_{\circ}:\mathcal{P}(E^{*})\times\mathcal{P}(E^{*})\rightarrow\mathcal{P}(E^{*}), where A×B={ab|aAbB}A\times_{\circ}B=\{a\circ b\;|\;a\in A\;\wedge\;b\in B\}.

Finally, to conclude this section, the reason why the G˙=(V˙,𝔼˙={E1˙,E2˙,,Em˙(V˙×V˙)})\dot{G}=(\dot{V},\dot{\mathbb{E}}=\{\dot{E_{1}},\dot{E_{2}},\ldots,\dot{E_{m}}\subseteq(\dot{V}\times\dot{V})\}) definition of a multi-relational graph is not used is because when evaluating concatenative joins over binary relations, the edge label information is lost and thus, the path label can not be determined. In other words, if ee and ff are edges from two different binary relations, then efe\circ f would only provide a sequence of vertices and as such would not specify from which relations the join was constructed. This is a deficiency of the algebra in [4], where binary relations are used and :V×VV\circ:V^{*}\times V^{*}\rightarrow V^{*} as opposed to :E×EE\circ:E^{*}\times E^{*}\rightarrow E^{*}, where E=(V×Ω×V)E=(V\times\Omega\times V). While the algebra in [4] is applicable to multi-relational graphs (as any two relations can be joined), it was specifically intended for single-relational graphs, where problems involving path labels are not considered. In contrast, the specification defined in this article preserves path labels.

III Basic Traversals

From the explicit adjacencies (edges) defined in the edge set EE, there exists implicit adjacencies (paths) defined by efe\circ f, where e,fEe,f\in E and efEe\circ f\in E^{*}. Given the previously defined operations, different types of common traversal idioms can be affected.

III-A Complete Traversal

All joint paths through a graph of length nn can be constructed using EEn times\underbrace{E\bowtie_{\circ}\ldots\bowtie_{\circ}E}_{n\text{ times}}. This type of traversal is called a complete traversal because there is no discrimination when joining except that the join vertex (i.e. the head of the first path and tail of the second) be equal. When it is desirable to limit the set of paths derived by the traversal then the sets A,BEA,B\subseteq E need to be defined and joined.

III-B Source Traversal

A source traversal emanates from a particular set of vertices. Such a traversal is left restricting as it constructs paths whose tail vertex is an element of VsVV_{s}\subseteq V. The first concatenative join must, on its left side, contain the set of all edges in EE that have their tail vertex in VsV_{s}. Therefore, when

A={e|eEγ(e)Vs},A=\{e\;|\;e\in E\;\wedge\;\gamma^{-}(e)\in V_{s}\},

AEEn times\underbrace{A\bowtie_{\circ}E\ldots\bowtie_{\circ}E}_{n\text{ times}} yields all joint paths of length nn emanating from the vertices in VsV_{s}. When Vs=VV_{s}=V, a complete traversal is evaluated since A=EA=E. For ease of expression, the complement of the set VsV_{s} can be used to denote where not to start a traversal from. For example, Vs¯=VVs\overline{V_{s}}=V\setminus V_{s} states to start the traversal from all vertices in VV except those in VsV_{s}.

III-C Destination Traversal

A destination traversal is similar to a source traversal, except that it is right restricting as it constructs all paths of length nn whose head, or terminal, vertex is in VdVV_{d}\subseteq V. In this way, when

B={e|eEγ+(e)Vd},B=\{e\;|\;e\in E\;\wedge\;\gamma^{+}(e)\in V_{d}\},

EEBn times\underbrace{E\bowtie_{\circ}\ldots E\bowtie_{\circ}B}_{n\text{ times}} is a destination traversal. When Vd=VV_{d}=V, a complete traversal is evaluated because B=EB=E in such situations.

By combining a source and destination traversal, its possible to emanate from particular vertices and arrive at particular vertices, where AEEBn times\underbrace{A\bowtie_{\circ}E\ldots E\bowtie_{\circ}B}_{n\text{ times}} is the set of all joint paths that start from vertices in VsV_{s}, end at vertices in VdV_{d}, and are of length nn. Source and destination traversals can also be used to ensure that each edge in the path goes through a particular set of vertices by specifying, at some particular \bowtie_{\circ} step, the source (or destination) vertex set as VsV_{s} (or VdV_{d}) before enacting the next concatenative join.

III-D Labeled Traversal

A traversal can be constrained to particular path labels by defining an edge set that is a function of its edge labels. For example, if ΩeΩ\Omega_{e}\subseteq\Omega, ΩfΩ\Omega_{f}\subseteq\Omega,

A={e|eEω(e)Ωe},A=\{e\;|\;e\in E\;\wedge\;\omega(e)\in\Omega_{e}\},

and

B={f|fEω(f)Ωf},B=\{f\;|\;f\in E\;\wedge\;\omega(f)\in\Omega_{f}\},

then ABA\bowtie_{\circ}B denotes all paths where ω(σ(a,1))Ωe\omega(\sigma(a,1))\in\Omega_{e} and ω(σ(a,2))Ωf\omega(\sigma(a,2))\in\Omega_{f}. When Ωe=Ωf=Ω\Omega_{e}=\Omega_{f}=\Omega, a complete traversal is enacted as, in such situations, A=B=EA=B=E. The labeled traversal is possible because the relation type is represented in the edge definition E(V×Ω×V)E\subseteq(V\times\Omega\times V) and there exists the label projection function ω:EΩ\omega:E\rightarrow\Omega.

IV Derivative Traversals

The basic traversals defined in §III can be mixed and matched to yield different types of joint paths in EE^{*}. This section will introduce some typical applications of the presented multi-relational path algebra to problems that are specific to multi-relational graphs—focusing primarily on problems involving regular paths.666For the sake of simplicity, only regular paths are discussed. However, with more machinery (e.g. memory structures), more complex traversals can be expressed using the core operations presented in §II.

IV-A Regular Path Recognizer

The presented multi-relational path algebra has application to regular expressions and their corresponding finite state automata. Before presenting this application, an example-specific set-builder notation is introduced in order to specify subsets of EE in a more concise, readable manner than previously presented. A source edge set can be specified as [i,_,_]αΩjV(i,α,j):(i,α,j)E[i,\_,\_]\equiv\bigcup_{\alpha\in\Omega}\bigcup_{j\in V}(i,\alpha,j):(i,\alpha,j)\in E in order to denote the set of all edges that emanate from vertex ii. A destination edge set can be specified as [_,_,j]iVαΩ(i,α,j):(i,α,j)E[\_,\_,j]\equiv\bigcup_{i\in V}\bigcup_{\alpha\in\Omega}(i,\alpha,j):(i,\alpha,j)\in E in order to denote the set of all edges that terminate at vertex jj. A labeled edge set can be specified as [_,α,_]iVjV(i,α,j):(i,α,j)E[\_,\alpha,\_]\equiv\bigcup_{i\in V}\bigcup_{j\in V}(i,\alpha,j):(i,\alpha,j)\in E in order to denote the set of all edges that have α\alpha as their label. Finally [_,_,_]=E[\_,\_,\_]=E.

If EE is the regular expression alphabet, then \emptyset, ϵ\epsilon, and any eEe\in E are regular expressions. If RR and QQ are regular expressions, then RQR\cup Q, RQR\bowtie_{\circ}Q, and RR^{*} are regular expressions [7].777The ×\times_{\circ} operation can be used to recognize potentially disjoint paths, but in practice, when only joint paths are being recognized then \bowtie_{\circ} is a more efficient use of resources as RQR×QR\bowtie_{\circ}Q\subseteq R\times_{\circ}Q. A regular expression over EE, and corresponding finite state automaton, recognize a set of joint paths in 𝒫(E)\mathcal{P}(E^{*}).888The common operations R+R^{+}, R?R?, and RnR^{n} used in practice can be represented as RRR\bowtie_{\circ}R^{*}, R{ϵ}R\cup\{\epsilon\}, and RRn times\underbrace{R\bowtie_{\circ}\ldots\bowtie_{\circ}R}_{n\text{ times}}, respectively. For example,

[i,α,_][_,β,_](([_,α,j]{(j,α,i)})[_,α,k])[i,\alpha,\_]\bowtie_{\circ}[\_,\beta,\_]^{*}\left(\left([\_,\alpha,j]\bowtie_{\circ}\{(j,\alpha,i)\}\right)\;\cup\;[\_,\alpha,k]\right)

recognizes all paths emanating from ii, terminating at ii or kk, with the first and last label traversed being α\alpha, and all intermediate edge labels (zero or more) being β\beta. The corresponding finite state automaton is diagrammed in Figure 1, where the transition function is based on set membership, not equality.999Given that set membership can be represented element-wise as element equality under or, each element of the transition label edge set can be individually denoted as a transition with the same tail and head state. As such, the typical finite state automaton transition exists. For diagram clarity, set membership is used instead of equality.

Refer to caption
Figure 1: A finite state automaton to recognize and generate a set of paths in 𝒫(E)\mathcal{P}(E^{*}). The left most state is the start state and the double-circle states denote accepting states.

Regular paths in graphs are explored in depth in [8], where only paths with particular path labels are considered for recognition. In other words, in [8], a regular expression is defined for the alphabet Ω\Omega, where above, its defined for EE.

IV-B Regular Path Generator

By making use of a non-deterministic single-stack automaton with a stack alphabet of 𝒫(E)\mathcal{P}(E^{*}), it is possible to generate all paths in GG that can be recognized by some regular expression. The non-deterministic aspect of the automaton ensures that all branches in the state machine are taken “in parallel.” The single-stack aspect refers to the fact that the automaton (and thus, its cloned/branched automata) maintain a first-in/last-out stack memory that can be pushed and popped.

Initially, the automaton’s stack contains the element {ϵ}\{\epsilon\}. The automaton will halt whenever its stack element is \emptyset or is in an accepting state. For each state transition (which happens unless the automaton has been halted), the path set defined on the transition label is joined on the right with the path set popped off the stack. The result of the join is then pushed back onto the stack. Whenever a branch in the automaton’s state graph is approached, all branches are taken “in parallel.” Thus, given the automaton diagrammed in Figure 1, the following joins are evaluated.

{ϵ}[i,α,_][_,α,j]{(j,α,i)}\displaystyle\{\epsilon\}\bowtie_{\circ}[i,\alpha,\_]\bowtie_{\circ}[\_,\alpha,j]\bowtie_{\circ}\{(j,\alpha,i)\}
{ϵ}[i,α,_][_,α,k]\displaystyle\{\epsilon\}\bowtie_{\circ}[i,\alpha,\_]\bowtie_{\circ}[\_,\alpha,k]
{ϵ}[i,α,_][_,β,_][_,α,j]{(j,α,i)}\displaystyle\{\epsilon\}\bowtie_{\circ}[i,\alpha,\_]\bowtie_{\circ}[\_,\beta,\_]\ldots\bowtie_{\circ}[\_,\alpha,j]\bowtie_{\circ}\{(j,\alpha,i)\}
{ϵ}[i,α,_][_,β,_][_,α,k]\displaystyle\{\epsilon\}\bowtie_{\circ}[i,\alpha,\_]\bowtie_{\circ}[\_,\beta,\_]\ldots\bowtie_{\circ}[\_,\alpha,k]

The union of the first (and only) element of all the stacks across all branches of accept-state automaton forms the set of all paths in GG that satisfy the regular expression.

IV-C Constructing Semantically-Rich Single-Relational Graphs

Most of the graph algorithms in existence today have been developed for single-relational graphs. Examples of such algorithms include the geodesics (e.g. closeness centrality, betweenness centrality), spectral (e.g. eigenvector centrality, spreading activation), and assortative (e.g. scalar and discrete) algorithms (see [1] for a consolidate review and analysis of many such algorithms). When applied to multi-relational graphs, these algorithms have the potential drawback of losing their meaning and thus, their applicability. To explicate this statement, it is important to consider the way in which a single-relational graph algorithm can be formally applied to multi-relational graphs. One method that can be employed is to simply ignore edge labels and, potentially, repeated edges between the same two vertices. However, when there are numerous ways in which one vertex can be related to another vertex, what is the resulting semantics of, say, a centrality algorithm? Another method is to extract a single edge relation, based on its label, from the multi-relational graph. For example, its possible to construct the binary edge set

Eα={(γ(e),γ+(e))|eEω(e)=α}E_{\alpha}=\{(\gamma^{-}(e),\gamma^{+}(e))\;|\;e\in E\;\wedge\;\omega(e)=\alpha\}

and utilize that subgraph as the source of a single-relational graph algorithm. However, with multiple ways in which vertices can be related, more abstract relationships can be inferred through paths. Thus, in the final method, single-relational graphs can be generated from the multi-relational graph through the derivation of implicit edges defined through paths. Using a simple example, if α,βΩ\alpha,\beta\in\Omega are two edge labels, then all αβ\alpha\beta-paths can be constructed when A={e|eEω(e)=α}A=\{e\;|\;e\in E\;\wedge\;\omega(e)=\alpha\}, B={e|eEω(e)=β}B=\{e\;|\;e\in E\;\wedge\;\omega(e)=\beta\} and ABA\bowtie_{\circ}B. The tail and head vertices of these paths can then be projected to form a new binary edge set

Eαβ=aAB(γ(a),γ+(a)).E_{\alpha\beta}=\bigcup_{a\in A\bowtie_{\circ}B}\left(\gamma^{-}(a),\gamma^{+}(a)\right).

Thus, Eαβ(V×V)E_{\alpha\beta}\subseteq(V\times V) can be subjected to all known single-relational graph algorithms. For regular paths, a regular path generator can be used as in §IV-B. Mapping single-relational graph algorithms over to the multi-relational domain is explored in depth in [5].

V Conclusion

This article defined a path algebra for multi-relational graphs represented as G=(V,E(V×Ω×V)G=(V,E\subseteq(V\times\Omega\times V). The core traversal types (complete, source, destination, and labeled) allow for the expression of more expressive traversals through the restriction of the join set EE. Applications to regular path recognizers (§IV-A), generators (§IV-B), and “semantically-rich” single-relational graph construction (§IV-C) were presented. Generally, the algebra has applicability to the construction of a multi-relational graph traversal engine.

References

  • [1] U. Brandes and T. Erlebach, Eds., Network Analysis: Methodolgical Foundations. Berling, DE: Springer, 2005.
  • [2] M. A. Rodriguez and P. Neubauer, “Constructions from dots and lines,” Bulletin of the American Society for Information Science and Technology, vol. 36, no. 6, pp. 35–41, August 2010.
  • [3] E. F. Codd, “A relational model of data for large shared data banks,” Communications of the ACM, vol. 13, no. 6, pp. 377–387, 1970.
  • [4] M. Russling, “A general scheme for breadth-first graph traversal,” in Mathematics of Program Construction, ser. Lecture Notes in Computer Science, M. Russling, Ed., vol. 947, no. 380–398. Springer-Verlag, 1995, pp. 380–398.
  • [5] M. A. Rodriguez and J. Shinavier, “Exposing multi-relational networks to single-relational network analysis algorithms,” Journal of Informetrics, vol. 4, no. 1, pp. 29–41, 2009. [Online]. Available: http://arxiv.org/abs/0806.2274
  • [6] P. Pucheral and J.-M. Thévenin, “A graph based data structure for efficient implementation of main memory dbms,” in Proceedings of the Sixth International Workshop on Database Machines. London, UK: Springer-Verlag, 1989, pp. 73–96.
  • [7] B. Moret, The Theory of Computation. Addison-Wesley, 1997.
  • [8] A. O. Mendelzon and P. T. Wood, “Finding regular simple paths in graph databases,” in Proceedings of the 15th International Conference on Very Large Data Bases. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1989, pp. 185–193.