This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: 1Yale University, 2SingularityNET

A meta-probabilistic-programming language for bisimulation of probabilistic and non-well-founded type systems

Jonathan Warrell1,2    Alexey Potapov2    Adam Vandervorst2    Ben Goertzel2
Abstract

We introduce a formal meta-language for probabilistic programming, capable of expressing both programs and the type systems in which they are embedded. We are motivated here by the desire to allow an AGI to learn not only relevant knowledge (programs/proofs), but also appropriate ways of reasoning (logics/type systems). We draw on the frameworks of cubical type theory and dependent typed metagraphs to formalize our approach. In doing so, we show that specific constructions within the meta-language can be related via bisimulation (implying path equivalence) to the type systems they correspond. This allows our approach to provide a convenient means of deriving synthetic denotational semantics for various type systems. Particularly, we derive bisimulations for pure type systems (PTS), and probabilistic dependent type systems (PDTS). We discuss further the relationship of PTS to non-well-founded set theory, and demonstrate the feasibility of our approach with an implementation of a bisimulation proof in a Guarded Cubical Type Theory type checker.

1 Introduction

Probabilistic programming offers a fertile ground between logic-based and machine-learning-based approaches to A(G)I. Formalization within type theory offers a rigorous approach to deriving semantics for probabilistic languages [15], and formalization of dependently typed probabilistic languages offers the promise of drawing a tight connection with probabilistic logics of various kinds (e.g. Markov Logic [19], Probabilistic Paraconsistent Logic [7]).

While the exploration of such individual systems is highly important, we might consider more abstractly how to embody general principles for the formation of diverse probabilistic type systems, logics, and programming languages within a single meta-language. Such a language can be considered a meta-theoretical language or logical framework for expressing individual type systems and logics. However, previous frameworks (such as [9]) have not been designed with probabilistic type systems and logics specifically in mind. Here, we outline a formal language, 𝕄\mathbb{M}, designed for such a purpose. This language is intended as a formal model of the MeTTa language, currently being developed as part of the OpenCog project [14, 8, 16]. The language allows for (probabilistic) reasoning not only about the knowledge embedded in a system, but also about the logic employed by the system itself.

Our approach may also be seen in relation to recent methods to derive synthetic denotational semantics for logical systems using guarded cubical type theory (GCTT) [18, 11]. Such approaches are particularly promising, offering as they do a unified approach to deriving semantics for recursive datatypes as final co-algebras of appropriate functors in the context of a formulation of univalent type theory with a fully computational semantics. We draw on methods from [10] to formalize our approach in this context. This allows us to rigorously define the relationship between an object-language and its expression in our meta-language as one of bisimulation, corresponding to path equivalence in GCTT. We further show how dependently typed metagraphs can be formalized in GCTT as the basis for our framework [6, 12], and how this leads to systems embedding natural type-theoretic equivalents of non-well-founded sets.

We begin by developing a general framework for representing metagraphs in GCTT, before outlining how the final co-algebra of a labeled transition system over this recursive datatype can be used to model our meta-language. We then derive bisimulations for various object-languages in our system, including simply typed (and untyped) lambda calulus, pure type systems, and probabilistic dependent type systems, hence deriving synthetic denotational semantics for these systems. Finally, we demonstrate the feasibility of our approach with an implementation of a bisimulation proof for a small-scale type system in a Guarded Cubical Type Theory type checker [4], before concluding with a discussion.

2 Labeled metagraphs as a guarded recursive datatype

We begin by defining a recursive datatype for typed metagraphs ((𝒯,,T)\mathcal{M}_{(\mathcal{T},\mathcal{L},\preceq_{T})}) using guarded cubical type theory. Here, 𝒯,\mathcal{T},\mathcal{L} are types of type-symbols and edge labels respectively, and T:𝒯×𝒯𝔹\preceq_{T}:\mathcal{T}\times\mathcal{T}\rightarrow\mathbb{B} is a partial order on type-symbols. The recursive datatype is defined as the final co-algebra of the functor (𝒯,,T)(A)\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L},\preceq_{T})}(A), which when applied to type AA returns the following datatype (letting Δ\Delta stand for the assumptions ,𝒯,A:𝒰0\mathcal{L},\mathcal{T},A:\mathcal{U}_{0}; the ϵ,edge,\epsilon,\operatorname{edge}, and connect\operatorname{connect} constructors used here follow the approach of [12] and [6]):

     ΓΔ\Gamma\vdash\Delta     Γ(𝒯,)(A)\Gamma\vdash\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}(A)

      ΓΔ\Gamma\vdash\Delta     Γϵ:(𝒯,)(A)\Gamma\vdash\epsilon:\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}(A)

  ΓΔ,n:,t0:𝒯,t:Vec(n,𝒯),l0:\Gamma\vdash\Delta,n:\mathbb{N},t_{0}:\mathcal{T},t:\operatorname{Vec}(n,\mathcal{T}),l_{0}:\mathcal{L}          Γedge(n,t0,l0,t):(𝒯,)(A)\Gamma\vdash\operatorname{edge}(n,t_{0},l_{0},t):\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}(A)

  ΓΔ,a1,a2:A,t0:𝒯,l0:,q:0,\Gamma\vdash\Delta,a_{1},a_{2}:A,t_{0}:\mathcal{T},l_{0}:\mathcal{L},q:\mathbb{N}\rightarrow\mathbb{N}_{0,\infty}       Γconnect(a1,a2,t0,l0,q):(𝒯,)(A)\Gamma\vdash\operatorname{connect}(a_{1},a_{2},t_{0},l_{0},q):\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}(A)

where Vec(n,A)\operatorname{Vec}(n,A) is the type of vectors over AA of length nn, and 0,\mathbb{N}_{0,\infty} is \mathbb{N} extended with 0 and \infty. We note that for notational convenience, we do not explicitly include target labels/indices in the definition of (𝒯,)(A)\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}(A) above (in contrast to [6], where \mathcal{L} refers to target indices and 𝒱\mathcal{V} is used for edge values). If explicit indices are required to identify target ’levels’, these may be included by letting =0×nVec(n,)\mathcal{L}=\mathcal{L}_{0}\times\sum_{n}\operatorname{Vec}(n,\mathbb{N}), so that each edge label is paired with a vector of target indices. (𝒯,,T)\mathcal{M}_{(\mathcal{T},\mathcal{L},\preceq_{T})} is then defined as a final fixed-point of (𝒯,)\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}, such that a set of constraints are satisfied:

(𝒯,,T)\displaystyle\mathcal{M}_{(\mathcal{T},\mathcal{L},\preceq_{T})} =\displaystyle= M:ν((𝒯,)).C(M,T)\displaystyle\sum M:\nu(\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}).C(M,\preceq_{T}) (2)

where C(M,T)C(M,\preceq_{T}) represents the constraints:

C(M,T)\displaystyle C(M,\preceq_{T}) =\displaystyle= n1,n2:,t1,t2:𝒯\displaystyle\forall n_{1},n_{2}:\mathbb{N},t_{1},t_{2}:\mathcal{T} (3)
f(M,n1)=t1\displaystyle f(M,n_{1})=t_{1}\wedge
f(M,n2)=t2\displaystyle f(M,n_{2})=t_{2}\wedge
qM(n1)=n2t1Tt2\displaystyle q^{\prime}_{M}(n_{1})=n_{2}\Rightarrow t_{1}\preceq_{T}t_{2}

Here, f(M,n)f(M,n) represents a function, which for metagraph MM returns the type of its nn’th edge or target. Specifically, when MM is of the form edge(n,t0,l0,t)\operatorname{edge}(n,t_{0},l_{0},t), f(M,0)f(M,0) is the type of the edge, and f(M,n>0)f(M,n>0) is the type of the nn’th target, and when MM is of the form connect(a1,a2,t0,l0,q)\operatorname{connect}(a_{1},a_{2},t_{0},l_{0},q), f(M,0)f(M,0) is the type of the whole metagraph, while the types of the edges/targets of a1a_{1} and a2a_{2} are interleaved when evaluating f(M,n>0)f(M,n>0) for odd/even values of nn respectively. Further, the function qM:0,q^{\prime}_{M}:\mathbb{N}\rightarrow\mathbb{N}_{0,\infty} is recursively defined on ν((𝒯,))\nu(\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}) (via the qq function in the connect\operatorname{connect} constructor of Eq. LABEL:eq:1) to indicate that the n1n_{1}’th target of MM is connected to the n2n_{2}’th edge/target of MM, whenever q(n1)=n2q(n_{1})=n_{2}, with n2=n_{2}=\infty indicating that the target has no connection. C(M,T)C(M,\preceq_{T}) thus provides a set of constraints that ensure the connections in a metagraph respect the T\preceq_{T} relation; further constraints are needed to ensure for instance that targets receive input from only one other target (as may be appropriate for some metagraphs). Further, ν=fixX.F((α:𝕋).X[α]))\nu=\operatorname{fix}X.F(\triangleright(\alpha:\mathbb{T}).X[\alpha])) is the guarded fixed-point operator [10]. By [10], Prop. 3.2, (𝒯,,T)\mathcal{M}_{(\mathcal{T},\mathcal{L},\preceq_{T})} is both a subset of the initial algebra and final coalgebra of (𝒯,)\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}\circ\triangleright. Finally, we note that our connect\operatorname{connect} constructor corresponds to ConnectQ\text{Connect}_{Q} in [6], and the Union constructor is simply connect\operatorname{connect} with q(n)=q(n)=\infty for all nn (meaning that no new connections are added).

We briefly give some examples of typed metagraphs. For convenience, we set ={null}\mathcal{L}=\{\operatorname{null}\}, and 𝒯={A,B,C,D,}\mathcal{T}=\{A,B,C,D,\top\}, with T\preceq_{T} the identity relation along with tTt\preceq_{T}\top for all tt. In our first example, we can construct metagraphs X=edge(3,A,null,[D,B,C])X=\operatorname{edge}(3,A,\operatorname{null},[D,B,C]), and Y=edge(2,B,null,[D,A])Y=\operatorname{edge}(2,B,\operatorname{null},[D,A]). Then, a combined graph can be constructed as Z=connect(X,Y,,null,{(1,1),(2,0)})Z^{\prime}=\operatorname{connect}(X,Y,\top,\operatorname{null},\{(1,1),(2,0)\}), Z′′=connect(Y,X,,null,{(1,1),(2,0)})Z^{\prime\prime}=\operatorname{connect}(Y,X,\top,\operatorname{null},\{(1,1),(2,0)\}), Z′′′=connect(Z,Z′′,,null,{})Z^{\prime\prime\prime}=\operatorname{connect}(Z^{\prime},Z^{\prime\prime},\top,\operatorname{null},\{\}), Z=connect(X,Z′′′,C,null,{(3,0)})Z=\operatorname{connect}(X,Z^{\prime\prime\prime},C,\operatorname{null},\{(3,0)\}). The entire metagraph is shown in Fig. 1A. We note that, in general, any metagraph with a finite number of edges and targets can be represented by a term in the initial algebra of (𝒯,)\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})} (as is ZZ). Some graphs, however, may be conveniently be represented also by terms in the final coalgebra. Consider for instance Fig. 1B. Here, we may define X′′=edge(3,,null,[B,B,A])X^{\prime\prime}=\operatorname{edge}(3,\top,\operatorname{null},[B,B,A]) and X=connect(X′′,X′′,A,null,{(1,2),(2,1),(3,0)})X^{\prime}=\operatorname{connect}(X^{\prime\prime},X^{\prime\prime},A,\operatorname{null},\{(1,2),(2,1),(3,0)\}), representing XX^{\prime} by a term in the initial algebra (suppressing visualization of the X′′X^{\prime\prime} subgraph). Alternatively, we may define Xco=connect(edge(3,A,null,[B,B,A]),Xco,A,null,{(1,2),(2,1),(3,0)})X^{\prime}_{co}=\operatorname{connect}(\operatorname{edge}(3,A,\operatorname{null},[B,B,A]),\allowbreak X^{\prime}_{co},A,\operatorname{null},\{(1,2),(2,1),(3,0)\}), which implicitly determines a term in the coalgebra as a solution to the recursive equation.

Refer to caption
Figure 1: Typed metagraph examples. Boxes show metagraphs, which may be single edges (containing no further boxes) or include several edges. Solid circles edge target types and dotted circles show metagraph types. Arrows show target-target or target-edge connections. Metagraph letter names are shown on the box of the metagraph to which they refer in the text.

3 𝕄\mathbb{M} as the final coalgebra of a labeled transition system

We define the formal meta-probabilistic-programming language, 𝕄\mathbb{M}, as a labeled transition system over typed metagraphs. Here, we are interested in typed metagraphs with a particular form. Specifically, we begin by defining 𝒯\mathcal{T} by the abstract syntax:

𝒯\displaystyle\mathcal{T} ::=\displaystyle::= tn|𝒯𝒯|a:𝒯.𝕄|\displaystyle t_{n}\;\;|\;\;\mathcal{T}\rightarrow\mathcal{T}\;\;|\;\;\prod a:\mathcal{T}.\mathcal{M}_{\mathbb{M}}\;\;|\;\; (4)
Eq(𝒯,𝕄,𝕄)|𝒯𝒯|𝒯𝒯|\displaystyle\operatorname{Eq}(\mathcal{T},\mathcal{M}_{\mathbb{M}},\mathcal{M}_{\mathbb{M}})\;\;|\;\;\mathcal{T}\cup\mathcal{T}\;\;|\;\;\mathcal{T}\cap\mathcal{T}\;\;|\;\;
Type|Type||𝒥|𝒳\displaystyle\operatorname{Type}\;\;|\;\;\top_{\operatorname{Type}}\;\;|\;\;\top\;\;|\;\;\mathcal{J}\;\;|\;\;\mathcal{X}

These syntactic constructions represent base-level types, function types, dependent types, equality types, type unions and intersections, a base universe of small types, the union of all small types, the union of all types, judgments and execution states respectively. Notice also that in Eq. 4, 𝒯\mathcal{T} is defined by mutual recursion with the type 𝕄\mathcal{M}_{\mathbb{M}}, defined in Eq. 6. We then define \mathcal{L} as =𝒮𝒱𝒦𝒯×\mathcal{L}=\mathcal{S}\cup\mathcal{V}\cup\mathcal{K}\cup\mathcal{T}\times\mathbb{N}. Notice that \mathcal{L} includes 𝒯\mathcal{T}, so that types may simultaneously serve as labels. Further, 𝒮={s1,s2,}\mathcal{S}=\{s_{1},s_{2},...\} and 𝒱={v1,v2,}\mathcal{V}=\{v_{1},v_{2},...\} denote collections of symbols and variables respectively, and 𝒦\mathcal{K} is a special set of 𝕄\mathbb{M} keywords/key-symbols:

𝒦\displaystyle\mathcal{K} =\displaystyle= {:,,=,,Eq,funapp,transform,@,}\displaystyle\{:,\preceq,=,\rightarrow,\operatorname{Eq},\operatorname{fun-app},\operatorname{transform},@,\dagger\} (5)

Further, \mathcal{L} includes an edge-specific identifier \mathbb{N} to deduplicate edges which are identical in other respects.

The state of an 𝕄\mathbb{M} program is represented by a typed metagraph in the following space:

𝕄=T:(𝒯×𝒯𝔹).M:(𝒯,,T).C𝕄(M,T)\displaystyle\mathcal{M}_{\mathbb{M}}=\sum\preceq_{T}:(\mathcal{T}\times\mathcal{T}\rightarrow\mathbb{B}).\sum M:\mathcal{M}_{(\mathcal{T},\mathcal{L},\preceq_{T})}.C_{\mathbb{M}}(M,\preceq_{T}) (6)

Hence, this is the space of all metagraphs over \mathcal{L} and 𝒯\mathcal{T}, with a varying T\preceq_{T} relation, where C𝕄(M,T)C_{\mathbb{M}}(M,\preceq_{T}) represents a set of ’𝕄\mathbb{M}-specific constraints’ on the structure of the metagraph (to be outlined below). This state represents the Atomspace of the program, and the subgraphs of the Atomspace are the individual atoms (as in MeTTa, see [14, 8]). We note that, since 𝕄\mathbb{M} serves both as a language for defining programs and type-systems within which these programs are embedded, the atoms may represent base-level propositions and programs (expressions), as well as judgments and computational state information, as reflected by their types. The 𝕄\mathbb{M}-specific constraints, C𝕄(M,T)C_{\mathbb{M}}(M,\preceq_{T}), determine the interaction of the keywords/key-symbols with the type system:

m\displaystyle\forall m \displaystyle\in M.n,n1:.\displaystyle M.\exists n,n_{1}:\mathbb{N}.
m\displaystyle m =\displaystyle= edge(2,𝒥,(:,n),[Type])\displaystyle\operatorname{edge}(2,\mathcal{J},(:,n),[\top_{\operatorname{Type}}\;\top])\vee
m\displaystyle m =\displaystyle= edge(2,𝒥,(,n),[TypeType])\displaystyle\operatorname{edge}(2,\mathcal{J},(\preceq,n),[\operatorname{Type}\;\operatorname{Type}])\wedge
(mM[1]=edge(0,Type,(tn1,0),[])\displaystyle(m_{M}[1]=\operatorname{edge}(0,\operatorname{Type},(t_{n_{1}},0),[])\wedge
mM[2]=edge(0,Type,(tn2,0),[])(tn1tn2))\displaystyle m_{M}[2]=\operatorname{edge}(0,\operatorname{Type},(t_{n_{2}},0),[])\Rightarrow(t_{n_{1}}\preceq t_{n_{2}}))\vee
m\displaystyle m =\displaystyle= edge(2,𝒥,(=,n),[TypeType])\displaystyle\operatorname{edge}(2,\mathcal{J},(=,n),[\top_{\operatorname{Type}}\;\top_{\operatorname{Type}}])\vee
m\displaystyle m =\displaystyle= edge(2,Type,(,n),[TypeType])\displaystyle\operatorname{edge}(2,\operatorname{Type},(\rightarrow,n),[\operatorname{Type}\;\operatorname{Type}])\wedge
((m1)M[2]=ml(m1)=(:,n1)t(mM[1])=A\displaystyle((m_{1})_{M}[2]=m\wedge l(m_{1})=(:,n_{1})\wedge t(m_{M}[1])=A\wedge
t(mM[2])=Bt((m1)M[1])AB)\displaystyle t(m_{M}[2])=B\Rightarrow t((m_{1})_{M}[1])\preceq A\rightarrow B)\vee
m\displaystyle m =\displaystyle= edge(2,Type,(,n),[Type])\displaystyle\operatorname{edge}(2,\operatorname{Type},(\rightarrow,n),[\operatorname{Type}\;\top])\wedge
((m1)M[2]=ml(m1)=(:,n1)t(mM[1])=A\displaystyle((m_{1})_{M}[2]=m\wedge l(m_{1})=(:,n_{1})\wedge t(m_{M}[1])=A\wedge
mM[2]=m2t((m1)M[1])a:A.m2)\displaystyle m_{M}[2]=m_{2}\Rightarrow t((m_{1})_{M}[1])\preceq\prod a:A.m_{2})\vee
m\displaystyle m =\displaystyle= edge(3,Type,(Eq,n),[Typetn1tn1])\displaystyle\operatorname{edge}(3,\operatorname{Type},(\operatorname{Eq},n),[\operatorname{Type}\;t_{n_{1}}\>t_{n_{1}}])\wedge
mM[1]=edge(0,Type,(tn1,0),[])\displaystyle m_{M}[1]=\operatorname{edge}(0,\operatorname{Type},(t_{n_{1}},0),[])\wedge
((m1)M[2]=ml(m1)=(Eq,n1)l(mM[1])=(T,n1)\displaystyle((m_{1})_{M}[2]=m\wedge l(m_{1})=(\operatorname{Eq},n_{1})\wedge l(m_{M}[1])=(T,n_{1})\wedge
mM[2]=AmM[3]=Bt((m1)M[1])=Eq(T,A,B))\displaystyle m_{M}[2]=A\wedge m_{M}[3]=B\Rightarrow t((m_{1})_{M}[1])=\operatorname{Eq}(T,A,B))\vee
m\displaystyle m =\displaystyle= edge(2,,(transform,n),[])\displaystyle\operatorname{edge}(2,\top,(\operatorname{transform},n),[\top\;\top])\vee
m\displaystyle m =\displaystyle= edge(1,𝒳,(@,n),[])\displaystyle\operatorname{edge}(1,\mathcal{X},(@,n),[\top])\wedge
m\displaystyle m =\displaystyle= edge(1,𝒳,(,0),[])\displaystyle\operatorname{edge}(1,\mathcal{X},(\dagger,0),[\top])\wedge
m\displaystyle m =\displaystyle= edge(0,,(𝒮𝒱𝒯,n),[])\displaystyle\operatorname{edge}(0,\top,(\mathcal{S}\cup\mathcal{V}\cup\mathcal{T},n),[])\wedge
m\displaystyle m =\displaystyle= edge(2,B,(funapp,n),[ABA])BB\displaystyle\operatorname{edge}(2,B,(\operatorname{fun-app},n),[A\rightarrow B^{\prime}\;A])\wedge B^{\prime}\preceq B\vee
m\displaystyle m =\displaystyle= edge(2,B,(funapp,n),[a:A.m1A])\displaystyle\operatorname{edge}(2,B,(\operatorname{fun-app},n),[\prod a:A.m_{1}\;A])\wedge
m1[a=mM[1]]B\displaystyle m_{1}[a=m_{M}[1]]\preceq B\vee
m\displaystyle m =\displaystyle= connect(_,_,_,_,_)\displaystyle\operatorname{connect}(\_,\_,\_,\_,\_)\wedge
n,n1,n2\displaystyle\forall n,n_{1},n_{2} :\displaystyle: .tn\displaystyle\mathbb{N}.t_{n}\preceq\top\wedge
sn\displaystyle s_{n} :\displaystyle: Typesn:Type\displaystyle\top_{\operatorname{Type}}\vee s_{n}:\operatorname{Type}\wedge
vn\displaystyle v_{n} :\displaystyle: Typevn:Type\displaystyle\top_{\operatorname{Type}}\vee v_{n}:\operatorname{Type}\wedge
tn\displaystyle t_{n} :\displaystyle: Type\displaystyle\operatorname{Type}\wedge
(tn1\displaystyle(t_{n_{1}} \displaystyle\preceq tn2tn2tntn1tn)\displaystyle t_{n_{2}}\wedge t_{n_{2}}\preceq t_{n}\Rightarrow t_{n_{1}}\preceq t_{n})\wedge
tn\displaystyle t_{n} \displaystyle\preceq tntn1\displaystyle t_{n}\cup t_{n_{1}}\wedge
tntn1\displaystyle t_{n}\cap t_{n_{1}} \displaystyle\preceq tn\displaystyle t_{n} (7)

where the notation mM[n]m_{M}[n] denotes the nn’th target of subgraph mm in metagraph MM, t[m]t[m] and l[m]l[m] denote the type and label of metagraph mm respectively, and we write a:Aa:A as shorthand for ’there exits an ::-edge in MM connecting aa and AA’. We note that, for convenience, the above formulation does not include some constructions that may be appropriate in a full implementation, but can be derived from others. For instance, tuples can be constructed by introducing a dependent function tuple:A,B:Type.ABType\operatorname{tuple}:\prod A,B:\operatorname{Type}.A\rightarrow B\rightarrow\operatorname{Type}. The left and right projection functions are then defined by π1(tuple(A,B,a,b))=a\pi_{1}(\operatorname{tuple}(A,B,a,b))=a and π2(tuple(A,B,a,b))=b\pi_{2}(\operatorname{tuple}(A,B,a,b))=b. Dependent sums can likewise be defined as dependent tuples, tuple:A:Type.B:(AType).a:A.B(a)Type\operatorname{tuple}^{\prime}:\prod A:\operatorname{Type}.\prod B:(A\rightarrow\operatorname{Type}).\prod a:A.B(a)\rightarrow\operatorname{Type}.

3.1 Labeled transition system based on metagraph rewriting

In guarded cubical type theory, a guarded labeled transition system (GLTS) may be defined via a state-space XX, a space of actions AA, and a function mapping states to sets of (action,state) pairs, f:XPfin(A×X))f:X\rightarrow P_{\text{fin}}(A\times\triangleright X)), where PfinP_{\text{fin}} is the finite powerset functor. The space of all processes, or runs of the GLTS may the be defined as the final coalgebra of the following functor: Proc=fixX.Pfin(A×(α:𝕋).X[α]))\text{Proc}=\operatorname{fix}X.P_{\text{fin}}(A\times\triangleright(\alpha:\mathbb{T}).X[\alpha])) (see [10]). In order to characterize the process of evaluation in 𝕄\mathbb{M}, we characterize the computational dynamics of 𝕄\mathbb{M} via a GLTS. Here, the state space is the space of all 𝕄\mathbb{M} metgraphs, X=𝕄X=\mathcal{M}_{\mathbb{M}}. The actions are specified by single pushout (SPO) rewriting rules, or sequences of such rules. We therefore introduce the type, 𝒜=𝕄(L,R)×homp(𝕄)\mathcal{A}^{\prime}=\mathcal{M}^{(L,R)}_{\mathbb{M}}\times\hom_{p}(\mathcal{M}_{\mathbb{M}}), whose values (M,ϕ)(M^{\prime},\phi) consist of a 𝕄\mathbb{M} metagraph whose label set is =×{L,R,LR}×{[],,}\mathcal{L}^{\prime}=\mathcal{L}\times\{L,R,LR\}\times\{[],*,**\}, i.e. identical to above, but with LL and RR labels added to each edge to indicate its membership of the left or right-hand side of the rule (notice that these may overlap), * and ** to indicate the input and output nodes of the rule (see below), and ϕ\phi, a partial metagraph homomorphism between the LL and RR metagraphs of MM^{\prime} (defining a partial metagraph homomorphism as in [8]). Since we wish to allow sequences of rewrite rules as actions, we define the full action space to be 𝒜=n:.Vec(n,𝒜)\mathcal{A}=\sum n:\mathbb{N}.\operatorname{Vec}(n,\mathcal{A}^{\prime}), and write the members of 𝒜\mathcal{A} as a1a2ana_{1}\circ a_{2}\circ...\circ a_{n}, where a1n:𝒜a_{1...n}:\mathcal{A}^{\prime}. The dynamics are then defined (via ff) by mapping a given metagraph state M1M_{1} to the set of all pairs (A,M2)(A,M_{2}) such that M2M_{2} results from an application of action AA to M1M_{1}. For individual rewrite rules a𝒜a\in\mathcal{A}^{\prime}, their action is determined via a partial homomorphism between aa and M1M_{1}. We note that, when there are no partial homomorphisms between aa and M1M_{1}, or when the rewrite rule produces an invalid 𝕄\mathbb{M} graph, we set M2=M1M_{2}=M_{1}. Further, we note that the update may change the \preceq relation, for instance by introducing an edge of the form t1t2t_{1}\preceq t_{2}.

3.2 𝕄\mathbb{M}-interpretation as metagraph dynamics

Refer to caption
Figure 2: Metagraph rewriting rules. Notation as in Fig. 1. Subgraphs involving only one variable are not shown explicitly, but notated directly on the targets they are connected to. See Eqs. 3.2 and 3.2 for explicit expressions for the graphs.

We can now describe interpretation in 𝕄\mathbb{M} via the GLTS defined above. To do so, we map specific symbols/edges in the metagraph to actions in 𝒜\mathcal{A} (corresponding to the grounding domain FF in [8]). Specifically, edges carrying symbols of a function type, ABA\rightarrow B, dependent product type, a:A.B\prod a:A.B, or the transform\operatorname{transform} symbol, are mapped to specific forms of rewrite rule, as specified below. All other edges are mapped to the null\operatorname{null} transform. Fig. 2 specifies the general forms of the rewrite rules for function application, and transform rules (we note the transform\operatorname{transform} is equivalent to the 2-argument match\operatorname{match} keyword/function in the current version of the MeTTa language, see [16]). The dependent product rule is identical to Fig. 2a, with ABA\rightarrow B replaced with a:A.m1\prod a:A.m_{1} For explicitness, we give these below also in equational form. We note that, for convenience variable names are denoted using $\$, although these should be ultimately mapped to the names v1,v2,v_{1},v_{2},....

Rfunapp1\displaystyle R^{1}_{\operatorname{fun-app}} =\displaystyle= edge(2,$T2,(funapp,$n0,L),[$T1$T2$T1])\displaystyle\operatorname{edge}(2,\$T_{2},(\operatorname{fun-app},\$n_{0},L),[\$T_{1}\rightarrow\$T_{2}\;\$T_{1}])
Rfunapp2\displaystyle R^{2}_{\operatorname{fun-app}} =\displaystyle= edge(2,𝒥,(=,$n1,LR),[$T2$T2])\displaystyle\operatorname{edge}(2,\mathcal{J},(=,\$n_{1},LR),[\$T_{2}\;\$T_{2}])
Rfunapp3\displaystyle R^{3}_{\operatorname{fun-app}} =\displaystyle= edge(2,$T2,(funapp,$n2,LR),[$T1$T2$T1])\displaystyle\operatorname{edge}(2,\$T_{2},(\operatorname{fun-app},\$n_{2},LR),[\$T_{1}\rightarrow\$T_{2}\;\$T_{1}])
Rfunapp4\displaystyle R^{4}_{\operatorname{fun-app}} =\displaystyle= edge(0,$T1$T2,($f,$n3,LR),[])\displaystyle\operatorname{edge}(0,\$T_{1}\rightarrow\$T_{2},(\$f,\$n_{3},LR),[])
Rfunapp5\displaystyle R^{5}_{\operatorname{fun-app}} =\displaystyle= edge(0,$T1,($v1,$n4,LR),[])\displaystyle\operatorname{edge}(0,\$T_{1},(\$v_{1},\$n_{4},LR),[])
Rfunapp6\displaystyle R^{6}_{\operatorname{fun-app}} =\displaystyle= edge(0,$T2,($v2,$n5,LR),[])\displaystyle\operatorname{edge}(0,\$T_{2},(\$v_{2},\$n_{5},LR**),[])
Rfunapp7\displaystyle R^{7}_{\operatorname{fun-app}} =\displaystyle= connect(connect(Rfunapp1,Rfunapp4,,null,{(1,0)})),\displaystyle\operatorname{connect}(\operatorname{connect}(R^{1}_{\operatorname{fun-app}},R^{4}_{\operatorname{fun-app}},\top,\operatorname{null},\{(1,0)\})),
Rfunapp5,,(null,null,),{(5,0)})\displaystyle R^{5}_{\operatorname{fun-app}},\top,(\operatorname{null},\operatorname{null},*),\{(5,0)\})
Rfunapp8\displaystyle R^{8}_{\operatorname{fun-app}} =\displaystyle= connect(connect(Rfunapp3,Rfunapp4,,null,{(1,0)})),\displaystyle\operatorname{connect}(\operatorname{connect}(R^{3}_{\operatorname{fun-app}},R^{4}_{\operatorname{fun-app}},\top,\operatorname{null},\{(1,0)\})),
Rfunapp5,,null,{(5,0)})\displaystyle R^{5}_{\operatorname{fun-app}},\top,\operatorname{null},\{(5,0)\})
Rfunapp9\displaystyle R^{9}_{\operatorname{fun-app}} =\displaystyle= connect(connect(Rfunapp2,Rfunapp3,,null,{(1,0)})),\displaystyle\operatorname{connect}(\operatorname{connect}(R^{2}_{\operatorname{fun-app}},R^{3}_{\operatorname{fun-app}},\top,\operatorname{null},\{(1,0)\})),
Rfunapp5,,null,{(5,0)})\displaystyle R^{5}_{\operatorname{fun-app}},\top,\operatorname{null},\{(5,0)\})
Rfunapp\displaystyle R_{\operatorname{fun-app}} =\displaystyle= Rfunapp7Rfunapp8Rfunapp9\displaystyle R^{7}_{\operatorname{fun-app}}\cup R^{8}_{\operatorname{fun-app}}\cup R^{9}_{\operatorname{fun-app}} (8)
Rtransform1\displaystyle R^{1}_{\operatorname{transform}} =\displaystyle= edge(2,Type,(transform,$n0,L),[])\displaystyle\operatorname{edge}(2,\operatorname{Type},(\operatorname{transform},\$n_{0},L),[\top\;\top])
Rtransform2\displaystyle R^{2}_{\operatorname{transform}} =\displaystyle= connect(connect(Rtransform1,$M1,,null,{(1,0)})),\displaystyle\operatorname{connect}(\operatorname{connect}(R^{1}_{\operatorname{transform}},\$M_{1},\top,\operatorname{null},\{(1,0)\})),
$M2,,(null,null,L),{(5,0)})\displaystyle\$M_{2},\top,(\operatorname{null},\operatorname{null},L*),\{(5,0)\})
Rtransform3\displaystyle R^{3}_{\operatorname{transform}} =\displaystyle= edge(2,Type,(tuple,$n0,R),[])\displaystyle\operatorname{edge}(2,\operatorname{Type},(\operatorname{tuple},\$n_{0},R**),[\top\;\top])
Rtransform4\displaystyle R^{4}_{\operatorname{transform}} =\displaystyle= $M1$M1′′$M2$M2′′edge(0,,(null,null,LR),[])\displaystyle\$M^{\prime}_{1}\cup\$M^{\prime\prime}_{1}\cup\$M^{\prime}_{2}\cup\$M^{\prime\prime}_{2}\cup\operatorname{edge}(0,\top,(\operatorname{null},\operatorname{null},LR),[])
Rtransform5\displaystyle R^{5}_{\operatorname{transform}} =\displaystyle= connect(Rtransform3,Rtransform4,,(null,null,null),{(1,1),(2,2)}))\displaystyle\operatorname{connect}(R^{3}_{\operatorname{transform}},R^{4}_{\operatorname{transform}},\top,(\operatorname{null},\operatorname{null},\operatorname{null}),\{(1,1),(2,2)\}))
Rtransform\displaystyle R_{\operatorname{transform}} =\displaystyle= Rtransform2Rtransform5\displaystyle R^{2}_{\operatorname{transform}}\cup R^{5}_{\operatorname{transform}} (9)

In Eq. 3.2, M1M^{\prime}_{1} and M2M^{\prime}_{2} denote metagraphs isomorphic to M1M_{1} and M2M_{2}, using a disjoint set of variables, while M1′′M^{\prime\prime}_{1} and M2′′M^{\prime\prime}_{2} are defined similarly, with variables disjoint to the previous subsets. The rule in Eq. 3.2 is defined so as to return a 2-tuple of matches; in general, the size of the tuple returned should be large enough to allow for any number of matches (i.e the number of nodes in MM), and if the number of matches is less than this, it will be padded with null\operatorname{null} values.

fun-app nodes. For a given annotated funapp\operatorname{fun-app} node, i.e. connect(@,F,null,{(1,0)}))\operatorname{connect}(@,F,\operatorname{null},\{(1,0)\})), where @=edge(1,𝒳,(@,n),[])@=\operatorname{edge}(1,\mathcal{X},(@,n),[\top]) and FF is a graph consisting of a target funapp\operatorname{fun-app} node and its two arguments, the full rewrite rule rewriteF\operatorname{rewrite}_{F} is found by forming a metagraph homomorphism between Rfunapp7R^{7}_{\operatorname{fun-app}} (labeled by * as the input of the rule), and FF, replacing the variables in RfunappR_{\operatorname{fun-app}} by their values in FF. The resulting graph is denoted Rfunapp(F)R_{\operatorname{fun-app}}(F). The rule rewriteF\operatorname{rewrite}_{F} is then defined by the subgraphs L=connect($M0,connect(@,lfunapp(F),null,{(1,0)})),null,{((n1,m1),(n2,m1),})L=\operatorname{connect}(\$M_{0},\operatorname{connect}(@,l_{\operatorname{fun-app}}(F),\operatorname{null},\{(1,0)\})),\operatorname{null},\{((n_{1},m_{1}),(n_{2},m_{1}),...\}), R=connect(($M0,rfunapp(F),null,{((n1,m),(n2,m),})R=\operatorname{connect}((\$M_{0},r_{\operatorname{fun-app}}(F),\operatorname{null},\{((n_{1},m),(n_{2},m),...\}), where M0M_{0} is the graph of all nodes in MM targeting FF, m1m_{1} is the index of the funapp\operatorname{fun-app} node in FF, m2m_{2} is the index of the ** output node in rfunapp(F)r_{\operatorname{fun-app}}(F), and ϕ\phi is defined by the partial homomorphism consisting of the identity map on all nodes labeled LRLR.

transform nodes. For a given annotated transform\operatorname{transform} node, the full rewrite rule is defined similarly. Hence, for connect(@,F,null,{(1,0)}))\operatorname{connect}(@,F,\operatorname{null},\{(1,0)\})), where @=edge(1,𝒳,(@,n),[])@=\operatorname{edge}(1,\mathcal{X},(@,n),[\top]) and FF is a graph consisting of a target transform\operatorname{transform} node and its two arguments, the full rewrite rule rewriteF\operatorname{rewrite}_{F} is found by forming a metagraph homomorphism between Rtransform2R^{2}_{\operatorname{transform}} (labeled by * as the input of the rule), and FF, replacing the variables in RtransformR_{\operatorname{transform}} by their values in FF. The resulting graph is denoted Rtransform(F)R_{\operatorname{transform}}(F). The rule rewriteF\operatorname{rewrite}_{F} is then defined by the subgraphs L=connect($M0,connect(@,ltransform(F),null,{(1,0)})),null,{((n1,m1),(n2,m1),})L=\operatorname{connect}(\$M_{0},\operatorname{connect}(@,l_{\operatorname{transform}}(F),\allowbreak\operatorname{null},\{(1,0)\})),\operatorname{null},\{((n_{1},m_{1}),(n_{2},m_{1}),...\}), R=connect(($M0,rtransform(F),null,{((n1,m),(n2,m),})R=\operatorname{connect}((\$M_{0},r_{\operatorname{transform}}(F),\allowbreak\operatorname{null},\{((n_{1},m),(n_{2},m),...\}), where M0M_{0} is the graph of all nodes in MM targeting FF, m1m_{1} is the index of the transform\operatorname{transform} node in FF, m2m_{2} is the index of the ** output node in rtransform(F)r_{\operatorname{transform}}(F), and ϕ\phi is defined by the partial homomorphism consisting of the identity map on all nodes labeled LRLR.

𝕄\mathbb{M}-evaluation. The above provides groundings for activated nodes in a metagraph; as noted, nodes not of the form above result in a null\operatorname{null} update. Evaluation in 𝕄\mathbb{M} involves repeatedly updating the current pointed metgraph according to the grounding of the node currently pointed to. The conditions in Eq. 3 imply there will be at most one edge labeled with \dagger in a metagraph, whose target FF specifies the rule by which the graph is updated. This is expressed via the single partial function, update:𝕄𝕄\operatorname{update}:\mathcal{M}_{\mathbb{M}}\rightarrow\mathcal{M}_{\mathbb{M}}. The action of update\operatorname{update} is determined by the form of FF. If FF is not an activated subgraph, i.e. it is not the target of an @@-edge, the action update\operatorname{update} cannot be applied (i.e. evaluation halts). If however FF is the target of an @@-edge, update\operatorname{update} first checks if FF itself has any activated targets. If so, then update\operatorname{update} simply applies a graph rewrite which moves the pointer \dagger to the first such activated target (in the ordering of the edge). If not, update\operatorname{update} applies rewriteF\operatorname{rewrite}_{F}, which automatically ensures that the update will finish with \dagger pointing to the output subgraph, labeled **. These dynamics define a reduced GLTS, with X=𝕄X=\mathcal{M}_{\mathbb{M}}, A={update}A=\{\operatorname{update}\}, and f(M)={(update,M|update(M)=M)}f(M)=\{(\operatorname{update},M^{\prime}|\operatorname{update}(M)=M^{\prime})\}. Note that there may be multiple MM^{\prime}’s for which update(M)=M\operatorname{update}(M)=M^{\prime} if rewriteF\operatorname{rewrite}_{F} for a funapp\operatorname{fun-app} node is non-deterministic. Processes are defined by the fixed point Proc=ν(Pfin(A×X)))\text{Proc}=\nu(P_{\text{fin}}(A\times\triangleright X))). Normal forms of 𝕄\mathcal{M}_{\mathbb{M}} are metagraphs for which update\operatorname{update} cannot be applied (i.e. their grounding is null\operatorname{null}). Processes which reach a normal form are said to be terminating, and the initial expression of the process is said to evaluate to the normal form reached. Alternatively, certain expressions may not reach a normal form, resulting instead in a non-terminating computation.

4 Bisimulation of type systems in 𝕄\mathbb{M}

As described in [10], in guarded cubical type theory, a bisimulation R:XXUR:X\rightarrow X\rightarrow U for the GLTS (X,A,f)(X,A,f) may be defined via the following dependent type:

isGLTSBisimfR=x,y:X.R(x,y)\displaystyle\operatorname{isGLTSBisim}_{f}R=\prod x,y:X.R(x,y) \displaystyle\rightarrow
(x:X.a:A.(a,x)f(x)\displaystyle(\prod x^{\prime}:\triangleright X.\prod a:A.(a,x^{\prime})\in f(x) \displaystyle\rightarrow y:X.a:A.\displaystyle\exists y^{\prime}:\triangleright X.\prod a:A.
(a,y)f(y)\displaystyle(a,y^{\prime})\in f(y) ×\displaystyle\times (α:𝕋).R(x[α])(y[α]))×\displaystyle\triangleright(\alpha:\mathbb{T}).R(x^{\prime}[\alpha])(y^{\prime}[\alpha]))\times
(y:X.a:A.(a,y)f(y)\displaystyle(\prod y^{\prime}:\triangleright X.\prod a:A.(a,y^{\prime})\in f(y) \displaystyle\rightarrow x:X.a:A.\displaystyle\exists x^{\prime}:\triangleright X.\prod a:A.
(a,x)f(x)\displaystyle(a,x^{\prime})\in f(x) ×\displaystyle\times (α:𝕋).R(x[α])(y[α])).\displaystyle\triangleright(\alpha:\mathbb{T}).R(x^{\prime}[\alpha])(y^{\prime}[\alpha])). (10)

As shown in [10], this type is equivalent to the path type over the recursive data type of processes defined by the GLTS, Proc=fixX.Pfin(A×(α:𝕋).X[α]))\text{Proc}=\operatorname{fix}X.P_{\text{fin}}(A\times\triangleright(\alpha:\mathbb{T}).X[\alpha])). We may further define a bisimulation R2:X1X2UR_{2}:X_{1}\rightarrow X_{2}\rightarrow U between two GLTS’s over a common action space, (X1,A,f1)(X_{1},A,f_{1}) and (X2,A,f2)(X_{2},A,f_{2}) via a bisimulation over their coproduct (see [1]):

is2GLTSBisim2GLTSBisimf1,f2R2\displaystyle\operatorname{is2GLTSBisim}2GLTSBisim_{f_{1},f_{2}}R_{2} =\displaystyle= isGLTSBisimf1+f2R2×x1:X1.x2:X2.R2(x1,x2)×\displaystyle\operatorname{isGLTSBisim}_{f_{1}+f_{2}}R^{\prime}_{2}\times\forall x_{1}:X_{1}.\exists x_{2}:X_{2}.R_{2}(x_{1},x_{2})\times (11)
x2:X2.x1:X2.R1(x1,x2)\displaystyle\forall x_{2}:X_{2}.\exists x_{1}:X_{2}.R_{1}(x_{1},x_{2})

where R2:(X1+X2)(X1+X2)UR^{\prime}_{2}:(X_{1}+X_{2})\rightarrow(X_{1}+X_{2})\rightarrow U, R2((a,x),(b,y))=R(x,y)R^{\prime}_{2}((a,x),(b,y))=R(x,y) when a=1b=2a=1\wedge b=2, R2(x,y)=R^{\prime}_{2}(x,y)=\bot otherwise, and f1+f2:(X1+X2)𝒫(A×(X1+X2))f_{1}+f_{2}:(X_{1}+X_{2})\rightarrow\mathcal{P}(A\times(X_{1}+X_{2})) defined similarly. Since R2(x1,x2)R_{2}(x_{1},x_{2}) contains at least one matching element for each x1x_{1} and x2x_{2}, we may extract functions g1:X1X2g_{1}:X_{1}\rightarrow X_{2} and g2:X2X1g_{2}:X_{2}\rightarrow X_{1} as subsets of R2R_{2}, where an element in the codomain of each is chosen arbitrarily when there are multiple matches in R2R_{2}. Since bisimulation corresponds to path-equivalence for elements of each type, g1g_{1} and g2g_{2}, we can choose π1\pi_{1} and π2\pi_{2} such that g1g2π1=i1g_{1}\circ g_{2}\circ\pi_{1}=i_{1} and g2g1π2=i2g_{2}\circ g_{1}\circ\pi_{2}=i_{2}, where i1i_{1} and i2i_{2} are the identity on X1X_{1} and X2X_{2} respectively, and π1(x)=xp:PathX1(x,x)\pi_{1}(x)=x^{\prime}\Rightarrow\exists p:\operatorname{Path}_{X_{1}}(x,x^{\prime}), π2(x)=xp:PathX2(x,x)\pi_{2}(x)=x^{\prime}\Rightarrow\exists p:\operatorname{Path}_{X_{2}}(x,x^{\prime}). Hence, (g1,g2)(g_{1},g_{2}) is an equivalence between the recursive process types Proc1\text{Proc}_{1} and Proc2\text{Proc}_{2} of the two GLTS’s, meaning that PathU(Proc1,Proc2)\operatorname{Path}_{U}(\text{Proc}_{1},\text{Proc}_{2}) is inhabited by univalence.

For a given type system, its computational content may be modeled by a GLTS by setting XX to be the type of expressions in the system, AA to contain an update\operatorname{update} action along with ‘actions’ corresponding to the judgmental and syntactic relations between expressions (e.g. is-of-type, is-of-subtype, is-a-body-of-lambda-term, and their opposite relations), and ff to be the relation over expressions corresponding to the reduction relation in the system for the action update\operatorname{update} (for instance β\beta-reduction). To show that 𝕄\mathbb{M} can be used as a metalanguage for a given type system, we thus show that there is a bisimulation between 𝕄\mathbb{M} with a specific form of Atomspace (i.e. containing specific atoms and/or additional constraints to those of Eq. 3), along with an expanded action space to incorporate the typing and syntactic relations relevant to the specific system, and the GLTS corresponding to computation in the target type system; hence the process spaces induced by the two systems are equivalent. Below, we sketch how this can be achieved for three type systems of interest, focusing on the how the computational dynamics of the update\operatorname{update} rule correspond to reduction in the target system (the typing and syntactic relations in each system straightforwardly correspond in 𝕄\mathbb{M} to the inbuilt typing relation and relationships definable in terms of submetagraph composition respectively).

4.1 Simply typed lambda calculus

The syntax for the simply typed lambda calculus may be defined via mutually recursive definitions of variable, type and expression datatypes:

𝒱\displaystyle\mathcal{V} ::=\displaystyle::= vn\displaystyle v_{n}
𝒯\displaystyle\mathcal{T} ::=\displaystyle::= tn|𝒯𝒯\displaystyle t_{n}\;\;|\;\;\mathcal{T}\rightarrow\mathcal{T}
\displaystyle\mathcal{E} ::=\displaystyle::= 𝒱|()|λvn:𝒯.\displaystyle\mathcal{V}\;\;|\;\;(\mathcal{E}\;\mathcal{E})\;\;|\;\;\lambda v_{n}:\mathcal{T}.\mathcal{E} (12)

We refrain from explicitly stating the rules for type assignment as can be found in [2], which determine a typing relation _:_\_:\_ between \mathcal{E} and 𝒯\mathcal{T} given a context Γ\Gamma, which can be modeled as a partial map from 𝒱\mathcal{V} to 𝒯\mathcal{T}. Together, these determine a set of valid expressions, (_:_,Γ)\mathcal{E}_{(\_:\_,\Gamma)}, and the computational dynamics is defined by the β\beta-reduction relation over this type:

((λvn1:tn2.en3)en4)βen3[vn1/en4]\displaystyle((\lambda v_{n_{1}}:t_{n_{2}}.e_{n_{3}})\;e_{n_{4}})\rightarrow_{\beta}e_{n_{3}}[v_{n_{1}}/e_{n_{4}}] (13)

where a[b/c]a[b/c] denotes substitution of bb for cc in aa, where any bound variables in cc are renamed so as not to clash with bound variables in aa.

To simulate the simply typed lambda calculus in 𝕄\mathbb{M}, we restrict the 𝕄\mathbb{M} atomspace to include only metagraphs labeled with types using the restricted type syntax of Eq. 4.1, and including only keywords/symbols {:,=,,funapp,@,}\{:,=,\rightarrow,\operatorname{fun-app},@,\dagger\}. Then, we add the following constraint to those of Eq. 3:

mM.l(m)=(:,n1)(mM[1]𝒮𝒱)mM[2]𝒯\displaystyle\forall m\in M.l(m)=(:,n_{1})\Rightarrow(m_{M}[1]\in\mathcal{S}\vee\mathcal{V})\wedge m_{M}[2]\in\mathcal{T} (14)

Hence, all typing relations are between symbols or variables (representing global and local variables respectively) and types. The context Γ\Gamma is then represented by an atomspace consisting of a set of :: edges between symbols and types. A given lambda expression e=λx:t1.ee=\lambda x:t_{1}.e^{\prime}, where e:t2e^{\prime}:t_{2} is then simulated by choosing an unused symbol, fe𝒮f_{e}\in\mathcal{S}, and introducing the following atoms to atomspace:

(:fe(t1t2))\displaystyle(:\;f_{e}\;(\rightarrow\;t_{1}\;t_{2}))
(=(fe$x)me)\displaystyle(=\;(f_{e}\;\$x)\;m_{e^{\prime}}) (15)

where mem_{e^{\prime}} is the metagraph corresponding to expression ee^{\prime} (we note that Eq. 4.1 defines a combinator corresponding to the lambda term ee). With the atomspace so specified, reduction of an expression ee in context Γ\Gamma in the simply typed lambda calculus corresponds to repeated application of update\operatorname{update} to the pointed atomspace containing Γ\Gamma and mem_{e}, with @@ edges attached to all function application nodes, and the \dagger pointing to mem_{e}. The computation terminates with \dagger pointing to the normal form of ee. The required bisimulation thus involves pairing tuples (Γ,e)(\Gamma,e) in the simply typed lambda calculus with their corresponding pointed atomspaces in 𝕄\mathbb{M}. We note further that the untyped lambda calculus can be defined by simply removing 𝒯\mathcal{T} from the syntax in Eq. 4.1, and letting lambda expressions take the form λvn.\lambda v_{n}.\mathcal{E}. All members of \mathcal{E}. are considered legal expressions, and the 𝕄\mathbb{M} bisimulation is achieved by converting all type symbols to Type\top_{\operatorname{Type}}, hence treating Type\top_{\operatorname{Type}} as a Scott domain.

4.2 Pure Type Systems

In a pure type system (PTS, [2]), types and terms are not distinguished syntactically. PTS expressions follow the syntax:

𝒱\displaystyle\mathcal{V} ::=\displaystyle::= vn\displaystyle v_{n}
𝒞\displaystyle\mathcal{C} ::=\displaystyle::= s1N\displaystyle s_{1...N}
\displaystyle\mathcal{E} ::=\displaystyle::= 𝒱|𝒞|()|λvn:.|vn:.\displaystyle\mathcal{V}\;\;|\;\;\mathcal{C}\;\;|\;\;(\mathcal{E}\;\mathcal{E})\;\;|\;\;\lambda v_{n}:\mathcal{E}.\mathcal{E}\;\;|\;\;\prod v_{n}:\mathcal{E}.\mathcal{E} (16)

Here, 𝒞\mathcal{C} is a set of constant symbols, which in a PTS are used to represent sorts. The typing relation :: for a PTS is defined via a set of axioms and rules. The former consist of a set of judgements 𝒜={sm:sn|(m,n)AN×N}\mathcal{A}=\{s_{m}:s_{n}|(m,n)\in A\subset N\times N\}, and the latter a set of triplets ={(sl,sm,sn)|(l,m,n)RN×N×N}\mathcal{R}=\{(s_{l},s_{m},s_{n})|(l,m,n)\in R\subset N\times N\times N\}. The typing rules for a PTS are identical to the typed lambda calculus, except for the introduction rule for dependent products, which takes the form:

  ΓA:slΓ,A:slB:sm(sl,sm,sn)\Gamma\vdash A:s_{l}\;\;\Gamma,A:s_{l}\vdash B:s_{m}\;\;(s_{l},s_{m},s_{n})\in\mathcal{R}                   Γ(x:A.B):sn\Gamma\vdash(\prod x:A.B):s_{n}

The legal expressions then consist of the sorts, and any expression that can be typed in a context Γ\Gamma, consisting of multiple typing judgments e1:e2e_{1}:e_{2}. The β\beta-reduction relation is established identically to the simple lambda calculus above. Notice that there is no restriction on the form of 𝒜\mathcal{A} and \mathcal{R}; hence the typing relation :: may be arbitrary between sorts (and hence may contain cycles), while the dependent product (i.e. dependent function types) may live in arbitrary sorts with respect to their inputs.

To simulate a PTS in 𝕄\mathbb{M}, we select a collection of fixed types t1tNt_{1}...t_{N} to represent the sorts. We then add edges of the following forms to atomspace:

(:tmtn),(sm,sn)𝒜\displaystyle(:\;t_{m}\;t_{n}),\;\;\forall(s_{m},s_{n})\in\mathcal{A}
(:($ta$tb)(transform(:$tatl)(:$tbtm)tn)),(sl,sm,sn)\displaystyle(:\;(\rightarrow\;\$t_{a}\;\$t_{b})\;(\operatorname{transform}\;(:\;\$t_{a}\;t_{l})\wedge(:\;\$t_{b}\;t_{m})\;t_{n})),\;\;\forall(s_{l},s_{m},s_{n})\in\mathcal{R}
(:($x:$ta.$m)(transform(:$tatl)(:$mtm)tn)),(sl,sm,sn)\displaystyle(:\;(\prod\$x:\$t_{a}.\$m)\;(\operatorname{transform}\;(:\;\$t_{a}\;t_{l})\wedge(:\;\$m\;t_{m})\;t_{n})),\;\;\forall(s_{l},s_{m},s_{n})\in\mathcal{R}

As above, lambda expressions are simulated by adding atoms of the form in Eq. 4.1 to the atomspace, and a context Γ\Gamma is simulated by adding atoms corresponding to the typing relations it contains. Reduction of expression ee in context Γ\Gamma is simulated as previously by applying update\operatorname{update} to the pointed atomspace consisting of {Γ,e}\{\Gamma,e\} and the above constructions, along with \dagger pointing to ee. Further, we note that we can use PTS’s can be regarded as a type-theoretic analogue of non-well-founded sets; from this viewpoint, a cyclical :: relation corresponds to an accessible pointed graph (apg) underlying a non-well-founded set. For instance, including the axiom s1:s1s_{1}:s_{1} in 𝒜\mathcal{A} defines s1s_{1} as a type-theoretic analogue of a Quine atom. We note, however, that in the type-theoretic context, a cyclic PTS carries more structure than a non-well-founded set, since the rules (\mathcal{R}) carry information about how the \rightarrow constructor interacts with the :: relation. An interesting conjecture though would be that appropriately defined PTS’s provide bisimulations of systems of non-well-founded sets definable within a recursive datatype (via a coalgebra on the powerset functor, definable in GCTT), as a general system of set equations ([3]) involving both \in and \rightarrow relations.

4.3 Probabilistic dependent types

Finally, we outline a version of the probabilistic dependent type system introduced in [19], and its bisimulation in 𝕄\mathbb{M}. The syntax is a variation on the dependently typed lambda calculus:

𝒱\displaystyle\mathcal{V} ::=\displaystyle::= vn\displaystyle v_{n}
𝒯\displaystyle\mathcal{T} ::=\displaystyle::= tn|vn:𝒯.|𝒟(𝒯)|𝒯𝒯|𝒯𝒯|Type\displaystyle t_{n}\;\;|\;\;\prod v_{n}:\mathcal{T}.\mathcal{E}\;\;|\;\;\mathcal{D}(\mathcal{T})\;\;|\;\;\mathcal{T}\cup\mathcal{T}\;\;|\;\;\mathcal{T}\cap\mathcal{T}\;\;|\;\;\operatorname{Type}
\displaystyle\mathcal{E} ::=\displaystyle::= 𝒱|()|λvn:𝒯.|randomρ(,)|sample()|thunk()\displaystyle\mathcal{V}\;\;|\;\;(\mathcal{E}\;\mathcal{E})\;\;|\;\;\lambda v_{n}:\mathcal{T}.\mathcal{E}\;\;|\;\;\operatorname{random}_{\rho}(\mathcal{E},\mathcal{E})\;\;|\;\;\operatorname{sample}(\mathcal{E})\;\;|\;\;\operatorname{thunk}(\mathcal{E})

Further, we allow the judgments :𝒯\mathcal{E}:\mathcal{T} (typing), 𝒯𝒯\mathcal{T}\preceq\mathcal{T} (subtyping), and βρ\mathcal{E}\rightarrow_{\beta}^{\rho}\mathcal{E} (weighted β\beta-reduction), where ρ\rho\in\mathbb{R}. The typing rules are as for the dependent typed lambda calculus for expressions not involving subtypes or probabilistic terms. The typing rules for subtypes include the standard Γa:A,ABΓa:B\Gamma\vdash a:A,\;A\preceq B\Rightarrow\Gamma\vdash a:B, ΓA,B:TypeΓABA,ABB,AABBAB\Gamma\vdash A,B:\operatorname{Type}\Rightarrow\Gamma\vdash A\cap B\preceq A,\;A\cap B\preceq B,\;A\preceq A\cup B\;B\preceq A\cup B, ΓABvn:B.vn:A.\Gamma\vdash A\preceq B\Rightarrow\prod v_{n}:B.\mathcal{E}\preceq\prod v_{n}:A.\mathcal{E}, Γ,x:tABx:t.Ax:t.B\Gamma,x:t\vdash A\preceq B\Rightarrow\prod x:t.A\preceq\prod x:t.B. These interact with the probabilistic terms via the following special rules:

        Γa:t1,b:t2\Gamma\vdash a:t_{1},\;b:t_{2}     Γrandomρ(a,b):t1t2\Gamma\vdash\operatorname{random}_{\rho}(a,b):t_{1}\cup t_{2}

  ΓA:Type,pA:𝒟(A)\Gamma\vdash A:\operatorname{Type},p_{A}:\mathcal{D}(A)        Γsample(pA):A\Gamma\vdash\operatorname{sample}(p_{A}):A

         Γa:A\Gamma\vdash a:A     Γthunk(a):𝒟(A)\Gamma\vdash\operatorname{thunk}(a):\mathcal{D}(A)

where, we note that 𝒟(A)\mathcal{D}(A) denotes the type of distributions over AA (so, for instance, if a:t1,b:t2a:t_{1},\;b:t_{2}, then thunk(randomρ(a,b)):𝒟(t1t2)\operatorname{thunk}(\operatorname{random}_{\rho}(a,b)):\mathcal{D}(t_{1}\cup t_{2})). For all expressions not involving probabilistic terms, e1βe2e_{1}\rightarrow_{\beta}e_{2} in the dependent typed lambda calculus implies e1β1e2e_{1}\rightarrow_{\beta}^{1}e_{2} in the PDTS above. For probabilistic terms, we have the following computational rules:

randomρ(a,b)\displaystyle\operatorname{random}_{\rho}(a,b) βρ\displaystyle\rightarrow_{\beta}^{\rho} a\displaystyle a
randomρ(a,b)\displaystyle\operatorname{random}_{\rho}(a,b) β1ρ\displaystyle\rightarrow_{\beta}^{1-\rho} a\displaystyle a
sample(thunk(pA))\displaystyle\operatorname{sample}(\operatorname{thunk}(p_{A})) β1\displaystyle\rightarrow_{\beta}^{1} pA\displaystyle p_{A} (19)

Computationally, evaluation may proceed by stochastic β\beta-reduction (i.e. sampling a reduction according to the weights ρ\rho), or a ’full evaluation’ may be made, by returning the set of all possible reduction sequences from a term, annotated with the total probability of each. We note that in any given reduction sequence, e1βρe2e_{1}\rightarrow_{\beta}^{\rho}e_{2} for ρ>0\rho>0 implies t2t1t_{2}\preceq t_{1} where e1:t1,e2:t2e_{1}:t_{1},\;e_{2}:t_{2}.

For the formulation in 𝕄\mathbb{M}, we constrain the typing relation and encode lambda terms as in Eqs. 14 and 4.1; further, as above we encode contexts Γ\Gamma by fixing atoms of the form :: in atomspace. To encode the probabilistic terms, we choose fixed symbols s14s_{1...4} to correspond to Distribution,random,sample,thunk\operatorname{Distribution},\operatorname{random},\operatorname{sample},\operatorname{thunk}. Then, we fix the following atoms in atomspace:

(:Distribution(TypeType)),\displaystyle(:\;\operatorname{Distribution}\;(\rightarrow\;\operatorname{Type}\;\operatorname{Type})),
(:random($t1$t2$t1$t2)),\displaystyle(:\;\operatorname{random}\;(\rightarrow\;\$t_{1}\;\$t_{2}\;\$t_{1}\cup\$t_{2})),
(=(random$a$b)$a),\displaystyle(=\;(\operatorname{random}\;\$a\;\$b)\;\$a),
(=(random$a$b)$b),\displaystyle(=\;(\operatorname{random}\;\$a\;\$b)\;\$b),
(:sample((Distribution$t1)$t1)),\displaystyle(:\;\operatorname{sample}\;(\rightarrow\;(\operatorname{Distribution}\;\$t_{1})\;\$t_{1})),
(:thunk($t1(Distribution$t1))),\displaystyle(:\;\operatorname{thunk}\;(\rightarrow\;\$t_{1}\;(\operatorname{Distribution}\;\$t_{1}))),
(=(sample(thunk$a))$a)\displaystyle(=\;(\operatorname{sample}\;(\operatorname{thunk}\;\$a))\;\$a) (20)

Application of update\operatorname{update} to the pointed atomspace so defined, with \dagger pointing to mem_{e} (corresponding to expression ee), results in a simulation of a probabilistic reduction of ee in the PDTS above. As defined, update\operatorname{update} will simulate the ’full evaluation’ of all possible paths, and hence a bisimulation exists between full evaluation dynamics in the PDTS GLTS using βρ\beta\rho-reduction and the GLTS defined by 𝕄\mathbb{M} with the restricted atomspace above. We note that, in both cases, the weights on particular paths are lost, since the ρ\rho values are not explicitly recorded; however. it is straightforward to define a GLTS over the extended system, (X×,A,f)(X\times\mathbb{R},A,f), where f(x)={((x1,p1),a1),((x2,p2),a2),}f(x)=\{((x_{1},p_{1}),a_{1}),((x_{2},p_{2}),a_{2}),...\} denotes that action aa on xx results in x1x_{1} with probability p1p_{1}, x2x_{2} with probability p2p_{2}, and so on.

5 Implementation of Bisimulation proof in a Guarded Cubical Type Theory type checker

We briefly give an example to show the feasibility of our approach with an implementation of a bisimulation proof for a small-scale type system in a Guarded Cubical Type Theory type checker [4]. Here, we model a minimal type system, which has one type constant A:TypeA:\operatorname{Type} with two constructors v1,v2:Av_{1},v_{2}:A; one function constant f1:AAf_{1}:A\rightarrow A, where f1(v1)=v2f_{1}(v_{1})=v_{2} and f1(v2)=v1f_{1}(v_{2})=v_{1}; and includes the sample\operatorname{sample} and thunk\operatorname{thunk} constructs, which are combined following the syntax of Eq. 4.3. Our implementation models a fragment of this system where expressions are restricted to include at most three subexpressions. Hence, valid expressions of the language include: (f1(f1v1))(f_{1}\;(f_{1}\;v_{1})), (thunk(f1v2))(\operatorname{thunk}\;(f_{1}\;v_{2})), (sample(thunkv1))(\operatorname{sample}\;(\operatorname{thunk}\;v_{1})), (f1v2)(f_{1}\;v_{2}). Our implementation in a Haskell-based Guarded Cubical Type Theory type checker [4] is given in Appendix A. Here, we implement evaluation in this system via (i) a pattern matcher over an atomspace (’update’), and (ii) direct implementation of β\beta-reduction via case analysis over the expression space (’beta3’). We define GLTS’s using both forms of evaluation (’str1’ and ’str2’), and finally derive a proof that these GLTS’s are bisimilar (’bisim’). The code for this example is also provided at: https://github.com/jwarrell/metta_bisimulation

6 Discussion

In the above, we have introduced a formal meta-probabilistic programming language, formalized in GCTT, and proposed that bisimutations link the specific object-languages (or domain specific languages) outlined above with their simulations in 𝕄\mathbb{M}. Specifically, we have proposed that the restricted forms of 𝕄\mathbb{M} outlined in Secs. 4.1 and 4.2 and 4.3 form bisimulations of the simply typed lambda calculus, arbitrary PTS’s, and the target PDTS, respectively.

Finally, we mention some of the areas of investigation opened up by the formal model outlined. First, we note that, while we have focused on ‘full’ probabilistic programming evaluation, other possibilities include investigation of sampling based evaluation which performs only one meta-graph update at each step, stochastically chosen from the possible graph rewriting locations. Second, we intend to derive further bisimulations for other kinds of probabilistic logic, particularly, probabilistic paraconsistent logic [7], and probabilistic analogues of pure type systems [2], which may be suitable for models involving infinite-order probabilities [5]. Lastly, we intend to expand our implementation of aspects of this framework in Guarded Cubical Agda [17] to provide more complete implementations of the metalanguage and type systems explored here.

References

  • [1] Baier, C. and Katoen, J.P., 2008. Principles of model checking. MIT press.
  • [2] Barendregt, Henk, and Lennart Augustsson, 1992. "Lambda Calculi with Types." Handbook of Logic in Computer Science 34: 239-250.
  • [3] Barwise, J. and Moss, L., 1996. Vicious circles: on the mathematics of non-wellfounded phenomena.
  • [4] Birkedal, L., Bizjak, A., Clouston, R., Grathwohl, H.B., Spitters, B. and Vezzosi, A., 2016. Guarded cubical type theory: Path equality for guarded recursion. arXiv preprint arXiv:1606.05223.
  • [5] Goertzel, B., 2008. Modeling Uncertain Self-Referential Semantics with Infinite-Order Probabilities.
  • [6] Goertzel, B., 2020. Folding and Unfolding on Metagraphs. arXiv preprint arXiv:2012.01759.
  • [7] Goertzel, B., 2020. Paraconsistent Foundations for Probabilistic Reasoning, Programming and Concept Formation. arXiv preprint arXiv:2012.14474.
  • [8] Goertzel, B., 2021. Reflective Metagraph Rewriting as a Foundation for an AGI ‘Language of Thought’. arXiv preprint arXiv:2112.08272.
  • [9] Harper, R., 2012. Notes on logical frameworks. Lecture notes, Institute for Advanced Study, Nov, 29, p.34.
  • [10] Møgelberg, R.E. and Veltri, N., 2019. Bisimulation as path type for guarded recursive types. Proceedings of the ACM on Programming Languages, 3(POPL), pp.1-29.
  • [11] Møgelberg, R.E. and Paviotti, M., 2019. Denotational semantics of recursive types in synthetic guarded domain theory. Mathematical Structures in Computer Science, 29(3), pp.465-510.
  • [12] Mokhov, A., 2017. Algebraic graphs with class (functional pearl). ACM SIGPLAN Notices, 52(10), pp.2-13.
  • [13] Paviotti, M., Møgelberg, R.E. and Birkedal, L., 2015. A model of PCF in guarded type theory. Electronic Notes in Theoretical Computer Science, 319, pp.333-349.
  • [14] Potapov, A., 2021. MeTTa language specification. https://wiki.opencog.org/w/Hyperon.
  • [15] Staton, S., Wood, F., Yang, H., Heunen, C. and Kammar, O., 2016, July. Semantics for probabilistic programming: higher-order functions, continuous distributions, and soft constraints. In 2016 31st annual acm/ieee symposium on logic in computer science (lics) (pp. 1-10).
  • [16] TrueAGI, 2021. Hyperon-experimental repository. https://github.com/trueagi-io/hyperon-experimental.
  • [17] Veltri, N. and Vezzosi, A., 2020, January. Formalizing π\pi-calculus in guarded cubical Agda. In Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs (pp. 270-283).
  • [18] Vezzosi, A., Mörtberg, A. and Abel, A., 2021. Cubical Agda: A dependently typed programming language with univalence and higher inductive types. Journal of Functional Programming, 31.
  • [19] Warrell, J. and Gerstein, M., 2018. Dependent Type Networks: A Probabilistic Logic via the Curry-Howard Correspondence in a System of Probabilistic Dependent Types. In Uncertainty in Artificial Intelligence, Workshop on Uncertainty in Deep Learning. http://www.gatsby.ucl.ac.uk/~balaji/udl-camera-ready/UDL-19.pdf.

Appendices

Appendix 0.A Proof of Bisimulation for Small-scale Type System in a Guarded Cubical Type Theory type checker

Below, we provide the code for the example discussed in Sec. 5, which uses a Haskell-based GCTT type checker [4]. The code for this example is also provided at: https://github.com/jwarrell/metta_bisimulation

[Uncaptioned image]
[Uncaptioned image]
[Uncaptioned image]
[Uncaptioned image]
[Uncaptioned image]