¹¹institutetext: ¹Yale University, ²SingularityNET

A meta-probabilistic-programming language for bisimulation of probabilistic and non-well-founded type systems

Jonathan Warrell^1,2 Alexey Potapov² Adam Vandervorst² Ben Goertzel²

Abstract

We introduce a formal meta-language for probabilistic programming, capable of expressing both programs and the type systems in which they are embedded. We are motivated here by the desire to allow an AGI to learn not only relevant knowledge (programs/proofs), but also appropriate ways of reasoning (logics/type systems). We draw on the frameworks of cubical type theory and dependent typed metagraphs to formalize our approach. In doing so, we show that specific constructions within the meta-language can be related via bisimulation (implying path equivalence) to the type systems they correspond. This allows our approach to provide a convenient means of deriving synthetic denotational semantics for various type systems. Particularly, we derive bisimulations for pure type systems (PTS), and probabilistic dependent type systems (PDTS). We discuss further the relationship of PTS to non-well-founded set theory, and demonstrate the feasibility of our approach with an implementation of a bisimulation proof in a Guarded Cubical Type Theory type checker.

1 Introduction

Probabilistic programming offers a fertile ground between logic-based and machine-learning-based approaches to A(G)I. Formalization within type theory offers a rigorous approach to deriving semantics for probabilistic languages [15], and formalization of dependently typed probabilistic languages offers the promise of drawing a tight connection with probabilistic logics of various kinds (e.g. Markov Logic [19], Probabilistic Paraconsistent Logic [7]).

While the exploration of such individual systems is highly important, we might consider more abstractly how to embody general principles for the formation of diverse probabilistic type systems, logics, and programming languages within a single meta-language. Such a language can be considered a meta-theoretical language or logical framework for expressing individual type systems and logics. However, previous frameworks (such as [9]) have not been designed with probabilistic type systems and logics specifically in mind. Here, we outline a formal language, $\mathbb{M}$ , designed for such a purpose. This language is intended as a formal model of the MeTTa language, currently being developed as part of the OpenCog project [14, 8, 16]. The language allows for (probabilistic) reasoning not only about the knowledge embedded in a system, but also about the logic employed by the system itself.

Our approach may also be seen in relation to recent methods to derive synthetic denotational semantics for logical systems using guarded cubical type theory (GCTT) [18, 11]. Such approaches are particularly promising, offering as they do a unified approach to deriving semantics for recursive datatypes as final co-algebras of appropriate functors in the context of a formulation of univalent type theory with a fully computational semantics. We draw on methods from [10] to formalize our approach in this context. This allows us to rigorously define the relationship between an object-language and its expression in our meta-language as one of bisimulation, corresponding to path equivalence in GCTT. We further show how dependently typed metagraphs can be formalized in GCTT as the basis for our framework [6, 12], and how this leads to systems embedding natural type-theoretic equivalents of non-well-founded sets.

We begin by developing a general framework for representing metagraphs in GCTT, before outlining how the final co-algebra of a labeled transition system over this recursive datatype can be used to model our meta-language. We then derive bisimulations for various object-languages in our system, including simply typed (and untyped) lambda calulus, pure type systems, and probabilistic dependent type systems, hence deriving synthetic denotational semantics for these systems. Finally, we demonstrate the feasibility of our approach with an implementation of a bisimulation proof for a small-scale type system in a Guarded Cubical Type Theory type checker [4], before concluding with a discussion.

2 Labeled metagraphs as a guarded recursive datatype

We begin by defining a recursive datatype for typed metagraphs ( $\mathcal{M}_{(\mathcal{T},\mathcal{L},\preceq_{T})}$ ) using guarded cubical type theory. Here, $\mathcal{T},\mathcal{L}$ are types of type-symbols and edge labels respectively, and $\preceq_{T}:\mathcal{T}\times\mathcal{T}\rightarrow\mathbb{B}$ is a partial order on type-symbols. The recursive datatype is defined as the final co-algebra of the functor $\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L},\preceq_{T})}(A)$ , which when applied to type $A$ returns the following datatype (letting $\Delta$ stand for the assumptions $\mathcal{L},\mathcal{T},A:\mathcal{U}_{0}$ ; the $\epsilon,\operatorname{edge},$ and $\operatorname{connect}$ constructors used here follow the approach of [12] and [6]):

$\Gamma\vdash\Delta$ $\Gamma\vdash\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}(A)$

$\Gamma\vdash\Delta$ $\Gamma\vdash\epsilon:\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}(A)$

$\Gamma\vdash\Delta,n:\mathbb{N},t_{0}:\mathcal{T},t:\operatorname{Vec}(n,\mathcal{T}),l_{0}:\mathcal{L}$ $\Gamma\vdash\operatorname{edge}(n,t_{0},l_{0},t):\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}(A)$

$\Gamma\vdash\Delta,a_{1},a_{2}:A,t_{0}:\mathcal{T},l_{0}:\mathcal{L},q:\mathbb{N}\rightarrow\mathbb{N}_{0,\infty}$ $\Gamma\vdash\operatorname{connect}(a_{1},a_{2},t_{0},l_{0},q):\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}(A)$

where $\operatorname{Vec}(n,A)$ is the type of vectors over $A$ of length $n$ , and $\mathbb{N}_{0,\infty}$ is $\mathbb{N}$ extended with $0$ and $\infty$ . We note that for notational convenience, we do not explicitly include target labels/indices in the definition of $\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}(A)$ above (in contrast to [6], where $\mathcal{L}$ refers to target indices and $\mathcal{V}$ is used for edge values). If explicit indices are required to identify target ’levels’, these may be included by letting $\mathcal{L}=\mathcal{L}_{0}\times\sum_{n}\operatorname{Vec}(n,\mathbb{N})$ , so that each edge label is paired with a vector of target indices. $\mathcal{M}_{(\mathcal{T},\mathcal{L},\preceq_{T})}$ is then defined as a final fixed-point of $\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}$ , such that a set of constraints are satisfied:

\displaystyle\mathcal{M}_{(\mathcal{T},\mathcal{L},\preceq_{T})}

\displaystyle=

\displaystyle\sum M:\nu(\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}).C(M,\preceq_{T})

(2)

where $C(M,\preceq_{T})$ represents the constraints:

$\displaystyle C(M,\preceq_{T})$	$\displaystyle=$	$\displaystyle\forall n_{1},n_{2}:\mathbb{N},t_{1},t_{2}:\mathcal{T}$	(3)
		$\displaystyle f(M,n_{1})=t_{1}\wedge$
		$\displaystyle f(M,n_{2})=t_{2}\wedge$
		$\displaystyle q^{\prime}_{M}(n_{1})=n_{2}\Rightarrow t_{1}\preceq_{T}t_{2}$

Here, $f(M,n)$ represents a function, which for metagraph $M$ returns the type of its $n$ ’th edge or target. Specifically, when $M$ is of the form $\operatorname{edge}(n,t_{0},l_{0},t)$ , $f(M,0)$ is the type of the edge, and $f(M,n>0)$ is the type of the $n$ ’th target, and when $M$ is of the form $\operatorname{connect}(a_{1},a_{2},t_{0},l_{0},q)$ , $f(M,0)$ is the type of the whole metagraph, while the types of the edges/targets of $a_{1}$ and $a_{2}$ are interleaved when evaluating $f(M,n>0)$ for odd/even values of $n$ respectively. Further, the function $q^{\prime}_{M}:\mathbb{N}\rightarrow\mathbb{N}_{0,\infty}$ is recursively defined on $\nu(\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})})$ (via the $q$ function in the $\operatorname{connect}$ constructor of Eq. LABEL:eq:1) to indicate that the $n_{1}$ ’th target of $M$ is connected to the $n_{2}$ ’th edge/target of $M$ , whenever $q(n_{1})=n_{2}$ , with $n_{2}=\infty$ indicating that the target has no connection. $C(M,\preceq_{T})$ thus provides a set of constraints that ensure the connections in a metagraph respect the $\preceq_{T}$ relation; further constraints are needed to ensure for instance that targets receive input from only one other target (as may be appropriate for some metagraphs). Further, $\nu=\operatorname{fix}X.F(\triangleright(\alpha:\mathbb{T}).X[\alpha]))$ is the guarded fixed-point operator [10]. By [10], Prop. 3.2, $\mathcal{M}_{(\mathcal{T},\mathcal{L},\preceq_{T})}$ is both a subset of the initial algebra and final coalgebra of $\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}\circ\triangleright$ . Finally, we note that our $\operatorname{connect}$ constructor corresponds to $\text{Connect}_{Q}$ in [6], and the Union constructor is simply $\operatorname{connect}$ with $q(n)=\infty$ for all $n$ (meaning that no new connections are added).

We briefly give some examples of typed metagraphs. For convenience, we set $\mathcal{L}=\{\operatorname{null}\}$ , and $\mathcal{T}=\{A,B,C,D,\top\}$ , with $\preceq_{T}$ the identity relation along with $t\preceq_{T}\top$ for all $t$ . In our first example, we can construct metagraphs $X=\operatorname{edge}(3,A,\operatorname{null},[D,B,C])$ , and $Y=\operatorname{edge}(2,B,\operatorname{null},[D,A])$ . Then, a combined graph can be constructed as $Z^{\prime}=\operatorname{connect}(X,Y,\top,\operatorname{null},\{(1,1),(2,0)\})$ , $Z^{\prime\prime}=\operatorname{connect}(Y,X,\top,\operatorname{null},\{(1,1),(2,0)\})$ , $Z^{\prime\prime\prime}=\operatorname{connect}(Z^{\prime},Z^{\prime\prime},\top,\operatorname{null},\{\})$ , $Z=\operatorname{connect}(X,Z^{\prime\prime\prime},C,\operatorname{null},\{(3,0)\})$ . The entire metagraph is shown in Fig. 1A. We note that, in general, any metagraph with a finite number of edges and targets can be represented by a term in the initial algebra of $\mathcal{M}^{\prime}_{(\mathcal{T},\mathcal{L})}$ (as is $Z$ ). Some graphs, however, may be conveniently be represented also by terms in the final coalgebra. Consider for instance Fig. 1B. Here, we may define $X^{\prime\prime}=\operatorname{edge}(3,\top,\operatorname{null},[B,B,A])$ and $X^{\prime}=\operatorname{connect}(X^{\prime\prime},X^{\prime\prime},A,\operatorname{null},\{(1,2),(2,1),(3,0)\})$ , representing $X^{\prime}$ by a term in the initial algebra (suppressing visualization of the $X^{\prime\prime}$ subgraph). Alternatively, we may define $X^{\prime}_{co}=\operatorname{connect}(\operatorname{edge}(3,A,\operatorname{null},[B,B,A]),\allowbreak X^{\prime}_{co},A,\operatorname{null},\{(1,2),(2,1),(3,0)\})$ , which implicitly determines a term in the coalgebra as a solution to the recursive equation.

Refer to caption — Figure 1: Typed metagraph examples. Boxes show metagraphs, which may be single edges (containing no further boxes) or include several edges. Solid circles edge target types and dotted circles show metagraph types. Arrows show target-target or target-edge connections. Metagraph letter names are shown on the box of the metagraph to which they refer in the text.

3 $\mathbb{M}$ as the final coalgebra of a labeled transition system

We define the formal meta-probabilistic-programming language, $\mathbb{M}$ , as a labeled transition system over typed metagraphs. Here, we are interested in typed metagraphs with a particular form. Specifically, we begin by defining $\mathcal{T}$ by the abstract syntax:

$\displaystyle\mathcal{T}$	$\displaystyle::=$	$\displaystyle t_{n}\;\;\|\;\;\mathcal{T}\rightarrow\mathcal{T}\;\;\|\;\;\prod a:\mathcal{T}.\mathcal{M}_{\mathbb{M}}\;\;\|\;\;$	(4)
		$\displaystyle\operatorname{Eq}(\mathcal{T},\mathcal{M}_{\mathbb{M}},\mathcal{M}_{\mathbb{M}})\;\;\|\;\;\mathcal{T}\cup\mathcal{T}\;\;\|\;\;\mathcal{T}\cap\mathcal{T}\;\;\|\;\;$
		$\displaystyle\operatorname{Type}\;\;\|\;\;\top_{\operatorname{Type}}\;\;\|\;\;\top\;\;\|\;\;\mathcal{J}\;\;\|\;\;\mathcal{X}$

These syntactic constructions represent base-level types, function types, dependent types, equality types, type unions and intersections, a base universe of small types, the union of all small types, the union of all types, judgments and execution states respectively. Notice also that in Eq. 4, $\mathcal{T}$ is defined by mutual recursion with the type $\mathcal{M}_{\mathbb{M}}$ , defined in Eq. 6. We then define $\mathcal{L}$ as $\mathcal{L}=\mathcal{S}\cup\mathcal{V}\cup\mathcal{K}\cup\mathcal{T}\times\mathbb{N}$ . Notice that $\mathcal{L}$ includes $\mathcal{T}$ , so that types may simultaneously serve as labels. Further, $\mathcal{S}=\{s_{1},s_{2},...\}$ and $\mathcal{V}=\{v_{1},v_{2},...\}$ denote collections of symbols and variables respectively, and $\mathcal{K}$ is a special set of $\mathbb{M}$ keywords/key-symbols:

\displaystyle\mathcal{K}

\displaystyle=

\displaystyle\{:,\preceq,=,\rightarrow,\operatorname{Eq},\operatorname{fun-app},\operatorname{transform},@,\dagger\}

(5)

Further, $\mathcal{L}$ includes an edge-specific identifier $\mathbb{N}$ to deduplicate edges which are identical in other respects.

The state of an $\mathbb{M}$ program is represented by a typed metagraph in the following space:

\displaystyle\mathcal{M}_{\mathbb{M}}=\sum\preceq_{T}:(\mathcal{T}\times\mathcal{T}\rightarrow\mathbb{B}).\sum M:\mathcal{M}_{(\mathcal{T},\mathcal{L},\preceq_{T})}.C_{\mathbb{M}}(M,\preceq_{T})

(6)

Hence, this is the space of all metagraphs over $\mathcal{L}$ and $\mathcal{T}$ , with a varying $\preceq_{T}$ relation, where $C_{\mathbb{M}}(M,\preceq_{T})$ represents a set of ’ $\mathbb{M}$ -specific constraints’ on the structure of the metagraph (to be outlined below). This state represents the Atomspace of the program, and the subgraphs of the Atomspace are the individual atoms (as in MeTTa, see [14, 8]). We note that, since $\mathbb{M}$ serves both as a language for defining programs and type-systems within which these programs are embedded, the atoms may represent base-level propositions and programs (expressions), as well as judgments and computational state information, as reflected by their types. The $\mathbb{M}$ -specific constraints, $C_{\mathbb{M}}(M,\preceq_{T})$ , determine the interaction of the keywords/key-symbols with the type system:

$\displaystyle\forall m$	$\displaystyle\in$	$\displaystyle M.\exists n,n_{1}:\mathbb{N}.$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,\mathcal{J},(:,n),[\top_{\operatorname{Type}}\;\top])\vee$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,\mathcal{J},(\preceq,n),[\operatorname{Type}\;\operatorname{Type}])\wedge$
		$\displaystyle(m_{M}[1]=\operatorname{edge}(0,\operatorname{Type},(t_{n_{1}},0),[])\wedge$
		$\displaystyle m_{M}[2]=\operatorname{edge}(0,\operatorname{Type},(t_{n_{2}},0),[])\Rightarrow(t_{n_{1}}\preceq t_{n_{2}}))\vee$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,\mathcal{J},(=,n),[\top_{\operatorname{Type}}\;\top_{\operatorname{Type}}])\vee$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,\operatorname{Type},(\rightarrow,n),[\operatorname{Type}\;\operatorname{Type}])\wedge$
		$\displaystyle((m_{1})_{M}[2]=m\wedge l(m_{1})=(:,n_{1})\wedge t(m_{M}[1])=A\wedge$
		$\displaystyle t(m_{M}[2])=B\Rightarrow t((m_{1})_{M}[1])\preceq A\rightarrow B)\vee$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,\operatorname{Type},(\rightarrow,n),[\operatorname{Type}\;\top])\wedge$
		$\displaystyle((m_{1})_{M}[2]=m\wedge l(m_{1})=(:,n_{1})\wedge t(m_{M}[1])=A\wedge$
		$\displaystyle m_{M}[2]=m_{2}\Rightarrow t((m_{1})_{M}[1])\preceq\prod a:A.m_{2})\vee$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{edge}(3,\operatorname{Type},(\operatorname{Eq},n),[\operatorname{Type}\;t_{n_{1}}\>t_{n_{1}}])\wedge$
		$\displaystyle m_{M}[1]=\operatorname{edge}(0,\operatorname{Type},(t_{n_{1}},0),[])\wedge$
		$\displaystyle((m_{1})_{M}[2]=m\wedge l(m_{1})=(\operatorname{Eq},n_{1})\wedge l(m_{M}[1])=(T,n_{1})\wedge$
		$\displaystyle m_{M}[2]=A\wedge m_{M}[3]=B\Rightarrow t((m_{1})_{M}[1])=\operatorname{Eq}(T,A,B))\vee$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,\top,(\operatorname{transform},n),[\top\;\top])\vee$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{edge}(1,\mathcal{X},(@,n),[\top])\wedge$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{edge}(1,\mathcal{X},(\dagger,0),[\top])\wedge$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{edge}(0,\top,(\mathcal{S}\cup\mathcal{V}\cup\mathcal{T},n),[])\wedge$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,B,(\operatorname{fun-app},n),[A\rightarrow B^{\prime}\;A])\wedge B^{\prime}\preceq B\vee$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,B,(\operatorname{fun-app},n),[\prod a:A.m_{1}\;A])\wedge$
		$\displaystyle m_{1}[a=m_{M}[1]]\preceq B\vee$
$\displaystyle m$	$\displaystyle=$	$\displaystyle\operatorname{connect}(\_,\_,\_,\_,\_)\wedge$
$\displaystyle\forall n,n_{1},n_{2}$	$\displaystyle:$	$\displaystyle\mathbb{N}.t_{n}\preceq\top\wedge$
$\displaystyle s_{n}$	$\displaystyle:$	$\displaystyle\top_{\operatorname{Type}}\vee s_{n}:\operatorname{Type}\wedge$
$\displaystyle v_{n}$	$\displaystyle:$	$\displaystyle\top_{\operatorname{Type}}\vee v_{n}:\operatorname{Type}\wedge$
$\displaystyle t_{n}$	$\displaystyle:$	$\displaystyle\operatorname{Type}\wedge$
$\displaystyle(t_{n_{1}}$	$\displaystyle\preceq$	$\displaystyle t_{n_{2}}\wedge t_{n_{2}}\preceq t_{n}\Rightarrow t_{n_{1}}\preceq t_{n})\wedge$
$\displaystyle t_{n}$	$\displaystyle\preceq$	$\displaystyle t_{n}\cup t_{n_{1}}\wedge$
$\displaystyle t_{n}\cap t_{n_{1}}$	$\displaystyle\preceq$	$\displaystyle t_{n}$	(7)

where the notation $m_{M}[n]$ denotes the $n$ ’th target of subgraph $m$ in metagraph $M$ , $t[m]$ and $l[m]$ denote the type and label of metagraph $m$ respectively, and we write $a:A$ as shorthand for ’there exits an $:$ -edge in $M$ connecting $a$ and $A$ ’. We note that, for convenience, the above formulation does not include some constructions that may be appropriate in a full implementation, but can be derived from others. For instance, tuples can be constructed by introducing a dependent function $\operatorname{tuple}:\prod A,B:\operatorname{Type}.A\rightarrow B\rightarrow\operatorname{Type}$ . The left and right projection functions are then defined by $\pi_{1}(\operatorname{tuple}(A,B,a,b))=a$ and $\pi_{2}(\operatorname{tuple}(A,B,a,b))=b$ . Dependent sums can likewise be defined as dependent tuples, $\operatorname{tuple}^{\prime}:\prod A:\operatorname{Type}.\prod B:(A\rightarrow\operatorname{Type}).\prod a:A.B(a)\rightarrow\operatorname{Type}$ .

3.1 Labeled transition system based on metagraph rewriting

In guarded cubical type theory, a guarded labeled transition system (GLTS) may be defined via a state-space $X$ , a space of actions $A$ , and a function mapping states to sets of (action,state) pairs, $f:X\rightarrow P_{\text{fin}}(A\times\triangleright X))$ , where $P_{\text{fin}}$ is the finite powerset functor. The space of all processes, or runs of the GLTS may the be defined as the final coalgebra of the following functor: $\text{Proc}=\operatorname{fix}X.P_{\text{fin}}(A\times\triangleright(\alpha:\mathbb{T}).X[\alpha]))$ (see [10]). In order to characterize the process of evaluation in $\mathbb{M}$ , we characterize the computational dynamics of $\mathbb{M}$ via a GLTS. Here, the state space is the space of all $\mathbb{M}$ metgraphs, $X=\mathcal{M}_{\mathbb{M}}$ . The actions are specified by single pushout (SPO) rewriting rules, or sequences of such rules. We therefore introduce the type, $\mathcal{A}^{\prime}=\mathcal{M}^{(L,R)}_{\mathbb{M}}\times\hom_{p}(\mathcal{M}_{\mathbb{M}})$ , whose values $(M^{\prime},\phi)$ consist of a $\mathbb{M}$ metagraph whose label set is $\mathcal{L}^{\prime}=\mathcal{L}\times\{L,R,LR\}\times\{[],*,**\}$ , i.e. identical to above, but with $L$ and $R$ labels added to each edge to indicate its membership of the left or right-hand side of the rule (notice that these may overlap), * and ** to indicate the input and output nodes of the rule (see below), and $\phi$ , a partial metagraph homomorphism between the $L$ and $R$ metagraphs of $M^{\prime}$ (defining a partial metagraph homomorphism as in [8]). Since we wish to allow sequences of rewrite rules as actions, we define the full action space to be $\mathcal{A}=\sum n:\mathbb{N}.\operatorname{Vec}(n,\mathcal{A}^{\prime})$ , and write the members of $\mathcal{A}$ as $a_{1}\circ a_{2}\circ...\circ a_{n}$ , where $a_{1...n}:\mathcal{A}^{\prime}$ . The dynamics are then defined (via $f$ ) by mapping a given metagraph state $M_{1}$ to the set of all pairs $(A,M_{2})$ such that $M_{2}$ results from an application of action $A$ to $M_{1}$ . For individual rewrite rules $a\in\mathcal{A}^{\prime}$ , their action is determined via a partial homomorphism between $a$ and $M_{1}$ . We note that, when there are no partial homomorphisms between $a$ and $M_{1}$ , or when the rewrite rule produces an invalid $\mathbb{M}$ graph, we set $M_{2}=M_{1}$ . Further, we note that the update may change the $\preceq$ relation, for instance by introducing an edge of the form $t_{1}\preceq t_{2}$ .

3.2 $\mathbb{M}$ -interpretation as metagraph dynamics

We can now describe interpretation in $\mathbb{M}$ via the GLTS defined above. To do so, we map specific symbols/edges in the metagraph to actions in $\mathcal{A}$ (corresponding to the grounding domain $F$ in [8]). Specifically, edges carrying symbols of a function type, $A\rightarrow B$ , dependent product type, $\prod a:A.B$ , or the $\operatorname{transform}$ symbol, are mapped to specific forms of rewrite rule, as specified below. All other edges are mapped to the $\operatorname{null}$ transform. Fig. 2 specifies the general forms of the rewrite rules for function application, and transform rules (we note the $\operatorname{transform}$ is equivalent to the 2-argument $\operatorname{match}$ keyword/function in the current version of the MeTTa language, see [16]). The dependent product rule is identical to Fig. 2a, with $A\rightarrow B$ replaced with $\prod a:A.m_{1}$ For explicitness, we give these below also in equational form. We note that, for convenience variable names are denoted using $\$$ , although these should be ultimately mapped to the names $v_{1},v_{2},...$ .

$\displaystyle R^{1}_{\operatorname{fun-app}}$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,\$T_{2},(\operatorname{fun-app},\$n_{0},L),[\$T_{1}\rightarrow\$T_{2}\;\$T_{1}])$
$\displaystyle R^{2}_{\operatorname{fun-app}}$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,\mathcal{J},(=,\$n_{1},LR),[\$T_{2}\;\$T_{2}])$
$\displaystyle R^{3}_{\operatorname{fun-app}}$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,\$T_{2},(\operatorname{fun-app},\$n_{2},LR),[\$T_{1}\rightarrow\$T_{2}\;\$T_{1}])$
$\displaystyle R^{4}_{\operatorname{fun-app}}$	$\displaystyle=$	$\displaystyle\operatorname{edge}(0,\$T_{1}\rightarrow\$T_{2},(\$f,\$n_{3},LR),[])$
$\displaystyle R^{5}_{\operatorname{fun-app}}$	$\displaystyle=$	$\displaystyle\operatorname{edge}(0,\$T_{1},(\$v_{1},\$n_{4},LR),[])$
$\displaystyle R^{6}_{\operatorname{fun-app}}$	$\displaystyle=$	$\displaystyle\operatorname{edge}(0,\$T_{2},(\$v_{2},\$n_{5},LR**),[])$
$\displaystyle R^{7}_{\operatorname{fun-app}}$	$\displaystyle=$	$\displaystyle\operatorname{connect}(\operatorname{connect}(R^{1}_{\operatorname{fun-app}},R^{4}_{\operatorname{fun-app}},\top,\operatorname{null},\{(1,0)\})),$
		$\displaystyle R^{5}_{\operatorname{fun-app}},\top,(\operatorname{null},\operatorname{null},*),\{(5,0)\})$
$\displaystyle R^{8}_{\operatorname{fun-app}}$	$\displaystyle=$	$\displaystyle\operatorname{connect}(\operatorname{connect}(R^{3}_{\operatorname{fun-app}},R^{4}_{\operatorname{fun-app}},\top,\operatorname{null},\{(1,0)\})),$
		$\displaystyle R^{5}_{\operatorname{fun-app}},\top,\operatorname{null},\{(5,0)\})$
$\displaystyle R^{9}_{\operatorname{fun-app}}$	$\displaystyle=$	$\displaystyle\operatorname{connect}(\operatorname{connect}(R^{2}_{\operatorname{fun-app}},R^{3}_{\operatorname{fun-app}},\top,\operatorname{null},\{(1,0)\})),$
		$\displaystyle R^{5}_{\operatorname{fun-app}},\top,\operatorname{null},\{(5,0)\})$
$\displaystyle R_{\operatorname{fun-app}}$	$\displaystyle=$	$\displaystyle R^{7}_{\operatorname{fun-app}}\cup R^{8}_{\operatorname{fun-app}}\cup R^{9}_{\operatorname{fun-app}}$	(8)

$\displaystyle R^{1}_{\operatorname{transform}}$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,\operatorname{Type},(\operatorname{transform},\$n_{0},L),[\top\;\top])$
$\displaystyle R^{2}_{\operatorname{transform}}$	$\displaystyle=$	$\displaystyle\operatorname{connect}(\operatorname{connect}(R^{1}_{\operatorname{transform}},\$M_{1},\top,\operatorname{null},\{(1,0)\})),$
		$\displaystyle\$M_{2},\top,(\operatorname{null},\operatorname{null},L*),\{(5,0)\})$
$\displaystyle R^{3}_{\operatorname{transform}}$	$\displaystyle=$	$\displaystyle\operatorname{edge}(2,\operatorname{Type},(\operatorname{tuple},\$n_{0},R**),[\top\;\top])$
$\displaystyle R^{4}_{\operatorname{transform}}$	$\displaystyle=$	$\displaystyle\$M^{\prime}_{1}\cup\$M^{\prime\prime}_{1}\cup\$M^{\prime}_{2}\cup\$M^{\prime\prime}_{2}\cup\operatorname{edge}(0,\top,(\operatorname{null},\operatorname{null},LR),[])$
$\displaystyle R^{5}_{\operatorname{transform}}$	$\displaystyle=$	$\displaystyle\operatorname{connect}(R^{3}_{\operatorname{transform}},R^{4}_{\operatorname{transform}},\top,(\operatorname{null},\operatorname{null},\operatorname{null}),\{(1,1),(2,2)\}))$
$\displaystyle R_{\operatorname{transform}}$	$\displaystyle=$	$\displaystyle R^{2}_{\operatorname{transform}}\cup R^{5}_{\operatorname{transform}}$	(9)

In Eq. 3.2, $M^{\prime}_{1}$ and $M^{\prime}_{2}$ denote metagraphs isomorphic to $M_{1}$ and $M_{2}$ , using a disjoint set of variables, while $M^{\prime\prime}_{1}$ and $M^{\prime\prime}_{2}$ are defined similarly, with variables disjoint to the previous subsets. The rule in Eq. 3.2 is defined so as to return a 2-tuple of matches; in general, the size of the tuple returned should be large enough to allow for any number of matches (i.e the number of nodes in $M$ ), and if the number of matches is less than this, it will be padded with $\operatorname{null}$ values.

fun-app nodes. For a given annotated $\operatorname{fun-app}$ node, i.e. $\operatorname{connect}(@,F,\operatorname{null},\{(1,0)\}))$ , where $@=\operatorname{edge}(1,\mathcal{X},(@,n),[\top])$ and $F$ is a graph consisting of a target $\operatorname{fun-app}$ node and its two arguments, the full rewrite rule $\operatorname{rewrite}_{F}$ is found by forming a metagraph homomorphism between $R^{7}_{\operatorname{fun-app}}$ (labeled by $*$ as the input of the rule), and $F$ , replacing the variables in $R_{\operatorname{fun-app}}$ by their values in $F$ . The resulting graph is denoted $R_{\operatorname{fun-app}}(F)$ . The rule $\operatorname{rewrite}_{F}$ is then defined by the subgraphs $L=\operatorname{connect}(\$M_{0},\operatorname{connect}(@,l_{\operatorname{fun-app}}(F),\operatorname{null},\{(1,0)\})),\operatorname{null},\{((n_{1},m_{1}),(n_{2},m_{1}),...\})$ , $R=\operatorname{connect}((\$M_{0},r_{\operatorname{fun-app}}(F),\operatorname{null},\{((n_{1},m),(n_{2},m),...\})$ , where $M_{0}$ is the graph of all nodes in $M$ targeting $F$ , $m_{1}$ is the index of the $\operatorname{fun-app}$ node in $F$ , $m_{2}$ is the index of the $**$ output node in $r_{\operatorname{fun-app}}(F)$ , and $\phi$ is defined by the partial homomorphism consisting of the identity map on all nodes labeled $LR$ .

transform nodes. For a given annotated $\operatorname{transform}$ node, the full rewrite rule is defined similarly. Hence, for $\operatorname{connect}(@,F,\operatorname{null},\{(1,0)\}))$ , where $@=\operatorname{edge}(1,\mathcal{X},(@,n),[\top])$ and $F$ is a graph consisting of a target $\operatorname{transform}$ node and its two arguments, the full rewrite rule $\operatorname{rewrite}_{F}$ is found by forming a metagraph homomorphism between $R^{2}_{\operatorname{transform}}$ (labeled by $*$ as the input of the rule), and $F$ , replacing the variables in $R_{\operatorname{transform}}$ by their values in $F$ . The resulting graph is denoted $R_{\operatorname{transform}}(F)$ . The rule $\operatorname{rewrite}_{F}$ is then defined by the subgraphs $L=\operatorname{connect}(\$M_{0},\operatorname{connect}(@,l_{\operatorname{transform}}(F),\allowbreak\operatorname{null},\{(1,0)\})),\operatorname{null},\{((n_{1},m_{1}),(n_{2},m_{1}),...\})$ , $R=\operatorname{connect}((\$M_{0},r_{\operatorname{transform}}(F),\allowbreak\operatorname{null},\{((n_{1},m),(n_{2},m),...\})$ , where $M_{0}$ is the graph of all nodes in $M$ targeting $F$ , $m_{1}$ is the index of the $\operatorname{transform}$ node in $F$ , $m_{2}$ is the index of the $**$ output node in $r_{\operatorname{transform}}(F)$ , and $\phi$ is defined by the partial homomorphism consisting of the identity map on all nodes labeled $LR$ .

$\mathbb{M}$ -evaluation. The above provides groundings for activated nodes in a metagraph; as noted, nodes not of the form above result in a $\operatorname{null}$ update. Evaluation in $\mathbb{M}$ involves repeatedly updating the current pointed metgraph according to the grounding of the node currently pointed to. The conditions in Eq. 3 imply there will be at most one edge labeled with $\dagger$ in a metagraph, whose target $F$ specifies the rule by which the graph is updated. This is expressed via the single partial function, $\operatorname{update}:\mathcal{M}_{\mathbb{M}}\rightarrow\mathcal{M}_{\mathbb{M}}$ . The action of $\operatorname{update}$ is determined by the form of $F$ . If $F$ is not an activated subgraph, i.e. it is not the target of an $@$ -edge, the action $\operatorname{update}$ cannot be applied (i.e. evaluation halts). If however $F$ is the target of an $@$ -edge, $\operatorname{update}$ first checks if $F$ itself has any activated targets. If so, then $\operatorname{update}$ simply applies a graph rewrite which moves the pointer $\dagger$ to the first such activated target (in the ordering of the edge). If not, $\operatorname{update}$ applies $\operatorname{rewrite}_{F}$ , which automatically ensures that the update will finish with $\dagger$ pointing to the output subgraph, labeled $**$ . These dynamics define a reduced GLTS, with $X=\mathcal{M}_{\mathbb{M}}$ , $A=\{\operatorname{update}\}$ , and $f(M)=\{(\operatorname{update},M^{\prime}|\operatorname{update}(M)=M^{\prime})\}$ . Note that there may be multiple $M^{\prime}$ ’s for which $\operatorname{update}(M)=M^{\prime}$ if $\operatorname{rewrite}_{F}$ for a $\operatorname{fun-app}$ node is non-deterministic. Processes are defined by the fixed point $\text{Proc}=\nu(P_{\text{fin}}(A\times\triangleright X)))$ . Normal forms of $\mathcal{M}_{\mathbb{M}}$ are metagraphs for which $\operatorname{update}$ cannot be applied (i.e. their grounding is $\operatorname{null}$ ). Processes which reach a normal form are said to be terminating, and the initial expression of the process is said to evaluate to the normal form reached. Alternatively, certain expressions may not reach a normal form, resulting instead in a non-terminating computation.

4 Bisimulation of type systems in $\mathbb{M}$

As described in [10], in guarded cubical type theory, a bisimulation $R:X\rightarrow X\rightarrow U$ for the GLTS $(X,A,f)$ may be defined via the following dependent type:

$\displaystyle\operatorname{isGLTSBisim}_{f}R=\prod x,y:X.R(x,y)$	$\displaystyle\rightarrow$
$\displaystyle(\prod x^{\prime}:\triangleright X.\prod a:A.(a,x^{\prime})\in f(x)$	$\displaystyle\rightarrow$	$\displaystyle\exists y^{\prime}:\triangleright X.\prod a:A.$
$\displaystyle(a,y^{\prime})\in f(y)$	$\displaystyle\times$	$\displaystyle\triangleright(\alpha:\mathbb{T}).R(x^{\prime}[\alpha])(y^{\prime}[\alpha]))\times$
$\displaystyle(\prod y^{\prime}:\triangleright X.\prod a:A.(a,y^{\prime})\in f(y)$	$\displaystyle\rightarrow$	$\displaystyle\exists x^{\prime}:\triangleright X.\prod a:A.$
$\displaystyle(a,x^{\prime})\in f(x)$	$\displaystyle\times$	$\displaystyle\triangleright(\alpha:\mathbb{T}).R(x^{\prime}[\alpha])(y^{\prime}[\alpha])).$	(10)

As shown in [10], this type is equivalent to the path type over the recursive data type of processes defined by the GLTS, $\text{Proc}=\operatorname{fix}X.P_{\text{fin}}(A\times\triangleright(\alpha:\mathbb{T}).X[\alpha]))$ . We may further define a bisimulation $R_{2}:X_{1}\rightarrow X_{2}\rightarrow U$ between two GLTS’s over a common action space, $(X_{1},A,f_{1})$ and $(X_{2},A,f_{2})$ via a bisimulation over their coproduct (see [1]):

	$\displaystyle\operatorname{is2GLTSBisim}2GLTSBisim_{f_{1},f_{2}}R_{2}$	$\displaystyle=$	$\displaystyle\operatorname{isGLTSBisim}_{f_{1}+f_{2}}R^{\prime}_{2}\times\forall x_{1}:X_{1}.\exists x_{2}:X_{2}.R_{2}(x_{1},x_{2})\times$		(11)
			$\displaystyle\forall x_{2}:X_{2}.\exists x_{1}:X_{2}.R_{1}(x_{1},x_{2})$		(11)

where $R^{\prime}_{2}:(X_{1}+X_{2})\rightarrow(X_{1}+X_{2})\rightarrow U$ , $R^{\prime}_{2}((a,x),(b,y))=R(x,y)$ when $a=1\wedge b=2$ , $R^{\prime}_{2}(x,y)=\bot$ otherwise, and $f_{1}+f_{2}:(X_{1}+X_{2})\rightarrow\mathcal{P}(A\times(X_{1}+X_{2}))$ defined similarly. Since $R_{2}(x_{1},x_{2})$ contains at least one matching element for each $x_{1}$ and $x_{2}$ , we may extract functions $g_{1}:X_{1}\rightarrow X_{2}$ and $g_{2}:X_{2}\rightarrow X_{1}$ as subsets of $R_{2}$ , where an element in the codomain of each is chosen arbitrarily when there are multiple matches in $R_{2}$ . Since bisimulation corresponds to path-equivalence for elements of each type, $g_{1}$ and $g_{2}$ , we can choose $\pi_{1}$ and $\pi_{2}$ such that $g_{1}\circ g_{2}\circ\pi_{1}=i_{1}$ and $g_{2}\circ g_{1}\circ\pi_{2}=i_{2}$ , where $i_{1}$ and $i_{2}$ are the identity on $X_{1}$ and $X_{2}$ respectively, and $\pi_{1}(x)=x^{\prime}\Rightarrow\exists p:\operatorname{Path}_{X_{1}}(x,x^{\prime})$ , $\pi_{2}(x)=x^{\prime}\Rightarrow\exists p:\operatorname{Path}_{X_{2}}(x,x^{\prime})$ . Hence, $(g_{1},g_{2})$ is an equivalence between the recursive process types $\text{Proc}_{1}$ and $\text{Proc}_{2}$ of the two GLTS’s, meaning that $\operatorname{Path}_{U}(\text{Proc}_{1},\text{Proc}_{2})$ is inhabited by univalence.

For a given type system, its computational content may be modeled by a GLTS by setting $X$ to be the type of expressions in the system, $A$ to contain an $\operatorname{update}$ action along with ‘actions’ corresponding to the judgmental and syntactic relations between expressions (e.g. is-of-type, is-of-subtype, is-a-body-of-lambda-term, and their opposite relations), and $f$ to be the relation over expressions corresponding to the reduction relation in the system for the action $\operatorname{update}$ (for instance $\beta$ -reduction). To show that $\mathbb{M}$ can be used as a metalanguage for a given type system, we thus show that there is a bisimulation between $\mathbb{M}$ with a specific form of Atomspace (i.e. containing specific atoms and/or additional constraints to those of Eq. 3), along with an expanded action space to incorporate the typing and syntactic relations relevant to the specific system, and the GLTS corresponding to computation in the target type system; hence the process spaces induced by the two systems are equivalent. Below, we sketch how this can be achieved for three type systems of interest, focusing on the how the computational dynamics of the $\operatorname{update}$ rule correspond to reduction in the target system (the typing and syntactic relations in each system straightforwardly correspond in $\mathbb{M}$ to the inbuilt typing relation and relationships definable in terms of submetagraph composition respectively).

4.1 Simply typed lambda calculus

The syntax for the simply typed lambda calculus may be defined via mutually recursive definitions of variable, type and expression datatypes:

$\displaystyle\mathcal{V}$	$\displaystyle::=$	$\displaystyle v_{n}$
$\displaystyle\mathcal{T}$	$\displaystyle::=$	$\displaystyle t_{n}\;\;\|\;\;\mathcal{T}\rightarrow\mathcal{T}$
$\displaystyle\mathcal{E}$	$\displaystyle::=$	$\displaystyle\mathcal{V}\;\;\|\;\;(\mathcal{E}\;\mathcal{E})\;\;\|\;\;\lambda v_{n}:\mathcal{T}.\mathcal{E}$	(12)

We refrain from explicitly stating the rules for type assignment as can be found in [2], which determine a typing relation $\_:\_$ between $\mathcal{E}$ and $\mathcal{T}$ given a context $\Gamma$ , which can be modeled as a partial map from $\mathcal{V}$ to $\mathcal{T}$ . Together, these determine a set of valid expressions, $\mathcal{E}_{(\_:\_,\Gamma)}$ , and the computational dynamics is defined by the $\beta$ -reduction relation over this type:

\displaystyle((\lambda v_{n_{1}}:t_{n_{2}}.e_{n_{3}})\;e_{n_{4}})\rightarrow_{\beta}e_{n_{3}}[v_{n_{1}}/e_{n_{4}}]

(13)

where $a[b/c]$ denotes substitution of $b$ for $c$ in $a$ , where any bound variables in $c$ are renamed so as not to clash with bound variables in $a$ .

To simulate the simply typed lambda calculus in $\mathbb{M}$ , we restrict the $\mathbb{M}$ atomspace to include only metagraphs labeled with types using the restricted type syntax of Eq. 4.1, and including only keywords/symbols $\{:,=,\rightarrow,\operatorname{fun-app},@,\dagger\}$ . Then, we add the following constraint to those of Eq. 3:

\displaystyle\forall m\in M.l(m)=(:,n_{1})\Rightarrow(m_{M}[1]\in\mathcal{S}\vee\mathcal{V})\wedge m_{M}[2]\in\mathcal{T}

(14)

Hence, all typing relations are between symbols or variables (representing global and local variables respectively) and types. The context $\Gamma$ is then represented by an atomspace consisting of a set of $:$ edges between symbols and types. A given lambda expression $e=\lambda x:t_{1}.e^{\prime}$ , where $e^{\prime}:t_{2}$ is then simulated by choosing an unused symbol, $f_{e}\in\mathcal{S}$ , and introducing the following atoms to atomspace:

	$\displaystyle(:\;f_{e}\;(\rightarrow\;t_{1}\;t_{2}))$
	$\displaystyle(=\;(f_{e}\;\$x)\;m_{e^{\prime}})$		(15)

where $m_{e^{\prime}}$ is the metagraph corresponding to expression $e^{\prime}$ (we note that Eq. 4.1 defines a combinator corresponding to the lambda term $e$ ). With the atomspace so specified, reduction of an expression $e$ in context $\Gamma$ in the simply typed lambda calculus corresponds to repeated application of $\operatorname{update}$ to the pointed atomspace containing $\Gamma$ and $m_{e}$ , with $@$ edges attached to all function application nodes, and the $\dagger$ pointing to $m_{e}$ . The computation terminates with $\dagger$ pointing to the normal form of $e$ . The required bisimulation thus involves pairing tuples $(\Gamma,e)$ in the simply typed lambda calculus with their corresponding pointed atomspaces in $\mathbb{M}$ . We note further that the untyped lambda calculus can be defined by simply removing $\mathcal{T}$ from the syntax in Eq. 4.1, and letting lambda expressions take the form $\lambda v_{n}.\mathcal{E}$ . All members of $\mathcal{E}$ . are considered legal expressions, and the $\mathbb{M}$ bisimulation is achieved by converting all type symbols to $\top_{\operatorname{Type}}$ , hence treating $\top_{\operatorname{Type}}$ as a Scott domain.

4.2 Pure Type Systems

In a pure type system (PTS, [2]), types and terms are not distinguished syntactically. PTS expressions follow the syntax:

$\displaystyle\mathcal{V}$	$\displaystyle::=$	$\displaystyle v_{n}$
$\displaystyle\mathcal{C}$	$\displaystyle::=$	$\displaystyle s_{1...N}$
$\displaystyle\mathcal{E}$	$\displaystyle::=$	$\displaystyle\mathcal{V}\;\;\|\;\;\mathcal{C}\;\;\|\;\;(\mathcal{E}\;\mathcal{E})\;\;\|\;\;\lambda v_{n}:\mathcal{E}.\mathcal{E}\;\;\|\;\;\prod v_{n}:\mathcal{E}.\mathcal{E}$	(16)

Here, $\mathcal{C}$ is a set of constant symbols, which in a PTS are used to represent sorts. The typing relation $:$ for a PTS is defined via a set of axioms and rules. The former consist of a set of judgements $\mathcal{A}=\{s_{m}:s_{n}|(m,n)\in A\subset N\times N\}$ , and the latter a set of triplets $\mathcal{R}=\{(s_{l},s_{m},s_{n})|(l,m,n)\in R\subset N\times N\times N\}$ . The typing rules for a PTS are identical to the typed lambda calculus, except for the introduction rule for dependent products, which takes the form:

$\Gamma\vdash A:s_{l}\;\;\Gamma,A:s_{l}\vdash B:s_{m}\;\;(s_{l},s_{m},s_{n})\in\mathcal{R}$ $\Gamma\vdash(\prod x:A.B):s_{n}$

The legal expressions then consist of the sorts, and any expression that can be typed in a context $\Gamma$ , consisting of multiple typing judgments $e_{1}:e_{2}$ . The $\beta$ -reduction relation is established identically to the simple lambda calculus above. Notice that there is no restriction on the form of $\mathcal{A}$ and $\mathcal{R}$ ; hence the typing relation $:$ may be arbitrary between sorts (and hence may contain cycles), while the dependent product (i.e. dependent function types) may live in arbitrary sorts with respect to their inputs.

To simulate a PTS in $\mathbb{M}$ , we select a collection of fixed types $t_{1}...t_{N}$ to represent the sorts. We then add edges of the following forms to atomspace:

	$\displaystyle(:\;t_{m}\;t_{n}),\;\;\forall(s_{m},s_{n})\in\mathcal{A}$
	$\displaystyle(:\;(\rightarrow\;\$t_{a}\;\$t_{b})\;(\operatorname{transform}\;(:\;\$t_{a}\;t_{l})\wedge(:\;\$t_{b}\;t_{m})\;t_{n})),\;\;\forall(s_{l},s_{m},s_{n})\in\mathcal{R}$
	$\displaystyle(:\;(\prod\$x:\$t_{a}.\$m)\;(\operatorname{transform}\;(:\;\$t_{a}\;t_{l})\wedge(:\;\$m\;t_{m})\;t_{n})),\;\;\forall(s_{l},s_{m},s_{n})\in\mathcal{R}$

As above, lambda expressions are simulated by adding atoms of the form in Eq. 4.1 to the atomspace, and a context $\Gamma$ is simulated by adding atoms corresponding to the typing relations it contains. Reduction of expression $e$ in context $\Gamma$ is simulated as previously by applying $\operatorname{update}$ to the pointed atomspace consisting of $\{\Gamma,e\}$ and the above constructions, along with $\dagger$ pointing to $e$ . Further, we note that we can use PTS’s can be regarded as a type-theoretic analogue of non-well-founded sets; from this viewpoint, a cyclical $:$ relation corresponds to an accessible pointed graph (apg) underlying a non-well-founded set. For instance, including the axiom $s_{1}:s_{1}$ in $\mathcal{A}$ defines $s_{1}$ as a type-theoretic analogue of a Quine atom. We note, however, that in the type-theoretic context, a cyclic PTS carries more structure than a non-well-founded set, since the rules ( $\mathcal{R}$ ) carry information about how the $\rightarrow$ constructor interacts with the $:$ relation. An interesting conjecture though would be that appropriately defined PTS’s provide bisimulations of systems of non-well-founded sets definable within a recursive datatype (via a coalgebra on the powerset functor, definable in GCTT), as a general system of set equations ([3]) involving both $\in$ and $\rightarrow$ relations.

4.3 Probabilistic dependent types

Finally, we outline a version of the probabilistic dependent type system introduced in [19], and its bisimulation in $\mathbb{M}$ . The syntax is a variation on the dependently typed lambda calculus:

$\displaystyle\mathcal{V}$	$\displaystyle::=$	$\displaystyle v_{n}$
$\displaystyle\mathcal{T}$	$\displaystyle::=$	$\displaystyle t_{n}\;\;\|\;\;\prod v_{n}:\mathcal{T}.\mathcal{E}\;\;\|\;\;\mathcal{D}(\mathcal{T})\;\;\|\;\;\mathcal{T}\cup\mathcal{T}\;\;\|\;\;\mathcal{T}\cap\mathcal{T}\;\;\|\;\;\operatorname{Type}$
$\displaystyle\mathcal{E}$	$\displaystyle::=$	$\displaystyle\mathcal{V}\;\;\|\;\;(\mathcal{E}\;\mathcal{E})\;\;\|\;\;\lambda v_{n}:\mathcal{T}.\mathcal{E}\;\;\|\;\;\operatorname{random}_{\rho}(\mathcal{E},\mathcal{E})\;\;\|\;\;\operatorname{sample}(\mathcal{E})\;\;\|\;\;\operatorname{thunk}(\mathcal{E})$

Further, we allow the judgments $\mathcal{E}:\mathcal{T}$ (typing), $\mathcal{T}\preceq\mathcal{T}$ (subtyping), and $\mathcal{E}\rightarrow_{\beta}^{\rho}\mathcal{E}$ (weighted $\beta$ -reduction), where $\rho\in\mathbb{R}$ . The typing rules are as for the dependent typed lambda calculus for expressions not involving subtypes or probabilistic terms. The typing rules for subtypes include the standard $\Gamma\vdash a:A,\;A\preceq B\Rightarrow\Gamma\vdash a:B$ , $\Gamma\vdash A,B:\operatorname{Type}\Rightarrow\Gamma\vdash A\cap B\preceq A,\;A\cap B\preceq B,\;A\preceq A\cup B\;B\preceq A\cup B$ , $\Gamma\vdash A\preceq B\Rightarrow\prod v_{n}:B.\mathcal{E}\preceq\prod v_{n}:A.\mathcal{E}$ , $\Gamma,x:t\vdash A\preceq B\Rightarrow\prod x:t.A\preceq\prod x:t.B$ . These interact with the probabilistic terms via the following special rules:

$\Gamma\vdash a:t_{1},\;b:t_{2}$ $\Gamma\vdash\operatorname{random}_{\rho}(a,b):t_{1}\cup t_{2}$

$\Gamma\vdash A:\operatorname{Type},p_{A}:\mathcal{D}(A)$ $\Gamma\vdash\operatorname{sample}(p_{A}):A$

$\Gamma\vdash a:A$ $\Gamma\vdash\operatorname{thunk}(a):\mathcal{D}(A)$

where, we note that $\mathcal{D}(A)$ denotes the type of distributions over $A$ (so, for instance, if $a:t_{1},\;b:t_{2}$ , then $\operatorname{thunk}(\operatorname{random}_{\rho}(a,b)):\mathcal{D}(t_{1}\cup t_{2})$ ). For all expressions not involving probabilistic terms, $e_{1}\rightarrow_{\beta}e_{2}$ in the dependent typed lambda calculus implies $e_{1}\rightarrow_{\beta}^{1}e_{2}$ in the PDTS above. For probabilistic terms, we have the following computational rules:

$\displaystyle\operatorname{random}_{\rho}(a,b)$	$\displaystyle\rightarrow_{\beta}^{\rho}$	$\displaystyle a$
$\displaystyle\operatorname{random}_{\rho}(a,b)$	$\displaystyle\rightarrow_{\beta}^{1-\rho}$	$\displaystyle a$
$\displaystyle\operatorname{sample}(\operatorname{thunk}(p_{A}))$	$\displaystyle\rightarrow_{\beta}^{1}$	$\displaystyle p_{A}$	(19)

Computationally, evaluation may proceed by stochastic $\beta$ -reduction (i.e. sampling a reduction according to the weights $\rho$ ), or a ’full evaluation’ may be made, by returning the set of all possible reduction sequences from a term, annotated with the total probability of each. We note that in any given reduction sequence, $e_{1}\rightarrow_{\beta}^{\rho}e_{2}$ for $\rho>0$ implies $t_{2}\preceq t_{1}$ where $e_{1}:t_{1},\;e_{2}:t_{2}$ .

For the formulation in $\mathbb{M}$ , we constrain the typing relation and encode lambda terms as in Eqs. 14 and 4.1; further, as above we encode contexts $\Gamma$ by fixing atoms of the form $:$ in atomspace. To encode the probabilistic terms, we choose fixed symbols $s_{1...4}$ to correspond to $\operatorname{Distribution},\operatorname{random},\operatorname{sample},\operatorname{thunk}$ . Then, we fix the following atoms in atomspace:

	$\displaystyle(:\;\operatorname{Distribution}\;(\rightarrow\;\operatorname{Type}\;\operatorname{Type})),$
	$\displaystyle(:\;\operatorname{random}\;(\rightarrow\;\$t_{1}\;\$t_{2}\;\$t_{1}\cup\$t_{2})),$
	$\displaystyle(=\;(\operatorname{random}\;\$a\;\$b)\;\$a),$
	$\displaystyle(=\;(\operatorname{random}\;\$a\;\$b)\;\$b),$
	$\displaystyle(:\;\operatorname{sample}\;(\rightarrow\;(\operatorname{Distribution}\;\$t_{1})\;\$t_{1})),$
	$\displaystyle(:\;\operatorname{thunk}\;(\rightarrow\;\$t_{1}\;(\operatorname{Distribution}\;\$t_{1}))),$
	$\displaystyle(=\;(\operatorname{sample}\;(\operatorname{thunk}\;\$a))\;\$a)$		(20)

Application of $\operatorname{update}$ to the pointed atomspace so defined, with $\dagger$ pointing to $m_{e}$ (corresponding to expression $e$ ), results in a simulation of a probabilistic reduction of $e$ in the PDTS above. As defined, $\operatorname{update}$ will simulate the ’full evaluation’ of all possible paths, and hence a bisimulation exists between full evaluation dynamics in the PDTS GLTS using $\beta\rho$ -reduction and the GLTS defined by $\mathbb{M}$ with the restricted atomspace above. We note that, in both cases, the weights on particular paths are lost, since the $\rho$ values are not explicitly recorded; however. it is straightforward to define a GLTS over the extended system, $(X\times\mathbb{R},A,f)$ , where $f(x)=\{((x_{1},p_{1}),a_{1}),((x_{2},p_{2}),a_{2}),...\}$ denotes that action $a$ on $x$ results in $x_{1}$ with probability $p_{1}$ , $x_{2}$ with probability $p_{2}$ , and so on.

5 Implementation of Bisimulation proof in a Guarded Cubical Type Theory type checker

We briefly give an example to show the feasibility of our approach with an implementation of a bisimulation proof for a small-scale type system in a Guarded Cubical Type Theory type checker [4]. Here, we model a minimal type system, which has one type constant $A:\operatorname{Type}$ with two constructors $v_{1},v_{2}:A$ ; one function constant $f_{1}:A\rightarrow A$ , where $f_{1}(v_{1})=v_{2}$ and $f_{1}(v_{2})=v_{1}$ ; and includes the $\operatorname{sample}$ and $\operatorname{thunk}$ constructs, which are combined following the syntax of Eq. 4.3. Our implementation models a fragment of this system where expressions are restricted to include at most three subexpressions. Hence, valid expressions of the language include: $(f_{1}\;(f_{1}\;v_{1}))$ , $(\operatorname{thunk}\;(f_{1}\;v_{2}))$ , $(\operatorname{sample}\;(\operatorname{thunk}\;v_{1}))$ , $(f_{1}\;v_{2})$ . Our implementation in a Haskell-based Guarded Cubical Type Theory type checker [4] is given in Appendix A. Here, we implement evaluation in this system via (i) a pattern matcher over an atomspace (’update’), and (ii) direct implementation of $\beta$ -reduction via case analysis over the expression space (’beta3’). We define GLTS’s using both forms of evaluation (’str1’ and ’str2’), and finally derive a proof that these GLTS’s are bisimilar (’bisim’). The code for this example is also provided at: https://github.com/jwarrell/metta_bisimulation

6 Discussion

In the above, we have introduced a formal meta-probabilistic programming language, formalized in GCTT, and proposed that bisimutations link the specific object-languages (or domain specific languages) outlined above with their simulations in $\mathbb{M}$ . Specifically, we have proposed that the restricted forms of $\mathbb{M}$ outlined in Secs. 4.1 and 4.2 and 4.3 form bisimulations of the simply typed lambda calculus, arbitrary PTS’s, and the target PDTS, respectively.

Finally, we mention some of the areas of investigation opened up by the formal model outlined. First, we note that, while we have focused on ‘full’ probabilistic programming evaluation, other possibilities include investigation of sampling based evaluation which performs only one meta-graph update at each step, stochastically chosen from the possible graph rewriting locations. Second, we intend to derive further bisimulations for other kinds of probabilistic logic, particularly, probabilistic paraconsistent logic [7], and probabilistic analogues of pure type systems [2], which may be suitable for models involving infinite-order probabilities [5]. Lastly, we intend to expand our implementation of aspects of this framework in Guarded Cubical Agda [17] to provide more complete implementations of the metalanguage and type systems explored here.

References

[1] Baier, C. and Katoen, J.P., 2008. Principles of model checking. MIT press.
[2] Barendregt, Henk, and Lennart Augustsson, 1992. "Lambda Calculi with Types." Handbook of Logic in Computer Science 34: 239-250.
[3] Barwise, J. and Moss, L., 1996. Vicious circles: on the mathematics of non-wellfounded phenomena.
[4] Birkedal, L., Bizjak, A., Clouston, R., Grathwohl, H.B., Spitters, B. and Vezzosi, A., 2016. Guarded cubical type theory: Path equality for guarded recursion. arXiv preprint arXiv:1606.05223.
[5] Goertzel, B., 2008. Modeling Uncertain Self-Referential Semantics with Infinite-Order Probabilities.
[6] Goertzel, B., 2020. Folding and Unfolding on Metagraphs. arXiv preprint arXiv:2012.01759.
[7] Goertzel, B., 2020. Paraconsistent Foundations for Probabilistic Reasoning, Programming and Concept Formation. arXiv preprint arXiv:2012.14474.
[8] Goertzel, B., 2021. Reflective Metagraph Rewriting as a Foundation for an AGI ‘Language of Thought’. arXiv preprint arXiv:2112.08272.
[9] Harper, R., 2012. Notes on logical frameworks. Lecture notes, Institute for Advanced Study, Nov, 29, p.34.
[10] Møgelberg, R.E. and Veltri, N., 2019. Bisimulation as path type for guarded recursive types. Proceedings of the ACM on Programming Languages, 3(POPL), pp.1-29.
[11] Møgelberg, R.E. and Paviotti, M., 2019. Denotational semantics of recursive types in synthetic guarded domain theory. Mathematical Structures in Computer Science, 29(3), pp.465-510.
[12] Mokhov, A., 2017. Algebraic graphs with class (functional pearl). ACM SIGPLAN Notices, 52(10), pp.2-13.
[13] Paviotti, M., Møgelberg, R.E. and Birkedal, L., 2015. A model of PCF in guarded type theory. Electronic Notes in Theoretical Computer Science, 319, pp.333-349.
[14] Potapov, A., 2021. MeTTa language specification. https://wiki.opencog.org/w/Hyperon.
[15] Staton, S., Wood, F., Yang, H., Heunen, C. and Kammar, O., 2016, July. Semantics for probabilistic programming: higher-order functions, continuous distributions, and soft constraints. In 2016 31st annual acm/ieee symposium on logic in computer science (lics) (pp. 1-10).
[16] TrueAGI, 2021. Hyperon-experimental repository. https://github.com/trueagi-io/hyperon-experimental.
[17] Veltri, N. and Vezzosi, A., 2020, January. Formalizing $\pi$ -calculus in guarded cubical Agda. In Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs (pp. 270-283).
[18] Vezzosi, A., Mörtberg, A. and Abel, A., 2021. Cubical Agda: A dependently typed programming language with univalence and higher inductive types. Journal of Functional Programming, 31.
[19] Warrell, J. and Gerstein, M., 2018. Dependent Type Networks: A Probabilistic Logic via the Curry-Howard Correspondence in a System of Probabilistic Dependent Types. In Uncertainty in Artificial Intelligence, Workshop on Uncertainty in Deep Learning. http://www.gatsby.ucl.ac.uk/~balaji/udl-camera-ready/UDL-19.pdf.

Appendices

Appendix 0.A Proof of Bisimulation for Small-scale Type System in a Guarded Cubical Type Theory type checker

Below, we provide the code for the example discussed in Sec. 5, which uses a Haskell-based GCTT type checker [4]. The code for this example is also provided at: https://github.com/jwarrell/metta_bisimulation