¹¹institutetext: University of Massachusetts Lowell, Lowell MA 01854, USA

A Calculus for Language Transformations

Benjamin Mourad Matteo Cimini

Abstract

In this paper we propose a calculus for expressing algorithms for programming languages transformations. We present the type system and operational semantics of the calculus, and we prove that it is type sound. We have implemented our calculus, and we demonstrate its applicability with common examples in programming languages. As our calculus manipulates inference systems, our work can, in principle, be applied to logical systems.

1 Introduction

Operational semantics is a standard de facto to defining the semantics of programming languages [PLOTKIN]. However, producing a programming language definition is still a hard task. It is not surprising that theoretical and software tools for supporting the modeling of languages based on operational semantics have received attention in research [LangWorkbenches, Rosu2010, Redex]. In this paper, we address an important aspect of language reuse which has not received attention so far: Producing language definitions from existing ones by the application of transformation algorithms. Such algorithms may automatically add features to the language, or switch to different semantics styles. In this paper, we aim at providing theoretical foundations and a software tool for this aspect.

Consider the typing rule of function application below on the left and its version with algorithmic subtyping on the right.

\displaystyle{{\vbox{\hbox{\hbox{\small\small{(t-app)}}}\hbox{$\displaystyle\displaystyle{\hbox{\hskip 55.66817pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1}:T_{1}\to T_{2}$}\hskip 17.00024pt\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{2}:T_{1}$}}}}\vbox{}}}\over\hbox{\hskip 24.62883pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1}\;e_{2}:T_{2}$}}}}}}$}}}}~~\stackrel{{\scriptstyle f(\textsc{t-app})}}{{\Longrightarrow}}~~{{\vbox{\hbox{\hbox{\small\small{(t-app')}}}\hbox{$\displaystyle\displaystyle{\hbox{\hskip 59.47934pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1}:T_{11}\to T_{2}$}\hskip 17.00024pt\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{2}:T_{12}$}}}}\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle T_{12}<:T_{11}$}}}\vbox{}}}}\over\hbox{\hskip 24.62883pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1}\;e_{2}:T_{2}$}}}}}}$}}}}

Intuitively, we can describe (t-app’) as a function of (t-app). Such a function includes, at least, giving new variable names when a variable is mentioned more than once, and must relate the new variables with subtyping according to the variance of types (covariant vs contravariant). Our question is: Can we express, easily, language transformations in a safe calculus?

Language transformations are beneficial for a number of reasons. On the theoretical side, they isolate and make explicit the insights that underly some programming languages features or semantics style. On the practical side, language transformations do not apply just to one language but to several languages. They can alleviate the burden to language designers, who can use them to automatically generate new language definitions using well-established algorithms rather than manually defining them, an error prone endeavor.

In this paper, we make the following contributions.

•

We present $\mathcal{L}\textendash\textsf{Tr}$ (pronounced “Elter”), a formal calculus for language transformations (Section 2). We define the syntax (Section 2.1), operational semantics (Section 2.2), and type system (Section 2.3) of $\mathcal{L}\textendash\textsf{Tr}$ .
•

We prove that $\mathcal{L}\textendash\textsf{Tr}$ is type sound (Section 2.3).
•

We show the applicability of $\mathcal{L}\textendash\textsf{Tr}$ to the specification of two transformations: adding subtyping and switching from small-step to big-step semantics (Section 3). Our examples show that $\mathcal{L}\textendash\textsf{Tr}$ is expressive and offers a rather declarative style to programmers.
•

We have implemented $\mathcal{L}\textendash\textsf{Tr}$ [ltr], and we report that we have applied our transformations to several language definitions.

Related work are discussed in Section LABEL:related, and Section LABEL:conclusion concludes the paper.

2 A Calculus for Language Transformations

We focus on language definitions in the style of operational semantics. To briefly summarize, languages are specified with a BNF grammar and a set of inference rules. BNF grammars have grammar productions such as $\text{\sf Types}\;T::=B\mid\;T\to T$ . We call Types a category name, $T$ is a grammar meta-variable, and $B$ and $T\to T$ , as well as, for example, $(\lambda x.e\;v)$ , are terms. $(\lambda x.e\;v)\longrightarrow e[v/x]$ and $\Gamma\vdash(e_{1}\;e_{2}):T_{2}$ are formulae. An inference rule $\inference{f_{1},\ldots,f_{n}}{f}$ has a set of formulae above the horizontal line, which are called premises, and a formula below the horizontal line, which is called the conclusion.

2.1 Syntax of $\mathcal{L}\textendash\textsf{Tr}$

Below we show the $\mathcal{L}\textendash\textsf{Tr}$ syntax for language definitions, which reflects the operational semantics style of defining languages. Sets are accommodated with lists.

$cname\in\textsc{CatName},{X}\in\textsc{Meta-Var},opname\in\textsc{OpName},predname\in\textsc{PredName}$

\begin{array}[]{l@{\;\;}lcl}\text{\sf Language}&\mathcal{L}&::=&(G,R)\\ \text{\sf Grammar}&G&::=&\{s_{1},\ldots,s_{n}\}\\ \text{\sf Grammar Pr.}&s&::=&cname\;{X}::=lt\\ \text{\sf Rule}&r&::=&\inference{lf}{f}\\ \text{\sf Formula}&f&::=&predname\;lt\\ \text{\sf Term}&t&::=&{X}\mid opname\;lt\mid({X})t\mid t[t/{X}]\\ \text{\sf List of Rules}&R&::=&{\mathtt{nil}}\mid\mathtt{cons}\;r\;R\\ \text{\sf List of Formula}&\mathit{lf}&::=&{\mathtt{nil}}\mid\mathtt{cons}\;f\;\mathit{lf}\\ \text{\sf List of Terms}&lt&::=&{\mathtt{nil}}\mid\mathtt{cons}\;t\;lt\end{array}

We assume a set of category names CatName, a set of meta-variables Meta-Var, a set of constructor operator names OpName, and a set of predicate names PredName. We assume that these sets are pairwise disjoint. OpName contains elements such as $\to$ and $\lambda$ (elements do not have to necessarily be (string) names). PredName contains elements such as $\vdash$ and $\longrightarrow$ . To facilitate the modeling of our calculus, we assume that terms and formulae are defined in abstract syntax tree fashion. Here this means that they always have a top level constructor applied to a list of terms. $\mathcal{L}\textendash\textsf{Tr}$ also provides syntax to specify unary binding $({z})t$ and capture-avoiding substitution $t[t/{z}]$ . Therefore, $\mathcal{L}\textendash\textsf{Tr}$ is tailored for static scoping rather than dynamic scoping. Lists can be built as usual with the ${\mathtt{nil}}$ and $\mathtt{cons}$ operator. We sometimes use the shorthand $[o_{1},\ldots o_{n}]$ for the corresponding series of $\mathtt{{\mathtt{cons}}}$ applications ended with ${\mathtt{nil}}$ .

To make an example, the typing rule for function application and the $\beta$ -reduction rules are written as follows. ( $app$ is the top-level operator name for function application).

\displaystyle\inference{[\;\vdash\;[{\Gamma},{e_{1}},(\to[{T_{1}},{T_{2}}])],\;\vdash\;[{\Gamma},{e_{2}},{T_{1}}]\;]}{\vdash\;[{\Gamma},(app\;[{e_{1}},{e_{2}}]),{T_{2}}]}\qquad\inference{[]}{\longrightarrow\;[(app\;[(\lambda\;[({x}){e}]),{v}]),{e}[{v}/x]]}

Below we show the rest of the syntax of $\mathcal{L}\textendash\textsf{Tr}$ .
$x\in\textsc{Var},str\in\textsc{String},\{\mathit{self},\mathit{premises},\mathit{conclusion}\}\subseteq\textsc{Var}$

\begin{array}[]{l@{\;\;}lcl}\text{\sf Expression}&e&::=&x\mid cname\mid str\mid{\hat{t}}\mid{\hat{f}}\mid{\hat{r}}\\ &&&\mid\mathtt{nil}\mid\mathtt{cons}\;e\;e\mid{\mathtt{head}}\;e\mid{\mathtt{tail}}\;e\mid e@e\\ &&&\mid\mathtt{map}(e,e)\mid e(e)\mid\mathtt{mapKeys}\;e\\ &&&\mid\mathtt{just}\;e\mid\mathtt{nothing}\mid\mathtt{get}\;e\\ &&&\mid cname\;{X}::=e\;\mid cname\;{X}::=\ldots\;e\\ &&&\mid\mathtt{getRules}\mid\mathtt{setRules}\;e\\ &&&\mid{e}[p]:\;e\mid{e(\mathtt{keep})}[p]:\;e\mid\mathtt{uniquefy}(e,e,str)\Rightarrow(x,x):e\\ &&&\mid\mathtt{if}\;b\;\mathtt{then}\;e\;\mathtt{else}\;e\mid e\;;\;e\mid e;_{\textsf{r}}e\mid\mathtt{skip}\\ &&&\mid\mathtt{newVar}\mid e\texttt{'}\mid\mathtt{fold}\;predname\;e\\ &&&\mid\mathtt{error}\\ \text{\sf Boolean Expr.}&b&::=&e==e\mid\mathtt{isEmpty}\;e\mid e\;\mathtt{in}\;e\mid\mathtt{isNothing}\;e\mid b\;\mathtt{and}\;b\mid b\;\mathtt{or}\;b\mid\mathtt{not}\;b\\ \text{\sf$\mathcal{L}\textendash\textsf{Tr}$ Rule}&{\hat{r}}&::=&\inference{e}{e}\\ \text{\sf$\mathcal{L}\textendash\textsf{Tr}$ Formula}&{\hat{f}}&::=&predname\;e\mid x\;e\\ \text{\sf$\mathcal{L}\textendash\textsf{Tr}$ Term}&{\hat{t}}&::=&{X}\mid opname\;e\mid x\;e\mid({X})e\mid e[e/{X}]\\ \text{\sf Pattern}&p&::=&x:T\mid predname\;p\mid opname\;p\mid x\;p\mid\mathtt{nil}\mid\mathtt{cons}\;p\;p\\ \text{\sf Value}&v&::=&{t}\mid{f}\mid{r}\mid cname\mid str\\ &&&\mid\mathtt{nil}\mid\mathtt{cons}\;v\;v\mid\mathtt{map}(v,v)\mid\mathtt{just}\;v\mid\mathtt{nothing}\mid\mathtt{skip}\end{array}

Programmers write expressions to specify transformations. At run-time, an expression will be executed with a language definition. Evaluating an expression may modify the current language definition.

Design Principles: We strive to offer well-crafted operations that map well with the language manipulations that are frequent in adding features to languages or switching semantics styles. There are three features that we can point out which exemplify our approach the most: 1) The ability to program parts of rules, premises and grammars, 2) selectors ${e}[p]:\;e$ , and 3) the $\mathtt{uniquefy}$ operation. Below, we shall the describe the syntax for transformations, and place some emphasis in motivating these three operations.
Basic Data Types: $\mathcal{L}\textendash\textsf{Tr}$ has strings and has lists with typical operators for extracting their head and tail, as well as for concatenating them ( $@$ ). $\mathcal{L}\textendash\textsf{Tr}$ also has maps (key-value). In $\mathtt{map}(e_{1},e_{2})$ , $e_{1}$ and $e_{2}$ are lists. The first element of $e_{1}$ is the key for the first element of $e_{2}$ , and so on for the rest of elements. Such a representation fits better our language transformations examples, as we shall see in Section 3. Operation $e_{1}(e_{2})$ queries a map, where $e_{1}$ is a map and $e_{2}$ is a key, and $\mathtt{mapKeys}\;e$ returns the list of keys of the map $e$ . Maps are convenient in $\mathcal{L}\textendash\textsf{Tr}$ to specify information that is not expressible in the language definition. For example, we can use maps to store information about whether some type argument is covariant or contravariant, or to store information about the input-output mode of the arguments of relations. Section 3 shows that we use maps in this way extensively. $\mathcal{L}\textendash\textsf{Tr}$ also has options ( $\mathtt{just}$ , $\mathtt{nothing}$ , and $\mathtt{get}$ ). We include options because they are frequently used in combination with the selector operator described below. Programmers can refer to grammar categories (cname) in positions where a list is expected. When cname is used the corresponding list of grammar items is retrieved.
Grammar Instructions: $cname\;{X}::=e$ is essentially a grammar production. With this instruction, the current grammar is augmented with this production. $cname\;{X}::=\ldots\;e$ (notice the dots) adds the terms in $e$ to an existing production. $\mathtt{getRules}$ and $\mathtt{setRules}\;e$ retrieve and set the current list of rules, respectively.
Selectors: ${e_{1}}[p]:\;e_{2}$ is the selector operator. This operation selects one by one the elements of the list $e_{1}$ that satisfy the pattern $p$ and executes the body $e_{2}$ for each of them. This operation returns a list that collects the result of each iteration. Selectors are useful for selecting elements of a language with great precision, and applying manipulations to them. To make an example, suppose that the variable prems contains the premises of a rule and that we wanted to invert the direction of all subtyping premises in it. The operation ${prems}[T_{1}<:T_{2}]:\;\mathtt{just}\;T_{2}<:T_{1}$ does just that. Notice that the body of a selector is an option. This is because it is common for some iteration to return no values ( $\mathtt{nothing}$ ). The examples in Section 3 show this aspect. Since options are commonly used in the context of selector iterations, we have designed our selector operation to automatically handle them. That is, $\mathtt{nothing}$ s are automatically removed, and the selector above returns the list of new subtyping premises rather than a list of options. The selector ${e(\mathtt{keep})}[p]:\;e$ works like an ordinary selector except that it also returns the elements that failed the pattern-matching.
Uniquefy: When transforming languages it is often necessary to assign distinct variables. The example of algorithmic subtyping in the introduction is archetypal. $\mathcal{L}\textendash\textsf{Tr}$ accommodates this operation as primitive with $\mathtt{uniquefy}$ .
$\mathtt{uniquefy}(e_{1},e_{2},str)\Rightarrow(x,y):e_{3}$ takes in input a list of formulae $e_{1}$ , a map $e_{2}$ , and a string $str$ (we shall discuss $x$ , $y$ , and $e_{3}$ shortly). This operation modifies the formulae $e_{2}$ to use different variable names when a variable is mentioned more than once. However, not every variable is subject to the replacement. Only the variables that appear in some specific positions are targeted. The map $e_{2}$ and the string $str$ contain the information to identify these positions. $e_{2}$ maps operator names and predicate names to a list that contains a label (as a string) for each of their arguments. For example, the map $m=\{\vdash\;\mapsto[``{in}",``{in}",``{out}"]\}$ says that $\Gamma$ and $e$ are inputs in a formula $\Gamma\vdash e:T$ , and that $T$ is the output. Similarly, the map $\{\to\;\mapsto[``{contravariant}",``{covariant}""]\}$ says that $T_{1}$ is contravariant and $T_{2}$ is covariant in $T_{1}\to T_{2}$ . The string $str$ specifies a label. $\mathcal{L}\textendash\textsf{Tr}$ inspects the formulae in $e_{1}$ and their terms. Arguments that correspond to the label according to the map then receive a new variable. To make an example, if $\mathit{lf}$ is the list of premises of (t-app) and $m$ is defined as above (input-output modes), the operation $\mathtt{uniquefy}(\mathit{lf},m,``{out}")\Rightarrow(x,y):e_{3}$ creates the premises of (t-app’) shown in the introduction. Furthermore, the computation continues with the expression $e_{3}$ in which $x$ is bound to these premises and $y$ is bound to a map that summarizes the changes made by $\mathtt{uniquefy}$ . This latter map associates every variable $X$ to the list of new variables that $\mathtt{uniquefy}$ has used to replace $X$ . For example, since $\mathtt{uniquefy}$ created the premises of (t-app’) by replacing $T_{1}$ in two different positions with $T_{11}$ and $T_{12}$ , the map $\{T_{1}\mapsto[T_{11},T_{12}]\}$ is passed to $e_{3}$ as $y$ . Section 3 will show two examples that make use of $\mathtt{uniquefy}$ .
Control Flow: $\mathcal{L}\textendash\textsf{Tr}$ includes the if-then-else statement with typical guards. $\mathcal{L}\textendash\textsf{Tr}$ also has the sequence operation ; (and $\mathtt{skip}$ ) to execute language transformations one after another. $e_{1};_{\text{r}}e_{2}$ , instead, executes sequences of transformations on rules. After $e_{1}$ evaluates to a rule, $e_{2}$ makes use of that rule as the subject of its transformations.
Programming Rules, Premises, and Terms: In $\mathcal{L}\textendash\textsf{Tr}$ a programmer can write $\mathcal{L}\textendash\textsf{Tr}$ terms ( $\hat{t}$ ), $\mathcal{L}\textendash\textsf{Tr}$ formulae ( $\hat{f}$ ), and $\mathcal{L}\textendash\textsf{Tr}$ rules ( $\hat{r}$ ) in expressions. These differ from the terms, formulae and rules of language definitions in that they can contain arbitrary expressions, such as if-then-else statements, at any position. This is a useful feature as it provides a declarative way to create rules, premises, or terms. To make an example with rule creation, we can write

\inference{{prems}[T_{1}<:T_{2}]:\;\mathtt{just}\;T_{2}<:T_{1}}{f}

where prems is the list of premises from above, and $f$ is a formula. As we can see, using expressions above the horizontal line is a convinient way to compute the premises of a rule.
Other Operations: The operation $\mathtt{fold}\;predname\;e$ creates a list of formulae that interleaves $predname$ to any two subsequent elements of the list $e$ . To make an example, the operation $\mathtt{fold}\;=\;[T_{1},T_{2},T_{3},T_{4}]$ generates the list of formulae $[T_{1}=T_{2},T_{2}=T_{3},T_{3}=T_{4}]$ . $\mathtt{vars}(e)$ returns the list of the meta-variables in $e$ . $\mathtt{newVar}$ returns a meta-variable that has not been previously used. The tick operator $e\texttt{'}$ gives a prime ^′ to the meta-variables of $e_{1}$ ( ${X}$ becomes ${X^{\prime}}$ ). $\mathtt{vars}$ and the tick operator also work on lists of terms.
Variables and Substitution: Some variables have a special treatment in $\mathcal{L}\textendash\textsf{Tr}$ . We can refer to the value that a selector iterates over with the variable $\mathit{self}$ . If we are in a context that manipulates a rule, we can also refer to the premises and conclusion with variables $\mathit{premises}$ and $\mathit{conclusion}$ . We use the notation $e[v/x]$ to denote the capture-avoiding substitution. $\theta$ ranges over finite sequences of substitutions denoted with $[v_{1}/x_{1},\ldots,v_{n}/x_{n}]$ . $e[v_{1}/x_{2},v_{1}/x_{2},\ldots,v_{n}/x_{n}]$ means $((e[v_{1}/x_{1}])[v_{2}/x_{2}])\ldots[v_{n}/x_{n}]$ . We omit the definition of substitution because it is standard, for the most part. The only aspect that differs from standard substitution is that we do not substitute $\mathit{self}$ , $\mathit{premises}$ and $\mathit{conclusion}$ in those contexts that will be set at run-time ( $;_{\textsf{r}}$ , and selector body). For example, $(e_{1};_{\textsf{r}}e_{2})[v/\mathcal{X}]\equiv(e_{1}[v/\mathcal{X}]);_{\textsf{r}}e_{2}$ , where $\mathcal{X}\in\{\mathit{self},\mathit{premises},\mathit{conclusion}\}$ .

2.2 Operational Semantics of $\mathcal{L}\textendash\textsf{Tr}$

Dynamic Semantics $V;\mathcal{L};e\longrightarrow V;\mathcal{L};e$

	$\displaystyle\inference{\{cname\;{X}::=v\}\in G}{V;(G,R);cname\longrightarrow_{\mathtt{@}}V;(G,R);v}$		(r-cname-ok)
	$\displaystyle\inference{\{cname\;{X}::=v\}\not\in G}{V;(G,R);cname\longrightarrow_{\mathtt{@}}V;(G,R);\mathtt{error}}$		(r-cname-fail)
	$\displaystyle V;(G,R);\mathtt{getRules}\longrightarrow_{\mathtt{@}}V;(G,R);R$		(r-getRules)
	$\displaystyle V;(G,R);\mathtt{setRules}\;v\longrightarrow_{\mathtt{@}}V;(G,v);\mathtt{skip}$		(r-setRules)
	$\displaystyle\inference{G^{\prime}=(G\backslash cname)\cup\{cname\;{X}::=v\}}{V;(G,R);(cname\;{X}::=v)\longrightarrow_{\mathtt{@}}\emptyset;(G^{\prime},R);\mathtt{skip}}$		(r-new-syntax)
	$\displaystyle\inference{\{cname\;{X}::=v^{\prime}\}\in G}{V;(G,R);(cname\;{X}::=\ldots\;v)\longrightarrow_{\mathtt{@}}\emptyset;(G,R);cname\;{X}::=v^{\prime}@v}$		(r-add-syntax-ok)
	$\displaystyle\inference{\{cname\;{X}::=v^{\prime}\}\not\in G}{V;(G,R);(cname\;{X}::=\ldots\;v)\longrightarrow_{\mathtt{@}}\emptyset;(G^{\prime},R);\mathtt{error}}$		(r-add-syntax-fail)
	$\displaystyle V;\mathcal{L};(\mathtt{skip};e)\longrightarrow_{\mathtt{@}}V;\mathcal{L};e$		(r-seq)
	$\displaystyle V;\mathcal{L};v;_{\textsf{r}}e\longrightarrow_{\mathtt{@}}V;\mathcal{L};e\theta_{\textsf{rule}}^{(v)}$		(r-rule-comp)
	$\displaystyle V;\mathcal{L};{\mathtt{nil}}[p]:\;e\longrightarrow_{\mathtt{@}}V;\mathcal{L};\mathtt{nil}$		(r-selector-nil)
	$\displaystyle\inference{\mathit{match}(v_{1},p)=\theta\qquad\theta^{\prime}=\mbox{$\begin{cases}\theta_{\textsf{rule}}^{(r)}&\mbox{if }v_{1}=r\\ \{\mathit{self}\mapsto v_{1}\}&\mbox{otherwise}\end{cases}$}}{V;\mathcal{L};{(\mathtt{cons}\;v_{1}\;v_{2})}[p]:\;e\longrightarrow_{\mathtt{@}}V;\mathcal{L};(\mathit{cons}^{*}\;{e\theta\theta^{\prime}}\;{({v_{2}}[p]:\;e)})}$		(r-selector-cons-ok)
	$\displaystyle\inference{\mathit{match}(v_{1},p)\not=\theta}{V;\mathcal{L};{(\mathtt{cons}\;v_{1}\;v_{2})}[p]:\;e\longrightarrow_{\mathtt{@}}V;\mathcal{L};({v_{2}}[p]:\;e)}$		(r-selector-cons-fail)
	$\displaystyle\inference{{X^{\prime}}\not\in V\cup\mathit{vars}(\mathcal{L})\cup\mathit{range}(\mathit{tick})}{V;(G,R);\mathtt{newVar}\;\longrightarrow_{\mathtt{@}}V\cup\{{X^{\prime}}\};\mathcal{L};{X^{\prime}}}$		(r-newvar)
	$\displaystyle\inference{(\mathit{lf}^{\prime},v_{2})=\mathit{uniquefy}_{\textsf{lf}}(\mathit{lf},v_{1},str,\mathtt{map}([],[]))}{V;\mathcal{L};\mathtt{uniquefy}(\mathit{lf},v_{1},str)\Rightarrow(x,y):e\longrightarrow_{\mathtt{@}}V;\mathcal{L};e[\mathit{lf}^{\prime}/x,v_{2}/y]}$		(r-uniquefy-ok)
	$\displaystyle\inference{\mathit{uniquefy}_{\textsf{lf}}(\mathit{lf},v_{1},str,\mathtt{map}([],[]))=\mathit{fail}}{V;\mathcal{L};\mathtt{uniquefy}(\mathit{lf},v_{1},str)\Rightarrow(x,y):e\longrightarrow_{\mathtt{@}}V;\mathcal{L};\mathtt{error}}$		(r-uniquefy-fail)
	$\displaystyle\text{where }\theta_{\textsf{rule}}^{(r)}\equiv[r/\mathit{self},v_{1}/\mathit{premises},v_{2}/\mathit{conclusion}]\qquad\text{if }r=\inference{v_{1}}{v_{2}}$

Figure 1: Reduction Semantics of

\mathcal{L}\textendash\textsf{Tr}

In this section we show a small-step operational semantics for $\mathcal{L}\textendash\textsf{Tr}$ . A configuration is denoted with $V;\mathcal{L};e$ , where $e$ is an expression, $\mathcal{L}$ is the language subject of the transformation, and $V$ is the set of meta-variables that have been generated by $\mathtt{newVar}$ . Calls to $\mathtt{newVar}$ make sure not to produce name clashes.

The main reduction relation is $V;\mathcal{L};e\longrightarrow V^{\prime};\mathcal{L}^{\prime};e^{\prime}$ , defined as follows. Evaluation contexts $E$ are straightforward and can be found in Appendix LABEL:evaluationcontexts.

This relation relies on a step $V;\mathcal{L};e\longrightarrow_{\mathtt{@}}V^{\prime};\mathcal{L}^{\prime};e^{\prime}$ , which concretely performs the step. Since a transformation may insert ill-formed elements such as $\vdash T\;T$ or $\to\;e\;e$ in the language, we also rely on a notion of type checking for language definitions $\vdash\mathcal{L}^{\prime}$ decided by the language designer. For example, our implementation of $\mathcal{L}\textendash\textsf{Tr}$ compiles languages to $\lambda$ -prolog and detects ill-formed languages at each step, but the logic of Coq, Agda, Isabelle could be used as well. Our type soundness theorem works regardless of the definition of $\vdash\mathcal{L}^{\prime}$ .

Fig. 1 shows the reduction relation $V;\mathcal{L};e\longrightarrow_{\mathtt{@}}V^{\prime};\mathcal{L}^{\prime};e^{\prime}$ . We show the most relevant rules. The rest of the rules can be found in Appendix LABEL:app:operational. (r-cname-ok) and (r-cname-fail) handle the encounter of a category name. We retrieve the corresponding list of terms from the grammar or throw an error if the production does not exist. (r-getRules) retrieves the list of rules of the current language, and (r-setRules) updates this list. (r-new-syntax) replaces the grammar with a new one that contains the new production. The meta-operation $G\backslash cname$ in that rule removes the production with category name $cname$ from $G$ (definition is straightforward and omitted). The position of $cname$ in $(cname\;{X}::=v)$ is not an evaluation context, therefore (r-cname-ok) will not replace that name. (r-add-syntax-ok) takes a step to the instruction for adding new syntax. The production to be added includes both old and new grammar terms. (r-add-syntax-fail) throws an error when the category name does not exist in the grammar, or the meta-variable does not match. (r-rule-seq) applies when the first expression has evaluated, and starts the evaluation of the second expression. (Evaluation context $E;e$ evaluates the first expression) (r-rule-comp) applies when the first expression has evaluated to a rule, and starts the evaluation of the second expression where $\theta_{\textsf{rule}}^{(v)}$ sets this rule as the current rule. Rules (r-selector-*) define the behavior of a selector. (r-selector-cons-ok) and (r-selector-cons-fail) make use of the meta-operation $\mathit{match}(v_{1},p)=\theta$ . If this operation succeeds it returns the substitutions $\theta$ with the associations computed during pattern-matching. The definition of $\mathit{match}$ is standard and is omitted. The body is evaluated with these substitutions and with $\mathit{self}$ instantiated with the element selected. If the element selected is a rule, then the body is instantiated with $\theta_{\textsf{rule}}^{(v)}$ to refer to that rule as the current rule. The body of the selector always returns an option type. However, $\mathit{cons}^{*}$ is defined as: $\mathit{cons}^{*}\;{e_{1}}\;{e_{2}}\equiv\mathtt{if}\;(\mathtt{isNothing}\;e_{1})\;\mathtt{then}\;e_{2}\;\mathtt{else}\;\mathtt{cons}\;(\mathtt{get}\;e_{1})\;e_{2}$ . Therefore, $\mathtt{nothing}$ s are discarded, and values wrapped in $\mathtt{just}$ s are unwrapped. (r-newvar) returns a new meta-variable and augments $V$ with it. Meta-variables are chosen among those that are not in the language, have not previously been generated by $\mathtt{newVar}$ , and are not in the range of $\mathit{tick}$ . This meta-operation is used by the tick operator to give a prime to meta-variables. r-newvar avoids clashes with these variables, too. (r-uniquefy-ok) and (r-uniquefy-fail) define the semantics for $\mathtt{uniquefy}$ . They rely on the meta-operation $\mathit{uniquefy}_{\textsf{r}}(\mathit{lf},v,str,\mathtt{map}([],[]))$ , which takes the list of formulae $\mathit{lf}$ , the map $v$ , the string $str$ , and an empty map to start computing the result map. The definition of $\mathit{uniquefy}_{\textsf{r}}$ is mostly a recursive traversal of list of formuale and terms, and we omit that. It can be found in Appendix LABEL:uniquefy. This function can succeed and return a pair $(\mathit{lf}^{\prime},v_{2})$ where $\mathit{lf}^{\prime}$ is the modified list of formulae and $v_{2}$ maps meta-variables to the new meta-variables that have replaced it. $\mathit{uniquefy}_{\textsf{r}}$ can also fail. This may happen when, for example, a map such as $\{\to\;\mapsto``contra"\}$ is passed when $\to$ requires two arguments.

2.3 Type System of $\mathcal{L}\textendash\textsf{Tr}$

Type System (Configurations) $\Gamma\vdash\;V;\mathcal{L};e$

Type System (Expressions) $\Gamma\vdash\;e:T$

	(t-var) $\displaystyle\displaystyle\Gamma,x:T\vdash\;x:T$ (t-opname) $\displaystyle\displaystyle{\hbox{\hskip 33.19014pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e:{\mathtt{List}}\;\mathtt{Term}$}}}\vbox{}}}\over\hbox{\hskip 43.36288pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;(opname\;e):\mathtt{Term}$}}}}}}$ (t-opname-var) $\displaystyle\displaystyle{\hbox{\hskip 33.19014pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e:{\mathtt{List}}\;\mathtt{Term}$}}}\vbox{}}}\over\hbox{\hskip 49.5817pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma,x:\mathtt{OpName}\vdash\;(x\;e):\mathtt{Term}$}}}}}}$
	(t-meta-var) $\displaystyle\displaystyle{X}:\mathtt{Term}$ (t-abs) $\displaystyle\displaystyle{\hbox{\hskip 22.45554pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e:\mathtt{Term}$}}}\vbox{}}}\over\hbox{\hskip 28.3982pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;({z})e:\mathtt{Term}$}}}}}}$ (t-subs) $\displaystyle\displaystyle{\hbox{\hskip 51.31654pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1}:\mathtt{Term}\quad\Gamma\vdash\;e_{2}:\mathtt{Term}$}}}\vbox{}}}\over\hbox{\hskip 36.18301pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1}[e_{2}/{z}]:\mathtt{Term}$}}}}}}$
	(t-predname) $\displaystyle\displaystyle{\hbox{\hskip 33.19014pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e:{\mathtt{List}}\;\mathtt{Term}$}}}\vbox{}}}\over\hbox{\hskip 54.97319pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;(predname\;e):\mathtt{Formula}$}}}}}}$ (t-predname-var) $\displaystyle\displaystyle{\hbox{\hskip 33.19014pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e:{\mathtt{List}}\;\mathtt{Term}$}}}\vbox{}}}\over\hbox{\hskip 61.39406pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma,x:\mathtt{PredName}\vdash\;(x\;e):\mathtt{Formula}$}}}}}}$
	(t-rule) $\displaystyle\displaystyle{\hbox{\hskip 41.16782pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1}:{\mathtt{List}}\;\mathtt{Formula}$}}}\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{2}:\mathtt{Formula}$}}}\vbox{}}}}\over\hbox{\hskip 31.84348pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;\inference{e_{1}}{e_{2}}:\mathtt{Rule}$}}}}}}$ (t-seq) $\displaystyle\displaystyle{\hbox{\hskip 32.79568pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1}:\mathtt{Language}$}}}\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{2}:\mathtt{Language}$}}}\vbox{}}}}\over\hbox{\hskip 39.17636pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1};e_{2}:\mathtt{Language}$}}}}}}$ (t-rule-comp) $\displaystyle\displaystyle{\hbox{\hskip 31.59924pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1}:\mathtt{Rule}$}}}\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma,\Gamma_{\textsf{rule}}\vdash\;e_{2}:\mathtt{Rule}$}}}\vbox{}}}}\over\hbox{\hskip 30.28189pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1};_{\textsf{r}}e_{2}:\mathtt{Rule}$}}}}}}$
	(t-selector) $\displaystyle\displaystyle{\hbox{\hskip 168.72173pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1}:{\mathtt{List}}\;T\quad\Gamma\vdash\;p:T\Rightarrow\Gamma^{\prime}$}\hskip 18.49988pt\hbox{\hbox{$\displaystyle\displaystyle\Gamma^{\prime\prime}=\mbox{$\displaystyle\begin{cases}\Gamma_{\textsf{rule}}&\mbox{if }T=\mathtt{Rule}\\ \mathit{self}:T&\mbox{otherwise}\end{cases}$}$}\hskip 18.49988pt\hbox{\hbox{$\displaystyle\displaystyle\Gamma,\Gamma^{\prime},\Gamma^{\prime\prime}\vdash\;e_{2}:\mathtt{Option}\;T^{\prime}$}}}}}\vbox{}}}\over\hbox{\hskip 42.4434pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;{e_{1}}[p]:\;e_{2}:{\mathtt{List}}\;T^{\prime}$}}}}}}$
	(t-syntax-new and t-syntax-add) $\displaystyle\displaystyle{\hbox{\hskip 33.19014pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e:{\mathtt{List}}\;\mathtt{Term}$}}}\vbox{}}}\over\hbox{\hskip 65.62836pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\mbox{$\displaystyle\begin{array}[]{c}\Gamma\vdash\;cname\;{X}::=e:\mathtt{Language}\\ \Gamma\vdash\;cname\;{X}::=\ldots\;e:\mathtt{Language}\end{array}$ }$}}}}}}$ (t-cname) $\displaystyle\displaystyle\Gamma\vdash cname:{\mathtt{List}}\;\mathtt{Term}$
	(t-getRules) $\displaystyle\displaystyle\Gamma\vdash\mathtt{getRules}:{\mathtt{List}}\;\mathtt{Rule}$ (t-setRules) $\displaystyle\displaystyle{\hbox{\hskip 33.19014pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e:{\mathtt{List}}\;\mathtt{Rule}$}}}\vbox{}}}\over\hbox{\hskip 52.08992pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;\mathtt{setRules}\;e:\mathtt{Language}$}}}}}}$
	$\displaystyle{\vbox{\hbox{\hbox{\small\small{(t-uniquefy)}}}\hbox{$\displaystyle\displaystyle{\hbox{\hskip 118.3857pt\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{1}:{\mathtt{List}}\;\mathtt{Formula}$}}}\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;e_{2}:\mathtt{Map}\;T^{\prime}\;({\mathtt{List}}\;\mathtt{String})\quad T^{\prime}=\mathtt{OpName}\text{ or }T^{\prime}=\mathtt{PredName}$}}}\vbox{\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma,x:{\mathtt{List}}\;\mathtt{Formula},y:\mathtt{Map}\;\mathtt{Term}\;({\mathtt{List}}\;\mathtt{Term})\vdash\;e_{3}:T$}}}\vbox{}}}}}\over\hbox{\hskip 81.58128pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;\mathtt{uniquefy}(e_{1},e_{2},str)\Rightarrow(x,y):e_{3}:T$}}}}}}$}}}\qquad\begin{array}[]{c}{\vbox{\hbox{\hbox{\small\small{(t-skip)}}}\hbox{$\displaystyle\vbox{\hbox{\hskip 37.92044pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;\mathtt{skip}:\mathtt{Language}$}}}}}}$}}}\\[7.74997pt] {\vbox{\hbox{\hbox{\small\small{(t-newvar)}}}\hbox{$\displaystyle\vbox{\hbox{\hskip 33.1955pt\vbox{\vbox{}\hbox{\thinspace\hbox{\hbox{$\displaystyle\displaystyle\Gamma\vdash\;\mathtt{newVar}:\mathtt{Term}$}}}}}}$}}}\end{array}$
	$\displaystyle\text{where }\Gamma_{\textsf{rule}}\equiv\mathit{self}:\mathtt{Rule},\mathit{premises}:{\mathtt{List}}\;\mathtt{Formula},\mathit{conclusion}:\mathtt{Formula}$

Figure 2: Type System of

\mathcal{L}\textendash\textsf{Tr}

In this section we define a type system for $\mathcal{L}\textendash\textsf{Tr}$ . Types are defined as follows

\begin{array}[]{l@{\;\;}lcl}\text{\sf Type}&T&::=&\mathtt{Language}\mid\mathtt{Rule}\mid\mathtt{Formula}\mid\mathtt{Term}\\ &&&{\mathtt{List}}\;T\mid\mathtt{Map}\;T\;T\mid\mathtt{Option}\;T\mid\mathtt{String}\mid\mathtt{OpName}\mid\mathtt{PredName}\\ \text{\sf Type Env}&\Gamma&::=&\emptyset\mid\Gamma,x:T\end{array}

We have a typical type environment that maps variables to types. Fig. 2.3 shows the type system. The typing judgement $\vdash V;\mathcal{L};e$ means that the configuration $V;\mathcal{L};e$ is well-typed. This judgment checks that the variables of $V$ and those in $\mathcal{L}$ are disjoint. This is an invariant that ensures that $\mathtt{newVar}$ always produces fresh names. We also check that $\mathcal{L}$ is well-typed and that $e$ is of type $\mathtt{Language}$ .

We type check expressions with the typing judgement $\Gamma\vdash\;e:T$ , which means that $e$ has type $T$ under the assignments in $\Gamma$ . Most typing rules are straightforward. We omit rules about lists and maps because they are standard. We comment only on the rules that are more involved. (t-selector) type checks a selector operation. We use $\Gamma\vdash\;p:T\Rightarrow\Gamma^{\prime}$ to type check the pattern $p$ and return the type environment for the variables of the pattern. Its definition is standard and is omitted. When we type check the body $e_{2}$ we then include $\Gamma^{\prime}$ . If the elements of the list are rules then we also include $\Gamma_{\textsf{rule}}$ to give a type to the variables for referring to the current rule. Otherwise, we assign $\mathit{self}$ the type of the element of the list. Selectors with $\mathtt{keep}$ are analogous and omitted. (t-rule-comp) type checks a rule composition. In doing so, we type check the second expression with $\Gamma_{\textsf{rule}}$ . (t-uniquefy) type checks the $\mathtt{uniquefy}$ operation. As we rename variables depending on the position they hold in terms and formulae, the keys of the map are of type $\mathtt{OpName}$ or $\mathtt{PredName}$ , and values are strings. We type check $e_{3}$ giving $x$ the type of list of formulae, and $y$ the type of a map from meta-variables to list of meta-variables.

We have proved that $\mathcal{L}\textendash\textsf{Tr}$ is type sound.

Theorem 2.1 (Type Soundness)

For all $\Gamma$ , $V$ , $\mathcal{L}$ , $e$ , if $\vdash V;\mathcal{L};e$ then $V;\mathcal{L};e\longrightarrow^{*}V^{\prime};\mathcal{L}^{\prime};e^{\prime}$ s.t. i) $e^{\prime}=\mathtt{skip}$ , ii) $e^{\prime}=\mathtt{error}$ , or iii) $V^{\prime};\mathcal{L}^{\prime};e^{\prime}\longrightarrow V^{\prime\prime};\mathcal{L}^{\prime\prime};e^{\prime\prime}$ , for some $e^{\prime\prime}$ .

The proof is by induction on the derivation $\vdash V;\mathcal{L};e$ , and follows the standard approach of Wright and Felleisen [WrightFelleisen94] through a progress theorem and a subject reduction theorem. The proof can be found in Appendix 0.D.

3 Examples

We show the applicability of $\mathcal{L}\textendash\textsf{Tr}$ with two examples of language transformations: adding subytyping [tapl] and switching to big-step semantics [Kahn87]. In the code we use let-binding, pattern-matching, and an overlap operation that returns true if two terms have variables in common. These operations can be easily defined in $\mathcal{L}\textendash\textsf{Tr}$ , and we show them in Appendix 0.E. The code below defines the transformation for adding subtyping. We assume that two maps are already defined, $mode=\{\vdash\;\mapsto[``inp",\;``inp",\;``out"]\}$ and $variance=\{\to\;\;\mapsto[``contra",\;``cova"]\}$ .

⬇

\mathtt{setRules}

\mathtt{getRules}(\mathtt{keep})[(\vdash[\Gamma,e,T])]:

\mathtt{uniquefy}(premises,mode,``out")=>(uniq,newpremises):

\underline{newpremises\;@\;\mathtt{concat}(\mathtt{mapKeys}(uniq)[T_{f}]:\mathtt{fold}<:uniq(T_{f}))}

conclusion

;_{\textsf{r}}

\mathtt{concat}(premises(\mathtt{keep})[T_{1}<:T_{2}]:

premises[(\vdash[\Gamma,e_{v},(c_{v}\;Ts_{v})])]:

\mathtt{let}\;vmap=\mathtt{map}(Ts_{v},\;variance(c_{v}))\;\mathtt{in}\;

\mathtt{if}\;vmap(T_{1})=``contra"\;\mathtt{then}\;T_{2}<:T_{1}

\underline{\mathtt{else}\;\mathtt{if}\;vmap(T_{1})=``inv"\;\mathtt{and}\;vmap(T_{2})=``inv"\;\mathtt{then}\;T_{1}=T_{2}\;\mathtt{else}\;T_{1}<:T_{2})}

conclusion

;_{\textsf{r}}

\mathtt{let}\;\mathit{outputVars}=\mathtt{match}\;conclusion\;\mathtt{with}\;(\vdash[\Gamma,e_{c},T_{c}])\Rightarrow\mathtt{vars}(T_{c})\;\mathtt{in}\;

\mathtt{let}\;joins=\mathtt{mapKeys}(uniq)[T_{i}]:

\mathtt{if}\;T_{i}\;\mathtt{in}\;\mathit{outputVars}\;\mathtt{then}\;(\sqcup\;uniq(T_{i})\;=\;T_{i})\;\mathtt{else}\;\mathtt{nothing}

\mathtt{in}\;\underline{premises\;@\;joins}

conclusion

Line 1 updates the rules of the language with the rules computed by the code in lines 2-17. Line 2 selects all typing rules, and each of them will be the subject of the transformations in lines 3-17. Line 3 calls $\mathtt{uniquefy}$ on the premises of the selected rule. We instruct $\mathtt{uniquefy}$ to give new variables to the outputs of the typing relation $\vdash$ , if they are used more than once in that position. As previously described, $\mathtt{uniquefy}$ returns the list of new premises, which we bind to $\mathit{newpremises}$ , and the map that assigns variables to the list of the new variables generated to replace them, which we bind to $uniq$ . The body of $\mathtt{uniquefy}$ goes from line 4 to 17. Lines 4 and 5 build a new rule with the conclusion of the selected rule (line 5). It does so using the special variable name conclusion. The premises of this rule include the premises just generated by $\mathtt{uniquefy}$ . Furthermore, we add premises computed as follows. With $\mathtt{mapKeys}(uniq)[T_{f}]$ , we iterate over all the variables replaced by $\mathtt{uniquefy}$ . We take the variables that replaced them and use fold to relate them all with subtyping. In other words, for each $\{T\;\mapsto[T_{1},\ldots,T_{n}]\}$ in $uniq$ , we have the formulae $T_{1}<:T_{2},\ldots,T_{n-1}<:T_{n}$ . This transformation has created a rule with unique outputs and subtyping, but subtyping may be incorrect because if some variable is contravariant its corresponding subtyping premise should be swapped. Lines 7-11, then, adjust the subtyping premises based on the variance of types. Line 7 selects all subtyping premises of the form $T_{1}<:T_{2}$ . For each, Line 8 selects typing premises with output of the form $(c_{v}\;Ts_{v})$ . We do so to understand the variance of variables. If the first argument of $c_{v}$ is contravariant, for example, then the first element of $Ts_{v}$ warrants a swap in a subtyping premise because it is used in the contravariant position. We achieve this by creating a map that associates the variance to each argument of $c_{v}$ . The information about the variance for $c_{v}$ is in $variance$ . If $T_{1}$ or $T_{2}$ (from the pattern of the selected premise) appear in $Ts_{v}$ then they find themselves with a variance assigned in $vmap$ . Lines 10-11 generate a new premise based on the variance of variables. For example, if $T_{1}$ is contravariant then we generate $T_{2}<:T_{1}$ .

The program written so far (lines 1-11) is enough to add subtyping to several typing rules. For example, (t-app) can be transformed into (t-app’) with this program. However, some typing rules need a more sophisticated algorithm. Below is the typing rule for if-then-else on the left, and its version with subtyping on the right, which makes use of the join operator ( $\sqcup$ ) (see, [tapl]).

{\footnotesize\begin{array}[]{ccc}{\inference{\Gamma\vdash\;e_{1}:\mathtt{Bool}\\ \Gamma\vdash\;e_{2}:T\quad\Gamma\vdash\;e_{2}:T}{\Gamma\vdash\;(\mathit{if}\;e_{1}\;e_{2}\;e_{3}):T}}&~~\Longrightarrow&{\inference{\Gamma\vdash e_{1}:\mathtt{Bool}\qquad\Gamma\vdash e_{2}:T_{1}\\ \\ \Gamma\vdash e_{3}:T_{2}\qquad T_{1}\sqcup T_{2}=T}{\Gamma\vdash(\mathit{if}\;e_{1}\;e_{2}\;e_{3}):T}}\end{array}}

If we removed $T_{1}\sqcup T_{2}$ the meta-variable $T$ would have no precise instantiation because its counterpart variables have been given new names. Lines 13-17 accommodate for cases of the like. Line 13 saves the variables that appear the output type of the rule in outputVar. We then iterate over all the keys of $uniq$ , that is, the variables that have been replaced. For each of them, we see if they appear in outputVar. If so then we create a join operator with the variables newly generated to replace this variable, which can be retrieved from $uniq$ . We set the output of the join operator to be the variable itself, because that is the one used in the conclusion.

The algorithm above shows that $\mathtt{uniquefy}$ is a powerful operation of $\mathcal{L}\textendash\textsf{Tr}$ . To illustrate $\mathtt{uniquefy}$ further, let us consider a small example before we address big-step semantics. Suppose that we would like to make every test of equality explicit. We therefore want to disallow terms such as $(\mathtt{op}\;e\;e\;e)$ to appear in the premises, and want to turn them into $(\mathtt{op}\;e_{1}\;e_{2}\;e_{3})$ together with premises $e_{1}=e_{2}$ and $e_{2}=e_{3}$ . In $\mathcal{L}\textendash\textsf{Tr}$ we can do this in the following way. Below we assume that the map allOps maps each operator to the string “yes” for each of its arguments. This instructs $\mathtt{uniquefy}$ to look for every argument.

⬇

1...

\mathtt{uniquefy}(premises,allOps,``yes")=>(uniq,newpremises):

{newpremises\;@\;\mathtt{concat}(\mathtt{mapKeys}(uniq)[T_{f}]:\mathtt{fold}=uniq(T_{f}))}

Below, we show the code to turn language definitions into big-step semantics.

⬇

\mathtt{setRules}

\mathit{Value}[v]:v\longrightarrow v\;@

\mathtt{getRules}(\mathtt{keep})[(op\;es)\;\longrightarrow\;et]:

\mathtt{if}\;\mathtt{isEmpty}({Expression}[(op\;\_)]:\;self)\;\mathtt{then}\;\mathtt{nothing}\;\mathtt{else}\;

\mathtt{let}\;v_{res}=\mathtt{newVar}\;\mathtt{in}

\mathtt{let}\;emap=\mathtt{createMap}(({es}[e]:\;\mathtt{newVar}),es)\;\mathtt{in}

(\mathtt{mapKeys}(emap)[e]:\mathtt{if}\;\mathtt{isVar}(emap(e))\;\mathtt{and}\;\mathtt{not}(emap(e)\;\mathtt{in}\;\mathtt{vars}(et))

\mathtt{then}\;\mathtt{nothing}\;\mathtt{else}\;e\longrightarrow emap(e))

\underline{@\;(\mathtt{if}\;\mathtt{not}(et\;\mathtt{in}\;es)\;\mathtt{then}\;[(et\;\longrightarrow\;v_{res})]\;\mathtt{else}\;\mathtt{nil})\;@\;premises\qquad\qquad\qquad~~~}

{(op\;(\mathtt{mapKeys}(emap)))\;\longrightarrow\;\mathtt{if}\;\mathtt{not}(et\;\mathtt{in}\;es)\;\mathtt{then}\;v_{res}\;\mathtt{else}\;et}

Line 1 updates the rules of the language with the list computed in lines 2-9. Line 2 generates reduction rules such as $\lambda x.e\longrightarrow\lambda x.e$ , for each value, as it is standard in big-step semantics. These rules are appended to those generated in lines 3-9. Line 3 selects all the reduction rules. Line 4 leaves out those rules that are not about a top-level expression operator. This skips contextual rules that take a step $E[e]\longrightarrow E[e^{\prime}]$ , which do not appear in big-step semantics. To do so, line 4 make use of ${\emph{Expression}}[(op\;\_)]:\;\emph{self})$ . As $op$ is bound to the operator we are focusing on (from line 2), this selector returns a list with one element if $op$ appears in Expression, and an empty list otherwise. This is the check we perform at line 4. Line 5 generates a new variable that will store the final value of the step. Line 6 assigns a new variable to each of the arguments in $(es)$ . We do so creating a map emap. These new variables are the formal arguments of the new rule being generated (Line 9). Line 7-8 makes each of these variables evaluate to its corresponding argument in $es$ (line 8). For example, for the beta-reduction an argument of $es$ would be $\lambda x.e$ and we therefore generate the premise $e_{1}\longrightarrow\lambda x.e$ , where $e_{1}$ is the new variable that we assigned to this argument with line 6. Line 7 skips generating the reduction premise if it is a variable that does not appear in $e_{t}$ . For example, in the translation of (if-true) $(\mathit{if}\;true\;e_{2}\;e_{3})\longrightarrow e_{2}$ we do not evaluate $e_{3}$ at all. Line 9 handles the result of the overall small-step reduction. This result is evaluated to a value ( $v_{res}$ ), unless it already appears in the arguments $es$ . The conclusion of the rule syncs with this, and we place $v_{res}$ or $e_{t}$ in the target of the step accordingly. Line 9 also appends the premises from the original rule, as they contain conditions to be checked.

When we apply this algorithm to the simply typed $\lambda$ -calculus with if-then-else we obtain: (we use standard notation rather than $\mathcal{L}\textendash\textsf{Tr}$ syntax)

$last(l)$ returns the last element of a list, and $@$ is list append.

The function is mostly a straightforward recursive traverse of terms, formulae, list of terms and list of formulae. The only elements to notice are that when $\mathit{uniquefy}$ detects a context that potentially contain the string $str$ then it switches to $\mathit{uniquefy}^{\bullet}$ , which is a meta-operation that seeks for $str$ . In turn, when $\mathit{uniquefy}^{\bullet}$ finds an argument in a position prescribed by $str$ , then it switches to $\mathit{uniquefy}^{\dagger}$ , which is a meta-operation that is responsible for actually replace variables and record the association. $\mathit{zip}$ is a meta-operation that combines two lists. Of course, it may fail if the two lists do not have the same length. This happens in the scenario described above about $\to$ and its number of argumets. $\mathit{CheckZip}$ performs just that check and can make the function fail.

Appendix 0.D Proof of Type Soundness

0.D.1 Progress Theorem

Theorem 0.D.1 (Canonical Form Lemmas)

•

$\emptyset\vdash\;e:\mathtt{Language}$ , and $e$ is a value then $e=\mathtt{skip}$ .
•

$\emptyset\vdash\;e:\mathtt{Rule}$ , and $e$ is a value then $e=r$ .
•

$\emptyset\vdash\;e:\mathtt{Formula}$ , and $e$ is a value then $e=f$ .
•

$\emptyset\vdash\;e:\mathtt{Term}$ , and $e$ is a value then $e=t$ .
•

$\emptyset\vdash\;e:{\mathtt{List}}\;T$ , and $e$ is a value then $e={\mathtt{nil}}\;\lor e={\mathtt{cons}}{\;v_{1}}{v_{2}}$ .
•

$\emptyset\vdash\;e:\mathtt{Map}\;T_{1}\;T_{2}$ , and $e$ is a value then $e=\mathtt{map}(v_{1},v_{2})$ .
•

$\emptyset\vdash\;e:\mathtt{Option}\;T_{1}{T_{2}}$ , and $e$ is a value then $e=\mathtt{nothing}\;\lor=\mathtt{just}\;v$ .
•

$\emptyset\vdash\;e:\mathtt{String}$ , and $e$ is a value then $e=str$ .
•

$\emptyset\vdash\;e:\mathtt{OpName}$ , and $e$ is a value then $e=opname$ .
•

$\emptyset\vdash\;e:\mathtt{PredName}$ , and $e$ is a value then $e=predname$ .

Proof

Each case is proved by case analysis on $\emptyset\vdash\;e:T$ . Each case is straightforward.

Theorem 0.D.2 (Progress Theorem Expressions)

For all , if $\emptyset\vdash e:T$ then either

•

$e=v$ , or
•

$e=\mathtt{error}$ , or
•

for all $V,\mathcal{L}$ , $V;\mathcal{L};e\longrightarrow V^{\prime};\mathcal{L}^{\prime};e^{\prime}$ , for some $V^{\prime},\mathcal{L}^{\prime},e^{\prime}$ .

Proof

We prove the theorem by induction on the derivation of $\emptyset\vdash e:T$ . Let us assume the proviso of the theorem, that is (H1) $\emptyset\vdash e:T$ .

$V;\mathcal{L}\;;\;(opname\;\mathtt{error})\longrightarrow V;\mathcal{L}\;;\;\mathtt{error}$ , for all $V,\mathcal{L}$ because of the evaluation context $(opname\;E)$ . • for all $V,\mathcal{L}$ , $V;\mathcal{L};e\longrightarrow V^{\prime};\mathcal{L}^{\prime};e^{\prime}$ , for some $V^{\prime},\mathcal{L}^{\prime},e^{\prime}$ . Then $(opname\;e)$ takes a step by ctx-succ or ctx-lang-err.

Case 2 (t-rule-comp)

Since (H1) then we have

e=(e_{1};_{\textsf{r}}e_{2})

with

\emptyset\vdash e_{1}:{\mathtt{List}}\;\mathtt{Rule}

. By IH, we have that

V;\mathcal{L}\;;\;\mathtt{error};_{\textsf{r}}e_{2}\longrightarrow V;\mathcal{L}\;;\;\mathtt{error}

, for all

V,\mathcal{L}

because of the evaluation context

E;_{\textsf{r}}e

. • for all

V,\mathcal{L}

V;\mathcal{L};e_{1}\longrightarrow V^{\prime};\mathcal{L}^{\prime};e_{1}^{\prime}

, for some

V^{\prime},\mathcal{L}^{\prime},e^{\prime}

. Then

e_{1};_{\textsf{r}}e_{2}

takes a step by ctx-succ or ctx-lang-err.

Case 3 (t-seq)

Since (H1) then we have

e=e_{1};e_{2}

with

\emptyset\vdash e_{1}:\mathtt{Language}

. By IH, we have that •

e_{1}=v

. By Canonical form

e_{1}=\mathtt{skip}

. Then we have

\mathtt{skip};e_{2}

which by takes a step. •

e_{1}=\mathtt{error}

. Then we have

\mathtt{error};e_{2}

and by ctx-err we take a step to an error. • for all

V,\mathcal{L}

V;\mathcal{L};e_{1}\longrightarrow V^{\prime};\mathcal{L}^{\prime};e_{1}^{\prime}

, for some

V^{\prime},\mathcal{L}^{\prime},e^{\prime}

. Then by ctx-succ or ctx-lang-err, we take a step.

Case 4 (t-selector)

Since (H1) then we have that

\emptyset\vdash e_{1}:{\mathtt{List}}\;T

. By IH, we have that •

e_{1}=v

. By Canonical form

e_{1}

can have two forms: –

e_{1}={\mathtt{nil}}

Then we apply r-selector-nil takes a step. –

e_{1}={\mathtt{cons}}{\;v_{1}}{v_{2}}

. Then we have two cases: ether

\mathit{match}(v_{1},p,

) succeeds, then we apply r-selector-cons-ok and take a step, or

\mathit{match}(v_{1},p,

) fails, then we apply r-selector-cons-fail and take a step. •

e_{1}=\mathtt{error}

. Then by ctx-err we take a step to an error. • for all

V,\mathcal{L}

V;\mathcal{L};e_{1}\longrightarrow V^{\prime};\mathcal{L}^{\prime};e_{1}^{\prime}

, for some

V^{\prime},\mathcal{L}^{\prime},e^{\prime}

. Then by ctx-succ or ctx-lang-err, we take a step. The case for selectors with keep are analogous.

Case 5 (t-uniquefy)

Since (H1) then we have that

\emptyset\vdash e_{1}:\mathtt{Rule}

\emptyset\vdash e_{2}:\mathtt{Map}\;\mathtt{OpName}\;({\mathtt{List}}\;\mathtt{String})

. By IH on

e_{1}

, we have that •

e_{1}=v_{1}

. By Canonical form

e_{1}=r

. By IH on

e_{2}

we have the three cases: –

e_{2}=v_{2}

. Then we there are two cases: Either

\mathit{uniquefy}

succeeds and we apply r-uniquefy-ok to take a step, or fails and we apply r-uniquefy-fail to take a step. –

e_{2}=\mathtt{error}

. Then by ctx-err we take a step to an error. – for all

V,\mathcal{L}

V;\mathcal{L};e_{2}\longrightarrow V^{\prime};\mathcal{L}^{\prime};e_{2}^{\prime}

, for some

V^{\prime},\mathcal{L}^{\prime},e_{2}^{\prime}

. Then by ctx-succ or ctx-lang-err, we take a step. •

e_{1}=\mathtt{error}

. Then by ctx-err we take a step to an error. • for all

V,\mathcal{L}

V;\mathcal{L};e_{1}\longrightarrow V^{\prime};\mathcal{L}^{\prime};e_{1}^{\prime}

, for some

V^{\prime},\mathcal{L}^{\prime},e_{1}^{\prime}

. Then by ctx-succ or ctx-lang-err, we take a step. The case of

\emptyset\vdash e_{2}:\mathtt{Map}\;\mathtt{PredName}\;({\mathtt{List}}\;\mathtt{String})

is analogous.

Case 6 (t-tick)

Since (H1) then we have that

\emptyset\vdash e_{1}:T

\emptyset\vdash e_{2}:{\mathtt{List}}\;\mathtt{Term}

. By IH on

e_{1}

, we have that •

e_{1}=v_{1}

. By IH on

e_{2}

: –

e_{2}=v_{2}

. Then we have two cases depending on

T

: *

T=\mathtt{Term}

. By Canonical forms, we have that

e_{1}

can be of the following forms: ·

(opname\;v_{1}^{\prime})

. Then we apply LABEL:r-tick-opname and take a step. ·

{X}

. Then we apply LABEL:r-tick-var and take a step. ·

({z}v_{1}^{\prime})

. Then we apply LABEL:r-tick-abs and take a step. ·

v_{1}^{\prime}[v_{1}^{\prime\prime}/{z}]

. Then we apply LABEL:r-tick-sub and take a step. *

T={\mathtt{List}}\;\mathtt{Term}

. By Canonical form

e_{1}

can have two forms: ·

e_{1}={\mathtt{nil}}

. Then we apply LABEL:r-tick-nil takes a step. ·

e_{1}={\mathtt{cons}}{v_{1}}{v_{2}}

. Then we apply LABEL:r-tick-cons takes a step. –

e_{2}=\mathtt{error}

. Then by ctx-err we take a step to an error. – for all

V,\mathcal{L}

V;\mathcal{L};e_{2}\longrightarrow V^{\prime};\mathcal{L}^{\prime};e_{2}^{\prime}

, for some

V^{\prime},\mathcal{L}^{\prime},e_{2}^{\prime}

. Then by ctx-succ or ctx-lang-err, we take a step. •

e_{1}=\mathtt{error}

. Then by LABEL:ctx-err we take a step to an error. • for all

V,\mathcal{L}

V;\mathcal{L};e_{1}\longrightarrow V^{\prime};\mathcal{L}^{\prime};e_{1}^{\prime}

, for some

V^{\prime},\mathcal{L}^{\prime},e_{1}^{\prime}

. Then by ctx-succ or ctx-lang-err, we take a step. All other cases follow similar lines as above. ∎

Theorem 0.D.3 (Progress Theorem for Configurations)

For all , if $\emptyset\vdash V;\mathcal{L};e$ then either

•

$e=\mathtt{skip}$ , or
•

$e=\mathtt{error}$ , or
•

$V;\mathcal{L};e\longrightarrow V^{\prime};\mathcal{L}^{\prime};e^{\prime}$ , for some $e^{\prime}$ .

Proof

Let us assume the proviso: $\emptyset\vdash V;\mathcal{L};e$ . Then we have $\emptyset\vdash e:\mathtt{Language}$ . By Progress Theorem for Expressions, we have that

•

$e=v$ . By Canonical forms, $e=\mathtt{skip}$ .
•

$e=\mathtt{error}$ , which satisfies the theorem.
•

$V;\mathcal{L};e\longrightarrow V^{\prime};\mathcal{L}^{\prime};e^{\prime}$ , for some $e^{\prime}$ , which satisfies the theorem.

∎

0.D.2 Subject Reduction Theorem

Lemma 1 (Substitution Lemma)

if $\Gamma,x:T\vdash e:T^{\prime}$ and $\emptyset\vdash v:T$ then $\Gamma\vdash e[v/x]:T^{\prime}$ .

Proof

The proof is by induction on the derivation of $\Gamma,x:T\vdash e:T^{\prime}$ . As usual, the case for variables (t-var) relies on a standard weakening lemma: $\Gamma\vdash e:T^{\prime}$ and $x$ is not in the free variables of $e$ then $\Gamma,x:T\vdash e:T^{\prime}$ , which can be proved by induction on the derivation of $\Gamma\vdash e:T^{\prime}$ . An aspect that differs from a standard proof is that our substitution does not replace all instances of variables $\mathit{self}$ , $\mathit{premises}$ , and $\mathit{conclusion}$ in certain context. Then extra care must be taken in the substitution lemma because the substituted expression may still have those as free variables. The type system covers for those cases because it augments the type environment with $\Gamma_{\textsf{rule}}$ .

∎

Lemma 2 (Pattern-matching typing and reduction)

if $\emptyset\vdash p:T\Rightarrow\Gamma^{\prime}$ and $\mathit{match}(v,p)=\theta$ then for all $x:T^{\prime}\in\Gamma^{\prime}$ , $[x/v^{\prime}]\in\theta$ and $\emptyset\vdash v^{\prime}:T^{\prime}$ .

Proof

The proof is by induction on the derivation of $\emptyset\vdash p:T\Rightarrow\Gamma^{\prime}$ . Each case is straightforward. ∎

Lemma 3 ( $\mathit{uniquefy}_{\textsf{lf}}$ produces well-typed results or fails)

$\mathit{uniquefy}_{\textsf{lf}}(lf,m,str,mr)=res$

$\emptyset\vdash lf^{\prime}:{\mathtt{List}}\;\mathtt{Formula}$ , and $\emptyset\vdash mr^{\prime}:\mathtt{Map}\;\mathtt{MetaVar}\;({\mathtt{List}}\;\mathtt{Term})$ . • $res=\mathit{fail}$ .

Proof

Straightforward induction on the definition of

\mathit{uniquefy}_{\textsf{lf}}

. Most cases rely on the analogous lemmas for formulae, terms, list of terms and list of formulae:

\emptyset\vdash f^{\prime}:\mathtt{Formula}

, and

\emptyset\vdash mr^{\prime}:\mathtt{Map}\;\mathtt{Term}\;({\mathtt{List}}\;\mathtt{Term})

. –

res=\mathit{fail}

. •

\mathit{uniquefy}_{\textsf{t}}(t,m,str,mr)=res

–

res=(t^{\prime},mr^{\prime})

such that

\emptyset\vdash t^{\prime}:\mathtt{Term}

, and

\emptyset\vdash mr^{\prime}:\mathtt{Map}\;\mathtt{Term}\;({\mathtt{List}}\;\mathtt{Term})

. –

res=\mathit{fail}

. •

\mathit{uniquefy}_{\textsf{lt}}(lt,m,str,mr)=res

–

res=(lt^{\prime},mr^{\prime})

such that

\emptyset\vdash lt^{\prime}:{\mathtt{List}}\;\mathtt{Term}

, and

\emptyset\vdash mr^{\prime}:\mathtt{Map}\;\mathtt{Term}\;({\mathtt{List}}\;\mathtt{Term})

. –

res=\mathit{fail}

. Each can be proved with a straightforward induction on the definition of

\mathit{uniquefy}_{\mathcal{X}}

where

\mathcal{X}\in\{\textsf{f,t,lt}\}

. ∎

Lemma 4 (Compositionality of $\vdash$ )

\emptyset\vdash E[e]:T

then there exits

T^{\prime}

such that

\emptyset\vdash e:T^{\prime}

and for all

e^{\prime}

\emptyset\vdash e^{\prime}:T^{\prime}

then

\emptyset\vdash E[e^{\prime}]:T

Proof

Proof is by induction on the structure of

E

. Each case is straightforward.

Theorem 0.D.4 (Subject Reduction ( $\longrightarrow_{\mathtt{@}}$ ))

V\cap\mathit{vars}(\mathcal{L})=\emptyset

\emptyset\vdash e:T

, and

V;\mathcal{L}\;;\;e\longrightarrow_{\mathtt{@}}V^{\prime};\mathcal{L}^{\prime}\;;\;e^{\prime}

then

V^{\prime}\cap\mathit{vars}(\mathcal{L}^{\prime})=\emptyset

, and

\emptyset\vdash e^{\prime}:T

Proof

Let us assume the proviso of the theorem, that is, (H1)

V\cap\mathit{vars}(\mathcal{L})=\emptyset

, (H2)

\Gamma\vdash e:T

, and (H3)

V;\mathcal{L}\;;\;e\longrightarrow V^{\prime};\mathcal{L}^{\prime}\;;\;e^{\prime}

. Case analysis on (H3).

Case 7 (r-seq-ok)

V;\mathcal{L}\;;\;(\mathtt{skip};e)\longrightarrow V;\mathcal{L}\;;\;e

. We need to prove

\mathit{vars}(\mathcal{L})\subseteq V

, which we already have by (H1). We need to prove

\vdash\mathcal{L}

, which we have by (H2). We have to prove that

\Gamma\vdash e:T

where

\Gamma\vdash(\mathtt{skip};e):T

. By t-seq we have that

\Gamma\vdash(\mathtt{skip};e):\mathtt{Language}

(i.e.

T=\mathtt{Language}

\Gamma\vdash e:\mathtt{Language}

Case 8 (r-newar)

V;(G,R)\;;\;\mathtt{newVar}\;str\longrightarrow_{\mathtt{@}}V\cup\{{X^{\prime}}\};\mathcal{L}\;;\;{X^{\prime}}

. We need to prove

\mathit{vars}(\mathcal{L})\subseteq V\cup\{{X^{\prime}}\}

, which we have because by (H1), we have that

\mathit{vars}(\mathcal{L})\subseteq V

, and we additionally we have that

{X^{\prime}}\not\in\mathit{vars}(\mathcal{L})

. We have to prove that

\Gamma\vdash{X^{\prime}}:\mathtt{Term}

because

\Gamma\vdash\mathtt{newVar}:\mathtt{Term}

. This holds thanks to t-metaVar.

Case 9 (r-rule-comp)

V;\mathcal{L}\;;\;v;_{\textsf{r}}e\longrightarrow_{\mathtt{@}}V;\mathcal{L}\;;\;e\theta_{\textsf{rule}}^{(v)}

. We need to prove

V\cap\mathit{vars}(\mathcal{L})=\emptyset

, which we have because by (H1). We have to prove that (#)

\emptyset\vdash e\theta_{\textsf{rule}}^{(v)}:\mathtt{Rule}

when (*)

\emptyset\vdash v;_{\textsf{r}}e:\mathtt{Rule}

. From (*) we infer (HRULE)

\emptyset\vdash v:\mathtt{Rule}

and

\Gamma_{\textsf{rule}}\vdash\;e:\mathtt{Rule}

, that is (HE)

\mathit{self}:\mathtt{Rule},\mathit{premises}:{\mathtt{List}}\;\mathtt{Formula},\mathit{conclusion}:\mathtt{Formula}\vdash\;e:\mathtt{Rule}

. By Canonical Form Lemma, from (HRULE) we infer that

v=\inference{v_{1}}{v_{2}}

and since it is typeable (HRULE), by t-rule have

\emptyset\vdash v_{1}:{\mathtt{List}}\;\mathtt{Formula}

and

\emptyset\vdash v_{2}:\mathtt{Formula}

e\theta_{\textsf{rule}}^{(v)}=e[r/\mathit{self},v_{1}/\mathit{premises},v_{2}/\mathit{conclusion}]=e[v/\mathit{self}][v_{1}/\mathit{premises}][v_{2}/\mathit{conclusion}]

. Given (HE), and given (HRULE), by Substitution Lemma we have (

HE_{1}

)

\mathit{premises}:{\mathtt{List}}\;\mathtt{Formula},\mathit{conclusion}:\mathtt{Formula}\vdash\;e[v/\mathit{self}]:\mathtt{Rule}

. Given (

HE_{1}

), and given

\Gamma\vdash v_{1}:{\mathtt{List}}\;\mathtt{Formula}

, by Substitution Lemma we have (

HE_{2}

)

\mathit{conclusion}:\mathtt{Formula}\vdash\;e[v/\mathit{self}][v_{1}/\mathit{premises}]:\mathtt{Rule}

. Given (

HE_{2}

), and given

\emptyset\vdash v_{2}:\mathtt{Formula}

, by Substitution Lemma we have

\emptyset\vdash\;e[v/\mathit{self}][v_{1}/\mathit{premises}][v_{2}/\mathit{conclusion}]:\mathtt{Rule}

Case 10 (r-selector-nil)

V;\mathcal{L}\;;\;{\mathtt{nil}}[p]:\;e\longrightarrow_{\mathtt{@}}V;\mathcal{L}\;;\;\mathtt{nil}

. We need to prove

V\cap\mathit{vars}(\mathcal{L})=\emptyset

, which we have because by (H1). We have to prove that (#)

\Gamma\vdash\mathtt{nil}:{\mathtt{List}}\;T

when (*)

\Gamma\vdash{\mathtt{nil}}[p]:\;e:{\mathtt{List}}\;T

. Thanks to t-emptyList this holds.

V;\mathcal{L}\;;\;(\mathit{cons}^{*}\;{e\theta\theta^{\prime}}\;{({v_{2}}[p]:\;e)})

. We need to prove

V\cap\mathit{vars}(\mathcal{L})=\emptyset

, which we have because by (H1). We have to prove that (#)

\emptyset\vdash(\mathit{cons}^{*}\;{e\theta\theta^{\prime}}\;{({v_{2}}[p]:\;e)}):{\mathtt{List}}\;T^{\prime}

when (*)

\Gamma\vdash{(\mathtt{cons}\;v_{1}\;v_{2})}[p]:\;e:{\mathtt{List}}\;T^{\prime}

. By t-selector we have that

\Gamma\vdash(\mathtt{cons}\;v_{1}\;v_{2}):{\mathtt{List}}\;T

, and therefore by t-cons, we have that

\Gamma\vdash v_{1}:T

. We do a case analysis on whether

T=\mathtt{Rule}

or not, to prove in both cases that

\Gamma\vdash e\theta\theta^{\prime}:\mathtt{Option}\;T^{\prime}

. •

T=\mathtt{Rule}

: By Canonical Form, then we have that

v_{1}=\inference{v_{1}^{\prime}}{v_{2}^{\prime}}

. Then

\theta^{\prime}=[v_{1}/\mathit{self},v_{1}^{\prime}/\mathit{premises},v_{2}^{\prime}/\mathit{conclusion}]

. From (*) we infer that

\Gamma^{\prime},\mathit{self}:\mathtt{Rule},\mathit{premises}:{\mathtt{List}}\;\mathtt{Formula},\mathit{conclusion}:\mathtt{Formula}\vdash\;e_{2}:\mathtt{Option}\;T^{\prime}

, where

\Gamma^{\prime}

comes from the pattern-matching. By applying the same reasoning as in r-rule-comp, we can apply the Substitution lemma three times to have

\Gamma^{\prime}\vdash e\theta^{\prime}:\mathtt{Option}\;T^{\prime}

. By Lemma 2 (pattern-matching correctness) we have that for all

(x:T^{\prime\prime})\in\Gamma^{\prime}

there is

[x/v^{\prime\prime}]\in\theta

such that

\emptyset\vdash v^{\prime\prime}:T^{\prime\prime}

. Then, for all such

(x:T^{\prime\prime})\in\Gamma^{\prime}

we can use the Substitution Lemma to substitute its

[x/v^{\prime\prime}]

, and end up with

\emptyset\vdash e\theta\theta^{\prime}:\mathtt{Option}\;T^{\prime}

. •

T\not=\mathtt{Rule}

: Then

\theta^{\prime}=[v_{1}/\mathit{self}]

and by Substitution lemma we have

\Gamma^{\prime}\vdash e\theta^{\prime}:\mathtt{Option}\;T^{\prime}

. By pattern-matching correctness, the same reasoning as in the previous case leads us to

\emptyset\vdash e\theta\theta^{\prime}:\mathtt{Option}\;T^{\prime}

. As now we know that (*)

\Gamma\vdash e\theta\theta^{\prime}:\mathtt{Option}\;T^{\prime}

in all cases. If we expand

(\mathit{cons}^{*}\;{e\theta\theta^{\prime}}\;{({v_{2}}[p]:\;e)})

we have

\mathtt{if}\;(\mathtt{isNothing}\;e\theta\theta^{\prime})\;\mathtt{then}\;({v_{2}}[p]:\;e)\;\mathtt{else}\;\mathtt{cons}\;(\mathtt{get}\;e\theta\theta^{\prime})\;({v_{2}}[p]:\;e)

. Here

\mathtt{isNothing}

and

\mathtt{get}

are applied to

e\theta\theta^{\prime}

of type

\mathtt{Option}\;T^{\prime}

, therefore are well-typed. Also, both branches of the if return an expression of type

{\mathtt{List}}\;T^{\prime}

Case 12 (r-uniquefy-ok)

V;\mathcal{L}\;;\;\mathtt{uniquefy}(\mathit{lf},v_{1},str)\Rightarrow(x,y):e\longrightarrow_{\mathtt{@}}V;\mathcal{L}\;;\;e[\mathit{lf}^{\prime}/x,v_{2}/y]

. We need to prove

V\cap\mathit{vars}(\mathcal{L})=\emptyset

, which we have because by (H1). We have to prove that (#)

\Gamma\vdash e[\mathit{lf}^{\prime}/x,v_{2}/y]:T

when (*)

\Gamma\vdash\mathtt{uniquefy}(r,v,str)\Rightarrow(x,y):e:T

. By r-uniquefy-ok we have

(r^{\prime},m)=\mathit{uniquefy}_{\textsf{r}}(r,v,str,\mathtt{emptyMap})

, and by Lemma 3 we have that

\emptyset\vdash\mathit{lf^{\prime}}:{\mathtt{List}}\;\mathtt{Formula}

, and

\emptyset\vdash v_{2}:\mathtt{Map}\;\mathtt{Term}\;({\mathtt{List}}\;\mathtt{Term})

. By t-uniquefy we have that

\Gamma,x:{\mathtt{List}}\;\mathtt{Formula},y:\mathtt{Map}\;\mathtt{Term}\;({\mathtt{List}}\;\mathtt{Term})\vdash\;e_{3}:T

. By Substitution Lemma, we then have

\Gamma\vdash e[\mathit{lf}^{\prime}/x,v_{2}/y]:T

All other cases are analogous.

Theorem 0.D.5 (Subject Reduction ( $\longrightarrow$ ))

For all $V$ , $V^{\prime}$ , $\mathcal{L}$ , $\mathcal{L}^{\prime}$ , $e$ , $e^{\prime}$ , if $\emptyset\vdash V;\mathcal{L};e$ and $V;\mathcal{L};e\longrightarrow V^{\prime};\mathcal{L}^{\prime};e^{\prime}$ then $\emptyset\vdash V^{\prime};\mathcal{L}^{\prime};e^{\prime}$ .

Proof

Let us assume the proviso of the theorem and have (H1) $\emptyset\vdash V;\mathcal{L};e$ and $V;\mathcal{L};e\longrightarrow V^{\prime};\mathcal{L}^{\prime};e^{\prime}$ . The proof is by case analysis on the derivation of $V;\mathcal{L};e\longrightarrow V^{\prime};\mathcal{L}^{\prime};e^{\prime}$

Case 13 (ctx-succ)

$V;\mathcal{L};E[e]\longrightarrow V^{\prime};\mathcal{L}^{\prime};E[e^{\prime}]$ when (H2) $V;\mathcal{L};e\longrightarrow_{\mathtt{@}}V^{\prime};\mathcal{L}^{\prime};e^{\prime}$ . From (H1) we know that (H6) $\emptyset\vdash E[e]:T$ , for some $T$ . Then we can apply Lemma 4 to have that (H3) $\emptyset\vdash e:T^{\prime}$ , for some $T^{\prime}$ . With (H2) and (H3) we can apply Subject Reduction for $\longrightarrow_{\mathtt{@}}$ and obtain that (H4) $\emptyset\vdash e:T^{\prime}$ and (H5) $V^{\prime}\cap\mathit{vars}(\mathcal{L}^{\prime})=\emptyset$ . By Lemma 4, since we have (H6) and (H4) we can derive $\emptyset\vdash E[e^{\prime}]:T$ , and since we have (H5), we can derive $\emptyset\vdash V;\mathcal{L};E[e^{\prime}]$ .

Case 14 (ctx-lang-err)

$V;\mathcal{L}\;;\;E[e]\longrightarrow V;\mathcal{L}\;;\;\mathtt{error}$ . (H1) implies $V\cap\mathit{vars}(\mathcal{L})=\emptyset$ and $\vdash\mathcal{L}$ . We need to prove $V\cap\mathit{vars}(\mathcal{L})=\emptyset$ , which we already have, and $\vdash\mathcal{L}$ , which we already have. We need to prove $\Gamma\vdash\mathtt{error}:\mathtt{Language}$ , which we can prove with (t-error).

Case 15 (ctx-err)

Similar lines as ctx-lang-err.

0.D.3 Type Soundness

Theorem 0.D.6 (Type Soundness)

The proof is straightforward once we have the Subject Reduction ( $\longrightarrow$ ) theorem, and the Progress for Configuration theorem, and that typeability is preserved in multiple steps (provable by straightforward induction on the derivation of $\longrightarrow^{*}$ ).

Appendix 0.E Let-Binding and Match in $\mathcal{L}\textendash\textsf{Tr}$

$\mathtt{let}\;x=e_{1}\;\mathtt{in}\;e_{2}\equiv\mathtt{head}\;({[e_{1}]}[x]:\;e_{2})$

The pattern-matching that we use is unary-branched and either succeeds or throws an error.

$\mathtt{match}\;e_{1}\;\mathtt{with}\;p\Rightarrow e_{2}$ letx=([e1][p]:e2))inif(isEmptyx)thenerrorelseheadx

A Calculus for Language Transformations

Abstract

1 Introduction

2 A Calculus for Language Transformations

2.1 Syntax of ℒ​–​Tr\mathcal{L}\textendash\textsf{Tr}

2.2 Operational Semantics of ℒ​–​Tr\mathcal{L}\textendash\textsf{Tr}

2.3 Type System of ℒ​–​Tr\mathcal{L}\textendash\textsf{Tr}

Theorem 2.1 (Type Soundness)

3 Examples

Appendix 0.D Proof of Type Soundness

0.D.1 Progress Theorem

Theorem 0.D.1 (Canonical Form Lemmas)

Proof

Theorem 0.D.2 (Progress Theorem Expressions)

Proof

Case 2 (t-rule-comp)

Case 3 (t-seq)

Case 4 (t-selector)

Case 5 (t-uniquefy)

Case 6 (t-tick)

Theorem 0.D.3 (Progress Theorem for Configurations)

Proof

0.D.2 Subject Reduction Theorem

Lemma 1 (Substitution Lemma)

Proof

Lemma 2 (Pattern-matching typing and reduction)

Proof

Lemma 3 (𝑢𝑛𝑖𝑞𝑢𝑒𝑓𝑦lf\mathit{uniquefy}_{\textsf{lf}} produces well-typed results or fails)

Proof

Lemma 4 (Compositionality of ⊢\vdash)

Proof

Theorem 0.D.4 (Subject Reduction (⟶@\longrightarrow_{\mathtt{@}}))

Proof

Case 7 (r-seq-ok)

Case 8 (r-newar)

Case 9 (r-rule-comp)

Case 10 (r-selector-nil)

Case 12 (r-uniquefy-ok)

Theorem 0.D.5 (Subject Reduction (⟶\longrightarrow))

Proof

Case 13 (ctx-succ)

Case 14 (ctx-lang-err)

Case 15 (ctx-err)

0.D.3 Type Soundness

Theorem 0.D.6 (Type Soundness)

Appendix 0.E Let-Binding and Match in ℒ​–​Tr\mathcal{L}\textendash\textsf{Tr}

2.1 Syntax of $\mathcal{L}\textendash\textsf{Tr}$

2.2 Operational Semantics of $\mathcal{L}\textendash\textsf{Tr}$

2.3 Type System of $\mathcal{L}\textendash\textsf{Tr}$

Lemma 3 ( $\mathit{uniquefy}_{\textsf{lf}}$ produces well-typed results or fails)

Lemma 4 (Compositionality of $\vdash$ )

Theorem 0.D.4 (Subject Reduction ( $\longrightarrow_{\mathtt{@}}$ ))

Theorem 0.D.5 (Subject Reduction ( $\longrightarrow$ ))

Appendix 0.E Let-Binding and Match in $\mathcal{L}\textendash\textsf{Tr}$