Metatheory.jl: Fast and Elegant Algebraic Computation in Julia with Extensible Equality Saturation
Abstract
We introduce Metatheory.jl: a lightweight and performant general purpose symbolics and metaprogramming framework meant to simplify the act of writing complex Julia metaprograms and to significantly enhance Julia with a native term rewriting system, based on state-of-the-art equality saturation techniques, and a dynamic first class AST pattern matching system that is dynamically composable in an algebraic fashion, taking full advantage of the language’s powerful reflection capabilities. Our contribution allows to perform general purpose symbolic mathematics, manipulation, optimization, synthesis or analysis of syntactically valid Julia expressions with a clean and concise programming interface, both during compilation or execution of programs.
1 Statement of Need
The Julia programming language is a fresh approach to technical computing [Bezanson et al., 2017], disrupting the popular conviction that a programming language cannot be high level, easy to learn, and performant at the same time. One of the most practical features of Julia is the excellent metaprogramming and macro system, allowing for homoiconicity: programmatic generation and manipulation of expressions as first-class values, a well-known paradigm similar to several LISP idioms such as Scheme.
We introduce Metatheory.jl: a general purpose metaprogramming and algebraic computation library for the Julia programming language, designed to take advantage of the powerful reflection capabilities to bridge the gap between symbolic mathematics, abstract interpretation, equational reasoning, optimization, composable compiler transforms, and advanced homoiconic pattern matching features. Intuitively, Metatheory.jl transforms Julia expressions in other Julia expressions at both compile and run time. This allows Metatheory.jl users to perform customized and composable compiler optimization specifically tailored to single, arbitrary Julia packages. Our library provides a simple, algebraically composable interface to help scientists in implementing and reasoning about all kinds of formal systems, by defining concise rewriting rules as syntactically valid Julia code.
2 Summary
Theories can then be executed throught two, highly composable, rewriting backends. The first one, is based just on standard rewriting built on top of the pattern matcher developed in [Zhao and Carlsson, 2020]. Such approach suffers of the usual problems of rewriting systems: even trivial equational rules such as commutativity may lead to non-terminating systems and thus need to be adjusted by some sort of structuring or rewriting order (that is known to require extensive user reasoning).
The other back-end for Metatheory.jl, the core of our contribution, is designed to not require the user to reason about rewriting order by relying on state-of-the-art techniques equality saturation on e-graphs, adapted from the egg rust library [Willsey et al., 2021]. Provided with a theory of equational rewriting rules, defined in pure Julia, e-graphs compactly represent many equivalent programs. Saturation iteratively executes an e-graph specific pattern matcher to efficiently compute (and analyze) all possible equivalent expressions contained in the e-graph congruence closure. This latter back-end is suitable for partial evaluators, symbolic mathematics, static analysis, theorem proving and superoptimizers.

The original egg library [Willsey et al., 2021] is known to be the first implementation of generic and extensible e-graphs [Nelson and Oppen, 1980], the contributions of egg include novel amortized algorithms for fast and efficient equivalence saturation and analysis. Differently from the original rust implementation of egg, which handles expressions defined as rust strings and data structures, our system directly manipulates homoiconic Julia expressions, and can therefore fully leverage the Julia subtyping mechanism [Zappa Nardelli et al., 2018], allowing programmers to build expressions containing not only symbols but all kinds of Julia values. This permits rewriting and analyses to be efficiently based on runtime data contained in expressions. Most importantly, users can –and are encouraged to– include type assertions in the left hand of rewriting rules in theories.
One of the project goals of Metatheory, beyond being to be easy to use and composable, is to be fast and efficient: both the first-class pattern matching system and the generation of e-graph analyses from theories rely on RuntimeGeneratedFunctions.jl [Rackauckas and Foster, 2021], generating callable functions at runtime that efficiently bypass Julia’s world age problem [Belyakova et al., 2020] with the full performance of a standard Julia anonymous function.
2.1 Analyses and Extraction
With Metatheory.jl, modeling analyses and conditional/dynamic rewrites is straightforward: it is possible to check conditions on runtime values or to read and write from external data structures during rewriting. The analysis mechanism described in egg [Willsey et al., 2021] and re-implemented in our contribution lets users define ways to compute additional analysis metadata from an arbitrary semi-lattice domain, such as costs of nodes or logical statements attached to terms. Other than for inspection, analysis data can be used to modify expressions in the e-graph both during rewriting steps and after e-graph saturation.
Therefore using the equality saturation (e-graph) backend, extraction can be performed as an on-the-fly e-graph analysis or after saturation. Users can define their own, or choose between a variety of predefined cost functions for automatically extracting the most fitting expressions from the congruence closure represented by an e-graph.
3 Conclusion
Many applications of equality saturation have been recently published, tailoring advanced optimization tasks. Herbie [Panchekha et al., 2015] is a tool for automatically improving the precision of floating point expressions, which recently switched to egg as the core rewriting backend. In [Yang et al., 2021], authors used egg to superoptimize tensor signal flow graphs describing neural networks. However, Herbie requires interoperation and conversion of expressions between different languages and libraries. Implementing similar case studies in pure Julia would make valid research contributions on their own. We are confident that a well-integrated and homoiconic equality saturation engine in pure Julia will permit exploration of many new metaprogramming applications, and allow them to be implemented in an elegant, performant and concise way. Code for Metatheory.jl is available in [Cheli, 2021], or at this link https://github.com/0x0f0f0f/Metatheory.jl.
4 Acknowledgements
We acknowledge Max Willsey and contributors for their work on the original egg library [Willsey et al., 2021], Christopher Rackauckas and Christopher Foster for their efforts in developing RuntimeGeneratedFunctions [Rackauckas and Foster, 2021], Taine Zhao for developing MLStyle [Zhao, 2021] and MatchCore [Zhao and Carlsson, 2020], and Philip Zucker for his original idea of implementing E-Graphs in Julia [Zucker, 2020b, Zucker, 2020a] and support during the development of the project. Special thanks to Filippo Bonchi for a friendly review of a preliminary version of this article.
References
- [Belyakova et al., 2020] Belyakova, J., Chung, B., Gelinas, J., Nash, J., Tate, R., and Vitek, J. (2020). World age in julia.
- [Bezanson et al., 2017] Bezanson, J., Edelman, A., Karpinski, S., and Shah, V. B. (2017). Julia: A fresh approach to numerical computing. SIAM review, 59(1):65–98.
- [Cheli, 2021] Cheli, A. (2021). Metatheory.jl.
- [Nelson and Oppen, 1980] Nelson, G. and Oppen, D. C. (1980). Fast decision procedures based on congruence closure. J. ACM, 27(2):356–364.
- [Panchekha et al., 2015] Panchekha, P., Sanchez-Stern, A., Wilcox, J. R., and Tatlock, Z. (2015). Automatically improving accuracy for floating point expressions. SIGPLAN Not., 50(6):1–11.
- [Rackauckas and Foster, 2021] Rackauckas, C. and Foster, C. (2021). Runtimegeneratedfunctions.jl: Functions generated at runtime without world-age issues.
- [Willsey et al., 2021] Willsey, M., Nandi, C., Wang, Y. R., Flatt, O., Tatlock, Z., and Panchekha, P. (2021). Egg: Fast and extensible equality saturation. Proc. ACM Program. Lang., 5(POPL).
- [Yang et al., 2021] Yang, Y., Phothilimtha, P. M., Wang, Y. R., Willsey, M., Roy, S., and Pienaar, J. (2021). Equality saturation for tensor graph superoptimization. arXiv preprint arXiv:2101.01332.
- [Zappa Nardelli et al., 2018] Zappa Nardelli, F., Belyakova, J., Pelenitsyn, A., Chung, B., Bezanson, J., and Vitek, J. (2018). Julia subtyping: a rational reconstruction. Proceedings of the ACM on Programming Languages, 2(OOPSLA):1–27.
- [Zhao, 2021] Zhao, T. (2021). Mlstyle.jl: Fast, consistent and extensible functional programming infrastructures.
- [Zhao and Carlsson, 2020] Zhao, T. and Carlsson, K. (2020). Matchcore.jl: A minimal implementation of optimized pattern matching.
- [Zucker, 2020a] Zucker, P. (2020a). E-graph pattern matching (part ii).
- [Zucker, 2020b] Zucker, P. (2020b). E-graphs in julia (part i).