This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Categorification of characteristic structures

Peter A. Brooksbank Heiko Dietrich Joshua Maglione E.A. O’Brien  and  James B. Wilson pbrooksb@bucknell.edu, Bucknell University, USA heiko.dietrich@monash.edu, Monash University, Australia joshua.maglione@universityofgalway.ie, University of Galway, Ireland e.obrien@auckland.ac.nz, University of Auckland, New Zealand James.Wilson@ColoState.Edu, Colorado State University, USA
(Date: August 10, 2025)
Abstract.

We develop a representation theory of categories as a means to explore characteristic structures in algebra. Characteristic structures play a critical role in isomorphism testing of groups and algebras, and their construction and description often rely on specific knowledge of the parent object and its automorphisms. In many cases, questions of reproducibility and comparison arise. Here we present a categorical framework that addresses these questions. We prove that every characteristic structure is the image of a functor equipped with a natural transformation. This shifts the local description in the parent object to a global one in the ambient category. Through constructions in representation theory, such as tensor products, we can combine characteristic structure across multiple categories. Our results are constructive, stated in the language of a constructive type theory, which facilitates implementations in theorem checkers.

1. Introduction

The problem of deciding when two algebraic structures are isomorphic is fundamental to algebra and computer science. It encompasses issues of decidability and complexity, and it tests the limits of our theories and algorithms. An initial tactic in deciding isomorphism is to identify substructures that are invariant under isomorphisms because doing so reduces the search space. We first discuss groups, where the literature is most developed (see, for example, \citelist[ELGO2002][BOW][Maglione2021][Wilson:filters]), but our results apply to monoids, loops, rings, and non-associative algebras.

A subgroup HH of a group GG is characteristic if φ(H)=H\varphi(H)=H for every automorphism φ:GG\varphi:G\to G; it is fully invariant if ψ(H)H\psi(H)\leqslant H for every homomorphism ψ:GG\psi:G\to G. We use the language of categories, following [Riehl], and a type of natural transformation to describe our main results (details are given in Section 5.1).

Definition 1.1.

Let 𝖠\mathsf{A} be a category, and let 𝖡\mathsf{B} be a subcategory with inclusion functor :𝖡𝖠\mathcal{I}:\mathsf{B}\to\mathsf{A}. A counital is a natural transformation ι:𝒞\iota:\mathcal{C}\Rightarrow\mathcal{I} for some functor 𝒞:𝖡𝖠\mathcal{C}:\mathsf{B}\to\mathsf{A}. The class of all such counitals is denoted Counital(𝖡,𝖠)\text{Counital}(\mathsf{B},\mathsf{A}). For an object XX of 𝖡\mathsf{B}, the XX-component of ι\iota is the morphism ιX:𝒞(X)(X)\iota_{X}:\mathcal{C}(X)\to\mathcal{I}(X) in 𝖠\mathsf{A}.

A special case of our results, for the category of groups, can be stated as follows.

Theorem 1.

For the category 𝖦𝗋𝗉\mathsf{Grp} of groups and subcategory 𝖦𝗋𝗉\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}} of groups and their isomorphisms, the following equalities of sets hold:

{HG|H characteristic in G}\{H\leqslant G~|~H\text{ characteristic in }G\} == {Im(ιG)|ιCounital(𝖦𝗋𝗉,𝖦𝗋𝗉)};\left\{\mathrm{Im}(\iota_{G})~\middle|~\iota\in\text{\rm Counital}\left(\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}},\mathsf{Grp}\right)\right\};
{HG|H fully invariant in G}\{H\leqslant G~|~H\text{ fully invariant in }G\} == {Im(ιG)|ιCounital(𝖦𝗋𝗉,𝖦𝗋𝗉)}\left\{\mathrm{Im}(\iota_{G})~\middle|~\iota\in\text{\rm Counital}(\mathsf{Grp},\mathsf{Grp})\right\}.

Theorem 1 contrasts a “recognizable” description of characteristic (fully invariant) subgroups with a “constructive” one. For a fixed group GG, the sets on the left are of the form {HP(G,H)}\{H\mid P(G,H)\}, where PP is the appropriate logical predicate that allows us to recognize when a subgroup HH belongs to the set; those on the right are of the form {f(ι)ιCounital(,𝖦𝗋𝗉)}\{f(\iota)\mid\iota\in\text{Counital}(\ldots,\mathsf{Grp})\}, where f(ι)=Im(ιG)f(\iota)=\mathrm{Im}(\iota_{G}) allows us to construct members of the subset by applying a function. Also, the descriptions on the left are “local” since they reference just a single parent group, whereas those on the right are “global” since they apply to the ambient categories.

The characterization of characteristic subgroups by natural transformations allows one to recast the lattice theory of characteristic subgroups into the globular compositions of natural transformations as explored in [Baez, Power]. We now explore other implications of Theorem 1.

1.1. Constraining isomorphism by characteristic subgroups

Characteristic subgroups constrain isomorphisms in the following sense:

Fact 1.2.

If HH is a characteristic subgroup of GG, and α,β:GG~\alpha,\beta:G\to\tilde{G} are isomorphisms, then α(H)=β(H)\alpha(H)=\beta(H).

It is therefore useful for an isomorphism test to locate characteristic subgroups of a group GG: every hypothetical isomorphism from GG to G~\tilde{G} must then assign such a subgroup HH to a unique corresponding subgroup H~\tilde{H} of G~\tilde{G}. This raises at least two issues. First, if the task is to construct isomorphisms, then we should assume that Aut(G)\operatorname{Aut}(G) is not yet known. How then do we verify that HH is characteristic? Is there an alternative definition of the characteristic property that does not directly reference Aut(G)\operatorname{Aut}(G)? A second issue is how to determine the possible H~G~\tilde{H}\leqslant\tilde{G} when we know only that HH is characteristic in GG. For familiar characteristic subgroups such as the center ζ(G)\zeta(G) this is possible because the definition is already global to all groups. Hence, a hypothetical isomorphism α:GG~\alpha:G\to\tilde{G} must satisfy α(ζ(G))=ζ(G~)\alpha(\zeta(G))=\zeta(\tilde{G}), and typically ζ(G)\zeta(G) and ζ(G~)\zeta(\tilde{G}) can be constructed without explicit knowledge of Aut(G)\operatorname{Aut}(G) or Aut(G~)\operatorname{Aut}(\tilde{G}). However, the following family of examples, first explored by Rottlaender [Rottlander28], exhibits groups whose characteristic subgroups have no known global definition, so it is difficult to utilize Fact 1.2.

Example 1.3.

Let pp be a prime and m<pm<p a positive integer. Let q1modpq\equiv 1\bmod{p} be a prime and denote by 𝔽q\mathbb{F}_{q} the field with qq elements. Let θGLm(𝔽q)\theta\in\operatorname{GL}_{m}(\mathbb{F}_{q}), with θp=1\theta^{p}=1, be diagonalizable with mm eigenvalues a1,,ama_{1},\ldots,a_{m}, each different from 11, satisfying the following property. If there exists u{1,,p1}u\in\{1,\dots,p-1\} with aiu=aja_{i}^{u}=a_{j} for all iji\neq j, then p(uk1)p\nmid(u^{k}-1) for k{1,,m}k\in\{1,\dots,m\}. For m=2m=2, this requires a1a2±1a_{1}\neq a_{2}^{\pm 1}.

The cyclic group CpC_{p} of order pp acts on the vector space V=𝔽qmV=\mathbb{F}_{q}^{m} via θ\theta. The condition on θ\theta means that each eigenspace in VV is a characteristic subgroup of the semidirect product Gθ=CpθVG_{\theta}=C_{p}\ltimes_{\theta}V determined by θ\theta, and exactly mm of the 1+q+q2++qm11+q+q^{2}+\cdots+q^{m-1} order qq subgroups of GθG_{\theta} are characteristic. Two such groups GθG_{\theta} and GτG_{\tau} may be isomorphic even if the eigenvalues of θ\theta and τ\tau are different. For example, this occurs when τ=θj\tau=\theta^{j} for some jj coprime to pp. Thus, the correspondence between characteristic subgroups of GθG_{\theta} and GτG_{\tau} is not a priori clear. \square

One of the goals of this work is to reinterpret the definition of a characteristic subgroup in a way that is independent of automorphisms and which is unambiguously defined for all groups. We do this by formulating the characteristic condition on the entire category of groups, thereby providing a categorification of the property of being characteristic. Moreover, our formulation pairs well with—and indeed is motivated by—the necessities of computation (see Section 1.3). To address this, we employ methods from theorem-checking, specifically type-theoretic techniques [Hindley-Seldin, Pierce:types, HoTT]; these have recently become accessible through systems such as Agda [Agda], Coq [Coq], and Lean [lean].

1.2. A local-to-global problem

Our approach is to transform the local characteristic property of subgroups into an equivalent global property of the category of all groups and their isomorphisms. Calculations now take place within the category instead of within individual groups, which opens up new ways to search for characteristic subgroups. Our approach also facilitates an a priori verification of the global characteristic property, rather than the usual a posteriori check that requires knowledge of automorphisms. The process is analogous to proving that ζ(G)\zeta(G) is characteristic without employing specific properties of GG. Our methods extend to every characteristic subgroup, even those discovered via bespoke calculations.

The traditional model of a category 𝖠\mathsf{A} involves both objects and morphisms. By sometimes focusing only on morphisms, we work with categories as an algebraic structure with a partial binary associative product on 𝖠\mathsf{A}—given by composition of its morphisms—and 𝟙𝖠={idXX\mathbb{1}_{\mathsf{A}}=\{\operatorname{id}_{X}\mid X an object in 𝖠}\mathsf{A}\}. It is partial because not every pair of morphisms is composable, in which case the product is undefined. This perspective yields a general algebraic framework for our computations.

The morphisms of a category can act on the morphisms of another category either on the left or the right. Although several interpretations of “category action” appear in the literature \citelist [Bergner-Hackney]*§2 [nlab:action] [FS]*1.271–274, there is no single established meaning. Let 𝖠\mathsf{A}, 𝖡\mathsf{B}, and 𝖷\mathsf{X} be categories. A left 𝖠\mathsf{A}-action on 𝖷\mathsf{X} is a partial-function, where axa\cdot x is defined for some morphisms aa of 𝖠\mathsf{A} and xx of 𝖷\mathsf{X}, that satisfies two conditions inspired by group actions. The first is that (aa´)x=a(a´x)(a\acute{a})\cdot x=a\cdot(\acute{a}\cdot x), whenever defined, for all morphisms a,a´a,\acute{a} of 𝖠\mathsf{A} and xx of 𝖷\mathsf{X}. The second is that 𝟙𝖠x={x}\mathbb{1}_{\mathsf{A}}\cdot x=\{x\}. To simplify notation, we write 𝟙𝖠x=x\mathbb{1}_{\mathsf{A}}\cdot x=x. As in the theory of bimodules of rings, an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-biaction on 𝖷\mathsf{X} is a left 𝖠\mathsf{A}-action and a right 𝖡\mathsf{B}-action on 𝖷\mathsf{X} such that for every morphism aa in 𝖠\mathsf{A}, bb in 𝖡\mathsf{B}, and xx in 𝖷\mathsf{X},

a(xb)=(ax)ba\cdot(x\cdot b)=(a\cdot x)\cdot b

whenever both sides of the equation are defined. Suppose there are (𝖠,𝖡)(\mathsf{A},\mathsf{B})-biactions on categories 𝖷\mathsf{X} and 𝖸\mathsf{Y}. An (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphism is a partial-function, which we denote by :𝖸𝖷?\mathcal{M}:{\mathsf{Y}}\to\mathsf{X}^{?}, such that

(ayb)=a(y)b\mathcal{M}(a\cdot y\cdot b)=a\cdot\mathcal{M}(y)\cdot b

whenever ayba\cdot y\cdot b is defined for morphisms aa in 𝖠\mathsf{A}, bb in 𝖡\mathsf{B}, and yy in 𝖸\mathsf{Y}.

We write 𝖠𝖡\mathsf{A}\leqslant\mathsf{B} to indicate that 𝖠\mathsf{A} is a subcategory of 𝖡\mathsf{B}, and denote the identity functor of 𝖠\mathsf{A} by id𝖠:𝖠𝖠\operatorname{id}_{\mathsf{A}}:\mathsf{A}\to\mathsf{A}. A counit is a counital of the form η:𝒞id𝖠\eta:\mathcal{C}\Rightarrow\operatorname{id}_{\mathsf{A}}. The following specialization of one of our principal results to groups describes how characteristic subgroups relate to counits and morphisms of category biactions.

Theorem 2.

Let GG be a group and HGH\leqslant G with inclusion ιG:HG\iota_{G}:H\hookrightarrow G. There exist categories 𝖠\mathsf{A} and 𝖡\mathsf{B}, where 𝖦𝗋𝗉𝖠𝖦𝗋𝗉\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}\;\leqslant\mathsf{A}\leqslant\mathsf{Grp}, such that the following are equivalent.

  1. (1)

    HH is characteristic in GG.

  2. (2)

    There is a functor 𝒞:𝖠𝖠\mathcal{C}:\mathsf{A}\to\mathsf{A} and a counit η:𝒞id𝖠\eta:\mathcal{C}\Rightarrow\operatorname{id}_{\mathsf{A}} such that H=Im(ηG)H=\operatorname{Im}(\eta_{G}).

  3. (3)

    There is an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphism :𝖡𝖠?\mathcal{M}:\mathsf{B}\to\mathsf{A}^{?} such that ιG=(idG𝟙𝖡)\iota_{G}=\mathcal{M}(\operatorname{id}_{G}\cdot\mathbb{1}_{\mathsf{B}}).

We emphasize that the category 𝖡\mathsf{B} in Theorem 2 need not be a subcategory of 𝖦𝗋𝗉\mathsf{Grp}; see Section 8 for an example. Moreover, our results apply to characteristic substructures of eastern algebras, which include monoids, loops, rings, and non-associative algebras. This generalization (Theorem 2-cat) and its dual version (2-dual) are proved in Section 6. We conclude this section with an example that illustrates how natural transformations arise from characteristic substructures.

Example 1.4.

The derived subgroup γ2(G)\gamma_{2}(G) of a group GG determines the inclusion homomorphism λG:γ2(G)G\lambda_{G}:\gamma_{2}(G)\hookrightarrow G and a functor 𝒟:𝖦𝗋𝗉𝖦𝗋𝗉\mathcal{D}:\mathsf{Grp}\to\mathsf{Grp} mapping groups to their derived subgroup and mapping homomorphisms to their restriction onto the derived subgroups. For every group homomorphism φ:GH\varphi:G\to H, observe that λH𝒟(φ)=id𝖦𝗋𝗉(φ)λG\lambda_{H}\mathcal{D}(\varphi)=\operatorname{id}_{\mathsf{Grp}}(\varphi)\lambda_{G}, so λ:𝒟id𝖦𝗋𝗉\lambda:\mathcal{D}\Rightarrow\operatorname{id}_{\mathsf{Grp}} is a natural transformation.

The center ζ(G)\zeta(G) of GG determines the inclusion homomorphism ρG:ζ(G)G\rho_{G}:\zeta(G)\hookrightarrow G. To define a functor with object map Gζ(G)G\mapsto\zeta(G), we must restrict the type of homomorphisms between groups since homomorphisms need not map centers to centers. (Consider, for example, an embedding /2Sym(3)\mathbb{Z}/2\hookrightarrow\text{Sym}(3).) Since every isomorphism maps center to center, we restrict to 𝖦𝗋𝗉\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}, defining a functor 𝒵:𝖦𝗋𝗉𝖦𝗋𝗉\mathcal{Z}:\;\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}\;\to\;\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}} mapping Gζ(G)G\mapsto\zeta(G) and mapping each homomorphism to its restriction. If :𝖦𝗋𝗉𝖦𝗋𝗉\mathcal{I}:\;\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}\;\to\mathsf{Grp} is the inclusion functor, then ρ:𝒵\rho:\mathcal{I}\mathcal{Z}\Rightarrow\mathcal{I} is a natural transformation.  \square

1.3. Applications to computation

Part of the motivation for our work comes from computational challenges that arise in contemporary isomorphism tests in algebra. One of these is to develop new ways to discover characteristic subgroups. Standard constructions—such as the commutator subgroup, the center, and the Fitting subgroup—can be applied to any group. However, these subgroups often contribute little to resolving isomorphism. Many ideas have been introduced to search for new structures; see, for example,  \citelist[BOW][ELGO2002][Maglione2021]. Often these involve very detailed computations with individual groups, and their application is ad hoc. Indeed, a primary motivation for this study is to systematize the disparate techniques currently used to search for characteristic subgroups.

Theorem 2 provides the framework for a systematic search for characteristic subgroups. Indeed, an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphism generalizes the familiar and much studied category theory notion of adjoint functor pairs. We show in Section 4.6 that category actions offer a flexible way to implement the behavior of natural transformations in a computer algebra system. To exploit the full power of the categorical interpretation of characteristic subgroups, we work in a suitably general algebraic framework that allows a seamless transfer of information from one category to another. The familiar examples from Sections 7 and 8 demonstrate how to identify characteristic structure in a category and transfer it back to groups.

A second challenge concerns reproducibility and comparison of characteristic subgroups. Algorithms to decide isomorphism often, as a first step, generate a list of characteristic subgroups in a given group. For example, we could extract such a list for the family described in Example 1.3. An immediate question is: if we rerun this step for a different group, do we obtain the same list of corresponding characteristic subgroups (Fact 1.2)? It is not always clear that we do. For instance, some characteristic subgroup constructions employ randomization or make labelling choices that could change from one run to the next. Such shortcomings can compromise the utility of characteristic subgroup lists in deciding isomorphism.

Our proposed solution is to develop algorithms that return the natural transformation (or a morphism of biactions) from Theorem 2 instead of the characteristic subgroup itself. This will allow us, in principle, to extend the reach of a specific characteristic subgroup of a given group to an entire category, in much the same way that the commutator subgroup and center behave. The natural transformation can then be applied to a group G~\tilde{G} to produce a characteristic subgroup H~\tilde{H} that corresponds to HH in the sense of Fact 1.2: every isomorphism GG~G\to\tilde{G} necessarily maps HH to H~\tilde{H}, so allowing a meaningful comparison of characteristic subgroups. The precise circumstances under which such extensions are possible are specified in Theorem 5.4.

A third challenge is verifiability: in a computer algebra system, subgroups are often given by monomorphisms which are defined on a given set of group generators. The construction of such a monomorphism usually invokes computations that prove the claimed properties (such as homomorphism, monic, characteristic image, and so on). We present our work in a framework that combines these computations, data, and proofs, by employing an intuitionistic Martin-Löf type theory; such a model also allows machine verification of proofs. In this setting, if a computer algebra system returns a counital ι\iota, then this counital comes with a “type” that certifies that each morphism ιG\iota_{G} of ι\iota yields a characteristic substructure.

1.4. Structure of this paper

In Section 2, we discuss the required background for our foundations (type theory). In Section 3, we introduce eastern algebras (essentially algebraic structures) and show how they can be viewed as abstract categories.

Section 4 studies category actions. In particular, we define capsules (category modules) and describe a computational model for natural transformations as category bimorphisms (Proposition 4.10). This also allows us to describe counitals (Theorem 4.11) and adjoint functor pairs (Theorem 4.13) in the language of bicapsules and bimorphisms.

In Section 5, we explain how characteristic structures can be described by counitals. The functors involved in this construction are defined on categories with one object, but Theorem 5.4—which we call the Extension Theorem—allows us to extend these functors to larger categories. This theorem is the essential ingredient for proving our main results. We also generalize Theorem 1 to eastern algebras (Theorem 1-cat).

In Section 6, we generalize Theorem 2 to eastern algebras (Theorem 2-cat). We show that characteristic substructures can be described as certain counits, and as bimorphism actions on capsules. We also prove the dual version of this result for characteristic quotients (Theorem 2-dual).

In Section 7, we use our framework to provide categorical descriptions of common characteristic subgroups, including verbal and marginal subgroups.

In Section 8, we describe a cross-category translation of counitals and explain, in categorical terms, how a counital for a category of groups can be constructed from a counital for a category of algebras.

Table 1 summarizes notation used throughout the paper.

Symbol Description
𝖤\mathsf{E} Eastern variety
𝖠,𝖡,𝖢,𝖣\mathsf{A},\mathsf{B},\mathsf{C},\mathsf{D} Abstract categories or categories that act
𝖷,𝖸,𝖹\mathsf{X},\mathsf{Y},\mathsf{Z} Capsules
Δ,Σ\Delta,\Sigma Bicapsules
idX\operatorname{id}_{X} Identity morphism of type XX
𝟙𝖠\mathbb{1}_{\mathsf{A}} Identity morphisms of 𝖠\mathsf{A}
,𝒢\mathcal{F},\mathcal{G} Morphisms between categories
,𝒥,𝒦,\mathcal{I},\mathcal{J},\mathcal{K},\mathcal{L} Inclusion functors
,𝒩,,𝒮\mathcal{M},\mathcal{N},\mathcal{R},\mathcal{S} Capsule morphisms
AXA^{X} Functions XAX\to A
AnA^{n} Functions {1,,n}A\{1,\dots,n\}\to A
Ω\Omega Signature
EastΩ\mathrm{East}_{\Omega} Type of Ω\Omega-eastern algebras
\bot The void type
B?B^{?} The type B{}B\sqcup\{\bot\}
f(a)bf(a)\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}b If f(a)f(a) is defined, then f(a)=bf(a)=b
fgf\asymp g Computable equality of functions
f,ff\mathbin{\blacktriangleleft},\mathbin{\blacktriangleleft}f Source and target of a morphism
f,ff\lhd,\lhd f Guards for a category action
𝖠\mathrel{\mathop{\mathsf{A}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}, 𝖠\mathrel{\mathop{\mathsf{A}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\twoheadrightarrow$}\vss}}}, 𝖠\mathrel{\mathop{\mathsf{A}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\hookrightarrow$}\vss}}} The iso-, epi-, and mono-morphisms of 𝖠\mathsf{A} (resp.)
Table 1. A guide to notation

2. Type theory and certifying characteristic structure

To certify that a subgroup HH of a group GG, with inclusion ι\iota, is characteristic, we must verify that

(2.1) (φAut(G))(hH)(kH)φ(ι(h))=ι(k).\begin{array}[]{llll}(\forall\varphi\in\operatorname{Aut}(G))&(\forall h\in H)&(\exists k\in H)&\varphi(\iota(h))=\iota(k).\end{array}

At face value, this a posteriori check requires knowledge of Aut(G)\operatorname{Aut}(G). To provide a certificate of being characteristic, we instead develop a constructive version of our main results using type theory language. Specifically, we use an intuitionistic Martin-Löf type theory (MLTT), a model of computation capable of expressing aspects of proofs that can be machine verified. In an MLTT, (2.1) can be expressed as

φ:Aut(G)h:Hk:HEQG(φ(ι(h)),ι(k));\displaystyle\prod_{\varphi:\operatorname{Aut}(G)}\prod_{h:H}\bigsqcup_{k:H}\mathrm{EQ}_{G}\left(\varphi(\iota(h)),\iota(k)\right);

this notation we explain below. An advantage of this approach is that certificate data can be verified by practical type-checkers. An MLTT employs the “propositions as types” paradigm (Curry–Howard Correspondence), where types correspond to propositions and terms are programs that correspond to proofs. The remainder of this section is a concise treatment of type theory from \citelist [Hindley-Seldin]*Chapters 10–13 [HoTT]*Chapter 3.

2.1. Types

Informally, types annotate data by signalling which syntax rules apply to the data. We write a:Aa:A and say “aa is a term of type AA” or “aa inhabits AA”. For example, a:a:\mathbb{N} signals that aa can only be used as a natural number. A type AA is inhabited if there exists at least one term a:Aa:A and uninhabited if no term of type AA exists. The void type \bot has no inhabitants by definition. Deciding whether a type is inhabited or not is computationally undecidable [Hindley-Seldin]*pp. 66–67. Therefore in computational settings types are permitted to be neither inhabited nor uninhabited. Type annotations enable us to use symbols according to their logical purpose; for example, a:Aa:A is analogous to aAa\in A, but type theories do not have the axioms of set theory.

Types are introduced from two sources. First, there is a context that defines a priori the types that we need: for example, \mathbb{N}. Next, there are type-builders that construct new types from existing ones. We use both ABA\to B and BAB^{A} to denote the type of functions, and set Dom(AB)=A\operatorname{Dom}(A\to B)=A and Codom(AB)=B\operatorname{Codom}(A\to B)=B. If nn is a natural number, then an inhabitant of type AnA^{n} can be interpreted as an nn-tuple (a1,,an)(a_{1},\ldots,a_{n}) with each ai:Aa_{i}:A, or alternatively as a function {1,,n}A\{1,\ldots,n\}\to A. There is a unique function A\bot\to A (akin to the uniqueness of a function A\varnothing\to A), so A0A^{0} is a type with a single inhabitant—it is not void.

The notation i:IAi\prod_{i:I}A_{i} together with projection maps πi:(i:IAi)Ai\pi_{i}:\left(\prod_{i:I}A_{i}\right)\to A_{i} is used for Cartesian products, and i:IAi\bigsqcup_{i:I}A_{i} together with inclusion maps ιi:Aii:IAi\iota_{i}:A_{i}\to\bigsqcup_{i:I}A_{i} is used for disjoint unions. (The tradition in type theory is to use i:IAi\sum_{i:I}A_{i} instead of i:IAi\bigsqcup_{i:I}A_{i}, but this conflicts with algebraic uses of Σ\Sigma.)

2.2. Propositions as types

In set theory, propositions are part of the existing foundations. In type theory, propositions co-evolve with the theory as special types. A proposition PP in logic is associated to a type P^:Type\hat{P}:\mathrm{Type}. (Only in this section do we distinguish propositions PP in logic from propositions as types with the notation P^\hat{P}.) If the type P^\hat{P} is inhabited by data p:P^p:\hat{P}, then the term pp is regarded as a proof that PP is true. For example, an implication PQP\Rightarrow Q (here \Rightarrow means “implies”) can be proved by means of a function f:P^Q^f:\hat{P}\to\hat{Q}, where P^\hat{P} and Q^\hat{Q} are the respective types associated with PP and QQ, because it suffices to assume PP and derive a proof of QQ. Likewise, if we assume that there is a term p:P^p:\hat{P} and apply the function ff, then it produces a term f(p):Q^f(p):\hat{Q}.

In classical logic, it is only the existence of a proof for a proposition that is relevant. Analogously, in type theory, P^:Type\hat{P}:\mathrm{Type} is a mere proposition, written P^:Prop\hat{P}:\mathrm{Prop}, if it has at most one inhabitant.

Consider the function P^:AProp\hat{P}:A\to\mathrm{Prop}. Now (aA)(P(a))(\forall a\in A)(P(a)) is expressed by terms of type (a:AP^a):Prop\left(\prod_{a:A}\hat{P}_{a}\right):\mathrm{Prop}; and (aA)(P(a))(\exists a\in A)(P(a)) is expressed by terms of type a:AP(a)\bigsqcup_{a:A}P(a) (technically, the truncation of that type [HoTT]*§3.7). The negation of a proposition PP is PFalseP\Rightarrow\textsc{False}, which accords with functions of type P^\hat{P}\to\bot. For additional details, see [Hindley-Seldin]*Chapters 12–13 or [HoTT]*Chapter 3.

Martin-Löf developed a notion of equality that imitates the Leibniz Law [Feldman]:

(a=b)[(P(x))P(a)P(b)],(a=b)\iff\left[(\forall P(x))\;P(a)\Longleftrightarrow P(b)\right],

where P(x)P(x) runs over all predicates of a single variable xx. Thus, for every type AA and terms s,t:As,t:A, we define an auxiliary type EQA(s,t)\text{EQ}_{A}(s,t) (where terms are proofs that s=ts=t) with the rule that, given a function f:ABf:A\to B, there is a function

(2.2) path(f):EQA(s,t)EQB(f(s),f(t)).\displaystyle\text{path}(f):\text{EQ}_{A}(s,t)\to\text{EQ}_{B}(f(s),f(t)).

2.3. Subtypes and inclusion functions

Sets are a special case of types: we write S:SetS:\mathrm{Set} for a type SS if the type EQS(s,t)\mathrm{EQ}_{S}(s,t) is a mere proposition for all s,t:Ss,t:S. Let AA be a type. If P:APropP:A\to\mathrm{Prop}, then

{a:A|P(a)}\hstretch.13==a:AP(a),\{a:A\,|\,P(a)\}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\bigsqcup_{a:A}P(a),

is the subtype of AA defined by PP. We also write this as B={a:A|P(a)}AB=\{a:A\,|\,P(a)\}\subset A. Terms of type BB have the form a,p\langle a,p\rangle for a:Aa:A and p:Propp:\mathrm{Prop}, where pp is a proof that P(a)P(a) is inhabited. We sometimes use set theory notation to improve readability when describing a subtype. For more details, see [HoTT]*§3.5. For a typed function f:ABf:A\to B, the image {f(a)a:A}\{f(a)\mid a:A\} is shorthand for {b:B(a:A)(f(a)=b)}\{b:B\mid(\exists a:A)(f(a)=b)\}.

Subtypes have an associated inclusion function α:BA\alpha:B\to A where α(a,p)\hstretch.13==a\alpha(\langle a,p\rangle)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}a. A subtlety is that if CBC\subset B with inclusion map β:CB\beta:C\to B, then the composition αβ:CA\alpha\beta:C\to A is injective but does not show directly that CAC\subset A. A term of type C=b:BQ(b)C=\bigsqcup_{b:B}Q(b), with Q:BPropQ:B\to\mathrm{Prop}, has the form a,p,q\langle\langle a,p\rangle,q\rangle, which differs from those of type BB. A small modification addresses the fact that the relation \subset is not strictly transitive. Define a subtype C=a:AR(a)C^{\prime}=\bigsqcup_{a:A}R(a), where R(a)=p:P(a)Q(a)R(a)=\bigsqcup_{p:P(a)}Q(a), and inclusion γ:CA\gamma:C^{\prime}\to A. Now construct a map σ:CC\sigma:C\to C^{\prime} given by

a,p,qa,p,q,\displaystyle\langle\langle a,p\rangle,q\rangle\mapsto\langle a,\langle p,q\rangle\rangle,

where a:Aa:A and p,q:R(a)\langle p,q\rangle:R(a). Thus, αβ=γσ\alpha\beta=\gamma\sigma, and the composition αβ\alpha\beta is equivalent to γ\gamma. Hence, \subset is transitive up to this equivalence.

2.4. Partial-functions

For a type BB, we define

B?\displaystyle B^{?} \hstretch.13==B{}.\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}B\sqcup\{\bot\}.

A partial-function is a function is a term, ff, of type AB?A\to B^{?}. It is defined at a:Aa:A if there is b:Bb:B with f(a)=ιB(b)f(a)=\iota_{B}(b), where ιB:BB?\iota_{B}:B\hookrightarrow B^{?} is the inclusion.

Given partial-functions f,g:AB?f,g:A\to B^{?}, the notion of equality as “f(a)=g(a)f(a)=g(a)” for every a:Aa:A is too strict. We impose that condition only for those a:Aa:A for which both f(a)f(a) and g(a)g(a) are defined. This motivates a notion of “directional equality”, where having one side defined implies that the other is also defined; only now do we decide whether the results are equal. Freyd and Scedrov [FS]*1.12 introduced the following venturi-tube \mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}} relation on partial-functions.

Definition 2.1.

Given f:AB?f:A\to B^{?}, a:Aa:A and b:Bb:B, we write

f(a)b if f is defined at a:A and f(a)=ιB(b).\displaystyle f(a)\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}b\text{ if }f\text{ is defined at }a:A\text{ and }f(a)=\iota_{B}(b).

For f,g:AB?f,g:A\to B^{?} we write fgf\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}g if f(a)g(a)f(a)\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}g(a) for all a:Aa:A, and we write fgf\asymp g if fgf\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}g and gfg\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}f.

Note that \mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}} is a pre-order (reflexive and transitive). If we compare \asymp with (extensional) function equality

f=g[(a:A)f(a)=g(a)],f=g\iff\left[(\forall a:A)\;f(a)=g(a)\right],

then f=gf=g implies that fgf\asymp g. In classical logic (with the law of the excluded middle) the converse also holds because one can, by fiat, declare that ff is defined or undefined at a:Aa:A. In some computational models such a separation is non-constructive, so computing f(a)f(a) may not halt. Thus, we retain the fgf\asymp g notation.

2.5. Certifying that the trivial group is characteristic

As an illustration, we present a type verifying the characteristic property of the trivial subgroup. Let G:GroupG:\mathrm{Group} be a group with identity 1:G1:G. Let H={x:GEQG(x,1)}H=\{x:G\mid\mathrm{EQ}_{G}(x,1)\} be the subtype of GG representing the trivial subgroup. Recall that terms of HH have the form x,p\langle x,p\rangle, where x:Gx:G and p:EQG(x,1)p:\mathrm{EQ}_{G}(x,1), and there is a map ι:HG\iota:H\to G, x,px\langle x,p\rangle\mapsto x. If h,k:Hh,k:H, then ι(h)=ι(k)=1\iota(h)=\iota(k)=1, and, by (2.2), for every φ:Aut(G)\varphi:\operatorname{Aut}(G) there is an invertible function of type

(2.3) EQG(φ(1),1)EQG(φ(ι(h)),ι(k)).\displaystyle\text{EQ}_{G}(\varphi(1),1)\longleftrightarrow\text{EQ}_{G}(\varphi(\iota(h)),\iota(k)).

The latter function depends on hh and kk, but we suppress this dependency to simplify the exposition. Let idLaw(φ):EQG(φ(1),1)\mathrm{idLaw}(\varphi):\mathrm{EQ}_{G}(\varphi(1),1) be a proof that φ:Aut(G)\varphi:\operatorname{Aut}(G) fixes 1:G1:G. Using (2.3), we define the term

idMap(φ):h:Hk:HEQG(φ(ι(h)),ι(k))\mathrm{idMap}(\varphi):\prod_{h:H}\bigsqcup_{k:H}\text{EQ}_{G}(\varphi(\iota(h)),\iota(k))

that takes as input h:Hh:H and produces 1,idLaw(φ):k:HEQG(φ(ι(h)),ι(k))\langle 1,\mathrm{idLaw}(\varphi)\rangle:\bigsqcup_{k:H}\text{EQ}_{G}(\varphi(\iota(h)),\iota(k)). Therefore we obtain the term

idMap\displaystyle\mathrm{idMap} :φ:Aut(G)h:Hk:HEQG(φ(ι(h)),ι(k)),\displaystyle:\prod_{\varphi:\operatorname{Aut}(G)}\prod_{h:H}\bigsqcup_{k:H}\text{EQ}_{G}(\varphi(\iota(h)),\iota(k)),

which certifies that HH is characteristic in GG; compare to (2.1). Recall that in MLTT, types correspond to propositions, and terms are programs that correspond to proofs. Thus, the term idMap\mathrm{idMap} is not an exhaustive tuple listing Aut(G)\operatorname{Aut}(G), but a program (function) that takes as input φ:Aut(G)\varphi:\operatorname{Aut}(G) and h:Hh:H, and produces k:Hk:H and p:EQG(φ(ι(h)),ι(k))p:\text{EQ}_{G}(\varphi(\iota(h)),\iota(k)).

3. Essentially algebraic structures

To interpret characteristic structure as computable categorical information, we treat categories as algebraic structures. (Computational categories should not be confused with categorical semantics of computation.) For our purpose, it suffices to use operations that may only be partially defined, so categories are important examples, as are monoids, groups, groupoids, rings, and non-associative algebras. We give an abridged account and refer to \citelist[Cohn]*§II.2 [AR1994:categories]*Chapter 3 for details.

3.1. Operators, grammars, and signatures

Informally, a grammar is a description of rules for formulas.

Definition 3.1.

An operator is a symbol with a grammar, which we describe using the Backus–Naur Form (BNF) [Pierce:types]*p. 24. The valence of an operator ω\omega, written |ω||\omega|, is the number of parameters in its grammar. A set Ω\Omega of operators is a signature.

Example 3.2.

A signature for additive formulas specifies three operators:

<Add> ::= (<Add> + <Add>) | 0 | (-<Add>)

The bivalent addition (+)(+) depends on terms to the left and right; zero (0) depends on nothing; and univalent negation (-) is followed by a term. \square

It is easy to reject ++ 2 3 7+-+\,2\,3\,7 since it is not meaningful. However, we might write 2+372+3-7 intending (2+3)+(7)(2+3)+(-7); the BNF grammar <Add> accepts only the latter.

The purpose of the signature is to formulate important algebraic concepts such as homomorphisms. To declare that a function f:ABf:A\to B is a homomorphism between additive groups, we use the signature of Example 3.2 as follows:

f((x+y))\displaystyle f((x+y)) =(f(x)+f(y)),\displaystyle=(f(x)+f(y)), f(0)\displaystyle f(0) =0,\displaystyle=0, f((x))\displaystyle f((-x)) =(f(x)).\displaystyle=(-f(x)).

3.2. Algebraic structures

An algebra is a single type with a signature [Cohn]*§II.2.

Definition 3.3.

An algebraic structure with signature Ω\Omega is a type AA and a function ωωA\omega\mapsto\omega_{A}, where ω:Ω\omega:\Omega and ωA:A|ω|A\omega_{A}:A^{|\omega|}\to A. A homomorphism of algebraic structures AA and BB, each having signature Ω\Omega, is a function f:ABf:A\to B such that, for every ω:Ω\omega:\Omega and a1,,a|ω|:Aa_{1},\ldots,a_{|\omega|}:A,

f(ωA(a1,,a|ω|))\displaystyle f(\omega_{A}(a_{1},\ldots,a_{|\omega|})) =ωB(f(a1),,f(a|ω|)).\displaystyle=\omega_{B}(f(a_{1}),\ldots,f(a_{|\omega|})).

As in Section 2.2, we extend these propositions to types as follows:

AlgeΩ\displaystyle\mathrm{Alge}_{\Omega} \hstretch.13==A:Typeω:Ω(A|ω|A)\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\bigsqcup_{A:\mathrm{Type}}~\prod_{\omega:\Omega}(A^{|\omega|}\to A)
HomΩ(A,B)\displaystyle\mathrm{Hom}_{\Omega}(A,B) \hstretch.13==f:ABω:Ωa:A|ω|EQB(f(ωA(a1,,a|ω|)),ωB(f(a1),,f(a|ω|))).\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\!\bigsqcup_{f:A\to B}~\prod_{\omega:\Omega}~\prod_{a:A^{|\omega|}}\!\mathrm{EQ}_{B}\left(f(\omega_{A}(a_{1},\ldots,a_{|\omega|})),\omega_{B}(f(a_{1}),\ldots,f(a_{|\omega|}))\right).

Terms of type AlgeΩ\mathrm{Alge}_{\Omega} are Ω\Omega-algebras.

For example, consider the additive group signature from Example 3.2. The underlying structure of an additive group can be described by a type (set) AA together with assignments of the operators in Add such as (<Add> + <Add>) to +A:A×AA+_{A}:A\times A\to A.

3.3. Free algebras and formulas

We now extend signatures to include variables that allow us to work with formulas.

Definition 3.4.

Let Ω\Omega be a signature and let XX be a type whose terms are variables. The free Ω\Omega-algebra in variables XX, denoted by ΩX\Omega\langle X\rangle, is the type of every formula in XX constructed using the operators in Ω\Omega.

Example 3.5.

To describe formulas in variables x,yx,y and zz, we extend the additive signature Ω=Add\Omega=\texttt{Add} of Example 3.2 as follows:

<Add<X>> ::= (<Add<X>> + <Add<X>>) | 0 | (-<Add<X>>) | x | y | z.\begin{split}\texttt{<Add<X>>}\texttt{ ::= (<Add<X>> + <Add<X>>) | 0 | (-<Add<X>>) | x | y | z}.\end{split}

Here, x+yx+y and (x)+(0+z)(-x)+(0+z) have type AddX\texttt{Add}\langle X\rangle, but xx- and x+7x+7 do not. The operations on the formulas Φ1(X),Φ2(X):AddX\Phi_{1}(X),\Phi_{2}(X):\texttt{Add}\langle X\rangle are:

Φ1(X)+AddXΦ2(X)\displaystyle\Phi_{1}(X)+_{\texttt{Add}\langle X\rangle}\Phi_{2}(X) \hstretch.13==(Φ1(X)+Φ2(X))\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}(\Phi_{1}(X)+\Phi_{2}(X))
0AddX\displaystyle 0_{\texttt{Add}\langle X\rangle} \hstretch.13==0\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}0
AddXΦ1(X)\displaystyle-_{\texttt{Add}\langle X\rangle}\Phi_{1}(X) \hstretch.13==(Φ1(X)).\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}(-\Phi_{1}(X)).

Thus, AddX\texttt{Add}\langle X\rangle is the free additive algebra, but it lacks laws such as x+y=y+xx+y=y+x and x+(x)=0x+(-x)=0. We explain how to impose these laws in Section 3.4. \square

Fact 3.6.

Let AA be an Ω\Omega-algebra and a:AXa:A^{X}, where XX is a type whose terms are variables. There is a unique homomorphism evala:ΩXA\mathrm{eval}_{a}:\Omega\langle X\rangle\to A that satisfies evala(x)=ax\mathrm{eval}_{a}(x)=a_{x}.

Consequently, we write Φ(a)\hstretch.13==evala(Φ)\Phi(a)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}{\rm eval}_{a}(\Phi) for formulas Φ:ΩX\Phi:\Omega\langle X\rangle and a:AXa:A^{X}.

Remark 3.7.

The construction in Fact 3.6 is categorical in nature, and we use it in Section 7 to construct characteristic subgroups. The category of Ω\Omega-algebras has objects of type AlgeΩ\mathrm{Alge}_{\Omega} together with homomorphisms. The pair of functors (given only by their object maps)

XΩX and A:Type,(ω:Ω)(ωA:A|ω|A)AX\mapsto\Omega\langle X\rangle\quad\text{ and }\quad\langle A:\mathrm{Type},\ (\omega:\Omega)\mapsto(\omega_{A}:A^{|\omega|}\to A)\rangle\mapsto A

forms an adjoint functor pair between the categories of types and Ω\Omega-algebras; see Section 4.5 for related discussion.

3.4. Laws and varieties

Let A{A} be an Ω\Omega-algebra and let XX be a type for variables. We now describe the variety of algebraic structures whose operators satisfy a list of laws such as the axioms of a group. A law is a term of type ΩX2\Omega\langle X\rangle^{2}. We index laws by a type LL, so they are terms of type LΩX2L\to\Omega\langle X\rangle^{2} and are written (Λ1,,Λ2,)\ell\mapsto(\Lambda_{1,\ell},\Lambda_{2,\ell}). We say that A{A} is in the variety for the laws LΩX2L\to\Omega\langle X\rangle^{2} if

(a:AX)(:L)Λ1,(a)=Λ2,(a).\displaystyle\begin{array}[]{lll}(\forall a:{A}^{X})&(\forall\ell:{L})&\Lambda_{1,\ell}(a)=\Lambda_{2,\ell}(a).\end{array}
Example 3.8.

The signature Ω\Omega for groups is the following:

<G> ::= (<G><G>) | 1 | (<G>)-1.\displaystyle\texttt{<G> ::= (<G><G>) | 1 | (<G>)${}^{-1}$}.

The variety of groups uses three laws, indexed by L={𝚊𝚜𝚌,𝚒𝚍,𝚒𝚗𝚟}L=\{{\tt asc},{\tt id},{\tt inv}\} with variables X={x,y,z}X=\{x,y,z\}. If =asc\ell=\texttt{asc}, then (Λ1,,Λ2,):ΩX2(\Lambda_{1,\ell},\Lambda_{2,\ell}):\Omega\langle X\rangle^{2} is

Λ1,\hstretch.13==x(yz)andΛ2,\hstretch.13==(xy)z,\Lambda_{1,\ell}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}x(yz)\quad\text{and}\quad\Lambda_{2,\ell}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}(xy)z,

so Λ1,(g,h,k)=g(hk)\Lambda_{1,\ell}(g,h,k)=g(hk) and Λ2,(g,h,k)=(gh)k\Lambda_{2,\ell}(g,h,k)=(gh)k. Hence, associativity is imposed on the Ω\Omega-algebra GG by requiring a term (“proof”) of type

g:Gh:Gk:GEQG(g(hk),(gh)k).\prod_{g:G}\prod_{h:G}\prod_{k:G}\text{EQ}_{G}(g(hk),(gh)k).

Encoding 1x=x1x=x and x1x=1x^{-1}x=1 as additional laws gives a complete description of the variety of groups. Laws need not be algebraically independent: for example, x1=xx1=x and xx1=1xx^{-1}=1 are often also encoded. \square

For clarity, henceforth we write laws as propositions. For example, we write g(hk)=(gh)kg(hk)=(gh)k rather than terms of a mere proposition type.

3.5. Eastern algebras

We cannot always compose a pair of morphisms in a category: composition may be a partial-function. Hence, the morphisms need not form an algebraic structure under composition. We address this limitation by identifying precisely when the operators yield partial-functions.

Example 3.9.

The type of every function is given as

Fun\hstretch.13==A:TypeB:Type(AB).\mathrm{Fun}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\bigsqcup_{A:\mathrm{Type}}~\bigsqcup_{B:\mathrm{Type}}(A\to B).

Technically, to quantify over all types, we shift to a larger universe Type1\text{Type}_{1}; see Remark 3.21. For f:ABf:A\to B, define

f\displaystyle f\mathbin{\blacktriangleleft} \hstretch.13==idA,\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\operatorname{id}_{A}, f\displaystyle\mathbin{\blacktriangleleft}f \hstretch.13==idB,\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\operatorname{id}_{B}, (fg)(x)\displaystyle(fg)(x) \hstretch.13=={f(g(x))f=g,otherwise.\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\begin{cases}f(g(x))&f\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}g,\\ \bot&\text{otherwise}.\end{cases}

The condition f=gf\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}g guards against composing non-composable functions. (A helpful mnemonic for f=gf\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}g is “What enters ff must match what exits gg”.) This yields the composition signature:

<Comp> ::= (<Comp><Comp>) | (<Comp>) | (<Comp>)\texttt{<Comp> ::= (<Comp><Comp>) | }(\mathbin{\blacktriangleleft}\texttt{<Comp>})\texttt{ | }(\texttt{<Comp>}\mathbin{\blacktriangleleft})

Note that (f)=idA=idA=f\mathbin{\blacktriangleleft}(f\mathbin{\blacktriangleleft})=\mathbin{\blacktriangleleft}\operatorname{id}_{A}=\operatorname{id}_{A}=f\mathbin{\blacktriangleleft}, and similarly (f)=f(\mathbin{\blacktriangleleft}f)\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}f. \square

The composition signature defined in Example 3.9 is used throughout. Motivated by it, we make the following general definition.

Definition 3.10.

For a signature Ω\Omega, operator ω:Ω\omega:\Omega, variables X={x1,,x|ω|}X=\{x_{1},\dots,x_{|\omega|}\}, type AA, and formulas Φ1,Φ2:ΩX\Phi_{1},\Phi_{2}:\Omega\langle X\rangle, the partial-function ωA:A|ω|A?\omega_{A}:A^{|\omega|}\to A^{?} is (Φ1,Φ2)(\Phi_{1},\Phi_{2})-guarded if

(3.1) [(a:A|ω|)Φ1(a)=Φ2(a)]ωA(a) is defined at a.\left[(\forall a:A^{|\omega|})\;\;\Phi_{1}(a)=\Phi_{2}(a)\right]\iff\omega_{A}(a)\text{ is defined at $a$}.

The formulas Φ1,Φ2\Phi_{1},\Phi_{2} are the rails of ω\omega. If Φ1=Φ2\Phi_{1}=\Phi_{2}, then the rails are trivial, so ωA\omega_{A} is everywhere defined and is total.

We define a type Guard(A,ω,(Φ1,Φ2))\mathrm{Guard}(A,\omega,(\Phi_{1},\Phi_{2})) whose terms are pairs ωA,p\langle\omega_{A},p\rangle, where pp is a proof of (3.1). Fix a tuple of variables X:n:{x1,,xn}X:\prod_{n:\mathbb{N}}\{x_{1},\dots,x_{n}\}. Define the type Rails(Ω)\hstretch.13==ω:ΩΩX|ω|2\mathrm{Rails}(\Omega)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\prod_{\omega:\Omega}\Omega\langle X_{|\omega|}\rangle^{2} whose terms are the rails for the operators in Ω\Omega. Observe that the rails are everywhere defined.

We now define essentially algebraic structures, which we call eastern algebras; see [AR1994:categories]*§3.D.

Definition 3.11.

For a signature Ω\Omega, an Ω\Omega-eastern algebra is a type AA and an assignment ωωA,p\omega\mapsto\langle\omega_{A},p\rangle of operators ω:Ω\omega:\Omega to Φω\Phi_{\omega}-guarded partial-functions. Formally, Ω\Omega-eastern algebras are terms of the type

EastΩ\displaystyle\mathrm{East}_{\Omega} \hstretch.13==A:TypeΦ:Rails(Ω)ω:ΩGuard(A,ω,Φω).\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\bigsqcup_{A:\mathrm{Type}}~\bigsqcup_{\Phi:\mathrm{Rails}(\Omega)}~\prod_{\omega:\Omega}\mathrm{Guard}(A,\omega,\Phi_{\omega}).

Every algebraic structure defines an eastern algebra by using trivial rails for each operator. In the next example, we observe that categories are eastern algebras. Recall that a category 𝖢\mathsf{C} has objects U,V,U,V,\ldots of type 𝖢0\mathsf{C}_{0}, morphisms ff of type 𝖢1(U,V)\mathsf{C}_{1}(U,V), and a composition operation :𝖢1(V,W)×𝖢1(U,V)𝖢1(U,W)\circ:\mathsf{C}_{1}(V,W)\times\mathsf{C}_{1}(U,V)\to\mathsf{C}_{1}(U,W).

Example 3.12 (Categories as eastern algebras).

Let 𝖢\mathsf{C} be a category with object type 𝖢0\mathsf{C}_{0}. Form the type of all morphisms of 𝖢\mathsf{C}:

(3.2) 𝖢1\displaystyle\mathsf{C}_{1} \hstretch.13==U:𝖢0V:𝖢0𝖢1(U,V).\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\bigsqcup_{U:\mathsf{C}_{0}}\bigsqcup_{V:\mathsf{C}_{0}}\mathsf{C}_{1}(U,V).

For objects U,V:𝖢0U,V:\mathsf{C}_{0}, there is an inclusion map (see Section 2.1)

ιUV\displaystyle\iota_{UV} :𝖢1(U,V)𝖢1.\displaystyle:\mathsf{C}_{1}(U,V)\hookrightarrow\mathsf{C}_{1}.

Thus, for each φ:𝖢1\varphi:\mathsf{C}_{1}, there exist unique U,V:𝖢0U,V:\mathsf{C}_{0} and f:𝖢1(U,V)f:\mathsf{C}_{1}(U,V) such that φ=ιUV(f)\varphi=\iota_{UV}(f). The type 𝖢1\mathsf{C}_{1} is an eastern algebra with the composition signature from Example 3.9, which is realized as follows. For φ,τ:𝖢1\varphi,\tau:\mathsf{C}_{1}, with φ:UV\varphi:U\to V and τ:UV\tau:U^{\prime}\to V^{\prime},

φ\displaystyle\varphi\mathbin{\blacktriangleleft} \hstretch.13==idU,\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\operatorname{id}_{U}, φ\displaystyle\mathbin{\blacktriangleleft}\varphi \hstretch.13==idV,\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\operatorname{id}_{V}, τφ\displaystyle\tau\varphi \hstretch.13=={τφτ=φ(i.e.U=V),otherwise.\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\begin{cases}\tau\varphi&\tau\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}\varphi\quad(\textrm{i.e.}~U^{\prime}=V),\\ \bot&\text{otherwise}.\end{cases}

As before, φ=τ\varphi\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}\tau guards against composing incompatible morphisms. \square

3.6. Abstract categories

We consider categories as described in [FS]*1.11; their type differs from those of Example 3.12.

Definition 3.13.

Let Ω\Omega be the composition signature of Example 3.9. An abstract category is an Ω\Omega-eastern algebra satisfying the following laws in variables f,g,hf,g,h:

(f)\displaystyle\mathbin{\blacktriangleleft}(f\mathbin{\blacktriangleleft}) =f\displaystyle=f\mathbin{\blacktriangleleft} (f)\displaystyle(\mathbin{\blacktriangleleft}f)\mathbin{\blacktriangleleft} =f\displaystyle=\mathbin{\blacktriangleleft}f
(f)f\displaystyle(\mathbin{\blacktriangleleft}f)f =f\displaystyle=f f(f)\displaystyle f(f\mathbin{\blacktriangleleft}) =f\displaystyle=f
(fg)\displaystyle\mathbin{\blacktriangleleft}(fg) (f(g))\displaystyle\asymp\mathbin{\blacktriangleleft}(f(\mathbin{\blacktriangleleft}g)) (fg)\displaystyle(fg)\mathbin{\blacktriangleleft} ((f)g)\displaystyle\asymp((f\mathbin{\blacktriangleleft})g)\mathbin{\blacktriangleleft}
f(gh)\displaystyle f(gh) (fg)h.\displaystyle\asymp(fg)h.

We sometimes refer to the operators ()(-)\mathbin{\blacktriangleleft} and ()\mathbin{\blacktriangleleft}(-) as guards.

A useful subtype of an abstract category 𝖠\mathsf{A} is the type of identities:

𝟙𝖠={aa:𝖠}={aa:𝖠}.\mathbb{1}_{\mathsf{A}}=\{a\mathbin{\blacktriangleleft}\mid a:\mathsf{A}\}=\{\mathbin{\blacktriangleleft}a\mid a:\mathsf{A}\}.
Lemma 3.14.

The following hold in every abstract category.

  1. (a)

    The guards are idempotent, namely

    (())\displaystyle((-)\mathbin{\blacktriangleleft})\mathbin{\blacktriangleleft} =(),\displaystyle=(-)\mathbin{\blacktriangleleft}, (())\displaystyle\mathbin{\blacktriangleleft}(\mathbin{\blacktriangleleft}(-)) =().\displaystyle=\mathbin{\blacktriangleleft}(-).
  2. (b)

    Terms ff and gg satisfy

    (fg)\displaystyle\mathbin{\blacktriangleleft}(fg) f,\displaystyle\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}\mathbin{\blacktriangleleft}f, (fg)g.\displaystyle(fg)\mathbin{\blacktriangleleft}\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}g\mathbin{\blacktriangleleft}.
Proof.

For a term ff in an abstract category,

(f)=((f))=(f)=f.\mathbin{\blacktriangleleft}(\mathbin{\blacktriangleleft}f)=\mathbin{\blacktriangleleft}((\mathbin{\blacktriangleleft}f)\mathbin{\blacktriangleleft})=(\mathbin{\blacktriangleleft}f)\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}f.

A similar argument shows (f)=f(f\mathbin{\blacktriangleleft})\mathbin{\blacktriangleleft}=f\mathbin{\blacktriangleleft}, so (a) holds.

For (b), suppose gg is another term and f=gf\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}g. Now (fg)(f(g))\mathbin{\blacktriangleleft}(fg)\asymp\mathbin{\blacktriangleleft}(f(\mathbin{\blacktriangleleft}g)) is an equality: (fg)=(f(g))\mathbin{\blacktriangleleft}(fg)=\mathbin{\blacktriangleleft}(f(\mathbin{\blacktriangleleft}g)). Since f(f)=ff(f\mathbin{\blacktriangleleft})=f,

(fg)=(f(g))=(f(f))=f.\displaystyle\mathbin{\blacktriangleleft}(fg)=\mathbin{\blacktriangleleft}(f(\mathbin{\blacktriangleleft}g))=\mathbin{\blacktriangleleft}(f(f\mathbin{\blacktriangleleft}))=\mathbin{\blacktriangleleft}f.

Hence, (fg)f\mathbin{\blacktriangleleft}(fg)\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}\mathbin{\blacktriangleleft}f, and the other formula follows similarly. ∎

Proposition 3.15.

Let 𝖢\mathsf{C} be a category. The type 𝖢1\mathsf{C}_{1} from (3.2) of all morphisms of 𝖢\mathsf{C} with the composition signature from Example 3.9 forms an abstract category.

Proof.

If f:UVf:U\to V in 𝖢\mathsf{C}, then (f)=idCodomidU=idU=f.\mathbin{\blacktriangleleft}(f\mathbin{\blacktriangleleft})=\operatorname{id}_{\text{Codom}\operatorname{id}_{U}}=\operatorname{id}_{U}=f\mathbin{\blacktriangleleft}. Similarly, (f)=f(\mathbin{\blacktriangleleft}f)\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}f. Since the operators ()(-)\mathbin{\blacktriangleleft} and ()\mathbin{\blacktriangleleft}(-) have trivial rails, both of the equations (f)=f\mathbin{\blacktriangleleft}(f\mathbin{\blacktriangleleft})=f\mathbin{\blacktriangleleft} and (f)=f(\mathbin{\blacktriangleleft}f)\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}f are everywhere defined.

Observe that (f)f(\mathbin{\blacktriangleleft}f)f is defined and equals idVf=f\operatorname{id}_{V}f=f; also f(f)f(f\mathbin{\blacktriangleleft}) is defined and equals fidU=ff\operatorname{id}_{U}=f. For g:𝖢1(U,V)g:\mathsf{C}_{1}(U^{\prime},V^{\prime}), the expression (fg)\mathbin{\blacktriangleleft}(fg) is defined whenever f=gf\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}g, and f(g)f(\mathbin{\blacktriangleleft}g) is defined whenever f=(g)f\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}(\mathbin{\blacktriangleleft}g). Since ()\mathbin{\blacktriangleleft}(-) is idempotent by Lemma 3.14(a), both (fg)\mathbin{\blacktriangleleft}(fg) and f(g)f(\mathbin{\blacktriangleleft}g) are defined when f=gf\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}g. Thus, f=gf\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}g implies

(fg)=idV=(f(f))=(f(g)),\mathbin{\blacktriangleleft}(fg)=\operatorname{id}_{V}=\mathbin{\blacktriangleleft}(f(f\mathbin{\blacktriangleleft}))=\mathbin{\blacktriangleleft}(f(\mathbin{\blacktriangleleft}g)),

so (fg)(f(g))\mathbin{\blacktriangleleft}(fg)\asymp\mathbin{\blacktriangleleft}(f(\mathbin{\blacktriangleleft}g)). A similar argument holds for (fg)((f)g)(fg)\mathbin{\blacktriangleleft}\asymp((f\mathbin{\blacktriangleleft})g)\mathbin{\blacktriangleleft}.

Lastly, composition is associative everywhere it is defined, so f(gh)(fg)hf(gh)\asymp(fg)h. ∎

Example 3.16.

Let 𝖠\mathsf{A} be an abstract category with 𝟙𝖠={e1,,e6}\mathbb{1}_{\mathsf{A}}=\{e_{1},\ldots,e_{6}\} and additional morphisms a12,a23,a13,a´13,b45,b54a_{12},a_{23},a_{13},\acute{a}_{13},b_{45},b_{54}, where xij=ejx_{ij}\mathbin{\blacktriangleleft}=e_{j} and xij=ei\mathbin{\blacktriangleleft}x_{ij}=e_{i}. Using the signature of Example 3.9, 𝖠\mathsf{A} is an eastern algebra with multiplication defined in Table 2 where every instance of \bot is omitted. It is not easy to discern structure from this table, so two additional visualizations of 𝖠\mathsf{A} are given in Figure 1. The first is the Cayley graph of the multiplication with undefined products omitted. The second is the Peirce decomposition, which we now discuss. \square

xe1e2e3e4e5e6a12a23a13a´13b45b54xe1e2e3e4e5e6e2e3e3e3e5e4xe1e2e3e4e5e6e1e2e1e1e4e5e1e2e3e4e5e6a12a23a13a´13b45b54e1e1a12a13a´13e2e2a23e3e3e4e4b45e5e5b54e6e6a12a12a13a23a23a13a13a´13a´13b45b45e4b54b54e5\displaystyle\begin{array}[]{|c||ccccc|c|cccc|cc|}\hline\cr x&e_{1}&e_{2}&e_{3}&e_{4}&e_{5}&e_{6}&a_{12}&a_{23}&a_{13}&\acute{a}_{13}&b_{45}&b_{54}\\ \hline\cr\hline\cr x\mathbin{\blacktriangleleft}&e_{1}&e_{2}&e_{3}&e_{4}&e_{5}&e_{6}&e_{2}&e_{3}&e_{3}&e_{3}&e_{5}&e_{4}\\ \hline\cr\mathbin{\blacktriangleleft}x&e_{1}&e_{2}&e_{3}&e_{4}&e_{5}&e_{6}&e_{1}&e_{2}&e_{1}&e_{1}&e_{4}&e_{5}\\ \hline\cr\lx@intercol\hfil\hfil\lx@intercol\\ \hline\cr\cdot&e_{1}&e_{2}&e_{3}&e_{4}&e_{5}&e_{6}&a_{12}&a_{23}&a_{13}&\acute{a}_{13}&b_{45}&b_{54}\\ \hline\cr\hline\cr e_{1}&e_{1}&&&&&&a_{12}&&a_{13}&\acute{a}_{13}&&\\ e_{2}&&e_{2}&&&&&&a_{23}&&&&\\ e_{3}&&&e_{3}&&&&&&&&&\\ e_{4}&&&&e_{4}&&&&&&&b_{45}&\\ e_{5}&&&&&e_{5}&&&&&&&b_{54}\\ \hline\cr e_{6}&&&&&&e_{6}&&&&&&\\ \hline\cr a_{12}&&a_{12}&&&&&&a_{13}&&&&\\ a_{23}&&&a_{23}&&&&&&&&&\\ a_{13}&&&a_{13}&&&&&&&&&\\ \acute{a}_{13}&&&\acute{a}_{13}&&&&&&&&&\\ \hline\cr b_{45}&&&&&b_{45}&&&&&&&e_{4}\\ b_{54}&&&&b_{54}&&&&&&&e_{5}&\\ \hline\cr\end{array}

Table 2. The multiplication table for 𝖠\mathsf{A}
e3e_{3}e2e_{2}e1e_{1}e4e_{4}e5e_{5}e6e_{6}e3e_{3}e2e_{2}e1e_{1}e4e_{4}e5e_{5}e6e_{6}a23a_{23}a12a_{12}a13a_{13}a´13\acute{a}_{13}b54b_{54}b45b_{45}
(a) Cayley graph
e1e_{1}e2e_{2}e3e_{3}e4e_{4}e5e_{5}e6e_{6}a23a_{23}a12a_{12}a13a´13\begin{array}[]{c}a_{13}\\ \acute{a}_{13}\end{array}b45b_{45}b54b_{54}e1e_{1}e2e_{2}e3e_{3}e4e_{4}e5e_{5}e6e_{6}e1e_{1}e2e_{2}e3e_{3}e4e_{4}e5e_{5}e6e_{6}
(b) Peirce decomposition
Figure 1. Visualizing the abstract category 𝖠\mathsf{A} in Example 3.16

3.7. Peirce decomposition of abstract categories

Treating categories as algebraic structures allows us to frame aspects of category theory in algebraic terms. Our goal is an elementary representation theory of categories. In particular, we seek matrix-like structures—known as Peirce decompositions in ring theory—for abstract categories.

One can recover from an abstract category 𝖠\mathsf{A} notions of objects and morphisms by considering the identities 𝟙𝖠\mathbb{1}_{\mathsf{A}}. Using the laws in Definition 3.13,

(e:𝟙𝖠)(f:𝟙𝖠)ef={e if f=e, otherwise.(\forall e:\mathbb{1}_{\mathsf{A}})\;\;(\forall f:\mathbb{1}_{\mathsf{A}})\;\;\;ef=\begin{cases}e&\text{ if }f=e,\\ \bot&\text{ otherwise}.\end{cases}

In algebraic terms, the subtype 𝟙𝖠\mathbb{1}_{\mathsf{A}} is a type of pairwise orthogonal idempotents. For subtypes XX and YY of 𝖠\mathsf{A}, define

XY={xyx:X,y:Y,x=y}.XY=\{xy\mid x:X,y:Y,\ x\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}y\}.
Fact 3.17.

If a:𝖠a:\mathsf{A}, then 𝟙𝖠{a}={a}={a}𝟙𝖠\mathbb{1}_{\mathsf{A}}\{a\}=\{a\}=\{a\}\mathbb{1}_{\mathsf{A}}; we write simply 𝟙𝖠a=a=a𝟙𝖠\mathbb{1}_{\mathsf{A}}a=a=a\mathbb{1}_{\mathsf{A}}.

Given e,f:𝟙𝖠e,f:\mathbb{1}_{\mathsf{A}}, we define three subtypes:

(left slice)\displaystyle(\text{left slice}) e𝖠\displaystyle e\mathsf{A} ={a:𝖠e=a};\displaystyle=\{a:\mathsf{A}\mid e=\mathbin{\blacktriangleleft}a\};
(right slice)\displaystyle(\text{right slice}) 𝖠f\displaystyle\mathsf{A}f ={a:𝖠a=f};\displaystyle=\{a:\mathsf{A}\mid a\mathbin{\blacktriangleleft}=f\};
(hom-set)\displaystyle(\text{hom-set}) e𝖠f\displaystyle e\mathsf{A}f ={a:𝖠e=a,a=f}.\displaystyle=\{a:\mathsf{A}\mid e=\mathbin{\blacktriangleleft}a,\ a\mathbin{\blacktriangleleft}=f\}.

These subtypes appear in Figure 2 in the left, middle, and right images, respectively. If e=ae\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}a for a:𝖠a:\mathsf{A}, then ea=(e)a=(a)a=aea=(e\mathbin{\blacktriangleleft})a=(\mathbin{\blacktriangleleft}a)a=a.

e𝖠e\mathsf{A}ee  𝖠f\mathsf{A}fffe𝖠fe\mathsf{A}feeff
Figure 2. Visualizing the Peirce decomposition of 𝖠\mathsf{A}

If a:𝖠a:\mathsf{A}, then a:𝖠(a)a:\mathsf{A}(a\mathbin{\blacktriangleleft}), and so on, from which we deduce the following.

Proposition 3.18.

If 𝖠\mathsf{A} is an abstract category, then a(a)aa\mapsto(\mathbin{\blacktriangleleft}a)a, aa(a)a\mapsto a(a\mathbin{\blacktriangleleft}), and a(a)a(a)a\mapsto(\mathbin{\blacktriangleleft}a)a(a\mathbin{\blacktriangleleft}) induce invertible functions ((denoted by “\leftrightarrow)) of the following types:

𝖠\displaystyle\mathsf{A} e:𝟙𝖠e𝖠,\displaystyle\longleftrightarrow\bigsqcup_{e:\mathbb{1}_{\mathsf{A}}}e\mathsf{A}, 𝖠\displaystyle\mathsf{A} f:𝟙𝖠𝖠f\displaystyle\longleftrightarrow\bigsqcup_{f:\mathbb{1}_{\mathsf{A}}}\mathsf{A}f 𝖠\displaystyle\mathsf{A} e:𝟙𝖠f:𝟙𝖠e𝖠f.\displaystyle\longleftrightarrow\bigsqcup_{e:\mathbb{1}_{\mathsf{A}}}\bigsqcup_{f:\mathbb{1}_{\mathsf{A}}}e\mathsf{A}f.

Proposition 3.18, which we use to prove Theorem 5.4, allows us to draw upon intuition from matrix algebras. The morphisms of a category appear in its multiplication table, as in Table 2. Products of morphisms and slices are defined, as with matrix products, only when the inner indices agree. In this model, 𝟙𝖠\mathbb{1}_{\mathsf{A}} can be visualized as the identity matrix, where the entries on the diagonal are the individual identities e:𝟙𝖠e:\mathbb{1}_{\mathsf{A}}. In Figure 1(b), that product is represented in a matrix-like form respecting the conditions of the Peirce decomposition.

Remark 3.19.

While there are differences between the types for categories and abstract categories, every theorem stated in one setting translates to a corresponding theorem in the other. More precisely, the translation is a model-theoretic definable interpretation [Marker:models]*§1.4: there is a prescribed formula that translates every theorem and its proof between the two theories. Example 3.16 shows how the model of categories with both objects and morphisms may be interpreted as definable types in the theory of categories with only morphisms (abstract categories). Conversely, if 𝖠\mathsf{A} is an abstract category, then we obtain a category 𝖢\mathsf{C} with object type 𝖢0\hstretch.13==𝟙𝖠\mathsf{C}_{0}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathbb{1}_{\mathsf{A}} as follows. For objects e,f:𝖢0e,f:\mathsf{C}_{0}, we define

𝖢1(e,f)\hstretch.13==f𝖠e,\mathsf{C}_{1}(e,f)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}f\mathsf{A}e,

where the identity morphisms of 𝖢\mathsf{C} are e:𝖢1(e,e)e:\mathsf{C}_{1}(e,e). To compose morphisms fae:𝖢1(e,f)fae:\mathsf{C}_{1}(e,f) with gbf:𝖢1(f,g)gbf:\mathsf{C}_{1}(f,g) for objects e,f,g:𝖢0e,f,g:\mathsf{C}_{0}, we define

(gbf)(fae)\hstretch.13==gbae:𝖢1(e,g).(gbf)(fae)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}gbae:\mathsf{C}_{1}(e,g).

Hence, we no longer distinguish between categories and abstract categories.

3.8. Eastern algebras as categories

Example 3.12 shows that a category is an eastern algebra. We now show that a variety of eastern algebras forms a category.

Definition 3.20.

Fix a signature Ω\Omega. Let E1,E2:EastΩE_{1},E_{2}:\mathrm{East}_{\Omega}, where E1=A,Φ,ωωA,pE_{1}=\langle A,\Phi,\omega\mapsto\langle\omega_{A},p\rangle\rangle and E2=B,Γ,ωωB,qE_{2}=\langle B,\Gamma,\omega\mapsto\langle\omega_{B},q\rangle\rangle. A morphism from E1E_{1} to E2E_{2} is a partial-function f:AB?f:A\to B^{?} such that for every ω:Ω\omega:\Omega and every a1,,a|ω|:Aa_{1},\ldots,a_{|\omega|}:A,

f(ωA(a1,,a|ω|))ωB(f(a1),,f(a|ω|)).\displaystyle f(\omega_{A}(a_{1},\ldots,a_{|\omega|}))\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}\omega_{B}(f(a_{1}),\ldots,f(a_{|\omega|})).

The type of morphisms from E1E_{1} to E2E_{2} is

MorΩ(E1,E2)\displaystyle\operatorname{Mor}_{\Omega}(E_{1},E_{2}) \hstretch.13==f:AB?ω:Ωa:A|ω|(f(ωA(a1,,a|ω|))ωB(f(a1),,f(a|ω|))).\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\bigsqcup_{f:A\to B^{?}}~\prod_{\omega:\Omega}~\prod_{a:A^{|\omega|}}\left(f(\omega_{A}(a_{1},\ldots,a_{|\omega|}))\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}\omega_{B}(f(a_{1}),\ldots,f(a_{|\omega|}))\right).

The object type EastΩ\mathrm{East}_{\Omega} and the morphism type MorΩ\operatorname{Mor}_{\Omega} form the category of Ω\Omega-eastern algebras. In particular, the Ω\Omega-morphisms of EastΩ\mathrm{East}_{\Omega}, namely the type

MorΩ\hstretch.13==E1:EastΩE2:EastΩMorΩ(E1,E2),\operatorname{Mor}_{\Omega}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\bigsqcup_{E_{1}:\mathrm{East}_{\Omega}}~\bigsqcup_{E_{2}:\mathrm{East}_{\Omega}}\operatorname{Mor}_{\Omega}(E_{1},E_{2}),

can be viewed as an abstract category (see Proposition 3.15). Therefore, MorΩ\operatorname{Mor}_{\Omega} forms an eastern algebra with the composition signature.

We call a subcategory 𝖤\mathsf{E} of eastern algebras an eastern variety (with respect to signature Ω\Omega and laws \mathcal{L}) if it is full and its objects are those eastern Ω\Omega-algebras satisfying the laws \mathcal{L}. We reserve 𝖤\mathsf{E} to denote an eastern variety.

Remark 3.21.

Regarding categories as eastern algebras could lead to a paradox of Russell type. The paradox is avoided either by limiting Π\Pi-types to forbid some quantifications [Tucker] or by creating an increasing tower of universe types and pushing the larger categories into the next universe [HoTT]*§9.9. Both resolutions allow us to define categories and eastern algebras computationally.

Under the correspondence of Remark 3.19, morphisms between abstract categories are precisely functors between categories. This translation serves two of our goals. The first is an elementary representation theory for categories: by regarding categories as “monoids with partial-operators”, we mimic monoid actions. The second is to treat a category as a single data type with operations defined on it. This is considerably easier to implement as a computer program. Indeed, both GAP [GAP4] and Magma [magma] are designed for such algebras. There are benefits to the usual description of categories, but the translation to abstract categories is essential for our approach to computing with and within categories.

Next, we prove a generalization of Noether’s Isomorphism Theorem.

Theorem 3.22.

Let φ:E1E2\varphi:E_{1}\to E_{2} be a morphism between eastern algebras. There exist eastern algebras Coim(φ)\mathrm{Coim}(\varphi) and Im(φ)\mathrm{Im}(\varphi), an epimorphism coim(φ):E1Coim(φ)\mathrm{coim}(\varphi):E_{1}\twoheadrightarrow\mathrm{Coim}(\varphi), a monomorphism im(φ):Im(φ)E2\mathrm{im}(\varphi):\mathrm{Im}(\varphi)\hookrightarrow E_{2}, and an isomorphism ψ:Coim(φ)Im(φ)\psi:\mathrm{Coim}(\varphi)\to\mathrm{Im}(\varphi) such that the following diagram commutes.

E1\displaystyle{E_{1}}E2\displaystyle{E_{2}}Coim(φ)\displaystyle{\mathrm{Coim}(\varphi)}Im(φ)\displaystyle{\mathrm{Im}(\varphi)}φ\scriptstyle{\varphi}coim(φ)\scriptstyle{\mathrm{coim}(\varphi)}ψ\scriptstyle{\psi}im(φ)\scriptstyle{\mathrm{im}(\varphi)}
Proof.

Let Ω\Omega be a signature, and let AA be an Ω\Omega-eastern algebra. If every operation in AA is total, then we apply Noether’s Isomorphism Theorem [Cohn]*Theorem II.3.7. It is clear that the existence of the stated algebras and morphisms is constructive.

Otherwise, at least one operation is a partial-function. We define a new eastern algebra where all operations are total. Let Ξ\Xi be a formal symbol, disjoint from Ω\Omega. Define a new type E\hstretch.13==A{Ξ}E\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}A\sqcup\{\Xi\} with inclusion function ιA:AE\iota_{A}:A\hookrightarrow E. By abuse of notation, we also apply ιA\iota_{A} to tuples over AA. Define a new signature Σ\Sigma, obtained from Ω\Omega by including Ξ\Xi as a constant. We use trivial rails for every operator, and define ωE:E|ω|E\omega_{E}:E^{|\omega|}\to E via

e{ιA(ωA(a))if e=ιA(a) for some a:A|ω| with ωA(a):A,Ξotherwise.e\mapsto\begin{cases}\iota_{A}(\omega_{A}(a))&\text{if $e=\iota_{A}(a)$ for some $a:A^{|\omega|}$ with $\omega_{A}(a):A$,}\\ \Xi&\text{otherwise}.\end{cases}

Now every operation in EE is total, so the Isomorphism Theorem applies. Since every homomorphism of Σ\Sigma-eastern algebras fixes constants, the statement follows. ∎

The monomorphism im(φ)\mathrm{im}(\varphi) from Theorem 3.22 is the image of φ\varphi, and the epimorphism coim(φ)\mathrm{coim}(\varphi) is the coimage of φ\varphi. Theorem 3.22 asserts, in particular, that categories of eastern algebras have images and coimages. These maps possess universal properties [Riehl]*§E.5.

3.9. Subobjects and images

We close with a list of facts about eastern varieties, which we use heavily in Section 5. We first define a pre-order that enables abbreviation of compositions of multiple morphisms. To motivate this, assume φ:E1E2\varphi:E_{1}\to E_{2} is a morphism of eastern algebras. Theorem 3.22 states there exists θ:E1Im(φ)\theta:E_{1}\to\mathrm{Im}(\varphi) such that φ=im(φ)θ\varphi=\mathrm{im}(\varphi)\theta. We denote this by φim(φ)\varphi\ll\mathrm{im}(\varphi) and make the following more general definition. For morphisms a,b:𝖤a,b:\mathsf{E},

(3.3) a\displaystyle a b[(c:𝖤)a=bc],\displaystyle\ll b\iff\left[(\exists c:\mathsf{E})\;\;a=bc\right],
(3.4) a\displaystyle a b[(d:𝖤)a=db].\displaystyle\gg b\iff\left[(\exists d:\mathsf{E})\;\;a=db\right].

Two monomorphisms a,b:𝖤a,b:\mathsf{E} are equivalent if aba\ll b and bab\ll a. Similarly, epimorphisms c,d:𝖤c,d:\mathsf{E} are equivalent if cdc\gg d and dcd\gg c.

Lemma 3.23.

Let 𝖤\mathsf{E} be an eastern variety. For morphisms a,b:𝖤a,b:\mathsf{E}, if a=ba\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}b, then aim(b)im(ab)a\,\mathrm{im}(b)\ll\mathrm{im}(ab).

Proof.

By Theorem 3.22, there exist isomorphisms ψb,ψab:𝖤\psi_{b},\psi_{ab}:\mathsf{E} such that

b\displaystyle b =im(b)ψbcoim(b),\displaystyle=\mathrm{im}(b)\psi_{b}\mathrm{coim}(b), ab\displaystyle ab =im(ab)ψabcoim(ab).\displaystyle=\mathrm{im}(ab)\psi_{ab}\mathrm{coim}(ab).

By the universal property of coimages, there exists a unique morphism π:𝖤\pi:\mathsf{E} such that coim(ab)=πcoim(b)\mathrm{coim}(ab)=\pi\,\mathrm{coim}(b). Therefore,

aim(b)ψbcoim(b)=ab=im(ab)ψabcoim(ab)=im(ab)ψabπcoim(b).\displaystyle a\,\mathrm{im}(b)\psi_{b}\mathrm{coim}(b)=ab=\mathrm{im}(ab)\psi_{ab}\mathrm{coim}(ab)=\mathrm{im}(ab)\psi_{ab}\pi\,\mathrm{coim}(b).

Since coim(b)\mathrm{coim}(b) is an epimorphism, aim(b)=im(ab)ψabπψb1im(ab)a\,\mathrm{im}(b)=\mathrm{im}(ab)\psi_{ab}\pi\psi_{b}^{-1}\ll\mathrm{im}(ab). ∎

Lemma 3.24.

Let 𝖤\mathsf{E} be an eastern variety. For morphisms a,b:𝖤a,b:\mathsf{E}, if a=ba\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}b, then im(ab)im(a)\mathrm{im}(ab)\ll\mathrm{im}(a). If aa is also monic, then im(ab)aim(b)\mathrm{im}(ab)\ll a\,\mathrm{im}(b).

Proof.

The first claim follows from the universal property of images, so we assume aa is monic. By Theorem 3.22, there exists an isomorphism ψb:𝖤\psi_{b}:\mathsf{E} such that

b\displaystyle b =im(b)ψbcoim(b).\displaystyle=\mathrm{im}(b)\psi_{b}\mathrm{coim}(b).

Since aim(b)a\,\mathrm{im}(b) is monic and ab=(aim(b))(ψbcoim(b))ab=(a\,\mathrm{im}(b))(\psi_{b}\mathrm{coim}(b)), by the universal property of images, there exists a morphism ι:𝖤\iota:\mathsf{E} such that im(ab)=aim(b)ιaim(b)\mathrm{im}(ab)=a\,\mathrm{im}(b)\iota\ll a\,\mathrm{im}(b). ∎

Eastern varieties have a coproduct [Riehl]*p. 81 given by the free product [Riehl]*p. 183. An example concerning groups is given in [Riehl]*Corollary 4.5.7. We list some facts concerning coproducts in eastern varieties.

Fact 3.25.

Let II be a type. In an eastern variety 𝖤\mathsf{E}, the following hold for all e:𝟙𝖤e:\mathbb{1}_{\mathsf{E}} and a:Ie𝖤a:I\to e\mathsf{E}.

  1. (a)

    There exists a coproduct morphism i:Iai\coprod_{i:I}a_{i} and morphisms ι:I(i:Iai)𝖤\iota:I\to\big{(}\coprod_{i:I}a_{i}\big{)}\mathsf{E} satisfying (i:Iai)ιj=aj\big{(}\coprod_{i:I}a_{i}\big{)}\iota_{j}=a_{j} for each j:Ij:I.

  2. (b)

    If II is uninhabited, then f=(i:Iai)f=\left(\coprod_{i:I}a_{i}\right)\mathbin{\blacktriangleleft} is the identity on the free algebra on the empty set. In particular, i:Iai\coprod_{i:I}a_{i} is the unique morphism inhabiting e𝖤fe\mathsf{E}f.

  3. (c)

    If b:𝖤b:\mathsf{E} such that b=aib\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}a_{i} for all i:Ii:I, then i:I(bai)=bi:Iai\coprod_{i:I}(ba_{i})=b\coprod_{i:I}a_{i}.

  4. (d)

    If b:I𝖤b:I\to\mathsf{E} with ai=bia_{i}\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}b_{i} for all i:Ii:I, then i:I(aibi)i:Iai\coprod_{i:I}(a_{i}b_{i})\ll\coprod_{i:I}a_{i}.

  5. (e)

    If JIJ\subset I, then j:Jaji:Iai\coprod_{j:J}a_{j}\ll\coprod_{i:I}a_{i}.

Finally, if aa is monomorphism satisfying a=e\mathbin{\blacktriangleleft}a=e, for some identity ee, then aa\mathbin{\blacktriangleleft} can be regarded as a subobject of the object associated to ee. Given a collection {aii:I}\{a_{i}\mid i:I\} of such monomorphisms, consider the smallest subobject containing all set-wise images of the aia_{i}\mathbin{\blacktriangleleft}. The coproduct allows us to effectively “glue” together all of the monomorphisms, but the result is not a monomorphism. To obtain a monomorphism, we take the image of the coproduct, namely

(3.5) im(i:Iai).\displaystyle\text{im}\left(\coprod_{i:I}a_{i}\right).

4. Category actions, capsules, and counits

Theorem 2 asserts that characteristic subgroups arise from categories acting on other categories.

4.1. Category actions

Our formulation of category actions generalizes the familiar notion for groups and also actions of monoids and groupoids  [MonoidsAC]*§I.4. The technical aspects of the definition concern the additional guards, denoted \lhd, needed to express where products are defined. Their use is similar to the guards \blacktriangleleft used for abstract categories (see Definition 3.13).

Definition 4.1.

Let 𝖠\mathsf{A} be an abstract category with guards denoted by ()(-)\mathbin{\blacktriangleleft} and ()\mathbin{\blacktriangleleft}(-). Let XX be a type. A ((left)) category action of 𝖠\mathsf{A} on XX consists of a type X\lhd X, functions ():𝖠X(-)\lhd:\mathsf{A}\to\lhd X and ():XX\lhd(-):X\to\lhd X, and a partial-function :𝖠×XX?\cdot:\mathsf{A}\times X\to X^{?} that satisfies the following rules:
    (1) (a:𝖠\forall a:\mathsf{A}) (x:X\forall x:X) [(a=x)((y:X)ax=y)]\left[(a\lhd=\lhd x)\iff((\exists y:X)\;a\cdot x=y)\right]; (2) (a:𝖠)\forall a:\mathsf{A}) (x:X\forall x:X) [(a)=a and ((a)x)x]\left[(a\mathbin{\blacktriangleleft})\lhd=a\lhd\text{ and }((a\mathbin{\blacktriangleleft})\cdot x)\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}x\right]; and (3) (a,b:𝖠\forall a,b:\mathsf{A}) (x:X\forall x:X) ((ab)x)(a(bx))((ab)\cdot x)\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}(a\cdot(b\cdot x)).

Given a left action of 𝖠\mathsf{A} on a type YY, a partial-function :XY?\mathcal{M}:X\to Y^{?} is an 𝖠\mathsf{A}-morphism if (ax)=a(x)\mathcal{M}(a\cdot x)=a\cdot\mathcal{M}(x) whenever a:𝖠a:\mathsf{A} and x:Xx:X with a=xa\lhd=\lhd x.

Right category actions are similarly defined. We unpack the symbolic expressions in Definition 4.1. Condition (1) states that the functions ()(-)\lhd and ()\lhd(-) serve as guards for the partial-function :𝖠×XX?\cdot:\mathsf{A}\times X\to X^{?}: namely, (1) characterizes precisely when \cdot is defined. The first part of Condition (2) asserts that ()(-)\lhd respects the ()(-)\mathbin{\blacktriangleleft} identity of 𝖠\mathsf{A}; the second part states that identity morphisms of 𝖠\mathsf{A} act as identities. Condition (3) is the familiar group action axiom in the setting of partial-functions.

For subtypes S𝖠S\subset\mathsf{A} and YXY\subset X, we write

SY\displaystyle S\cdot Y ={sys:S,y:Y,s=y}.\displaystyle=\{s\cdot y\mid s:S,\;y:Y,\;s\lhd=\lhd y\}.

From Definition 4.1, an 𝖠\mathsf{A}-morphism :XY?\mathcal{M}:X\to Y^{?} is always defined on 𝖠X\mathsf{A}\cdot X; hence, we do not need guards for 𝖠\mathsf{A}-morphisms.

Definition 4.2.

The category action of 𝖠\mathsf{A} on XX is full if ex(ex)e\cdot x\mapsto\lhd(e\cdot x) defines a bijection from 𝟙𝖠X\mathbb{1}_{\mathsf{A}}\cdot X to 𝖠={aa:𝖠}\mathsf{A}\lhd=\{a\lhd\mid a:\mathsf{A}\}.

Note that the category action of 𝖠\mathsf{A} on XX is full if and only if for every a:𝖠a:\mathsf{A} there exists an x:Xx:X such that a=xa\lhd=\lhd x.

Since we identify categories and abstract categories (Remark 3.19), we say that a category 𝖢\mathsf{C} acts on a type XX if its morphisms 𝖢1\mathsf{C}_{1} act on XX.

Example 4.3.

Let 𝖢\mathsf{C} be a category with object type 𝖢0\mathsf{C}_{0} and morphism type 𝖢1\mathsf{C}_{1}. Set X=X=𝖢0X=\lhd X=\mathsf{C}_{0}. Define ():𝖢1𝖢0(-)\lhd:\mathsf{C}_{1}\to\mathsf{C}_{0} via f\hstretch.13==Domff\lhd\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\operatorname{Dom}f and define ():𝖢0𝖢0\lhd(-):\mathsf{C}_{0}\to\mathsf{C}_{0} via U\hstretch.13==U\lhd\,U\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}U. Let :𝖢1×𝖢0𝖢0?\cdot:\mathsf{C}_{1}\times\mathsf{C}_{0}\to\mathsf{C}_{0}^{?} be the partial-function defined by

fU\hstretch.13=={Codomfif f=U,otherwise.\displaystyle f\cdot U\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\begin{cases}\operatorname{Codom}f&\text{if }f\lhd=\lhd\,U,\\ \bot&\text{otherwise}.\end{cases}

This defines a full left action of 𝖢\mathsf{C} on 𝖢0\mathsf{C}_{0}. A full right action is defined similarly.  \square

Remark 4.4.

Let 𝖢\mathsf{C} be a category and let X=𝖢1X=\mathsf{C}_{1}. The definition of category action in [FS]*1.271–1.274 is similar to ours, but it requires X=𝟙𝖢={ff:𝖢1}\lhd X=\mathbb{1}_{\mathsf{C}}=\{f\mathbin{\blacktriangleleft}\mid f:\mathsf{C}_{1}\} and x=x\lhd x=\mathbin{\blacktriangleleft}x and f=ff\lhd=f\mathbin{\blacktriangleleft} for every x:Xx:X and f:𝖢1f:\mathsf{C}_{1}. Thus, for f,g:𝖢1f,g:\mathsf{C}_{1} and x:Xx:X, both fxf\cdot x and gxg\cdot x are defined only when f=x=gf\mathbin{\blacktriangleleft}=\lhd x=g\mathbin{\blacktriangleleft}; this is too restrictive for our purposes.

4.2. Capsules

As identified in Section 1.2, we focus on the action of one category 𝖠\mathsf{A} on another category 𝖷\mathsf{X}; we call these “category modules” capsules. Note the change in notation from XX to 𝖷\mathsf{X} to emphasize this setting. In this case, 𝖷\mathsf{X} already has a candidate type for 𝖷\lhd\mathsf{X}, namely 𝖷=𝟙𝖷\mathbin{\blacktriangleleft}\mathsf{X}=\mathbb{1}_{\mathsf{X}}. Furthermore, because a category has its own operation of composition, the action by 𝖠\mathsf{A} respects composition. For example, given a group homomorphism φ:GH\varphi:G\to H, we get an action gh\hstretch.13==φ(g)hg\cdot h\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\varphi(g)h that satisfies g(hh)=(gh)hg\cdot(hh^{\prime})=(g\cdot h)h^{\prime}.

Definition 4.5.

A category 𝖷\mathsf{X} is a left 𝖠\mathsf{A}-capsule if there is a full left 𝖠\mathsf{A}-action on 𝖷\mathsf{X} with 𝖷=𝟙𝖷\lhd\mathsf{X}=\mathbb{1}_{\mathsf{X}} such that the following hold:
    (a) (x:𝖷)(\forall x:\mathsf{X}) (x=x)(\lhd x=\mathbin{\blacktriangleleft}x); (b) (a:𝖠)(\forall a:\mathsf{A}) (x,y:𝖷)(\forall x,y:\mathsf{X}) a(xy)(ax)ya\cdot(xy)\asymp(a\cdot x)y.

A right 𝖠\mathsf{A}-capsule is similarly defined. We present our results below for left 𝖠\mathsf{A}-capsules, but they can be formulated for both.

Much of our intuition on actions draws on familiar themes in representation theory. A reader may be assisted by translating “𝖠\mathsf{A}-capsule” to “AA-module” and considering the matching statement for modules. We write 𝖷𝖠{{}_{\mathsf{A}}{\mathsf{X}}} to indicate the presence of a left 𝖠\mathsf{A}-capsule action on 𝖷\mathsf{X}.

From now on, if a category 𝖠\mathsf{A} acts on itself, then we assume it is by the (left) regular action, where :𝖠×𝖠𝖠?\cdot:\mathsf{A}\times\mathsf{A}\to\mathsf{A}^{?} is given by composition in 𝖠\mathsf{A}. Moreover, a category action on another category is implicitly understood to be on the morphisms. We now show that capsules arise from morphisms between categories.

Proposition 4.6.

A category 𝖷\mathsf{X} is a left 𝖠\mathsf{A}-capsule of a category 𝖠\mathsf{A} if, and only if, there is a morphism :𝖠𝖷\mathcal{F}:\mathsf{A}\to\mathsf{X} such that ax=(a)xa\cdot x=\mathcal{F}(a)x. Furthermore, the morphism \mathcal{F} is unique.

The following lemma proves one direction of Proposition 4.6.

Lemma 4.7.

Every morphism :𝖠𝖷\mathcal{F}:\mathsf{A}\to\mathsf{X} of categories makes 𝖷\mathsf{X} a left 𝖠\mathsf{A}-capsule, where for each a:𝖠a:\mathsf{A} and x:𝖷x:\mathsf{X}, the guard is defined by a\hstretch.13==(a)a\lhd\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{F}(a)\mathbin{\blacktriangleleft} and the action is defined by ax\hstretch.13==(a)xa\cdot x\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{F}(a)x.

Proof.

Condition (1) of Definition 4.1 is satisfied by the defined action.

For the first part of Condition (2), let a:𝖠a:\mathsf{A}. Since \mathcal{F} is a morphism and ()(-)\mathbin{\blacktriangleleft} is everywhere defined, (a)=(a)\mathcal{F}(a\mathbin{\blacktriangleleft})=\mathcal{F}(a)\mathbin{\blacktriangleleft}. Hence, by Lemma 3.14(a),

(a)\displaystyle(a\mathbin{\blacktriangleleft})\lhd =(a)=((a))=(a)=a.\displaystyle=\mathcal{F}(a\mathbin{\blacktriangleleft})\mathbin{\blacktriangleleft}=(\mathcal{F}(a)\mathbin{\blacktriangleleft})\mathbin{\blacktriangleleft}=\mathcal{F}(a)\mathbin{\blacktriangleleft}=a\!\lhd.

For the second part of Condition (2), let a:𝖠a:\mathsf{A} and x:𝖷x:\mathsf{X} with a=xa\lhd=\lhd x, so (a)=x\mathcal{F}(a)\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}x by definition. Thus,

(a)x=(a)x=((a))x=(x)x=x,(a\mathbin{\blacktriangleleft})\cdot x=\mathcal{F}(a\mathbin{\blacktriangleleft})x=(\mathcal{F}(a)\mathbin{\blacktriangleleft})x=(\mathbin{\blacktriangleleft}x)x=x,

so (a)xx(a\mathbin{\blacktriangleleft})\cdot x\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}x for every a:𝖠a:\mathsf{A} and x:𝖷x:\mathsf{X}.

For Condition (3), let a,b:𝖠a,b:\mathsf{A} and x:𝖷x:\mathsf{X} with a=ba\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}b and (ab)=x(ab)\lhd=\lhd x, so (ab)x(ab)\cdot x is defined and (ab)=b(ab)\mathbin{\blacktriangleleft}=b\mathbin{\blacktriangleleft}. We need to show that (ab)x=a(bx)(ab)\cdot x=a\cdot(b\cdot x). Since \mathcal{F} is a morphism,

(ab)=((ab))=((ab))=(b)=(b)=b.(ab)\lhd=(\mathcal{F}(ab))\mathbin{\blacktriangleleft}=\mathcal{F}((ab)\mathbin{\blacktriangleleft})=\mathcal{F}(b\mathbin{\blacktriangleleft})=\mathcal{F}(b)\mathbin{\blacktriangleleft}=b\lhd.

Hence, (ab)=x(ab)\lhd=\lhd x implies b=xb\lhd=\lhd x. Thus, (b)x\mathcal{F}(b)x is defined. Also, a=ba\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}b implies (a)=((b))\mathcal{F}(a)\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}(\mathcal{F}(b)), so

a=(a)=((b))=((b)x)=(bx)=(bx).a\lhd=\mathcal{F}(a)\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}(\mathcal{F}(b))=\mathbin{\blacktriangleleft}(\mathcal{F}(b)x)=\mathbin{\blacktriangleleft}(b\cdot x)=\lhd(b\cdot x).

It follows that a(bx)a\cdot(b\cdot x) is defined. Since \mathcal{F} is a morphism,

a(bx)=(a)((b)x)=(ab)x=(ab)x,a\cdot(b\cdot x)=\mathcal{F}(a)(\mathcal{F}(b)x)=\mathcal{F}(ab)x=(ab)\cdot x,

and therefore (ab)xa(bx)(ab)\cdot x\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}a\cdot(b\cdot x) for every a,b:𝖠a,b:\mathsf{A} and x:𝖷x:\mathsf{X}.

To see that the action is full, consider a:𝖠a:\mathsf{A} and define x=(a)x=\mathcal{F}(a)\mathbin{\blacktriangleleft}. By the laws of an abstract category, a=(a)=((a))=x=xa\lhd=\mathcal{F}(a)\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}({\mathcal{F}(a)\mathbin{\blacktriangleleft}})=\mathbin{\blacktriangleleft}x=\lhd x. Finally, (ax)y=(a)xya(xy)(a\cdot x)y=\mathcal{F}(a)xy\asymp a\cdot(xy), so 𝖷\mathsf{X} is a left 𝖠\mathsf{A}-capsule. ∎

Our proof of the reverse direction of Proposition 4.6 uses the following result.

Lemma 4.8.

Let 𝖷\mathsf{X} be a left 𝖠\mathsf{A}-capsule. For every a:𝖠a:\mathsf{A}, there is a unique e:𝟙𝖷e:\mathbb{1}_{\mathsf{X}} such that aea\cdot e is the unique term of type a𝟙𝖷a\cdot\mathbb{1}_{\mathsf{X}}.

Proof.

Since the action is full, for each a:𝖠a:\mathsf{A} there exists x:𝖷x:\mathsf{X} such that a=xa\lhd=\lhd x, so axa\cdot x is defined. Since 𝖷\mathsf{X} is a left 𝖠\mathsf{A}-capsule, a=x=xa\lhd=\lhd x=\mathbin{\blacktriangleleft}x, so

(x)=(x)=x.\lhd(\mathbin{\blacktriangleleft}x)=\mathbin{\blacktriangleleft}(\mathbin{\blacktriangleleft}x)=\mathbin{\blacktriangleleft}x.

Hence, a(x)a\cdot(\mathbin{\blacktriangleleft}x) is defined and has type a𝟙𝖷a\cdot\mathbb{1}_{\mathsf{X}}. Suppose e,f:𝟙𝖷e,f:\mathbb{1}_{\mathsf{X}} and a=e=fa\lhd=\lhd e=\lhd f, so that ae,af:a𝟙𝖷a\cdot e,a\cdot f:a\cdot\mathbb{1}_{\mathsf{X}}. Furthermore,

e=e=e=f=f=f.e=\mathbin{\blacktriangleleft}e=\lhd e=\lhd f=\mathbin{\blacktriangleleft}f=f.

Thus, ae=afa\cdot e=a\cdot f, and there is exactly one term with type a𝟙𝖷a\cdot\mathbb{1}_{\mathsf{X}}. ∎

Under the assumptions of Lemma 4.8, we simplify notation and identify a𝟙𝖷a\cdot\mathbb{1}_{\mathsf{X}} with its unique term.

Proof of Proposition 4.6.

By Lemma 4.7, it remains to prove the forward direction and uniqueness. Suppose that 𝖷\mathsf{X} is a left 𝖠\mathsf{A}-capsule. By Lemma 4.8, for each a:𝖠a:\mathsf{A} there is a unique (a):𝟙𝖷\mathcal{F}(a\mathbin{\blacktriangleleft}):\mathbb{1}_{\mathsf{X}} such that (a)(a)(a\mathbin{\blacktriangleleft})\cdot\mathcal{F}(a\mathbin{\blacktriangleleft}) is defined. Since 𝖷\mathsf{X} is a left 𝖠\mathsf{A}-capsule and (a)\mathcal{F}(a\mathbin{\blacktriangleleft}) is an identity,

(a)=a=(a)=(a)=(a).(a\mathbin{\blacktriangleleft})\lhd=a\lhd=\lhd\mathcal{F}(a\mathbin{\blacktriangleleft})=\mathbin{\blacktriangleleft}\mathcal{F}(a\mathbin{\blacktriangleleft})=\mathcal{F}(a\mathbin{\blacktriangleleft}).

Thus, a(a)a\cdot\mathcal{F}(a\mathbin{\blacktriangleleft}) is also defined. Put (a)\hstretch.13==a(a)\mathcal{F}(a)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}a\cdot\mathcal{F}(a\mathbin{\blacktriangleleft}). If x:𝖷x:\mathsf{X}, then axa\cdot x is defined whenever

x=x=a=(a)=(a).\mathbin{\blacktriangleleft}x=\lhd x=a\lhd=\mathcal{F}(a\mathbin{\blacktriangleleft})=\mathcal{F}(a\mathbin{\blacktriangleleft})\mathbin{\blacktriangleleft}.

Hence, (a)x\mathcal{F}(a\mathbin{\blacktriangleleft})x is also defined in 𝖷\mathsf{X}. Because (a)\mathcal{F}(a\mathbin{\blacktriangleleft}) is an identity, (a)x=x\mathcal{F}(a\mathbin{\blacktriangleleft})x=x. Since 𝖷\mathsf{X} is a left 𝖠\mathsf{A}-capsule, ax=a((a)x)=(a(a))x=(a)xa\cdot x=a\cdot(\mathcal{F}(a\mathbin{\blacktriangleleft})x)=(a\cdot\mathcal{F}(a\mathbin{\blacktriangleleft}))x=\mathcal{F}(a)x. Hence, it remains to prove that :𝖠𝖷\mathcal{F}:\mathsf{A}\to\mathsf{X} is a morphism of categories.

For a,b:𝖠a,b:\mathsf{A}, by the action laws

(ab)\displaystyle\mathcal{F}(ab) (ab)((ab))(ab)(b)a(b(b))a(b).\displaystyle\asymp(ab)\cdot\mathcal{F}((ab)\mathbin{\blacktriangleleft})\asymp(ab)\cdot\mathcal{F}(b\mathbin{\blacktriangleleft})\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}a\cdot(b\cdot\mathcal{F}(b\mathbin{\blacktriangleleft}))\asymp a\cdot\mathcal{F}(b).

Thus, (ab)a(b)\mathcal{F}(ab)\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}a\cdot\mathcal{F}(b). But 𝖷\mathsf{X} is a left 𝖠\mathsf{A}-capsule, so Fact 3.17 implies that

(4.1) a(b)\displaystyle a\cdot\mathcal{F}(b) a(𝟙𝖷(b))(a𝟙𝖷)(b)(a)(b).\displaystyle\asymp a\cdot(\mathbb{1}_{\mathsf{X}}\mathcal{F}(b))\asymp(a\cdot\mathbb{1}_{\mathsf{X}})\mathcal{F}(b)\asymp\mathcal{F}(a)\mathcal{F}(b).

Hence, (ab)(a)(b)\mathcal{F}(ab)\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}\mathcal{F}(a)\mathcal{F}(b). By Lemma 3.14(b) and (4.1) for all a:𝖠a:\mathsf{A},

(a)=(a(a))=((a)(a))=(a)=(a).\mathcal{F}(a)\mathbin{\blacktriangleleft}=(a\cdot\mathcal{F}(a\mathbin{\blacktriangleleft}))\mathbin{\blacktriangleleft}=(\mathcal{F}(a)\mathcal{F}(a\mathbin{\blacktriangleleft}))\mathbin{\blacktriangleleft}=\mathcal{F}(a\mathbin{\blacktriangleleft})\mathbin{\blacktriangleleft}=\mathcal{F}(a\mathbin{\blacktriangleleft}).

Similarly, (a)=(a)\mathcal{F}(\mathbin{\blacktriangleleft}a)=\mathbin{\blacktriangleleft}\mathcal{F}(a). Hence, \mathcal{F} is a morphism.

Lastly, we prove uniqueness of \mathcal{F}. Suppose there exists 𝒢:𝖠𝖷\mathcal{G}:\mathsf{A}\to\mathsf{X} such that ax=𝒢(a)xa\cdot x=\mathcal{G}(a)x for every a:𝖠a:\mathsf{A} and x:𝖷x:\mathsf{X} whenever a=xa\lhd=\lhd x. Since a=(a)a\lhd=\mathcal{F}(a\mathbin{\blacktriangleleft}), it follows that 𝒢(a)=(a)\mathcal{G}(a)\mathbin{\blacktriangleleft}=\mathcal{F}(a\mathbin{\blacktriangleleft}), so 𝒢(a)=𝒢(a)(a)=a(a)=(a)\mathcal{G}(a)=\mathcal{G}(a)\mathcal{F}(a\mathbin{\blacktriangleleft})=a\cdot\mathcal{F}(a\mathbin{\blacktriangleleft})=\mathcal{F}(a). ∎

If 𝖡\mathsf{B} is a subcategory of 𝖠\mathsf{A} with inclusion :𝖡𝖠\mathcal{I}:\mathsf{B}\to\mathsf{A}, then the (left) regular action of 𝖡\mathsf{B} on 𝖠\mathsf{A} is defined to be the action given by \mathcal{I}. In other words, the regular action of 𝖡\mathsf{B} on 𝖠\mathsf{A} is given by ba=(b)ab\cdot a=\mathcal{I}(b)a for a:𝖠a:\mathsf{A} and b:𝖡b:\mathsf{B}. By Lemma 4.7, each regular action defines a capsule. With regular actions we sometimes omit the “\cdot”.

4.3. Category biactions and cyclic bicapsules

We now define the concepts appearing in Theorem 2(3).

Definition 4.9.

Let 𝖠\mathsf{A} and 𝖡\mathsf{B} be categories and let XX and YY be types.

  1. (a)

    An (𝖠,𝖡)(\mathsf{A},\mathsf{B})-biaction on XX is a left 𝖠\mathsf{A}-action on XX and a right 𝖡\mathsf{B}-action on XX such that a(xb)(ax)ba\cdot(x\cdot b)\asymp(a\cdot x)\cdot b for every a:𝖠a:\mathsf{A}, b:𝖡b:\mathsf{B}, and x:Xx:X. Hence, writing axba\cdot x\cdot b is unambiguous. If, in addition, 𝖷\mathsf{X} is a left 𝖠\mathsf{A}-capsule and right 𝖡\mathsf{B}-capsule, then 𝖷\mathsf{X} is an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-bicapsule.

  2. (b)

    Suppose there are (𝖠,𝖡)(\mathsf{A},\mathsf{B})-biactions on XX and YY. An (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphism is a partial-function :XY?\mathcal{M}:X\to Y^{?} such that (axb)=a(x)b\mathcal{M}(a\cdot x\cdot b)=a\cdot\mathcal{M}(x)\cdot b, whenever a:𝖠a:\mathsf{A}, x:Xx:X, b:𝖡b:\mathsf{B} with a=xa\lhd=\lhd x and x=bx\lhd=\lhd b.

We sometimes write X𝖡𝖠{{}_{\mathsf{A}}X_{\mathsf{B}}} for an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-biaction on XX for clarity. Notice that an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphism :XY?\mathcal{M}:X\to Y^{?} must be defined on 𝖠X𝖡\mathsf{A}\cdot X\cdot\mathsf{B}. As with capsule morphisms, we do not need to establish guards. We abbreviate (𝖠,𝖠)(\mathsf{A},\mathsf{A})-bicapsule to 𝖠\mathsf{A}-bicapsule, (𝖠,𝖠)(\mathsf{A},\mathsf{A})-morphism to 𝖠\mathsf{A}-bimorphism, and (𝖠,𝖠)(\mathsf{A},\mathsf{A})-biaction to 𝖠\mathsf{A}-biaction. Note that 𝖠\mathsf{A}-bimorphisms are defined everywhere. Just as ring homomorphisms are not always linear maps, morphisms of capsules need not be morphisms of categories—morphisms of capsules do not in general send identities to identities.

Motivated by Proposition 4.6, we show that bicapsules provide a computationally useful perspective to record natural transformations of functors. If ,𝒢:𝖠𝖡\mathcal{F},\mathcal{G}:\mathsf{A}\to\mathsf{B} are functors and μ:𝒢\mu:\mathcal{G}\Rightarrow\mathcal{F} is a natural transformation, then, using Remark 3.19, the natural transformation property written with guards is

(a)μa=μa𝒢(a)\mathcal{F}(a)\mu_{a\mathbin{\blacktriangleleft}}=\mu_{\mathbin{\blacktriangleleft}a}\mathcal{G}(a)

for every morphism aa in 𝖠\mathsf{A}.

Proposition 4.10.

In the following statements, the category 𝖠\mathsf{A} is also regarded as an 𝖠\mathsf{A}-bicapsule via its regular action.

  1. (a)

    For every natural transformation μ:𝒢\mu:\mathcal{G}\Rightarrow\mathcal{F} between functors ,𝒢:𝖠𝖷\mathcal{F},\mathcal{G}:\mathsf{A}\to\mathsf{X}, the assignment

    axa\hstretch.13==(a)x𝒢(a)\displaystyle a\cdot x\cdot a^{\prime}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{F}(a)x\mathcal{G}(a^{\prime}) (a,a:𝖠,x:𝖷)\displaystyle(a,a^{\prime}:\mathsf{A},\;x:\mathsf{X})

    makes 𝖷\mathsf{X} into an 𝖠\mathsf{A}-bicapsule, and the assignment (a)\hstretch.13==aμa\mathcal{M}(a)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}a\cdot\mu_{a\mathbin{\blacktriangleleft}} defines an 𝖠\mathsf{A}-bimorphism :𝖠𝖷?\mathcal{M}:\mathsf{A}\to\mathsf{X}^{?}.

  2. (b)

    Conversely, for every 𝖠\mathsf{A}-bimorphism :𝖠𝖷?\mathcal{M}:\mathsf{A}\to\mathsf{X}^{?}, the assignments

    (a)\displaystyle\mathcal{F}(a) \hstretch.13==a𝟙𝖷,\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}a\cdot\mathbb{1}_{\mathsf{X}}, 𝒢(a)\displaystyle\mathcal{G}(a) \hstretch.13==𝟙𝖷a\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathbb{1}_{\mathsf{X}}\cdot a (a:𝖠)\displaystyle(a:\mathsf{A})

    define functors ,𝒢:𝖠𝖷\mathcal{F},\mathcal{G}:\mathsf{A}\to\mathsf{X}, and the assignment

    μa\displaystyle\mu_{a\mathbin{\blacktriangleleft}} \hstretch.13==(a)\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{M}(a\mathbin{\blacktriangleleft}) (a:𝖠)\displaystyle(a:\mathsf{A})

    defines a natural transformation μ:𝒢\mu:\mathcal{G}\Rightarrow\mathcal{F}.

Proof.
  1. (a)

    By Lemma 3.14(b) for all a,b,c:𝖠a,b,c:\mathsf{A},

    (ab)\displaystyle\mathcal{M}(ab) (ab)μ(ab)(ab)μ(ab)(a)(b)μba(b).\displaystyle\asymp(ab)\cdot\mu_{(ab)\mathbin{\blacktriangleleft}}\asymp\mathcal{F}(ab)\mu_{(ab)\mathbin{\blacktriangleleft}}\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}\mathcal{F}(a)\mathcal{F}(b)\mu_{b\mathbin{\blacktriangleleft}}\asymp a\cdot\mathcal{M}(b).

    Since μ\mu is a natural transformation,

    (bc)\displaystyle\mathcal{M}(bc) (bc)μ(bc)μ(bc)𝒢(bc)μb𝒢(b)𝒢(c)(b)μb𝒢(c)(b)c.\displaystyle\asymp\mathcal{F}(bc)\mu_{(bc)\mathbin{\blacktriangleleft}}\asymp\mu_{\mathbin{\blacktriangleleft}(bc)}\mathcal{G}(bc)\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}\mu_{\mathbin{\blacktriangleleft}b}\mathcal{G}(b)\mathcal{G}(c)\asymp\mathcal{F}(b)\mu_{b\mathbin{\blacktriangleleft}}\mathcal{G}(c)\asymp\mathcal{M}(b)\cdot c.

    Thus, (abc)a(b)c\mathcal{M}(abc)\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}a\cdot\mathcal{M}(b)\cdot c, so \mathcal{M} is an 𝖠\mathsf{A}-bimorphism.

  2. (b)

    We apply Proposition 4.6, so there are functors ,𝒢:𝖠𝖷\mathcal{F},\mathcal{G}:\mathsf{A}\to\mathsf{X} determined by the left and right actions, respectively. Let a:𝖠a:\mathsf{A} and define μe\hstretch.13==(e)\mu_{e}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{M}(e) for e:𝟙𝖠e:\mathbb{1}_{\mathsf{A}}. Now

    (a)μa\displaystyle\mathcal{F}(a)\mu_{a\mathbin{\blacktriangleleft}} =(a)(a)=a(a)=(a(a))=(a)\displaystyle=\mathcal{F}(a)\mathcal{M}(a\mathbin{\blacktriangleleft})=a\cdot\mathcal{M}(a\mathbin{\blacktriangleleft})=\mathcal{M}(a(a\mathbin{\blacktriangleleft}))=\mathcal{M}(a)
    =((a)a)=(a)a=(a)𝒢(a)=μa𝒢(a).\displaystyle=\mathcal{M}((\mathbin{\blacktriangleleft}a)a)=\mathcal{M}(\mathbin{\blacktriangleleft}a)\cdot a=\mathcal{M}(\mathbin{\blacktriangleleft}a)\mathcal{G}(a)=\mu_{\mathbin{\blacktriangleleft}a}\mathcal{G}(a).

    Therefore, μ\mu is a natural transformation, as required.∎

We summarize the conclusion in Proposition 4.10(b), namely μe=(e)\mu_{e}=\mathcal{M}(e) for every e:𝟙𝖠e:\mathbb{1}_{\mathsf{A}}, by writing μ=(𝟙A)\mu=\mathcal{M}(\mathbb{1}_{A}). While 𝟙𝖠\mathbb{1}_{\mathsf{A}} consists of many terms, x𝟙𝖠x\cdot\mathbb{1}_{\mathsf{A}} and 𝟙𝖠x\mathbb{1}_{\mathsf{A}}\cdot x produce unique values, so 𝟙𝖠\mathbb{1}_{\mathsf{A}} plays a role similar to multiplying by 11. Since :𝖠𝖷?\mathcal{M}:\mathsf{A}\to\mathsf{X}^{?} is an 𝖠\mathsf{A}-bimorphism, (a)=a(a)=(a)a\mathcal{M}(a)=a\cdot\mathcal{M}(\mathbin{\blacktriangleleft}a)=\mathcal{M}(a\mathbin{\blacktriangleleft})\cdot a for a:𝖠a:\mathsf{A}, which shows that \mathcal{M} is determined by (𝟙𝖠)\mathcal{M}(\mathbb{1}_{\mathsf{A}}). We write

(4.2) 𝖠μ𝖠\hstretch.13=={aμea´|a,a´:𝖠,e:𝟙𝖠,a=μe,μe=a´}.\displaystyle\mathsf{A}\cdot\mu\cdot\mathsf{A}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\left\{a\cdot\mu_{e}\cdot\acute{a}~\middle|~a,\acute{a}:\mathsf{A},e:\mathbb{1}_{\mathsf{A}},\ a\lhd=\lhd\mu_{e},\ \mu_{e}\lhd=\lhd\acute{a}\right\}.

The bicapsule in (4.2) is the cyclic 𝖠\mathsf{A}-bicapsule determined by μ=(𝟙𝖠)\mu=\mathcal{M}(\mathbb{1}_{\mathsf{A}}).

4.4. Units and counits

Given a category 𝖠\mathsf{A} and functor :𝖠𝖠\mathcal{H}:\mathsf{A}\to\mathsf{A}, a unit is a natural transformation μ:id𝖠\mu:\operatorname{id}_{\mathsf{A}}\Rightarrow\mathcal{H}, and a counit is a natural transformation ν:id𝖠\nu:\mathcal{H}\Rightarrow\operatorname{id}_{\mathsf{A}}. We will prove that units and counits are responsible for all characteristic structure. It therefore makes sense to translate these into capsule actions. We show that a unit μ\mu is characterized as an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-bimorphism :𝖠𝖡?\mathcal{M}:\mathsf{A}\to\mathsf{B}^{?} and a counit ν\nu by an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphism 𝒩:𝖡𝖠?\mathcal{N}:\mathsf{B}\to\mathsf{A}^{?}. As the relationship is dual, and we emphasize substructures instead of quotients, we state and prove this relationship only for counits.

Theorem 4.11.

Let 𝖠\mathsf{A} and 𝖡\mathsf{B} be categories.

  1. (a)

    If both 𝖠\mathsf{A} and 𝖡\mathsf{B} are (𝖠,𝖡)(\mathsf{A},\mathsf{B})-bicapsules and 𝒩:𝖡𝖠?\mathcal{N}:\mathsf{B}\to\mathsf{A}^{?} is an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphism, then (b)\hstretch.13==𝟙𝖠b\mathcal{F}(b)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathbb{1}_{\mathsf{A}}\cdot b and 𝒢(a)\hstretch.13==a𝟙𝖡\mathcal{G}(a)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}a\cdot\mathbb{1}_{\mathsf{B}} define functors :𝖡𝖠\mathcal{F}:\mathsf{B}\to\mathsf{A} and 𝒢:𝖠𝖡\mathcal{G}:\mathsf{A}\to\mathsf{B}, and ν\hstretch.13==𝒩𝒢(𝟙𝖠)\nu\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{N}\mathcal{G}(\mathbb{1}_{\mathsf{A}}) is a counit ν:𝒢id𝖠\nu:\mathcal{F}\mathcal{G}\Rightarrow\operatorname{id}_{\mathsf{A}}.

  2. (b)

    If :𝖡𝖠\mathcal{F}:\mathsf{B}\to\mathsf{A} and 𝒢:𝖠𝖡\mathcal{G}:\mathsf{A}\to\mathsf{B} are functors and ν:𝒢id𝖠\nu:\mathcal{F}\mathcal{G}\Rightarrow\operatorname{id}_{\mathsf{A}} is a counit, then 𝖠\mathsf{A} and 𝖡\mathsf{B} are (𝖠,𝖡)(\mathsf{A},\mathsf{B})-bicapsules, where ayb\hstretch.13==𝒢(a)yba\cdot y\cdot b\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{G}(a)yb and axb=𝒢(a)x𝒢(b)a\cdot x\cdot b=\mathcal{F}\mathcal{G}(a)x\mathcal{F}\mathcal{G}\mathcal{F}(b) for a,x:𝖠a,x:\mathsf{A} and b,y:𝖡b,y:\mathsf{B}. Also, 𝒩(b)\hstretch.13==(b)ν(b)\mathcal{N}^{\prime}(b)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{F}(b)\nu_{\mathcal{F}(b)\mathbin{\blacktriangleleft}} is an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphism 𝖡𝖠\mathsf{B}\to\mathsf{A} such that 𝒩𝒢(e)=ν𝒢(e)\mathcal{N}^{\prime}\mathcal{G}(e)=\nu_{\mathcal{F}\mathcal{G}(e)} for all e:𝟙𝖠e:\mathbb{1}_{\mathsf{A}}.

Proof.
  1. (a)

    By Proposition 4.6, the maps \mathcal{F} and 𝒢\mathcal{G} define functors where xbx(b)x\cdot b\asymp x\mathcal{F}(b) and ay𝒢(a)ya\cdot y\asymp\mathcal{G}(a)y, for a,x:𝖠a,x:\mathsf{A} and b,y:𝖡b,y:\mathsf{B}. Put ν=𝒩(𝒢(𝟙A))\nu=\mathcal{N}(\mathcal{G}(\mathbb{1}_{A})). For a:𝖠a:\mathsf{A},

    aνa\displaystyle a\nu_{a\mathbin{\blacktriangleleft}} =a𝒩𝒢(a)=a𝒩(𝒢(a))=𝒩(a(𝒢(a)))=𝒩(𝒢(a)(𝒢(a)))=𝒩𝒢(a),\displaystyle=a\mathcal{N}\mathcal{G}(a\mathbin{\blacktriangleleft})=a\mathcal{N}(\mathcal{G}(a)\mathbin{\blacktriangleleft})=\mathcal{N}(a\cdot(\mathcal{G}(a))\mathbin{\blacktriangleleft})=\mathcal{N}(\mathcal{G}(a)(\mathcal{G}(a))\mathbin{\blacktriangleleft})=\mathcal{N}\mathcal{G}(a),

    and

    νa𝒢(a)\displaystyle\nu_{\mathbin{\blacktriangleleft}a}\mathcal{F}\mathcal{G}(a) =𝒩𝒢(a)𝒢(a)=𝒩(𝒢(a))𝒢(a)=𝒩((𝒢(a))𝒢(a))=𝒩𝒢(a).\displaystyle=\mathcal{N}\mathcal{G}(\mathbin{\blacktriangleleft}a)\cdot\mathcal{G}(a)=\mathcal{N}(\mathbin{\blacktriangleleft}\mathcal{G}(a))\cdot\mathcal{G}(a)=\mathcal{N}((\mathbin{\blacktriangleleft}\mathcal{G}(a))\mathcal{G}(a))=\mathcal{N}\mathcal{G}(a).

    Hence, aνa=νa𝒢(a)a\nu_{a\mathbin{\blacktriangleleft}}=\nu_{\mathbin{\blacktriangleleft}a}\mathcal{F}\mathcal{G}(a) for all a:𝖠a:\mathsf{A}, so ν:𝒢id𝖠\nu:\mathcal{F}\mathcal{G}\Rightarrow\operatorname{id}_{\mathsf{A}} is a natural transformation.

    We show that 𝒩(b)\hstretch.13==(b)ν(b)\mathcal{N}^{\prime}(b)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{F}(b)\nu_{\mathcal{F}(b)\mathbin{\blacktriangleleft}} yields an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphism 𝒩:𝖡𝖠\mathcal{N}^{\prime}:\mathsf{B}\to\mathsf{A}. First, if a:𝖠a:\mathsf{A} and y:𝖡y:\mathsf{B} with a=ya\lhd=\lhd y, then

    𝒩(ay)\displaystyle\mathcal{N}^{\prime}(a\cdot y) =𝒩(𝒢(a)y)\displaystyle=\mathcal{N}^{\prime}(\mathcal{G}(a)y)
    =(𝒢(a)y)ν(𝒢(a)y)\displaystyle=\mathcal{F}(\mathcal{G}(a)y)\nu_{\mathcal{F}(\mathcal{G}(a)y)\mathbin{\blacktriangleleft}}
    =𝒢(a)(y)ν(𝒢(a)(y))\displaystyle=\mathcal{F}\mathcal{G}(a)\mathcal{F}(y)\nu_{(\mathcal{F}\mathcal{G}(a)\mathcal{F}(y))\mathbin{\blacktriangleleft}}
    =𝒢(a)(y)ν(y)\displaystyle=\mathcal{F}\mathcal{G}(a)\mathcal{F}(y)\nu_{\mathcal{F}(y)\mathbin{\blacktriangleleft}}
    =𝒢(a)𝒩(y)\displaystyle=\mathcal{F}\mathcal{G}(a)\mathcal{N}^{\prime}(y)
    =a𝒩(y).\displaystyle=a\cdot\mathcal{N}^{\prime}(y).

    Next, if b:𝖡b:\mathsf{B} such that y=by\lhd=\lhd b, then

    𝒩(yb)\displaystyle\mathcal{N}^{\prime}(yb) =(yb)ν(yb)\displaystyle=\mathcal{F}(yb)\nu_{\mathcal{F}(yb)\mathbin{\blacktriangleleft}}
    =ν(yb)𝒢(yb)\displaystyle=\nu_{\mathbin{\blacktriangleleft}\mathcal{F}(yb)}\mathcal{F}\mathcal{G}\mathcal{F}(yb)
    =ν((y)(b))𝒢(y)𝒢(b)\displaystyle=\nu_{\mathbin{\blacktriangleleft}(\mathcal{F}(y)\mathcal{F}(b))}\mathcal{F}\mathcal{G}\mathcal{F}(y)\mathcal{F}\mathcal{G}\mathcal{F}(b)
    =ν(y)𝒢(y)𝒢(b)\displaystyle=\nu_{\mathbin{\blacktriangleleft}\mathcal{F}(y)}\mathcal{F}\mathcal{G}\mathcal{F}(y)\mathcal{F}\mathcal{G}\mathcal{F}(b)
    =(y)ν(y)𝒢(b)\displaystyle=\mathcal{F}(y)\nu_{\mathcal{F}(y)\mathbin{\blacktriangleleft}}\mathcal{F}\mathcal{G}\mathcal{F}(b)
    =𝒩(y)𝒢(b)\displaystyle=\mathcal{N}^{\prime}(y)\mathcal{F}\mathcal{G}\mathcal{F}(b)
    =𝒩(y)b.\displaystyle=\mathcal{N}^{\prime}(y)\cdot b.

    Finally, consider e:𝟙𝖠e:\mathbb{1}_{\mathsf{A}}. Since functors map identities to identities, we deduce that

    𝒩(𝒢(e))\displaystyle\mathcal{N}^{\prime}(\mathcal{G}(e)) =𝒢(e)ν𝒢(e)=ν𝒢(e).\displaystyle=\mathcal{F}\mathcal{G}(e)\nu_{\mathcal{F}\mathcal{G}(e)\mathbin{\blacktriangleleft}}=\nu_{\mathcal{F}\mathcal{G}(e)}.\qed

4.5. Adjoint functor pairs

Adjoint functor pairs are an important special case of natural transformations. We give one of many equivalent definitions [Riehl]*§4.1.

Definition 4.12.

Let 𝖠\mathsf{A} and 𝖡\mathsf{B} be categories. An adjoint functor pair is a pair of functors :𝖡𝖠\mathcal{F}:\mathsf{B}\to\mathsf{A} and 𝒢:𝖠𝖡\mathcal{G}:\mathsf{A}\to\mathsf{B} with the following property. For every object UU in 𝖡\mathsf{B} and VV in 𝖠\mathsf{A}, there is an invertible function

ΨUV:𝖠1((U),V)𝖡1(U,𝒢(V))\Psi_{UV}:\mathsf{A}_{1}(\mathcal{F}(U),V)\to\mathsf{B}_{1}(U,\mathcal{G}(V))

that is natural in the following sense: if b:𝖡1(X,U)b:\mathsf{B}_{1}(X,U) and a:𝖠1(V,Y)a:\mathsf{A}_{1}(V,Y) for objects XX in 𝖡\mathsf{B} and YY in 𝖡\mathsf{B} then, for every x𝖠1((U),V)x\in\mathsf{A}_{1}(\mathcal{F}(U),V),

(4.3) ΨXY(ax(b))=𝒢(a)ΨUV(x)b.\Psi_{XY}(ax\mathcal{F}(b))=\mathcal{G}(a)\Psi_{UV}(x)b.

We say that \mathcal{F} is left-adjoint to 𝒢\mathcal{G} and 𝒢\mathcal{G} is right-adjoint to \mathcal{F} and write this as :𝖡Ψ𝖠:𝒢\mathcal{F}:\mathsf{B}\dashv_{\Psi}\mathsf{A}:\mathcal{G}.

We now characterize adjoint functor pairs in terms of bicapsules. A reader may find it useful to review the translation between categories and abstract categories in Remark 3.19. The invertibility of ΨUV\Psi_{UV} in Definition 4.12 is equivalent to a pseudo-inverse property of morphisms of bicapsules.

For types XX and YY, partial-functions :XY?\mathcal{M}:X\to Y^{?} and 𝒩:YX?\mathcal{N}:Y\to X^{?} are pseudo-inverses if, for x:Xx:X and y:Yy:Y, 𝒩(x)(x)\mathcal{M}\mathcal{N}\mathcal{M}(x)\asymp\mathcal{M}(x) and 𝒩𝒩(y)𝒩(y)\mathcal{N}\mathcal{M}\mathcal{N}(y)\asymp\mathcal{N}(y).

Theorem 4.13.

Let 𝖠\mathsf{A} and 𝖡\mathsf{B} be categories.

  1. (a)

    If 𝖠\mathsf{A} and 𝖡\mathsf{B} are (𝖠,𝖡)(\mathsf{A},\mathsf{B})-bicapsules and :𝖠𝖡?\mathcal{M}:\mathsf{A}\to\mathsf{B}^{?} and 𝒩:𝖡𝖠?\mathcal{N}:\mathsf{B}\to\mathsf{A}^{?} are (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphisms that are pseudo-inverses, then :𝖡Ψ𝖠:𝒢\mathcal{F}:\mathsf{B}\dashv_{\Psi}\mathsf{A}:\mathcal{G} where

    \displaystyle\mathcal{F} :𝖡𝖠,(b)\hstretch.13==𝟙𝖠b,\displaystyle:\mathsf{B}\to\mathsf{A},\quad\mathcal{F}(b)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathbb{1}_{\mathsf{A}}\cdot b,
    𝒢\displaystyle\mathcal{G} :𝖠𝖡,𝒢(a)\hstretch.13==a𝟙𝖡,\displaystyle:\mathsf{A}\to\mathsf{B},\quad\mathcal{G}(a)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}a\cdot\mathbb{1}_{\mathsf{B}},

    and for x:𝖠1((U),V)x:\mathsf{A}_{1}(\mathcal{F}(U),V) and y:𝖡1(U,𝒢(V))y:\mathsf{B}_{1}(U,\mathcal{G}(V)) the bijections ΨUV\Psi_{UV} and ΨUV1\Psi_{UV}^{-1} are given by

    ΨUV(x)\displaystyle\Psi_{UV}(x) \hstretch.13==(x)\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{M}(x) ΨUV1(x)\displaystyle\Psi_{UV}^{-1}(x) \hstretch.13==𝒩(y).\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{N}(y).
  2. (b)

    If :𝖡Ψ𝖠:𝖦\mathcal{F}:\mathsf{B}\dashv_{\Psi}\mathsf{A}:\mathsf{G} is an adjoint functor pair, then 𝖠\mathsf{A} and 𝖡\mathsf{B} are (𝖠,𝖡)(\mathsf{A},\mathsf{B})-bicapsules with actions defined by

    ay\hstretch.13==𝒢(a)yandxb\hstretch.13==x(b)a\cdot y\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{G}(a)y\quad\text{and}\quad x\cdot b\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}x\mathcal{F}(b)

    for a,x:𝖠a,x:\mathsf{A} and b,y:𝖡b,y:\mathsf{B} and Ψ\Psi yields a pair of (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphisms :𝖠𝖡?\mathcal{M}:\mathsf{A}\to\mathsf{B}^{?} and 𝒩:𝖡𝖠?\mathcal{N}:\mathsf{B}\to\mathsf{A}^{?} that are pseudo-inverses where

    (x)\displaystyle\mathcal{M}(x) \hstretch.13==ΨUV(x),\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\Psi_{UV}(x), x:𝖠𝟙𝖡\hstretch.13==idV:𝟙𝖠idU:𝟙𝖡𝖠1((U),V),\displaystyle x:\mathsf{A}\cdot\mathbb{1}_{\mathsf{B}}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\bigsqcup_{\operatorname{id}_{V}:\mathbb{1}_{\mathsf{A}}}\bigsqcup_{\operatorname{id}_{U}:\mathbb{1}_{\mathsf{B}}}\mathsf{A}_{1}(\mathcal{F}(U),V),
    𝒩(y)\displaystyle\mathcal{N}(y) \hstretch.13==ΨUV1(y),\displaystyle\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\Psi^{-1}_{UV}(y), y:𝟙𝖠𝖡\hstretch.13==idV:𝟙𝖠idU:𝟙𝖡𝖡1(U,𝒢(V)).\displaystyle y:\mathbb{1}_{\mathsf{A}}\cdot\mathsf{B}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\bigsqcup_{\operatorname{id}_{V}:\mathbb{1}_{\mathsf{A}}}\bigsqcup_{\operatorname{id}_{U}:\mathbb{1}_{\mathsf{B}}}\mathsf{B}_{1}(U,\mathcal{G}(V)).
Proof.

First we prove (a). Since 𝖠\mathsf{A} and 𝖡\mathsf{B} are (𝖠,𝖡)(\mathsf{A},\mathsf{B})-bicapsules, by Proposition 4.6 there are functors :𝖡𝖠\mathcal{F}:\mathsf{B}\to\mathsf{A} and 𝒢:𝖠𝖡\mathcal{G}:\mathsf{A}\to\mathsf{B} defining the right 𝖡\mathsf{B}-capsule 𝖠𝖡\mathsf{A}_{\mathsf{B}} and the left 𝖠\mathsf{A}-capsule 𝖡𝖠{{}_{\mathsf{A}}\mathsf{B}} respectively. Since \mathcal{M} and 𝒩\mathcal{N} are pseudo-inverses and capsule actions are full, \mathcal{M} inverts 𝒩\mathcal{N} on 𝖠𝟙𝖡\mathsf{A}\cdot\mathbb{1}_{\mathsf{B}} and 𝒩\mathcal{N} inverts \mathcal{M} on 𝟙𝖠𝖡\mathbb{1}_{\mathsf{A}}\cdot\mathsf{B}. For objects UU of 𝖡\mathsf{B} and VV of 𝖠\mathsf{A}, let e=idUe=\operatorname{id}_{U} and f=idVf=\operatorname{id}_{V}. For x:𝖠1((U),V)=f𝖠ex:\mathsf{A}_{1}(\mathcal{F}(U),V)=f\mathsf{A}\cdot e (see Remark 3.19), we define ΨUV(x)\hstretch.13==(x)\Psi_{UV}(x)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{M}(x). Therefore, for y:𝖡1(U,𝒢(V))=f𝖡ey:\mathsf{B}_{1}(U,\mathcal{G}(V))=f\cdot\mathsf{B}e, the map y𝒩(y)y\mapsto\mathcal{N}(y) inverts ΨUV\Psi_{UV}, so the result follows.

Now we prove (b). By Proposition 4.6, we can exchange functors for capsules, so :𝖡𝖠\mathcal{F}:\mathsf{B}\to\mathsf{A} affords a right 𝖡\mathsf{B}-capsule 𝖠𝖡\mathsf{A}_{\mathsf{B}}. We enrich this action by adding the left regular action by 𝖠\mathsf{A} to produce an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-bicapsule 𝖠𝖡𝖠{}_{\mathsf{A}}\mathsf{A}_{\mathsf{B}}. We do likewise with 𝒢:𝖠𝖡\mathcal{G}:\mathsf{A}\to\mathsf{B} producing a second (𝖠,𝖡)(\mathsf{A},\mathsf{B})-capsule 𝖡𝖡𝖠{}_{\mathsf{A}}\mathsf{B}_{\mathsf{B}}.

To encode Ψ\Psi, we define an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-bimorphism :𝖠𝖡?\mathcal{M}:\mathsf{A}\to\mathsf{B}^{?} by (x)\hstretch.13==ΨUV(x)\mathcal{M}(x)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\Psi_{UV}(x) for x:A1((U),V)x:A_{1}(\mathcal{F}(U),V). This defines \mathcal{M} on

U:𝖡0V:𝖠0𝖠1((U),V)=e:𝟙𝖡f:𝟙𝖠f𝖠e=𝖠𝟙𝖡.\bigsqcup_{U:\mathsf{B}_{0}}\bigsqcup_{V:\mathsf{A}_{0}}\mathsf{A}_{1}(\mathcal{F}(U),V)=\bigsqcup_{e:\mathbb{1}_{\mathsf{B}}}\bigsqcup_{f:\mathbb{1}_{\mathsf{A}}}f\mathsf{A}\cdot e=\mathsf{A}\cdot\mathbb{1}_{\mathsf{B}}.

For all other values, \mathcal{M} is undefined. Now (4.3) shows that on 𝖠𝟙𝖡\mathsf{A}\cdot\mathbb{1}_{\mathsf{B}} with a:𝖠1(V,Y)a:\mathsf{A}_{1}(V,Y), b:𝖡1(X,U)b:\mathsf{B}_{1}(X,U), and x:𝖠1((U),V)x:\mathsf{A}_{1}(\mathcal{F}(U),V),

(axb)\displaystyle\mathcal{M}(ax\cdot b) =ΨUV(ax(b))=𝒢(a)ΨXY(x)b=a(x)b,\displaystyle=\Psi_{UV}(ax\mathcal{F}(b))=\mathcal{G}(a)\Psi_{XY}(x)b=a\cdot\mathcal{M}(x)b,

so \mathcal{M} is an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-bimorphism. We define 𝒩:𝖡𝖠?\mathcal{N}:\mathsf{B}\to\mathsf{A}^{?} analogously: if y:𝟙𝖠𝖡y:\mathbb{1}_{\mathsf{A}}\cdot\mathsf{B}, then 𝒩(y)\hstretch.13==Ψ1(y)\mathcal{N}(y)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\Psi^{-1}(y) (for suitable subscripts of Ψ\Psi), and otherwise 𝒩(y)\mathcal{N}(y) is undefined. Therefore, for x:𝖠𝟙𝖡x:\mathsf{A}\cdot\mathbb{1}_{\mathsf{B}} and y:𝟙𝖠𝖡y:\mathbb{1}_{\mathsf{A}}\cdot\mathsf{B},

(𝒩)(x)\displaystyle(\mathcal{M}\mathcal{N}\mathcal{M})(x) =Ψ(Ψ1(Ψ(x)))=Ψ(x)=(x)\displaystyle=\Psi(\Psi^{-1}(\Psi(x)))=\Psi(x)=\mathcal{M}(x)
(𝒩𝒩)(y)\displaystyle(\mathcal{N}\mathcal{M}\mathcal{N})(y) =Ψ1(Ψ(Ψ1(y)))=Ψ1(y)=𝒩(y).\displaystyle=\Psi^{-1}(\Psi(\Psi^{-1}(y)))=\Psi^{-1}(y)=\mathcal{N}(y).\qed

4.6. A computational model for natural transformations

We use the algebraic perspective of Section 3.6 to discuss briefly a model for computing with natural transformations. The next definition formalizes how to treat morphisms of a category as functors between two other categories.

Definition 4.14.

Let 𝖭\mathsf{N}, 𝖠\mathsf{A} and 𝖡\mathsf{B} be abstract categories. A natural map of 𝖭\mathsf{N} from 𝖠\mathsf{A} to 𝖡\mathsf{B} consists of functions :𝟙𝖭×𝖠𝖡\cdot:\mathbb{1}_{\mathsf{N}}\times\mathsf{A}\to\mathsf{B} and :𝖭×𝟙𝖠𝖡\bullet:\mathsf{N}\times\mathbb{1}_{\mathsf{A}}\to\mathsf{B} that satisfy the following properties:
    (1) (x,y:𝖠)(\forall x,y:\mathsf{A}) (e:𝟙𝖭)(\forall e:\mathbb{1}_{\mathsf{N}}) e(xy)(ex)(ey)e\cdot(xy)\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}(e\cdot x)(e\cdot y); (2) (x:𝖠)(\forall x:\mathsf{A}) (e:𝟙𝖭)(\forall e:\mathbb{1}_{\mathsf{N}}) e(x)=(ex)e\cdot(x\mathbin{\blacktriangleleft})=(e\cdot x)\mathbin{\blacktriangleleft}  and  e(x)=(ex)e\cdot(\mathbin{\blacktriangleleft}x)=\mathbin{\blacktriangleleft}(e\cdot x); (3) (x:𝖠)(\forall x:\mathsf{A}) (s:𝖭)(\forall s:\mathsf{N}) (s(x))((s)x)=((s)x)(s(x))(s\bullet(\mathbin{\blacktriangleleft}x))((s\mathbin{\blacktriangleleft})\cdot x)=((\mathbin{\blacktriangleleft}s)\cdot x)(s\bullet(x\mathbin{\blacktriangleleft})); (4) (f:𝟙𝖠)(\forall f:\mathbb{1}_{\mathsf{A}}) (s,t:𝖭)(\forall s,t:\mathsf{N}) (st)f(sf)(tf)(st)\bullet f\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}(s\bullet f)(t\bullet f).

The use of \mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}} in (1) and (4) depends only on xyxy and stst, respectively, being defined. For the composition signature Ω\Omega from Example 3.9, the first two conditions of Definition 4.14 imply that there is a function 𝟙𝖭MorΩ(𝖠,𝖡)\mathbb{1}_{\mathsf{N}}\to\operatorname{Mor}_{\Omega}(\mathsf{A},\mathsf{B}) given by e(xex)e\mapsto(x\mapsto e\cdot x) where MorΩ(𝖠,𝖡)\operatorname{Mor}_{\Omega}(\mathsf{A},\mathsf{B}) is the type of morphisms between abstract categories; see also Definition 3.20. This function 𝟙𝖭MorΩ(𝖠,𝖡)\mathbb{1}_{\mathsf{N}}\to\operatorname{Mor}_{\Omega}(\mathsf{A},\mathsf{B}) enables us to treat the objects of 𝖭\mathsf{N} as functors from 𝖠\mathsf{A} to 𝖡\mathsf{B}. As illustrated in Example 4.15, conditions (3) and (4) are equivalent to the commutative diagrams in Figure 3 in the shaded (2,2)(2,2) and (3,1)(3,1) entries, respectively.

Refer to caption
Figure 3. A natural map of 𝖭\mathsf{N} (displayed in the left dotted column) from 𝖠\mathsf{A} (displayed in top row) to 𝖡\mathsf{B} (shaded gray)
Example 4.15.

We illustrate how the four conditions of Definition 4.14 translate to categories with objects and morphisms. Let 𝖠\mathsf{A} and 𝖡\mathsf{B} be two such categories, and let Ω\Omega be the composition signature from Example 3.9. Then MorΩ(𝖠,𝖡)\operatorname{Mor}_{\Omega}(\mathsf{A},\mathsf{B}) is the type functors from 𝖠\mathsf{A} to 𝖡\mathsf{B}. Let 𝖭\mathsf{N} be the category whose objects are the functors in MorΩ(𝖠,𝖡)\operatorname{Mor}_{\Omega}(\mathsf{A},\mathsf{B}) and whose morphisms are natural transformations. Let η:𝒢\eta:\mathcal{F}\Rightarrow\mathcal{G} be a natural transformation between ,𝒢:MorΩ(𝖠,𝖡)\mathcal{F},\mathcal{G}:\operatorname{Mor}_{\Omega}(\mathsf{A},\mathsf{B}). Treating 𝖭\mathsf{N} as an abstract category, the guards are defined as follows: η\hstretch.13==id:\eta\mathbin{\blacktriangleleft}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\operatorname{id}_{\mathcal{F}}:\mathcal{F}\Rightarrow\mathcal{F} and η\hstretch.13==id𝒢:𝒢𝒢\mathbin{\blacktriangleleft}\eta\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\operatorname{id}_{\mathcal{G}}:\mathcal{G}\Rightarrow\mathcal{G}. Define :𝟙𝖭×𝖠𝖡\cdot:\mathbb{1}_{\mathsf{N}}\times\mathsf{A}\to\mathsf{B} by (id,φ)(φ)(\operatorname{id}_{\mathcal{F}},\varphi)\longmapsto\mathcal{F}(\varphi), and :𝖭×𝟙𝖠𝖡\bullet:\mathsf{N}\times\mathbb{1}_{\mathsf{A}}\to\mathsf{B} by (η,idX)ηX(\eta,\operatorname{id}_{X})\longmapsto\eta_{X}. Now the conditions of Definition 4.14 become:
    (1) (a,b:𝖠)(\forall a,b:\mathsf{A}) (id:𝟙𝖭)(\forall\operatorname{id}_{\mathcal{F}}:\mathbb{1}_{\mathsf{N}}) (ab)(a)(b)\mathcal{F}(ab)\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}\mathcal{F}(a)\mathcal{F}(b); (2) (a:𝖠)(\forall a:\mathsf{A}) (id:𝟙𝖭)(\forall\operatorname{id}_{\mathcal{F}}:\mathbb{1}_{\mathsf{N}}) (a)=((a))\mathcal{F}(a\mathbin{\blacktriangleleft})=(\mathcal{F}(a))\mathbin{\blacktriangleleft} and (a)=((a))\mathcal{F}(\mathbin{\blacktriangleleft}a)=\mathbin{\blacktriangleleft}(\mathcal{F}(a)); (3) ((a:XY):𝖠)((η:𝒢):𝖭)ηY(a)=𝒢(a)ηX(\forall(a:X\to Y):\mathsf{A})\quad(\forall(\eta:\mathcal{F}\Rightarrow\mathcal{G}):\mathsf{N})\quad\eta_{Y}\mathcal{F}(a)=\mathcal{G}(a)\eta_{X}; (4) (idX:𝟙𝖠)(\forall\operatorname{id}_{X}:\mathbb{1}_{\mathsf{A}}) (η,ϵ:𝖭)(\forall\eta,\epsilon:\mathsf{N}) (ηϵ)XηXϵX(\eta\epsilon)_{X}\mathrel{\leavevmode\hbox to6.86pt{\vbox to4.92pt{\pgfpicture\makeatletter\hbox{\thinspace\lower 1.7375pt\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }\definecolor{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@rgb@stroke{0}{0}{0}\pgfsys@invoke{ }\pgfsys@color@rgb@fill{0}{0}{0}\pgfsys@invoke{ }\pgfsys@setlinewidth{0.4pt}\pgfsys@invoke{ }\nullfont\pgfsys@beginscope\pgfsys@invoke{ }\pgfsys@invoke{ }\pgfsys@endscope\hbox to0.0pt{\pgfsys@beginscope\pgfsys@invoke{ }{{}}{{}} {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{6.45837pt}\pgfsys@lineto{2.15277pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{5.1667pt}\pgfsys@lineto{6.45831pt}{5.1667pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{0.0pt}{1.9375pt}\pgfsys@lineto{2.15277pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } {}{{}}{} {}{}{}\pgfsys@moveto{2.15277pt}{3.22919pt}\pgfsys@lineto{6.45831pt}{3.22919pt}\pgfsys@stroke\pgfsys@invoke{ } \pgfsys@invoke{ }\pgfsys@endscope{}{}{}\hss}\pgfsys@discardpath\pgfsys@invoke{ }\pgfsys@endscope\hss}}\endpgfpicture}}}\eta_{X}\epsilon_{X}\square

The theory of functors and natural transformations is equivalent to that of natural maps on abstract categories, but the latter allows us to use multiple encodings of functors and natural transformations such as those available in computer algebra systems. If, for example, we compute the derived subgroup γ2(G)\gamma_{2}(G) of a group GG in Magma, then the system may use an encoding for γ2(G)\gamma_{2}(G) that differs from that supplied for GG. In such cases, Magma also returns an inclusion homomorphism λG:γ2(G)G\lambda_{G}:\gamma_{2}(G)\hookrightarrow G.

5. The Extension Theorem

One of our goals is a categorification of characteristic subgroups and their analogues in eastern algebras. We start by translating the characteristic condition into the language of natural transformations.

5.1. Natural transformations express characteristic subgroups

Suppose that HH is a characteristic subgroup of a group GG. Hence, every automorphism φ:GG\varphi:G\to G restricts to an automorphism φ|H:HH\varphi|_{H}:H\to H of HH. In categorical terms, we now treat 𝖠\hstretch.13==𝖠𝗎𝗍(G)\mathsf{A}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathsf{Aut}(G) as the subcategory of 𝖦𝗋𝗉\mathsf{Grp} consisting of a single object GG and all isomorphisms GGG\to G. Likewise, we treat 𝖡\hstretch.13==𝖠𝗎𝗍(H)\mathsf{B}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathsf{Aut}(H) as a subcategory of 𝖦𝗋𝗉\mathsf{Grp}. The restriction defines a functor 𝒞:𝖠𝖡\mathcal{C}:\mathsf{A}\to\mathsf{B}. Of course, 𝖠𝗎𝗍(G)\mathsf{Aut}(G) and 𝖠𝗎𝗍(H)\mathsf{Aut}(H) are also groups and 𝒞\mathcal{C} is a group homomorphism, but the discussion below justifies the functor language.

Now we use the fact that HH is a subgroup of GG (by using the inclusion map ρG:HG\rho_{G}:H\hookrightarrow G). That φ(H)\varphi(H) is a subgroup of HH can be expressed as φρG=ρGφ|H=ρG𝒞(φ).\varphi\rho_{G}=\rho_{G}\varphi|_{H}=\rho_{G}\mathcal{C}(\varphi). Recognizing the different categories, we use the inclusion functors :𝖠𝖦𝗋𝗉\mathcal{I}:\mathsf{A}\to\mathsf{Grp} and 𝒥:𝖡𝖦𝗋𝗉\mathcal{J}:\mathsf{B}\to\mathsf{Grp} to deduce the following:

(φ)ρG=ρG𝒥𝒞(φ).\mathcal{I}(\varphi)\rho_{G}=\rho_{G}\mathcal{J}\mathcal{C}(\varphi).

Thus, a characteristic subgroup determines a natural transformation

ρ:𝒥𝒞.\rho:\mathcal{J}\mathcal{C}\Rightarrow\mathcal{I}.

The next definition generalizes Definition 1.1.

Definition 5.1.

Fix an eastern variety 𝖤\mathsf{E} with subcategories 𝖠\mathsf{A} and 𝖡\mathsf{B} and inclusion functors :𝖠𝖤\mathcal{I}:\mathsf{A}\to\mathsf{E} and 𝒥:𝖡𝖤\mathcal{J}:\mathsf{B}\to\mathsf{E}. A counital is a natural transformation ρ:𝒥𝒞\rho:\mathcal{JC}\Rightarrow\mathcal{I} for some functor 𝒞:𝖠𝖡\mathcal{C}:\mathsf{A}\to\mathsf{B}. The counital ρ:𝒥𝒞\rho:\mathcal{JC}\Rightarrow\mathcal{I} is monic if ρX\rho_{X} is a monomorphism for all objects XX in 𝖠\mathsf{A}.

A common way to illustrate categories, functors, and natural transformations uses a 2-dimensional diagram where categories are vertices, functors are directed edges, and natural transformations are oriented 2-cells. The next diagram illustrates the counital discussed above.

𝖦𝗋𝗉\mathsf{Grp}𝖠𝗎𝗍(G)\mathsf{Aut}(G)𝖠𝗎𝗍(H)\mathsf{Aut}(H)\mathcal{I}𝒥\mathcal{J}𝒞\mathcal{C}ρ\rho

We now generalize the notion of a characteristic subgroup to an arbitrary eastern algebra. Let 𝖠\mathsf{A} be a subcategory of an eastern variety 𝖤\mathsf{E}. For G:𝖤G:\mathsf{E}, let 𝖠(G)\mathsf{A}(G) be the category with a single object GG and morphisms 𝖠1(G,G)\mathsf{A}_{1}(G,G), so 𝖠𝗎𝗍(G)=𝖦𝗋𝗉(G)\mathsf{Aut}(G)=\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}(G). Observe that 𝖠(G)\mathsf{A}(G) is a full subcategory of 𝖠\mathsf{A}.

Definition 5.2.

Let G,H:𝖤G,H:\mathsf{E}, and let :𝖠(G)𝖠\mathcal{I}:\mathsf{A}(G)\to\mathsf{A} and 𝒥:𝖠(H)𝖠\mathcal{J}:\mathsf{A}(H)\to\mathsf{A} be inclusions. A monomorphism ι:HG\iota:H\hookrightarrow G is 𝖠\mathsf{A}-invariant if there is a functor 𝒞:𝖠(G)𝖠(H)\mathcal{C}:\mathsf{A}(G)\to\mathsf{A}(H) and a monic counital η:𝒥𝒞\eta:\mathcal{JC}\Rightarrow\mathcal{I} such that ηG\eta_{G} is equivalent to ι\iota (see Section 3.9 for the definition of equivalence).

Using the language of Definition 5.2, a characteristic subgroup HH of a group GG determines and is determined by a 𝖦𝗋𝗉\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}-invariant monomorphism HGH\hookrightarrow G. For fully invariant subgroups, the corresponding monomorphism is 𝖦𝗋𝗉\mathsf{Grp}-invariant.

5.2. The extension problem and representation theory

In Section 5.1, we observed that a characteristic subgroup HH of GG determines a functor 𝒞:𝖠𝖡\mathcal{C}:\mathsf{A}\to\mathsf{B} and a natural transformation ρ:𝒥𝒞\rho:\mathcal{JC}\Rightarrow\mathcal{I}, where 𝖠\mathsf{A} and 𝖡\mathsf{B} are categories with one object, namely GG and HH respectively. If a group G´\acute{G} is isomorphic to GG, then by Fact 1.2 G´\acute{G} has a characteristic subgroup corresponding to HH. It seems plausible that we may be able to extend the functor 𝒞\mathcal{C} to more groups and, hence, to larger categories. We now make this notion of extension precise and generalize it to the setting of eastern algebras.

Fix an eastern variety 𝖤\mathsf{E}. Let 𝖠\mathsf{A}, 𝖡\mathsf{B}, and 𝖢\mathsf{C} be subcategories of 𝖤\mathsf{E} where 𝖠𝖢\mathsf{A}\leqslant\mathsf{C}. We have inclusion functors :𝖠𝖤\mathcal{I}:\mathsf{A}\to\mathsf{E}, 𝒥:𝖡𝖤\mathcal{J}:\mathsf{B}\to\mathsf{E}, 𝒦:𝖢𝖤\mathcal{K}:\mathsf{C}\to\mathsf{E} and :𝖠𝖢\mathcal{L}:\mathsf{A}\to\mathsf{C}, where =𝒦\mathcal{I}=\mathcal{K}\mathcal{L}. Suppose that ρ:𝒥𝒞\rho:\mathcal{J}\mathcal{C}\Rightarrow\mathcal{I} is a monic counital as depicted in Figure 4(a). The extension problem asks whether there is a functor 𝒟:𝖢𝖤\mathcal{D}:\mathsf{C}\to\mathsf{E} and a natural transformation σ:𝒟𝒦\sigma:\mathcal{D}\Rightarrow\mathcal{K} such that

(5.1) ρX=σ(X)τX\displaystyle\rho_{X}=\sigma_{\mathcal{L}(X)}\tau_{X}

for some invertible morphism τX:𝒥𝒞(X)𝒟(X)\tau_{X}:\mathcal{JC}(X)\to\mathcal{DL}(X) for all objects XX of 𝖠\mathsf{A}. This is depicted in Figure 4(b).

𝖠\mathsf{A}𝖡\mathsf{B}𝖤\mathsf{E}\mathcal{I}𝒞\mathcal{C}𝒥\mathcal{J}ρ\rho
(a) A diagram of a counital
𝖠\mathsf{A}𝖡\mathsf{B}𝖢\mathsf{C}𝖤\mathsf{E}\mathcal{L}\mathcal{I}𝒞\mathcal{C}𝒥\mathcal{J}𝒦\mathcal{K}𝒟\mathcal{D}σ\sigmaτ\tau
(b) A diagram of an extension
Figure 4. Extending a counital

For now, we are concerned only with the existence and construction of such extensions. For use within an isomorphism test, it will be necessary to develop tools to compute efficiently with categories; the data types of Section 4.6 are designed for that purpose.

In light of Proposition 4.10, we can explore the natural transformations from Figure 4 through the lens of actions. Recall that concatenation always denotes regular actions. The natural transformation ρ\rho defined above is encoded as an 𝖠\mathsf{A}-bimorphism :𝖠𝖤\mathcal{R}:\mathsf{A}\to\mathsf{E}, where ρ=(𝟙𝖠)\rho=\mathcal{R}(\mathbb{1}_{\mathsf{A}}), and this bimorphism defines a cyclic 𝖠\mathsf{A}-bicapsule Δ\hstretch.13==𝖠ρ𝖠\Delta\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathsf{A}\rho\cdot\mathsf{A} via (4.2), which we fix throughout.

Our goal in part is to extend the cyclic 𝖠\mathsf{A}-bicapsule Δ\Delta to a cyclic 𝖢\mathsf{C}-bicapsule Σ\Sigma. Specifically, we will define Σ\hstretch.13==𝖢σ𝖢\Sigma\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathsf{C}\sigma\cdot\mathsf{C}, where σ:𝒟𝒦\sigma:\mathcal{D}\Rightarrow\mathcal{K} is depicted in Figure 4(b). This is the content of Theorem 5.4, but given in the general setting of eastern algebras. By construction (Proposition 4.10), the left actions on Δ\Delta and Σ\Sigma are regular; hence, we focus on right actions.

Example 5.3.

For the purposes of illustration, we consider a familiar construction that is similar to our context, namely Frobenius reciprocity and Morita condensation [Rowen, Theorem 25A.19]. Here EE is a ring, and AA and CC are subrings. Considering a Peirce decomposition of EE, let 𝟙A\mathbb{1}_{A} and 𝟙C\mathbb{1}_{C} be idempotents in EE such that 𝟙A𝟙C=𝟙A=𝟙C𝟙A\mathbb{1}_{A}\mathbb{1}_{C}=\mathbb{1}_{A}=\mathbb{1}_{C}\mathbb{1}_{A}. Then C=𝟙CE𝟙CC=\mathbb{1}_{C}E\mathbb{1}_{C} is a (non-unital) subring of EE, and A=𝟙AE𝟙AA=\mathbb{1}_{A}E\mathbb{1}_{A} is a subring of both CC and EE. Furthermore, 𝟙AE𝟙C=𝟙AC\mathbb{1}_{A}E\mathbb{1}_{C}=\mathbb{1}_{A}C is an (A,C)(A,C)-bimodule and C𝟙A=𝟙CE𝟙AC\mathbb{1}_{A}=\mathbb{1}_{C}E\mathbb{1}_{A} is a (C,A)(C,A)-bimodule. Suppose Δ\Delta is a right AA-module and Σ\Sigma a right CC-module. The theory of induction and restriction provides us, respectively, with a right CC-module and a right AA-module: namely,

IndAC(Δ)\hstretch.13==ΔA(𝟙AC)andResAC(Σ)\hstretch.13==ΣC(C𝟙A).\mathrm{Ind}^{C}_{A}(\Delta)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\Delta\otimes_{A}(\mathbb{1}_{A}C)\quad\text{and}\quad\mathrm{Res}^{C}_{A}(\Sigma)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\Sigma\otimes_{C}(C\mathbb{1}_{A}).

Thus, A(𝟙AC)C(C𝟙A)A\to(\mathbb{1}_{A}C)\otimes_{C}(C\mathbb{1}_{A}) yields a map ΔΔAAResAC(IndAC(Δ))\Delta\cong\Delta\otimes_{A}A\to\mathrm{Res}_{A}^{C}(\mathrm{Ind}_{A}^{C}(\Delta)). If, for example, C𝟙AC=CC\mathbb{1}_{A}C=C, then ΔResAC(IndAC(Δ))\Delta\cong\mathrm{Res}_{A}^{C}(\mathrm{Ind}_{A}^{C}(\Delta))\square

Guided by the Peirce decomposition from Example 5.3, we seek similar constructions for categories and capsules. Recall that 𝖤\mathsf{E} contains a subcategory 𝖢\mathsf{C} that contains a subcategory 𝖠\mathsf{A}. This containment implies that 𝟙𝖠\mathbb{1}_{\mathsf{A}} is contained in (rather embeds under the inclusion functors into) 𝟙𝖢\mathbb{1}_{\mathsf{C}}. The bicapsule action of 𝖠\mathsf{A} on Δ\Delta induces a 𝖢\mathsf{C}-bicapsule, denoted Ind𝖠𝖢(Δ)\mathrm{Ind}_{\mathsf{A}}^{\mathsf{C}}(\Delta). By mimicking modules, we can consider a formal extension process. We form the type Δ𝖠𝖢\Delta\otimes_{\mathsf{A}}\mathsf{C} whose terms are pairs, denoted δc\delta\otimes c for δ:Δ\delta:\Delta and c:𝖢c:\mathsf{C}, subject to the equivalence relation (δa)c=δ(ac)(\delta\cdot a)\otimes c=\delta\otimes(ac). Then we equip this type with the right 𝖢\mathsf{C}-action (δc)c=δ(cc)(\delta\otimes c)\cdot c^{\prime}=\delta\otimes(cc^{\prime}). Defining 𝖠\𝖢\hstretch.13=={𝖠cc:𝖢}\mathsf{A}\backslash\mathsf{C}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\{\mathsf{A}c\mid c:\mathsf{C}\}, we write

Δ𝖠𝖢\displaystyle\Delta\otimes_{\mathsf{A}}\mathsf{C} =𝖠c:𝖠\𝖢Δ𝖠𝖠c.\displaystyle=\coprod_{\mathsf{A}c:\mathsf{A}\backslash\mathsf{C}}\Delta\otimes_{\mathsf{A}}\mathsf{A}c.

We return to this construction in Section 8. Finally, since Δ=𝖠ρ𝖠\Delta=\mathsf{A}\rho\cdot\mathsf{A} and 𝖢\mathsf{C} are both subtypes of 𝖤\mathsf{E}, the product in 𝖤\mathsf{E} defines a map Δ×𝖢𝖤\Delta\times\mathsf{C}\to\mathsf{E} that factors through Δ𝖠𝖢\Delta\otimes_{\mathsf{A}}\mathsf{C}. The image of the map is a cyclic 𝖢\mathsf{C}-bicapsule Σ\Sigma embedded in 𝖤\mathsf{E}, with corresponding 𝖢\mathsf{C}-bimorphism 𝒮:𝖢𝖤\mathcal{S}:\mathsf{C}\to\mathsf{E}. The following theorem states that this is always possible if 𝖠\mathsf{A} is full in 𝖢\mathsf{C}. Table 3 summarizes some of the notation fixed throughout this section.

𝖤\mathsf{E} eastern variety
𝖡,𝖢\mathsf{B},\mathsf{C} subcategories of 𝖤\mathsf{E}
𝖠\mathsf{A} full subcategory of 𝖢\mathsf{C}
bimorphism monic counital cyclic bicapsule
:𝖠𝖤\mathcal{R}:\mathsf{A}\to\mathsf{E} ρ=(𝟙𝖠)\rho=\mathcal{R}(\mathbb{1}_{\mathsf{A}}) Δ=AρA\Delta=A\rho\cdot A
𝒮:𝖢𝖤\mathcal{S}:\mathsf{C}\to\mathsf{E} σ=𝒮(𝟙𝖢)\sigma=\mathcal{S}(\mathbb{1}_{\mathsf{C}}) Σ=CσC\Sigma=C\sigma\cdot C
Table 3. Data for the proof of Theorem 5.4
Theorem 5.4 (Extension).

Let 𝖤,𝖢,𝖠,Δ\mathsf{E},\mathsf{C},\mathsf{A},\Delta be as in Table 3. If 𝖠\mathsf{A} is full in 𝖢\mathsf{C}, then there is a cyclic 𝖢\mathsf{C}-bicapsule Σ\mathsf{\Sigma} on 𝖤\mathsf{E} and unique cyclic 𝖠\mathsf{A}-bicapsules Υ\mathsf{\Upsilon}, Λ\mathsf{\Lambda} on 𝖤\mathsf{E} such that

Δ=Res𝖠𝖢(Σ)𝖠ΥandRes𝖠𝖢(Σ)=Δ𝖠Λ.\Delta=\mathrm{Res}_{\mathsf{A}}^{\mathsf{C}}(\Sigma)\otimes_{\mathsf{A}}\Upsilon\quad\text{and}\quad\mathrm{Res}_{\mathsf{A}}^{\mathsf{C}}(\Sigma)=\Delta\otimes_{\mathsf{A}}\Lambda.

We briefly describe the idea of the proof. We start with a cyclic 𝖠\mathsf{A}-bicapsule Δ\Delta with associated 𝖠\mathsf{A}-bimorphism :𝖠𝖤\mathcal{R}:\mathsf{A}\to\mathsf{E}. We seek an extension of \mathcal{R} to a 𝖢\mathsf{C}-bimorphism 𝒮:𝖢𝖤\mathcal{S}\colon\mathsf{C}\to\mathsf{E} that satisfies 𝒮=\mathcal{S}\mathcal{L}=\mathcal{R}, where :𝖠𝖢\mathcal{L}\colon\mathsf{A}\to\mathsf{C} is the inclusion functor. If this holds, then, for every e:𝟙𝖠e:\mathbb{1}_{\mathsf{A}} and every c:𝖢c:\mathsf{C} with c=(e)c\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}\mathcal{R}(e),

c(e)=c𝒮(e)=𝒮(ce)=𝒮(c)=𝒮(c)c.c\cdot\mathcal{R}(e)=c\cdot\mathcal{S}\mathcal{L}(e)=\mathcal{S}(c\cdot e)=\mathcal{S}(c)=\mathcal{S}(\mathbin{\blacktriangleleft}c)\cdot c.

We now derive some necessary conditions for a putative 𝒮\mathcal{S} of this type. Recall from (3.3) that the notation αβ\alpha\ll\beta for morphisms α\alpha and β\beta implies that there is a morphism γ\gamma such that α=βγ\alpha=\beta\gamma. Applying Lemma 3.24 to c(e)=𝒮(c)cc\cdot\mathcal{R}(e)=\mathcal{S}(c\mathbin{\blacktriangleleft})\cdot c yields im(c(e))im(𝒮(c))\mathrm{im}(c\cdot\mathcal{R}(e))\ll\mathrm{im}(\mathcal{S}(\mathbin{\blacktriangleleft}c)). For f:𝟙𝖢f:\mathbb{1}_{\mathsf{C}} define

𝕌𝖢(f)\hstretch.13==e:𝟙𝖠f𝖢e.\displaystyle\mathbb{U}_{\mathsf{C}}(f)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\bigsqcup_{e:\mathbb{1}_{\mathsf{A}}}f\mathsf{C}\cdot e.

The left 𝖠\mathsf{A}-actions on 𝖢\mathsf{C} and 𝖤\mathsf{E} are regular, as is the left 𝖢\mathsf{C}-action on 𝖤\mathsf{E}. Hence, for e,c:𝕌𝖢(f)\langle e,c\rangle:\mathbb{U}_{\mathsf{C}}(f),

c=(c𝟙𝖤)=(c)𝟙𝖤=e𝟙𝖤.c\lhd=(c\cdot\mathbb{1}_{\mathsf{E}})\mathbin{\blacktriangleleft}=(c\mathbin{\blacktriangleleft})\cdot\mathbb{1}_{\mathsf{E}}=e\cdot\mathbb{1}_{\mathsf{E}}.

Therefore, c=e𝟙𝖤=(e(e))=(e)=(e)c\lhd=e\cdot\mathbb{1}_{\mathsf{E}}=\mathbin{\blacktriangleleft}(e\cdot\mathcal{R}(e))=\mathbin{\blacktriangleleft}\mathcal{R}(e)=\lhd\mathcal{R}(e), so c(e)c\cdot\mathcal{R}(e) is defined. Since im(c(e))im(𝒮(f))\mathrm{im}(c\cdot\mathcal{R}(e))\ll\mathrm{im}(\mathcal{S}(f)) holds for every e,c:𝕌𝖢(f)\langle e,c\rangle:\mathbb{U}_{\mathsf{C}}(f), we can make a single inclusion (see (3.5)):

(5.2) im(e,c𝕌𝖢(f)im(c(e)))im(𝒮(f)).\mathrm{im}\left(\coprod_{{\langle e,c\rangle\in\mathbb{U}_{\mathsf{C}}(f)}}\mathrm{im}(c\cdot\mathcal{R}(e))\right)\ll\mathrm{im}(\mathcal{S}(f)).

Observe that (5.2) also holds if, instead of =𝒮\mathcal{R}=\mathcal{SL}, we assume that there exists 𝒯:𝖠𝖤\mathcal{T}:\mathsf{A}\to\mathsf{E} such that (a)=Res𝖠𝖢(𝒮)(a)𝒯(a)\mathcal{R}(a)=\mathrm{Res}_{\mathsf{A}}^{\mathsf{C}}(\mathcal{S})(a)\mathcal{T}(a\mathbin{\blacktriangleleft}) for all a:𝖠a:\mathsf{A}; here Res𝖠𝖢(𝒮)(a)=𝒮(𝟙𝖢a)\mathrm{Res}_{\mathsf{A}}^{\mathsf{C}}(\mathcal{S})(a)=\mathcal{S}(\mathbb{1}_{\mathsf{C}}\cdot a) denotes the restriction of 𝒮\mathcal{S} to 𝖠\mathsf{A}. This motivates us to choose 𝒮\mathcal{S} such that 𝒮(c)\mathcal{S}(c) is defined as the left hand side of (5.2), and then solve for a suitable 𝒯\mathcal{T}.

In the language of bimorphisms, Theorem 5.4 asserts that there exists an 𝖠\mathsf{A}-bimorphism 𝒯:𝖠𝖤\mathcal{T}\colon\mathsf{A}\to\mathsf{E} such that, for a:𝖠a:\mathsf{A},

(5.3) (a)\displaystyle\mathcal{R}(a) =𝒮(𝟙𝖢(a))𝒯(a)=𝒮(𝟙𝖢a)𝒯(a)\displaystyle=\mathcal{S}(\mathbb{1}_{\mathsf{C}}\cdot(\mathbin{\blacktriangleleft}a))\mathcal{T}(a)=\mathcal{S}(\mathbb{1}_{\mathsf{C}}\cdot a)\mathcal{T}(a\mathbin{\blacktriangleleft})

where 𝒮\mathcal{S} is the 𝖢\mathsf{C}-bimorphism corresponding to Σ\Sigma; we use this language in its proof. The second equality in (5.3) reflects the tensor product over 𝖠\mathsf{A} shown in Theorem 5.4.

5.3. Building blocks

We prove Theorem 5.4 in Section 5.4 using the three intermediate results presented in this section.

Lemma 5.5.

Let 𝖢\mathsf{C} and 𝖤\mathsf{E} be as in Table 3. For σ:f:𝟙𝖢(f𝖤)\sigma:\prod_{f:\mathbb{1}_{\mathsf{C}}}(f~\cdot\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\hookrightarrow$}\vss}}}), the following are equivalent.

  1. (1)

    There is a 𝖢\mathsf{C}-bicapsule Σ\Sigma on 𝖤\mathsf{E} such that the function 𝒮:𝖢Σ\mathcal{S}:\mathsf{C}\to\Sigma given by 𝒮(c)\hstretch.13==cσc\mathcal{S}(c)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}c\cdot\sigma_{c\mathbin{\blacktriangleleft}} is a 𝖢\mathsf{C}-bimorphism.

  2. (2)

    For all c:𝖢c:\mathsf{C}, cσcσcc\cdot\sigma_{c\mathbin{\blacktriangleleft}}\ll\sigma_{\mathbin{\blacktriangleleft}c}.

Proof.

We assume (1) holds and prove (2). By Proposition 4.10(b), there exists a unique functor 𝒢:𝖢𝖤\mathcal{G}:\mathsf{C}\to\mathsf{E} that induces the action of 𝖢\mathsf{C} on the right of 𝖤\mathsf{E}. Since 𝒮\mathcal{S} is a function and a 𝖢\mathsf{C}-bimorphism by assumption, c=σcc\lhd=\lhd\sigma_{c\mathbin{\blacktriangleleft}} for all c:𝖢c:\mathsf{C}, and

cσc\displaystyle c\cdot\sigma_{c\mathbin{\blacktriangleleft}} =𝒮(c)=𝒮((c)c)=𝒮(c)c=((c)σ(c))c=σc𝒢(c).\displaystyle=\mathcal{S}(c)=\mathcal{S}((\mathbin{\blacktriangleleft}c)\cdot c)=\mathcal{S}(\mathbin{\blacktriangleleft}c)\cdot c=((\mathbin{\blacktriangleleft}c)\cdot\sigma_{(\mathbin{\blacktriangleleft}c)\mathbin{\blacktriangleleft}})\cdot c=\sigma_{\mathbin{\blacktriangleleft}c}\mathcal{G}(c).

Thus, (2) holds.

We now assume (2) holds and prove (1). First, we show that an x:𝖤x:\mathsf{E} satisfying cσc=σcxc\cdot\sigma_{c\mathbin{\blacktriangleleft}}=\sigma_{\mathbin{\blacktriangleleft}c}x is unique. Suppose y:𝖤y:\mathsf{E} satisfies cσc=σcyc\cdot\sigma_{c\mathbin{\blacktriangleleft}}=\sigma_{\mathbin{\blacktriangleleft}c}y, so σcx=σcy\sigma_{\mathbin{\blacktriangleleft}c}x=\sigma_{\mathbin{\blacktriangleleft}c}y. Since σc\sigma_{\mathbin{\blacktriangleleft}c} is a monomorphism, x=yx=y. We denote this unique morphism by uc:𝖤u_{c}:\mathsf{E}. Since σcuc\sigma_{\mathbin{\blacktriangleleft}c}u_{c} is defined for all c:𝖢c:\mathsf{C},

uc=(σ(c))=(σc)=uc.\displaystyle\mathbin{\blacktriangleleft}u_{\mathbin{\blacktriangleleft}c}=(\sigma_{\mathbin{\blacktriangleleft}(\mathbin{\blacktriangleleft}c)})\mathbin{\blacktriangleleft}=(\sigma_{\mathbin{\blacktriangleleft}c})\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}u_{c}.

Next, we define a right 𝖢\mathsf{C}-capsule structure on 𝖤\mathsf{E} as follows. Let ():𝖢𝟙𝖤\lhd(-):\mathsf{C}\to\mathbb{1}_{\mathsf{E}} be given by c=uc\lhd c=\mathbin{\blacktriangleleft}u_{c}, and let ():𝖤𝟙𝖤(-)\lhd:\mathsf{E}\to\mathbb{1}_{\mathsf{E}} be given by x=xx\lhd=x\mathbin{\blacktriangleleft}. For all c:𝖢c:\mathsf{C} and x:𝖤x:\mathsf{E}, let xc=xucx\cdot c=xu_{c}, which is defined if, and only if, x=ucx\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}u_{c}. Condition (2) of Definition 4.1 follows from (c)=uc=uc=c\lhd(\mathbin{\blacktriangleleft}c)=\mathbin{\blacktriangleleft}u_{\mathbin{\blacktriangleleft}c}=\mathbin{\blacktriangleleft}u_{c}=\lhd c and ue:𝟙𝖤u_{e}:\mathbb{1}_{\mathsf{E}} for all e:𝟙𝖢e:\mathbb{1}_{\mathsf{C}} since σe\sigma_{e} is monic. Lastly, let c,d:𝖢c,d:\mathsf{C} with c=dc\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}d. Then (cd)σ(cd)=σ(cd)ucd=σcucd(cd)\cdot\sigma_{(cd)\mathbin{\blacktriangleleft}}=\sigma_{\mathbin{\blacktriangleleft}(cd)}u_{cd}=\sigma_{\mathbin{\blacktriangleleft}c}u_{cd} and, since we have a regular left action,

(cd)σ(cd)=c(dσ(cd))=c(dσd)=cσdud=σcucud.\displaystyle(cd)\cdot\sigma_{(cd)\mathbin{\blacktriangleleft}}=c\cdot(d\cdot\sigma_{(cd)\mathbin{\blacktriangleleft}})=c\cdot(d\cdot\sigma_{d\mathbin{\blacktriangleleft}})=c\cdot\sigma_{\mathbin{\blacktriangleleft}d}u_{d}=\sigma_{\mathbin{\blacktriangleleft}c}u_{c}u_{d}.

Since σc\sigma_{\mathbin{\blacktriangleleft}c} is a monomorphism, ucd=ucudu_{cd}=u_{c}u_{d}. Hence, this defines a right 𝖢\mathsf{C}-capsule on 𝖤\mathsf{E} since c=uc:𝟙𝖤\lhd c=\mathbin{\blacktriangleleft}u_{c}:\mathbb{1}_{\mathsf{E}} for all c:𝖢c:\mathsf{C}. Since 𝖢\mathsf{C} acts regularly on 𝖤\mathsf{E} on the left, there exists a 𝖢\mathsf{C}-bicapsule Σ\Sigma on 𝖤\mathsf{E} by Proposition 4.6, with the regular left and right actions just defined. Finally, we prove that 𝒮\mathcal{S} is a 𝖢\mathsf{C}-bimorphism. For all c,x,y:𝖢c,x,y:\mathsf{C},

𝒮(cx)\displaystyle\mathcal{S}(cx) =(cx)σ(cx)=(cx)σx=c(xσx)=c𝒮(x),\displaystyle=(cx)\cdot\sigma_{(cx)\mathbin{\blacktriangleleft}}=(cx)\cdot\sigma_{x\mathbin{\blacktriangleleft}}=c\cdot(x\cdot\sigma_{x\mathbin{\blacktriangleleft}})=c\cdot\mathcal{S}(x),

provided c=xc\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}x. If y=cy\mathbin{\blacktriangleleft}=\mathbin{\blacktriangleleft}c, then

𝒮(yc)\displaystyle\mathcal{S}(yc) =(yc)σ(yc)=(yc)σc=y(cσc)=y(σcuc)=(yσy)uc\displaystyle=(yc)\cdot\sigma_{(yc)\mathbin{\blacktriangleleft}}=(yc)\cdot\sigma_{c\mathbin{\blacktriangleleft}}=y\cdot(c\cdot\sigma_{c\mathbin{\blacktriangleleft}})=y\cdot(\sigma_{\mathbin{\blacktriangleleft}c}u_{c})=(y\cdot\sigma_{y\mathbin{\blacktriangleleft}})u_{c}
=𝒮(y)c.\displaystyle=\mathcal{S}(y)\cdot c.\qed
Lemma 5.6.

For 𝖤,𝖢,\mathsf{E},\mathsf{C},\mathcal{R} as in Table 3, define σ:f:𝟙𝖢(f𝖤)\sigma:\coprod_{f:\mathbb{1}_{\mathsf{C}}}(f~\cdot\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\hookrightarrow$}\vss}}}) via

σf\hstretch.13==im(e,x:𝕌𝖢(f)im(x(e))).\sigma_{f}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathrm{im}\left(\coprod_{\langle e,x\rangle:\mathbb{U}_{\mathsf{C}}(f)}\mathrm{im}(x\cdot\mathcal{R}(e))\right).

For each c:𝖢c:\mathsf{C}, there is a unique y:𝖤y:\mathsf{E} satisfying cσc=σcyc\cdot\sigma_{c\mathbin{\blacktriangleleft}}=\sigma_{\mathbin{\blacktriangleleft}c}y.

Proof.

Let c:𝖢c:\mathsf{C}, so c=cc\lhd=c\mathbin{\blacktriangleleft}, and c=σcc\mathbin{\blacktriangleleft}=\lhd\sigma_{c\mathbin{\blacktriangleleft}} by definition of σ\sigma. We show that cσcσcc\cdot\sigma_{c\mathbin{\blacktriangleleft}}\ll\sigma_{\mathbin{\blacktriangleleft}c}. By Lemma 3.23,

(e,x:𝕌𝖢(f))\displaystyle(\forall\langle e,x\rangle:\mathbb{U}_{\mathsf{C}}(f)) cim(x(e))\displaystyle c\cdot\mathrm{im}(x\cdot\mathcal{R}(e)) im(c(x(e)))=im((cx)(e)).\displaystyle\ll\mathrm{im}(c\cdot(x\cdot\mathcal{R}(e)))=\mathrm{im}((cx)\cdot\mathcal{R}(e)).

Thus, by Fact 3.25(d),

(5.4) e,x:𝕌𝖢(f)(cim(x(e)))\displaystyle\coprod_{\langle e,x\rangle:\mathbb{U}_{\mathsf{C}}(f)}\left(c\cdot\mathrm{im}(x\cdot\mathcal{R}(e))\right) e,x:𝕌𝖢(f)im((cx)(e)).\displaystyle\ll\coprod_{\langle e,x\rangle:\mathbb{U}_{\mathsf{C}}(f)}\mathrm{im}((cx)\cdot\mathcal{R}(e)).

Therefore, using (3.5),

cim(e,xim(x(e)))\displaystyle c\cdot\mathrm{im}\left(\coprod_{\langle e,x\rangle}\mathrm{im}(x\cdot\mathcal{R}(e))\right) im(ce,xim(x(e)))\displaystyle\ll\mathrm{im}\left(c\cdot\coprod_{\langle e,x\rangle}\mathrm{im}(x\cdot\mathcal{R}(e))\right) (Lemma 3.23)\displaystyle(\text{Lemma~\ref{lem:im}})
=im(e,x(cim(x(e))))\displaystyle=\mathrm{im}\left(\coprod_{\langle e,x\rangle}(c\cdot\mathrm{im}(x\cdot\mathcal{R}(e)))\right) (Fact 3.25(c))\displaystyle(\text{Fact~\ref{fact:coprod}\ref{factpart:factor-out}})
im(e,xim((cx)(e)))\displaystyle\ll\mathrm{im}\left(\coprod_{\langle e,x\rangle}\mathrm{im}((cx)\cdot\mathcal{R}(e))\right) (Equation (5.4))\displaystyle(\text{Equation (\ref{eq:coprod-ineq})})

where all of the coproducts are over e,x:𝕌𝖢(c)\langle e,x\rangle:\mathbb{U}_{\mathsf{C}}(c\mathbin{\blacktriangleleft}). By Fact 3.25(e),

im(e,x:𝕌𝖢(c)im((cx)(e)))im(e,z:𝕌𝖢(c)im(z(e))).\mathrm{im}\left(\coprod_{\langle e,x\rangle:\mathbb{U}_{\mathsf{C}}(c\mathbin{\blacktriangleleft})}\mathrm{im}((cx)\cdot\mathcal{R}(e))\right)\ll\mathrm{im}\left(\coprod_{\langle e,z\rangle:\mathbb{U}_{\mathsf{C}}(\mathbin{\blacktriangleleft}c)}\mathrm{im}(z\cdot\mathcal{R}(e))\right).

Putting this together, we deduce that

cσc\displaystyle c\cdot\sigma_{c\mathbin{\blacktriangleleft}} =cim(e,x:𝕌𝖢(c)im(x(e)))im(e,z:𝕌𝖢(c)im(z(e)))=σc,\displaystyle=c\cdot\mathrm{im}\left(\coprod_{\langle e,x\rangle:\mathbb{U}_{\mathsf{C}}(c\mathbin{\blacktriangleleft})}\mathrm{im}(x\cdot\mathcal{R}(e))\right)\ll\mathrm{im}\left(\coprod_{\langle e,z\rangle:\mathbb{U}_{\mathsf{C}}(\mathbin{\blacktriangleleft}c)}\mathrm{im}(z\cdot\mathcal{R}(e))\right)=\sigma_{\mathbin{\blacktriangleleft}c},

so cσc=σcyc\cdot\sigma_{c\mathbin{\blacktriangleleft}}=\sigma_{c\mathbin{\blacktriangleleft}}y for some y:𝖤y:\mathsf{E}. Since σc\sigma_{\mathbin{\blacktriangleleft}c} is monic, yy is unique. ∎

Proposition 5.7.

Let 𝖤,𝖢,𝖠,\mathsf{E},\mathsf{C},\mathsf{A},\mathcal{R} be as in Table 3. If 𝖠\mathsf{A} is full in 𝖢\mathsf{C}, then 𝒮:𝖢𝖤\mathcal{S}:\mathsf{C}\to\mathsf{E} defined by

𝒮(c)\hstretch.13==cim(e,x:𝕌𝖢(c)im(x(e)))\mathcal{S}(c)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}c\cdot\mathrm{im}\left(\coprod_{\langle e,x\rangle:\mathbb{U}_{\mathsf{C}}(c\mathbin{\blacktriangleleft})}\mathrm{im}(x\cdot\mathcal{R}(e))\right)

is a 𝖢\mathsf{C}-bimorphism, and there exists a unique λ:f:𝟙𝖠f𝖤f\lambda:\prod_{f:\mathbb{1}_{\mathsf{A}}}f\!\!\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}\!\!f such that for all a:𝖠a:\mathsf{A},

(a)=𝒮(a𝟙𝖢)λaand𝒮(a𝟙𝖢)=(a)λa1.\mathcal{R}(a)=\mathcal{S}(a\cdot\mathbb{1}_{\mathsf{C}})\lambda_{a\mathbin{\blacktriangleleft}}\quad\text{and}\quad\mathcal{S}(a\cdot\mathbb{1}_{\mathsf{C}})=\mathcal{R}(a)\lambda_{a\mathbin{\blacktriangleleft}}^{-1}.
Proof.

Take e,f:𝟙𝖠e,f:\mathbb{1}_{\mathsf{A}}, and recall that 𝖠\mathsf{A} acts regularly on both the left and right of 𝖢\mathsf{C}. Since 𝖠\mathsf{A} is full in 𝖢\mathsf{C}, these actions are full, so

f𝖢e=𝟙𝖢(f𝖠e)=(f𝖠e)𝟙𝖢.f\cdot\mathsf{C}\cdot e=\mathbb{1}_{\mathsf{C}}\cdot(f\mathsf{A}e)=(f\mathsf{A}e)\cdot\mathbb{1}_{\mathsf{C}}.

Since the left actions of 𝖠\mathsf{A} on 𝖢\mathsf{C} and on 𝖤\mathsf{E} and the left action of 𝖢\mathsf{C} on 𝖤\mathsf{E} are regular, for each a:𝖠a:\mathsf{A} and x:𝖤x:\mathsf{E} with a=xa\lhd=\lhd x,

(5.5) (a𝟙𝖢)x\displaystyle(a\cdot\mathbb{1}_{\mathsf{C}})\cdot x =ax.\displaystyle=a\cdot x.

Fix a:𝖠a:\mathsf{A}. Set c=a𝟙𝖢c=a\cdot\mathbb{1}_{\mathsf{C}}, so (c)𝖢=(a)𝖢(c\mathbin{\blacktriangleleft})\mathsf{C}=(a\mathbin{\blacktriangleleft})\cdot\mathsf{C}. Thus, since the 𝖠\mathsf{A}-action on 𝖢\mathsf{C} is full,

(5.6) 𝕌𝖢(c)\hstretch.13==e:𝟙𝖠((c)𝖢e)\displaystyle\mathbb{U}_{\mathsf{C}}(c\mathbin{\blacktriangleleft})\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\bigsqcup_{e:\mathbb{1}_{\mathsf{A}}}\left((c\mathbin{\blacktriangleleft})\mathsf{C}\cdot e\right) =e:𝟙𝖠((a)𝖢e)=((a)𝖠)𝟙𝖢,\displaystyle=\bigsqcup_{e:\mathbb{1}_{\mathsf{A}}}\left((a\mathbin{\blacktriangleleft})\cdot\mathsf{C}\cdot e\right)=((a\mathbin{\blacktriangleleft})\mathsf{A})\cdot\mathbb{1}_{\mathsf{C}},

where (a)𝖠(a\mathbin{\blacktriangleleft})\mathsf{A} acts on 𝟙𝖢\mathbb{1}_{\mathsf{C}} in the final expression. Therefore,

𝒮(a𝟙𝖢)\displaystyle\mathcal{S}(a\cdot\mathbb{1}_{\mathsf{C}}) =aim(e,x:𝕌𝖢((a)𝟙𝖢)im(x(e)))\displaystyle=a\cdot\mathrm{im}\left(\coprod_{\langle e,x\rangle:\mathbb{U}_{\mathsf{C}}((a\mathbin{\blacktriangleleft})\cdot\mathbb{1}_{\mathsf{C}})}\mathrm{im}(x\cdot\mathcal{R}(e))\right) (Equation (5.5))\displaystyle(\text{Equation \eqref{eqn:C-to-A}})
=aim(a:(a)𝖠im(a(a)))\displaystyle=a\cdot\mathrm{im}\left(\coprod_{a^{\prime}:(a\mathbin{\blacktriangleleft})\mathsf{A}}\mathrm{im}(a^{\prime}\cdot\mathcal{R}(a^{\prime}\mathbin{\blacktriangleleft}))\right) (Equation (5.6))\displaystyle(\text{Equation \eqref{eqn:subscripts}})
=aim(a:(a)𝖠im((a)a))\displaystyle=a\cdot\mathrm{im}\left(\coprod_{a^{\prime}:(a\mathbin{\blacktriangleleft})\mathsf{A}}\mathrm{im}(\mathcal{R}(a\mathbin{\blacktriangleleft})\cdot a^{\prime})\right) (𝖠-bimorphism, a=a)\displaystyle(\text{$\mathsf{A}$-bimorphism, $\mathbin{\blacktriangleleft}a^{\prime}=a\mathbin{\blacktriangleleft}$})
a(a)=(a).\displaystyle\ll a\cdot\mathcal{R}(a\mathbin{\blacktriangleleft})=\mathcal{R}(a). (Lemma 3.24, Fact 3.25(c))\displaystyle(\text{Lemma~\ref{lem:im-monic}, Fact~\ref{fact:coprod}\ref{factpart:factor-out}})

For the application of Lemma 3.24 in the last step, recall that (e)\mathcal{R}(e) is monic for e:𝖠e:\mathsf{A} by our assumption (Table 3).

We establish the other direction as follows:

(a)\displaystyle\mathcal{R}(a\mathbin{\blacktriangleleft}) im((a))=im((a)(a))\displaystyle\ll\mathrm{im}(\mathcal{R}(a\mathbin{\blacktriangleleft}))=\mathrm{im}((a\mathbin{\blacktriangleleft})\cdot\mathcal{R}(a\mathbin{\blacktriangleleft})) (Theorem 3.22)\displaystyle(\text{Theorem~\ref{thm:Noether}})
a:(a)𝖠im(a(a))\displaystyle\ll\coprod_{a^{\prime}:(a\mathbin{\blacktriangleleft})\mathsf{A}}\mathrm{im}(a^{\prime}\cdot\mathcal{R}(a^{\prime}\mathbin{\blacktriangleleft})) (Fact 3.25(e))\displaystyle(\text{Fact~\ref{fact:coprod}\ref{factpart:smaller-coprod}})
im(a:(a)𝖠im(a(a))).\displaystyle\ll\mathrm{im}\left(\coprod_{a^{\prime}:(a\mathbin{\blacktriangleleft})\mathsf{A}}\mathrm{im}(a^{\prime}\cdot\mathcal{R}(a^{\prime}\mathbin{\blacktriangleleft}))\right). (Theorem 3.22)\displaystyle(\text{Theorem~\ref{thm:Noether}})

Acting with a:𝖠a:\mathsf{A} from the left, we obtain (a)𝒮(a𝟙𝖢)\mathcal{R}(a)\ll\mathcal{S}(a\cdot\mathbb{1}_{\mathsf{C}}).

From both computations, there exist λ,μ:f:𝟙𝖠f𝖤f\lambda,\mu:\prod_{f:\mathbb{1}_{\mathsf{A}}}f\mathsf{E}f such that

(a)\displaystyle\mathcal{R}(a) =𝒮(a𝟙𝖢)λa,\displaystyle=\mathcal{S}(a\cdot\mathbb{1}_{\mathsf{C}})\lambda_{a\mathbin{\blacktriangleleft}}, 𝒮(a𝟙𝖢)\displaystyle\mathcal{S}(a\cdot\mathbb{1}_{\mathsf{C}}) =(a)μa.\displaystyle=\mathcal{R}(a)\mu_{a\mathbin{\blacktriangleleft}}.

It remains to show that μa=λa1\mu_{a\mathbin{\blacktriangleleft}}=\lambda_{a\mathbin{\blacktriangleleft}}^{-1} for all a:𝖠a:\mathsf{A} and λ\lambda is unique. For all e:𝟙𝖠e:\mathbb{1}_{\mathsf{A}}, (e)\mathcal{R}(e) is monic by the assumptions in Theorem 5.4, and 𝒮(e𝟙𝖢)\mathcal{S}(e\cdot\mathbb{1}_{\mathsf{C}}) is also monic by the definition of 𝒮\mathcal{S}. Since (e)\mathcal{R}(e) is monic,

(e)\displaystyle\mathcal{R}(e) =𝒮(e𝟙𝖢)λe=(e)μeλe\displaystyle=\mathcal{S}(e\cdot\mathbb{1}_{\mathsf{C}})\lambda_{e}=\mathcal{R}(e)\mu_{e}\lambda_{e}

implies μeλe:𝟙𝖤\mu_{e}\lambda_{e}:\mathbb{1}_{\mathsf{E}}. Similarly, λeμe:𝟙𝖤\lambda_{e}\mu_{e}:\mathbb{1}_{\mathsf{E}} because 𝒮(e𝟙𝖢)\mathcal{S}(e\cdot\mathbb{1}_{\mathsf{C}}) is monic and

𝒮(e𝟙𝖢)\displaystyle\mathcal{S}(e\cdot\mathbb{1}_{\mathsf{C}}) =(e)μe=𝒮(e𝟙𝖢)λeμe.\displaystyle=\mathcal{R}(e)\mu_{e}=\mathcal{S}(e\cdot\mathbb{1}_{\mathsf{C}})\lambda_{e}\mu_{e}.

The uniqueness of λ\lambda follows since (e)\mathcal{R}(e) is monic. ∎

5.4. Proof of Theorem 5.4

Let 𝒮:𝖢𝖤\mathcal{S}:\mathsf{C}\to\mathsf{E} be the 𝖢\mathsf{C}-bimorphism in Proposition 5.7. This proposition shows that there exists a unique λ:f:𝟙𝖠f𝖤f\lambda:\prod_{f:\mathbb{1}_{\mathsf{A}}}f\!\!\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}\!\!f such that for all a:𝖠a:\mathsf{A},

(5.7) (a)\displaystyle\mathcal{R}(a) =𝒮(a𝟙𝖢)λa.\displaystyle=\mathcal{S}(a\cdot\mathbb{1}_{\mathsf{C}})\lambda_{a\mathbin{\blacktriangleleft}}.

Since \mathcal{R} is an 𝖠\mathsf{A}-bimorphism,

(5.8) (a𝟙𝖤)(a)\displaystyle(a\cdot\mathbb{1}_{\mathsf{E}})\cdot\mathcal{R}(a\mathbin{\blacktriangleleft}) =(a)=(a)(𝟙𝖤a).\displaystyle=\mathcal{R}(a)=\mathcal{R}(\mathbin{\blacktriangleleft}a)\cdot(\mathbb{1}_{\mathsf{E}}\cdot a).

Let Σ\Sigma be the 𝖢\mathsf{C}-bicapsule on 𝖤\mathsf{E} defined by the 𝖢\mathsf{C}-bimorphism 𝒮\mathcal{S}. Both Δ\Delta and Σ\Sigma are bicapsules, so applying (5.7) to (5.8) yields

(5.9) (a𝟙𝖤)𝒮((a)𝟙𝖢)λa\displaystyle(a\cdot\mathbb{1}_{\mathsf{E}})\mathcal{S}((a\mathbin{\blacktriangleleft})\cdot\mathbb{1}_{\mathsf{C}})\lambda_{a\mathbin{\blacktriangleleft}} =𝒮((a)𝟙𝖢)λa(𝟙𝖤a).\displaystyle=\mathcal{S}((\mathbin{\blacktriangleleft}a)\cdot\mathbb{1}_{\mathsf{C}})\lambda_{\mathbin{\blacktriangleleft}a}(\mathbb{1}_{\mathsf{E}}\cdot a).

Since the left 𝖢\mathsf{C}-action on 𝖤\mathsf{E} and the left 𝖠\mathsf{A}-action on 𝖢\mathsf{C} are regular, a𝟙𝖤=(a𝟙𝖢)𝟙𝖤a\cdot\mathbb{1}_{\mathsf{E}}=(a\cdot\mathbb{1}_{\mathsf{C}})\cdot\mathbb{1}_{\mathsf{E}}. Thus, (a𝟙𝖤)𝒮((a)𝟙𝖢)=𝒮(a𝟙𝖢)=𝒮((a)𝟙𝖢)(𝟙𝖤(𝟙𝖢a))(a\cdot\mathbb{1}_{\mathsf{E}})\mathcal{S}((a\mathbin{\blacktriangleleft})\cdot\mathbb{1}_{\mathsf{C}})=\mathcal{S}(a\cdot\mathbb{1}_{\mathsf{C}})=\mathcal{S}((\mathbin{\blacktriangleleft}a)\cdot\mathbb{1}_{\mathsf{C}})(\mathbb{1}_{\mathsf{E}}\cdot(\mathbb{1}_{\mathsf{C}}\cdot a)). Applying this to (5.9) and using the monic property of 𝒮((a)𝟙𝖢)\mathcal{S}((\mathbin{\blacktriangleleft}a)\cdot\mathbb{1}_{\mathsf{C}}), we deduce that

(𝟙𝖤(𝟙𝖢a))λa\displaystyle(\mathbb{1}_{\mathsf{E}}\cdot(\mathbb{1}_{\mathsf{C}}\cdot a))\lambda_{a\mathbin{\blacktriangleleft}} =λa(𝟙𝖤a).\displaystyle=\lambda_{\mathbin{\blacktriangleleft}a}(\mathbb{1}_{\mathsf{E}}\cdot a).

Since the actions are capsules, λ\lambda defines a natural transformation. By Proposition 4.10(a), the function a(𝟙𝖤(𝟙𝖢a))λaa\mapsto(\mathbb{1}_{\mathsf{E}}\cdot(\mathbb{1}_{\mathsf{C}}\cdot a))\lambda_{a\mathbin{\blacktriangleleft}} defines an 𝖠\mathsf{A}-bimorphism 𝒯:𝖠𝖤\mathcal{T}:\mathsf{A}\to\mathsf{E}. Thus, 𝒯(a)=λa:𝖤\mathcal{T}(a\mathbin{\blacktriangleleft})=\lambda_{a\mathbin{\blacktriangleleft}}:\,\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}, and therefore

(a)\displaystyle\mathcal{R}(a) =𝒮(𝟙𝖢a)𝒯(a)=𝒮(𝟙𝖢(a))(𝟙𝖤(𝟙𝖢a))𝒯(a)=𝒮(𝟙𝖢(a))𝒯(a).\displaystyle=\mathcal{S}(\mathbb{1}_{\mathsf{C}}\cdot a)\mathcal{T}(a\mathbin{\blacktriangleleft})=\mathcal{S}(\mathbb{1}_{\mathsf{C}}\cdot(\mathbin{\blacktriangleleft}a))(\mathbb{1}_{\mathsf{E}}\cdot(\mathbb{1}_{\mathsf{C}}\cdot a))\mathcal{T}(a\mathbin{\blacktriangleleft})=\mathcal{S}(\mathbb{1}_{\mathsf{C}}\cdot(\mathbin{\blacktriangleleft}a))\mathcal{T}(a).

The uniqueness of 𝒯\mathcal{T} follows from Proposition 4.10(b) and the uniqueness of λ\lambda. ∎

5.5. Proof of Theorem 1 for eastern varieties

Recall that Counital(𝖡,𝖤)\text{Counital}(\mathsf{B},\mathsf{E}) denotes the type of all counitals ι:𝒦𝒞\iota:\mathcal{K}\mathcal{C}\Rightarrow\mathcal{I}, where 𝖡𝖤\mathsf{B}\leqslant\mathsf{E} and 𝖢\mathsf{C} are categories, 𝒞:𝖡𝖢\mathcal{C}:\mathsf{B}\to\mathsf{C} is a functor, and :𝖡𝖤\mathcal{I}:\mathsf{B}\to\mathsf{E} and 𝒦:𝖢𝖤\mathcal{K}:\mathsf{C}\to\mathsf{E} are inclusion functors. Let Unital(𝖡,𝖤)\text{Unital}(\mathsf{B},\mathsf{E}) be the type of all unitals π:𝒦𝒞\pi:\mathcal{I}\Rightarrow\mathcal{K}\mathcal{C}; these are the duals of counitals. Recall from Theorem 3.22 that im\mathrm{im} and coim\mathrm{coim} produce categorical morphisms, and from Section 3.9 the equivalence relations on monomorphisms and epimorphisms. Our use of set theory notation in the following generalization of Theorem 1 is justified because we compare subsets of a fixed algebra.

Theorem 1-cat.

Let 𝖤\mathsf{E} be an eastern variety. For every G:𝖤G:\mathsf{E}, the following equalities of sets hold up to equivalence:

(1) {ι:HGι is 𝖤-invariant}\displaystyle\{\iota:H\hookrightarrow G\mid\iota\text{ is $\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}$-invariant}\} ={im(ηG)η:Counital(𝖤,𝖤)};\displaystyle=\left\{\text{\rm im}(\eta_{G})\mid\eta:\text{\rm Counital}\left(\mathrel{\mathop{\mathsf{\mathsf{E}}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}},\mathsf{E}\right)\right\};
(2) {π:GQπ is 𝖤-invariant}\displaystyle\{\pi:G\twoheadrightarrow Q\mid\pi\text{ is $\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}$-invariant}\} ={coim(τG)τ:Unital(𝖤,𝖤)};\displaystyle=\left\{\text{\rm coim}(\tau_{G})\mid\tau:\text{\rm Unital}\left(\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}},\mathsf{E}\right)\right\};
(3) {ι:HGι is 𝖤-invariant}\displaystyle\{\iota:H\hookrightarrow G\mid\iota\text{ is $\mathsf{E}$-invariant}\} ={im(ηG)η:Counital(𝖤,𝖤)};\displaystyle=\left\{\text{\rm im}(\eta_{G})\mid\eta:\text{\rm Counital}(\mathsf{E},\mathsf{E})\right\};
(4) {π:GQπ is 𝖤-invariant}\displaystyle\{\pi:G\twoheadrightarrow Q\mid\pi\text{ is $\mathsf{E}$-invariant}\} ={coim(τG)τ:Unital(𝖤,𝖤)}.\displaystyle=\left\{\text{\rm coim}(\tau_{G})\mid\tau:\text{\rm Unital}(\mathsf{E},\mathsf{E})\right\}.
Proof.

We prove (1) in detail; the proof of (3) is analogous but requires replacing 𝖤\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}} with 𝖤\mathsf{E}. The proofs of (2) and (4) are dual to the proofs of (1) and (3), respectively.

Let ι:HG\iota:H\hookrightarrow G, which is a morphism in 𝖤\mathsf{E}. Recall that the single-object category 𝖠𝗎𝗍(G)\mathsf{Aut}(G) consists of GG and all its automorphisms, and likewise for 𝖠𝗎𝗍(H)\mathsf{Aut}(H). Both are subcategories of 𝖤\mathsf{E}, and full subcategories of 𝖤\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}. We denote the relevant inclusion functors by :𝖠𝗎𝗍(G)𝖤\mathcal{I}:\mathsf{Aut}(G)\to\mathsf{E}, 𝒥:𝖠𝗎𝗍(H)𝖤\mathcal{J}:\mathsf{Aut}(H)\to\mathsf{E}, :𝖠𝗎𝗍(G)𝖤\mathcal{L}:\mathsf{Aut}(G)\to\;\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}, and 𝒦:𝖤𝖤\mathcal{K}:\;\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}\;\to\mathsf{E}. As in Section 5.1, we obtain a natural transformation ρ:𝒥𝒞\rho:\mathcal{J}\mathcal{C}\Rightarrow\mathcal{I} with (restriction) functor 𝒞:𝖠𝗎𝗍(G)𝖠𝗎𝗍(H)\mathcal{C}:\mathsf{Aut}(G)\to\mathsf{Aut}(H), so ρ:Counital(𝖠𝗎𝗍(G),𝖤)\rho:\text{Counital}(\mathsf{Aut}(G),\mathsf{E}) is a monic counital.

We now use Proposition 4.10 to pass to the associated cyclic 𝖠𝗎𝗍(G)\mathsf{Aut}(G)-bicapsule Δ=𝖠𝗎𝗍(G)ρ𝖠𝗎𝗍(G)\Delta=\mathsf{Aut}(G)\cdot\rho\cdot\mathsf{Aut}(G). Recall that the left action is defined by \mathcal{I}, hence regular, and the right action is defined by 𝒥𝒞\mathcal{JC}. By construction, Δ\Delta satisfies the conditions of Theorem 5.4 since 𝖠𝗎𝗍(G)\mathsf{Aut}(G) is full in 𝖤\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}. We extend Δ\Delta to a cyclic 𝖤\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}-bicapsule Σ=𝖤σ𝖤\Sigma=\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}\cdot\sigma\cdot\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}} where σ:𝒦𝒟𝒦\sigma:\mathcal{K}\mathcal{D}\Rightarrow\mathcal{K} is a monic counital extending ρ\rho, namely, there exists an isomorphism τG:𝒥𝒞(G)𝒟(G)\tau_{G}:\mathcal{JC}(G)\to\mathcal{DL}(G) such that ι=ρG=σ(G)τG\iota=\rho_{G}=\sigma_{\mathcal{L}(G)}\tau_{G}; see (5.1). Since \mathcal{L} is the inclusion functor, there exists an isomorphism τ:𝖤\tau^{\prime}:\mathsf{E} such that ι=σGτ\iota=\sigma_{G}\tau^{\prime}, so ι\iota and σG\sigma_{G} are equivalent. Hence, ι\iota and im(σG)\mathrm{im}(\sigma_{G}) are equivalent. Since σ:Counital(𝖤,𝖤)\sigma:\text{Counital}(\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}},\mathsf{E}), this proves the “\subseteq” part of (1).

For the converse, consider η:Counital(𝖤,𝖤)\eta:\text{Counital}(\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}},\mathsf{E}), say η:𝒟𝒦\eta:\mathcal{HD}\Rightarrow\mathcal{K} for some functor 𝒟:𝖤𝖢\mathcal{D}:\;\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}\;\to\mathsf{C}, subcategory 𝖢𝖤\mathsf{C}\leqslant\mathsf{E}, and inclusion :𝖢𝖤\mathcal{H}:\mathsf{C}\to\mathsf{E}. If φ:𝖠𝗎𝗍(G)\varphi:\mathsf{Aut}(G), then (φ):𝖤\mathcal{L}(\varphi):\;\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}, and so 𝒦(φ)ηG=ηG𝒟(φ)\mathcal{K}\mathcal{L}(\varphi)\eta_{G}=\eta_{G}\mathcal{H}\mathcal{D}\mathcal{L}(\varphi). Since G=(G)=𝒦(G)G=\mathcal{L}(G)=\mathcal{K}(G), it follows that the morphism ηG:𝒟(G)G\eta_{G}:\mathcal{HD}(G)\to G is characteristic, and therefore so is its monic image im(ηG)\mathrm{im}(\eta_{G}). This proves the “\supseteq” part of (1). ∎

6. Categorification of characteristic substructure

The final step in our work is to describe the source of all characteristic subgroups, and more generally characteristic substructures in eastern algebras. In Section 5, we showed that characteristic structure arises naturally from counitals. Now we demonstrate that all counitals are derived from counits. In particular, in Section 6.3, we prove the following generalization of Theorem 2 to eastern algebras.

Theorem 2-cat.

Fix an eastern variety 𝖤\mathsf{E}. Let GG be an object in 𝖤\mathsf{E} with subobject HH and inclusion ι:HG\iota:H\hookrightarrow G. There exist categories 𝖠\mathsf{A} and 𝖡\mathsf{B}, where 𝖤𝖠𝖤\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}\;\leqslant\mathsf{A}\leqslant\mathsf{E}, such that the following are equivalent.

  1. (1)

    HH is characteristic in GG.

  2. (2)

    There is a functor 𝒞:𝖠𝖠\mathcal{C}:\mathsf{A}\to\mathsf{A} and a counit η:𝒞id𝖠\eta:\mathcal{C}\Rightarrow\operatorname{id}_{\mathsf{A}} such that H=Im(ηG)H=\operatorname{Im}(\eta_{G}).

  3. (3)

    There is an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphism :𝖡𝖠\mathcal{M}:\mathsf{B}\to\mathsf{A} such that ι=(idG𝟙𝖡)\iota=\mathcal{M}(\operatorname{id}_{G}\cdot\mathbb{1}_{\mathsf{B}}).

Our proof relies on the Extension Theorem 5.4 and further consideration of counitals.

Definition 6.1.

Fix a category 𝖤\mathsf{E} with subcategories 𝖠\mathsf{A} and 𝖡\mathsf{B} and inclusion functors :𝖠𝖤\mathcal{I}:\mathsf{A}\to\mathsf{E} and 𝒥:𝖡𝖤\mathcal{J}:\mathsf{B}\to\mathsf{E}. A counital η:𝒥𝒞\eta:\mathcal{JC}\Rightarrow\mathcal{I} is isosceles if 𝖠=𝖡\mathsf{A}=\mathsf{B} and =𝒥\mathcal{I}=\mathcal{J}, and flat if, in addition, 𝖠=𝖡=𝖤\mathsf{A}=\mathsf{B}=\mathsf{E} and =𝒥=id𝖤\mathcal{I}=\mathcal{J}=\operatorname{id}_{\mathsf{E}}. Otherwise, it is scalene.

Example 6.2.

We mention three examples in 𝖦𝗋𝗉\mathsf{Grp} and illustrate their natural transformations in Figure 5. The first two are the derived subgroup and the centre of a group considered in Example 1.4. For the third example, we consider an arbitrary characteristic subgroup HH of a group GG. As discussed in Section 5.2, define 𝖠𝗎𝗍(G)\mathsf{Aut}(G) to be the category with one object GG and its morphisms are the automorphisms of GG. Hence, 𝖠𝗎𝗍(G)\mathsf{Aut}(G) and 𝖠𝗎𝗍(H)\mathsf{Aut}(H) are subcategories of 𝖦𝗋𝗉\mathsf{Grp} with inclusion functors 𝒥\mathcal{J} and 𝒦\mathcal{K}, respectively. We define a functor 𝒞:𝖠𝗎𝗍(G)𝖠𝗎𝗍(H)\mathcal{C}:\mathsf{Aut}(G)\to\mathsf{Aut}(H) by mapping GG to HH and automorphisms of GG to their restriction to HH, and so obtain a natural transformation ι:𝒦𝒞𝒥\iota:\mathcal{K}\mathcal{C}\Rightarrow\mathcal{J}\square

𝖦𝗋𝗉\mathsf{Grp}𝖦𝗋𝗉\mathsf{Grp}id\operatorname{id}𝒟\mathcal{D}λ\lambda
(a) The derived subgroup
𝖦𝗋𝗉\mathsf{Grp}𝖦𝗋𝗉\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}𝖦𝗋𝗉\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}\mathcal{I}\mathcal{I}𝒵\mathcal{Z}ρ\rho
(b) The center
𝖦𝗋𝗉\mathsf{Grp}𝖠𝗎𝗍(G)\mathsf{Aut}(G)𝖠𝗎𝗍(H)\mathsf{Aut}(H)𝒥\mathcal{J}𝒦\mathcal{K}𝒞\mathcal{C}ι\iota
(c) A characteristic subgroup
Figure 5. Natural transformations from Example 6.2

In our study of characteristic structure we consider the three types of counitals. First, we use induced actions from Theorem 6.5 to pass from a scalene counital to one that is isosceles. Next, we work with isosceles counitals to determine an intermediate class of isosceles counitals known as internal counitals. Finally, we show that internal counitals are completely determined by a morphism of bicapsules.

Counits are common in many categorical contexts; for example, they occur for every adjoint functor pair. The case of flat counitals coincides precisely with the stricter class of fully-invariant substructures.

6.1. Composing counitals

In this section, we describe two ways to construct new counitals from given counitals by composing natural transformations and functors in different ways. These are two instances of a much larger theory; see [Baez, Power]. Figure 4 illustrates the usual composition of natural transformations. We now describe how to compose a functor with a natural transformation. Consider functors ,𝒢:𝖡𝖠\mathcal{F},\mathcal{G}:\mathsf{B}\to\mathsf{A}, :𝖢𝖡\mathcal{H}:\mathsf{C}\to\mathsf{B}, and 𝒦:𝖠𝖣\mathcal{K}:\mathsf{A}\to\mathsf{D} for categories 𝖠\mathsf{A}, 𝖡\mathsf{B}, 𝖢\mathsf{C}, and 𝖣\mathsf{D}, and a natural transformation η:𝒢\eta:\mathcal{F}\Rightarrow\mathcal{G}. Define η:𝒢\eta\mathcal{H}:\mathcal{F}\mathcal{H}\Rightarrow\mathcal{G}\mathcal{H} by setting (η)X\hstretch.13==η(X)(\eta\mathcal{H})_{X}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\eta_{\mathcal{H}(X)} for each object XX in 𝖢\mathsf{C}. Similarly, define 𝒦η:𝒦𝒦𝒢\mathcal{K}\eta:\mathcal{K}\mathcal{F}\Rightarrow\mathcal{K}\mathcal{G} by setting (𝒦η)Y\hstretch.13==𝒦(ηY)(\mathcal{K}\eta)_{Y}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathcal{K}(\eta_{Y}) for each YY in 𝖡\mathsf{B}. The effects of η\eta\mathcal{H} and 𝒦η\mathcal{K}\eta are displayed in Figure 6.

𝖢\mathsf{C}𝖡\mathsf{B}𝖠\mathsf{A}\mathcal{H}𝒢\mathcal{G}\mathcal{H}\mathcal{F}\mathcal{H}𝒢\mathcal{G}\mathcal{F}η\eta\mathcal{H}η\eta
(a) A diagram for η\eta\mathcal{H}
𝖣\mathsf{D}𝖠\mathsf{A}𝖡\mathsf{B}𝒦\mathcal{K}𝒦𝒢\mathcal{K}\mathcal{G}𝒦\mathcal{K}\mathcal{F}𝒢\mathcal{G}\mathcal{F}𝒦η\mathcal{K}\etaη\eta
(b) A diagram for 𝒦η\mathcal{K}\eta
Figure 6. Composing natural transformations with functors

The composition we describe next is specific to natural transformations of a particular form, which include counitals. It composes two natural transformations that share a functor and reflects our expectation that the characteristic relation is transitive. In 𝖦𝗋𝗉\mathsf{Grp}, for example, given a counital describing a characteristic subgroup HH of GG, and a counital describing a characteristic subgroup KK of HH, we expect to have a counital that prescribes how KK is characteristic in GG.

To that end, suppose 𝖤\mathsf{E} is a category of eastern algebras with subcategories 𝖠\mathsf{A}, 𝖡\mathsf{B}, and 𝖢\mathsf{C}, and respective inclusions \mathcal{I}, 𝒥\mathcal{J}, and 𝒦\mathcal{K}. Suppose η:𝒥𝒞\eta:\mathcal{J}\mathcal{C}\Rightarrow\mathcal{I} and μ:𝒦𝒟𝒥\mu:\mathcal{K}\mathcal{D}\Rightarrow\mathcal{J} are natural transformations. Define μη:𝒦𝒟𝒞\mu\triangledown\eta:\mathcal{KDC}\Rightarrow\mathcal{I} by

(μη)X\hstretch.13==ηXμ𝒞(X)\displaystyle(\mu\triangledown\eta)_{X}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\eta_{X}\mu_{\mathcal{C}(X)}

for all objects XX in 𝖠\mathsf{A}, see Figure 7. This construction reflects the fact that being a characteristic substructure is a transitive property.

𝖤\mathsf{E}𝖠\mathsf{A}𝖡\mathsf{B}𝖢\mathsf{C}\mathcal{I}𝒞\mathcal{C}𝒟\mathcal{D}𝒦\mathcal{K}η\etaμ\mu𝖤\mathsf{E}𝖠\mathsf{A}𝖢\mathsf{C}\mathcal{I}𝒟𝒞\mathcal{DC}𝒦\mathcal{K}μη\mu\triangledown\eta
Figure 7. The \triangledown-composition of counitals explains transitivity

6.2. Categorifying isosceles counitals

All extensions used in our proof of Theorem 1-cat lead to isosceles counitals. Counits—namely, counitals 𝒥𝒞\mathcal{J}\mathcal{C}\Rightarrow\mathcal{I} in which 𝒥=\mathcal{J}=\mathcal{I} is the identity functor—are one source of isosceles counitals. This hints at a way to characterize characteristic subgroups.

We now prove that all counitals arising from characteristic subgroups extend to isosceles counitals, thereby proving Theorem 2-cat. The most direct proof might utilize Kan lifts, the dual of the better known Kan extensions [Riehl]*Chapter 6, but we give a self-contained proof.

Let :𝖠𝖢\mathcal{I}:\mathsf{A}\to\mathsf{C} be an inclusion functor of categories and let 𝒞:𝖠𝖢\mathcal{C}:\mathsf{A}\to\mathsf{C} be a functor. If ι:𝒞\iota:\mathcal{C}\Rightarrow\mathcal{I} is a natural transformation, then, for every object XX in 𝖠\mathsf{A}, the morphism ιX:𝒞(X)(X)\iota_{X}:\mathcal{C}(X)\to\mathcal{I}(X) is a morphism in 𝖢\mathsf{C}. We consider the special case when this morphism is also in 𝖠\mathsf{A}.

Definition 6.3.

A natural transformation ι:𝒞\iota:\mathcal{C}\Rightarrow\mathcal{I} is internal if, for every object XX in 𝖠\mathsf{A}, the morphism ιX\iota_{X} is a morphism in 𝖠\mathsf{A}.

The property of being internal is strong. Take, for example, 𝖠=𝖢\mathsf{A}=\;\mathrel{\mathop{\mathsf{C}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}, so the morphisms are exclusively isomorphisms. If ι\iota is internal, then ιX:𝒞(X)X\iota_{X}:\mathcal{C}(X)\to X is an isomorphism. Such an ι\iota does not identify a new substructure. In other words, 𝖠\mathsf{A} has too few morphisms for our purposes. By extending the types of morphisms, we prove in Proposition 6.4 that every monic isosceles counital lifts to an internal one, see Figure 8 for an illustration.

𝖤\mathsf{E}𝖠\mathsf{A}𝖡\mathsf{B}𝖠\mathsf{A}𝖡\mathsf{B}𝒦\mathcal{K}𝒥\mathcal{J}𝒦\mathcal{K}𝒥\mathcal{J}id\operatorname{id}𝒟\mathcal{D}\mathcal{E}η^\hat{\eta}
Figure 8. Extending an isosceles counital to an internal one
Proposition 6.4.

Let 𝖤\mathsf{E} be a category with subcategory 𝖡\mathsf{B} and inclusion \mathcal{I}. Suppose every object in 𝖤\mathsf{E} is also an object in 𝖡\mathsf{B}. Let η:\eta:\mathcal{I}\mathcal{E}\Rightarrow\mathcal{I} be a monic isosceles counital with :𝖡𝖡\mathcal{E}:\mathsf{B}\to\mathsf{B}. There exists a category 𝖠\mathsf{A} with inclusions 𝒥:𝖡𝖠\mathcal{J}:\mathsf{B}\to\mathsf{A} and 𝒦:𝖠𝖤\mathcal{K}:\mathsf{A}\to\mathsf{E}, a functor 𝒟:𝖠𝖠\mathcal{D}:\mathsf{A}\to\mathsf{A}, and an internal monic isosceles counital η^:𝒦𝒟𝒦\hat{\eta}:\mathcal{K}\mathcal{D}\Rightarrow\mathcal{K} such that 𝒥=𝒟𝒥\mathcal{JE}=\mathcal{DJ}, =𝒦𝒥\mathcal{I}=\mathcal{KJ}, and η^𝒥=η\hat{\eta}\mathcal{J}=\eta.

Proof.

We define a subcategory 𝖠\mathsf{A} of 𝖤\mathsf{E} as follows: its objects are the objects of 𝖤\mathsf{E}; its morphisms are given as finite compositions of morphisms (φ):𝖤\mathcal{I}(\varphi):\mathsf{E}, where φ\varphi is a morphism in 𝖡\mathsf{B}, and morphisms ηX:𝖤\eta_{X}:\mathsf{E}, where XX an object in 𝖡\mathsf{B}. Hence, we have inclusions 𝒥:𝖡𝖠\mathcal{J}:\mathsf{B}\to\mathsf{A} and 𝒦:𝖠𝖤\mathcal{K}:\mathsf{A}\to\mathsf{E} such that =𝒦𝒥\mathcal{I}=\mathcal{KJ}. Since both 𝖠\mathsf{A} and 𝖡\mathsf{B} have the same objects as 𝖤\mathsf{E}, it follows that \mathcal{I}, 𝒥\mathcal{J}, and 𝒦\mathcal{K} are the identities on objects. Moreover, 𝒦\mathcal{K} is the identity on morphisms.

We now construct a functor 𝒟:𝖠𝖠\mathcal{D}:\mathsf{A}\to\mathsf{A} such that 𝒥=𝒟𝒥\mathcal{J}\mathcal{E}=\mathcal{D}\mathcal{J}. It suffices to define 𝒟\mathcal{D} on morphisms and then verify that 𝒟\mathcal{D} is a functor. Set

𝒟(φ)\displaystyle\mathcal{D}(\varphi) ={𝒥(φ)φ=𝒥(φ) for a morphism φ in 𝖡,η(X)φ=ηX for some object X in 𝖡,𝒟(σ)𝒟(τ)φ=στ.\displaystyle=\begin{cases}\mathcal{JE}(\varphi^{\prime})&\varphi=\mathcal{J}(\varphi^{\prime})\text{ for a morphism $\varphi^{\prime}$ in }\mathsf{B},\\ \eta_{\mathcal{E}(X)}&\varphi=\eta_{X}\text{ for some object $X$ in $\mathsf{B}$},\\ \mathcal{D}(\sigma)\mathcal{D}(\tau)&\varphi=\sigma\tau.\end{cases}

If 𝒟\mathcal{D} is well defined, then 𝒟(φ)\mathcal{D}(\varphi) is a morphism in 𝖠\mathsf{A}, and 𝒥=𝒟𝒥\mathcal{J}\mathcal{E}=\mathcal{D}\mathcal{J} by construction. To verify that 𝒟\mathcal{D} is well defined, it suffices to consider the case where ηX\eta_{X} (with XX an object in 𝖡\mathsf{B}) is also a morphism in 𝖡\mathsf{B}: specifically, there is a morphism β:𝖡\beta:\mathsf{B} such that ηX=(β)\eta_{X}=\mathcal{I}(\beta). Since \mathcal{I} is the identity on objects, β:(X)X\beta:\mathcal{E}(X)\to X. We will show that η(X)=𝒦𝒟(ηX)=(β)\eta_{\mathcal{E}(X)}=\mathcal{K}\mathcal{D}(\eta_{X})=\mathcal{IE}(\beta). To see this, we apply η\eta to the morphism β:(X)X\beta:\mathcal{E}(X)\to X and obtain the following diagram (see shaded entry (2,2)(2,2) of Figure 3).

(X)\displaystyle{\mathcal{I}\mathcal{E}\mathcal{E}(X)}(X)\displaystyle{\mathcal{I}\mathcal{E}(X)}(X)\displaystyle{\mathcal{I}\mathcal{E}(X)}(X)\displaystyle{\mathcal{I}(X)}(β)\scriptstyle{\mathcal{IE}(\beta)}η(X)\scriptstyle{\eta_{\mathcal{E}(X)}}ηX\scriptstyle{\eta_{X}}(β)\scriptstyle{\mathcal{I}(\beta)}

Since ηX=(β)\eta_{X}=\mathcal{I}(\beta), the diagram implies that ηXη(X)=ηX(β)\eta_{X}\eta_{\mathcal{E}(X)}=\eta_{X}\mathcal{IE}(\beta). Since ηX\eta_{X} is monic by assumption, (β)=η(X)\mathcal{IE}(\beta)=\eta_{\mathcal{E}(X)}. This proves that 𝒟\mathcal{D} is well defined.

We claim that there exists a natural transformation η^:𝒦𝒟𝒦\hat{\eta}:\mathcal{K}\mathcal{D}\Rightarrow\mathcal{K} such that η^𝒥=η\hat{\eta}\mathcal{J}=\eta. Since the objects of 𝖠\mathsf{A} are those of 𝖡\mathsf{B}, we define η^X\hat{\eta}_{X} to be ηX\eta_{X} and show that this yields the required counital. First, we consider the case that φ:XY\varphi:X\to Y is a morphism in 𝖡\mathsf{B}. Then 𝒦𝒟𝒥(φ)=𝒦𝒥(φ)=(φ)\mathcal{K}\mathcal{D}\mathcal{J}(\varphi)=\mathcal{K}\mathcal{J}\mathcal{E}(\varphi)=\mathcal{I}\mathcal{E}(\varphi), so

η^Y𝒦𝒟(𝒥(φ))\displaystyle\hat{\eta}_{Y}\mathcal{K}\mathcal{D}(\mathcal{J}(\varphi)) =ηY(φ)=(φ)ηX=𝒦(𝒥(φ))η^X.\displaystyle=\eta_{Y}\mathcal{I}\mathcal{E}(\varphi)=\mathcal{I}(\varphi)\eta_{X}=\mathcal{K}(\mathcal{J}(\varphi))\hat{\eta}_{X}.

Now we assume φ=ηX:(X)(X)\varphi=\eta_{X}:\mathcal{IE}(X)\to\mathcal{I}(X) for some object XX in 𝖡\mathsf{B}. Since \mathcal{I} is the identity on objects and 𝒦\mathcal{K} is the identity on morphisms,

η^(X)𝒦𝒟(ηX)\displaystyle\hat{\eta}_{\mathcal{I}(X)}\mathcal{K}\mathcal{D}(\eta_{X}) =η^X𝒦𝒟(ηX)=ηX𝒦(η(X))=ηXη(X)=𝒦(ηX)η^(X).\displaystyle=\hat{\eta}_{X}\mathcal{KD}(\eta_{X})=\eta_{X}\mathcal{K}(\eta_{\mathcal{E}(X)})=\eta_{X}\eta_{\mathcal{E}(X)}=\mathcal{K}(\eta_{X})\hat{\eta}_{\mathcal{E}(X)}.

Lastly, we consider the case of an arbitrary finite composition φ=φ1φn\varphi=\varphi_{1}\cdots\varphi_{n} where each φk\varphi_{k} is either 𝒥(φk)\mathcal{J}(\varphi_{k}^{\prime}) for some morphism φk\varphi_{k}^{\prime} in 𝖡\mathsf{B} or a morphism ηX\eta_{X} for some object XX in 𝖡\mathsf{B}. It suffices to consider only the case where n=2n=2, say φ=φ1φ2\varphi=\varphi_{1}\varphi_{2} with φ2:XZ\varphi_{2}:X\to Z and φ1:ZY\varphi_{1}:Z\to Y. Now

η^Y𝒦𝒟(φ)\displaystyle\hat{\eta}_{Y}\mathcal{K}\mathcal{D}(\varphi) =η^Y𝒦𝒟(φ1)𝒦𝒟(φ2)\displaystyle=\hat{\eta}_{Y}\mathcal{K}\mathcal{D}(\varphi_{1})\mathcal{K}\mathcal{D}(\varphi_{2})
=𝒦(φ1)η^Z𝒦𝒟(φ2)\displaystyle=\mathcal{K}(\varphi_{1})\hat{\eta}_{Z}\mathcal{K}\mathcal{D}(\varphi_{2})
=𝒦(φ1)𝒦(φ2)η^X\displaystyle=\mathcal{K}(\varphi_{1})\mathcal{K}(\varphi_{2})\hat{\eta}_{X}
=𝒦(φ)η^X.\displaystyle=\mathcal{K}(\varphi)\hat{\eta}_{X}.

Thus, η^:𝒦𝒟𝒦\hat{\eta}:\mathcal{K}\mathcal{D}\Rightarrow\mathcal{K}. Since η\eta is monic, so is η^\hat{\eta}. Also, η^X\hat{\eta}_{X} is a morphism in 𝖠\mathsf{A} for every object XX, so it is internal, as claimed. ∎

We now prove that every characteristic substructure of an eastern algebra is induced by a morphism of category biactions.

Theorem 6.5.

Let XX be an object in a category 𝖤\mathsf{E} of eastern algebras. Let YY be characteristic in XX with inclusion ι:YX\iota:Y\to X. There exist subcategories 𝖠\mathsf{A} and 𝖡\mathsf{B} with 𝖤𝖠,𝖡𝖤\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}\leqslant\mathsf{A},\mathsf{B}\leqslant\mathsf{E}, and an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphism :𝖡𝖠\mathcal{M}:\mathsf{B}\to\mathsf{A} such that (idX𝟙𝖡)=ι\mathcal{M}(\operatorname{id}_{X}\cdot\mathbb{1}_{\mathsf{B}})=\iota.

Proof.

Let :𝖤𝖤\mathcal{I}:\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}\to\mathsf{E} be the inclusion functor. The proof of Theorem 1-cat shows that there exists a functor :𝖤𝖤\mathcal{E}:\;\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}\;\to\;\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}} and a monic counital η:\eta:\mathcal{I}\mathcal{E}\Rightarrow\mathcal{I} such that ηX=ι\eta_{X}=\iota. We use Proposition 6.4 (with 𝖡=𝖤\mathsf{B}=\;\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}) to create a category 𝖠\mathsf{A} generated from 𝖤\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}} and η\eta, an inclusion functor 𝒦:𝖠𝖤\mathcal{K}:\mathsf{A}\to\mathsf{E}, a functor 𝒟:𝖠𝖠\mathcal{D}:\mathsf{A}\to\mathsf{A}, and an internal monic counital η^:𝒦𝒟𝒦\hat{\eta}:\mathcal{K}\mathcal{D}\Rightarrow\mathcal{K} with η^Z=ηZ\hat{\eta}_{Z}=\eta_{Z} for all objects in 𝖤\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}. Lastly, we apply Proposition 4.10(a) to η^\hat{\eta} to obtain an 𝖠\mathsf{A}-bimorphism 𝒩:𝖠𝖤\mathcal{N}:\mathsf{A}\to\mathsf{E} such that η^=𝒩(𝟙𝖠)\hat{\eta}=\mathcal{N}(\mathbb{1}_{\mathsf{A}}). Since η^\hat{\eta} is internal, there exists an 𝖠\mathsf{A}-bimorphism :𝖠𝖠\mathcal{M}:\mathsf{A}\to\mathsf{A} such that 𝒩=𝒦\mathcal{N}=\mathcal{KM}. Hence, η^=𝒦(𝟙𝖠)\hat{\eta}=\mathcal{K}\mathcal{M}(\mathbb{1}_{\mathsf{A}}). With 𝖡\hstretch.13==𝖠\mathsf{B}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathsf{A}, it follows that (idX𝟙𝖡)=η^X=ηX=ι\mathcal{M}(\operatorname{id}_{X}\cdot\mathbb{1}_{\mathsf{B}})=\hat{\eta}_{X}=\eta_{X}=\iota, as claimed. ∎

6.3. Proofs of main theorems

Having developed the required theory, we can now complete the proofs of our main results. Theorem 1 is a special case of Theorem 1-cat, which we proved in the previous section.

Proof of Theorem 2-cat. If (1) holds, then Theorem 6.5 yields (3). If (3) holds, then (2) follows from Theorem 4.11(a) and the fact that ι=(idG𝟙𝖡)\iota=\mathcal{M}(\operatorname{id}_{G}\cdot\mathbb{1}_{\mathsf{B}}). If (2) holds, then (1) follows from Theorem 1-cat. ∎

Theorem 2 follows from Theorem 2-cat.

6.4. Duality

Recall from Section 5.5 that a natural transformation η:𝒟\eta:\mathcal{I}\Rightarrow\mathcal{D} is a unital if \mathcal{I} is an inclusion functor. If =id\mathcal{I}=\operatorname{id}, then η:id𝒟\eta:\operatorname{id}\Rightarrow\mathcal{D} is a unit. A unital η:𝒟\eta:\mathcal{I}\Rightarrow\mathcal{D} is epic if ηX:(X)𝒟(X)\eta_{X}:\mathcal{I}(X)\to\mathcal{D}(X) is epic for all objects XX. Units and unitals are the duals of counits and counitals.

We state a dual analogue of Theorem 2-cat for characteristic quotients in eastern algebras; its proof follows mutatis mutandis from that of Theorem 2-cat.

Theorem 2-dual.

Let 𝖤\mathsf{E} be an eastern variety, and let GG be an object of 𝖤\mathsf{E} with quotient QQ and projection π\pi. There exist categories 𝖠\mathsf{A} and 𝖡\mathsf{B}, where 𝖤𝖠𝖤\mathrel{\mathop{\mathsf{E}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\leftrightarrow$}\vss}}}\;\leqslant\mathsf{A}\leqslant\mathsf{E}, such that the following are equivalent.

  1. (1)

    QQ is a characteristic quotient of GG.

  2. (2)

    There is a functor 𝒰:𝖠𝖠\mathcal{U}:\mathsf{A}\to\mathsf{A} and a unit ϵ:id𝖠𝒰\epsilon:\operatorname{id}_{\mathsf{A}}\Rightarrow\mathcal{U} such that Q=Coim(ϵG)Q=\mathrm{Coim}(\epsilon_{G}).

  3. (3)

    There is an (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphism :𝖠𝖡\mathcal{M}:\mathsf{A}\to\mathsf{B} such that π=(𝟙𝖠idG)\pi=\mathcal{M}(\mathbb{1}_{\mathsf{A}}\cdot\operatorname{id}_{G}).

Although a characteristic subgroup of a group GG is associated with a characteristic quotient of GG, and vice-versa, there are subtle differences in other categories of eastern algebras.

Example 6.6.

Let \mathbb{Q} be the ring of rational numbers and \mathbb{Z} its subring of integers. If φ:\varphi:\mathbb{Q}\to\mathbb{Q} is a homomorphism of unital rings, then φ(1)=1\varphi(1)=1. This forces φ=id\varphi=\operatorname{id}_{\mathbb{Q}}, so \mathbb{Z} is fully invariant in \mathbb{Q}. Since \mathbb{Q} is a field, its only quotients are itself and the trivial ring. Hence, \mathbb{Q} has many fully-invariant substructures, but only two fully-invariant quotients. \square

In general, kernels of group homomorphisms are representable as subgroups (unlike ideals, which are not necessarily unital subrings). Conversely, every characteristic subgroup is normal and has an associated quotient. Formalizing these observations, we say that invariant structures of groups are self-dual up to equivalence of natural transformations in 𝖢𝖺𝗍\mathsf{Cat}, see [HoTT]*pp. 59–61. The next proposition provides a categorical description of this observation for 𝖦𝗋𝗉\mathsf{Grp}; we use it in Section 7.

Proposition 6.7.

The following hold for categories 𝖦𝗋𝗉𝖠𝖦𝗋𝗉\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}\;\leqslant\mathsf{A}\leqslant\mathsf{Grp} and 𝖡𝖦𝗋𝗉\mathsf{B}\leqslant\mathsf{Grp} with inclusion functors :𝖠𝖦𝗋𝗉\mathcal{I}:\mathsf{A}\to\mathsf{Grp} and 𝒥:𝖡𝖦𝗋𝗉\mathcal{J}:\mathsf{B}\to\mathsf{Grp}.

  1. (a)

    Given a unital π:𝒥𝒰\pi:\mathcal{I}\Rightarrow\mathcal{J}\mathcal{U}, there is a category 𝖢𝖦𝗋𝗉\mathsf{C}\leqslant\mathsf{Grp}, with inclusion 𝒦\mathcal{K}, and a functor 𝒞:𝖠𝖢\mathcal{C}:\mathsf{A}\to\mathsf{C} such that ker(π):𝒦𝒞{\rm ker\,}(\pi):\mathcal{K}\mathcal{C}\Rightarrow\mathcal{I} is a counital where 𝒞(G)=ker(πG)\mathcal{C}(G)={\rm ker\,}(\pi_{G}) and (ker(π))G:ker(πG)G({\rm ker\,}(\pi))_{G}:{\rm ker\,}(\pi_{G})\hookrightarrow G is the inclusion for every group GG.

  2. (b)

    Given a counital ι:𝒥𝒞\iota:\mathcal{J}\mathcal{C}\Rightarrow\mathcal{I}, there is a category 𝖢𝖦𝗋𝗉\mathsf{C}\leqslant\mathsf{Grp}, with inclusion 𝒦\mathcal{K}, and a functor 𝒰:𝖠𝖢\mathcal{U}:\mathsf{A}\to\mathsf{C} such that coker(ι):𝒦𝒰{\rm coker\,}(\iota):\mathcal{I}\Rightarrow\mathcal{K}\mathcal{U} is a unital where 𝒰(G)=G/Im(ιG)\mathcal{U}(G)=G/\operatorname{Im}(\iota_{G}) and (coker(ι))G:GG/Im(ιG)({\rm coker\,}(\iota))_{G}:G\twoheadrightarrow G/\operatorname{Im}(\iota_{G}) for every group GG.

  3. (c)

    With the notation of (a) and (b), there are unique invertible μ,τ:𝖠\mu,\tau:\mathsf{A} such that coker(ker(π))=μ(im(π)){\rm coker\,}({\rm ker\,}(\pi))=\mu(\mathrm{im}(\pi)) and ker(coker(ι))=ιτ{\rm ker\,}({\rm coker\,}(\iota))=\iota\tau.

Proof.
  1. (a)

    For every morphism φ:GH\varphi:G\to H in 𝖠\mathsf{A}, there is an induced morphism φ:Im(πG)Im(πH)\varphi^{\prime}:\operatorname{Im}(\pi_{G})\to\operatorname{Im}(\pi_{H}) such that φπG=πHφ\varphi^{\prime}\pi_{G}=\pi_{H}\varphi, so

    πHφ(ker(πG))=φπG(ker(πG))=1.\pi_{H}\varphi({\rm ker\,}(\pi_{G}))=\varphi^{\prime}\pi_{G}({\rm ker\,}(\pi_{G}))=1.

    Therefore φ(ker(πG))ker(πH)\varphi({\rm ker\,}(\pi_{G}))\leqslant{\rm ker\,}(\pi_{H}). In particular, the restriction

    φ|ker(πG):ker(πG)ker(πH)\varphi|_{{\rm ker\,}(\pi_{G})}:{\rm ker\,}(\pi_{G})\to{\rm ker\,}(\pi_{H})

    is well defined. Let 𝖢\mathsf{C} be the category whose objects are ker(πG){\rm ker\,}(\pi_{G}) for all groups GG and whose morphisms are φ|ker(πG)\varphi|_{{\rm ker\,}(\pi_{G})} for all morphisms φ:GH\varphi:G\to H in 𝖠\mathsf{A}. Let 𝒦:𝖢𝖦𝗋𝗉\mathcal{K}:\mathsf{C}\to\mathsf{Grp} be the inclusion functor. Moreover, there is a functor 𝒞:𝖠𝖢\mathcal{C}:\mathsf{A}\to\mathsf{C} given by 𝒞(G)=ker(πG)\mathcal{C}(G)={\rm ker\,}(\pi_{G}) and 𝒞(φ)=φ|ker(πG)\mathcal{C}(\varphi)=\varphi|_{{\rm ker\,}(\pi_{G})}. If we define ιG:ker(πG)G\iota_{G}:{\rm ker\,}(\pi_{G})\hookrightarrow G to be the associated inclusion map for the kernel, then ι:𝒦𝒞\iota:\mathcal{K}\mathcal{C}\Rightarrow\mathcal{I} is the required counital.

  2. (b)

    The proof is dual to that of (a).

  3. (c)

    Consider the unital π:𝒥𝒰\pi:\mathcal{I}\Rightarrow\mathcal{J}\mathcal{U}. By Theorem 3.22, for each group GG there is an isomorphism

    μ:𝒰(G)=ImπGG/kerπG=coker(kerπG).\mu:\mathcal{U}(G)=\mathrm{Im}\pi_{G}\to G/{\rm ker\,}\pi_{G}={\rm coker\,}({\rm ker\,}\pi_{G}).

    Thus, coker(ker(π))=μ(im(π)){\rm coker\,}({\rm ker\,}(\pi))=\mu(\text{im}(\pi)); likewise, for ker(coker(ι)){\rm ker\,}({\rm coker\,}(\iota)) and ι\iota. ∎

7. Categorification of standard characteristic subgroups

Theorem 2 states that every characteristic subgroup can be studied in three ways: as a group, as a natural transformation, and as a morphism of category biactions. In this section, we describe common characteristic subgroups using all three forms. In so doing, we reveal insights gained from the categorical perspective.

Throughout, we use the following notation for restriction and induction. Let φ:GH\varphi:G\to H be a homomorphism of groups, and let 𝒞(G)\mathcal{C}(G) and 𝒞(H)\mathcal{C}(H) be subgroups of HH and GG, respectively. If the restriction of φ\varphi to 𝒞(G)\mathcal{C}(G) maps into 𝒞(H)\mathcal{C}(H), then we denote it by

(7.1) φ|𝒞:𝒞(G)𝒞(H),cφ(c).\displaystyle\varphi|_{\mathcal{C}}:\mathcal{C}(G)\to\mathcal{C}(H),\quad c\mapsto\varphi(c).

Similarly, if φ\varphi maps a normal subgroup 𝒬(G)\mathcal{Q}(G) of GG into a normal subgroup 𝒬(H)\mathcal{Q}(H) of HH, then the induction of φ\varphi via 𝒬\mathcal{Q} is

(7.2) φ|𝒬:G/𝒬(G)H/𝒬(H),g𝒬(G)φ(g)𝒬(H).\displaystyle\varphi|^{\mathcal{Q}}:G/\mathcal{Q}(G)\to H/\mathcal{Q}(H),\quad g\mathcal{Q}(G)\mapsto\varphi(g)\mathcal{Q}(H).

7.1. Abelianization and derived subgroups

Figure 9 gives the three perspectives on the derived subgroup. We develop this example, so that we may also treat the lower central series and all verbal subgroups in Section 7.2.

The counital λ:𝒟id𝖦𝗋𝗉\lambda:\mathcal{D}\Rightarrow\operatorname{id}_{\mathsf{Grp}} of Example 6.2 associated with the derived subgroup γ2(G)\gamma_{2}(G) of a group GG can be constructed also as the kernel of the unital associated with abelianization. We explore the category biaction interpretation. Let 𝖠𝖻𝖾𝗅\mathsf{Abel} be the category of abelian groups, a subcategory of 𝖦𝗋𝗉\mathsf{Grp} with inclusion :𝖠𝖻𝖾𝗅𝖦𝗋𝗉\mathcal{I}:\mathsf{Abel}\to\mathsf{Grp}. We define a morphism 𝒜:𝖦𝗋𝗉𝖠𝖻𝖾𝗅\mathcal{A}:\mathsf{Grp}\to\mathsf{Abel} given by φφ|γ2\varphi\mapsto\varphi|^{\gamma_{2}}. The functors 𝒜\mathcal{A} and \mathcal{I} turn the categories 𝖦𝗋𝗉\mathsf{Grp} and 𝖠𝖻𝖾𝗅\mathsf{Abel} into (𝖦𝗋𝗉,𝖠𝖻𝖾𝗅)(\mathsf{Grp},\mathsf{Abel})-bicapsules.

We show that 𝒜:𝖦𝗋𝗉𝖠𝖻𝖾𝗅\mathcal{A}:\mathsf{Grp}\to\mathsf{Abel} is a (𝖦𝗋𝗉,𝖠𝖻𝖾𝗅)(\mathsf{Grp},\mathsf{Abel})-morphism. Let φ\varphi and τ\tau be group homomorphisms, and let α\alpha be a homomorphism of abelian groups. Now

𝒜(αφτ)\displaystyle\mathcal{A}(\alpha\cdot\varphi\tau) =((α)φτ)|γ2=αφ|γ2τ|γ2=α𝒜(φ)τ.\displaystyle=(\mathcal{I}(\alpha)\varphi\tau)|^{\gamma_{2}}=\alpha\;\varphi|^{\gamma_{2}}\;\tau|^{\gamma_{2}}=\alpha\mathcal{A}(\varphi)\cdot\tau.

To obtain the counital associated with the derived subgroup, we apply Proposition 6.7 and take the kernel of 𝒜(𝟙𝖦𝗋𝗉)\mathcal{A}(\mathbb{1}_{\mathsf{Grp}}). Since the unital-counital pair obtained through this process is a unit-counit pair, we obtain the well-known observation that the derived subgroup is fully invariant.

Refer to caption
Figure 9. Three perspectives on the derived subgroup

7.2. Verbal subgroups

We generalize the approach taken in Section 7.1. Let Ω\Omega be the group signature from Example 3.8. To each set WW of words from the free group ΩX\Omega\langle X\rangle we associate a category 𝖵𝖺𝗋(W)\mathsf{Var}(W) as follows (see Section 3.3). For each word w:Ww:W, group GG, and XX-tuple g:GXg:G^{X}, define wG:GXGw_{G}:G^{X}\to G by gevalg(w)g\mapsto\mathrm{eval}_{g}(w). Define 𝖵𝖺𝗋(W)\mathsf{Var}(W) to be the full subcategory of 𝖦𝗋𝗉\mathsf{Grp} with objects

{G:𝖦𝗋𝗉(g:GX)(w:W)wG(g)=1}\{G:\mathsf{Grp}\mid(\forall g:G^{X})(\forall w:W)\;w_{G}(g)=1\}

with inclusion functor :𝖵𝖺𝗋(W)𝖦𝗋𝗉\mathcal{I}:\mathsf{Var}(W)\to\mathsf{Grp}. The category 𝖵𝖺𝗋(W)\mathsf{Var}(W) is the group variety with laws WW. Let RadW(G)\mathrm{Rad}_{W}(G) be the minimal normal subgroup of a group GG such that G/RadW(G)G/\mathrm{Rad}_{W}(G) is in 𝖵𝖺𝗋(W)\mathsf{Var}(W). Let :𝖦𝗋𝗉𝖵𝖺𝗋(W)\mathcal{R}:\mathsf{Grp}\to\mathsf{Var}(W) be the functor such that (G)\mathcal{R}(G) is the largest quotient of GG contained in 𝖵𝖺𝗋(W)\mathsf{Var}(W), where the functor carries GG to G/RadW(G)G/\mathrm{Rad}_{W}(G), and morphisms φ\varphi are sent to φ|RadW\varphi|^{\mathrm{Rad}_{W}}.

Proposition 7.1.

The functors \mathcal{R} and \mathcal{I} form an adjoint functor pair: :𝖦𝗋𝗉𝖵𝖺𝗋(W):\mathcal{R}:\mathsf{Grp}\dashv\mathsf{Var}(W):\mathcal{I}.

Proof.

By Proposition 4.6, the functors \mathcal{R} and \mathcal{I} turn both 𝖵𝖺𝗋(W)\mathsf{Var}(W) and 𝖦𝗋𝗉\mathsf{Grp} into (𝖵𝖺𝗋(W),𝖦𝗋𝗉)(\mathsf{Var}(W),\mathsf{Grp})-bicapsules. The functor \mathcal{R} is a (𝖵𝖺𝗋(W),𝖦𝗋𝗉)(\mathsf{Var}(W),\mathsf{Grp})-morphism: for morphisms α\alpha in 𝖵𝖺𝗋(W)\mathsf{Var}(W) and φ,τ\varphi,\tau in 𝖦𝗋𝗉\mathsf{Grp},

(αφτ)\displaystyle\mathcal{R}(\alpha\varphi\cdot\tau) =(αφ(τ))|RadW=α|RadWφ|RadWτ=α(φ)τ.\displaystyle=(\alpha\varphi\mathcal{I}(\tau))|^{\mathrm{Rad}_{W}}=\alpha|^{\mathrm{Rad}_{W}}\;\varphi|^{\mathrm{Rad}_{W}}\;\tau=\alpha\cdot\mathcal{R}(\varphi)\tau.

Since \mathcal{R} and \mathcal{I} are pseudo-inverses, the result follows from Theorem 4.13(a). ∎

The adjoint functor pair in Proposition 7.1 categorifies verbal subgroups. The dual version of Theorem 4.11 describes how to obtain the unit π:id𝖦𝗋𝗉\pi:\operatorname{id}_{\mathsf{Grp}}\Rightarrow\mathcal{IR} from \mathcal{R}. Applying Proposition 6.7, the kernel of π\pi yields a counit ι:𝒱id𝖦𝗋𝗉\iota:\mathcal{V}\Rightarrow\operatorname{id}_{\mathsf{Grp}} for some functor 𝒱:𝖦𝗋𝗉𝖦𝗋𝗉\mathcal{V}:\mathsf{Grp}\to\mathsf{Grp}. If GG is a group, then 𝒱(G)\mathcal{V}(G) is the WW-verbal subgroup. We conclude that all verbal subgroups are fully invariant. Thus, from Proposition 7.1, we get an exact sequence of natural transformations

𝒱\displaystyle{\mathcal{V}}id𝖦𝗋𝗉\displaystyle{\operatorname{id}_{\mathsf{Grp}}}.\displaystyle{\mathcal{IR}.}ker(π)\scriptstyle{{\rm ker\,}(\pi)}π\scriptstyle{\pi}

The corresponding diagram appears in Figure 10.

Refer to caption
Figure 10. Three perspectives on verbal subgroups

7.3. Marginal subgroups

Now we consider characteristic subgroups such as the center ζ(G)\zeta(G) of a group GG. As seen in Example 1.4, there are group homomorphisms φ:GH\varphi:G\to H for which φ(ζ(G))⩽̸ζ(H)\varphi(\zeta(G))\not\leqslant\zeta(H), so, unlike verbal subgroups, the center is not fully invariant. This fact is revealed by the categorification of the center—it does not yield a counit between functors 𝖦𝗋𝗉𝖦𝗋𝗉\mathsf{Grp}\to\mathsf{Grp}, but rather a proper counital between functors of the form 𝖦𝗋𝗉-↠𝖦𝗋𝗉\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\relbar\joinrel\twoheadrightarrow$}\vss}}}\;\to\mathsf{Grp}, where 𝖦𝗋𝗉-↠\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\relbar\joinrel\twoheadrightarrow$}\vss}}} is the category of groups whose morphisms are epimorphisms. We establish this fact more generally for the class of marginal subgroups introduced by P. Hall [Hall40].

Example 7.2 (Hall’s Isoclinism).

For an integer n>0n>0 we write GnG^{n} for the nn-fold direct product of a group GG. The commutator map κ:G2G\kappa:G^{2}\to G is given by (g,h)[g,h]\hstretch.13==g1h1gh(g,h)\mapsto[g,h]\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}g^{-1}h^{-1}gh. We define a congruence relation \equiv on GG and write xzx\equiv z if and only if [x,z]=[y,z][x,z]=[y,z] for all y:Gy:G. Factoring through this congruence relation and restricting the outputs to the verbal subgroups, we obtain a map :(G/ζ(G))2γ2(G)*:(G/\zeta(G))^{2}\to\gamma_{2}(G) such that the following diagram commutes.

G2\displaystyle{G^{2}}G\displaystyle{G}(G/ζ(G))2\displaystyle{(G/\zeta(G))^{2}}γ2(G)\displaystyle{{\gamma_{2}(G)}}κ\scriptstyle{\kappa}\scriptstyle{*}

Two groups are isoclinic if their commutator maps are equivalent.  \square

For each group GG and each word ww, there is a unique minimal normal subgroup w(G)w^{*}(G) such that the map w¯G:(G/w(G))nG\overline{w}_{G}:(G/w^{*}(G))^{n}\to G given by

(g1w(G),,gnw(G))\displaystyle(g_{1}w^{*}(G),\dots,g_{n}w^{*}(G)) wG(g1,,gn)\displaystyle\longmapsto w_{G}(g_{1},\dots,g_{n})

is non-degenerate: namely, fixing any n1n-1 entries of the nn-tuple argument of w¯G\overline{w}_{G} yields an injective map G/w(G)GG/w^{*}(G)\to G. Here wGw_{G} is as defined in Section 7.2.

For a set WW of words, the associated marginal subgroup of a group GG is defined as W(G)=w:Ww(G)W^{*}(G)=\bigcap_{w:W}w^{*}(G). Clearly, W(G)W^{*}(G) is characteristic in GG. The image of w¯G\overline{w}_{G}, and thus also wGw_{G}, is the verbal subgroup associated with ww, written w(G)w(G).

Refer to caption
Figure 11. Marginal subgroups and quotients categorified

Hall [Hall40] introduced the general notion of isologism for word-map equivalence. We extend this language to categories. Each word ww determines a category 𝖫𝗈𝗀-↠w\mathrel{\mathop{\mathsf{Log}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\relbar\joinrel\twoheadrightarrow$}\vss}}}_{w} with maps w¯G:(G/w(G))nw(G)\overline{w}_{G}:(G/w^{*}(G))^{n}\to w(G) as objects, where the morphisms are pairs (φ1,φ2)(\varphi_{1},\varphi_{2}) of group epimorphisms such that the following diagram commutes.

(G/w(G))n\displaystyle{(G/w^{*}(G))^{n}}w(G)\displaystyle{w(G)}(H/w(H))n\displaystyle{(H/w^{*}(H))^{n}}w(H)\displaystyle{w(H)}w¯G\scriptstyle{\overline{w}_{G}}φ1n\scriptstyle{\varphi_{1}^{n}}φ2\scriptstyle{\varphi_{2}}w¯H\scriptstyle{\overline{w}_{H}}

We define two functors. The first is :𝖦𝗋𝗉-↠𝖫𝗈𝗀-↠w\mathcal{L}:\;\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\relbar\joinrel\twoheadrightarrow$}\vss}}}\;\to\;\mathrel{\mathop{\mathsf{Log}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\relbar\joinrel\twoheadrightarrow$}\vss}}}_{w} given by Gw¯GG\mapsto\overline{w}_{G} and φ(φ|w,φ|w)\varphi\mapsto(\varphi|^{w^{*}},\varphi|_{w}). The second is 𝒫:𝖫𝗈𝗀-↠w𝖦𝗋𝗉-↠\mathcal{P}:\;\mathrel{\mathop{\mathsf{Log}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\relbar\joinrel\twoheadrightarrow$}\vss}}}_{w}\;\to\;\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\relbar\joinrel\twoheadrightarrow$}\vss}}} given by w¯GG/w(G)\overline{w}_{G}\mapsto G/w^{*}(G) and (φ1,φ2)φ1(\varphi_{1},\varphi_{2})\mapsto\varphi_{1}. For a group GG, let πG:GG/w(G)\pi_{G}:G\twoheadrightarrow G/w^{*}(G) be the usual projection homomorphism. Now π:id𝖦𝗋𝗉-↠𝒫\pi:\operatorname{id}_{\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\relbar\joinrel\twoheadrightarrow$}\vss}}}}\Rightarrow\mathcal{PL} is a unit. Let :𝖦𝗋𝗉-↠𝖦𝗋𝗉\mathcal{I}:\;\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\relbar\joinrel\twoheadrightarrow$}\vss}}}\;\to\mathsf{Grp} be the inclusion functor. Then the unital π:𝒫\mathcal{I}\pi:\mathcal{I}\Rightarrow\mathcal{IPL} is a categorification of marginal quotients.

To categorify the marginal subgroup, we take the kernel of π\pi via Proposition 6.7 and compose with \mathcal{I}: namely, ker(π):𝒞\mathcal{I}{\rm ker\,}(\pi):\mathcal{IC}\Rightarrow\mathcal{I} for some functor 𝒞:𝖦𝗋𝗉-↠𝖦𝗋𝗉-↠\mathcal{C}:\;\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\relbar\joinrel\twoheadrightarrow$}\vss}}}\;\to\;\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\relbar\joinrel\twoheadrightarrow$}\vss}}}. Figure 11 displays the various morphisms and their relationships. This construction demonstrates that marginal subgroups are not just characteristic, but invariant under all epimorphisms.

The construction applies to other algebraic structures by simply involving formulas in the appropriate signature. However, the notion of congruence does not always yield a substructure, so the structures are more naturally expressed as characteristic quotients.

8. Composite characteristic structures

We now address one remaining powerful feature of our categorical description of characteristic structure. It relates to a comment we made after Theorem 2: a characteristic subgroup may arise from (𝖠,𝖡)(\mathsf{A},\mathsf{B})-morphisms 𝖡𝖠\mathsf{B}\to\mathsf{A} where 𝖡\mathsf{B} is not a category of groups. We give one illustration of how this “transferability” explains techniques currently used in isomorphism tests.

In [Wilson:filters]*§4, it is shown that a pp-group GG of class at most 22 with exponent pp has a characteristic subgroup induced by the Jacobson radical of an algebra associated to the bilinear commutator map of GG. Here we construct that characteristic subgroup using a tensor product of capsules, as described in Section 5.2.

8.1. From groups to bimaps

Fix an odd prime pp, and let 𝖦\hstretch.13==𝖦𝗋𝗉2,p\mathsf{G}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}_{2,p} be the category whose objects are pp-groups of class at most 22 with exponent pp, and whose morphisms are isomorphisms. The objects of 𝖦\mathsf{G} are groups GG with exponent pp and central derived subgroup, so γ2(G)ζ(G)\gamma_{2}(G)\leqslant\zeta(G).

Let 𝔽p\mathbb{F}_{p} be the field with pp elements, and let 𝖡\hstretch.13==𝖡𝗂(𝔽p)\mathsf{B}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathrel{\mathop{\mathsf{Bi}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}(\mathbb{F}_{p}) be the category of alternating 𝔽p\mathbb{F}_{p}-bilinear maps. The objects of 𝖡\mathsf{B} are bilinear maps b:V×VWb:V\times V\to W, where VV and WW are 𝔽p\mathbb{F}_{p}-spaces, such that b(u,v)=b(v,u)b(u,v)=-b(v,u) for all vectors u,vu,v. For objects b:V×VWb:V\times V\to W and b:V×VWb^{\prime}:V^{\prime}\times V^{\prime}\to W^{\prime} in 𝖡\mathsf{B}, a morphism φ:bb\varphi:b\to b^{\prime} is a pair of invertible linear maps (α:VV,β:WW)(\alpha:V\to V^{\prime},\beta:W\to W^{\prime}) such that, for all u,vVu,v\in V,

b(αu,αv)=βb(u,v).b^{\prime}(\alpha u,\alpha v)=\beta b(u,v).

Define a functor :𝖦𝖡\mathcal{B}:\mathsf{G}\to\mathsf{B} that takes a group GG to

bG:G/γ2(G)×G/γ2(G)γ2(G),(xγ2(G),yγ2(G))[x,y],b_{G}:G/\gamma_{2}(G)\times G/\gamma_{2}(G)\to\gamma_{2}(G),\quad(x\gamma_{2}(G),y\gamma_{2}(G))\mapsto[x,y],

and a homomorphism φ:GH\varphi:G\to H to the pair (φ|γ2,φ|γ2)(\varphi|^{\gamma_{2}},\ \varphi|_{\gamma_{2}}), as defined in (7.1) and (7.2). Since GG has exponent pp and γ2(G)ζ(G)\gamma_{2}(G)\leqslant\zeta(G) by assumption, bGb_{G} is an alternating 𝔽p\mathbb{F}_{p}-bilinear map.

Next, define a functor 𝒢:𝖡𝖦\mathcal{G}:\mathsf{B}\to\mathsf{G} that takes an 𝔽p\mathbb{F}_{p}-bilinear map b:V×VWb:V\times V\to W to the group GbG_{b} on V×WV\times W with binary operation

(v1,w1)(v2,w2)=(v1+v2,w1+w2+12b(v1,v2)).(v_{1},w_{1})\cdot(v_{2},w_{2})=\left(v_{1}+v_{2},w_{1}+w_{2}+\frac{1}{2}b(v_{1},v_{2})\right).

A morphism (α,β)(\alpha,\beta) from b:V×VWb:V\times V\to W to b:V×VWb^{\prime}:V^{\prime}\times V^{\prime}\to W^{\prime} in 𝖡\mathsf{B} induces a group isomorphism, denoted αβ\alpha\boxtimes\beta, mapping Gb=V×WG_{b}=V\times W to Gb=V×WG_{b^{\prime}}=V^{\prime}\times W^{\prime} by

(αβ)(v,w)\hstretch.13==(αu,βw).(\alpha\boxtimes\beta)(v,w)\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}(\alpha u,\beta w).
Lemma 8.1.

The functor :𝖦𝖡\mathcal{B}:\mathsf{G}\to\mathsf{B} is a (𝖦,𝖡)(\mathsf{G},\mathsf{B})-morphism.

Proof.

The functor \mathcal{B} induces a left 𝖦\mathsf{G}-action on (the morphisms of) 𝖡\mathsf{B}, and 𝒢\mathcal{G} induces a right 𝖡\mathsf{B}-action on 𝖦\mathsf{G}, so 𝖡\mathsf{B} and 𝖦\mathsf{G} are (𝖡,𝖦)(\mathsf{B},\mathsf{G})-bicapsules. Let λ,μ\lambda,\mu be morphisms of 𝖦\mathsf{G} and let (α,β)(\alpha,\beta) be a morphism of 𝖡\mathsf{B} such that λμ(α,β)=λμ(αβ)\lambda\mu\cdot(\alpha,\beta)=\lambda\mu(\alpha\boxtimes\beta) is defined. Now

(λμ(α,β))\displaystyle\mathcal{B}(\lambda\mu\cdot(\alpha,\beta)) =((λμ(αβ))|γ2,(λμ(αβ))|γ2)\displaystyle=\left((\lambda\mu(\alpha\boxtimes\beta))|^{\gamma_{2}},\ (\lambda\mu(\alpha\boxtimes\beta))|_{\gamma_{2}}\right)
=(λ|γ2μ|γ2α,λ|γ2μ|γ2β)\displaystyle=\left(\lambda|^{\gamma_{2}}\mu|^{\gamma_{2}}\alpha,\ \lambda|_{\gamma_{2}}\mu|_{\gamma_{2}}\beta\right)
=λ(μ)(α,β),\displaystyle=\lambda\cdot\mathcal{B}(\mu)(\alpha,\beta),

so \mathcal{B} is a (𝖦,𝖡)(\mathsf{G},\mathsf{B})-morphism. ∎

By applying the dual version of Theorem 4.11(a), we obtain a unit id𝖦𝒢\operatorname{id}_{\mathsf{G}}\Rightarrow\mathcal{BG}. There is also a counit id𝖦𝒢\operatorname{id}_{\mathsf{G}}\Leftarrow\mathcal{BG}. Together these give a categorical interpretation of the Baer correspondence [Baer].

8.2. From bimaps to algebras

Let 𝖠\hstretch.13==𝖠𝗅𝗀𝖾(𝔽p)\mathsf{A}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathrel{\mathop{\mathsf{Alge}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}(\mathbb{F}_{p}) be the category of 𝔽p\mathbb{F}_{p}-matrix algebras with algebra isomorphisms. Using [Wilson:filters]*§4, define a functor 𝒜:𝖡𝖠\mathcal{A}:\mathsf{B}\to\mathsf{A} by

𝒜(b)\displaystyle\mathcal{A}(b) ={fEnd(V)|fEnd(V)op,u,vV,b(fu,v)=b(u,fv)}.\displaystyle=\left\{f\in\operatorname{End}(V)~\middle|~\exists f^{*}\in\operatorname{End}(V)^{op},\forall u,v\in V,\;b(fu,v)=b(u,f^{*}v)\right\}.

Invertible morphisms (α,β)(\alpha,\beta) in 𝖡\mathsf{B} from b:V×VWb:V\times V\to W to b:V×VWb^{\prime}:V^{\prime}\times V^{\prime}\to W^{\prime} are sent to

𝒜(α,β):f𝒜(b)fα1𝒜(b).\mathcal{A}(\alpha,\beta):f\in\mathcal{A}(b)\mapsto f^{\alpha^{-1}}\in\mathcal{A}(b^{\prime}).
Fact 8.2.

The functor 𝒜\mathcal{A} is a (𝖡,𝖠)(\mathsf{B},\mathsf{A})-morphism.

8.3. From matrix algebras to semisimple algebras

Every matrix algebra AA over a field is Artinian, so the quotient of AA by its Jacobson radical Jac(A)\mathrm{Jac}(A) is a semisimple algebra. The map AA/Jac(A)A\mapsto A/\mathrm{Jac}(A) is a functor from 𝖠\mathsf{A} to the category 𝖲\hstretch.13==𝖲𝖲𝖠𝗅𝗀𝖾(𝔽p)\mathsf{S}\mathrel{\hstretch{.13}{=}\hskip 0.86108pt{=}}\mathrel{\mathop{\mathsf{SSAlge}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}(\mathbb{F}_{p}) of semisimple 𝔽p\mathbb{F}_{p}-algebras. It is also an (𝖠,𝖲)(\mathsf{A},\mathsf{S})-morphism.

8.4. Combining capsules

Recall that

𝖦=𝖦𝗋𝗉2,p,𝖡=𝖡𝗂(𝔽p),𝖠=𝖠𝗅𝗀𝖾(𝔽p),𝖲=𝖲𝖲𝖠𝗅𝗀𝖾(𝔽p).\displaystyle\mathsf{G}=\mathrel{\mathop{\mathsf{Grp}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}_{2,p},\qquad\mathsf{B}=\mathrel{\mathop{\mathsf{Bi}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}(\mathbb{F}_{p}),\qquad\mathsf{A}=\mathrel{\mathop{\mathsf{Alge}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}(\mathbb{F}_{p}),\qquad\mathsf{S}=\mathrel{\mathop{\mathsf{SSAlge}}\limits^{\vbox to0.0pt{\kern-2.0pt\hbox{$\scriptstyle\longleftrightarrow$}\vss}}}(\mathbb{F}_{p}).

Denote by Δ\Delta the bicapsule associated to the (𝖦,𝖡)(\mathsf{G},\mathsf{B})-morphism in Lemma 8.1. Denote by Γ\Gamma and Υ\Upsilon, respectively, the bicapsules associated to the (𝖡,𝖠)(\mathsf{B},\mathsf{A})- and (𝖠,𝖲)(\mathsf{A},\mathsf{S})-morphisms in Fact 8.2 and Section 8.3. These three capsules can now be combined to produce the (𝖦,𝖲)(\mathsf{G},\mathsf{S})-capsule

Δ𝖡Γ𝖠Υ=𝖦μ𝖲.\displaystyle\Delta\otimes_{\mathsf{B}}\Gamma\otimes_{\mathsf{A}}\Upsilon=\mathsf{G}\cdot\mu\cdot\mathsf{S}.

The resulting generator μ\mu of this cyclic bicapsule is a unital. By Theorem 2-dual this provides the characteristic subgroup used in [Maglione:adjoints] and [Wilson:filters]*§4.

Acknowledgements

We thank Chris Liu for fruitful discussions and some proof-of-concept implementations. We thank John Power and Mima Stanojkovski for comments on a draft. Brooksbank was supported by NSF grant DMS-2319371. Maglione was supported by DFG grant VO 1248/4-1 (project number 373111162) and DFG-GRK 2297. O’Brien was supported by the Marsden Fund of New Zealand Grant 23-UOA-080 and by a Research Award of the Alexander von Humboldt Foundation. Wilson was supported by a Simons Foundation Grant identifier #636189 and by NSF grant DMS-2319370.

References

  • \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry \ProcessBibTeXEntry