Maximal Ideals in Commutative Rings and the Axiom of Choice

Alexei Entin Raymond and Beverly Sackler School of Mathematical Sciences, Tel Aviv University, Tel Aviv 69978, Israel aentin@tauex.tau.ac.il

Abstract.

It is well-known that within Zermelo-Fraenkel set theory (ZF), the Axiom of Choice (AC) implies the Maximal Ideal Theorem (MIT), namely that every nontrivial commutative ring has a maximal ideal. The converse implication MIT $\Rightarrow$ AC was first proved by Hodges, with subsequent proofs given by Banaschewski and Erné. Here we give another derivation of MIT $\Rightarrow$ AC, aiming to make the exposition self-contained and accessible to non-experts with only introductory familiarity with commutative ring theory and naive set theory.

1. Introduction

In the present note we work in Zermelo-Fraenkel set theory (ZF) and all numbered assertions are theorems of ZF. By a ring we will always mean an associative ring with a unit and by a domain we mean a nontrivial commutative ring without zero-divisors. A foundational result in the theory of commutative rings (attributed to Krull) is the

Maximal Ideal Theorem (MIT). Every nontrivial commutative ring $R$ has a maximal ideal.

This result is usually proved using Zorn’s Lemma, which is equivalent to the

Axiom of Choice (AC). Let $(A_{i})_{i\in I}$ be a family of non-empty sets. There exists a function $f:I\to\bigcup_{i\in I}A_{i}$ such that $f(i)\in A_{i}$ for each $i\in I$ .

Thus we have AC $\Rightarrow$ MIT. Scott [Sco54] raised the question of whether the converse implication MIT $\Rightarrow$ AC holds. It was answered affirmatively by Hodges [Hod79]. Banaschewski [Ban94] and Erné [Ern95] gave alternative proofs. All of the proofs are based on a similar principle - a combinatorial statement equivalent to AC is encoded in the existence of a maximal ideal in a suitable localized polynomial ring. Hodges’s proof establishes (assuming MIT) that every tree has a branch [Jec03, Definition 9.10] (which implies AC). Banaschewski’s proof establishes AC directly and Erné’s proof establishes the Teichmüller-Tuckey lemma [Jec08, p. 10] (which implies AC). The variant presented in this note establishes a weak form of Zorn’s Lemma, which (as we will see below) is actually equivalent to the original Zorn’s Lemma. The proof presented here was developed independently from [Ban94, Ern95] (the author was initially only familiar with Hodges’s work and learned about the work of Banaschewski and Erné later). The material in the present note can be viewed as a special case of the results in [Ern95] and is based on similar arguments. Nevertheless we hope that the presentation here will be more accessible and intuitive to non-experts and perhaps offer a new point of view that may eventually lead to novel results. The only background required from the reader is introductory familiarity with commutative rings, ideals, polynomial rings, localization, partial orders and Zorn’s Lemma.

We now state a more precise version of MIT $\Rightarrow$ AC that we will prove below (this statement is also proved in [Ern95] and in slightly weaker forms in [Hod79, Ban94]).

Theorem 1.

Let $R$ be a domain and assume that for every set of variables $X$ and any multiplicative subset $S$ of the polynomial ring $R[X]$ , the localization $S^{-1}R[X]$ has a maximal ideal. Then $\mathrm{AC}$ holds.

2. Weak Zorn’s Lemma

Our proof does not establish AC directly, but instead establishes an equivalent statement which we call the Weak Zorn’s Lemma. Before stating it we review some basic definitions and (the original) Zorn’s Lemma.

Definition 2.

Let $X$ be a partially-ordered set (poset). A subset $Y\subset X$ is a chain if for any $x,y\in Y$ we have $x\leqslant y$ or $y\leqslant x$ .

Definition 3.

Let $X$ be a poset, $Y\subset X$ . An upper bound on $Y$ is an element $x\in X$ such that $y\leqslant x$ for all $y\in Y$ .

Definition 4.

Let $X$ be a poset. A subset $Y\subset X$ is (upward) compatible if any finite subset of $Y$ has an upper bound in $X$ .

Note that every chain is compatible (since every finite subset has an element greater than the rest).

Definition 5.

Let $X$ be a poset. An element $x\in X$ is maximal if there is no $y\in X$ such that $x<y$ .

Zorn’s Lemma (ZL). Let $X$ be a poset such that every chain $Y\subset X$ has an upper bound. Then $X$ has a maximal element.

Weak Zorn’s Lemma (WZL). Let $X$ be a poset such that every (upward) compatible subset $Y\subset X$ has an upper bound. Then $X$ has a maximal element.

Since every chain in a poset $X$ is compatible, we clearly have ZL $\Rightarrow$ WZL. In fact WZL is equivalent to ZL. We now show that WZL directly implies both ZL and AC.

Proposition 6.

(i)

$\mathrm{WZL}\Rightarrow\mathrm{AC}$ .
(ii)

$\mathrm{WZL}\Rightarrow\mathrm{ZL}$ .

Proof.

(i). Let $(A_{i})_{i\in I}$ be a family of non-empty sets. Consider the set $X$ of partial choice functions for the family, i.e. the set of functions $f$ with domain $\mathrm{dom}(f)\subset I$ and $f(i)\in A_{i}$ for all $i\in\mathrm{dom}(f)$ . We order $X$ by inclusion, i.e. $f\leqslant g$ iff ${\mathrm{dom}}(f)\subset{\mathrm{dom}}(g)$ and $g|_{{\mathrm{dom}}(f)}=f$ .

Let $Y\subset X$ be compatible. Then for any $f,g\in Y$ and $i\in{\mathrm{dom}}(f)\cap{\mathrm{dom}}(g)$ we have $f(i)=g(i)$ . Consequently $\bigcup_{f\in Y}f$ is itself a partial choice function and is an upper bound on $Y$ . Assuming WZL, there is a maximal element $f\in X$ . If ${\mathrm{dom}}(f)\subsetneq I$ we can take $g=f\cup\{(i,a)\}$ for some $i\in I\setminus{\mathrm{dom}}(f),\,a\in A_{i}$ and then $f<g$ , contradicting maximality. Therefore ${\mathrm{dom}}(f)=I$ and $f$ is a (full) choice function for $(A_{i})_{i\in I}$ . Thus AC holds.

(ii) Let $X$ be a partial order where every chain has an upper bound. Let $C$ be the set of chains in $X$ , partially ordered by inclusion. We want to apply WZL to $C$ , so we check that its condition holds. If $D\subset C$ is compatible then for any $c_{1},\ldots,c_{n}\in D$ we have that $\bigcup_{i=1}^{n}c_{i}\subset c\in C$ is a chain ( $c$ is an upper bound on $\{c_{1},\ldots,c_{n}\}$ ). Therefore $e=\bigcup_{c\in D}c$ is a chain (if $x,y\in e$ then $x\in c,\,y\in d$ for some $c,d\in D$ and thus $x,y\in c\cup d\subset u\in C$ are comparable, i.e. $x\leqslant y$ or $y\leqslant x$ ; here $u\in C$ is an upper bound for $\{c,d\}$ ). Therefore $e\in C$ is a chain and is therefore an upper bound on $D$ . Assuming WZL, there must be a maximal element $c\in C$ (i.e. a maximal chain in $X$ ). By assumption $c$ has an upper bound $x$ in $X$ . If $x$ is not maximal, say $x<y$ for some $y\in C$ , then $c\cup\{x,y\}\supsetneq c$ is a chain, contradicting the maximality of $c$ . Thus $X$ has a maximal element as required.

∎

We note that many important applications of Zorn’s Lemma involve partial orders which satisfy the conditions of WZL, e.g. the standard proofs of MIT and of the fact that every vector space has a basis (but many other applications involve partial orders satisfying the condition of ZL but not of WZL). Thus WZL is a fairly natural statement, which could potentially be useful for deriving other equivalent forms of AC.

3. Polynomial rings with partially ordered variables

In light of Proposition 6, Theorem 1 would follow at once from the following

Proposition 7.

Let $R$ be a domain and $X$ a set of variables. Assume that for any multiplicative subset $S$ of the polynomial ring $R[X]$ the localization $S^{-1}R[X]$ has a maximal ideal. Then any partial order on $X$ satisfying the condition of WZL has a maximal element.

It remains to prove Proposition 7, which will occupy the rest of this note. We fix an arbitrary poset $(X,\leqslant)$ and an arbitrary domain $R$ . We view $X$ as a set of variables for the polynomial ring $R[X]$ . We naturally view $X$ as a subset of $R[X]$ . A monomial in $R[X]$ is a product (possibly empty, equalling 1) of variables in $X$ . We say that a monomial $m$ appears in $f\in R[X]$ if the coefficient of $m$ in $f$ is non-zero and we say that a variable $x$ appears in $m$ (or that $m$ contains $x$ ) if the exponent of $x$ in $m$ is positive.

Definition 8.

Let $x\in X$ be a variable and $f\in R[X]$ a polynomial. We say that $f$ is dominated by $x$ if every monomial appearing in $f$ contains a variable $y$ with $y\leqslant x$ . We denote this by $f\leqslant x$ .

We now make two observations (immediate from the definition), which will be used repeatedly.

(i)

If $f,g\leqslant x$ and $p,q\in R[X]$ then $pf+qg\leqslant x$ (in other words the set of polynomials dominated by $x$ is an ideal).
(ii)

If $f\leqslant x,\,g\not\leqslant x$ then $f+g\not\leqslant x$ .

Definition 9.

A polynomial $f\in R[X]$ is called small if $f\leqslant x$ for some $x\in X$ . It is called big if no such $x$ exists.

Lemma 10.

The set $S$ of big polynomials in $R[X]$ is multiplicative.

Proof.

Clearly $1\in S$ . Let $f,g\in S$ be big and let $x\in X$ be a variable. Write $f=f_{1}+f_{2},\,g=g_{1}+g_{2}$ , where $f_{1}$ (resp. $g_{1}$ ) consists of the monomials of $f$ (resp. $g$ ) dominated by $x$ , and $f_{2}$ (resp. $g_{2}$ ) consists of the monomials of $f$ (resp. $g$ ) not dominated by $x$ . Since $f,g$ are big we must have $f_{2},g_{2}\neq 0$ and therefore $f_{2}g_{2}\not\leqslant x$ (since the monomials appearing in $f_{2},g_{2}$ do not contain variables $\leqslant x$ ). Since $fg=(f_{1}g_{1}+f_{1}g_{2}+f_{2}g_{1})+f_{2}g_{2}$ and $f_{1}g_{1}+f_{1}g_{2}+f_{2}g_{1}\leqslant x$ (by observation (i) above), we have $fg\not\leqslant x$ (by observation (ii) above). This is true for any $x\in X$ , so $fg\in S$ is big and $S$ is multiplicative. ∎

Definition 11.

An ideal $I\lhd R[X]$ is called small if every $f\in I$ is small.

Definition 12.

A maximal small ideal is a small ideal $P\lhd R[X]$ which is not properly contained in another small ideal.

Lemma 13.

Let $A$ be a commutative ring, $S\subset A$ a multiplicative subset with $0\not\in S$ . Assume that the localization $S^{-1}A$ has a maximal ideal. Then the set of ideals $I\lhd A$ disjoint from $S$ contains a maximal element (with respect to inclusion).

Proof.

Let $M\lhd S^{-1}A$ be a maximal ideal and consider the localization homomorphism $\mathrm{loc}:A\to S^{-1}A$ given by $\mathrm{loc}(a)=\frac{a}{1}$ . Then $\mathrm{loc}^{-1}(M)\lhd A$ is maximal among the ideals of $A$ disjoint from $S$ (if $\mathrm{loc}^{-1}\subsetneq I\lhd A$ then $M\subsetneq S^{-1}I\lhd S^{-1}A$ , so $1\in S^{-1}I$ and therefore $I\cap S\neq\emptyset$ ).∎

Lemma 14.

Assume that every localization of $R[X]$ by a multiplicative subset has a maximal ideal. Then there exists a maximal small ideal $P\lhd R[X]$ .

Proof.

Apply the previous lemma to $A=R[X]$ and $S$ the set of big polynomials (which is multiplicative by Lemma 10).∎

Lemma 15.

Let $Y\subset X$ be compatible (in the sense of Definition 4). Then the ideal $(Y)\lhd R[X]$ generated by $Y$ is small.

Proof.

Let $g=\sum_{i=1}^{n}g_{i}y_{i}\in(Y),\,y_{i}\in Y,\,g_{i}\in R[X]$ . Since $Y$ is compatible there exists $x\in X$ such that $y_{i}\leqslant x$ for $1\leq i\leq n$ . Therefore $g\leqslant x$ is small for any $g\in(Y)$ and $(Y)$ is a small ideal.∎

Proposition 16.

Let $P\lhd R[X]$ be a maximal small ideal and denote $Y=P\cap X$ . Then

(i)

$P=(Y)$ is generated by $Y$ .
(ii)

$Y$ is compatible.
(iii)

$Y$ is maximal compatible: if $Y\subset Y^{\prime}\subset X$ and $Y^{\prime}$ is compatible, then $Y^{\prime}=Y$ .

Proof.

(i). Clearly $P\supset(Y)$ , so it is enough to show that $P\subset(Y)$ . Let $f\in P$ and let $m$ be a monomial appearing in $f$ .

Claim. There exists $z\in Y$ which appears in $m$ .

Since $f\in P$ is small we have $m\neq 1$ , so let $x$ be a variable appearing in $m$ . Let $g\in P$ and consider $h=x^{d}f+g\in P$ , where $d>\deg g$ . Since $h$ is small and the monomials appearing in $g$ and $x^{d}m$ also appear in $h$ because of our assumption on $d$ (they cannot cancel each other out), there exists $y\in X$ such that $h\leqslant y$ and therefore $m,g\leqslant y$ (note that since $x$ occurs in $m$ the condition $x^{d}m\leqslant y$ is equivalent to $m\leqslant y$ ). This implies in particular that $z\leqslant y$ for some $z\in X$ appearing in $m$ . Consequently $qz+g\leqslant y$ is small for any $q\in R[X],\,g\in P$ . Therefore the ideal $(P,z)\supset P$ is small and by the maximality of $P$ we have $z\in P\cap X=Y$ , establishing the claim.

Now since every monomial appearing in $f$ contains a variable from $Y$ , we have $f\in(Y)$ and the proof of (i) is complete.

(ii). Let $x_{1},\ldots x_{n}\in Y$ . We assume WLOG that they are distinct. Then $x_{1}+x_{2}+\ldots+x_{n}\in P$ is small and therefore there exists an upper bound $x\in X$ on $\{x_{1},\ldots,x_{n}\}$ . Hence $Y$ is compatible.

(iii) Let $Y\subset Y^{\prime}\subset X$ be compatible, $x\in Y^{\prime}$ . By Lemma 15, $(Y^{\prime})\supset(Y)=P$ is a small ideal. By the maximality of $P$ we have $(Y^{\prime})=P$ and therefore $Y^{\prime}=(Y^{\prime})\cap X=P\cap X=Y$ . ∎

Remark 17.

The converse of Proposition 16(iii) also holds: if $Y\subset X$ is maximal compatible, the ideal $(Y)$ is a maximal small ideal in $R[X]$ . Indeed, $(Y)$ is small by Lemma 15 and if $f\in R[X]\setminus(Y)$ then either $f$ is a constant and thus big, or $f$ has a monomial $m=x_{1}\cdots x_{n}$ with $x_{i}\not\in Y$ for $1\leq i\leq n$ . Since $Y\cup\{x_{i}\}$ is not compatible for any $i$ , one can pick $y_{ij}\in Y,\,1\leq i\leq n,\,1\leq j\leq n_{i}$ such that each $\{y_{i1},\ldots,y_{in_{i}},x_{i}\}$ has no upper bound in $X$ . Therefore $f+\sum_{y\in\{y_{ij}:1\leq i\leq n,1\leq j\leq n_{ij}\}}y$ is big. In either case the ideal $(Y,f)\lhd R[X]$ is not small and therefore $(Y)$ is a maximal small ideal. Thus there is a bijection between maximal compatible subsets of $X$ and maximal small ideals of $R[X]$ . This is a special case of [Ern95, Proposition on p. 126].

4. Conclusion of the proof

Proof of Proposition 7.

Let $X$ be a poset such that every compatible $Y\subset X$ has an upper bound. By the assumption of Proposition 7 and Lemma 14 there exists a maximal small ideal $P\lhd R[X]$ . By Proposition 16(ii-iii) the set $Y=P\cap X\subset X$ is maximal compatible and by assumption $Y$ has an upper bound $x\in X$ . We claim that $x$ is a maximal element of $X$ . Otherwise $x<y$ for some $y\in Y$ and $Y\cup\{x,y\}\supsetneq Y$ is compatible (since $y$ is an upper bound on the entire set), contradicting the maximality of $Y$ . ∎

Acknowledgment. The author was partially supported by Israel Science Foundation grant no. 2507/19.

References

[Ban94] B. Banaschewski. A new proof that Krull implies Zorn. Math. Log. Quart., 40:478 – 480, 1994.
[Ern95] M. Erné. A primrose path from Krull to Zorn. Comment. Math. Univ. Carolin., 36(1):123–126, 1995.
[Hod79] W. Hodges. Krull implies Zorn. J. London Math. Soc., 19:285 – 287, 1979.
[Jec03] T. J. Jech. Set Theory. Springer Monogr. Math. Springer Berlin, Heidelberg, 3 edition, 2003.
[Jec08] T. J. Jech. The axiom of choice. Courier Corporation, 2008.
[Sco54] D. S. Scott. Prime ideal theorems for rings, lattices and boolean algebras. Bull. Amer. Math. Soc., 60:390, 1954.