This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Maximal Ideals in Commutative Rings and the Axiom of Choice

Alexei Entin Raymond and Beverly Sackler School of Mathematical Sciences, Tel Aviv University, Tel Aviv 69978, Israel aentin@tauex.tau.ac.il
Abstract.

It is well-known that within Zermelo-Fraenkel set theory (ZF), the Axiom of Choice (AC) implies the Maximal Ideal Theorem (MIT), namely that every nontrivial commutative ring has a maximal ideal. The converse implication MIT \Rightarrow AC was first proved by Hodges, with subsequent proofs given by Banaschewski and Erné. Here we give another derivation of MIT \Rightarrow AC, aiming to make the exposition self-contained and accessible to non-experts with only introductory familiarity with commutative ring theory and naive set theory.

1. Introduction

In the present note we work in Zermelo-Fraenkel set theory (ZF) and all numbered assertions are theorems of ZF. By a ring we will always mean an associative ring with a unit and by a domain we mean a nontrivial commutative ring without zero-divisors. A foundational result in the theory of commutative rings (attributed to Krull) is the

Maximal Ideal Theorem (MIT). Every nontrivial commutative ring RR has a maximal ideal.

This result is usually proved using Zorn’s Lemma, which is equivalent to the

Axiom of Choice (AC). Let (Ai)iI(A_{i})_{i\in I} be a family of non-empty sets. There exists a function f:IiIAif:I\to\bigcup_{i\in I}A_{i} such that f(i)Aif(i)\in A_{i} for each iIi\in I.

Thus we have AC \Rightarrow MIT. Scott [Sco54] raised the question of whether the converse implication MIT \Rightarrow AC holds. It was answered affirmatively by Hodges [Hod79]. Banaschewski [Ban94] and Erné [Ern95] gave alternative proofs. All of the proofs are based on a similar principle - a combinatorial statement equivalent to AC is encoded in the existence of a maximal ideal in a suitable localized polynomial ring. Hodges’s proof establishes (assuming MIT) that every tree has a branch [Jec03, Definition 9.10] (which implies AC). Banaschewski’s proof establishes AC directly and Erné’s proof establishes the Teichmüller-Tuckey lemma [Jec08, p. 10] (which implies AC). The variant presented in this note establishes a weak form of Zorn’s Lemma, which (as we will see below) is actually equivalent to the original Zorn’s Lemma. The proof presented here was developed independently from [Ban94, Ern95] (the author was initially only familiar with Hodges’s work and learned about the work of Banaschewski and Erné later). The material in the present note can be viewed as a special case of the results in [Ern95] and is based on similar arguments. Nevertheless we hope that the presentation here will be more accessible and intuitive to non-experts and perhaps offer a new point of view that may eventually lead to novel results. The only background required from the reader is introductory familiarity with commutative rings, ideals, polynomial rings, localization, partial orders and Zorn’s Lemma.

We now state a more precise version of MIT \Rightarrow AC that we will prove below (this statement is also proved in [Ern95] and in slightly weaker forms in [Hod79, Ban94]).

Theorem 1.

Let RR be a domain and assume that for every set of variables XX and any multiplicative subset SS of the polynomial ring R[X]R[X], the localization S1R[X]S^{-1}R[X] has a maximal ideal. Then AC\mathrm{AC} holds.

2. Weak Zorn’s Lemma

Our proof does not establish AC directly, but instead establishes an equivalent statement which we call the Weak Zorn’s Lemma. Before stating it we review some basic definitions and (the original) Zorn’s Lemma.

Definition 2.

Let XX be a partially-ordered set (poset). A subset YXY\subset X is a chain if for any x,yYx,y\in Y we have xyx\leqslant y or yxy\leqslant x.

Definition 3.

Let XX be a poset, YXY\subset X. An upper bound on YY is an element xXx\in X such that yxy\leqslant x for all yYy\in Y.

Definition 4.

Let XX be a poset. A subset YXY\subset X is (upward) compatible if any finite subset of YY has an upper bound in XX.

Note that every chain is compatible (since every finite subset has an element greater than the rest).

Definition 5.

Let XX be a poset. An element xXx\in X is maximal if there is no yXy\in X such that x<yx<y.

Zorn’s Lemma (ZL). Let XX be a poset such that every chain YXY\subset X has an upper bound. Then XX has a maximal element.

Weak Zorn’s Lemma (WZL). Let XX be a poset such that every (upward) compatible subset YXY\subset X has an upper bound. Then XX has a maximal element.

Since every chain in a poset XX is compatible, we clearly have ZL \Rightarrow WZL. In fact WZL is equivalent to ZL. We now show that WZL directly implies both ZL and AC.

Proposition 6.
  1. (i)

    WZLAC\mathrm{WZL}\Rightarrow\mathrm{AC}.

  2. (ii)

    WZLZL\mathrm{WZL}\Rightarrow\mathrm{ZL}.

Proof.

(i). Let (Ai)iI(A_{i})_{i\in I} be a family of non-empty sets. Consider the set XX of partial choice functions for the family, i.e. the set of functions ff with domain dom(f)I\mathrm{dom}(f)\subset I and f(i)Aif(i)\in A_{i} for all idom(f)i\in\mathrm{dom}(f). We order XX by inclusion, i.e. fgf\leqslant g iff dom(f)dom(g){\mathrm{dom}}(f)\subset{\mathrm{dom}}(g) and g|dom(f)=fg|_{{\mathrm{dom}}(f)}=f.

Let YXY\subset X be compatible. Then for any f,gYf,g\in Y and idom(f)dom(g)i\in{\mathrm{dom}}(f)\cap{\mathrm{dom}}(g) we have f(i)=g(i)f(i)=g(i). Consequently fYf\bigcup_{f\in Y}f is itself a partial choice function and is an upper bound on YY. Assuming WZL, there is a maximal element fXf\in X. If dom(f)I{\mathrm{dom}}(f)\subsetneq I we can take g=f{(i,a)}g=f\cup\{(i,a)\} for some iIdom(f),aAii\in I\setminus{\mathrm{dom}}(f),\,a\in A_{i} and then f<gf<g, contradicting maximality. Therefore dom(f)=I{\mathrm{dom}}(f)=I and ff is a (full) choice function for (Ai)iI(A_{i})_{i\in I}. Thus AC holds.

(ii) Let XX be a partial order where every chain has an upper bound. Let CC be the set of chains in XX, partially ordered by inclusion. We want to apply WZL to CC, so we check that its condition holds. If DCD\subset C is compatible then for any c1,,cnDc_{1},\ldots,c_{n}\in D we have that i=1ncicC\bigcup_{i=1}^{n}c_{i}\subset c\in C is a chain (cc is an upper bound on {c1,,cn}\{c_{1},\ldots,c_{n}\}). Therefore e=cDce=\bigcup_{c\in D}c is a chain (if x,yex,y\in e then xc,ydx\in c,\,y\in d for some c,dDc,d\in D and thus x,ycduCx,y\in c\cup d\subset u\in C are comparable, i.e. xyx\leqslant y or yxy\leqslant x; here uCu\in C is an upper bound for {c,d}\{c,d\}). Therefore eCe\in C is a chain and is therefore an upper bound on DD. Assuming WZL, there must be a maximal element cCc\in C (i.e. a maximal chain in XX). By assumption cc has an upper bound xx in XX. If xx is not maximal, say x<yx<y for some yCy\in C, then c{x,y}cc\cup\{x,y\}\supsetneq c is a chain, contradicting the maximality of cc. Thus XX has a maximal element as required.

We note that many important applications of Zorn’s Lemma involve partial orders which satisfy the conditions of WZL, e.g. the standard proofs of MIT and of the fact that every vector space has a basis (but many other applications involve partial orders satisfying the condition of ZL but not of WZL). Thus WZL is a fairly natural statement, which could potentially be useful for deriving other equivalent forms of AC.

3. Polynomial rings with partially ordered variables

In light of Proposition 6, Theorem 1 would follow at once from the following

Proposition 7.

Let RR be a domain and XX a set of variables. Assume that for any multiplicative subset SS of the polynomial ring R[X]R[X] the localization S1R[X]S^{-1}R[X] has a maximal ideal. Then any partial order on XX satisfying the condition of WZL has a maximal element.

It remains to prove Proposition 7, which will occupy the rest of this note. We fix an arbitrary poset (X,)(X,\leqslant) and an arbitrary domain RR. We view XX as a set of variables for the polynomial ring R[X]R[X]. We naturally view XX as a subset of R[X]R[X]. A monomial in R[X]R[X] is a product (possibly empty, equalling 1) of variables in XX. We say that a monomial mm appears in fR[X]f\in R[X] if the coefficient of mm in ff is non-zero and we say that a variable xx appears in mm (or that mm contains xx) if the exponent of xx in mm is positive.

Definition 8.

Let xXx\in X be a variable and fR[X]f\in R[X] a polynomial. We say that ff is dominated by xx if every monomial appearing in ff contains a variable yy with yxy\leqslant x. We denote this by fxf\leqslant x.

We now make two observations (immediate from the definition), which will be used repeatedly.

  1. (i)

    If f,gxf,g\leqslant x and p,qR[X]p,q\in R[X] then pf+qgxpf+qg\leqslant x (in other words the set of polynomials dominated by xx is an ideal).

  2. (ii)

    If fx,g⩽̸xf\leqslant x,\,g\not\leqslant x then f+g⩽̸xf+g\not\leqslant x.

Definition 9.

A polynomial fR[X]f\in R[X] is called small if fxf\leqslant x for some xXx\in X. It is called big if no such xx exists.

Lemma 10.

The set SS of big polynomials in R[X]R[X] is multiplicative.

Proof.

Clearly 1S1\in S. Let f,gSf,g\in S be big and let xXx\in X be a variable. Write f=f1+f2,g=g1+g2f=f_{1}+f_{2},\,g=g_{1}+g_{2}, where f1f_{1} (resp. g1g_{1}) consists of the monomials of ff (resp. gg) dominated by xx, and f2f_{2} (resp. g2g_{2}) consists of the monomials of ff (resp. gg) not dominated by xx. Since f,gf,g are big we must have f2,g20f_{2},g_{2}\neq 0 and therefore f2g2⩽̸xf_{2}g_{2}\not\leqslant x (since the monomials appearing in f2,g2f_{2},g_{2} do not contain variables x\leqslant x). Since fg=(f1g1+f1g2+f2g1)+f2g2fg=(f_{1}g_{1}+f_{1}g_{2}+f_{2}g_{1})+f_{2}g_{2} and f1g1+f1g2+f2g1xf_{1}g_{1}+f_{1}g_{2}+f_{2}g_{1}\leqslant x (by observation (i) above), we have fg⩽̸xfg\not\leqslant x (by observation (ii) above). This is true for any xXx\in X, so fgSfg\in S is big and SS is multiplicative. ∎

Definition 11.

An ideal IR[X]I\lhd R[X] is called small if every fIf\in I is small.

Definition 12.

A maximal small ideal is a small ideal PR[X]P\lhd R[X] which is not properly contained in another small ideal.

Lemma 13.

Let AA be a commutative ring, SAS\subset A a multiplicative subset with 0S0\not\in S. Assume that the localization S1AS^{-1}A has a maximal ideal. Then the set of ideals IAI\lhd A disjoint from SS contains a maximal element (with respect to inclusion).

Proof.

Let MS1AM\lhd S^{-1}A be a maximal ideal and consider the localization homomorphism loc:AS1A\mathrm{loc}:A\to S^{-1}A given by loc(a)=a1\mathrm{loc}(a)=\frac{a}{1}. Then loc1(M)A\mathrm{loc}^{-1}(M)\lhd A is maximal among the ideals of AA disjoint from SS (if loc1IA\mathrm{loc}^{-1}\subsetneq I\lhd A then MS1IS1AM\subsetneq S^{-1}I\lhd S^{-1}A, so 1S1I1\in S^{-1}I and therefore ISI\cap S\neq\emptyset).∎

Lemma 14.

Assume that every localization of R[X]R[X] by a multiplicative subset has a maximal ideal. Then there exists a maximal small ideal PR[X]P\lhd R[X].

Proof.

Apply the previous lemma to A=R[X]A=R[X] and SS the set of big polynomials (which is multiplicative by Lemma 10).∎

Lemma 15.

Let YXY\subset X be compatible (in the sense of Definition 4). Then the ideal (Y)R[X](Y)\lhd R[X] generated by YY is small.

Proof.

Let g=i=1ngiyi(Y),yiY,giR[X]g=\sum_{i=1}^{n}g_{i}y_{i}\in(Y),\,y_{i}\in Y,\,g_{i}\in R[X]. Since YY is compatible there exists xXx\in X such that yixy_{i}\leqslant x for 1in1\leq i\leq n. Therefore gxg\leqslant x is small for any g(Y)g\in(Y) and (Y)(Y) is a small ideal.∎

Proposition 16.

Let PR[X]P\lhd R[X] be a maximal small ideal and denote Y=PXY=P\cap X. Then

  1. (i)

    P=(Y)P=(Y) is generated by YY.

  2. (ii)

    YY is compatible.

  3. (iii)

    YY is maximal compatible: if YYXY\subset Y^{\prime}\subset X and YY^{\prime} is compatible, then Y=YY^{\prime}=Y.

Proof.

(i). Clearly P(Y)P\supset(Y), so it is enough to show that P(Y)P\subset(Y). Let fPf\in P and let mm be a monomial appearing in ff.

Claim. There exists zYz\in Y which appears in mm.

Since fPf\in P is small we have m1m\neq 1, so let xx be a variable appearing in mm. Let gPg\in P and consider h=xdf+gPh=x^{d}f+g\in P, where d>deggd>\deg g. Since hh is small and the monomials appearing in gg and xdmx^{d}m also appear in hh because of our assumption on dd (they cannot cancel each other out), there exists yXy\in X such that hyh\leqslant y and therefore m,gym,g\leqslant y (note that since xx occurs in mm the condition xdmyx^{d}m\leqslant y is equivalent to mym\leqslant y). This implies in particular that zyz\leqslant y for some zXz\in X appearing in mm. Consequently qz+gyqz+g\leqslant y is small for any qR[X],gPq\in R[X],\,g\in P. Therefore the ideal (P,z)P(P,z)\supset P is small and by the maximality of PP we have zPX=Yz\in P\cap X=Y, establishing the claim.

Now since every monomial appearing in ff contains a variable from YY, we have f(Y)f\in(Y) and the proof of (i) is complete.

(ii). Let x1,xnYx_{1},\ldots x_{n}\in Y. We assume WLOG that they are distinct. Then x1+x2++xnPx_{1}+x_{2}+\ldots+x_{n}\in P is small and therefore there exists an upper bound xXx\in X on {x1,,xn}\{x_{1},\ldots,x_{n}\}. Hence YY is compatible.

(iii) Let YYXY\subset Y^{\prime}\subset X be compatible, xYx\in Y^{\prime}. By Lemma 15, (Y)(Y)=P(Y^{\prime})\supset(Y)=P is a small ideal. By the maximality of PP we have (Y)=P(Y^{\prime})=P and therefore Y=(Y)X=PX=YY^{\prime}=(Y^{\prime})\cap X=P\cap X=Y. ∎

Remark 17.

The converse of Proposition 16(iii) also holds: if YXY\subset X is maximal compatible, the ideal (Y)(Y) is a maximal small ideal in R[X]R[X]. Indeed, (Y)(Y) is small by Lemma 15 and if fR[X](Y)f\in R[X]\setminus(Y) then either ff is a constant and thus big, or ff has a monomial m=x1xnm=x_{1}\cdots x_{n} with xiYx_{i}\not\in Y for 1in1\leq i\leq n. Since Y{xi}Y\cup\{x_{i}\} is not compatible for any ii, one can pick yijY, 1in, 1jniy_{ij}\in Y,\,1\leq i\leq n,\,1\leq j\leq n_{i} such that each {yi1,,yini,xi}\{y_{i1},\ldots,y_{in_{i}},x_{i}\} has no upper bound in XX. Therefore f+y{yij:1in,1jnij}yf+\sum_{y\in\{y_{ij}:1\leq i\leq n,1\leq j\leq n_{ij}\}}y is big. In either case the ideal (Y,f)R[X](Y,f)\lhd R[X] is not small and therefore (Y)(Y) is a maximal small ideal. Thus there is a bijection between maximal compatible subsets of XX and maximal small ideals of R[X]R[X]. This is a special case of [Ern95, Proposition on p. 126].

4. Conclusion of the proof

Proof of Proposition 7.

Let XX be a poset such that every compatible YXY\subset X has an upper bound. By the assumption of Proposition 7 and Lemma 14 there exists a maximal small ideal PR[X]P\lhd R[X]. By Proposition 16(ii-iii) the set Y=PXXY=P\cap X\subset X is maximal compatible and by assumption YY has an upper bound xXx\in X. We claim that xx is a maximal element of XX. Otherwise x<yx<y for some yYy\in Y and Y{x,y}YY\cup\{x,y\}\supsetneq Y is compatible (since yy is an upper bound on the entire set), contradicting the maximality of YY. ∎

Acknowledgment. The author was partially supported by Israel Science Foundation grant no. 2507/19.

References

  • [Ban94] B. Banaschewski. A new proof that Krull implies Zorn. Math. Log. Quart., 40:478 – 480, 1994.
  • [Ern95] M. Erné. A primrose path from Krull to Zorn. Comment. Math. Univ. Carolin., 36(1):123–126, 1995.
  • [Hod79] W. Hodges. Krull implies Zorn. J. London Math. Soc., 19:285 – 287, 1979.
  • [Jec03] T. J. Jech. Set Theory. Springer Monogr. Math. Springer Berlin, Heidelberg, 3 edition, 2003.
  • [Jec08] T. J. Jech. The axiom of choice. Courier Corporation, 2008.
  • [Sco54] D. S. Scott. Prime ideal theorems for rings, lattices and boolean algebras. Bull. Amer. Math. Soc., 60:390, 1954.