A Simple Combinatorial Proof of Szemerédi’s Theorem via Three Levels of Infinities^†^††Mathematics Subject Classification 2020: Primary 11B25, Secondary 03H05

Renling Jin This work was partially supported by a collaboration grant (ID: 513023) from Simons Foundation.

Abstract

We present a nonstandard simple elementary proof of Szemerédi’s theorem by a straightforward induction with the help of three levels of infinities and four different bounded elementary embeddings in a nonstandard universe.

\dajAUTHORdetails

title = A Simple Combinatorial Proof of Szemerédi’s Theorem via Three Levels of Infinities, author = Renling Jin, plaintextauthor = Renling Jin, plaintexttitle = A Simple Combinatorial Proof of Szemerédi’s Theorem via Three Levels of Infinities, runningtitle = A Simple Proof, runningauthor = Renling Jin, copyrightauthor = Renling Jin, keywords = arithmetic progression, Szemerédi’s theorem, nonstandard analysis, iterated nonstandard extensions, \dajEDITORdetailsyear=2023, number=15, received=11 April 2022, revised=5 July 2023, published=25 September 2023, doi=10.19086/da.87772,

[classification=text]

1 Introduction

This article is strongly influenced by Terence Tao’s notes [9].

Theorem 1.1 (van der Waerden, 1927)

Given any $k,n\in{\mathbb{N}}$ , there exists $\Gamma(k,n)\in{\mathbb{N}}$ called van der Waerden number, such that if $\{U_{1},U_{2},\ldots,U_{n}\}$ is a partition of $\{1,2,\ldots,\Gamma(k,n)\}$ , then there is a $u\leq n$ such that $U_{u}$ contains a $k$ –term arithmetic progression.

Theorem 1.2 (E. Szemerédi, 1975 [8])

If $D\subseteq{\mathbb{N}}$ has a positive upper density, then $D$ contains a $k$ –term arithmetic progression for every $k\in{\mathbb{N}}$ .

Szemerédi’s theorem confirms a conjecture of P. Erdős and P. Turán made in 1936, which implies van der Waerden’s theorem.

Nonstandard versions of Furstenberg’s ergodic proof and Gowers’s harmonic proof of Szemerédi’s theorem have been tried by T. Tao (see Tao’s blog post [10]). In the workshop Nonstandard methods in combinatorial number theory sponsored by American Institute of Mathematics in San Jose, CA, August 2017, Tao gave a series of lectures to explain Szemerédi’s original combinatorial proof and hope to simplify it so that the proof can be better understood. He believed that Szemerédi’s combinatorial method should have a greater impact on combinatorics.

During these lectures Tao challenged the audience to produce a nonstandard proof of Szemerédi’s theorem which is noticeably simpler and more transparent than Szemerédi’s original proof. The current article is the result of Tao’s challenge and inspiration. However, in his later blog post [11], Tao commented that “in fact there are now signs that perhaps nonstandard analysis is not the optimal framework in which to place this argument.” We disagree. The current article is our effort to show that with the help of a nonstandard universe with three levels of infinities, Szemerédi’s original argument can be made simpler and more transparent.

The main simplification in our proof of Szemerédi’s theorem compared to the standard proof in [8, 9] is that a Tower of Hanoi type induction in [9, Theorem 6.6] and in [8, Lemma 5, Lemma 6, and Fact 12] is replaced by a straightforward induction (see Lemma 5.1 below), which makes Szemerédi’s idea more transparent. To achieve this, we work within a chain of nonstandard extensions ${\mathbb{V}}_{0}\prec{\mathbb{V}}_{1}\prec{\mathbb{V}}_{2}\prec{\mathbb{V}}_{3}$ which supply three levels of infinities, plus various bounded elementary embeddings from ${\mathbb{V}}_{j}$ to ${\mathbb{V}}_{j^{\prime}}$ for some $0\leq j<j^{\prime}\leq 3$ .

The paper is organized in the following sections. §2 is a brief introduction of logic foundation for constructing the nonstandard extensions ${\mathbb{V}}_{0}\prec{\mathbb{V}}_{1}\prec{\mathbb{V}}_{2}\prec{\mathbb{V}}_{3}$ and bounded elementary embeddings including $i_{0}$ , $i_{*}$ , $i_{1}$ , and $i_{2}$ to the reader who does not have logic background. The reader who is only interested in applications can familiarize with the notation, Property 2.7, Proposition 2.14, and Proposition 2.15, safely skip the proof, and return to it at a later time. In §3 we translate the density along arithmetic progressions in standard setting to strong upper Banach density in nonstandard setting as well as translate some consequences of the double counting argument in standard setting to nonstandard setting. The reader who is only interested in applications can familiarize with the notation, Lemma 3.3, Lemma 3.4, and Lemma 3.5, safely skip the proof, and return to it at a later time. In §4 we re-write, in a nonstandard setting, the proof of a so–called mixing lemma in [9] based on a weak regularity lemma. This section does not offer new idea but is included only for self-containment. §5 is the main part of the paper where we present the proof of Szemerédi’s theorem. In §6 we pose a question whether the presented proof of Szmerédi’s theorem can be carried out without the axiom of choice.

2 Construction of Nonstandard Extensions

The notation we use here should be consistent with some standard textbooks. Consult, for example, [1, 4, 7] for more details. If $f:A\to B$ is a function, then $f(a)$ denotes the image of $a$ as an element in $B$ and $f[C]:=\{f(a)\mid a\in C\}$ for some $C\subseteq A$ .

§2.1 Superstructure Let $\omega$ be the set of all standard non-negative integers used at the meta-level and $X$ be an infinite set of urelements, i.e., elements without members. The superstructure on $X$ (cf. [1, page 263]), denoted by ${\mathbb{V}}(X)$ , is composed of the base set $V(X)$ and the membership relation $\in$ on $V(X)$ where

V(X):=\bigcup_{n\in\omega}V(X,n)

and $V(X,n)$ is defined recursively by letting

V(X,0):=X\,\mbox{ and }\,V(X,n+1):=V(X,n)\cup\mathscr{P}(V(X,n))

for every $n\in\omega$ where $\mathscr{P}$ is the powerset operator. For notational convenience we often write ${\mathbb{V}}(X)$ also for its base set $V(X)$ . Hopefully, this will not cause confusion. One can define a rank function for every $a\in{\mathbb{V}}(X)$ recursively. Set $\mbox{rank}(a)=0$ iff (the abbreviation of “if and only if”) $a\in X$ and

\mbox{rank}(a)=1+\max\{\mbox{rank}(b)\mid b\in a\}

(1)

for any $a\in{\mathbb{V}}(X)\setminus X$ . Notice that $\mbox{rank}(a)=n$ iff $a\in{\mathbb{V}}(X,n)\setminus{\mathbb{V}}(X,n-1)$ .

Let ${\mathbb{N}}_{0}$ be the set of all standard positive integers and ${\mathbb{R}}_{0}$ be the set of all standard real numbers. By the standard universe we mean the superstructure ${\mathbb{V}}_{0}:={\mathbb{V}}({\mathbb{R}}_{0})$ on $X={\mathbb{R}}_{0}$ . Notice that all standard mathematical objects mentioned in this paper have ranks below, say, $100$ . For example, an ordered pair $(a,b)$ of standard real numbers can be viewed as the set $\{\{a\},\{a,b\}\}\in{\mathbb{V}}({\mathbb{R}}_{0},2)$ and a function $f:{\mathbb{N}}_{0}\to{\mathbb{R}}_{0}$ can be viewed as a set of ordered pairs in ${\mathbb{V}}({\mathbb{R}}_{0},2)$ . Hence $f\in{\mathbb{V}}({\mathbb{R}}_{0},3)$ .

§2.2 Logic and model theory Before introducing nonstandard universe we should mention briefly, without rigor, some concepts in model theory. For simplicity we consider only model theory on finite relational languages. We call a set $\mathscr{L}$ of finitely many symbols $P_{1},P_{2},\ldots,P_{n}$ with arity $m_{i}\in\omega$ for each $P_{i}$ a (relational) language. An $\mathscr{L}$ –model $\mathcal{M}$ is a structure composed of a nonempty base set $M$ and an $m_{i}$ -nary relation $P^{\mathcal{M}}_{i}\subseteq M^{m_{i}}$ , called the interpretation of $P_{i}$ in $\mathcal{M}$ , for each symbol $P_{i}\in\mathscr{L}$ . For notational convenience we sometimes write $\mathcal{M}$ for the base set of $\mathcal{M}$ .

We can define (first-order) $\mathscr{L}$ –formulas recursively starting from atomic formulas. If $x$ and $x^{\prime}$ are variables, then $x=x^{\prime}$ is an atomic formula. If $P_{i}\in\mathscr{L}$ and $\overline{x}:=(x_{1},x_{2},\ldots,x_{m_{i}})$ is an $m_{i}$ -tuple of variables, then $P_{i}(\overline{x})$ is an atomic formula. The word “first-order” means that these variables are intended to take only elements of some $\mathscr{L}$ –model as their values. All formulas mentioned in this paper are first-order. We will use the symbol $\overline{a}$ to represent an $m$ -tuple of elements with some suitable generic number $m\in\omega$ . When $\overline{x}$ is intended to be substituted by $\overline{a}$ , we assume implicitly that they have the same length.

If $\varphi$ and $\psi$ are $\mathscr{L}$ –formulas and $x$ is a variable, then $\neg\varphi$ , $\varphi\wedge\psi$ , $\varphi\vee\psi$ , $\varphi\to\psi$ , $\varphi\leftrightarrow\psi$ , $\forall x\varphi$ , and $\exists x\varphi$ are also $\mathscr{L}$ –formulas. The symbols $\neg$ , $\wedge$ , $\vee$ , $\to$ , and $\leftrightarrow$ are called logic connectives, and $\forall$ and $\exists$ are called universal and existential quantifiers. An occurrence of $x$ in $\varphi$ is called bounded if it occurs in a sub-formula of the form $\forall x\psi$ or $\exists x\psi$ in $\varphi$ . An occurrence of $x$ in $\varphi$ is called free if it is not bounded. A formula without free variable is called a sentence.

Let $\mathcal{M}$ be an $\mathscr{L}$ –model. For any $\mathscr{L}$ –formula $\varphi(\overline{x})$ , where $\overline{x}$ is an $m$ -tuple of variables containing all free variables in $\varphi$ , and any $\overline{a}\in\mathcal{M}^{m}$ , called parameters, we can define $\mathcal{M}\models\varphi(\overline{a})$ , meaning $\varphi(\overline{a})$ is true in $\mathcal{M}$ , recursively by (a) $\mathcal{M}\models a=a^{\prime}$ iff $a$ and $a^{\prime}$ are identical elements in $\mathcal{M}$ and $\mathcal{M}\models P_{i}(\overline{a})$ iff $\overline{a}\in P_{i}^{\mathcal{M}}$ ; (b) for any $\mathscr{L}$ –formulas $\varphi$ and $\psi$ with all free variables $\overline{x}$ being substituted by $\overline{a}$ in $\mathcal{M}$ , $\mathcal{M}\models\neg\varphi$ iff $\mathcal{M}\not\models\varphi$ ( $\neg$ means “not”), $\mathcal{M}\models\varphi\vee\psi$ iff $\mathcal{M}\models\varphi$ or $\mathcal{M}\models\psi$ ( $\vee$ means “or”), $\mathcal{M}\models\varphi\wedge\psi$ iff $\mathcal{M}\models\varphi$ and $\mathcal{M}\models\psi$ , ( $\wedge$ means “and”), $\mathcal{M}\models\varphi\to\psi$ iff $\mathcal{M}\models\neg\varphi\vee\psi$ ( $\to$ means “imply”), and $\mathcal{M}\models\varphi\leftrightarrow\psi$ iff $\mathcal{M}\models(\varphi\to\psi)\wedge(\psi\to\varphi)$ ( $\leftrightarrow$ means “if and only if”); (c) for any $\mathscr{L}$ –formulas $\varphi(y,\overline{x})$ , $\mathcal{M}\models\forall y\,\varphi(y,\overline{a})$ iff $\mathcal{M}\models\varphi(b,\overline{a})$ for every $b\in\mathcal{M}$ ( $\forall$ means “for all”) and $\mathcal{M}\models\exists y\varphi(y,\overline{a})$ iff $\mathcal{M}\models\varphi(b,\overline{a})$ for some $b\in\mathcal{M}$ ( $\exists$ means “there exist”). From the truth definition above, for every $\mathscr{L}$ –formula $\varphi(\overline{x})$ there is another $\mathscr{L}$ –formula $\psi(\overline{x})$ using only logic connectives $\neg$ and $\vee$ , and only quantifier $\exists$ , such that $\mathcal{M}\models\varphi(\overline{a})$ iff $\mathcal{M}\models\psi(\overline{a})$ for any $\mathscr{L}$ –model $\mathcal{M}$ and any $\overline{a}\in\mathcal{M}^{m}$ .

We sometimes call a formula with all free variables being substituted by parameters a sentence. Clearly, the truth value of a sentence with parameters from $\mathcal{M}$ is determined in $\mathcal{M}$ . When we write $\varphi(\overline{x},\overline{a})$ , we mean implicitly that all free variables in the formula are among $\overline{x}$ and all parameters from model $\mathcal{M}$ are among $\overline{a}$ .

Suppose $\mathcal{M}$ and $\mathcal{N}$ are two $\mathscr{L}$ –models. A function $i:\mathcal{M}\to\mathcal{N}$ is called an elementary embedding from $\mathcal{M}$ to $\mathcal{N}$ if for any $\mathscr{L}$ –formula $\varphi(\overline{x})$ and any $m$ -tuple $\overline{a}=(a_{1},a_{2},\ldots,a_{m})\in\mathcal{M}^{m}$ we have

\mathcal{M}\models\varphi(\overline{a})\,\mbox{ iff }\,\mathcal{N}\models\varphi(i(\overline{a}))

(2)

where $i(\overline{a}):=(i(a_{1}),i(a_{2}),\ldots,i(a_{m}))\in\mathcal{N}^{m}$ . An elementary embedding is necessarily injective. If there exists an elementary embedding $i:\mathcal{M}\to\mathcal{N}$ , we can view $\mathcal{M}$ as an elementary submodel of $\mathcal{N}$ and call $\mathcal{N}$ an elementary extension of $\mathcal{M}$ , denoted by $\mathcal{M}\preceq\mathcal{N}$ . We sometimes write $\mathcal{M}\prec\mathcal{N}$ to emphasize that $i$ is not surjective.

If $\mathcal{M}$ is an $\mathscr{L}$ –model, $\mathscr{L}^{\prime}:=\mathscr{L}\cup\{P_{n+1},\ldots,P_{k}\}$ , and $P_{i}^{\mathcal{M}}\subseteq\mathcal{M}^{m_{i}}$ for $n<i\leq k$ , the $\mathscr{L}^{\prime}$ –model $\mathcal{M^{\prime}}$ by adding the relations $P_{i}^{\mathcal{M}}$ to $\mathcal{M}$ is denoted by $(\mathcal{M};P^{\mathcal{M}}_{n+1},\ldots,P^{\mathcal{M}}_{k})$ . We call $\mathcal{M}^{\prime}$ a model expansion of $\mathcal{M}$ .

§2.3 Ultrapower construction Next we construct an elementary extension of a model using ultrapower construction (cf. [1, §4]).

Definition 2.1

Let $X$ be an infinite set. A set ${\mathcal{F}}\subseteq\mathscr{P}(X)$ is a non-principal ultrafilter on $X$ if it satisfies the following:

1.

$X\in{\mathcal{F}}$ and $F\not\in{\mathcal{F}}$ for any $F\in\mathscr{P}_{<\omega}(X)$ where $\mathscr{P}_{<\omega}(X)$ is the collection of all finite subsets of $X$ ,
2.

$\forall A,B\in\mathscr{P}(X)\,(A,B\in{\mathcal{F}}\,\mbox{ implies }\,A\cap B\in{\mathcal{F}})$ ,
3.

$\forall A,B\in\mathscr{P}(X)\,(A\in{\mathcal{F}}\,\mbox{ and }\,A\subseteq B\,\mbox{ imply }\,B\in{\mathcal{F}})$ ,
4.

$\forall A\in\mathscr{P}(X)\,(A\in{\mathcal{F}}\,\mbox{ or }\,(X\setminus A)\in{\mathcal{F}})$ .

It is well known that the existence of non-principal ultrafilters on $X$ follows from $\mathsf{ZFC}$ .

Definition 2.2

Let $\mathcal{M}$ be an $\mathscr{L}$ –model and ${\mathcal{F}}$ be a non-principal ultrafilter on an infinite set $X$ . Let $\mathcal{M}^{X}$ be the set of all functions from $X$ to $\mathcal{M}$ . For any $f,g\in\mathcal{M}^{X}$ define

f\sim_{{\mathcal{F}}}g\,\mbox{ iff }\,\{n\in X\mid f(n)=g(n)\}\in{\mathcal{F}}.

It is easy to check that $\sim_{{\mathcal{F}}}$ is an equivalence relation. Denote $[f]_{{\mathcal{F}}}$ for the equivalence class containing $f$ . The ultrapower of $\mathcal{M}$ modulo ${\mathcal{F}}$ , denoted by $\mathcal{M}^{X}/{\mathcal{F}}$ , is an $\mathscr{L}$ –model which is composed of the base set

\{[f]_{{\mathcal{F}}}\mid f\in\mathcal{M}^{X}\}

and the interpretation of $P_{i}$ by

P_{i}^{\mathcal{M}^{X}/{\mathcal{F}}}:=\left\{([f_{1}]_{{\mathcal{F}}},[f_{2}]_{{\mathcal{F}}},\ldots,[f_{m_{i}}]_{{\mathcal{F}}})\mid\{n\in X\mid(f_{1}(n),f_{2}(n),\ldots,f_{m_{i}}(n))\in P_{i}^{\mathcal{M}}\}\in{\mathcal{F}}\right\}

for every $P_{i}\in\mathscr{L}$ . Notice that

P_{i}^{\mathcal{M}^{X}/{\mathcal{F}}}=\{[\overline{f}]_{{\mathcal{F}}}\mid\overline{f}\,\mbox{ is a function from }\,X\,\mbox{ to }\,P_{i}^{\mathcal{M}}\}.

The right side above is the ultrapower of $P_{i}^{\mathcal{M}}$ modulo ${\mathcal{F}}$ .

For an element $c\in\mathcal{M}$ denote $\phi_{c}:X\to\mathcal{M}$ for the constant function with value $c$ . Let $i:\mathcal{M}\to\mathcal{M}^{X}/{\mathcal{F}}$ be the natural embedding, i.e., $i(c)=[\phi_{c}]_{{\mathcal{F}}}$ . The following is often called Łoś’s theorem. (cf. [1, Theorem 4.1.9, Corollary 4.1.13].)

Proposition 2.3

For any $\mathscr{L}$ –formula $\varphi(\overline{x})$ and any $\overline{[a]}_{{\mathcal{F}}}\in\mathcal{M}^{X}/{\mathcal{F}}$ it is true that

\mathcal{M}^{X}/{\mathcal{F}}\models\varphi(\overline{[a]}_{{\mathcal{F}}})\,\mbox{ iff }\,\left\{n\in X\mid\mathcal{M}\models\varphi(\overline{a(n)})\right\}\in{\mathcal{F}}.

The proof of Proposition 2.3 is done by induction on the complexity of $\varphi$ .

Corollary 2.4

The natural embedding $i:\mathcal{M}\to\mathcal{M}^{X}/{\mathcal{F}}$ with $i(c)=[\phi_{c}]_{{\mathcal{F}}}$ is an elementary embedding. Furthermore, if $\mathcal{M}^{\prime}=(\mathcal{M};R)$ is a model expansion of $\mathcal{M}$ , then $i$ is also an elementary embedding from $\mathcal{M}^{\prime}$ to $\mathcal{M}^{\prime X}/{\mathcal{F}}=(\mathcal{M}^{X}/{\mathcal{F}};R^{X}/{\mathcal{F}})$ .

So, the model $\mathcal{M}$ can be viewed as an elementary submodel of $\mathcal{M}^{X}/{\mathcal{F}}$ via the natural embedding $i$ and $\mathcal{M}^{X}/{\mathcal{F}}$ is an elementary extension of $\mathcal{M}$ .

§2.4 Construction of ${\mathbb{V}}_{1}$ Fix a non-principal ultrafilter ${\mathcal{F}}_{0}$ on ${\mathbb{N}}_{0}$ .

$\blacklozenge$ : From now on let $\mathscr{L}=\{\in\}$ . The ultrafilters which will be used are ${\mathcal{F}}_{0}$ , the tensor product of ${\mathcal{F}}_{0}$ ’s, and nonstandard versions of them.

Recall that ${\mathbb{V}}_{0}$ represents the standard universe, which is an $\mathscr{L}$ –model. Let ${\mathbb{V}}_{0}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}$ be the ultrapower of ${\mathbb{V}}_{0}$ modulo ${\mathcal{F}}_{0}$ and $i^{0}:{\mathbb{V}}_{0}\to{\mathbb{V}}_{0}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}$ be the natural embedding. Denote ${}^{*}\!\!\in$ for the interpretation of $\in$ in ${\mathbb{V}}_{0}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}$ . Notice that $a\in b$ iff $i^{0}(a)\,^{*}\!\!\in i^{0}(b)$ for any $a,b\in{\mathbb{V}}_{0}$ . One can define the rank function $\mbox{rank}(b)$ for every $b\in{\mathbb{V}}_{0}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}$ according to (1) with $\in$ being replaced by ${}^{*}\!\!\in$ . Notice that some elements $[f]_{{\mathcal{F}}}$ in ${\mathbb{V}}_{0}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}$ may not have a finite ${}^{*}\!\!\in$ -rank. For example, if $f(1)=0$ and $f(n+1)=\{f(n)\}$ for every $n\in{\mathbb{N}}_{0}$ , then $[f]_{{\mathcal{F}}_{0}}\in{\mathbb{V}}_{0}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}$ does not have a finite ${}^{*}\!\!\in$ -rank. Let

V_{1}:=\left\{b\in{\mathbb{V}}_{0}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}\mid\,^{*}\!\!\in\!\mbox{-rank}(b)\in\omega\right\}.

The set $V_{1}$ is just the ultrapower of ${\mathbb{V}}_{0}$ modulo ${\mathcal{F}}_{0}$ truncated at ${}^{*}\!\!\in$ -rank $\omega$ . Notice that $\in\!\mbox{-rank}(a)=\,^{*}\!\!\in\!\mbox{-rank}(i^{0}(a))$ for every $a\in{\mathbb{V}}_{0}$ and $V_{1}=\bigcup_{n\in\omega}i^{0}(V({\mathbb{R}}_{0},n))$ .

Let ${\mathbb{R}}_{1}:=i^{0}({\mathbb{R}}_{0})$ and ${\mathbb{N}}_{1}:=i^{0}({\mathbb{N}}_{0})$ . Assume that every element in ${\mathbb{R}}_{1}$ is an urelement and identify $i^{0}(r)$ by $r$ for every $r\in{\mathbb{R}}_{0}$ . Then ${\mathbb{R}}_{0}\subseteq{\mathbb{R}}_{1}$ . Since the natural order $\leq$ on ${\mathbb{R}}_{0}$ , addition $+$ and multiplication $\times$ on ${\mathbb{R}}_{0}$ can be viewed as elements in ${\mathbb{V}}_{0}$ , we have that $i^{0}(\leq)$ is a linear order on ${\mathbb{R}}_{1}$ extending $\leq$ , $i^{0}(+)$ is the addition on ${\mathbb{R}}_{1}$ extending $+$ , and $i^{0}(\times)$ is the multiplication on ${\mathbb{R}}_{1}$ extending $\times$ . For notational convenience we write $\leq$ , $+$ , and $\times$ for $i^{0}(\leq)$ , $i^{0}(+)$ , and $i^{0}(\times)$ , respectively. By the elementality of $i^{0}$ the structure $({\mathbb{R}}_{1};+,\times,\leq,0,1)$ is an ordered field containing the standard real field $({\mathbb{R}}_{0};+,\times,\leq,0,1)$ as its subfield. Notice that if $\mbox{Id}(n)=n$ for every $n\in{\mathbb{N}}_{0}$ , then $[\mbox{Id}]_{{\mathcal{F}}_{0}}\in i^{0}({\mathbb{N}}_{0})={\mathbb{N}}_{1}$ and $[\mbox{Id}]_{{\mathcal{F}}_{0}}\geq r$ for every $r\in{\mathbb{R}}_{0}$ , i.e., ${\mathbb{N}}_{1}$ contains natural numbers such as $[\mbox{Id}]_{{\mathcal{F}}_{0}}$ which are infinitely large relative to real numbers in ${\mathbb{R}}_{0}$ .

Let $\mathscr{M}$ be the Mostowski collapsing map on $V_{1}$ , i.e., $\mathscr{M}(a)=a$ for every $a\in{\mathbb{R}}_{1}$ and

\mathscr{M}(b):=\{\mathscr{M}(a)\mid a\,^{*}\!\!\in b\}

for every $b\in V_{1}\setminus{\mathbb{R}}_{1}$ . Then $\mathscr{M}$ is an injection and $a\,^{*}\!\!\in b$ iff $\mathscr{M}(a)\in\mathscr{M}(b)$ . If one identifies $V_{1}$ with the image of $V_{1}$ under $\mathscr{M}$ , one can pretend that ${}^{*}\!\!\in$ is the true membership relation and consider $V_{1}$ as a subset of the superstructure ${\mathbb{V}}({\mathbb{R}}_{1})$ . Hence, we can pretend that ${}^{*}\!\!\in$ is the true membership relation $\in$ and drop ^∗ for notational convenience. The purpose of the truncation of ${\mathbb{V}}_{0}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}$ at ${}^{*}\!\!\in$ -rank $\omega$ is to make sure $\mathscr{M}$ is well defined in the standard sense.

Let ${\mathbb{V}}_{1}:=(V_{1};\in)$ . We call ${\mathbb{V}}_{1}$ a nonstandard universe extending ${\mathbb{V}}_{0}$ . We will extend ${\mathbb{V}}_{1}$ further later. Notice that due to the truncation, ${\mathbb{V}}_{1}$ is no longer an elementary extension of ${\mathbb{V}}_{0}$ from the model theoretic point of view. However, ${\mathbb{V}}_{1}$ is a so-called bounded elementary extension of ${\mathbb{V}}_{0}$ .

An $\mathscr{L}$ –formula $\theta$ has bounded quantifiers if every occurrence of quantifiers $\forall$ and $\exists$ in $\theta$ has the form $\forall x\!\in\!y$ and $\exists x\!\in\!y$ . Notice that $\forall x\!\in\!y\,\varphi$ is the abbreviation of $\forall x(x\!\in\!y\to\varphi)$ and $\exists x\!\in\!y\,\varphi$ is the abbreviation of $\exists x(x\!\in\!y\wedge\varphi)$ . Similar to (2), it is easy to show that for any $\mathscr{L}$ –formula $\varphi(\overline{x})$ with bounded quantifiers and any $\overline{a}\in{\mathbb{V}}_{0}$ we have

{\mathbb{V}}_{0}\models\varphi(\overline{a})\,\mbox{ iff }\,{\mathbb{V}}_{1}\models\varphi(i^{0}(\overline{a})).

(3)

So, the map $i^{0}$ is called a bounded elementary embedding from ${\mathbb{V}}_{0}$ to ${\mathbb{V}}_{1}$ . It is a common abuse of notation to write ${\mathbb{V}}_{0}\preceq{\mathbb{V}}_{1}$ to indicate the existence of the bounded elementary embedding (instead of just elementary embedding) $i^{0}$ and ${\mathbb{V}}_{0}\prec{\mathbb{V}}_{1}$ to emphasize that $i^{0}$ is not surjective. The property (3) is sometimes called the transfer principle in nonstandard analysis. Notice that if $A\in{\mathbb{V}}_{0}$ , then $i^{0}(A)$ can be viewed as the ultrapower of $A$ modulo ${\mathcal{F}}_{0}$ , i.e., $i^{0}(A)=\{[f]_{{\mathcal{F}}_{0}}\mid f\in A^{{\mathbb{N}}_{0}}\}$ .

Each $i^{0}(a)$ for $a\in{\mathbb{V}}_{0}$ is called ${\mathbb{V}}_{0}$ –internal (or “standard” in some literature) and each $b\in{\mathbb{V}}_{1}$ is called ${\mathbb{V}}_{1}$ –internal. Hence ${\mathbb{V}}_{0}$ –internal set is also a ${\mathbb{V}}_{1}$ –internal set. For example, ${\mathbb{R}}_{1}=i^{0}({\mathbb{R}}_{0})$ and ${\mathbb{N}}_{1}=i^{0}({\mathbb{N}}_{0})$ are ${\mathbb{V}}_{0}$ –internal sets. Some ${\mathbb{V}}_{1}$ –internal sets are not ${\mathbb{V}}_{0}$ –internal. For example, the set $\{1,2,\ldots,[\mbox{Id}]_{{\mathcal{F}}_{0}}\}$ is ${\mathbb{V}}_{1}$ –internal subset of ${\mathbb{N}}_{1}$ but not ${\mathbb{V}}_{0}$ –internal. Some subsets of a ${\mathbb{V}}_{1}$ –internal set are not ${\mathbb{V}}_{1}$ –internal. For example, ${\mathbb{N}}_{0}$ as a subset of ${\mathbb{N}}_{1}$ is not ${\mathbb{V}}_{1}$ –internal because it is bounded above in ${\mathbb{N}}_{1}$ and has no largest element. Notice that $i^{0}[{\mathbb{N}}_{0}]={\mathbb{N}}_{0}$ and $i^{0}({\mathbb{N}}_{0})={\mathbb{N}}_{1}$ . The following proposition says that a subset of $B\in{\mathbb{V}}_{1}$ defined by an $\mathscr{L}$ –formula with parameters from ${\mathbb{V}}_{1}$ is ${\mathbb{V}}_{1}$ –internal. The proposition is an easy consequence of Proposition 2.3.

Proposition 2.5

Let $\varphi(\overline{a},\overline{x})$ be an $\mathscr{L}$ –formula with bounded quantifiers, and parameters $\overline{a}$ and $B^{m}$ being ${\mathbb{V}}_{1}$ –internal. Then $\left\{\overline{b}\in B^{m}\mid{\mathbb{V}}_{1}\models\varphi(\overline{a},\overline{b})\right\}$ is a ${\mathbb{V}}_{1}$ –internal set. (cf. [1, Theorem 4.4.14].)

The following is called the overspill principle in nonstandard analysis. Notice that ${\mathbb{N}}_{0}$ is an infinite initial segment of ${\mathbb{N}}_{1}$ .

Proposition 2.6

Let $U\subseteq{\mathbb{N}}_{1}$ be an infinite proper initial segment of ${\mathbb{N}}_{1}$ and not ${\mathbb{V}}_{1}$ –internal. Let $A$ be an ${\mathbb{V}}_{1}$ –internal subset of ${\mathbb{N}}_{1}$ . If $A\cap U$ is upper unbounded in $U$ , then $A\cap({\mathbb{N}}_{1}\setminus U)\not=\emptyset$ .

The proof of Proposition 2.6 is easy. If $A\cap({\mathbb{N}}_{1}\setminus U)=\emptyset$ , then $U$ can be defined by a formula with bounded quantifiers and parameter $A$ . Hence $U$ is ${\mathbb{V}}_{1}$ –internal.

§2.5 Construction of ${\mathbb{V}}_{2}$ and ${\mathbb{V}}_{3}$ We now extend ${\mathbb{V}}_{1}$ further to ${\mathbb{V}}_{2}$ and ${\mathbb{V}}_{3}$ to form a nonstandard extension chain

{\mathbb{V}}_{0}\prec{\mathbb{V}}_{1}\prec{\mathbb{V}}_{2}\prec{\mathbb{V}}_{3}

using ultrapower construction and show the existence of bounded elementary embeddings $i_{0}$ , $i_{*}$ , $i_{1}$ , and $i_{2}$ besides the natural embeddings $i^{1}:{\mathbb{V}}_{1}\to{\mathbb{V}}_{2}$ and $i^{2}:{\mathbb{V}}_{2}\to{\mathbb{V}}_{3}$ . The chain and embeddings will satisfy the following properties. Let ${\mathbb{N}}_{j+1}=i^{j}({\mathbb{N}}_{j})$ and ${\mathbb{R}}_{j+1}=i^{j}({\mathbb{R}}_{j})$ be the set of all positive integers and the set of all real numbers, respectively, in ${\mathbb{V}}_{j+1}$ for $j=1,2$ .

Property 2.7

1.

For $j=0,1,2$ , ${\mathbb{N}}_{j+1}$ is an end–extension of ${\mathbb{N}}_{j}$ , i.e., every number in ${\mathbb{N}}_{j+1}\setminus{\mathbb{N}}_{j}$ is greater than each number in ${\mathbb{N}}_{j}$ .
2.

There is a bounded elementary embedding $i_{*}$ from the $\mathscr{L}^{\prime}$ –model
$({\mathbb{V}}_{2};{\mathbb{R}}_{0},{\mathbb{R}}_{1})$ to the $\mathscr{L}^{\prime}$ –model $({\mathbb{V}}_{3};{\mathbb{R}}_{1},{\mathbb{R}}_{2})$ , where $\mathscr{L}^{\prime}:=\mathscr{L}\cup\{P_{1},P_{2}\}$ for two new unary predicate symbols $P_{1}$ and $P_{2}$ not in $\mathscr{L}$ . Furthermore, the map $i_{1}:=i_{*}\!\upharpoonright\!{\mathbb{V}}_{1}$ is a bounded elementary embedding from $({\mathbb{V}}_{1};{\mathbb{R}}_{0})$ to $({\mathbb{V}}_{2};{\mathbb{R}}_{1})$ . Notice that $i_{1}(a)\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ for each $a\in{\mathbb{N}}_{1}\setminus{\mathbb{N}}_{0}$ .
3.

There is a bounded elementary embedding $i_{2}$ from ${\mathbb{V}}_{2}$ to ${\mathbb{V}}_{3}$ such that $i_{2}\!\upharpoonright\!{\mathbb{N}}_{1}$ is an identity map and $i_{2}(a)\in{\mathbb{N}}_{3}\setminus{\mathbb{N}}_{2}$ for each $a\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ .

We are now going to work towards establishing this property in this section.

$\blacklozenge$ : From now on, let ${\mathbb{V}}_{j}^{{\mathbb{N}}_{j}}$ always represent, for notational convenience, the set of all functions $f$ from ${\mathbb{N}}_{j}$ to ${\mathbb{V}}_{j}$ such that $\{\mbox{rank}(f(n))\mid n\in{\mathbb{N}}_{j}\}$ is a bounded set in $\omega$ .

Notice that ${\mathcal{F}}_{1}:=i^{0}({\mathcal{F}}_{0})\in{\mathbb{V}}_{1}$ satisfies Part 1–4 of Definition 2.1 with $X$ , $\mathscr{P}_{<\omega}(X)$ , and $\mathscr{P}(X)$ being replaced by ${\mathbb{N}}_{1}:=i^{0}({\mathbb{N}}_{0})$ , $i^{0}(\mathscr{P}_{<{\mathbb{N}}_{0}}({\mathbb{N}}_{0}))=\left\{A\subseteq{\mathbb{N}}_{1}\mid A\in{\mathbb{V}}_{1}\,\mbox{ and }\,|A|\in{\mathbb{N}}_{1}\right\}$ , and $i^{0}(\mathscr{P}({\mathbb{N}}_{0}))={\mathbb{V}}_{1}\cap\mathscr{P}({\mathbb{N}}_{1})$ , respectively. We call ${\mathcal{F}}_{1}$ a ${\mathbb{V}}_{1}$ –internal non-principal ultrafilter on ${\mathbb{N}}_{1}$ .

Definition 2.8

Let ${\mathcal{F}}_{1}:=i^{0}({\mathcal{F}}_{0})$ . Denote ${\mathbb{V}}_{2}$ for the model $(V_{2};\,\in_{2})$ such that

V_{2}:=({\mathbb{V}}_{1}^{{\mathbb{N}}_{1}}\cap{\mathbb{V}}_{1})/{\mathcal{F}}_{1}\,\mbox{ and}

[f]_{{\mathcal{F}}_{1}}\in_{2}[g]_{{\mathcal{F}}_{1}}\,\mbox{ iff }\,\{n\in{\mathbb{N}}_{1}\mid f(n)\in g(n)\}\in{\mathcal{F}}_{1}

for all $f,g\in{\mathbb{V}}_{1}^{{\mathbb{N}}_{1}}\cap{\mathbb{V}}_{1}$ . Let $i^{1}:{\mathbb{V}}_{1}\to{\mathbb{V}}_{2}$ be the natural embedding that $i^{1}(c)=[\phi_{c}]_{{\mathcal{F}}_{1}}$ for every $c\in{\mathbb{V}}_{1}$ .

We call ${\mathbb{V}}_{2}$ a ${\mathbb{V}}_{1}$ –internal ultrapower of ${\mathbb{V}}_{1}$ modulo the ${\mathbb{V}}_{1}$ –internal ultrafilter ${\mathcal{F}}_{1}$ .

Proposition 2.9

If $\varphi(\overline{x})$ is an $\mathscr{L}$ –formula with bounded quantifier and
$\overline{[a]}_{{\mathcal{F}}_{1}}\in{\mathbb{V}}_{2}$ , then

{\mathbb{V}}_{2}\models\varphi(\overline{[a]}_{{\mathcal{F}}_{1}})\,\mbox{ iff }\,\left\{n\in{\mathbb{N}}_{1}\mid{\mathbb{V}}_{1}\models\varphi(\overline{a(n)})\right\}\in{\mathcal{F}}_{1}.

Corollary 2.10

The natural embedding $i^{1}:{\mathbb{V}}_{1}\to{\mathbb{V}}_{2}$ is a bounded elementary embedding from ${\mathbb{V}}_{1}$ to ${\mathbb{V}}_{2}$ .

The proof of Proposition 2.9 is almost the same as the proof of Proposition 2.3 except one step that shows $\left\{n\in{\mathbb{N}}_{1}\mid{\mathbb{V}}_{1}\models\exists x\!\in\!B(n)\,\varphi(\overline{a(n)},x)\right\}\in{\mathcal{F}}_{1}$ implies ${\mathbb{V}}_{2}\models\exists x\!\in\![B]_{{\mathcal{F}}_{1}}\,\varphi(\overline{[a]}_{{\mathcal{F}}_{1}},x)$ . Let $M_{z}=i^{0}({\mathbb{V}}({\mathbb{R}}_{0},z))\in{\mathbb{V}}_{1}$ with $z\in\omega$ such that $B,\overline{a}\in M_{z}$ . By the axiom of choice there is a well-order $\lhd$ on ${\mathbb{V}}({\mathbb{R}}_{0},z)$ . So, every nonempty set $A\in M_{z}$ has a $i^{0}(\lhd)$ -least element by the transfer principle. Suppose that $\left\{n\in{\mathbb{N}}_{1}\mid{\mathbb{V}}_{1}\models\exists x\!\in\!B(n)\,\varphi(\overline{a(n)},x)\right\}=X\in{\mathcal{F}}_{1}$ . For each $n\in{\mathbb{N}}_{1}$ , if $n\not\in X$ , let $f(n)=0$ and if $n\in X$ , let $f(n)$ be the $i^{0}(\lhd)$ -least element in the nonempty set $\left\{x\in B(n)\mid{\mathbb{V}}_{1}\models\varphi(a(n),x)\right\}$ . Since $B,\overline{a},i^{0}(\lhd)$ are in ${\mathbb{V}}_{1}$ , so does the function $f$ by Proposition 2.5. Hence $[f]_{{\mathcal{F}}_{1}}\in{\mathbb{V}}_{2}\cap[B]_{{\mathcal{F}}_{1}}$ . By the induction hypothesis we have that

\left\{n\in{\mathbb{N}}_{1}\mid{\mathbb{V}}_{1}\models\varphi(a(n),f(n))\right\}\supseteq X\in{\mathcal{F}}_{1}\,\mbox{ implies }\,{\mathbb{V}}_{2}\models\varphi(\overline{[a]}_{{\mathcal{F}}_{1}},[f]_{{\mathcal{F}}_{1}}),

which implies ${\mathbb{V}}_{2}\models\exists x\!\in\![B]_{{\mathcal{F}}_{1}}\,\varphi(\overline{[a]}_{{\mathcal{F}}_{1}},x)$ .

Let

i_{0}=i^{1}\!\circ\!i^{0}.

(4)

Then $i_{0}$ is a bounded elementary embedding from ${\mathbb{V}}_{0}$ to ${\mathbb{V}}_{2}$ , which will be used in Theorem 5.4.

Same as for ${\mathbb{V}}_{1}$ we can assume by Mostowski collapsing that ${\mathbb{V}}_{2}$ is a subset of the superstructure ${\mathbb{V}}({\mathbb{R}}_{2})$ and $\in_{2}$ is the true membership relation $\in$ . Notice that $i^{1}\!\upharpoonright\!{\mathbb{R}}_{1}$ is an identity map. The element $a\in{\mathbb{V}}_{2}$ is called ${\mathbb{V}}_{2}$ –internal, $i^{1}(b)$ is called ${\mathbb{V}}_{1}$ –internal for any $b\in{\mathbb{V}}_{1}$ , and $i_{0}(c)$ is called ${\mathbb{V}}_{0}$ –internal for every $c\in{\mathbb{V}}_{0}$ . Notice that ${\mathbb{N}}_{0}$ and ${\mathbb{N}}_{1}$ as subsets of ${\mathbb{N}}_{2}$ are not ${\mathbb{V}}_{2}$ –internal.

If $f\in{\mathbb{V}}_{1}$ is a function from ${\mathbb{N}}_{1}$ to $[n]$ for some $n\in{\mathbb{N}}_{1}$ , then $[f]_{{\mathcal{F}}_{1}}=m$ for some $m\in{\mathbb{N}}_{1}$ by (3) with ${\mathbb{V}}_{0},{\mathbb{V}}_{1},i^{0}$ being replaced by ${\mathbb{V}}_{1},{\mathbb{V}}_{2},i^{1}$ . Hence ${\mathbb{N}}_{2}$ is a proper end-extension of ${\mathbb{N}}_{1}$ .

By the same way, we can define ${\mathbb{V}}_{3}$ as a ${\mathbb{V}}_{2}$ –internal ultrapower of ${\mathbb{V}}_{2}$ modulo $i^{1}({\mathcal{F}}_{1})$ .

Definition 2.11

Let ${\mathcal{F}}_{2}:=i^{1}({\mathcal{F}}_{1})$ be the ${\mathbb{V}}_{2}$ –internal ultrafilter on ${\mathbb{N}}_{2}$ . Let ${\mathbb{V}}_{3}$ be the model $(V_{3};\,\in_{3})$ such that

V_{3}:=({\mathbb{V}}_{2}^{{\mathbb{N}}_{2}}\cap{\mathbb{V}}_{2})/{\mathcal{F}}_{2}\,\mbox{ and}

[f]_{{\mathcal{F}}_{2}}\in_{3}[g]_{{\mathcal{F}}_{2}}\,\mbox{ iff }\,\{n\in{\mathbb{N}}_{2}\mid f(n)\in g(n)\}\in{\mathcal{F}}_{2}

for all $f,g\in{\mathbb{V}}_{2}^{{\mathbb{N}}_{2}}\cap{\mathbb{V}}_{2}$ , and define the natural mebedding $i^{2}:{\mathbb{V}}_{2}\to{\mathbb{V}}_{3}$ by $i^{2}(c)=[\phi_{c}]_{{\mathcal{F}}_{2}}$ for every $c\in{\mathbb{V}}_{2}$ .

Generalizing the arguments above, we have that the map $i^{2}$ is a bounded elementary embedding from ${\mathbb{V}}_{2}$ to ${\mathbb{V}}_{3}$ . We say that ${\mathbb{V}}_{3}$ is a nonstandard extension of ${\mathbb{V}}_{2}$ . It is also easy to see that ${\mathbb{N}}_{3}:=i^{2}({\mathbb{N}}_{2})$ is an end-extension of ${\mathbb{N}}_{2}$ .

We have completed the construction of ${\mathbb{V}}_{0}\prec{\mathbb{V}}_{1}\prec{\mathbb{V}}_{2}\prec{\mathbb{V}}_{3}$ and verified that Part 1 of Property 2.7 is true.

§2.6 Bounded elementary embeddings $i_{*}$ , $i_{1}$ , and $i_{2}$ To verify Part 2–3 of Property 2.7 we have to view the construction of ${\mathbb{V}}_{j}$ for $j=2,3$ from a different angle. For a set $A\subseteq{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}$ and $n\in{\mathbb{N}}_{0}$ , let $A_{n}:=\left\{m\in{\mathbb{N}}_{0}\mid(m,n)\in A\right\}$ . Let ${\mathcal{F}}_{0}$ and ${\mathcal{F}}_{0}^{\prime}$ be two non-principal ultrafilters on ${\mathbb{N}}_{0}$ . The tensor product of ${\mathcal{F}}_{0}$ and ${\mathcal{F}}^{\prime}_{0}$ is defined by

{\mathcal{F}}_{0}\otimes{\mathcal{F}}_{0}^{\prime}:=\left\{A\subseteq{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}\mid\{m\in{\mathbb{N}}_{0}\mid A_{m}\in{\mathcal{F}}_{0}\}\in{\mathcal{F}}_{0}^{\prime}\right\}.

It is easy to check that ${\mathcal{F}}_{0}\otimes{\mathcal{F}}_{0}^{\prime}$ is a non-principal ultrafilter on ${\mathbb{N}}_{0}\times{\mathbb{N}}_{0}$ . For simplicity we assume that ${\mathcal{F}}_{0}$ and ${\mathcal{F}}^{\prime}_{0}$ are the same ultrafilter.

Lemma 2.12

Let ${\mathbb{V}}^{\prime}_{2}:=(V_{0}^{{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}\otimes{\mathcal{F}}_{0}^{\prime};\in^{\prime})$ where $[f]_{{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}}\in^{\prime}[g]_{{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}}$ iff
$\left\{(m,n)\in{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}\mid f(m,n)\in g(m,n)\right\}\in{\mathcal{F}}_{0}\otimes{\mathcal{F}}_{0}^{\prime}$ . Then

{\mathbb{V}}_{2}\cong{\mathbb{V}}_{2}^{\prime}.

Proof Let $m$ range over the first copy of ${\mathbb{N}}_{0}$ and $n$ over the second copy of ${\mathbb{N}}_{0}$ in ${\mathbb{N}}_{0}\times{\mathbb{N}}_{0}$ . Given $a\in{\mathbb{V}}_{2}$ , there is an $f_{a}\in{\mathbb{V}}_{1}^{{\mathbb{N}}_{1}}\cap{\mathbb{V}}_{1}$ such that $a=[f_{a}]_{{\mathcal{F}}_{1}}$ . Since the ranks of all image of $f_{a}$ is bounded, the range of $f_{a}$ is in a ${\mathbb{V}}_{0}$ –internal set $i^{0}(B)$ . So, by identifying $f_{a}$ with its graph, we have $f_{a}\subseteq{\mathbb{N}}_{1}\times i^{0}(B)$ . Since $f_{a}\in{\mathbb{V}}_{1}$ , there is a $g_{a}\in{\mathbb{V}}_{0}^{{\mathbb{N}}_{0}}$ such that $f_{a}=[g_{a}]_{{\mathcal{F}}_{0}}$ where $g_{a}(n)\subseteq{\mathbb{N}}_{0}\times B$ is the graph of a function $g_{a}(n):{\mathbb{N}}_{0}\to B$ for all $n\in{\mathbb{N}}_{0}$ . Now let $F_{a}:{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}\to B$ be such that $F_{a}(m,n)=g_{a}(n)(m)\in B$ . Then $F_{a}\in{\mathbb{V}}_{0}^{{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}}$ .

We view $a\mapsto[F_{a}]_{{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}}$ as a relation between ${\mathbb{V}}_{2}$ and ${\mathbb{V}}^{\prime}_{2}$ . Notice that $[X]_{{\mathcal{F}}^{\prime}_{0}}\in{\mathcal{F}}_{1}$ iff $\{n\in{\mathbb{N}}_{0}\mid X(n)\in{\mathcal{F}}_{0}\}\in{\mathcal{F}}^{\prime}_{0}$ . Notice also that for any $a,b\in{\mathbb{V}}_{2}$ , we have

	$\displaystyle a=b\,\mbox{ iff }$
			$\displaystyle\left\{[x]_{{\mathcal{F}}^{\prime}_{0}}\in{\mathbb{N}}_{1}\mid f_{a}([x]_{{\mathcal{F}}^{\prime}_{0}})=f_{b}([x]_{{\mathcal{F}}^{\prime}_{0}})\right\}\in{\mathcal{F}}_{1}\,\mbox{ iff}$
			$\displaystyle\left\{[x]_{{\mathcal{F}}^{\prime}_{0}}\in{\mathbb{N}}_{1}\mid[g_{a}]_{{\mathcal{F}}^{\prime}_{0}}([x]_{{\mathcal{F}}^{\prime}_{0}})=[g_{b}]_{{\mathcal{F}}^{\prime}_{0}}([x]_{{\mathcal{F}}^{\prime}_{0}})\right\}\in{\mathcal{F}}_{1}\,\mbox{ iff}$
			$\displaystyle\left\{[x]_{{\mathcal{F}}^{\prime}_{0}}\in{\mathbb{N}}_{1}\mid[g_{a}(n)(x(n))]_{{\mathcal{F}}^{\prime}_{0}}=[g_{b}(n)(x(n))]_{{\mathcal{F}}^{\prime}_{0}}\right\}\in{\mathcal{F}}_{1}\,\mbox{ iff}$
			$\displaystyle\left\{n\in{\mathbb{N}}_{0}\mid\{m=x(n)\in{\mathbb{N}}_{0}\mid g_{a}(n)(x(n))=g_{b}(n)(x(n))\}\in{\mathcal{F}}_{0}\right\}\in{\mathcal{F}}^{\prime}_{0}\,\mbox{ iff}$
			$\displaystyle\left\{n\in{\mathbb{N}}_{0}\mid\{m\in{\mathbb{N}}_{0}\mid g_{a}(n)(m)=g_{b}(n)(m)\}\in{\mathcal{F}}_{0}\right\}\in{\mathcal{F}}^{\prime}_{0}\,\mbox{ iff}$
			$\displaystyle\left\{n\in{\mathbb{N}}_{0}\mid\{m\in{\mathbb{N}}_{0}\mid F_{a}(m,n)=F_{b}(m,n))\}\in{\mathcal{F}}_{0}\right\}\in{\mathcal{F}}^{\prime}_{0}\,\mbox{ iff}$
			$\displaystyle\left\{(m,n)\in{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}\mid F_{a}(m,n)=F_{b}(m,n))\right\}\in{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}\,\mbox{ iff}$
			$\displaystyle[F_{a}]_{{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}}=[F_{b}]_{{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}}.$

Hence, the relation $a\mapsto[F_{a}]_{{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}}$ is an injective function from ${\mathbb{V}}_{2}$ to ${\mathbb{V}}^{\prime}_{2}$ .

On the other hand, given $[F]_{{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}}\in{\mathbb{V}}_{0}^{{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}$ , let $g(n)\in{\mathbb{V}}_{0}^{{\mathbb{N}}_{0}}$ be such that $g(n)(m)=F(m,n)$ . Let $f:{\mathbb{N}}_{1}\to{\mathbb{V}}_{1}$ be such that $f([x]_{{\mathcal{F}}_{0}}):=[g]_{{\mathcal{F}}^{\prime}_{0}}([x]_{{\mathcal{F}}_{0}})=[g(n)(x(n))]_{{\mathcal{F}}^{\prime}_{0}}$ . Then $f=[g]_{{\mathcal{F}}^{\prime}_{0}}\in{\mathbb{V}}_{1}$ . So, $a=[f]_{{\mathcal{F}}_{1}}\in{\mathbb{V}}_{2}$ is such that $[F_{a}]_{{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}}=[F]_{{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}}$ . This shows that $a\mapsto[F_{a}]_{{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}}$ is surjective.

Notice also that for any $a,b\in{\mathbb{V}}_{2}$ ,

	$\displaystyle a\in b\,\mbox{ iff}$
			$\displaystyle\left\{[x]_{{\mathcal{F}}^{\prime}_{0}}\in{\mathbb{N}}_{1}\mid f_{a}([x]_{{\mathcal{F}}^{\prime}_{0}})\in f_{b}([x]_{{\mathcal{F}}^{\prime}_{0}})\right\}\in{\mathcal{F}}_{1}\,\mbox{ iff}$
			$\displaystyle\left\{[x]_{{\mathcal{F}}^{\prime}_{0}}\in{\mathbb{N}}_{1}\mid[g_{a}]_{{\mathcal{F}}^{\prime}_{0}}([x]_{{\mathcal{F}}^{\prime}_{0}})\in[g_{b}]_{{\mathcal{F}}^{\prime}_{0}}([x]_{{\mathcal{F}}^{\prime}_{0}})\right\}\in{\mathcal{F}}_{1}\,\mbox{ iff}$
			$\displaystyle\left\{[x]_{{\mathcal{F}}^{\prime}_{0}}\in{\mathbb{N}}_{1}\mid[g_{a}(n)(x(n))]_{{\mathcal{F}}^{\prime}_{0}}\in[g_{b}(n)(x(n))]_{{\mathcal{F}}^{\prime}_{0}}\right\}\in{\mathcal{F}}_{1}\,\mbox{ iff}$
			$\displaystyle\left\{n\in{\mathbb{N}}_{0}\mid\{m=x(n)\in{\mathbb{N}}_{0}\mid g_{a}(n)(x(n))\in g_{b}(n)(x(n))\}\in{\mathcal{F}}_{0}\right\}\in{\mathcal{F}}^{\prime}_{0}\,\mbox{ iff}$
			$\displaystyle\left\{n\in{\mathbb{N}}_{0}\mid\{m\in{\mathbb{N}}_{0}\mid g_{a}(n)(m)\in g_{b}(n)(m)\}\in{\mathcal{F}}_{0}\right\}\in{\mathcal{F}}^{\prime}_{0}\,\mbox{ iff}$
			$\displaystyle\left\{n\in{\mathbb{N}}_{0}\mid\{m\in{\mathbb{N}}_{0}\mid F_{a}(m,n)\in F_{b}(m,n))\}\in{\mathcal{F}}_{0}\right\}\in{\mathcal{F}}^{\prime}_{0}\,\mbox{ iff}$
			$\displaystyle\left\{(m,n)\in{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}\mid F_{a}(m,n)\in F_{b}(m,n))\right\}\in{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}\,\mbox{ iff}$
			$\displaystyle[F_{a}]_{{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}}\in^{\prime}[F_{b}]_{{\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}}.$

This completes the proof of ${\mathbb{V}}_{2}\cong{\mathbb{V}}^{\prime}_{2}$ . $\blacksquare$

If we identify each of ${\mathbb{V}}_{2}$ and ${\mathbb{V}}^{\prime}_{2}$ with its image of Mostowski collapsing, then ${\mathbb{V}}_{2}$ and ${\mathbb{V}}^{\prime}_{2}$ can be viewed as the same model.

Lemma 2.13

Let ${\mathbb{V}}^{\prime\prime}_{2}:=\left((V_{0}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0})^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}^{\prime};\in^{\prime\prime}\right)$ where $[[f]_{{\mathcal{F}}_{0}}]_{{\mathcal{F}}^{\prime}_{0}}\in^{\prime\prime}[[g]_{{\mathcal{F}}_{0}}]_{{\mathcal{F}}^{\prime}_{0}}$ iff
$\left\{n\in{\mathbb{N}}_{0}\mid\left\{m\in{\mathbb{N}}_{0}\mid f(m,n)\in g(m,n)\right\}\in{\mathcal{F}}_{0}\right\}\in{\mathcal{F}}_{0}^{\prime}$ . Then

{\mathbb{V}}^{\prime}_{2}\cong{\mathbb{V}}_{2}^{\prime\prime}

for any $f,g:{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}\to{\mathbb{V}}_{0}$ .

The proof of Lemma 2.13 can be found in [1, Proposition 6.5.2]. We call ${\mathbb{V}}_{2}^{\prime\prime}$ the external ultrapower of ${\mathbb{V}}_{1}$ modulo ${\mathcal{F}}^{\prime}_{0}$ . By Lemma 2.12 and Lemma 2.13 we can view ${\mathbb{V}}_{2}$ and ${\mathbb{V}}_{2}^{\prime\prime}$ as the same model and write $\in$ for $\in^{\prime\prime}$ . To summarize, we have that

{\mathbb{V}}^{\prime\prime}_{2}=({\mathbb{V}}_{0}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0})^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}^{\prime}={\mathbb{V}}_{0}^{{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}\otimes{\mathcal{F}}_{0}^{\prime}=({\mathbb{V}}_{1}^{{\mathbb{N}}_{1}}\cap{\mathbb{V}}_{1})/{\mathcal{F}}_{1}={\mathbb{V}}_{2}.

(5)

The term on the left side is the two-step iteration when we take the ultrapowers by using ${\mathcal{F}}_{0}$ first and ${\mathcal{F}}^{\prime}_{0}$ second. The term on the right side is when we use ${\mathcal{F}}^{\prime}_{0}$ first so that $V_{0}^{{\mathbb{N}}_{0}}$ becomes ${\mathbb{V}}_{1}$ , ${\mathbb{N}}_{0}$ becomes ${\mathbb{N}}_{1}$ , and ${\mathcal{F}}_{0}$ becomes ${\mathcal{F}}_{1}$ .

However, the bounded elementary embedding from ${\mathbb{V}}_{1}$ to ${\mathbb{V}}_{2}$ induced by the ${\mathbb{V}}_{1}$ –internal ultrapower modulo ${\mathcal{F}}_{1}$ and the bounded elementary embedding from ${\mathbb{V}}_{1}$ to ${\mathbb{V}}_{2}^{\prime\prime}$ induced by the external ultrapower of ${\mathbb{V}}_{1}$ modulo ${\mathcal{F}}_{0}^{\prime}$ are different. Let $i_{1}:{\mathbb{V}}_{1}\to{\mathbb{V}}_{2}^{\prime\prime}={\mathbb{V}}_{1}^{{\mathbb{N}}_{0}}/{\mathcal{F}}^{\prime}_{0}$ be such that $i_{1}(c)=[\phi_{c}]_{{\mathcal{F}}^{\prime}_{0}}$ for each $c\in{\mathbb{V}}_{1}$ . If $c\in{\mathbb{N}}_{0}\subseteq{\mathbb{N}}_{1}$ , then clearly we have $[\phi_{c}]_{{\mathcal{F}}_{0}^{\prime}}=c$ . If $c,c^{\prime}\in{\mathbb{N}}_{1}\setminus{\mathbb{N}}_{0}$ with $c=[f]_{{\mathcal{F}}^{\prime}_{0}}$ and $c^{\prime}=[g]_{{\mathcal{F}}^{\prime}_{0}}$ for some $f,g\in{\mathbb{N}}_{0}^{{\mathbb{N}}_{0}}$ , then $c>g(n)$ for every $n\in{\mathbb{N}}_{0}$ . Hence $i_{1}(c)=[\phi_{c}]_{{\mathcal{F}}_{0}^{\prime}}>[g]_{{\mathcal{F}}_{0}^{\prime}}=c^{\prime}$ . This shows that $i_{1}(c)\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ . The map $i_{1}$ will be $i_{*}\!\upharpoonright\!{\mathbb{V}}_{1}$ for a bounded elementary embedding $i_{*}$ defined below. Notice that above arguments still work if the ultrafilters ${\mathcal{F}}_{0}$ and ${\mathcal{F}}_{0}^{\prime}$ are different.

Generalizing the construction further and with the help of Mostowski collapsing, we have that

	$\displaystyle{\mathbb{V}}^{\prime\prime}_{3}:=(({\mathbb{V}}_{0}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0})^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}^{\prime})^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}^{\prime\prime}={\mathbb{V}}_{2}^{{\mathbb{N}}_{0}}/{\mathcal{F}}^{\prime\prime}_{0}$
			$\displaystyle={\mathbb{V}}_{0}^{{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}\times{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}\otimes{\mathcal{F}}_{0}^{\prime}\otimes{\mathcal{F}}_{0}^{\prime\prime}$
			$\displaystyle=({\mathbb{V}}_{1}^{{\mathbb{N}}_{1}\times{\mathbb{N}}_{1}}\cap{\mathbb{V}}_{1})/{\mathcal{F}}_{1}\otimes{\mathcal{F}}^{\prime}_{1}=({\mathbb{V}}_{2}^{{\mathbb{N}}_{1}}\cap{\mathbb{V}}_{1})/{\mathcal{F}}^{\prime}_{1}={\mathbb{V}}_{3}^{\prime}$
			$\displaystyle=({\mathbb{V}}_{1}^{{\mathbb{N}}_{1}\times{\mathbb{N}}_{1}}\cap{\mathbb{V}}_{1})/{\mathcal{F}}_{1}\otimes{\mathcal{F}}^{\prime}_{1}=({\mathbb{V}}_{2}^{{\mathbb{N}}_{2}}\cap{\mathbb{V}}_{2})/{\mathcal{F}}_{2}={\mathbb{V}}_{3}.$

Let $i_{*}:{\mathbb{V}}_{2}\to{\mathbb{V}}_{3}^{\prime\prime}={\mathbb{V}}_{2}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}^{\prime\prime}$ be the bounded elementary embedding determined by (2) with $i_{*}(c)=[\phi_{c}]_{{\mathcal{F}}_{0}^{\prime\prime}}$ for every $c\in{\mathbb{V}}_{2}$ . Since ${\mathbb{R}}_{0}^{{\mathbb{N}}_{0}}/{\mathcal{F}}^{\prime\prime}_{0}={\mathbb{R}}_{1}$ and ${\mathbb{R}}_{1}^{{\mathbb{N}}_{0}}/{\mathcal{F}}^{\prime\prime}_{0}=({\mathbb{R}}_{1}^{{\mathbb{N}}_{1}}\cap{\mathbb{V}}_{1})/{\mathcal{F}}_{1}={\mathbb{R}}_{2}$ we conclude that the map $i_{*}$ is also a bounded elementary embedding from model expansion $({\mathbb{V}}_{2};{\mathbb{R}}_{0},{\mathbb{R}}_{1})$ to $({\mathbb{V}}_{3};{\mathbb{R}}_{1},{\mathbb{R}}_{2})$ . Since ${\mathbb{V}}_{2}={\mathbb{V}}_{1}^{{\mathbb{N}}_{0}}/{\mathcal{F}}_{0}^{\prime\prime}$ , we have that $i_{*}\!\upharpoonright\!{\mathbb{V}}_{1}$ coincides with $i_{1}$ mentioned above when ${\mathbb{V}}_{2}^{\prime\prime}$ is constructed, and is a bounded elementary embedding from $({\mathbb{V}}_{1};{\mathbb{R}}_{0})$ to $({\mathbb{V}}_{2};{\mathbb{R}}_{1})$ . Hence Part 2 of Property 2.7 is verified.

Let $i_{2}:{\mathbb{V}}_{2}\to{\mathbb{V}}_{3}^{\prime}=({\mathbb{V}}_{2}^{{\mathbb{N}}_{1}}\cap{\mathbb{V}}_{1})/{\mathcal{F}}_{1}^{\prime}$ be the bounded elementary embedding determined by (2) with $i_{2}(c)=[\phi_{c}]_{{\mathcal{F}}_{1}^{\prime}}$ for every $c\in{\mathbb{V}}_{2}$ . Since every ${\mathbb{V}}_{1}$ –internal function from ${\mathbb{N}}_{1}$ to $[n]$ for some $n\in{\mathbb{N}}_{1}$ is equivalent to a constant function modulo ${\mathcal{F}}^{\prime}_{1}$ we conclude that $i_{2}\!\upharpoonright\!{\mathbb{N}}_{1}$ is an identity map. Similar to the argument for $i_{1}$ , we have $i_{2}(c)\in{\mathbb{N}}_{3}\setminus{\mathbb{N}}_{2}$ for every $c\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ . This verifies Part 3 of Property 2.7.

The following propositions are ${\mathbb{V}}_{j}$ –versions of Proposition 2.5 and Proposition 2.6. Let $0<j\leq 3$ .

Proposition 2.14

Let $\varphi(\overline{a},\overline{x})$ be an $\mathscr{L}$ –formula with bounded quantifiers, and parameters $\overline{a}$ and $B^{m}$ be ${\mathbb{V}}_{j}$ –internal. Then $\left\{\overline{b}\in B^{m}\mid{\mathbb{V}}_{j}\models\varphi(\overline{a},\overline{b})\right\}$ is a ${\mathbb{V}}_{j}$ –internal set.

Proposition 2.15

Let $U$ be an infinite proper initial segment of ${\mathbb{N}}_{j}$ and not ${\mathbb{V}}_{j}$ –internal. Let $A\subseteq{\mathbb{N}}_{j}$ be ${\mathbb{V}}_{j}$ –internal. If $A\cap U$ is upper unbounded in $U$ , then $A\cap({\mathbb{N}}_{j}\setminus U)\not=\emptyset$ .

We would like to mention that ${\mathbb{V}}_{1}$ , ${\mathbb{V}}_{2}$ , and ${\mathbb{V}}_{3}$ are ultrapowers of ${\mathbb{V}}_{0}$ modulo the ultrafilters ${\mathcal{F}}_{0}$ , ${\mathcal{F}}_{0}\otimes{\mathcal{F}}_{0}^{\prime}$ , and ${\mathcal{F}}_{0}\otimes{\mathcal{F}}^{\prime}_{0}\otimes{\mathcal{F}}^{\prime\prime}_{0}$ , respectively. Hence, they are all countably saturated (cf. [1, Corollary 4.4.24]) although the countable saturation is not used in this paper. The proofs of Proposition 2.14 and Proposition 2.15 are the same as Proposition 2.5 and Proposition 2.6. Notice also that ${\mathbb{V}}_{2}$ and ${\mathbb{V}}_{3}$ are ${\mathbb{V}}_{1}$ –internal ultrapowers of ${\mathbb{V}}_{1}$ modulo the ${\mathbb{V}}_{1}$ –internal ultrafilters ${\mathcal{F}}_{1}$ and ${\mathcal{F}}_{1}\otimes{\mathcal{F}}^{\prime}_{1}$ , respectively.

The ultrafilters ${\mathcal{F}}_{0}$ , ${\mathcal{F}}^{\prime}_{0}$ , and ${\mathcal{F}}^{\prime\prime}_{0}$ do not have to be on the countable set ${\mathbb{N}}_{0}$ and do not have to be the same. The only restriction is that they have to be in ${\mathbb{V}}_{0}$ .

Iterated nonstandard extensions were used in combinatorial number theory before, e.g. in [2, 3, 6]. But we will use them in a new way by exploring the advantages of various bounded elementary embeddings between ${\mathbb{V}}_{j}$ and ${\mathbb{V}}_{j^{\prime}}$ (see Property 2.7). For most of the time we work within ${\mathbb{V}}_{2}$ . The nonstandard universe ${\mathbb{V}}_{3}$ is only used for one step in the proof of Lemma 5.1.

3 Nonstandard Versions of Some Facts

The Greek letters $\alpha,\beta,\eta,\epsilon$ will represent “standard” reals unless specified otherwise. Let $0\leq j<j^{\prime}\leq 3$ in this paragraph. An $r\in{\mathbb{R}}_{3}$ is a ${\mathbb{V}}_{j}$ –infinitesimal, denoted by $r\approx_{j}0$ , if $|r|<1/n$ for every $n\in{\mathbb{N}}_{j}$ . By an infinitesimal we mean a ${\mathbb{V}}_{0}$ –infinitesimal. Denote $st_{j}$ for the ${\mathbb{V}}_{j}$ –standard part map, i.e., $st_{j}(r)$ is the unique real number $r^{\prime}\in{\mathbb{R}}_{j}$ such that $r-r^{\prime}\approx_{j}0$ when $r\in{\mathbb{R}}_{3}\cap(-m,m)$ for some $m\in{\mathbb{N}}_{j}$ . Notice that $st_{j}$ and ${\mathbb{N}}_{j}$ are definable by a formula with bounded quantifiers and parameters in $({\mathbb{V}}_{j^{\prime}};{\mathbb{R}}_{j})$ . Sometimes, the subscript $0$ will be dropped. For example, $\approx$ means $\approx_{0}$ and $st$ means $st_{0}$ . For any two positive integers $m,n\in{\mathbb{N}}_{3}$ we denote $m\ll n$ for $m\in{\mathbb{N}}_{j}$ and $n\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ . Hence $m\gg 1$ means $m\in{\mathbb{N}}_{3}\setminus{\mathbb{N}}_{0}$ . Notice that if $r\in{\mathbb{R}}_{1}\cap(-m,m)$ for some $m\in{\mathbb{N}}_{0}$ and $st(r)=\alpha$ , then $st(i_{1}(r))=\alpha$ where $i_{1}$ is in Part 2 of Property 2.7. This is true because $i_{1}(\alpha)=\alpha$ , $i_{1}(1/n)=1/n$ , and $|r-\alpha|<1/n$ iff $|i_{1}(r)-i_{1}(\alpha)|<i_{1}(1/n)$ for each $n\in{\mathbb{N}}_{0}$ . Similarly, we have $st_{1}(i_{2}(r))=st_{1}(r)$ for $r\in{\mathbb{R}}_{2}$ in the domain of $st_{1}$ .

Capital letters $A$ , $B$ , $C$ , $\ldots$ represent sets of integers except $H$ , $J$ , $K$ , $N$ which are reserved for integers in ${\mathbb{N}}_{3}\setminus{\mathbb{N}}_{0}$ . The letter $k\geq 3$ represents exclusively the length of the arithmetic progression in Szemerédi’s theorem and $l$ represents an integer between $1$ and $k$ . All unspecified sets mentioned in this paper will be either standard subsets of ${\mathbb{N}}_{0}$ or ${\mathbb{V}}_{j}$ –internal sets for some $j=1,2,3$ . For any $n\in{\mathbb{N}}_{3}$ let $[n]:=\{1,2,\ldots,n\}$ .

For any bounded set $A\subseteq{\mathbb{N}}_{j^{\prime}}$ and $n\in{\mathbb{N}}_{j^{\prime}}$ denote $\delta_{n}(A)$ for the quantity $|A|/n$ in ${\mathbb{V}}_{j^{\prime}}$ where $|A|$ means the internal cardinality of $A$ in ${\mathbb{V}}_{j^{\prime}}$ . Denote $\mu^{j}_{n}(A):=st_{j}(\delta_{n}(A))$ for $j<j^{\prime}$ . Notice that $\delta_{n}$ is an internal function while $\mu^{j}_{n}$ are often external functions but definable in $({\mathbb{V}}_{j^{\prime}};{\mathbb{R}}_{j})$ for $j^{\prime}>j$ , i.e.,

\mu^{j}_{n}(A)=\alpha\,\mbox{ iff }\,\forall n\in{\mathbb{N}}_{j^{\prime}}\cap{\mathbb{R}}_{j}\,\left(|\delta_{H}(A)-\alpha|<\frac{1}{n}\right).

We often write $\mu_{n}$ for $\mu^{0}_{n}$ . If $A\subseteq\Omega$ and $|\Omega|=H$ , then $\mu_{H}(A)$ coincides with the Loeb measure of $A$ in $\Omega$ . The term $\delta_{H}$ is often used for an internal argument.

$\blacklozenge$ : The abbreviation a.p. stands for “arithmetic progression” and $n$ –a.p. stands for “ $n$ -term arithmetic progression.”

The length of an a.p. $p$ is the number of terms in $p$ which can be written as $|p|$ . The letters $P,Q,R$ are reserved exclusively for a.p.’s of length $\gg 1$ , and $p,q,r$ for a.p.’s of length $k$ or other standard length. When we run out of letters, we may also use $\vec{x},\vec{y}$ for $k$ –a.p.’s. If $1\leq l\leq|p|$ , then $p(l)$ represents the $l$ -th term of $p$ . We denote $p\subseteq A$ for $p(l)\in A$ for all $1\leq l\leq|p|$ . We allow the common difference $d$ of an a.p. to be any integer including, occasionally, the trivial case for $d=0$ . If $p$ and $q$ are two a.p.’s of the same length, then $p\oplus q$ represents the $|p|$ –a.p. $\{p(l)+q(l)\mid l=1,2,\ldots,|p|\}$ . If $p$ is an a.p. and $X$ is an element or a set, then $p\oplus X$ represents the sequence $\{p(l)+X\mid 1\leq l\leq|p|\}$ . By $p\sqsubseteq q\oplus X$ we mean $p(l)\in q(l)+X$ for every $1\leq l\leq|p|=|q|$ .

If $X\subseteq{\mathbb{R}}_{j}$ and $X\in{\mathbb{V}}_{j}$ let ${\sup}_{j}(X)$ be the supremum of $X$ in the sense of ${\mathbb{V}}_{j}$ , i.e., the unique least upper bound of $X$ in ${\mathbb{R}}_{j}$ , or $\infty$ if $X$ is unbounded above in ${\mathbb{R}}_{j}$ .

Let $1\leq j<j^{\prime}\leq 3$ . For any $A\subseteq{\mathbb{N}}_{j^{\prime}-j}$ and any collection of a.p.’s ${\mathcal{P}}\in{\mathbb{V}}_{j^{\prime}-j}$ , there exists $X\in{\mathbb{V}}_{0}$ with $X\subseteq{\mathbb{R}}_{0}$ (because every subset of ${\mathbb{R}}_{0}$ is in ${\mathbb{V}}_{0}$ ) such that $x\in X$ iff there exists a $P\in{\mathcal{P}}$ with $|P|\in{\mathbb{N}}_{j^{\prime}-j}\setminus{\mathbb{N}}_{0}$ and $\mu_{|P|}(A\cap P)=x$ . By the elementality of $i_{*}$ in Part 2 of Property 2.7 we have that for any $A\subseteq{\mathbb{N}}_{j^{\prime}}$ and any collection of a.p.’s ${\mathcal{P}}\in{\mathbb{V}}_{j^{\prime}}$ , there exists $X\in{\mathbb{V}}_{j}$ with $X\subseteq{\mathbb{R}}_{j}$ such that $x\in X$ iff there exists a $P\in{\mathcal{P}}$ with $|P|\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ and $\mu^{j}_{|P|}(A\cap P)=x$ . Therefore, the operator ${\sup}_{j}$ and hence $S\!D^{j}$ below are well defined. Similarly, $S\!D_{S}^{j}$ below is also well defined.

Definition 3.1

For $0\leq j<j^{\prime}\leq 3$ and $A\subseteq{\mathbb{N}}_{j^{\prime}}$ with $|A|\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ the strong upper Banach density $S\!D^{j}(A)$ of $A$ in ${\mathbb{V}}_{j}$ is defined by

S\!D^{j}(A):={\sup}_{j}\left\{\mu^{j}_{|P|}(A\cap P)\mid|P|\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}\right\}.

(8)

The letter $P$ above always represents an a.p. If $S\subseteq{\mathbb{N}}_{j^{\prime}}$ has $S\!D^{j}(S)=\eta\in{\mathbb{R}}_{j}$ and $A\subseteq{\mathbb{N}}_{j^{\prime}}$ , the strong upper Banach density $S\!D_{S}^{j}$ of $A$ relative to $S$ is defined by

S\!D^{j}_{S}(A):={\sup}_{j}\left\{\mu^{j}_{|P|}(A\cap P)\mid|P|\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j},\mbox{ and }\mu^{j}_{|P|}(S\cap P)=\eta\right\}.

(9)

When $S\!D^{j}_{S}(A)$ , defined in (9), is used in this paper, the set $A$ is often a subset of $S$ although there is no such restriction in the definition.

Definition 3.2

If $A\subseteq{\mathbb{N}}_{0}$ , then the strong upper Banach density of $A$ is defined by $S\!D(A):=S\!D^{0}(i_{0}(A))$ where $S\!D^{0}$ is defined by (8) and $i_{0}$ is defined by (4).

We would like to point out that for standard sets $A\subseteq{\mathbb{N}}_{0}$ ,

S\!D(A)=\lim_{n\in{\mathbb{N}}_{0},\,n\to\infty}\sup\{\delta_{|p|}(A\cap p)\mid|p|\geq n\}.

This equality will not be used. The purpose here is to give the reader some intuition because the right side is a standard expression.

The superscript $0$ in $S\!D^{0}$ will be omitted. Notice that the upper density of a set $A\subseteq{\mathbb{N}}_{0}$ is less than or equal to the upper Banach density of $A$ , which is less than or equal to the strong upper Banach density of $A$ . The strong upper Banach density of $A$ is the nonstandard version of the density of $A$ along a collection of arbitrarily long arithmetic progressions satisfying the double counting property in [9]. Of course, if we know that Szemerédi’s theorem is true, then $S\!D(S)=\eta>0$ implies that $\eta=1$ . Also $S\!D(S)=1$ and $S\!D_{S}(A)=\alpha>0$ imply that $\alpha=1$ .

Suppose that the strong upper Banach density of $A\subseteq{\mathbb{N}}_{0}$ is a positive real number $\alpha$ . Instead of looking for $k$ –a.p.’s in $A$ we will look for $k$ –a.p.’s in $i_{0}(A)\cap P$ for some infinitely long a.p. $P$ such that the distribution of $i_{0}(A)\cap P$ in $P$ is very uniform, i.e., the measure and strong upper Banach density of $i_{0}(A)\cap P$ in $P$ are the same value $\alpha$ . The uniformity allows the use of an argument similar to the so called density increment argument in the standard literature. The next lemma is the beginning of this effort.

Lemma 3.3

For $0\leq j<j^{\prime}\leq 3$ let $A\subseteq S\subseteq{\mathbb{N}}_{j^{\prime}}$ with $|A|\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ and $\alpha,\eta\in{\mathbb{R}}_{j}$ with $0\leq\alpha\leq\eta\leq 1$ . Then the following are true:

1.

$S\!D^{j}(S)\geq\eta$ iff there exists a $P$ with $|P|\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ and $\mu^{j}_{|P|}(S\cap P)\geq\eta$ ;
2.

If $S\!D^{j}(S)=\eta$ , then there exists a $P$ with $|P|\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ such that $\mu^{j}_{|P|}(S\cap P)=S\!D^{j}(S\cap P)=\eta$ ;
3.

Suppose $S\!D^{j}(S)=\eta$ . Then $S\!D^{j}_{S}(A)\geq\alpha$ iff there exists a $P$ with $|P|\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ , $\mu^{j}_{|P|}(S\cap P)=\eta$ , and $\mu^{j}_{|P|}(A\cap P)\geq\alpha$ ;
4.

Suppose $S\!D^{j}(S)=\eta$ . If $S\!D^{j}_{S}(A)=\alpha$ , then there exists a $P$ with $|P|\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ such that $\mu^{j}_{|P|}(S\cap P)=\eta$ and $\mu^{j}_{|P|}(A\cap P)=S\!D^{j}_{S\cap P}(A\cap P)=\alpha$ .

Proof Part 1: If $S\!D^{j}(S)\geq\eta$ , then there is a $P_{n}$ with $|P_{n}|\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ such that $\delta_{|P_{n}|}(S\cap P_{n})>\eta-1/n$ for every $n\in{\mathbb{N}}_{j}$ . Let

A:=\left\{n\in{\mathbb{N}}_{j^{\prime}}\mid\exists P\subseteq{\mathbb{N}}_{j^{\prime}}\,(|P|\geq n\wedge\delta_{|P|}(S\cap P)>\eta-1/n)\right\}.

Then $A$ is ${\mathbb{V}}_{j^{\prime}}$ –internal and $A\cap{\mathbb{N}}_{j}$ is unbounded above in ${\mathbb{N}}_{j}$ . By Proposition 2.15, there is a $J\in A\setminus{\mathbb{N}}_{j}$ . Hence there is an a.p. $P_{J}\subseteq{\mathbb{N}}_{j^{\prime}}$ such that $|P_{J}|\geq J\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ and $\delta_{|P_{J}|}(S\cap P_{J})>\eta-1/J\approx_{j}\eta$ . Therefore, $\mu^{j}_{|P_{J}|}(S\cap P_{J})\geq\eta$ . On the other hand, if $\mu^{j}_{|P|}(S\cap P)\geq\eta$ , then $S\!D^{j}(S)\geq\eta$ by the definition of $S\!D^{j}$ in (8).

Part 2: If $S\!D^{j}(S)=\eta$ , we can find $P$ with $|P|\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ such that $\mu^{j}_{|P|}(S\cap P)=\eta^{\prime}\geq\eta$ by Part 1. Clearly, $\eta=S\!D^{j}(S)\geq S\!D^{j}(S\cap P)\geq\mu^{j}_{|P|}(S\cap P)=\eta^{\prime}$ by the definition of $S\!D^{j}$ . Hence $\eta=\eta^{\prime}$ .

Part 3: If $S\!D^{j}_{S}(A)\geq\alpha$ , then there is a $P$ with $|P|>n$ such that $|\delta_{|P|}(S\cap P)-\eta|<1/n$ and $\delta_{|P|}(A\cap P)>\alpha-1/n$ for every $n\in{\mathbb{N}}_{j}$ . By Proposition 2.15 as in the proof of Part 1 there is a $P_{J}$ for some $J\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ with $|P_{J}|\geq J$ such that $|\delta_{|P_{J}|}(S\cap P_{J})-\eta|<1/J$ and $\delta_{|P_{J}|}(A\cap P_{J})>\alpha-1/J$ , which implies $\mu^{j}_{|P_{J}|}(S\cap P)=\eta$ and $\mu^{j}_{|P_{J}|}(A\cap P_{J})\geq\alpha$ . On the other hand, if $\mu_{|P|}(S\cap P)=\eta$ and $\mu^{j}_{|P|}(A\cap P)\geq\alpha$ , then $S\!D_{S}^{j}(A)\geq\alpha$ by the definition of $S\!D_{S}^{j}$ in (9).

Part 4: If $S\!D^{j}_{S}(A)=\alpha$ , then $\mu^{j}_{|P|}(S\cap P)=\eta$ and $\mu^{j}_{|P|}(A\cap P)=\alpha^{\prime}\geq\alpha$ for some $P$ with $|P|\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ by Part 3. Clearly, $\alpha=S\!D^{j}_{S}(A)\geq S\!D^{j}_{S\cap P}(A\cap P)\geq\mu^{j}_{|P|}(A\cap P)=\alpha^{\prime}$ by the definition of $S\!D_{S}^{j}$ . Hence $\alpha=\alpha^{\prime}$ . $\blacksquare$

The following lemma is the internal version of an argument similar to so-called the double counting property in the standard literature. Let $1\ll H\leq N/2$ . Roughly speaking, if $C$ is very uniformly distributed in $[N]$ with measure $\alpha$ , then for almost all $x\in[N-H]$ the measure of $C\cap(x+[H])$ inside $x+[H]$ is $\alpha$ . Since the measure $\mu_{H}$ is not an internal function we use $\delta_{H}$ instead and require $|\delta_{H}(C\cap(x+[H])-\alpha|<1/J$ for some infinite $J$ instead of $\mu_{H}(C\cap(x+[H])=\alpha$ . The lemma is stated in a more general case with ${\mathbb{V}}_{j}$ being viewed as the “standard” universe in ${\mathbb{V}}_{j^{\prime}}$ .

Lemma 3.4

Let $N,H\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ , $H\leq N/2$ , and $C\subseteq[N]$ with $\mu^{j}_{N}(C)=S\!D^{j}(C)=\alpha\in{\mathbb{R}}_{j}$ for $0\leq j<j^{\prime}\leq 3$ . For each $n\in{\mathbb{N}}_{j^{\prime}}$ let

D_{n,H,C}:=\left\{x\in[N-H]\mid\left|\delta_{H}(C\cap(x+[H]))-\alpha\right|<\frac{1}{n}\right\}.

(10)

Then there exists a $J\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ such that $\mu^{j}_{N-H}(D_{J,H,C})=1$ .

Notice that $D_{n,H,C}\subseteq D_{n^{\prime},H,C}$ if $n\geq n^{\prime}$ .

Proof Fix $N$ , $H$ , and $C$ . The subscripts $H$ and $C$ in $D_{n,H,C}$ will be omitted in the proof. If $st_{j}(H/N)>0$ , then for every $x\in[N-H]$ we have $\mu^{j}_{H}(x+[H])=\alpha$ by the supremality of $\alpha$ . Hence the maximal $J$ with $J\leq H$ such that $|\delta_{H}(A\cap(x+[H]))-\alpha|<1/J$ for every $x\in[N-H]$ is in ${\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ . Now $D_{J}=[N-H]$ works.

Assume that $st_{j}(H/N)=0$ . So, $\mu^{j}_{N-H}$ and $\mu^{j}_{N}$ coincide. If $\delta_{N}(D_{n})\approx_{j}1$ for every $n\in{\mathbb{N}}_{j}$ , then the maximal $J$ satisfying $|\delta_{N}(D_{J})-1|<1/J$ must be in ${\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ by Proposition 2.15. Hence $\mu^{j}_{N}(D_{J})=1$ . So we can assume that $\mu^{j}_{N}(D_{n})<1$ for some $n\in{\mathbb{N}}_{j}$ and derive a contradiction.

Notice that for each $x\in[N-H]$ , it is impossible to have $\mu^{j}_{H}(C\cap(x+[H]))>\alpha=S\!D^{j}(C)$ by the definition of $S\!D^{j}$ . Let $\overline{D}_{n}:=[N-H]\setminus D_{n}$ . Then $\mu^{j}_{N}(\overline{D}_{n})=1-\mu^{j}_{N}(D_{n})>0$ . Notice that $x\in\overline{D}_{n}$ implies $\delta_{H}(C\cap(x+[H]))\leq\alpha-1/n$ . By the following double counting argument, by ignoring some ${\mathbb{V}}_{j}$ –infinitesimal amount inside $st_{j}$ , we have

	$\displaystyle\alpha=st_{j}\left(\frac{1}{H}\sum_{y=1}^{H}\delta_{N}(C-y)\right)=st_{j}\left(\frac{1}{HN}\sum_{y=1}^{H}\sum_{x=1}^{N}\chi_{C}(x+y)\right)$
			$\displaystyle=st_{j}\left(\frac{1}{NH}\sum_{x=1}^{N}\sum_{y=1}^{H}\chi_{C}(x+y)\right)=st_{j}\left(\frac{1}{N}\sum_{x=1}^{N}\delta_{H}(C\cap(x+[H]))\right)$
			$\displaystyle=st_{j}\left(\frac{1}{N}\sum_{x\in D_{n}}\delta_{H}(C\cap(x+[H]))+\frac{1}{N}\sum_{x\in\overline{D}_{n}}\delta_{H}(C\cap(x+[H]))\right)$
			$\displaystyle\leq\alpha\mu^{j}_{N}(D_{n})+\left(\alpha-\frac{1}{n}\right)\mu^{j}_{N}(\overline{D}_{n})<\alpha$

which is absurd. This completes the proof. $\blacksquare$

Suppose $0\leq j<j^{\prime}\leq 3$ , $N\geq H\gg 1$ in ${\mathbb{N}}_{j^{\prime}}$ , $U\subseteq[N]$ , $A\subseteq S\subseteq[N]$ , $0\leq\alpha\leq\eta\leq 1$ , and $x\in[N]$ . For each $n\in{\mathbb{N}}_{j}$ let $\xi(x,\alpha,\eta,A,S,U,H,n)$ be the following internal statement:

\begin{array}[]{rcl}\vspace{0.1in}|\delta_{H}(x+[H])\cap U)-1|&<&1/n,\\ \vspace{0.1in}|\delta_{H}((x+[H])\cap S)-\eta|&<&1/n,\mbox{ and}\\ |\delta_{H}((x+[H])\cap A)-\alpha|&<&1/n.\end{array}

(11)

The statement $\xi(x,\alpha,\eta,A,S,U,H,n)$ infers that the densities of $A,S,U$ in the interval $x+[H]$ go to $\alpha,\eta,1$ , respectively, as $n\to\infty$ in ${\mathbb{N}}_{j}$ . The statement $\xi$ will be referred a few times in Lemma 5.1 and its proof.

The following lemma is the application of Lemma 3.4 to the sets $U,S,A$ simultaneously.

Lemma 3.5

Let $N\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ , $U\subseteq[N]$ , and $A\subseteq S\subseteq[N]$ be such that $\mu^{j}_{N}(U)=1$ , $\mu^{j}_{N}(S)=S\!D(S)=\eta$ , and $\mu^{j}_{N}(A)=S\!D^{j}_{S}(A)=\alpha$ for some $\eta,\alpha\in{\mathbb{R}}_{j}$ and $0\leq j<j^{\prime}\leq 3$ . For any $n,h\in{\mathbb{N}}_{j^{\prime}}$ let

G_{n,h}:=\{x\in[N-h]\mid{\mathbb{V}}_{j^{\prime}}\models\xi(x,\alpha,\eta,A,S,U,h,n)\}.

(12)

(a)

For each $H\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ with $H\leq N/2$ there exists a $J\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ such that $\mu^{j}_{N-H}(G_{J,H})=1$ ;
(b)

For each $n\in{\mathbb{N}}_{j}$ , there is an $h_{n}\in{\mathbb{N}}_{j}$ with $h_{n}>n$ such that $\delta_{N}(G_{n,h_{n}})>1-1/n$ .

Proof Part (a): Applying Lemma 3.4 for $U$ and $S$ we can find $J_{1},J_{2}\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ such that $\mu^{j}_{N-H}(D_{J_{1},H,U})=1$ and $\mu^{j}_{N-H}(D_{J_{2},H,S})=1$ where $D_{n,h,C}$ is defined in (10) and $\alpha$ is replaced by $1$ for $U$ and $\eta$ for $S$ . Let $G^{\prime}:=D_{J_{1},H,U}\cap D_{J_{2},H,S}$ . For each $n\leq\min\{J_{1},J_{2}\}$ let

\overline{G}^{\prime\prime}_{n}:=\left\{x\in[N-H]\mid\delta_{H}(A\cap(x+[H]))>\alpha+\frac{1}{n}\right\},\mbox{ and}

\underline{G}^{\prime\prime}_{n}:=\left\{x\in[N-H]\mid\delta_{H}(A\cap(x+[H]))<\alpha-\frac{1}{n}\right\}.

Notice that both $\overline{G}^{\prime\prime}_{n}$ and $\underline{G}^{\prime\prime}_{n}$ are ${\mathbb{V}}_{j^{\prime}}$ –internal. If $\mu^{j}_{N-H}(\overline{G}^{\prime\prime}_{n})>0$ for some $n\in{\mathbb{N}}_{j}$ , then $\overline{G}^{\prime\prime}_{n}\cap G^{\prime}\not=\emptyset$ . Let $x_{0}\in\overline{G}^{\prime\prime}_{n}\cap G^{\prime}$ . Then we have $\mu^{j}_{H}(S\cap(x_{0}+[H]))=\eta$ and $\mu^{j}_{H}(A\cap(x_{0}+[H]))>\alpha+1/n$ , which contradicts $S\!D^{j}_{S}(A)=\alpha$ . Hence $\delta_{N-H}(\overline{G}^{\prime\prime}_{n})\approx_{j}0$ for every $n\in{\mathbb{N}}_{j}$ . By Proposition 2.15 we can find $J_{+}\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ such that $\mu^{j}_{N-H}(\overline{G}^{\prime\prime}_{n})=0$ for any $n\leq J_{+}$ . If $\mu^{j}_{N-H}(\underline{G}^{\prime\prime}_{n})>0$ for some $n\in{\mathbb{N}}_{j}$ , then $\mu^{j}_{N-H}(\overline{G}_{m}^{\prime\prime})>0$ for some $m\in{\mathbb{N}}_{j}$ by the fact that $\mu^{j}_{N-H}(A)=\alpha$ . Hence $\delta_{N-H}(\underline{G}_{n}^{\prime\prime})\approx_{j}0$ for every $n\in{\mathbb{N}}_{j}$ . By Proposition 2.15 again we can find $J_{-}\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ such that $\mu^{j}_{N-H}(\underline{G}^{\prime\prime}_{n})=0$ for any $n\leq J_{-}$ . The proof is complete by setting $J:=\min\{J_{1},J_{2},J_{+},J_{-}\}$ and

G_{J,H}:=(D_{J,H,U}\cap D_{J.H,S})\setminus(\overline{G}^{\prime\prime}_{J}\cup\underline{G}^{\prime\prime}_{J}).

Part (b): Suppose Part (b) is not true. Then there exists an $n\in{\mathbb{N}}_{j}$ such that $\delta_{N-h}(G_{n,h})\leq 1-1/n$ for any $h>n$ in ${\mathbb{N}}_{j}$ . By Proposition 2.15 there is an $H\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{j}$ such that $\delta_{N-H}(G_{n,H})\leq 1-1/n$ . By Part (a) there is a $J\gg n$ such that $\mu^{j}_{N-H}(G_{J,H})=1$ . We have a contradiction because $n<J$ and hence $G_{J,H}\subseteq G_{n,H}$ . $\blacksquare$

Notice that for a given $n$ one can choose $h_{n}$ to be the least such that $\delta_{N}(G_{n,h_{n}})>1-1/n$ in Lemma 3.5 (b). So we can assume that $h_{n}$ is an internal function of $n$ . Hence we can assume that $G_{n,h_{n}}$ is also an internal function of $n$ .

4 Mixing Lemma

We work within ${\mathbb{V}}_{j^{\prime}}$ for $0<j^{\prime}\leq 3$ in this section. Any unspecified sets are ${\mathbb{V}}_{j^{\prime}}$ –internal. The letter $V$ will sometimes be used for a set other than the standard/nonstandard universes in §2. Hopefully, no confusion will arise. The following standard lemma is a consequence of Szemerédi’s Regularity Lemma in [8]. The proof of the lemma can be found in the appendix of [9].

Lemma 4.1

Let $U,W$ be finite sets, let $\epsilon>0$ , and for each $w\in W$ , let $E_{w}$ be a subset of $U$ . Then there exists a partition $U=U_{1}\cup U_{2}\cup\cdots\cup U_{n_{\epsilon}}$ for some $n_{\epsilon}\in{\mathbb{N}}_{0}$ , and real numbers $0\leq c_{u,w}\leq 1$ in ${\mathbb{R}}_{0}$ for $u\in[n_{\epsilon}]$ and $w\in W$ such that for any set $F\subseteq U$ , one has

\left||F\cap E_{w}|-\sum_{u=1}^{n_{\epsilon}}c_{u,w}|F\cap U_{u}|\right|\leq\epsilon|U|

for all but $\epsilon|W|$ values of $w\in W$ .

The following lemma, the nonstandard version of so–called mixing lemma in [9], can be derived from Lemma 4.1. We present a proof similar to the proof in [9] in a nonstandard setting. Part (i) and Part (ii) of the lemma are used to prove part (iii) and only Part (iii) will be referred in the proof of Lemma 5.1.

Lemma 4.2 (Mixing Lemma)

Let $N\in{\mathbb{N}}_{j^{\prime}}\setminus{\mathbb{N}}_{0}$ , $A\subseteq S\subseteq[N]$ , $1\ll H\leq N/2$ , and $R\subseteq[N-H]$ be an a.p. with $|R|\gg 1$ such that

\mu_{N}(S)=S\!D(S)=\eta>0,\,\mu_{N}(A)=S\!D_{S}(A)=\alpha>0,

(13)

\mu_{H}((x+[H])\cap S)=\eta,\,\mbox{ and }\,\mu_{H}((x+[H])\cap A)=\alpha

(14)

for every $x\in R$ . Then the following are true.

(i)

For any set $E\subseteq[H]$ with $\mu_{H}(E)>0$ , there is an $x\in R$ such that

$\mu_{H}(A\cap(x+E))\geq\alpha\mu_{H}(E);$
(ii)

Let $m\gg 1$ be such that the van der Waerden number $\Gamma\left(3^{m},m\right)\leq|R|$ . For any internal partition $\{U_{n}\mid n\in[m]\}$ of $[H]$ there exists an $m$ –a.p. $P\subseteq R$ , a set $I\subseteq[m]$ with $\mu_{H}(U_{I})=1$ where $U_{I}=\bigcup\{U_{n}\mid n\in I\}$ , and an infinitesimal $\epsilon>0$ such that

$|\delta_{H}(A\cap(x+U_{n}))-\alpha\delta_{H}(U_{n})|\leq\epsilon\delta_{H}(U_{n})$

for all $n\in I$ and all $x\in P$ ;
(iii)

Given an internal collection of sets $\{E_{w}\subseteq[H]\mid w\in W\}$ with $|W|\gg 1$ and $\mu_{H}(E_{w})>0$ for every $w\in W$ , there exists an $x\in R$ and $T\subseteq W$ such that $\mu_{|W|}(T)=1$ and

$\mu_{H}(A\cap(x+E_{w}))=\alpha\mu_{H}(E_{w})$

for every $w\in T$ .

Proof Part (i): Assume that (i) is not true. For each $x\in R$ let $r_{x}$ be such that $\delta_{H}(A\cap(E+x))=(\alpha-r_{x})\delta_{H}(E)$ . Then $r_{x}$ must be positive non-infinitesimal. We can set $r:=\min\{r_{x}\mid x\in R\}$ since the function $x\mapsto r_{x}$ is internal. Clearly, the number $r$ is positive non-infinitesimal. Hence $\delta_{H}(A\cap(E+x))\leq(\alpha-r)\delta_{H}(E)$ for all $x\in R$ . Notice that by (13) and (14), for $\mu_{H}$ –almost all $y\in[H]$ we have $\mu_{|R|}(S\cap(y+R))=\eta$ which implies that for $\mu_{H}$ –almost all $y\in[H]$ we have $\mu_{|R|}(A\cap(y+R))=\alpha$ . So

	$\displaystyle\alpha\mu_{H}(E)\approx\frac{1}{H}\sum_{y\in E}\frac{1}{\|R\|}\sum_{x\in R}\chi_{A}(x+y)=\frac{1}{\|R\|}\sum_{x\in R}\frac{1}{H}\sum_{y=1}^{H}\chi_{A\cap(E+x)}(x+y)$
			$\displaystyle\leq\frac{1}{\|R\|}\sum_{x\in R}(\alpha-r)\delta_{H}(E)=(\alpha-r)\delta_{H}(E)\approx(\alpha-st(r))\mu_{H}(E)<\alpha\mu_{H}(E),$

which is absurd.

Part (ii): To make the argument explicitly internal we use $\delta_{H}$ instead of $\mu_{H}$ . For each $t\in{\mathbb{N}}_{j^{\prime}}$ , $x\in R$ , and $n\in[m]$ let

c^{t}_{n}(x)=\left\{\begin{array}[]{cl}\vskip 6.0pt plus 2.0pt minus 2.0pt1&\,\mbox{ if }\,\delta_{H}((x+U_{n})\cap A)\geq\left(\alpha+\frac{1}{t}\right)\delta_{H}(U_{n}),\\ \vskip 6.0pt plus 2.0pt minus 2.0pt0&\,\mbox{ if }\,\left(\alpha-\frac{1}{t}\right)\delta_{H}(U_{n})<\delta_{H}((x+U_{n})\cap A)<\left(\alpha+\frac{1}{t}\right)\delta_{H}(U_{n}),\\ -1&\,\mbox{ if }\,\delta_{H}((x+U_{n})\cap A)\leq\left(\alpha-\frac{1}{t}\right)\delta_{H}(U_{n}).\end{array}\right.

and let $c^{t}:P\to\{-1,0,1\}^{[m]}$ be such that $c^{t}(x)(n)=c^{t}_{n}(x)$ . For each $t\in{\mathbb{N}}_{0}$ , since the van der Waerden number $\Gamma(3^{m},m)\leq|R|$ , there exists an $m$ –a.p. $P_{t}\subseteq R$ such that $c^{t}(x)=c^{t}(x^{\prime})$ for any $x,x^{\prime}\in P_{t}$ . For each $x\in P_{t}$ let

\begin{array}[]{rcl}\vskip 6.0pt plus 2.0pt minus 2.0ptI^{+}_{t}&=&\{n\in[m]\mid c^{t}(x)(n)=1\},\\ \vskip 6.0pt plus 2.0pt minus 2.0ptI^{-}_{t}&=&\{n\in[m]\mid c^{t}(x)(n)=-1\},\mbox{ and}\\ I_{t}&=&[m]\setminus(I^{+}_{t}\cup I^{-}_{t}),\mbox{ and}\end{array}

\begin{array}[]{rcl}\vskip 6.0pt plus 2.0pt minus 2.0ptU^{+}_{t}&=&\bigcup\{U_{n}\mid n\in I^{+}_{t}\},\\ \vskip 6.0pt plus 2.0pt minus 2.0ptU^{-}_{t}&=&\bigcup\{U_{n}\mid n\in I^{-}_{t}\},\,\mbox{ and}\\ U_{t}&=&[H]\setminus(U^{+}_{t}\cup U^{-}_{t}).\end{array}

Clearly, $\delta_{H}((x+U^{-}_{t})\cap A)\leq(\alpha-1/t)\delta_{H}(U^{-}_{t})$ because $U^{-}_{t}$ is a disjoint union of the $U_{n}$ ’s for $n\in I^{-}_{t}$ . Since $t\in{\mathbb{N}}_{0}$ we have that $\mu_{H}(U^{-}_{t})=0$ by (i) with $P_{t}$ in the place of $R$ and $U^{-}_{t}$ in the place of $E$ . Notice that $\delta_{H}(A\cap(x+U^{+}_{t}))\geq(\alpha+1/t)\delta_{H}(U^{+}_{t})$ . Since $\alpha\geq\mu_{H}(A\cap(x+U^{+}_{t}))\geq(\alpha+1/t)\mu_{H}(U^{+}_{t})$ , we have that $\mu_{H}(U^{+}_{t})<1$ , which implies $\mu_{H}(U_{t})>0$ . If $\mu_{H}(U^{+}_{t})>0$ , then $\delta_{H}(A\cap(x+U^{+}_{t}))\geq(\alpha+1/t)\delta_{H}(U^{+}_{t})$ implies $\mu_{H}(A\cap(x+U_{t}))<\alpha\mu_{H}(U_{t})$ for all $x\in P_{t}$ , which again contradicts (i). Hence $\mu_{H}(U^{+}_{t})=0$ and therefore, $\delta_{H}(U_{t})>1-1/t$ is true for every $t\in{\mathbb{N}}_{0}$ .

Since the set of all $t\in{\mathbb{N}}_{j^{\prime}}$ with $\delta_{H}(U_{t})>1-1/t$ is ${\mathbb{V}}_{j^{\prime}}$ –internal, by Proposition 2.15 there is a $J\gg 1$ such that $\delta_{H}(U_{J})>1-1/J\approx 1$ . The proof of (ii) is completed by letting $P:=P_{J}$ , $I:=I_{J}$ , and $U_{I}:=U_{J}$ .

Part (iii): Choose a sufficiently large positive infinitesimal $\epsilon$ satisfying that there is an internal partition of $[H]=U_{0}\cup U_{1}\cup\cdots\cup U_{m}$ and real numbers $0\leq c_{n,w}\leq 1$ for each $n\in[m]$ and $w\in W$ such that the van der Waerden number $\Gamma(3^{m},m)\leq|R|$ , and for any internal set $F\subseteq[H]$ there is a $T_{F}\subseteq W$ with $|W\setminus T_{F}|\leq\epsilon|W|$ such that

\left||F\cap E_{w}|-\sum_{n=1}^{m}c_{n,w}|F\cap U_{n}|\right|\leq\epsilon H

(15)

for all $w\in T_{F}$ . Notice that such $\epsilon$ exists because if $\epsilon$ is a standard positive real, then $m=n_{\epsilon}$ is in ${\mathbb{N}}_{0}$ . From (15) with $F$ being replaced by $[H]$ we have

\left||E_{w}|-\sum_{n=1}^{m}c_{n,w}|U_{n}|\right|\leq\epsilon H

(16)

for all $w\in T_{[H]}$ . By (ii) we can find a $P\subseteq R$ of length $m$ , a positive infinitesimal $\epsilon_{1}$ , and $I\subseteq[m]$ where, for some $x\in P$ ,

I:=\left\{n\in[m]\mid|\delta_{H}((x+U_{n})\cap A)-\alpha\delta_{H}(U_{n})|<\epsilon_{1}\delta_{H}(U_{n})\right\}

( $I$ is independent of the choice of $x$ ), and $V:=\bigcup\{U_{n}\mid n\in I\}$ with $\mu_{H}(V)=1$ . Let $I^{\prime}=[m]\setminus I$ and $V^{\prime}=[H]\setminus V$ . Then for each $w\in T:=T_{[H]}\cap T_{(A-x)\cap[H]}$ we have

	$\displaystyle\left\|\delta_{H}(A\cap(x+E_{w}))-\alpha\delta_{H}(E_{w})\right\|$
			$\displaystyle\leq\frac{1}{H}\left(\left\|\|A\cap(x+E_{w})\|-\sum_{n\in[m]}c_{n,w}\|A\cap(x+U_{n})\|\right\|\right.$
			$\displaystyle\quad+\left\|\sum_{n\in[m]}c_{n,w}\|A\cap(x+U_{n})\|-\sum_{n\in[m]}c_{n,w}\alpha\|U_{n}\|\right\|$
			$\displaystyle\quad\left.+\left\|\alpha\sum_{n\in[m]}c_{n,w}\|U_{n}\|-\alpha\|E_{w}\|\right\|\right)$
			$\displaystyle\leq\epsilon+\frac{1}{H}\sum_{n\in I}c_{n,w}\epsilon_{1}\|U_{n}\|+2\delta_{H}(V^{\prime})+\alpha\epsilon$
			$\displaystyle\leq\epsilon+\epsilon_{1}\delta_{H}(V)+2\delta_{H}(V^{\prime})+\alpha\epsilon\approx 0.$

Hence $\mu_{H}(A\cap(x+E_{w}))=\alpha\mu_{H}(E_{w})$ for all $w\in T$ . Notice that $\mu_{|W|}(T)=1$ because $\epsilon\approx 0$ and $\mu_{|W|}(T_{[H]})=\mu_{|W|}(T_{[H]\cap(A-x)})=1$ . $\blacksquare$

The set $S$ in Lemma 4.2, although seems unnecessary, is needed in the proof of Lemma 5.1.

5 Proof of Szemerédi’s Theorem

We work within ${\mathbb{V}}_{2}$ in this section except in the proof of Claim 1 in Lemma 5.1 where ${\mathbb{V}}_{3}$ is needed.

Szemerédi’s theorem is an easy consequence of Lemma 5.1, denoted by ${\bf L}(m)$ for all $m\in[k]$ . For an integer $n\geq 2k+1$ define an interval $C_{n}\subseteq[n]$ by

C_{n}:=\left[\left\lceil\frac{kn}{2k+1}\right\rceil,\,\left\lfloor\frac{(k+1)n}{2k+1}\right\rfloor\right].

(17)

The set $C_{n}$ is the subinterval of $[n]$ in the middle of $[n]$ with the length $\lfloor n/(2k+1)\rfloor\pm\iota$ for $\iota=0$ or $1$ . If $n\gg 1$ , then $\mu_{n}(C_{n})=1/(2k+1)$ . For notational convenience we denote

D:=3k^{3}\,\mbox{ and }\,\eta_{0}:=1-\frac{1}{D}.

(18)

$\blacklozenge$ : Fix a $K\in{\mathbb{N}}_{1}\setminus{\mathbb{N}}_{0}$ . The number $K$ is the length of an interval which will play an important role in Lemma 5.1. Keeping $K$ unchanged is one of the advantages from nonstandard analysis, which is unavailable in the standard setting.

There is a summary of ideas used in the proof of Lemma 5.1 right after the proof. It explains some motivation of the steps taken in the proof.

Lemma 5.1 ( ${\bf L}(m)$ )

Given any $\alpha>0$ , $\eta>\eta_{0}$ , any $N\in{\mathbb{N}}_{2}\setminus\!{\mathbb{N}}_{1}$ , and any $A\subseteq S\subseteq[N]$ and $U\subseteq[N]$ with

\mu_{N}(U)=1,\mu_{N}(S)=S\!D(S)=\eta,\mbox{ and }\,\mu_{N}(A)=S\!D_{S}(A)=\alpha,

(19)

the following are true:

${}_{1}(m)(\alpha,\eta,N,A,S,U,K)$ : There exists a $k$ –a.p. $\vec{x}\subseteq U$ with $\vec{x}\oplus[K]\subseteq[N]$ satisfying the statement $(\forall n\in{\mathbb{N}}_{0})\,\xi(\vec{x}(l),\alpha,\eta,A,S,U,K,n)$ for $l\in[k]$ , where $\xi$ is defined in (11), and there exist $T_{l}\subseteq C_{K}$ with $\mu_{|C_{K}|}(T_{l})=1$ where $C_{K}$ is defined in (17) and $V_{l}\subseteq[K]$ with $\mu_{K}(V_{l})=1$ for every $l\geq m$ , and collections of $k$ –a.p.’s

\begin{array}[]{rcl}\vskip 6.0pt plus 2.0pt minus 2.0pt{\mathcal{P}}&:=&\bigcup\{{\mathcal{P}}_{l,t}\mid t\in T_{l}\,\mbox{ and }\,l\geq m\}\,\mbox{ and}\\ {\mathcal{Q}}&:=&\bigcup\{{\mathcal{Q}}_{l,v}\mid v\in V_{l}\,\mbox{ and }\,l\geq m\}\,\mbox{ such that}\end{array}

{\mathcal{P}}_{l,t}\subseteq\{p\sqsubseteq(\vec{x}\oplus[K])\cap U\mid\forall l^{\prime}<m\,(p(l^{\prime})\in A)\mbox{ and }p(l)=\vec{x}(l)+t\}

(20)

satisfying $\mu_{K}({\mathcal{P}}_{l,t})=\alpha^{m-1}/k$ for all $l\geq m$ and $t\in T_{l}$ , and

{\mathcal{Q}}_{l,v}=\{q\sqsubseteq\vec{x}\oplus[K]\mid\forall l^{\prime}<m\,(q(l^{\prime})\in A)\,\mbox{ and }\,q(l)=\vec{x}(l)+v\}

(21)

satisfying $\mu_{K}({\mathcal{Q}}_{l,v})\leq\alpha^{m-1}$ for all $l\geq m$ and $v\in V_{l}$ .

L

${}_{2}(m)(\alpha,\eta,N,A,S,K$ ): There exist a set $W_{0}\subseteq S$ of $\min\{K,\lfloor 1/D(1-\eta)\rfloor\}$ –consecutive integers where $D$ is defined in (18) and a collection of $k$ –a.p.’s ${\mathcal{R}}=\{r_{w}\mid w\in W_{0}\}$ such that for each $w\in W_{0}$ we have $r_{w}(l)\in A$ for $l<m$ , $r_{w}(l)\in S$ for $l>m$ , and $r_{w}(m)=w$ .

Remark 5.2

(a)

${\bf L}_{2}(m)$ is an internal statement in ${\mathbb{V}}_{2}$ . Both ${\bf L}_{1}(m)$ and ${\bf L}_{2}(m)$ depend on $K$ . Since $K$ is fixed throughout whole proof, it, as a parameter, may be omitted in some expressions.
(b)

If $H\gg 1$ and $T\subseteq[H]$ with $\mu_{H}(T)>1-\epsilon$ , then $T$ contains $\lfloor 1/\epsilon\rfloor$ consecutive integers because otherwise we have $\mu_{H}(T)\leq(\lfloor 1/\epsilon\rfloor-1)/\lfloor 1/\epsilon\rfloor$ $=1-1/\lfloor 1/\epsilon\rfloor\leq 1-1/(1/\epsilon)=1-\epsilon$ .
(c)

The purpose of defining $C_{K}$ is that if $t\in C_{K}$ , then the number of $k$ –a.p.’s $p\sqsubseteq\vec{x}\oplus[K]$ with $p(l)=\vec{x}(l)+t$ is guaranteed to be at least $K/(k-1)$ .
(d)

It is not essential to require specific constant $c=1/k$ for $\mu_{K}({\mathcal{P}}_{l,t})=c\alpha^{m-1}$ in ${\bf L}_{1}(m)$ . Just requiring that $\mu_{K}({\mathcal{P}}_{l,t})\geq c\alpha^{m-1}$ for some positive standard real $c$ is sufficient. We use more specific expression “ $\mu_{K}({\mathcal{P}}_{l,t})=\alpha^{m-1}/k$ ” for notational simplicity.
(e)

Some “bad” $k$ –a.p.’s in ${\mathcal{P}}$ in ${\bf L}_{1}(m)$ will be thinned out so that ${\mathcal{R}}$ in ${\bf L}_{2}(m)$ can be constructed from ${\mathcal{P}}$ . The collection ${\mathcal{Q}}$ is only used to prevent ${\mathcal{P}}$ from being thinned out too much. See the proof of Lemma 5.3.
(f)

It is important to notice that in ${\bf L}_{1}(m)$ the collection ${\mathcal{P}}_{l,t}$ is a part of the collection at the right side of (20) while the collection ${\mathcal{Q}}_{l,v}$ is equal to the collection at the right side of (21).

Lemma 5.3

${\bf L}_{1}(m)(\alpha,\eta,N,A,S,U)$ implies ${\bf L}_{2}(m)(\alpha,\eta,N,A,S)$ for any $\alpha,\eta,N,A,S,U$ satisfying the conditions of Lemma 5.1.

Proof Assume we have obtained the $k$ –a.p. $\vec{x}\subseteq U$ with $\vec{x}\oplus[K]\subseteq[N]$ , sets $T_{l}\subseteq C_{K}$ and $V_{l}\subseteq[K]$ with $\mu_{|C_{K}|}(T_{l})=1$ and $\mu_{K}(V_{l})=1$ , and collections of $k$ –a.p.’s ${\mathcal{P}}$ and ${\mathcal{Q}}$ as in ${\bf L}_{1}(m)$ .

Call a $k$ –a.p. $p\in{\mathcal{P}}_{m}:=\bigcup\{{\mathcal{P}}_{m,t}\mid t\in T_{m}\}$ good if $p(l)\in S\cap(\vec{x}(l)+[K])$ for $l\geq m$ and bad otherwise. Let ${\mathcal{P}}_{m}^{g}\subseteq{\mathcal{P}}_{m}$ be the collection of all good $k$ –a.p.’s and ${\mathcal{P}}^{b}_{m}:={\mathcal{P}}_{m}\setminus{\mathcal{P}}_{m}^{g}$ be the collection of all bad $k$ –a.p.’s. Let $T_{m}^{g}:=\{p(m)-\vec{x}(m)\mid p\in{\mathcal{P}}_{m}^{g}\}$ . Then $T_{m}^{g}\subseteq T_{m}\cap(S-\vec{x}(m))\cap C_{K}$ . We show that $\mu_{|C_{K}|}(T_{m}^{g})>1-D(1-\eta)$ .

Let $Q:=\{q\sqsubseteq\vec{x}\oplus[K]\mid q(l^{\prime})\in A\,\mbox{ for }\,l^{\prime}<m\}$ . Notice that

{\mathcal{P}}_{m}^{b}\subseteq\bigcup_{l\geq m}\{q\in Q\mid q(l)\not\in S\}

and for each $v\in V_{l}$ , $q\in{\mathcal{Q}}_{l,v}$ iff $q\in Q$ and $q(l)=\vec{x}(l)+v$ .

	$\displaystyle\mbox{Hence, }\,\|{\mathcal{P}}_{m}^{b}\|\leq\sum_{l=m}^{k}\sum_{w\in[K]\setminus(S-\vec{x}(l))}\|\{q\in Q\mid q(l)=\vec{x}(l)+w\}\|$
			$\displaystyle\leq\sum_{l=m}^{k}\left(\sum_{w\in[K]\setminus V_{l}}\|\{q\in Q\mid q(l)=\vec{x}(l)+w\}\|+\sum_{v\in V_{l}\setminus(S-\vec{x}(l))}\|{\mathcal{Q}}_{l,v}\|\right)$
			$\displaystyle\leq K\sum_{l=m}^{k}(\|[K]\setminus V_{l}\|+\|V_{l}\setminus(S-\vec{x}(l))\|\alpha^{m-1}).$

\mbox{So }\,\,|{\mathcal{P}}_{m}^{g}|=|{\mathcal{P}}_{m}|-|{\mathcal{P}}_{m}^{b}|\geq\sum_{t\in T_{m}}|{\mathcal{P}}_{m,t}|-K\sum_{l=m}^{k}(|[K]\setminus V_{l}|+|V_{l}\setminus(S-\vec{x}(l))|\alpha^{m-1}).

Notice that $\mu_{K}([K]\setminus V_{l})=0$ . Hence we have $|[K]\setminus V_{l}|/|C_{K}|\approx 0$ and

	$\displaystyle\mu_{\|C_{K}\|}(T_{m}^{g})\cdot\frac{\alpha^{m-1}}{k}=st\left(\frac{1}{\|C_{K}\|}\sum_{t\in T_{m}^{g}}\frac{1}{K}\|{\mathcal{P}}_{m,t}\|\right)\geq st\left(\frac{1}{\|C_{K}\|K}\|{\mathcal{P}}_{m}^{g}\|\right)$
			$\displaystyle\geq st\left(\frac{1}{\|C_{K}\|}\sum_{t\in T_{m}}\frac{1}{K}\|{\mathcal{P}}_{m,t}\|-\frac{1}{\|C_{K}\|}\sum_{l=m}^{k}(\|V_{l}\setminus(S-\vec{x}(l))\|\alpha^{m-1})\right)$
			$\displaystyle\geq\mu_{\|C_{K}\|}(T_{m})\cdot\frac{\alpha^{m-1}}{k}-(2k+1)k(1-\eta)\cdot\alpha^{m-1}$
			$\displaystyle=\left(\frac{1}{k}-(2k+1)k(1-\eta)\right)\cdot\alpha^{m-1},$

which implies $\displaystyle\mu_{|C_{K}|}(T_{m}^{g})\geq 1-(2k+1)k^{2}(1-\eta)>1-D(1-\eta)$ . Recall that $T^{g}_{m}\subseteq C_{K}$ . Hence $\vec{x}(m)+T^{g}_{m}$ contains a set $W_{0}$ of $\lfloor 1/D(1-\eta)\rfloor$ consecutive integers. So, ${\bf L}_{2}(m)$ is proven if we let ${\mathcal{R}}:=\{r_{w}\mid w\in W_{0}\}$ where $r_{w}$ is one of the $k$ –a.p.’s in ${\mathcal{P}}_{m}^{g}$ such that $r_{w}(m)=\vec{x}(m)+w$ . $\blacksquare$

The idea of the proof of Lemma 5.3 is due to Szemerédi. See [9].

Proof of Lemma 5.1 We prove ${\bf L}(m)$ by induction on $m$ . By Lemmas 5.3 it suffices to prove ${\bf L}_{1}(m)$ .

For ${\bf L}(1)$ , given any $\alpha>0$ , $\eta>\eta_{0}$ , $N\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ , $A$ , $S$ , and $U$ satisfying (19), by Lemma 3.5 (b) we can find a $k$ –a.p. $\vec{x}\subseteq[N]$ such that $(\forall n\in{\mathbb{N}}_{0})\,\xi(\vec{x}(l),\alpha,\eta,A,S,U,K,n)$ is true for $l\in[k]$ , where $\xi$ is defined in (11). For each $l\in[k]$ let $T_{l}=C_{K}\cap(U-\vec{x}(l))$ and $V_{l}=[K]$ . For each $l\in[k]$ , $t\in T_{l}$ , and $v\in V_{l}$ let

\begin{array}[]{rcl}\vskip 6.0pt plus 2.0pt minus 2.0pt{\mathcal{P}}_{l,t}&:=&\{p\sqsubseteq(\vec{x}\oplus[K])\cap U\mid p(l)=\vec{x}(l)+t\}\\ {\mathcal{Q}}_{l,v}&:=&\{q\sqsubseteq(\vec{x}\oplus[K])\mid q(l)=\vec{x}(l)+v\}.\end{array}

Clearly, we have $\mu_{K}({\mathcal{P}}_{l,t})\geq 1/(k-1)>1/k$ . By some pruning we can assume that $\mu_{K}({\mathcal{P}}_{l,t})=1/k$ . It is trivial that $\mu_{K}({\mathcal{Q}}_{l,v})\leq 1$ and $q\in{\mathcal{Q}}_{l,v}$ iff $q(l)=\vec{x}(l)+v$ for each $q\sqsubseteq\vec{x}\oplus[K]$ . This completes the proof of ${\bf L}_{1}(1)(\alpha,\eta,N,A,S,U)$ . ${\bf L}_{2}(1)(\alpha,\eta,N,A,S)$ follows from Lemma 5.3.

Assume ${\bf L}(m-1)$ is true for some $2\leq m\leq k$ .

We now prove ${\bf L}(m)$ . Given any $\alpha>0$ and $\eta>\eta_{0}$ , fix $N\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ , $U\subseteq[N]$ , and $A\subseteq S\subseteq[N]$ satisfying (19). For each $n\in{\mathbb{N}}_{1}\setminus{\mathbb{N}}_{0}$ , by Lemma 3.5 (b), there is an $h_{n}>n$ in ${\mathbb{N}}_{1}$ and $G_{n,h_{n}}\subseteq[N]$ defined in (12) such that $d_{n}:=\delta_{N-h_{n}}(G_{n,h_{n}})>1-1/n$ . Notice that $d_{n}\approx_{1}\mu^{1}_{N-h_{n}}(G_{n,h_{n}})>\eta_{0}$ because $n\gg 1$ and $\mu_{N-h_{n}}(G_{n,h_{n}}))=1$ . Let $\eta^{1}_{n}:=\mu^{1}_{N-h_{n}}(G_{n,h_{n}})$ and fix an $n\in{\mathbb{N}}_{1}\setminus{\mathbb{N}}_{0}$ .

Claim 1 The following internal statement $\theta(n,A,N)$ is true:

$\exists W\subseteq[N]\,\exists{\mathcal{R}}\,(W\,\mbox{ is an a.p.}\,\wedge|W|\geq\min\{K,\lfloor 1/2D(1-d_{n})\rfloor\}\,\wedge\,{\mathcal{R}}=\{r_{w}\mid w\in W\}$ is a collection of $k$ –a.p.’s such that

\forall w\in W\,((\forall l\geq m)\,(r_{w}(l)\in\,G_{n,h_{n}})\,\wedge\,r_{w}(m-1)=w\,\wedge\,(\forall l,l^{\prime}\leq m-2)

((A\cap(r_{w}(l)+[h_{n}]))-r_{w}(l)=(A\cap(r_{w}(l^{\prime})+[h_{n}]))-r_{w}(l^{\prime}))).

Proof of Claim 1 Working in ${\mathbb{V}}_{2}$ by considering ${\mathbb{V}}_{1}$ as the standard universe, we can find $P\subseteq[N]$ with $|P|\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ by Lemma 3.3 and Part 2 of Property 2.7 such that

S\!D^{1}(G_{n,h_{n}})=\mu^{1}_{|P|}(P\cap G_{n,h_{n}})=S\!D^{1}(G_{n,h_{n}}\cap P)=\eta^{1}_{n}.

For each $x\in P\cap G_{n,h_{n}}$ let $\tau_{x}=((x+[h_{n}])\cap A)-x$ . Since there are at most $2^{h_{n}}\in{\mathbb{N}}_{1}$ different $\tau_{x}$ ’s and $|P|\gg 2^{h_{n}}$ , we can find one, say, $\tau_{n}\subseteq[h_{n}]$ such that the set

B_{n}:=\{x\in P\cap G_{n,h_{n}}\mid\tau_{x}=\tau_{n}\}

satisfies $\mu^{1}_{|P|}(B_{n})\geq\eta^{1}_{n}/2^{h_{n}}>0$ . Notice that $\mu^{0}_{|P|}(B_{n})$ could be $0$ .

Let $P^{\prime}\subseteq P$ with $|P^{\prime}|=N^{\prime}\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ be such that $\mu^{1}_{N^{\prime}}(G_{n,h_{n}}\cap P^{\prime})=\eta^{1}_{n}$ and

\begin{array}[]{rcl}\vspace{0.1in}\beta^{1}_{n}&:=&\mu^{1}_{N^{\prime}}(B_{n}\cap P^{\prime})=S\!D^{1}_{G_{n,h_{n}}\cap P}(B_{n}\cap P)\\ &=&S\!D^{1}_{G_{n,h_{n}}\cap P^{\prime}}(B_{n}\cap P^{\prime})\geq\mu^{1}_{|P|}(B_{n})>0\end{array}

by Part 4 of Lemma 3.3 and Part 2 of Property 2.7. Let $d$ be the common difference of the a.p. $P^{\prime}$ and $\varphi:P^{\prime}\to[N^{\prime}]$ be the order-preserving bijection, i.e.,

\varphi(x):=1+(x-\min P^{\prime})/d.

Let $B^{\prime}:=\varphi[B_{n}\cap P^{\prime}]$ and $S^{\prime}:=\varphi[G_{n,h_{n}}\cap P^{\prime}]$ . We have that $B^{\prime},S^{\prime},N^{\prime}$ and $\beta^{1}_{n},\eta^{1}_{n}$ in the place of $A,S,N$ and $\alpha,\eta$ satisfy the ${\mathbb{V}}_{1}$ –version of (19) with $\mu$ , $S\!D$ and $S\!D_{S}$ being replaced by $\mu^{1}$ , $S\!D^{1}$ , and $S\!D^{1}_{S^{\prime}}$ .

Let $N^{\prime\prime}=i_{2}(N^{\prime})$ , $B^{\prime\prime}=i_{2}(B^{\prime})$ , and $S^{\prime\prime}=i_{2}(S^{\prime})$ where $i_{2}$ is in Part 3 of Property 2.7. Recall that $i_{2}\!\upharpoonright\!{\mathbb{V}}_{1}$ is an identity map. Since $N^{\prime}\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ , we have $N^{\prime\prime}\in{\mathbb{N}}_{3}\setminus{\mathbb{N}}_{2}$ . Notice also that $\mu^{1}_{N^{\prime\prime}}(S^{\prime\prime})=S\!D^{1}(S^{\prime\prime})=\eta^{1}_{n}$ and $\mu^{1}_{N^{\prime\prime}}(B^{\prime\prime})=S\!D^{1}_{S^{\prime\prime}}(B^{\prime\prime})=\beta^{1}_{n}$ . By the induction hypothesis that ${\bf L}(m-1)$ is true we have

	$\displaystyle({\mathbb{V}}_{2};{\mathbb{R}}_{0},{\mathbb{R}}_{1})\models\forall\alpha,\eta\in{\mathbb{R}}_{0}\,\forall N\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}\,\forall A,S\subseteq[N]$
			$\displaystyle(\alpha>0\wedge\eta>\eta_{0}\wedge A\subseteq S\wedge\mu_{N}(S)=S\!D(S)=\eta\wedge\mu_{N}(A)=S\!D_{S}(A)$
			$\displaystyle\to{\bf L}_{2}(m-1)(\alpha,\eta,N,A,S)).$

Since $({\mathbb{V}}_{2};{\mathbb{R}}_{0},{\mathbb{R}}_{1})$ and $({\mathbb{V}}_{3};{\mathbb{R}}_{1},{\mathbb{R}}_{2})$ are elementarily equivalent by Part 2 of Property 2.7 via $i_{*}$ , we have, by universal instantiation, that

({\mathbb{V}}_{3};{\mathbb{R}}_{1},{\mathbb{R}}_{2})\models{\bf L}_{2}(m-1)(\beta^{1}_{n},\eta^{1}_{n},N^{\prime\prime},B^{\prime\prime},S^{\prime\prime}).

(23)

Notice that the right side above no longer depends on ${\mathbb{R}}_{1}$ or ${\mathbb{R}}_{2}$ . So, we have

{\mathbb{V}}_{3}\models{\bf L}_{2}(m-1)(i_{2}(\beta^{1}_{n}),i_{2}(\eta^{1}_{n}),i_{2}(N^{\prime}),i_{2}(B^{\prime}),i_{2}(S^{\prime}))

(24)

because $i_{2}(\beta^{1}_{n})=\beta^{1}_{n}$ and $i_{2}(\eta^{1}_{n})=\eta^{1}_{n}$ . Since $i_{2}$ is a bounded elementary embedding, we have

{\mathbb{V}}_{2}\models{\bf L}_{2}(m-1)(\beta^{1}_{n},\eta^{1}_{n},N^{\prime},B^{\prime},S^{\prime}),

which means that there is a set $W^{\prime}\subseteq[N^{\prime}]$ of $\min\{K,\lfloor 1/D(1-\eta^{1}_{n})\rfloor\}$ –consecutive integers and a collection of $k$ –a.p.’s ${\mathcal{R}}^{\prime}=\{r^{\prime}_{w}\mid w\in W^{\prime}\}$ such that for every $w\in W^{\prime}$ we have $r^{\prime}_{w}(l)\in B^{\prime}$ for $l<m-1$ , $r^{\prime}_{w}(m-1)=w$ , and $r^{\prime}_{w}(l)\in S^{\prime}$ for $l\geq m$ . Notice that $\varphi^{-1}[[N^{\prime}]]\subseteq[N]$ . Let $W=\varphi^{-1}[W^{\prime}]$ and ${\mathcal{R}}=\{r_{w}\mid w\in W\}$ , where $r_{w}=\varphi^{-1}[r^{\prime}_{\varphi(w)}]$ , such that for each $w\in W$ we have $r_{w}(l)\in\varphi^{-1}[B^{\prime}]\subseteq B_{n}$ for $l<m-1$ , $r_{w}(m-1)=w$ , and $r_{w}(l)\in\varphi^{-1}[S^{\prime}]\subseteq G_{n,h_{n}}$ for $l\geq m$ . If $\eta^{1}_{n}=1$ , then $|W|\geq K$ . If $\eta^{1}_{n}<1$ , then $2(1-d_{n})>1-\eta^{1}_{n}$ . Hence $|W|\geq\min\{K,\lfloor 1/2D(1-d_{n})\rfloor\}$ . $\blacksquare$ (Claim 1)

The following claim follows from Claim 1 by Proposition 2.15.

Claim 2 There exists a $J\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ such that the $\theta(J,A,N)$ is true, i.e., $\exists W\subseteq[N]\,\exists{\mathcal{R}}\,(W\,\mbox{ is an a.p.}\,\wedge|W|\geq\min\{K,\lfloor 1/2D(1-d_{J})\rfloor\}\,\wedge\,{\mathcal{R}}=\{r_{w}\mid w\in W\}$ is a collection of $k$ –a.p.’s such that $\forall w\in W\,((\forall l\geq m)\,(r_{w}(l)\in\,G_{J,h_{J}})$ , $r_{w}(m-1)=w$ , and $(\forall l,l^{\prime}\leq m-2)\,((A\cap(r_{w}(l)+[h_{J}]))-r_{w}(l)=(A\cap(r_{w}(l^{\prime})+[h_{J}]))-r_{w}(l^{\prime}))))$ .

For notational convenience let $W_{H}:=W$ and ${\mathcal{R}}_{H}:={\mathcal{R}}$ be obtained in Claim 2 and rename $H:=h_{J}$ , $S_{H}:=G_{J,h_{J}}$ , $\tau_{H}:=(A\cap(r_{w}(l)+[h_{J}]))-r_{w}(l)$ for some (or any) $w\in W_{H}$ and $l<m-1$ . Let $\{w_{s}\mid 1\leq s\leq|W_{H}|\}$ be the increasing enumeration of $W_{H}$ . Notice that $H\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ . We now go back to consider ${\mathbb{V}}_{0}$ as our standard universe. Notice that $\mu_{N-H}(S_{H})=1$ , $|W_{H}|\gg 1$ , and $(\forall n\in{\mathbb{N}}_{0})\,\xi(x,\alpha,\eta,A,S,U,H,n)$ is true for every $x\in S_{H}$ where $\xi$ is defined in (11).

Claim 3 For each $s\in{\mathbb{N}}_{0}$ we can find an internal $U_{s}\subseteq[H]$ with $\mu_{H}(U_{s})=1$ such that for each $y\in U_{s}$ and each $l\in[k]$ , $r_{w_{s}}(l)+y\in U$ and $(\forall n\in{\mathbb{N}}_{0})\,\xi(r_{w_{s}}(l)+y,\alpha,\eta,A,S,U,K,n)$ is true.

Proof of Claim 3 For each $l\in[k]$ we have $\xi(r_{w_{s}}(l),\alpha,\eta,A,S,U,H,n)$ is true because $r_{w_{s}}(l)\in S_{H}$ . By Lemma 3.5 (a), we can find a set $G_{l}\subseteq r_{w_{s}}(l)+[H]$ with $\mu_{H}(G_{l})=1$ such that
$\xi(r_{w_{s}}(l)+y,\alpha,\eta,A,S,U,K,n)$ is true for every $r_{w_{s}}(l)+y\in G_{l}$ . Set

U_{s}:=\bigcap_{l=1}^{k}((U\cap G_{l})-r_{w_{s}}(l)).

Then we have $U_{s}\subseteq[H]$ and $\mu_{H}(U_{s})=1$ . $\blacksquare$ (Claim 3)

Notice that $\delta_{H}(\bigcap_{i=1}^{s}U_{i})>1-1/s$ . By Proposition 2.15 we can find $1\ll I\leq|W_{H}|$ and

U^{\prime}:=\bigcap\{U_{s}\mid 1\leq s\leq I\}

such that $\delta_{H}(U^{\prime})>1-1/I$ . Hence $\mu_{H}(U^{\prime})=1$ . Applying the induction hypothesis for ${\bf L}_{1}(m-1)(\alpha,1,H,\tau_{H},[H],U^{\prime})$ , we obtain a $k$ –a.p. $\vec{y}\subseteq U^{\prime}$ with $\vec{y}\oplus[K]\subseteq[H]$ , $T^{\prime}_{l}\subseteq C_{K}\cap U^{\prime}$ with $\mu_{|C_{K}|}(T^{\prime}_{l})=1$ and $V^{\prime}_{l}\subseteq[K]$ with $\mu_{K}(V^{\prime}_{l})=1$ for each $l\geq m-1$ , and collections of $k$ –a.p.’s

\begin{array}[]{rcl}\vspace{0.1in}{\mathcal{P}}^{\prime}&=&\bigcup\{{\mathcal{P}}^{\prime}_{l,t}\mid t\in T^{\prime}_{l}\,\mbox{ and }\,l\geq m-1\}\,\mbox{ and}\\ {\mathcal{Q}}^{\prime}&=&\bigcup\{{\mathcal{Q}}^{\prime}_{l,v}\mid v\in V^{\prime}_{l}\,\mbox{ and }\,l\geq m-1\}\end{array}

such that (i) for each $l\geq m-1$ and $t\in T^{\prime}_{l}$ we have $\mu_{K}({\mathcal{P}}^{\prime}_{l,t})=\alpha^{m-2}/k$ and for each $p\in{\mathcal{P}}^{\prime}_{l,t}$ we have $p\sqsubseteq(\vec{y}\oplus[K])\cap U^{\prime}$ , $p(l^{\prime})\in\tau_{H}$ for $l^{\prime}<m-1$ , $p(l)=\vec{y}(l)+t$ , and (ii) for each $l\geq m-1$ and $v\in V^{\prime}_{l}$ we have $\mu_{K}({\mathcal{Q}}^{\prime}_{l,v})\leq\alpha^{m-2}$ , and for each $q\sqsubseteq\vec{y}\oplus[K]$ we have $q\in{\mathcal{Q}}^{\prime}_{l,v}$ iff $q(l^{\prime})\in\tau_{H}$ for every $l^{\prime}<m-1$ and $q(l)=\vec{y}(l)+v$ . For each $l\geq m$ , $t\in T_{l}$ , and $v\in V_{l}$ let

E_{l,t}:=\{p(m-1)\mid p\in{\mathcal{P}}^{\prime}_{l,t}\}\,\mbox{ and }\,F_{l,v}:=\{q(m-1)\mid q\in{\mathcal{Q}}^{\prime}_{l,v}\}.

Then $E_{l,t},F_{l,v}\subseteq\vec{y}(m-1)+[K]$ , $\mu_{K}(E_{l,t})=\mu_{K}({\mathcal{P}}^{\prime}_{l,t})=\alpha^{m-2}/k$ , and $\mu_{K}(F_{l,v})=\mu_{K}({\mathcal{Q}}^{\prime}_{l,v})\leq\alpha^{m-2}$ . Since $\vec{y}\subseteq U^{\prime}$ we have that for each $l\in[k]$ , $(\forall n\in{\mathbb{N}}_{0})\,\xi(r_{w_{s}}(l)+\vec{y}(l),\alpha,\eta,A,S,U,K,n)$ is true.

Applying Part (iii) of Lemma 4.2 with $R:=\{w_{s}+\vec{y}(m-1)\mid 1\leq s\leq I\}$ and $H$ being replaced by $K$ we can find $s_{0}\in[I]$ , $T_{l}\subseteq T^{\prime}_{l}$ with $\mu_{|C_{K}|}(T_{l})=1$ and $V_{l}\subseteq V^{\prime}_{l}$ with $\mu_{K}(V_{l})=1$ for each $l\geq m$ such that for each $t\in T_{l}$ and $v\in V_{l}$ we have

\begin{array}[]{l}\vspace{0.1in}\mu_{K}((w_{s_{0}}+E_{l,t})\cap((w_{s_{0}}+\vec{y}(m-1)+[K])\cap A))\\ \qquad=\alpha\mu_{K}(E_{l,t})=\alpha(\alpha^{m-2}/k)=\alpha^{m-1}/k\,\mbox{ and}\end{array}

(25)

\begin{array}[]{l}\vspace{0.1in}\mu_{K}((w_{s_{0}}+F_{l,v})\cap((w_{s_{0}}+\vec{y}(m-1)+[K])\cap A))\\ \qquad\qquad=\alpha\mu_{K}(F_{l,t})\leq\alpha\!\cdot\!\alpha^{m-2}=\alpha^{m-1}.\end{array}

(26)

Let $\vec{x}:=r_{w_{s_{0}}}\oplus\vec{y}$ . Clearly, we have $\vec{x}\oplus[K]\subseteq[N]$ . We also have that $\vec{x}\subseteq U$ , $\mu_{K}((\vec{x}(l)+[K])\cap S)=\eta$ , and $\mu_{K}((\vec{x}(l)+[K])\cap A)=\alpha$ because $r_{w_{s_{0}}}\subseteq S_{H}$ and $\vec{y}\subseteq U^{\prime}\subseteq U_{s_{0}}$ . For each $l\geq m$ , $t\in T_{l}$ , and $v\in V_{l}$ let

	$\displaystyle{\mathcal{P}}_{l,t}:=\{r_{w_{s_{0}}}\oplus p\mid p\in{\mathcal{P}}^{\prime}_{l,t}\,\mbox{ and}$
			$\displaystyle\,p(m-1)\in E_{l,t}\cap(((w_{s_{0}}+\vec{y}(m-1)+[K])\cap A)-w_{s_{0}})\},$
			$\displaystyle{\mathcal{Q}}_{l,v}:=\{r_{w_{s_{0}}}\oplus q\mid q\in{\mathcal{Q}}^{\prime}_{l,t}\,\mbox{ and}$
			$\displaystyle\,q(m-1)\in F_{l,v}\cap(((w_{s_{0}}+\vec{y}(m-1)+[K])\cap A)-w_{s_{0}})\}.$

Then $\mu_{K}({\mathcal{P}}_{l,t})=\alpha^{m-1}/k$ by (25). If $\bar{q}\sqsubseteq\vec{x}\oplus[K]$ , then there is a $q\sqsubseteq\vec{y}\oplus[K]$ such that $\bar{q}=r_{w_{s_{0}}}\oplus q$ . If $\bar{q}(l^{\prime})\in A$ for $l^{\prime}<m$ and $v\in V_{l}$ for some $l\geq m$ such that $\bar{q}(l)=\vec{x}(l)+v$ , then $q(l^{\prime})\in\tau_{H}$ for $l^{\prime}<m-1$ , $v\in V^{\prime}_{l}$ , and $q(l)=\vec{y}(l)+v$ , which imply $q\in{\mathcal{Q}}^{\prime}_{l,v}$ by induction hypothesis. Hence we have $q(m-1)\in F_{l,v}$ . Clearly, $\bar{q}(m-1)=w_{s_{0}}+q(m-1)\in A$ implies $q(m-1)\in F_{l,v}\cap(((w_{s_{0}}+\vec{y}(m-1)+[K])\cap A)-w_{s_{0}})$ . Thus we have $\bar{q}\in{\mathcal{Q}}_{l,v}$ . Clearly, $\mu_{K}({\mathcal{Q}}_{l,v})\leq\alpha^{m-1}$ by (26).

Summarizing the argument above we have that for each $r_{w_{s_{0}}}\oplus p\in{\mathcal{P}}_{l,t}$

•

$r_{w_{s_{0}}}(l^{\prime})+p(l^{\prime})\in r_{w_{s_{0}}}(l^{\prime})+\tau_{H}\subseteq A$ for $l^{\prime}<m-1$ because $r_{w_{s_{0}}}(l^{\prime})\in B_{H}$ ,
•

$r_{w_{s_{0}}}(m-1)+p(m-1)=w_{s_{0}}+p(m-1)$

$\in(w_{s_{0}}+E_{l,t})\cap(w_{s_{0}}+\vec{y}(m-1)+[K])\cap A\subseteq A$ ,
•

$r_{w_{s_{0}}}(l^{\prime})+p(l^{\prime})\in(\vec{x}(l^{\prime})+[K])\cap U\subseteq U$ for $l^{\prime}\geq m$ because of $p\subseteq U^{\prime}$ ,
•

$r_{w_{s_{0}}}(l)+p(l)=r_{w_{s_{0}}}(l)+\vec{y}(l)+t=\vec{x}(l)+t$ .

For each $\bar{q}\sqsubseteq\vec{x}\oplus[K]$ , $\bar{q}\in{\mathcal{Q}}_{l,v}$ iff there is a $q\sqsubseteq\vec{y}\oplus[K]$ with $\bar{q}=r_{w_{s_{0}}}\oplus q$ such that

•

$r_{w_{s_{0}}}(l^{\prime})+q(l^{\prime})\in r_{w_{s_{0}}}(l^{\prime})+\tau_{H}\subseteq A$ for $l^{\prime}<m-1$ because $r_{w_{s_{0}}}(l^{\prime})\in B_{H}$ ,
•

$r_{w_{s_{0}}}(m-1)+q(m-1)=w_{s_{0}}+q(m-1)\in A$ which is equivalent to

$w_{s_{0}}+q(m-1)\in(w_{s_{0}}+F_{l,v})\cap(w_{s_{0}}+\vec{y}(m-1)+[K])\cap A\subseteq A$ ,
•

$r_{w_{s_{0}}}(l)+q(l)=r_{w_{s_{0}}}(l)+\vec{y}(l)+v=\vec{x}(l)+v$ .

This completes the proof of ${\bf L}_{1}(m)(\alpha,\eta,N,A,S,U)$ as well as ${\bf L}(m)$ by Lemma 5.3. $\blacksquare$

Summary of the ideas used in the proof of Lemma 5.1: We want to use ${\bf L}_{2}(m-1)$ to create a sequence ${\mathcal{R}}_{H}$ of $k$ blocks $r_{w}\oplus[H]$ of size $H$ that are in arithmetic progression such that (a) the set $A$ in the first $m-2$ blocks $r_{w}(l)+[H]$ for $l\leq m-2$ are identical copies of $\tau_{H}$ in $[H]$ , (b) the initial points $\{r_{w}(m-1)\mid w\in W_{H}\}$ of all $(m-1)$ -st blocks form an a.p. of infinite length which should be used when applying Part (iii) of Lemma 4.2, (c) the initial points of the rest of the blocks satisfy an appropriate version of (11) for some infinite $n=J$ , i.e., $r_{w}\subseteq S_{H}$ . Then we work inside $[H]$ , using $L_{1}(m-1)$ to create collections of $k$ –a.p.’s ${\mathcal{P}}^{\prime}$ and ${\mathcal{Q}}^{\prime}$ in $[H]$ instead of $[N]$ . For applying Part (iii) of Lemma 4.2 we want to make sure, if we can, that the $(m-1)$ -st terms of all $p\in{\mathcal{P}}^{\prime}_{l,t}$ form a set of positive measure and the $(m-1)$ -st terms of all $q\in{\mathcal{Q}}^{\prime}_{l,v}$ form a set of positive measure. Then mixing one $r_{w}$ for some $w\in W_{H}$ with ${\mathcal{P}}^{\prime}$ and ${\mathcal{Q}}^{\prime}$ at $(m-1)$ -st terms yields ${\mathcal{P}}$ and ${\mathcal{Q}}$ validating ${\bf L}_{1}(m)$ .

Unfortunately, using ${\bf L}_{1}(m-1)$ to create collections ${\mathcal{P}}^{\prime}$ and ${\mathcal{Q}}^{\prime}$ of $k$ –a.p.’s in $[H]$ cannot guarantee that the set $E_{l,t}$ of the $(m-1)$ -st terms of all $p\in{\mathcal{P}}^{\prime}_{l,t}$ and the set $F_{l,v}$ of the $(m-1)$ -st terms of all $q\in{\mathcal{Q}}^{\prime}_{l,v}$ have positive measures in $[H]$ (the positive measures can be guaranteed when $k\leq 4$ but not for $k>4$ ). So, instead of getting the sets $E_{l,t}$ and $F_{l,v}$ to have positive measures in $[H]$ we make sure that the set $E_{l,t}$ and $F_{l,v}$ have positive measures in some subinterval $\vec{y}(m-1)+[K]$ of length $K$ in $[H]$ , where $K$ is fixed and could be much smaller than $H$ . This is achieved by requiring $\vec{y}\subseteq U^{\prime}$ . When we use the mixing lemma we want to make sure that $A$ in all of these relevant intervals $(r_{w_{s}}\oplus\vec{y})(m-1)+[K]$ of length $K$ has measure $\alpha$ . This requirement is again achieved by requiring $r_{w_{s}}\subseteq S_{H}$ and $\vec{y}\subseteq U^{\prime}$ after shrinking $W_{H}$ to its initial segment $\{w_{s}\mid s\in[I]\}$ .

Since we want $K$ to be significantly smaller than $N$ we assume that $N$ and $K$ is at least one universe apart. Hence $N$ must be at least in ${\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ . If we want to apply ${\bf L}_{1}(m-1)$ for $H$ instead of $N$ , then $H$ must also be at least in ${\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ . If we want $B_{H}$ to have a ${\mathbb{V}}_{2}$ –standard positive measure, $N^{\prime}$ must be in ${\mathbb{N}}_{4}$ in order to use the ${\mathbb{V}}_{4}$ –version of ${\bf L}(m-1)$ with ${\mathbb{V}}_{2}$ being considered as the “standard” universe. But $N^{\prime}$ can only be guaranteed one universe apart from $H$ by Definition 3.1 even though $N$ is assumed to be at least two universes apart from $H$ . Therefore, we use $h_{n}\in{\mathbb{N}}_{1}$ (instead of $H$ ) which leads to $N^{\prime}\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ and use $i_{2}$ to lift $N^{\prime}$ to $N^{\prime\prime}\in{\mathbb{N}}_{3}\setminus{\mathbb{N}}_{2}$ while keeping $h_{n}\in{\mathbb{N}}_{1}$ . This allows the use of ${\mathbb{V}}_{3}$ –version of ${\bf L}_{2}(m-1)$ with ${\mathbb{V}}_{1}$ being considered as the “standard” universe. Then spill $h_{n}$ over to $H\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ and apply ${\bf L}_{1}(m-1)$ with $N,A,S,U$ being replaced by $H,\tau_{H},[H],U^{\prime}$ to obtain desired collections ${\mathcal{P}}^{\prime}$ and ${\mathcal{Q}}^{\prime}$ of $k$ –a.p.’s. All these steps rely on the fact that ${\bf L}_{2}(m-1)$ is an internal statement with internal parameters.

Theorem 5.4 (E. Szemerédi, 1975)

Let $k\in{\mathbb{N}}_{0}$ . If $D\subseteq{\mathbb{N}}_{0}$ has positive upper density, then $D$ contains nontrivial $k$ -term arithmetic progressions.

Proof It suffices to find a nontrivial $k$ –a.p. in $i_{0}(D)$ . Let $P$ be an a.p. such that $|P|\gg 1$ and $\mu_{|P|}(i_{0}(D)\cap P)=S\!D(D)=\alpha$ . Then $\alpha>0$ because $\alpha$ is greater than or equal to the upper density of $D$ . Let $A=i_{0}(D)\cap P$ . Without loss of generality, we can assume $P=[N]$ for some $N\gg 1$ . We can also assume that $N\in{\mathbb{N}}_{2}\setminus{\mathbb{N}}_{1}$ because otherwise replace $N$ by $i_{1}(N)$ and $A$ by $i_{1}(A)$ . Then we have $\mu_{N}(A)=S\!D(A)=\alpha$ . Set $U=S=[N]$ . Trivially, $\mu_{N}(S)=S\!D(S)=\eta=1$ , $A\subseteq S$ , and $S\!D_{S}(A)=S\!D(A)=\alpha$ . To start with $k^{\prime}=k+1$ instead of $k$ , we have many nontrivial $k^{\prime}$ –a.p.’s $p\in{\mathcal{P}}$ such that $p(l)\in A$ for $l\leq k^{\prime}-1=k$ in ${\bf L}_{1}(k^{\prime})$ . So there must be many nontrivial $k$ –a.p.’s in $A\subseteq i_{0}(D)$ . By ${\mathbb{V}}_{0}\prec{\mathbb{V}}_{2}$ , there must be nontrivial $k$ –a.p.’s in $D$ . $\blacksquare$

6 A Question

The construction of the nonstandard universes above requires the existence of a non-principal ultrafilter ${\mathcal{F}}_{0}$ on ${\mathbb{N}}_{0}$ and a well-order $\lhd$ on some ${\mathbb{V}}({\mathbb{R}}_{0},z)$ , which is a consequence of $\mathsf{ZFC}$ . Notice that $\mathsf{ZF}$ cannot guarantee the existence of ${\mathcal{F}}_{0}$ although $\mathsf{ZF}$ plus the existence of ${\mathcal{F}}_{0}$ and $\lhd$ is strictly weaker than $\mathsf{ZFC}$ . However, assuming the existence of ${\mathcal{F}}_{0}$ and $\lhd$ may be avoided by the axiomatic approach of nonstandard analysis developed in [5]. In [5] two systems of axioms $\mathsf{SPOT}$ and $\mathsf{SCOT}$ are introduced. Roughly speaking, $\mathsf{SPOT}$ contains $\mathsf{ZF}$ plus some primitive tools for nonstandard analysis and $\mathsf{SCOT}$ contains $\mathsf{ZF}$ plus the axiom of dependent choice and some primitive tools for nonstandard analysis. It is shown in [5] that $\mathsf{SPOT}$ is a conservative extension of $\mathsf{ZF}$ and sufficient for developing basic calculus while $\mathsf{SCOT}$ is a conservative extension of $\mathsf{ZF}$ plus the axiom of dependent choice and sufficient for developing basic calculus and Lebesgue integration.

Question 6.1

Can the nonstandard proof of Szemerédi’s theorem in §5 be carried out in $\mathsf{SCOT}$ or even in $\mathsf{SPOT}$ ?

Acknowledgments

The author would like to thank the American Institute of Mathematics which sponsored the workshop Nonstandard Methods in Additive Combinatorics, where he had an opportunity to attend Terence Tao’s lectures and learn from Tao’s interpretation of Szemerédi’s original proof of Szemerédi’s theorem [9]. The author would also like to thank Steven Leth, Isaac Goldbring, Mikhail Katz, Michael Benedikt, Karel Hrbáček, and the anonymous referee for comments, suggestions, and correcting some mistakes and typos in earlier versions of the paper.

References

[1] Chen Chung Chang and H. Jerome Keisler. Model Theory. (3rd edition), North-Holland, 1990.
[2] Mauro Di Nasso. Hypernatural numbers as ultrafilters. in Nonstandard Analysis for the Working Mathematician (Springer, Dordrecht, 2015), 443–474.
[3] Mauro Di Nasso, Isaac Goldbring, and Martino Lupini. Nonstandard Methods in Ramsey Theory and Combinatorial Number Theory. Lecture Notes in Mathematics book series, volume 2239, Springer, 2019.
[4] Robert Goldblatt. Lectures on the hyperreals–an introduction to nonstandard analysis. Springer, 1998.
[5] Karel Hrbáček and Mikhail G. Katz. Infinitesimal analysis without the Axiom of Choice. Annals of Pure and Applied Logic, 172 (6), June 2021
[6] Lorenzo Luperi Baglini. Hyperintegers and Nonstandard Techniques in Combinatorics of Numbers. PhD Dissertation (2012), University of Siena, arXiv: 1212.2049.
[7] Peter A. Loeb and Manfred P. H. Wolff, editors. Nonstandard Analysis for the Working Mathematician–Second edition. Springer, Dordrecht, 2015.
[8] Endre Szemerédi. On sets of integers containing no $k$ elements in arithmetic progression. Collection of articles in memory of Juriǐ Vladimirovič Linnik. Acta Arithmatica. 27 (1975): 199–245.
[9] Terence Tao. Szemerédi’s proof of Szemerédi’s theorem. Acta Mathematica Hungarica. 161 (2020): 443–487. https://terrytao.files.wordpress.com/2017/09/szemeredi-proof1.pdf
[10] Terence Tao. A nonstandard analysis proof of Szemerédi’s theorem.
https://terrytao.wordpress.com/2015/07/20/a-nonstandard-analysis-proof-of-szemeredis-theorem/
[11] Terence Tao. Szemeredi’s proof of Szemeredi’s theorem.
https://terrytao.wordpress.com/2017/09/12/szemeredis-proof-of-szemeredis-theorem/.

{dajauthors}{authorinfo}

[pgom] Renling Jin
College of Charleston
Charleston, South Carolina, USA

jinr\imageatcofc\imagedotedu
\urlhttp://jinr.people.cofc.edu

	$\displaystyle\left\|\delta_{H}(A\cap(x+E_{w}))-\alpha\delta_{H}(E_{w})\right\|$
			$\displaystyle\leq\frac{1}{H}\left(\left\|\|A\cap(x+E_{w})\|-\sum_{n\in[m]}c_{n,w}\|A\cap(x+U_{n})\|\right\|\right.$
			$\displaystyle\quad+\left\|\sum_{n\in[m]}c_{n,w}\|A\cap(x+U_{n})\|-\sum_{n\in[m]}c_{n,w}\alpha\|U_{n}\|\right\|$
			$\displaystyle\quad\left.+\left\|\alpha\sum_{n\in[m]}c_{n,w}\|U_{n}\|-\alpha\|E_{w}\|\right\|\right)$
			$\displaystyle\leq\epsilon+\frac{1}{H}\sum_{n\in I}c_{n,w}\epsilon_{1}\|U_{n}\|+2\delta_{H}(V^{\prime})+\alpha\epsilon$
			$\displaystyle\leq\epsilon+\epsilon_{1}\delta_{H}(V)+2\delta_{H}(V^{\prime})+\alpha\epsilon\approx 0.$

	$\displaystyle\mu_{\|C_{K}\|}(T_{m}^{g})\cdot\frac{\alpha^{m-1}}{k}=st\left(\frac{1}{\|C_{K}\|}\sum_{t\in T_{m}^{g}}\frac{1}{K}\|{\mathcal{P}}_{m,t}\|\right)\geq st\left(\frac{1}{\|C_{K}\|K}\|{\mathcal{P}}_{m}^{g}\|\right)$
			$\displaystyle\geq st\left(\frac{1}{\|C_{K}\|}\sum_{t\in T_{m}}\frac{1}{K}\|{\mathcal{P}}_{m,t}\|-\frac{1}{\|C_{K}\|}\sum_{l=m}^{k}(\|V_{l}\setminus(S-\vec{x}(l))\|\alpha^{m-1})\right)$
			$\displaystyle\geq\mu_{\|C_{K}\|}(T_{m})\cdot\frac{\alpha^{m-1}}{k}-(2k+1)k(1-\eta)\cdot\alpha^{m-1}$
			$\displaystyle=\left(\frac{1}{k}-(2k+1)k(1-\eta)\right)\cdot\alpha^{m-1},$

A Simple Combinatorial Proof of Szemerédi’s Theorem via Three Levels of Infinities†††Mathematics Subject Classification 2020: Primary 11B25, Secondary 03H05