This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Diophantine Approximation of Anergodic Birkhoff Sums over Rotations

Paul Verschueren
(paul@verschueren.org.uk)
paul@verschueren.org.uk
(Date: Started: 05/2020; Date: This version: 03/2023)
Abstract.

We study Birkhoff sums over rotations (series of the form r=1Nϕ(rα)\sum_{r=1}^{N}\phi(r\alpha)), in which the summed function ϕ\phi may be unbounded at the origin. Estimates of these sums have been of significant interest and application in pure mathematics since the late 1890s, but in recent years they have also appeared in numerous areas of applied mathematics, and have enjoyed significant renewed interest. Functions which have been intensively studied include the reciprocals of number theoretical functions such as ϕ(x)=1/{x},1/{{x}},1/x\phi(x)=1/\{x\},1/\{\{x\}\},1/\left\|x\right\| and trigonometric functions such as ϕ(x)=cotπx\phi(x)=\cot\pi x or |cscπx|\left|\csc\pi x\right|. Classically the Birkhoff sum of each function has been studied in relative isolation using function specific tools, and the results have frequently been restricted to Bachmann-Landau estimates. We introduce here a more general unified theory which is applicable to all of the above functions. The theory uses only “elementary” tools (no tools of complex analysis), is capable of giving “effective” results (explicit bounds), and generally matches or improves on previously available results.

The author wishes to thank Professor Sebastian van Strien of Imperial College, without whose encouragement and support this paper would never have seen publication, and also the College itself for providing access to the literature.
The author also wishes to thank Ben Mestel of the Open University for introducing him to the subject, for his infectious enthusiasm, and for many wonderful discussions.
This work was partly supported by ERC AdG RGDD No 339523

1. Introduction

We study estimates for series of the form r=1Nϕ(rα)\sum_{r=1}^{N}\phi(r\alpha) where ϕ\phi is a real function of period 1, and α\alpha is an irrational real number. Such series can be classified as Birkhoff sums over rotations of the circle (with initial condition 0). When ϕ\phi is Lebesgue integrable, a first order estimate is easily available via the powerful theorems of Ergodic Theory. In particular Birkhoff’s Ergodic Theorem gives us 1Nr=1Nϕ(rα)ϕ\frac{1}{N}\sum_{r=1}^{N}\phi(r\alpha)\rightarrow\int\phi. Series which cannot be estimated in this way we call anergodic Birkhoff sums. In this paper we study the case where ϕ\phi is not Lebesgue integrable due to being unbounded.

Such series have been of great historical interest in pure mathematics (eg Diophantine Approximation, Discrepancy Theory, q-series) but there has been a recent resurgence of interest as new applications have emerged in a number of areas including KAM theory, Quantum Field Theory, Quantum Chaos, Quantum Computing, and String Theory.

These series seem most naturally situated at the intersection of the disciplines of Diophantine Approximation and Dynamical Systems, although some sophisticated techniques of Complex Analysis were also famously deployed by Hardy and Littlewood in their studies. We will briefly position our objects of study within the two aforementioned disciplines.

One notable difference between the two disciplines is that studies in Diophantine Approximation have almost always focused on the sums r=1Nϕ(rα)\sum_{r=1}^{N}\phi(r\alpha) as the natural objects of study, where ϕ\phi is unbounded at the origin. In Dynamical Systems the natural objects of study are the more general sums r=1Nϕ(x0+rα)\sum_{r=1}^{N}\phi(x_{0}+r\alpha) where x0x_{0} is an ’initial condition’. This broadening of perspective proves important.

Whilst the general approach of this paper is designed to be applicable to quite general sums, space will restrict us to developing detailed estimates only for a single class of functions ϕ\phi and for the homogeneous (x0=0)(x_{0}=0) case. This case does however cover the results of a surprisingly high proportion of previous studies. We hope to address inhomogeneous sums and other classes of functions in later papers.

1.1. The context of Diophantine Approximation

Particular examples of series r=1Nϕ(rα)\sum_{r=1}^{N}\phi(r\alpha) have long been studied within the discipline of Diophantine Approximation, particularly exploiting the theory of Continued Fractions. These studies are typically challenging, and individual papers have focused on developing results for a particular given function ϕ\phi. Perhaps as a result, techniques have often been dependent upon identities tied to the particular function being studied. The major contribution of this paper is an approach which separates the study of the underlying dynamics (an irrational rotation) from the characteristics of the function ϕ\phi. We develop new results about the dynamics, and these are then immediately available for use with any chosen function ϕ\phi.

The simplest such series is undoubtedly the “sum of remainders”, namely the series r=1N{rα}\sum_{r=1}^{N}\{r\alpha\} (ie ϕ(x)={x}\phi(x)=\{x\}) in which {x}\{x\} is the fractional part of xx (or remainder modulo 1). This case is ergodic and a simple application of ergodic theory immediately gives 1Nr=1N{rα}12\frac{1}{N}\sum_{r=1}^{N}\{r\alpha\}\rightarrow\frac{1}{2}. However the rate of convergence is of great interest, and was closely studied by many illustrious mathematicians in the early 20th century, including Lerch[10], Sierpinsky[13, 14], Ostrowski[12], Hardy & Littlewood[4, 5], Behnke[1], Hecke[7] who all contributed insights and techniques. Even after such intense study, Lang still saw value in developing a new approach as late as the 1960’s (see [8]), and there have been more papers on the topic in more recent years.

Of all the tools and techniques introduced in these papers, we will make the greatest use of Ostrowski’s representation of integers developed in [12] above. The way in which Ostrowski used this technique within his paper is not easily generalisable, and the technique seems to have been forgotten for many years (in fact it was independently rediscovered in 1952 by Lekkerkerker [9] and in 1972 by Zeckendorf[20], but then only for the case of Fibonacci numbers). However in recent years it has been recognised as a powerful technique quite independent of its original “sum of remainders” context.

Although the approach in our paper is applicable to the “sum of remainders” problem, that is not our focus. Rather we are concerned with series in which ϕ\phi is not integrable due to unbounded singularities. In such cases the Ergodic Theorems do not apply, and and other techniques are necessary to estimate the sums. A remarkable range of techniques have been applied. Hardy & Littlewood studied ϕ(x)=cscπx\phi(x)=\csc\pi x. using a number of advanced techniques including double zeta functions, functional equations, and Cesaro means. Lang [8]studied ϕ(x)=1/{x}\phi(x)=1/\{x\} and |cscπx|\left|\csc\pi x\right| using an elegant and elementary recursive technique he developed for the purpose. Sudler[16] and Wright[19] studied ϕ(x)=log|sinπx|\phi(x)=\log|\sin\pi x| giving perhaps the first explicit bounds for such a sum. This sum occurs in a remarkable range of application areas and guises (see the Bibliographies of [11, 18] for a list of 30 papers on the topic prior to 2016). An open question of Erdos-Szekeres[3] from 1959 was settled positively in 1998 for almost all rotation numbers α\alpha by Lubinsky[11] who felt certain it would prove to apply to all α\alpha. It was finally settled negatively in 2016 by Verschueren[17], but using techniques rather different from those in this paper. Recently (2020) Beresnevich, Haynes and Velani[2] studied ϕ(x)=1/x\phi(x)=1/\left\|x\right\| and explored connections between this particular Birkhoff sum and recent advances in the long standing Littlewood conjecture, using both elementary techniques and Minkowski’s Theorem. Sinai & Ulcigrai[15] (2009) studied the case ϕ(x)=cotπx\phi(x)=\cot\pi x arising out of a problem in Quantum Computing, and used the “Cut and Stack” technique developed within the Dynamical Systems discipline.

In this paper we develop instead a single unified approach with which to tackle these problems. We also show how we can apply it with relatively little effort to a particular class (r=1Nϕ(rα)\sum_{r=1}^{N}\phi(r\alpha) where ϕ\phi is has a single unbounded point at the origin), where it gives results equivalent to, or improving upon, those currently existing in the literature.

1.2. The context of Dynamical Systems

In the Dynamical Systems context, our series r=1Nϕ(rα)\sum_{r=1}^{N}\phi(r\alpha) is an example of an additive co-cycle (or equivalently a skew product) and more specifically, a Birkhoff sum. The base phase space is the circle, and the space morphism (or evolution function) is an irrational rotation. Birkhoff sums occur widely in the modelling of physical systems, representing sums of an observable along an orbit of the phase space. Although we are restricting here our study of Birkhoff sums to one of the simplest of Dynamical Systems (the irrational rotation), we recall that many physical systems can be studied via, or even reduced to, rotations of the circle.

Key tools of study in this area are normally Ergodic theorems (eg Birkhoff, von Neumann, Oseledets) which establish that averages of the Birkhoff sums converge (ie the sequence111or the multiplicative analogue (r=1Nϕ(rα))1/N\left(\prod_{r=1}^{N}\phi(r\alpha)\right)^{1/N} 1Nr=1Nϕ(rα)\frac{1}{N}\sum_{r=1}^{N}\phi(r\alpha) has a limit) almost everywhere under suitable constraints (primarily that ϕ\phi be L1L^{1}-integrable). We will call the Birkhoff sum anergodic when ergodic theory cannot be applied.

Such cases already occur in Quantum Field Theory and String Theory (where tools of renormalisation theory replace those of ergodic theory) and where the techniques of this paper (seen as a form of renormalisation) may provide the beginnings of a further alternative approach. Sinai & Ulcigrai report the occurrence of the case ϕ(x)=cotπx\phi(x)=\cot\pi x in a problem of quantum computing. They developed their own approach to the problem, but the approach developed in this paper is also applicable to the problem and provides an alternative approach (see the application in 9.7).

More recently, the advent of “Big Data” has refocused attention on Koopman operators and their adjoints, Perron-Frobenius operators, as these seem better suited to data-driven (rather than equation-driven) analysis, and high dimensional systems. The original theory was developed in the context of ergodic theory, and is limited to Banach spaces of observables (usually Hilbert spaces). Birkhoff sums are an important type of Koopman operator, and our study of anergodic sums extends the study of Koopman operators into non-Banach spaces of observables with singularities. It seems entirely possible that Big Data problems will at some point need such extensions. We therefore expend some effort in situating our theory within the context of operator theory (see particularly Section 3).

1.3. Overview of main results

  1. (1)

    We develop general upper and lower bounds for the Birkhoff sum SNϕ=r=1Nϕ(rα)S_{N}\phi=\sum_{r=1}^{N}\phi(r\alpha) where ϕ\phi is a monotonic period-1 function which may be unbounded at the origin. These bounds are independent of any other characteristics of ϕ\phi, and hence widely applicable.

  2. (2)

    These bounds are easily extended to the related Birkhoff sums SNθS_{N}\theta in the cases θ(x)=ϕ(1x)\theta(x)=\phi(1-x), θ(x)=ϕ(x)+ϕ(1x)\theta(x)=\phi(x)+\phi(1-x), and θ(x)=ϕ(x)ϕ(1x)\theta(x)=\phi(x)-\phi(1-x).

  3. (3)

    We compute specific bounds for the particular family ϕ(x)=xβ\phi(x)=x^{-\beta} for β1\beta\geq 1. (The techniques can also be used for β<1\beta<1 but this case is L1L^{1}-integrable and is already covered by existing methods.) We show that that in most cases this leads relatively quickly to results which match or improve upon results in the literature.

  4. (4)

    The underlying theory includes a new analysis of the distribution of the sequence of fractional parts {rα}\{r\alpha\} across [0,1)[0,1) using an extension of Ostrowski’s representation of integers which seems of interest in its own right.

  5. (5)

1.4. Overview of this paper

The aim of this paper is to develop a general theory of Birkhoff sums of the form 1Nϕ(rα)\sum_{1}^{N}\phi(r\alpha) where α\alpha is a real irrational and ϕ\phi is a period 1 real function which may be unbounded at the origin. As we shall use a number of Dynamical Systems concepts in the paper, we note that the Birkhoff sum is equivalent to an additive co-cycle or skew-product 1Nϕ(Rαr0)\sum_{1}^{N}\phi(R_{\alpha}^{r}0) where RαR_{\alpha} is an irrational rotation of the circle, 0 is the initial condition, and ϕ\phi is a observable on the circle which may be unbounded at the initial condition.

In section 2 we introduce a collection of concepts and results which we use in the rest of the paper. The paper is slightly unusual in that we do not require any advanced results from any particular mathematical discipline, but we will however use basic tools and results from a range of disciplines. This gives us two problems. The first is that finding a level of introduction which will suit all readers is an impossible task, and we apologise to every reader in advance for both insulting her intelligence in some sections, and for assuming too much for him in others. The second is that we have encountered numerous notation collisions between different disciplines. Rather than embark on the Sisyphian task of unifying mathematical notation, we have reused the concept of namespaces from Computer Science, albeit with an informal implementation which we feel is more suited to mathematical writing.

In section 3 we begin developing a theory of hom-set magmas which provides the algebraic context for our later sections. This gives us a set of dualities which save us much duplication of effort later, and also provides a structure within which to make sense of later sections, particularly in Section 7.

In section 4 we introduce and develop the idea of separation of concerns, a concept borrowed from the world of software engineering. The sum 1nϕ(rα)\sum_{1}^{n}\phi(r\alpha) mixes values of the observable ϕ\phi with the dynamics of an irrational rotation. Previous approaches to these sums have developed identities concerning the observable ϕ\phi and developed specific techniques for summing these particular identities. In software engineering terms the concerns of summation and dynamics are tightly bound. The result is that each observable with distinct characteristics tends to result in a different approach. In the approach of this paper, we will consciously aim to separate the two concerns. This allows us to study the dynamics once only, independent of any particular observable. The summation process then becomes a separate concern, and its study is simplified by being unbound from the dynamics.

The way we achieve this is to use an approach from Category Theory/Operator Theory to recast the normal Birkhoff sum 1nϕ(rα)\sum_{1}^{n}\phi(r\alpha) into the sum ϕ1n(Rαr0)\phi^{*}\sum_{1}^{*n}(R_{\alpha}^{r}0) where ϕ,\phi^{*},\sum^{*} are now operators on the space of orbits under the irrational rotation RαR_{\alpha}. This achieves the desired separation albeit at the cost of a more abstract approach. We are now left with studying the distribution of the orbit 1n(Rαr0)\sum_{1}^{*n}(R_{\alpha}^{r}0) (independently of ϕ\phi), and then studying the problem of applying operators ϕ\phi^{*} to such sequences. Like with many things in basic Category Theory, the approach becomes almost trivial once the concepts are understood. The value of Category Theory lies in encouraging thinking which leads to a fruitful recasting.

In section 5 we study our first concern which is the distribution of the orbit 1n(Rαr0)\sum_{1}^{n}(R_{\alpha}^{r}0). The equi-distribution of the points of such an orbit regarded as a set, is well understood and much studied. However our concern is rather to study the distribution as a sequence, ie it is to develop an estimate of the location of the point rαr\alpha for each rr. It turns out we can do this remarkably well using the number theoretic tool of Ostrowski representation, but lifting this tool to the level of an operator acting on “quasiperiod” orbit segments - namely the segments of an orbit which lie between closest returns. The error terms in these positional estimates are well controlled and effectively obey a renormalisation law.

In section 6 we develop some basic theory on unbounded functions and their spaces, as this does not appear to be a well studied area.

In section 7 we move to our second concern of applying the derived operator ϕ\phi^{*} to the orbit, and this is now largely the problem of applying ϕ\phi^{*} to the quasiperiod segments identified in Section 5. This results in upper and lower bounds for the Birkhoff sum 1nϕ(rα)\sum_{1}^{n}\phi(r\alpha) which of course involve ϕ\phi, but which are perfectly general, ie they are independent of the particular characteristics of the specific observable ϕ\phi. The approach also provides a way of identifying important equivalence classes of observables which share the same higher order growth estimates. This completes our development of the general theory.

In section 8 we make an application of the general theory to the particular family of functions ϕ(x)=xβ\phi(x)=x^{-\beta} for real β1\beta\geq 1. The general theory can of course be applied to the family β<1\beta<1, but as xβx^{-\beta} is L1L^{1}-integrable on the circle, these Birkhoff sums can be estimated via existing methods and we shall ignore them here. For β1\beta\geq 1 the fact that we now have a general theory enables us to develop estimates using a variety of estimating techniques developed independently in the literature. The theory enables us to develop the estimates relatively rapidly compared with earlier papers, perhaps with greater clarity. In addition, as they are developed within a unifying framework, the the relative merits of the different techniques can be assessed, something which was not easy to do previously.

In section 9, we compare the estimates from the general theory to the results available in the literature. With a few special case exceptions, we show that the results from the general theory are equivalent to, or improve upon those in the literature. (It is possible that the exceptions could be improved on with more careful application of the theory, as we have made no attempt to achieve best possible results at this stage).

2. Preliminaries

This paper requires little in the way of pre-requisites from any one mathematical discipline, but we do draw some basic ideas and results from a range of disciplines. Some of these disciplines have several notations in common use and in addition there are some notation collisions between disciplines. In this section, we standardise our notation, summarise the necessary disciplinary background, and develop from these some simple theory and results for use in the following sections.

2.1. Basic Terminology & Notation

We will adopt the following notation for logical operations: &\& (AND), || (OR), !! (NOT), \coloneqq (Definition), .\left\llbracket.\right\rrbracket (Iverson bracket - see 2).

We will use symbols from the 𝔹\mathbb{B}lackboard font for our main spaces of interest: we assign ,,𝕋\mathbb{Z},\mathbb{\mathbb{\mathbb{R}}},\mathbb{T} their usual meanings as the ring of integers, the topological field of real numbers with Lebesgue measure, and the circle (as a Lie Group/real manifold - see 2.4 below). We assign \mathbb{N} the “semi-standard” meaning as an additive group (and so 00\in\mathbb{N} and we define +\{0}\mathbb{N}^{+}\coloneqq\mathbb{N}\backslash\{0\}). Finally we define two real semi-open intervals 𝕀[0,1)\mathbb{I}\coloneqq[0,1) and 𝕀/𝕀12=[12,12)\mathbb{I}^{/}\coloneqq\mathbb{I}-\frac{1}{2}=[-\frac{1}{2},\frac{1}{2}) which together form the atlas (the covering of coordinate spaces) for our circle manifold.

We define the set {xr}r=1N{x1,x2,,xN}\{x_{r}\}_{r=1}^{N}\coloneqq\{x_{1},x_{2},\ldots,x_{N}\} with the analogous notation (xr)r=1N(x_{r})_{r=1}^{N} for sequences.

2.1.1. Contextual Annotation

Motivation In borrowing from a number of disciplines in this paper, we quickly hit a problem with homonyms - symbols which have different meanings in different disciplines. The problem is one of disambiguation - how do we ensure that the reader understands which meaning is signified by a homonym at any particular occurrence.

In mathematics we have historically aimed to solve this problem by eliminating it, stressing the importance of unambiguous notation. In theory we can simply define sufficient new symbols. However we found two obstacles to this approach in this paper. First, we are using very basic concepts with very familiar notation, and this means that introducing new notation may be unhelpful. For example the notation {x}\{x\} signifies different meanings in set theory and number theory, but it seems undesirable to replace it with new notation in either discipline. Second, although the supply of new symbols is theoretically limitless, modern mathematics is in reality constrained by the supply of defined Unicode symbols. The latter is very finite and can be exhausted surprisingly quickly.

Both of these problems were encountered in the process of writing down the theory presented in this paper. This leads us to considering ambiguous notations, and in practice this proves to be much less of a problem than it sounds.

Firstly, the meaning of any particular homonym is often easily derived from the immediate context, and then the danger of ambiguity is theoretical rather than real. In fact we frequently recognise this in mathematics by saying we are “abusing notation”. The phrase suggests that we know we are doing something wrong or lazy, but perhaps feel we can “get away with it”. However we now argue that, provided that the context does remove ambiguity, we are not “abusing” so much as “reusing” notation, and this is an effective and laudable approach to managing a finite resource. As an example, consider the trivial function f:f:\mathbb{N}\rightarrow\mathbb{N} defined by nn+1n\mapsto n+1. We now extend this to a function f:f:\mathbb{C}\rightarrow\mathbb{C} defined by zz+1z\mapsto z+1. We refer to this as “abuse of notation” since we are using the symbol ff to signify two different functions, but it would serve no good cause to use different symbol for the two functions, and indeed doing so would lose sight of the deep connection between the functions.

We will therefore feel free to subject our notation to abusive practices, subject to an additional authorial responsibility to ensure that ambiguity is resolved by the context. The reader must judge how well we succeed.

Secondly, the normal way of resolving ambiguity in formulae or equations is to add explanatory detail to the surrounding text. However this can become clumsy, and the relevant text can seem somewhat dissociated from the symbols themselves. We borrow from Computer Science the idea of “inlining” context by the direct annotation of ambiguous tokens (such as variable names) with context (“namespace”) tokens which determine the correct disambiguation. In Computer Science the goal is to provide a formal disambiguation syntax suitable for processing by machine, and existing implementations we have considered are probably too heavy handed to suit mathematical writing. We have instead adopted a lighter and more mathematical notational approach with the more modest goal of providing sufficient disambiguation to suit an intelligent mathematical reader, rather than an efficient algorithm. Like any new notation there is a learning hurdle, but we feel the benefits for this paper make it worthwhile. A quick-start summary follows here, but see Appendix A for a more formal explanation.

2.1.2. Contextual Annotation Quickstart

Embracing the positive reuse of notation as described above, enables us to write X(X,+)X\coloneqq(X,+) in order to define the group XX with underlying set XX and operator +{}^{\prime}+^{\prime}. We call X{}^{\prime}X^{\prime} here a homonym because we have given two different mathematical objects (the set and the group) the same name XX. Depending upon the document we are writing, we might judge the intended meaning to be clear from surrounding text. If there is need for disambiguation, we can make the meaning of each occurrence of the homonym XX explicit by annotating each symbol with a suitable contextual label. For example we could write XGrp(XSet,+)X_{Grp}\coloneqq(X_{Set},+). In a more formal setting we might decide to be even more explicit with something like XϵGrp(X::Set,+AssocOp)X_{\epsilon Grp}\coloneqq(X_{::Set},+_{AssocOp}). Whether we needed to define the context labels Grp,Set,AssocOpGrp,Set,AssocOp would again depend on authorial judgment. In these examples, the label represents a category or class of mathematical objects but it is useful not to restrict ourselves to this. Also given subscripted XiX_{i} we could write (Xi)Grp(X_{i})_{Grp}. Use :: or ϵ\epsilon only when appropriate.

Example 1.

Classically in mathematics, the type of a variable is given in accompanying text, visually separated from the reasoning which follows. The annotation approach complements the classical approach and the two can be usefully intermixed. In some situations annotation may be distracting, cluttering an equation or formula, and in this case the classical approach will be best. In other situations it may enhance the flow of information by concisely inlining all necessary context. For example we might write z=x+iyz_{\mathbb{C}}=x_{\mathbb{R}}+iy_{\mathbb{R}}, and this may be either highly useful or highly distracting depending upon the circumstances.

2.1.3. The abuttal operator (Annotation conventions)

Rather than writing YXY^{X} we will use XYXY as a concise notation for the morphisms from the object XX to the object YY. In addition to providing more natural directionality, this notation is particularly convenient for use as an annotation. For example, writing ϕXY(a)=b\phi_{XY}(a)=b tell us that ϕ\phi is a morphism from XX to YY, and hence we can also deduce that aX,bYa\in X,b\in Y: it would be redundant to write ϕXY(aX)=bY\phi_{XY}(a_{X})=b_{Y}, although it is permitted. Note that writing ϕXY\phi_{XY} means the same as ϕXY\phi\in XY and ϕ:XY\phi:X\rightarrow Y, and sometimes it also makes sense to say simply “ϕ\phi is XYXY”. If X=YX=Y it often makes sense to abbreviate ϕXX\phi_{XX} to simply ϕX\phi_{X} but then it needs to be clear from other context that ϕ\phi is a function on XX, and not an element of XX. This is an example of why we regard XX here a contextual label rather than the category or class XX.

We also borrow from mathematical logic the convention of using abuttal to represent a suitable distinguished operator in a particular context. For example in the context of an algebraic ring we retain the usual convention xyx×yxy\coloneqq x\times y, ie the abuttal operator represents the multiplication operator. However in the context of the function space XYXY, we define abuttal as evaluation, ie ϕXYxϕ(x)\phi_{XY}x\coloneqq\phi(x). Given also θYZ\theta_{YZ} we will define abuttal to be functional composition, ie (θϕ)XZxθYZ(ϕXYx)(\theta\phi)_{XZ}x\coloneqq\theta_{YZ}(\phi_{XY}x).

2.2. Logic and Abstract Algebra

Functions of Propositions

Definition 2.

Let 𝒫\mathscr{P} be a Boolean algebra of logical propositions, \mathscr{B} the Boolean integers {0,1}\{0,1\}, and \mathscr{B}^{\prime} the alternate Boolean integers {1,1}\{-1,1\}. The Iverson bracket .𝒫\left\llbracket.\right\rrbracket_{\mathscr{PB}} is a function which maps logical propositions to {0,1}\{0,1\} as follows: for any proposition PP we define P:=1\left\llbracket P\right\rrbracket:=1 if PP is true, and 0 if PP is false. The alternate Iverson bracket .𝒫\left\llbracket.\right\rrbracket_{\mathscr{PB}^{\prime}}^{\prime} is analogous: we define P:=1\left\llbracket P\right\rrbracket:=1 if PP is true, and 1-1 if PP is false.

Note the useful identities !P=1P\left\llbracket!P\right\rrbracket=1-\left\llbracket P\right\rrbracket, and that P=(1)!P=P!P\left\llbracket P\right\rrbracket^{\prime}=(-1)^{\left\llbracket!P\right\rrbracket}=\left\llbracket P\right\rrbracket-\left\llbracket!P\right\rrbracket. These functions allows us to define discontinuous functions very efficiently: for example f(x)x>0xf_{\mathbb{R}\mathbb{R}}(x)\coloneqq\left\llbracket x>0\right\rrbracket x gives us the Heaviside function (the real function which takes the value xx for x>0x>0, and 0 otherwise.

We will often need to distinguish odd and even cases, and it useful to define a set of constants for this purpose.

Definition 3.

For rr\in\mathbb{Z} we define Erris even,Orris oddE_{r}\coloneqq\left\llbracket r\,\textrm{is\,even}\right\rrbracket,O_{r}\coloneqq\left\llbracket r\,\textrm{is\,odd}\right\rrbracket

Note the useful identities Er+Or=1E_{r}+O_{r}=1 (so xr=Erxr+Orxrx_{r}=E_{r}x_{r}+O_{r}x_{r}), and ErOr=(1)rE_{r}-O_{r}=(-1)^{r}.

Abstract Algebra Basic terminology We recall some basic algebraic concepts and terminology, we fix precisely how we will use them here, and we establish notation.

Set:

We will not be concerned with foundational issues here, so we content ourselves with a naive approach and in particular we allow the definition of sets by use of the naive notation X{x:Px}X\coloneqq\{x:Px\} where PP is a logical proposition.

Function:

A function ϕXY\phi_{XY} here always means a single valued relation on the set X×YX\times Y. The annotation XYXY represents the set of morphisms from XX to YY, which in the case of sets is precisely the set of all possible functions from XX to YY. The latter is often denoted YXY^{X}, but we prefer the more natural direction implied by XYXY. When it seems more appropriate, we will also use the traditional notation ϕ:XY\phi:X\rightarrow Y.
A function is defined explicitly using the notation ϕXYxXyY\phi_{XY}\coloneqq x_{X}\mapsto y_{Y}, which also allows the definition of anonymous functions, eg xXyYx_{X}\mapsto y_{Y}.
We will also use placeholder notation to allow for function notation involving decorations (eg .¯\bar{.}) or braces (eg {.}\{.\} ). Here the character .{}^{\prime}.^{\prime} is the placeholder.
We define the nn_{\mathbb{N}}-ary extension of ϕXY\phi_{XY} to be ϕXnYn:(xi)i=1n(ϕxi)i=1n\phi_{X^{n}Y^{n}}:(x_{i})_{i=1}^{n}\mapsto(\phi x_{i})_{i=1}^{n} and the skew extension ϕXnYnOp:(xi)(ϕxi)Op\phi_{X^{n}Y^{n}}^{Op}:(x_{i})\mapsto(\phi x_{i})^{Op} where (.Op)XnXn\left(.^{Op}\right)_{X^{n}X^{n}} is the function which reverses sequences in XnX^{n}. (This is of course a specialisation of the concept of the OpOp functor from Category Theory but we wish to avoid deploying the full generality here. We retain the “skew” terminology as it is more familiar in the contexts we are using).
If X=YX=Y then ϕ\phi is a set endomorphism, and any xXx_{X} satisfying ϕx=x\phi x=x is called a fixed point of ϕ\phi.

Algebraic Operation:

A function XnX\circ_{X^{n}X} is called an nn_{\mathbb{N}}-ary algebraic operation on a set XX, abbreviated to simply an operation on XX when nn is unimportant.
A unary operation is also a set endomorphism, which will often be denoted by the decoration of a symbol, eg .¯\overline{.} denotes the function xx¯x\mapsto\overline{x}.
Binary operations have the formal notation (x1,x2)\circ(x_{1},x_{2}) and this is occasionally helpful, but we shall of course normally use the more common notation x1x2x_{1}\circ x_{2}. Often it is useful to use abuttal (eg abab) to indicate the presence of a well known binary operation - when needed we will use ′′ to indicate the abuttal operator, eg ′′(a,b)a×b{}^{\prime\prime}\coloneqq(a,b)\mapsto a\times b means abab is to be interpreted as a×ba\times b.
nn-ary operations on sequences (xi)Xn(x_{i})_{X^{n}} are often constructed by extending unary or binary operations, eg by (xi)¯(xi¯)\overline{(x_{i})}\coloneqq(\overline{x_{i}}) or (xi)\circ(x_{i})\coloneqq(x1,(x2,(xn1,xn)))\circ(x_{1},\circ(x_{2},\ldots\circ(x_{n-1},x_{n})\ldots)) and we will take these these extensions as given unless otherwise specified. If we also omit specification of nn then we intend that the extension may be made to any n1n\geq 1 (unary) or n2n\geq 2 (binary case). Examples: the unary operation :xx-:x\mapsto-x gives (x1,xn)=(x1,,xn)-(x_{1},\ldots x_{n})=(-x_{1},\ldots,-x_{n}), a commutative binary operation ++ gives ,+(x1,,xn)=1nxi,+(x_{1},\ldots,x_{n})=\sum_{1}^{n}x_{i} whereas the non-commutative binary skew operation Op:(x1,x2)x2x1\circ^{Op}:(x_{1},x_{2})\mapsto x_{2}\circ x_{1} gives Op(x1,,xn)=(xn,,x1)\circ^{Op}(x_{1},\ldots,x_{n})=\circ(x_{n},\ldots,x_{1}). (Borrowed from Universal Algebra).

Algebraic Structures:

An algebraic structure XStrucX_{Struc} is a an ordered pair of sets (X,{α}αA)(X,\{\circ_{\alpha}\}_{\alpha\in A}) where XSetX_{Set} is called the underlying set, and {α}\{\circ_{\alpha}\}is a set of operations on XX indexed by a set AA. We will also call XX a |A|Struc\left|A\right|-Struc. In the case of a singleton set AA (ie XX is a 1Struc1-Struc), we simplify the notation to (X,)(X,\circ) rather than (X,{})(X,\{\circ\}). For example, a bare set has A=A=\emptyset (meaning there is no additional structure), a bare monoid or group has AA a singleton, a bare ring or field has |A|=2|A|=2. For |A|2|A|\geq 2 there will usually be relations defining interaction between operators.

Generated Structures:

Let (U,)(U,\circ) be a structure with \circ a binary operation. If UU has a unit it is unital. Note that if UU is not unital we can always extend it with a unit, and there can be only one unit for \circ. Let XUX\subseteq U and XX is closed under \circ, XX is a substructure of UU. The smallest substructure containing XX (which could be UU or XX or something in between) is the closure of XX, designated <X,><X,\circ> (and also called the structure generated by XX).
If \circ is not associative (following Bourbaki ##paper) the structure is a magma and its elements are represented by binary trees with leaves in XX, eg w1w2w_{1}\circ w_{2} can be regarded as a tree rooted at \circ with w1,w2w_{1},w_{2} as left and right sub-trees (which may be simply leaves). When \circ is associative, the need for brackets disappears, the structure becomes a semi-group, and elements such as u1u2u3u_{1}\circ u_{2}\circ u_{3} can be represented as strings u1u2u3u_{1}u_{2}u_{3} or sequences (u1,u2,u3)(u_{1},u_{2},u_{3}). If one of these structures lacks a unit, we can always extend it with a unique unit, but this is more important in the case of a semi-group - in this case the structure becomes a monoid, and the unit can be represented as an empty string or sequence, typically designated ϵ\epsilon.
The structure is called free if there are no relations between elements, ie if w1=w2w_{1}=w_{2} means that any representations of w1,w2w_{1},w_{2} (trees or strings) are identical. Note that even if (U,)(U,\circ) is not free, we can always introduce a second operator 2\circ_{2}, forget 1\circ_{1}, and consider USetU_{Set} as a set of generators for a free magma, semi-group or monoid (U,2)(U^{*},\circ_{2}). In this case elements of USetU_{Set} are called the base elements.
We shall be particularly interested in using free monoids to represent orbits in a Dynamical System, and in magmas of hom-sets to develop our theory of induced morphisms (see 3).
We can regard a free magma as a naturally graded structure (ie there is a function U0U^{*}\rightarrow\mathbb{N}_{0}) in two ways. The first, introduced by Bourbaki is effectively the width of a term binary tree, measured as the width of the base of the tree, ie the number of leaf elements in the tree. This grading also extends to associative free magmas where it reduces to being simply the length of a string). We introduce a second grading for non-associative magmas in 3, which we will call the height of the tree, which is more useful for our purposes. We define base elements to be trees of height 0, and inductively a tree of height nn is one which has one or both sub-trees of height n1n-1. Note this means aba\circ b has a height of 11, (ab)(cd)(a\circ b)\circ(c\circ d) has a height of 22, whereas (a(b(cd)))(a\circ(b\circ(c\circ d))) has a height of 3.

Homomorphisms/Morphisms:

A structure homomorphism ϕXY\phi_{XY} between two algebraic structures XSruc=(XSet,{α}),YStruc=(YSet,{β})X_{Sruc}=(X_{Set},\{\circ_{\alpha}\}),Y_{Struc}=(Y_{Set},\{\circ_{\beta}\}) is a function ϕXSetYSet\phi_{X_{Set}Y_{Set}} which preserves (is compatible with/distributes over) the algebraic structure. This means that for each nn-ary (n>0n>0) algebraic operation α\circ_{\alpha} on XX there is a nn-ary β\circ_{\beta} on YY such that ϕ(α(xi)nSeq)=β(ϕxi)nSeq\phi\left(\circ_{\alpha}(x_{i})_{n-Seq}\right)=\circ_{\beta}\left(\phi x_{i}\right)_{n-Seq} for all sequences (xi)Xn(x_{i})_{X^{n}}. It is useful to define for each nn the Cartesian product function ϕXnYn=i=1nϕXY(xi)Xn(ϕXYxi)Yn\phi_{X^{n}Y^{n}}=\prod_{i=1}^{n}\phi_{XY}\coloneqq(x_{i})_{X^{n}}\mapsto(\phi_{XY}x_{i})_{Y^{n}}, which gives us ϕXYα=βϕXnYn\phi_{XY}\circ\circ_{\alpha}=\circ_{\beta}\circ\phi_{X^{n}Y^{n}}. We can also define the involution OpXn(xi)i=1n(xi)i=n1Op_{X^{n}}\coloneqq(x_{i})_{i=1}^{n}\mapsto(x_{i})_{i=n}^{1}, which pulls up to an involution on XnYnX^{n}Y^{n} defined by Op(ϕXnYn)=ϕXnYnOpϕXnYnOpXnOp(\phi_{X^{n}Y^{n}})=\phi_{X^{n}Y^{n}}^{Op}\coloneqq\phi_{X^{n}Y^{n}}\circ Op_{X^{n}}. We call ϕOp\phi^{Op} an anti-structure homomorphism as it reverses sequences. For example, given groups (X,X),(Y,Y)(X,\circ_{X}),(Y,\circ_{Y}), then ϕXY\phi_{XY} is an anti-group homomorphism iff ϕ(x1Xx2)=(ϕx2)Y(ϕx1)\phi(x_{1}\circ_{X}x_{2})=(\phi x_{2})\circ_{Y}(\phi x_{1}).
Note the standard additional terminology for specialised homomorphisms: if the function ϕXSetYSet\phi_{X_{Set}Y_{Set}} is a bijection then ϕ\phi is an isomorphism, and if X=YX=Y then ϕXX\phi_{XX} is an endomorphism, and if both conditions hold then ϕXX\phi_{XX} is an automorphism. If ϕXXx=x\phi_{XX}x=x, xx is a fixed point of the endomorphism ϕ\phi.

hom-sets:

Note that XYXY represents the hom-set (the set of homomorphisms or morphisms) from XX to YY. and that the endomorphism hom-set XXXX of endomorphisms on XX is a subset of the hom-set of unary operators on XX. We therefore need to be careful XYXY can often be equipped with algebraic operations defined by the operations of the underlying algebraic structures XX and YY. We call these new operations derived/induced and XYXY is then a (second order) algebraic structure in its own right. We will make significant use of this type of construction - see section (3) below.
The abuttal map :=(X,Y)XY":=(X,Y)\mapsto XY is itself a binary operation on sets, and so given a set of sets X={Xα}X=\{X_{\alpha}\} we will write X=<X,′′>X^{*}=<X,^{\prime\prime}> for the set of sets generated from XX by ′′. Note that since ′′ is not associative, (X,′′)(X^{*},^{\prime\prime}) is a magma rather than a semi-group (its elements are represented by binary trees rather than strings), and we will call it the magma of hom-sets. Also the elements of XX^{*} contain typically morphisms between sets of morphisms rather than morphisms between morphisms, so that these are distinct from the “higher order” morphisms encountered in higher order category theory.
second cant use of eg YY group gives XYXY group, endomorphisms of group form group (or comp of monids XYZ).

Involution:

An involution (.¯)XX\left(\overline{.}\right)_{XX} on an algebraic structure X=(XSet,{}α)X=(X_{Set},\{\circ\}_{\alpha}) is a unary operation of order 2 (ie for every xXx_{X} we have x¯¯=x\overline{\overline{x}}=x), and which is also a skew-automorphism which is a skew-anti-commutes with each α\circ_{\alpha}, ie xαy¯=y¯αx¯\overline{x\circ_{\alpha}y}=\overline{y}\circ_{\alpha}\overline{x}. The second constraint is automatically satisfied if {}α\{\circ\}_{\alpha} is empty. Given another set YY, an involution on XX induces an involution on (XY)Set(XY)_{Set} by ϕXY¯x=ϕx¯\overline{\phi_{XY}}x=\phi\overline{x}. However this is involution does not generally commute with operators on XYXY such as composition or addition, so it is not generally an involution on XYXY as an algebraic structure.

We call x¯\overline{x} the involute of xx, and self-involute when x=x¯x=\overline{x}. The set of self-involute points is the set of fixed points of .¯\overline{.} denoted FP(X,.¯FP(X,\bar{.}) or just FP(X)FP(X) when .¯\bar{.} is clear. A subset WXW\subseteq X is fixed point free under .¯\bar{.} if it is disjoint from FP(X,.¯)FP(X,\bar{.}). Note that this means W,W¯W,\bar{W} are mutually disjoint, and that WXW\neq X unless X=X=\emptyset. A partition of XX under .¯\bar{.} is a triple of mutually disjoint subsets (W,W¯,FP(X))(W,\overline{W},FP(X)) which cover XX.

Lemma 4.

Let X,YX,Y be algebraic structures with involutions .¯X,.¯Y\bar{.}_{X},\bar{.}_{Y}, and let WXW\subset X be fixed point free under .¯X\bar{.}_{X}. Suppose further that A,BA,B are real-valued functions on X×YX\times Y with the property that A(x,y)>B(x,y)A(x,y)>B(x,y) and A(x¯,y)<B(x¯,y)A(\overline{x},y)<B(\overline{x},y) for any yy and any xWx\in W. Then A¯(x,y)<B¯(x,y)\overline{A}(x,y)<\overline{B}(x,y) and A¯(x¯,y)>B¯(x¯,y)\overline{A}(\overline{x},y)>\overline{B}(\overline{x},y) for any yy and any xWx\in W.

Proof.

A¯(x,y)=A(x¯,y¯)<B(x¯,y¯)=B¯(x,y)\overline{A}(x,y)=A(\overline{x},\overline{y})<B(\overline{x},\overline{y})=\overline{B}(x,y) and the second result follows similarly. ∎

Involutions

Definition 5.

An involution .¯\overline{.} on a set XX is a self-inverse bijection, ie x¯¯=x\overline{\overline{x}}=x. An involution on an algebraic structure (X,)(X,\circ) is an involution on XX which is also an anti-isomorphism, ie xy¯=y¯x¯\overline{x\circ y}=\overline{y}\circ\overline{x}. If \circ is commutative, .¯\overline{.} becomes an automorphism.

If (X,)(X,\circ) is a group, we call x¯=x1\overline{x}=x^{-1} the inverse involution, noting xy¯=y¯x¯\overline{xy}=\overline{y}\,\overline{x}. When X=/X=\mathbb{R}/\mathbb{Z} this gives x¯=x\overline{x}=-x, x+y¯=(y)+(x)\overline{x+y}=(-y)+(-x), and we will use this extensively.

On the free monoid (X,+),(X^{*},+), reversal is an involution where x1x2xN¯=xNxN1x1\overline{x_{1}x_{2}\ldots x_{N}}=x_{N}x_{N-1}\ldots x_{1} and w1+w2¯=w2¯+w1¯\overline{w_{1}+w_{2}}=\overline{w_{2}}+\overline{w_{1}}. Note that reversal restricted to XX is the identity.

We will use some useful abstractions inspired by Category Theory in order to simplify and structure our theory. We start with concrete categories, namely categories whose objects are mathematical spaces each with an underlying set. By abuse of notation we will denote the underlying space of XObjX_{Obj} by XSetX_{Set}. Also in a concrete category, the morphism set XYXY of morphisms from XObjX_{Obj} to YObjY_{Obj} is a set of functions {ϕXSetYSet}\{\phi_{X_{Set}Y_{Set}}\} between the underlying sets. Of course XYXY will normally be a strict subset of the full set of functions from XSetX_{Set} to YSetY_{Set} (usually denoted YXY^{X}), and will be chosen to preserve in some way the additional structure of the spaces XObj,YObjX_{Obj},Y_{Obj}.

Given a concrete category CC we construct the second order category C+C^{+} whose objects are Obj(C){XY:X,YObj(C)}Obj(C)\bigcup\{XY:X,Y\in Obj(C)\}, ie we extend CC by making the morphism sets XYXY of CC into objects of C+C^{+}. This allows us to construct morphisms between morphism sets of CC, and between morphism sets of CC and objects of CC.

We use this for separating SN,ϕS_{N},\phi and light use of duality to reduce the number of cases we need to consider.

Induced/Constructed/Dependent Morphisms General: A(XY)(VW):ϕXYϕVWA_{(XY)(VW)}:\phi_{XY}\mapsto\phi_{VW} Need to check whether ϕVW\phi_{VW} is a morphism.

Sequence extension (XY)(XnYn)n:ϕXYϕXnYn\prod_{(XY)(X^{n}Y^{n})}^{n}:\phi_{XY}\mapsto\phi_{X^{n}Y^{n}} where (nϕ)(xi)(ϕxi)Yn\left(\prod^{n}\phi\right)(x_{i})\coloneqq(\phi x_{i})_{Y^{n}} and hence to Kleene Star ϕXY\phi_{X^{*}Y^{*}} which is ϕXnYn\phi_{X^{n}Y^{n}} on seq of length n0n\geq 0. But note image (XY)(XnYn)nXY\prod_{(XY)(X^{n}Y^{n})}^{n}XY is usually a strict subset of XnYnX^{n}Y^{n} consisting of the diagonal entries: general term of XnYnX^{n}Y^{n} is (ϕj)XnYn:(xj)Xn(ϕjxj)Yn(\phi_{j})_{X^{n}Y^{n}}:(x_{j})_{X^{n}}\mapsto(\phi_{j}x_{j})_{Y^{n}} - diagonal element of XnYnX^{n}Y^{n}, XnX^{n} has ϕj=ϕ\phi_{j}=\phi, xj=xx_{j}=x respectively.

(XY)n={(ϕ1,,ϕn)}(XY)^{n}=\left\{(\phi_{1},\ldots,\phi_{n})\right\} is ambiguous over domain - is it XYnXY^{n} or XnYnX^{n}Y^{n}? - don’t use - use either of the latter which are well defined. But then ϕXYn(XY)Setn\phi_{XY^{n}}\in(XY)_{Set}^{n}- but application of (XY)n(XY)^{n} as a map is undefined, better to say XYnXY^{n} is naturally isomorphic to XnYnDiag(Xn)X^{n}Y^{n}\mid_{Diag(X^{n})}

Also (ϕi)XYn:x(ϕix)Yn(\phi_{i})_{XY^{n}}:x\mapsto(\phi_{i}x)_{Y^{n}} - restriction of domain of XnYnX^{n}Y^{n} from XnX^{n} to diagonal elements of XnX^{n}, Diag(Xn)Diag(X^{n})

Pullback/Koopman Operator: Cψ:ϕXY(ϕψWX)WYC_{\psi}:\phi_{XY}\mapsto\left(\phi\psi_{WX}\right)_{WY} CψC_{\psi} is linear in ϕ\phi and has signature (XY)(WY)(XY)(WY), CC is (WX)((XY)((XY)(WY)))\left(WX\right)\left((XY)\left((XY)(WY)\right)\right) function to operator (meta operator, 2nd order operator)

Given (Y,YnY),(Y,\circ_{Y^{n}Y}), (XYn)(XY):(ϕi)XYn(YnY(ϕi)XYn)XY\circ_{(XY^{n})(XY)}:(\phi_{i})_{XY^{n}}\mapsto\left(\circ_{Y^{n}Y}\cdot(\phi_{i})_{XY^{n}}\right)_{XY} eg (n=2) (ϕ1XYϕ2)x(ϕ1x)Y(ϕ2x)(\phi_{1}\circ_{XY}\phi_{2})x\coloneqq\left(\phi_{1}x\right)\circ_{Y}(\phi_{2}x) (pushforward of \circ ) Equivalent to f(g)=fgf(g)=f\circ g not g(f)=fgg(f)=f\circ g - simple composition for 1-functions, extended to sequences. What about YnZ\circ_{Y^{n}Z} ? (Y,YnZ),(XYn)(XZ):(ϕi)XYn(YnZ(ϕi)XYn)XZ(Y,\circ_{Y^{n}Z}),\circ_{(XY^{n})(XZ)}:(\phi_{i})_{XY^{n}}\mapsto\left(\circ_{Y^{n}Z}\cdot(\phi_{i})_{XY^{n}}\right)_{XZ} - this normal comp with YY replaced by Yn.Y^{n}. Generalisation of distribution when X=Y=ZX=Y=Z.

PushForward/Koopman Adjoint: Cϕ:ψWX(ϕXYψWX)WYC_{\phi}:\psi_{WX}\mapsto\left(\phi_{XY}\psi_{WX}\right)_{WY} is (WX)(WY)(WX)(WY) instead of (XY)(WY)(XY)(WY). CC is linear in ϕ\phi, but CϕC_{\phi} is not linear in ψ\psi. So the codomain XX of ϕ\phi is pushed forward to YY, rather than the domain of XX being pulled back to WW. In the above we have (W,X,Y)(W,X,Y) replaced by (X,Yn,Y)(X,Y^{n},Y).

Let X,YX,Y be two objects of some concrete category CC, and suppose YY is equipped with a binary operation +Y+_{Y} (not necessarily commutative). Then +Y+_{Y} induces a dependent binary operation +XY+_{XY} on the set of functions YXY^{X} as follows: (ϕXY+XYθXY)xϕx+Yθx(\phi_{XY}\,+_{XY}\,\theta_{XY})x\coloneqq\phi x\,+_{Y}\,\theta x. Of course the subset XYXY of YXY^{X} may not be closed under +XY+_{XY}. If it is closed, then we say the space XY(XY,+XY)XY\coloneqq(XY,+_{XY}) is a dependent space of YY.

Now given an endomorphism TXXT_{XX} and a morphism ϕXY\phi_{XY}, then the composition ϕXYTXX\phi_{XY}T_{XX} is itself in XYXY. In other words TT acts on XYXY by ϕϕT\phi\mapsto\phi T. When XYXY is a dependent space on YY, the action is also linear in the sense that (ϕ1+ϕ2)(Tx)=ϕ1(Tx)+ϕ2(Tx)=(ϕ1T)(x)+(ϕ2T)(x)=(ϕ1T+ϕ2T)(x)(\phi_{1}+\phi_{2})(Tx)=\phi_{1}(Tx)+\phi_{2}(Tx)=(\phi_{1}T)(x)+(\phi_{2}T)(x)=(\phi_{1}T+\phi_{2}T)(x).

We now wish to be able to reason further about the map ϕϕT\phi\mapsto\phi T which is an endomorphism of XYXY and consequently does not exist within CC. Informally we will just add it in, but we will also describe how to this formally. We move to the higher order category C+C^{+} where we can define morphisms between sets of morphisms - in particular a morphism between XYXY and ABAB will lie in the morphism set (XY)(AB)(XY)(AB). In particular we can now define the dependent morphism 𝕋(XY)(XY)(ϕXY)ϕTXX\mathbb{T}_{(XY)(XY)}(\phi_{XY})\coloneqq\phi T_{XX}. Now by the result above, +XY+_{XY} in XYXY induces a dependent binary operation +(XY)(XY)+_{(XY)(XY)} in XY(XY)XY^{(XY)} defined by (T+U)ϕTϕ+Uϕ(T+U)\phi\coloneqq T\phi+U\phi. Again (XY)(XY)(XY)(XY) is a dependent space of XYXY if closed under ++.

The endomorphisms TXXT_{XX} of XX form a monoid under composition, and so also (XY)(XY)(XY)(XY) and also a semi-group if closed under the operator +XY+_{XY} pulled up from +Y+_{Y}. Also Tnϕ=ϕTnT^{n}\phi=\phi T^{n} for n0n\geq 0.

X=(X,+)X^{*}=(X^{*},+) is the free linear monoid on XX, and we extend TT to XX^{*} by T(r=1Nxr)r=1NTxrT(\sum_{r=1}^{N}x_{r})\coloneqq\sum_{r=1}^{N}Tx_{r} so the induced TT is XXX^{*}X^{*} and linear. Now the orbit segment of x0x_{0} with length NN in (X,T)(X,T) is (Tx0,T2x0,,TNx0)(Tx_{0},T^{2}x_{0},\ldots,T^{N}x_{0}) and now regard this as the word r=1N(Tr(x0)X)\sum_{r=1}^{N}\left(T^{r}\left(x_{0}\right)_{X^{*}}\right) in XX^{*} where (x0)X\left(x_{0}\right)_{X^{*}} is the single letter word in XX^{*} corresponding to (x0)X(x_{0})_{X}. Using the linearity of TrT^{r} we can rewrite this as (r=1NTr)x0\left(\sum_{r=1}^{N}T^{r}\right)x_{0} where the summation is the induced operator +(XX)+_{(X^{*}X^{*})}. We now define the linear operator SN=r=1NTrS_{N}=\sum_{r=1}^{N}T^{r} and we can write the orbit segment as SNx0S_{N}x_{0}. Finally we similarly extend ϕXY\phi_{XY} to XX^{*}by ϕ(r=1Nxr)=r=1Nϕxr\phi(\sum_{r=1}^{N}x_{r})=\sum_{r=1}^{N}\phi x_{r} where now the second summation uses +Y+_{Y}. Hence we can use the rewriting rules of (4) ϕ(SNx0)=r=1Nϕ(Trx0)=r=1N(Trϕ)x0=(r=1NTr)ϕx0=(SNϕ)x0\phi(S_{N}x_{0})=\sum_{r=1}^{N}\phi(T^{r}x_{0})=\sum_{r=1}^{N}(T^{r}\phi)x_{0}=\left(\sum_{r=1}^{N}T^{r}\right)\phi x_{0}=\left(S_{N}\phi\right)x_{0} where the second homonym SNS_{N} is now (XY)Y(X^{*}Y)Y and SNϕS_{N}\phi is XYX^{*}Y (ie an observable on XX^{*}) and its restriction to XX is now the Birkhoff sum operator

Free Monoids

Definition 6.

Given a set XX, the free monoid (X,+)(X^{*},+) on XX is defined as the set of finite words (or strings) x1x2xNx_{1}x_{2}\ldots x_{N} with each letter xiXx_{i}\in X, including the empty word 0, and with the operation ++ being the concatenation of words (which of course is NOT commutative). We write r=1Nwr\sum_{r=1}^{N}w_{r} to represent w1+w2+wNw_{1}+w_{2}\ldots+w_{N} where wrw_{r} is a word of XX^{*}. We write the reverse sum wN+wN1+w1w_{N}+w_{N-1}\ldots+w_{1} as r=1NwN+1r\sum_{r=1}^{N}w_{N+1-r}.

Duality A duality between two spaces allows us to obtain dual results automatically in one space from results in the other. We will exploit duality in a number of places to reduce the number of cases we need to consider. However the dualities are quite simple, and the theory development required to make them formal and explicit does not repay the effort. Hence we will simply identify the dualities informally and use them to structure the cases to be considered.

We will exploit 2 types of duality: domain dualities which arise through the existence of useful involutions on underlying sets and groups, and a parity duality (a duality between odd and even results) which arises in connection with the study of Continued Fractions.

2.3. Dynamical Systems

Although we noted above that Birkhoff sums were studied pre-Birkhoff, they are normally today associated with Ergodic Theory and Dynamical Systems, and we will introduce appropriate terminology and notation here.

A Dynamical System (X,TX)(X,T_{X}) is simply a space XX (a set XX equipped with some suitable algebraic and analytical structure) with an endomorphism T:XXT:X\rightarrow X. Given x0Xx_{0}\in X (called an initial condition), the point xN=TNx0x_{N}=T^{N}x_{0} is the NN-th iterate of x0x_{0}, and the sequence (x1,x2,)(x_{1},x_{2},\ldots) is the (forward) orbit of x0x_{0} (NB it is very convenient NOT to regard x0x_{0} itself as the first point of the orbit). For n,m0n,m\geq 0, the subsequence of the orbit (xr)r=m+1m+n(x_{r})_{r=m+1}^{m+n} is called a segment of length nn, and an initial segment if m=0m=0.

We will focus in this paper on the Dynamical system (𝕋,Rα)(\mathbb{T},R_{\alpha}) where 𝕋=[0,1)\mathbb{T}=[0,1) is the circle (a compact 1 dimensional real manifold) and RαR_{\alpha} is a rotation through α\alpha_{\mathbb{R}} revolutions. Note that Rα=R{α}R_{\alpha}=R_{\{\alpha\}}

When XX has a distance function dd we say xNx_{N} is a point of closest return to x0x_{0} if xN=TNx0x_{N}=T^{N}x_{0} satisfies d(x0,xN)<d(x0,xr)d(x_{0},x_{N})<d(x_{0},x_{r}) for r<Nr<N, ie xNx_{N} is the first point of minimum distance from x0x_{0} in the initial orbit segment of length NN. We call NN a quasiperiod with quasiperiod error d(x0,xN)d(x_{0},x_{N}). Note that 11 is trivially a quasiperiod.

Note also that if xN=x0x_{N}=x_{0} then: the orbit is periodic, and NN is a multiple of the period p=min{r>0:Rrx=x}p=\min\{r>0:R^{r}x=x\}, and the period is itself a quasiperiod with quasiperiod error 0.

We now introduce the concept of an observable of the Dynamical System as a function ϕXY\phi_{XY} with values in a semi-group (Y,Y)(Y,\circ_{Y}). By 2.2 this induces a natural binary function XY\circ_{XY} on XYXY. In general Y,XY\circ_{Y},\circ_{XY} need not be commutative (as in the theory of co-cycles. where (Y,)(Y,\circ) is a space of matrices with matrix multiplication). In this paper we restrict ourselves to the simple case (Y,)=(,+)(Y,\circ)=(\mathbb{\mathbb{R}},+), and so in the sequel an observable on XX will be a real function on XX, and the set of observables XX\mathbb{R} is a commutative group, namely (X,+X)(X\mathbb{R},+_{X\mathbb{R}}).

Definition 7 (Birkhoff Sum).

The NNth Birkhoff sum over the Dynamical System (X,T)(X,T) of the observable ϕX\phi_{X\mathbb{R}} with initial condition xx is SNT(ϕ,x)r=1Nϕ(Trx)S_{N}^{T}(\phi,x)\coloneqq\sum_{r=1}^{N}\phi(T^{r}x).

We shall find it convenient to define a number of homonymous forms derived from SNS_{N}. We allow the omission of TT or ϕ\phi when these are understood, obtaining the forms SN(ϕ,x)SNT(ϕ,x)S_{N}(\phi,x)\coloneqq S_{N}^{T}(\phi,x), SNxSN(ϕ,x)S_{N}x\coloneqq S_{N}(\phi,x). When TT is RαR_{\alpha}, namely a rotation of the circle through α\alpha_{\mathbb{R}} revolutions, we will also write SNαSNRαS_{N}^{\alpha}\coloneqq S_{N}^{R_{\alpha}}. Finally we define the homogeneous form SNϕSN(ϕ,0)S_{N}\phi\coloneqq S_{N}(\phi,0),

Separation of concerns Let (X,T)(X,T) be a Dynamical system with a Value system (Y,+)(Y,+) and Observables system (XY,+)(XY,+).

In (2.2) we noted that given T:XXT:X\rightarrow X and ϕ(XY,+)\text{$\phi$}\in(XY,+), that TT has a homomorphic action on T:XYXYT:XY\rightarrow XY defined by

(Tϕ)x=(ϕT)x=ϕ(Tx)(T\phi)x=(\phi\circ T)x=\phi(Tx)

For the purposes of this paper, the right hand side is more tractable since it means we can study a given Dynamical System (X,T)(X,T) independent of ϕ\phi. However our goal is to study Birkhoff sums which have the form (SNϕ)x0(S_{N}\phi)x_{0}. Noting that this has the same form as the left hand side of (2.2), it would be nice to rewrite (SNϕ)x0(S_{N}\phi)x_{0} as ϕ(SNx0)\phi(S_{N}x_{0}). Unfortunately SNx0S_{N}x_{0} is undefined, but we can easily remedy this by extending our study from XX to XX^{*}, the free monoid on XX (see 2.2), which provides a richer context in which to work.

Note that we can write any word w=x1xNXw=x_{1}\ldots x_{N}\in X^{*} as a concatenation r=1Nxi\sum_{r=1}^{N}x_{i}. We can also identify the initial orbit segment (Trx0)r=1N(T^{r}x_{0})_{r=1}^{N} of (X,T)(X,T) with the concatenation r=1NTrx0\sum_{r=1}^{N}T^{r}x_{0}.

We now extend the function ϕ:XY\phi:X\rightarrow Y to a monoid homomorphism ϕ:XY\phi:X^{*}\rightarrow Y by ϕ(r=1Nxr)=r=1Nϕxr\phi(\sum_{r=1}^{N}x_{r})=\sum_{r=1}^{N}\phi x_{r}. Similarly we extend each function Tr:XXT^{r}:X\rightarrow X to a monoid homomorphism Tr:XXT^{r}:X^{*}\rightarrow X^{*} by Tr(r=1Nxr)=r=1NTrxrT^{r}(\sum_{r=1}^{N}x_{r})=\sum_{r=1}^{N}T^{r}x_{r}, noting by (2.2) that ++ on XX^{*} induces ++ on XXX^{*}X^{*} so that M=((XX),+)M=\left((X^{*}X^{*}),+\right) is a monoid of monoid endomorphisms. In particular we can now write SN=r=1NTrMS_{N}=\sum_{r=1}^{N}T^{r}\in M with SNw=r=1NTrwS_{N}w=\sum_{r=1}^{N}T^{r}w for each word wXw\in X^{*}. We can now use (2.2) with U:XXMU:X^{*}\rightarrow X^{*}\in M and ϕ(XY,+)=M2\phi\in(X^{*}Y,+)=M_{2} that UU has a homomorphic action U:M2M2U:M_{2}\rightarrow M_{2} by (Uϕ)w=ϕ(Uw)(U\phi)w=\phi(Uw). In particular taking U=SN,w=x0U=S_{N},w=x_{0} gives us (as desired) (SNϕ)x0=ϕ(SNx0)(S_{N}\phi)x_{0}=\phi(S_{N}x_{0}), and indeed much more.

Noting that we can now identify SNx0S_{N}x_{0} with the initial orbit segment (Trx0)r=1N(T^{r}x_{0})_{r=1}^{N} in the Dynamical System (X,T)(X,T), we use the identity above to separate the study of Birkhoff sums SNϕx0S_{N}\phi x_{0} into the study of initial orbit segments, and subsequently the action of ϕ\phi on these segments. We do this in Sections (5,6) respectively.

2.4. Geometry of the circle

As usual we will adopt the topological quotient group 𝕋=/\mathbb{T}=\mathbb{\mathbb{\mathbb{R}}}/\mathbb{Z} as our model of the circle, together with its quotient topology and (pushed forward from 𝕀\mathbb{I}) Lebesgue measure. Recall that the elements of /\mathbb{R}/\mathbb{Z} are the cosets x+x_{\mathbb{\mathbb{R}}}+\mathbb{Z}. We write x𝕋x+x_{\mathbb{T}}\coloneqq x_{\mathbb{\mathbb{R}}}+\mathbb{Z} and designate the quotient homomorphism π𝕋\pi_{\mathbb{\mathbb{\mathbb{R}}}\mathbb{T}} so that π(x)=x+=x𝕋\pi(x_{\mathbb{\mathbb{\mathbb{R}}}})=x_{\mathbb{\mathbb{R}}}+\mathbb{Z}=x_{\mathbb{T}}. Given x,x0x,x_{0}\in\mathbb{\mathbb{\mathbb{R}}}, we also regard 𝕋\mathbb{T} as a compact manifold with the atlas of charts222Strictly a chart is a be a homeomorphism between open sets, but we can easily extend these charts to the full manifold. extension to the full manifold/interval we must first define χx0:𝕋\{x0+}Set𝕀\{x0}Set\chi_{x_{0}}:\mathbb{T}\backslash\{x_{0}+\mathbb{Z}\}_{Set}\rightarrow\mathbb{I}\backslash\{x_{0}\}_{Set} and then extend it to include x0+{x0}.x_{0}+\mathbb{Z}\mapsto\{x_{0}\}.Similarly for χx0/\chi_{x_{0}}^{/} χx0𝕋𝕀:x𝕋=x+{xx0}\chi_{x_{0}}^{\mathbb{T}\mathbb{I}}:x_{\mathbb{T}}=x+\mathbb{Z}\mapsto\{x-x_{0}\}. Note that χx0𝕋𝕀χ𝕋=χx0𝕀:x{xx0}\chi_{x_{0}}^{\mathbb{T}\mathbb{I}}\chi^{\mathbb{\mathbb{\mathbb{R}}}\mathbb{T}}=\chi_{x_{0}}^{\mathbb{\mathbb{\mathbb{R}}}\mathbb{I}}:x\mapsto\{x-x_{0}\}. We also have a similar set of charts χx0𝕋𝕀/:x𝕋{{xx0}}\chi_{x_{0}}^{\mathbb{T}\mathbb{I}^{/}}:x_{\mathbb{T}}\mapsto\{\{x-x_{0}\}\} with χx0𝕋𝕀/χ𝕋=χx0𝕀/:x{xx0}\chi_{x_{0}}^{\mathbb{T}\mathbb{I}^{/}}\chi^{\mathbb{\mathbb{\mathbb{R}}}\mathbb{T}}=\chi_{x_{0}}^{\mathbb{\mathbb{\mathbb{R}}}\mathbb{I}^{/}}:x\mapsto\{x-x_{0}\}. When x0=0x_{0}=0 we will drop the subscript from each of the maps χ\chi and regard these as canonical maps. In particular we take χ𝕋𝕀,χ𝕋𝕀/\chi^{\mathbb{T}\mathbb{I}},\chi^{\mathbb{T}\mathbb{I}^{/}} as a canonical atlas, and call x𝕀=χ𝕋𝕀(x𝕋),x𝕀/=χ𝕋𝕀/(x𝕋)x_{\mathbb{I}}=\chi^{\mathbb{T}\mathbb{I}}(x_{\mathbb{T}}),x_{\mathbb{I}^{/}}=\chi^{\mathbb{T}\mathbb{I}^{/}}(x_{\mathbb{T}}) the natural coordinate, and signed coordinate of x𝕋x_{\mathbb{T}} respectively.

Note the transition map or change of coordinates χx1𝕋𝕀(χx0𝕋𝕀/)1x={x(x1x0)}\chi_{x_{1}}^{\mathbb{T}\mathbb{I}}\left(\chi_{x_{0}}^{\mathbb{T}\mathbb{I}^{/}}\right)^{-1}x=\{x-(x_{1}-x_{0})\}, and so for the canonical coordinates the morphism χx1𝕋𝕀(χx0𝕋𝕀/)1\chi_{x_{1}}^{\mathbb{T}\mathbb{I}}\left(\chi_{x_{0}}^{\mathbb{T}\mathbb{I}^{/}}\right)^{-1} lies in the hom-set 𝕀/𝕀\mathbb{I}^{/}\mathbb{I} and its inverse lies in 𝕀𝕀/\mathbb{I}\mathbb{I}^{/}.

The natural order on \mathbb{\mathbb{\mathbb{R}}}and 𝕀\mathbb{I} is ambiguous on 𝕋\mathbb{T}. Whilst it is normal to define circle order as a ternary relation (without reference to coordinates), it will suit us better to use the order induced from the natural coordinate, so that we define x𝕋y𝕋 0x𝕀y𝕀<1x_{\mathbb{T}}\leq y_{\mathbb{T}}\,\Leftrightarrow\,0\leq x_{\mathbb{I}}\leq y_{\mathbb{I}}<1. We will then say that a sequence (xi)i=1n\left(x_{i}\right)_{i=1}^{n} for n3n\geq 3 is in cyclic order if there is a rotation RαR_{\alpha} under which the full sequence is placed in natural order, ie {xi+α}𝕀{xi+1+α}𝕀\{x_{i}+\alpha\}_{\mathbb{I}}\leq\{x_{i+1}+\alpha\}_{\mathbb{I}} for i=1..n1i=1..n-1. We say (xi)i=1n\left(x_{i}\right)_{i=1}^{n} for n3n\geq 3 is in anti-cyclic order if the reverse sequence (xi)i=n1\left(x_{i}\right)_{i=n}^{1} is in cyclic order. A sequence which is in cyclic or anti-cyclic order is simply ordered. A sequence of length 0n20\leq n\leq 2 is always both in cyclic and anti-cyclic order.

The ambiguity of order means that a pair of distinct circle points a𝕋<b𝕋a_{\mathbb{T}}<b_{\mathbb{T}} identifies 2 circle intervals defined by the direction of travel from aa to bb. We will find it useful to have 2 ways of resolving this ambiguity. First we use the coordinate order: the cyclic interval 𝕀\mathbb{I} contains interior points xx such that (a,x,b)(a,x,b) are in cyclic order; and the anti-cyclic interval contains xx such that (a,x,b)(a,x,b) are in anti-cyclic order. We call these directed intervals, and if we do not specify a direction, we will take cyclic by default. Secondly we define undirected intervals: the minor interval is the one of smaller measure, the major interval is the one of larger measure. If the measure of both is precisely 1/21/2, we take the minor interval to be the cyclic directed interval, although we will not make use of such intervals in this paper. Note that for directed intervals (a,b)(b,a)(a,b)\neq(b,a) whereas for undirected intervals (of measure not 12\frac{1}{2}) we have (a,b)=(b,a)(a,b)=(b,a) for both minor and major intervals.

In addition to open and closed intervals, we are equally concerned with semi-open intervals. We will use the usual notation of open ()() and closed [][] brackets to indicate whether endpoints are included or not. We will also use || to avoid specifying which is the case, so that |a,b||a,b| represents a particular but unspecified one of the 4 possible interval configurations of endpoints.

We will also use 11 as a synonym for 0 where this aids readability, so [34,1][\frac{3}{4},1] represents the subset [34,1){0}[\frac{3}{4},1)\cup\{0\}in 𝕀\mathbb{I}, and the cyclic directed interval [34,0][\frac{3}{4},0] in 𝕋\mathbb{T}.

Any circle interval JJ is also a topological subspace of 𝕋\mathbb{T} equipped with the subspace topology. One subtlety to note is that a subspace may have boundary points in the parent topology, but these are not boundary points in the subspace topology: for example if J=[a,b)J=[a,b) then the interval [a,cJ)[a,c_{J}) is open (not semi-open) in the subspace topology of JJ. With this in mind, it becomes simple to extend the well known characterisation about open sets of \mathbb{R} to arbitrary real or circle intervals. We include a proof for completeness:

Proposition 8.

In the subspace topology of an interval JJ of either \mathbb{R} or 𝕋\mathbb{T}, an open set is an at most countable union of disjoint open intervals

Proof.

Let XX be an open set in the topology of JJ. For each xXx_{X} let IxI_{x} be the maximal interval in XX containing xx. If Ix,IyI_{x},I_{y} are not disjoint, their union is an interval, so their maximality requires Ix=IxIy=IyI_{x}=I_{x}\cup I_{y}=I_{y}. Hence XX is a union of disjoint intervals. However XX is also open and so contains no boundary points. Hence each IxI_{x} is an open interval. Finally each interval with interior contains a rational, so any family of disjoint intervals with interior is at most countable. ∎

A partition of the circle of length n1n\geq 1 is a an ordered sequence (xi)i=1n\left(x_{i}\right)_{i=1}^{n}. A partition of a circle interval JJ is a partition of the circle, all of whose points lie in JJ. (Note that this is a slight generalisation of the usual definition: we do not require JJ to be closed, nor do we require the endpoints of JJ to be included in the partition).

We introduce the natural involution on 𝕋\mathbb{T} define by x𝕋¯=x𝕋\overline{x_{\mathbb{T}}}=-x_{\mathbb{T}} which induces coordinate involutions {χx𝕋}¯={χx𝕋¯}={xI}={1x}\overline{\{\chi x_{\mathbb{T}}\}}=\{\chi\overline{x_{\mathbb{T}}}\}=\{-x_{I}\}=\{1-x\} and {{χ/x𝕋}}¯={{χ/x𝕋¯}}={{x𝕀/}}={{x𝕀/}}\overline{\{\{\chi^{/}x_{\mathbb{T}}\}\}}=\{\{\chi^{/}\overline{x_{\mathbb{T}}}\}\}=\{\{-x_{\mathbb{I}^{/}}\}\}=-\{\{x_{\mathbb{I}^{/}}\}\}. Note that for x𝕀0x_{\mathbb{I}}\neq 0 we have {1x}=1x\{1-x\}=1-x, whereas {10}=0\{1-0\}=0. The duality which results from application of this involution we call Circle Duality.

We now introduce the real translation Rα:xx+αR_{\alpha}^{\mathbb{\mathbb{\mathbb{RR}}}}:x\mapsto x+\alpha. This induces the translations (which we will here call rotations) Rα𝕋𝕋𝕋:x𝕋x𝕋+α𝕋R_{\alpha_{\mathbb{T}}}^{\mathbb{TT}}:x_{\mathbb{T}}\mapsto x_{\mathbb{T}}+\alpha_{\mathbb{T}} and Rα𝕀𝕀𝕀:x𝕀{x𝕀+α𝕀},Rα𝕀/𝕀/:x𝕀/{{x𝕀+α𝕀}}R_{\alpha_{\mathbb{I}}}^{\mathbb{I}\mathbb{I}}:x_{\mathbb{I}}\mapsto\{x_{\mathbb{I}}+\alpha_{\mathbb{I}}\},R_{\alpha_{\mathbb{I}^{/}}}^{\mathbb{I}^{/}}:x_{\mathbb{I}^{/}}\mapsto\{\{x_{\mathbb{I}}+\alpha_{\mathbb{I}}\}\}. Since each of these has identical structure, we will abuse notation and regard all of them as RαR_{\alpha}.

Note that Rα¯=Rα¯=R1α\overline{R_{\alpha}}=R_{\overline{\alpha}}=R_{1-\alpha} (and Rα=R{α}=Rα+nR_{\alpha}=R_{\{\alpha\}}=R_{\alpha+n_{\mathbb{\mathbb{Z}}}})

Note that in these systems we have RαNx={x+Nα}R_{\alpha}^{N}x=\{x+N\alpha\} and SN(ϕ,x,Rα)=r=1Nϕ({x+rα})S_{N}(\phi,x,R_{\alpha})=\sum_{r=1}^{N}\phi(\{x+r\alpha\}). Note we can always lift ϕ:𝕋\phi:\mathbb{T}\rightarrow\mathbb{\mathbb{\mathbb{\mathbb{R}}}} to ϕ:\phi^{\uparrow}:\mathbb{\mathbb{\mathbb{\mathbb{\mathbb{R}}}\rightarrow}\mathbb{\mathbb{\mathbb{R}}}} by defining ϕ(x)=ϕ({x})\phi^{\uparrow}(x)=\phi(\{x\}), and by abuse of notation we will simply write ϕ\phi for ϕ\phi^{\uparrow} so for example we can simply write SN(ϕ,x,Rα)=r=1Nϕ(x+rα)S_{N}(\phi,x,R_{\alpha})=\sum_{r=1}^{N}\phi(x+r\alpha). We call SN(ϕ,x,Rα)S_{N}(\phi,x,R_{\alpha}) a homogeneous sum for {x}=0\{x\}=0, and an inhomogeneous sum for {x}0\{x\}\neq 0.

Duality results:. SNT¯ϕ¯x¯=SNTϕxS_{N}^{\overline{T}}\overline{\phi}\overline{x}=S_{N}^{T}\phi x SN(ϕ)x=SNϕxS_{N}(-\phi)x=-S_{N}\phi x SNT¯(ϕ¯)x¯=SNTϕxS_{N}^{\overline{T}}(-\overline{\phi})\overline{x}=-S_{N}^{T}\phi x

A partition of the circle of length q1q\geq 1 is a sequence of qq circle points with coordinates satisfying 0x1<x2<xq<10\leq x_{1}<x_{2}\ldots<x_{q}<1. Note that any sequence of distinct points can be reordered to give a partition. The intervals of the partition are the images under χ𝕋𝕀1\chi_{\mathbb{T}\mathbb{I}}^{-1} of the q1q-1 intervals (xi,xi+1)(x_{i},x_{i+1}) for r<qr<q, and the interval (xq,1)(0,x1)(x_{q},1)\bigcup(0,x_{1}).

The partition is regular if xr+1xr=1qx_{r+1}-x_{r}=\frac{1}{q} for 1r<q1\leq r<q, and also (if q>1)q>1) {x1xq}=1q\{x_{1}-x_{q}\}=\frac{1}{q}. We will only deal with regular partitions, so we will drop the word regular. Note that a partition of length 1 is always regular.

2.5. Number Theory

Algebraic Context Given some set AA\subseteq\mathbb{\mathbb{\mathbb{R}}} we define xA={xa:aA}x_{\mathbb{R}}A=\{xa:a\in A\}. If AA is an additive group then so is xAxA and there is a quotient group /xA\mathbb{R}/xA which is also an additive group with elements y(/xA)=y+xA{y+xa:aA}y_{(\mathbb{R}/xA)}=y+xA\coloneqq\{y+xa:a\in A\}. Further xAxA is also closed under multiplication iff zz\in\mathbb{Z}. When zz\in\mathbb{Z}, zz\mathbb{Z} is a commutative ring (without unit for |z|>1|z|>1), and the quotient /z\mathbb{Z}/z\mathbb{Z} is a commutative ring with unit (a field if zz is a prime, but with zero divisors if zz is composite).

In this paper we will work primarily in the projective closure of the positive reals, ie the interval =[0,]\mathbb{P}=[0,\infty] equipped with the involution R:x1xR:x\mapsto\frac{1}{x}, where we define R0\infty\coloneqq R0 (and hence R=0R\infty=0, and R[0,1]=[1,]R[0,1]=[1,\infty]). We also define \infty to be an integer, together with the usual operator extensions +x,.x,x\infty+x_{\mathbb{P}}\coloneqq\infty,\,\infty.x_{\mathbb{P}}\coloneqq\infty,\,x_{\mathbb{P}}\leq\infty. Number Functions The most important group of functions for our purposes are the remainder (residue) and floor functions. Recall 𝕀[0,1)\mathbb{I}\coloneqq[0,1) and for x0x_{\mathbb{R}}\neq 0 x𝕀=[0,x)x\mathbb{I}=[0,x).

For each z0z_{\mathbb{R}}\neq 0 we define the zz-remainder function {.}z:|z|𝕀\{.\}_{z}:\mathbb{R}\rightarrow\left|z\right|\mathbb{I} by {x}zmin{xkz:kzx}\{x\}_{z}\coloneqq\min\{x-k_{\mathbb{Z}}z:kz\leq x\}, ie {x}z\{x\}_{z} is the positive remainder after integer division by zz. Note as z0z\rightarrow 0, {x}z0\{x\}_{z}\rightarrow 0 and so we extend to {x}0=0\{x\}_{0}=0. Note {x+nz}z={x}z\{x+nz\}_{z}=\{x\}_{z} so there is a natural bijection x+z{x}zx+z\mathbb{Z}\longleftrightarrow\{x\}_{z}, ie /z|z|𝕀\mathbb{\mathbb{\mathbb{R}}}/z\mathbb{Z}\leftrightarrow|z|\mathbb{I}. This means we can give |z|𝕀|z|\mathbb{I} group structure via x+z𝕀y{x+y}zx+_{z\mathbb{I}}\,y\coloneqq\{x+y\}_{z}. Similarly /q|q|𝕀\mathbb{Z}/q\mathbb{Z}\longleftrightarrow|q|\mathbb{I}\bigcap\mathbb{Z} and we can add distributive multiplication via r|q|𝕀s={rs}qr\cdot_{|q|\mathbb{I}}s=\{rs\}_{q}.

The signed remainder is {{x}}z={x}z{x}z|z|2|z||z|𝕀/\{\{x\}\}_{z}=\{x\}_{z}-\left\llbracket\{x\}_{z}\geq\frac{\left|z\right|}{2}\right\rrbracket\left|z\right|\in\left|z\right|\mathbb{I}^{/}. The remainder and signed remainder are inverse bijections |z|𝕀/|z|𝕀|z|\mathbb{I}^{/}\longleftrightarrow|z|\mathbb{I}. Together they provide an involution on |z|(𝕀𝕀/)|z|\left(\mathbb{I}\bigcup\mathbb{I}^{/}\right) which fixes |z|[0,12)|z|[0,\frac{1}{2}) and transposes [12,0)[-\frac{1}{2},0) with [12,1)[\frac{1}{2},1).

We also define the floor function xzx{x}z\left\lfloor x\right\rfloor_{z}\coloneqq x-\{x\}_{z} and the smallest distance from xx to zz\mathbb{Z} given by xz=|{{x}}z||z|2𝕀\left\|x\right\|_{z}=\left|\{\{x\}\}_{z}\right|\in\frac{|z|}{2}\mathbb{I}. Note that the latter is a metric on 𝕀\mathbb{I} though not a norm (it fails the requirement of scalar multiplication). It is useful to extend these functions by defining {}=0\{\infty\}=0 from which we also get {{}}=0,=,=0\{\{\infty\}\}=0,\left\lfloor\infty\right\rfloor=\infty,\left\|\infty\right\|=0.

When z=1z=1 we will omit it from the notation. The functions {x},{{x}}\{x\},\{\{x\}\} are then also called fractional parts of xx.

We can use {x}z\{x\}_{z} to derive other common functions:

The highest common factor is the binary operation (x,y)max{z:{x}z=0&{y}z=0}(x_{\mathbb{R}},y_{\mathbb{R}})\coloneqq\max\{z_{\mathbb{R}}:\{x\}_{z}=0\&\{y\}_{z}=0\}. Note (x,x)=(x,0)=|x|(x,x)=(x,0)=\left|x\right|.

The reflexive & transitive relation ’divides’ is given by z|x{x}z=0z|x\coloneqq\;\{x\}_{z}=0 (so we allow 0|x0|x), and the equivalence relation ’congruent’ by xymodz{x}z={y}zx\equiv y\bmod z\;\coloneqq\;\{x\}_{z}=\{y\}_{z}. Now {x}z={y}zx+z=y+z\{x\}_{z}=\{y\}_{z}\Longleftrightarrow x+z\mathbb{Z}=y+z\mathbb{Z} so that there is a natural bijection between [0,z)[0,z) and /z\mathbb{\mathbb{\mathbb{R}}}/z\mathbb{Z} giving [0,z)[0,z) a group structure by x[0,z)[x]x_{[0,z)}\leftrightarrow[x], and a ring structure on {r}r=0q1\{r\}_{r=0}^{q-1} induced from /q\mathbb{Z}/q\mathbb{Z}.

The signum function sgn\operatorname{sgn} maps α\alpha_{\mathbb{R}} to its sign:sgn(x)=x>0x<0\operatorname{sgn}(x_{\mathbb{R}})=\left\llbracket x>0\right\rrbracket-\left\llbracket x<0\right\rrbracket. (Note sgn(0)=0\operatorname{sgn}(0)=0). Residues When z,xz,x are natural numbers we call {x}z\{x\}_{z} a residue of zz, noting that there are just zz possible residues, namely the set {i:0i<z}\{i_{\mathbb{N}}:0\leq i<z\}. We will also tend to use We have basic results, if (p,q)=1(p_{\mathbb{N}},q_{\mathbb{N}})=1 then {{rp}q}r=1q1={r}r=1q1\left\{\,\{r_{\mathbb{N}}p\}_{q}\,\right\}_{r=1}^{q-1}=\{r\}_{r=1}^{q-1}, ie the postfix operator ×p\times p induces a permutation on non-zero residues of qq - in particular for q>1q>1 there is a unique 1rpq11\leq r_{p}\leq q-1 with {rpp}q=1\{r_{p}p\}_{q}=1 which we designate {p}q1\{p\}_{q}^{-1}.

{.}q\{.\}_{q} induces +q𝕀+_{q\mathbb{I}} by: {x}q+q𝕀{y}q={x+y}q\{x\}_{q}+_{q\mathbb{I}}\{y\}_{q}=\{x+y\}_{q} - need to show indt of choice of x,yx,y.

{x+y}q={{x}q+{y}q}q={x}q+{y}q{x}q+{y}q|z||z|\{x+y\}_{q}=\left\{\{x\}_{q}+\{y\}_{q}\right\}_{q}=\{x\}_{q}+\{y\}_{q}-\left\llbracket\{x\}_{q}+\{y\}_{q}\geq|z|\right\rrbracket|z|. {rs}q={{r}q{s}q}q\{rs\}_{q}=\left\{\{r\}_{q}\{s\}_{q}\right\}_{q}

Diophantine Approximation Given aa_{\mathbb{R}}, the theory of Continued Fractions provides a sequence (prqr)r=0\left(\frac{p_{r}}{q_{r}}\right)_{r=0}^{\infty} (convergents) of exponentially improving rational approximations of aa. These have long provided a primary tool in our area of study, and there are many good introductions, eg Hardy & Wright[6]. We will give a very brief summary from the slightly novel viewpoint of our needs in this paper.

Recall that we can split any xx_{\mathbb{R}} into its integer and fractional parts, ie x=x+{x}x=\left\lfloor x\right\rfloor+\left\{x\right\}. In the sequel it will be useful to adopt the convention that symbols with ./.^{/} indicate real numbers, and symbols without indicate integer numbers. In addition we will regard \infty as a positive integer reciprocal with 0, so that 10=+1\frac{1}{0}\coloneqq\infty=\left\lfloor\infty\right\rfloor+\frac{1}{\infty}. An integer, real therefore means an element of ^={}\widehat{\mathbb{Z}}=\mathbb{Z}\bigcup\{\infty\},^={}\widehat{\mathbb{R}}=\mathbb{R}\bigcup\{\infty\} respectively.

We start with a modernisation of the method of anthyphairesis (alternating subtraction) developed by the early Greeks, probably by the School of Pythagoras, and later a central component of what we now call Euclid’s algorithm.

Definition 9.

The anthyphairetic relation is the recurrence relation an/=an+1an+1/a_{n}^{/}=a_{n}+\frac{1}{a_{n+1}^{/}} where an=an/a_{n}=\left\lfloor a_{n}^{/}\right\rfloor. Given an initial real number a/a^{/} and setting a0/=a/a_{0}^{/}=a^{/} the relation defines (for n0n\geq 0) 2 infinite sequences (an/),(an)(a_{n}^{/}),(a_{n}) of real numbers and their integer parts respectively. 333Technical note: Classically, expositions distinguish between finite sequences (for a0/a_{0}^{/} rational) and infinite sequences (for a0/a_{0}^{/} irrational). For our purposes it is simpler to regard all sequences as infinite, extending finite sequences with the integer \infty. In practice this makes little difference to the discussion. For historical reasons, we call (an/)(a_{n}^{/}) the sequence of complete quotients of a/a^{/}, and (an)(a_{n}) the sequence of partial quotients. The integer sequence (an)(a_{n}) is also called the anthyphairesis of the real number a/.a^{/}.

Note for n0n\geq 0 that 1an+1/\frac{1}{a_{n+1}^{/}} is the fractional part of an/a_{n}^{/}, and so 1<an+1/1<a_{n+1}^{/}\leq\infty which also means an+11a_{n+1}\geq 1. Also by definition anan/<an+1a_{n}\leq a_{n}^{/}<a_{n}+1 unless an/=an=a_{n}^{/}=a_{n}=\infty.

In modern terms, the relation defines the anthyphairetic map 𝒜:a/(an)\mathscr{A}:a^{/}\mapsto(a_{n}) of real numbers to sequences of integers. It is easy to see that this map is injective, and so it provides an invertible encoding of reals by integers. The map is not surjective: if an=a_{n}=\infty then also an+1=a_{n+1}=\infty and an11a_{n-1}\neq 1 (with the single exception of A(1)=(1,,,)A(1)=(1,\infty,\infty,\ldots)). The image of 𝒜\mathscr{A} is therefore set of anthyphaireses, a subset of all integer sequences.

The natural question arises as to whether we can construct the inverse map A1A^{-1} from the set of anthyphaireses to [1,][1,\infty]. Note first that using the anthyphairetic relation recursively gives us a0/=a0+1a1/=a0+1a1+1a2/=a_{0}^{/}=a_{0}+\frac{1}{a_{1}^{/}}=a_{0}+\frac{1}{a_{1}+\frac{1}{a_{2}^{/}}}=.... These expressions are called Continued Fractions, although a better modern term might be recursive fractions, namely fractions ab\frac{a}{b} in which a,ba,b may themselves be recursive fractions. Let us write the nn-th Continued Fraction as CFn/(a0,,an1,an/)CF_{n}^{/}(a_{0},\ldots,a_{n-1},a_{n}^{/}), noting that if we simplify the continued fraction back to a normal fraction we will end up with a0/=CFn/(a0,,an/)=Pn(a0,,an/)Qn(a0,,an/)a_{0}^{/}=CF_{n}^{/}(a_{0},\ldots,a_{n}^{/})=\frac{P_{n}(a_{0},\ldots,a_{n}^{/})}{Q_{n}(a_{0},\ldots,a_{n}^{/})}, a rational polynomial in a0,,an/a_{0},\ldots,a_{n}^{/}. If a0/a_{0}^{/} is irrational, so is an/a_{n}^{/} and we have simply shifted the problem of constructing the inverse from a0/a_{0}^{/} to an/.a_{n}^{/}.444There is an important exception: if the anthyphairesis is eventually periodic, Euler/Lagrange showed that a0/a_{0}^{/} is a root of a quadratic equation which can be recovered from the sequence.).

However if an/a_{n}^{/} is itself an integer, a0/a_{0}^{/} is rational and easy to calculate, and so we the effects of replacing an/a_{n}^{/} with its integer part ana_{n}. This gives us a derived sequence of Continued Fractions CFn=CFn/(a0an)CF_{n}=CF_{n}^{/}(a_{0}...a_{n}), each of which now simplifies to some normal fraction pnqn\frac{p_{n}}{q_{n}}. For example CF0=p0q0=a0CF_{0}=\frac{p_{0}}{q_{0}}=a_{0}, CF1=p1q1=a0+1a1=a0a1+1a1CF_{1}=\frac{p_{1}}{q_{1}}=a_{0}+\frac{1}{a_{1}}=\frac{a_{0}a_{1}+1}{a_{1}}, CF2=p2q2=a0+1a1+1a2=a0a1a2+a0+a2a1a2+1CF_{2}=\frac{p_{2}}{q_{2}}=a_{0}+\frac{1}{a_{1}+\frac{1}{a_{2}}}=\frac{a_{0}a_{1}a_{2}+a_{0}+a_{2}}{a_{1}a_{2}+1} etc.

The remarkable results of Continued Fraction theory are now that

(2.1) a0/pnqn=(1)nqn(an+1/qn+qn1)a_{0}^{/}-\frac{p_{n}}{q_{n}}=\frac{(-1)^{n}}{q_{n}\left(a_{n+1}^{/}q_{n}+q_{n-1}\right)}

This means that (pnqn)\left(\frac{p_{n}}{q_{n}}\right) is a sequence whose limit is a0/a_{0}^{/}, ie A1(ar)=limnpnqnA^{-1}(a_{r})=\lim_{n\rightarrow\infty}\frac{p_{n}}{q_{n}}, and we call pnqn\frac{p_{n}}{q_{n}} a convergent of a0/a_{0}^{/}. Further, these are the best possible rational approximations given the size of denominator, meaning that |a0/pq|<|a0/pnqn|\left|a_{0}^{/}-\frac{p}{q}\right|<\left|a_{0}^{/}-\frac{p_{n}}{q_{n}}\right| requires q>qnq>q_{n}. Indeed each approximation is actually stronger: the (stricter) inequality |qa0/p|<|qna0/pn|\left|qa_{0}^{/}-p\right|<\left|q_{n}a_{0}^{/}-p_{n}\right| in fact requires qqn+1q\geq q_{n+1}.

We will need two other basic results of Continued Fraction theory. We assume each pnqn\frac{p_{n}}{q_{n}} is in lowest terms. Then first there is an invariant of the sequence given by the identity:

(2.2) pn+1qnpnqn+1=(1)np_{n+1}q_{n}-p_{n}q_{n+1}=(-1)^{n}

Second, (pn),(qn)(p_{n}),(q_{n}) both satisfy the continuant recurrence relation rn=anrn1+rn2,r_{n}=a_{n}r_{n-1}+r_{n-2},which means:

pn\displaystyle p_{n} =anpn1+pn2\displaystyle=a_{n}p_{n-1}+p_{n-2}
qn\displaystyle q_{n} =anqn1+qn2\displaystyle=a_{n}q_{n-1}+q_{n-2}

Note that from p0q0=a0\frac{p_{0}}{q_{0}}=a_{0} we get p0=a0,q0=1p_{0}=a_{0},q_{0}=1 and from p1q1=a0+1a1=a0a1+1a1\frac{p_{1}}{q_{1}}=a_{0}+\frac{1}{a_{1}}=\frac{a_{0}a_{1}+1}{a_{1}} we get p1=a0a1+1p_{1}=a_{0}a_{1}+1,q1=a1q_{1}=a_{1}. Noting the recurrence relation can also be written in descending form as rn2=rnanrn1r_{n-2}=r_{n}-a_{n}r_{n-1}, we can then deduce the simpler initial conditions:

p1\displaystyle p_{-1} =1,p2=0\displaystyle=1,p_{-2}=0
q1\displaystyle q_{-1} =0,q2=1\displaystyle=0,q_{-2}=1

Continued Fraction theory began from the Greeks’ algebraic interest in the relations between natural numbers, rationals and reals, and this algebraic focus continued into exploring their connection with the solution of Diophantine equations (ie solutions in integers or rationals). Our interest by contrast is in using the results of the theory to study Dynamical Systems consisting of rotations of the circle. This changes our view of the first class citizens of the theory. So whereas the first class citizens in classical theory are rational convergents and approximation errors, ours are periods of closest return, and the periods of the associated return errors. Let us unpack this.

Given a Dynamical System (X,T)(X,T) with (X,d)(X,d) a metric space, recall that a period of closest return (or quasiperiod) to x0x_{0} is an integer n>0n>0 such that d(Tmx0,x0)<d(Tnx0,x0)d(T^{m}x_{0},x_{0})<d(T^{n}x_{0},x_{0}) requires m>nm>n. In the system (𝕋,Ra0/)(\mathbb{T},R_{a_{0}^{/}}) (being rotations of the circle through a0/a_{0}^{/} revolutions), then the “best possible” result of classical Continued Fraction theory are equivalent to the result that each qnq_{n} is a quasiperiod. The approximation error qna0/pn=(1)nan+1/qn+qn1q_{n}a_{0}^{/}-p_{n}=\frac{(-1)^{n}}{a_{n+1}^{/}q_{n}+q_{n-1}} is equivalent to saying that qnq_{n} rotations through a0/a_{0}^{/} returns to a (directed) distance of (1)nan+1/qn+qn1\frac{(-1)^{n}}{a_{n+1}^{/}q_{n}+q_{n-1}} revolutions from the starting point, ie it is the return error, which has an error period of an+1/qn+qn1a_{n+1}^{/}q_{n}+q_{n-1}. Since this will be of primary interest we will give it its own notation: motivated by the continuant relation for quasiperiods, namely qn+1=an+1qn+qn1q_{n+1}=a_{n+1}q_{n}+q_{n-1}, we will denote the nn-th error period qn+1/an+1/qn+qn1q_{n+1}^{/}\coloneqq a_{n+1}^{/}q_{n}+q_{n-1}.555Historical note: the notation qn+1/q_{n+1}^{/} was in fact used by Hardy & Wright[6], but purely as a convenient local shorthand for the expression an+1/qn+qn1a_{n+1}^{/}q_{n}+q_{n-1} in the proof of the inequality |αpnqn|<1qnqn+1\left|\alpha-\frac{p_{n}}{q_{n}}\right|<\frac{1}{q_{n}q_{n+1}}. It does not seem to have been named, or used in further theory development. In fact Lang[8] does not use any shorthand at all, always writing an+1/qn+qn1a_{n+1}^{/}q_{n}+q_{n-1} in full. We are not aware of distinct names being introduced for (qn),(qn/)(q_{n}),(q_{n}^{/}) previously.

In summary, the sequences an,an/a_{n},a_{n}^{/} are first class citizens of both classical theory and of the theory in this paper. However classical theory then focuses on the convergents pnqn\frac{p_{n}}{q_{n}} and their numerators/denominators, whereas our focus is on the periods qn,qn/q_{n},q_{n}^{/}. For convenience we give a formal definition:

Definition 10.

Given a real number a0/a_{0}^{/} with full quotient sequence (an/)(a_{n}^{/}) and partial quotient sequence (an)(a_{n}), we also define for n0n\geq 0 the quasiperiod sequence of a0/a_{0}^{/} as (qn=anqn1+qn1)(q_{n}=a_{n}q_{n-1}+q_{n-1}) and the error-period sequence of a0/a_{0}^{/} as (qn/=an/qn1+qn1)(q_{n}^{/}=a_{n}^{/}q_{n-1}+q_{n-1}) where q1=0,q2=1q_{-1}=0,q_{-2}=1. Although of less interest, we also define pn,pn/p_{n},p_{n}^{/} analogously.

Finally we remark that a rotation through a0/1a_{0}^{/}\geq 1 is equivalent to a rotation through α={a0/}<1\alpha=\{a_{0}^{/}\}<1. In this paper we focus on the dynamics of irrational rotations, and so we will adopt the convention that the symbol α\alpha is used to represent a real number (usually irrational) with 0α{a0/}<10\leq\alpha\coloneqq\{a_{0}^{/}\}<1, noting that A(α)=CF(0,a1,a2)A(\alpha)=CF(0,a_{1},a_{2}\ldots), ie a0=0a_{0}=0 and hence also α=a0/(α)=1a1/\alpha=a_{0}^{/}(\alpha)=\frac{1}{a_{1}^{/}}. It also gives p0=0,p1=1p_{0}=0,p_{1}=1.

Definition 11 (Diophantine type functions).

In estimating we find that it is useful to establish bounds on the growth of the sequences (qn),(qn/)(q_{n}),(q_{n}^{/}). Since qn=anqn1+qn2q_{n}=a_{n}q_{n-1}+q_{n-2} we have for n1n\geq 1 anqn1qn<(an+1)qn1a_{n}q_{n-1}\leq q_{n}<(a_{n}+1)q_{n-1} so that r=1narqn<r=1n(ar+1)\prod_{r=1}^{n}a_{r}\leq q_{n}<\prod_{r=1}^{n}(a_{r}+1) with equality only for n=1n=1. Define anmax=maxrnara_{n}^{\max}=\max_{r\leq n}a_{r} then qn<(anmax+1)nq_{n}<(a_{n}^{\max}+1)^{n}. Note anmaxa_{n}^{\max} is an increasing positive sequence. Define anmax,an/maxa_{n}^{\max},a_{n}^{/\max} and An=maxrn(qrqr1)A_{n}=\max_{r\leq n}\left(\frac{q_{r}}{q_{r-1}}\right) and An/=maxrn(qr/qr1)A_{n}^{/}=\max_{r\leq n}\left(\frac{q_{r}^{/}}{q_{r-1}}\right). Note anmax<an/max<anmax+1a_{n}^{\max}<a_{n}^{/\max}<a_{n}^{\max}+1 and An<An/<An+1A_{n}<A_{n}^{/}<A_{n}+1 anmax<An<anmax+1a_{n}^{\max}<A_{n}<a_{n}^{\max}+1 an/max<An/<an/max+1a_{n}^{/\max}<A_{n}^{/}<a_{n}^{/\max}+1. All are O(anmax)O(a_{n}^{\max}) and An/<anmax+2A_{n}^{/}<a_{n}^{\max}+2.

Estimates relating n,qnn,q_{n} Since qn=anqn1+qn2q_{n}=a_{n}q_{n-1}+q_{n-2} we get qn=(anan1+1)qn2+qn32qn2q_{n}=\left(a_{n}a_{n-1}+1\right)q_{n-2}+q_{n-3}\geq 2q_{n-2} for n2n\geq 2 (equality only for n=2)n=2). Hence for nn even qn2n/2q_{n}\geq 2^{n/2} for n2n\geq 2 and for nn odd qn2(n1)/2q1q_{n}\geq 2^{(n-1)/2}q_{1} for n1n\geq 1. Hence n2log2logqnn\leq\frac{2}{\log 2}\log q_{n} and n1+2log2(logqnlogq1)n\leq 1+\frac{2}{\log 2}(\log q_{n}-\log q_{1}) for nn odd.

Writing ϕ12(5+1)\phi\coloneqq\frac{1}{2}\left(\sqrt{5}+1\right) for the golden ration, we can get a slightly better estimate by observing that qnq_{n} increases with increasing ana_{n} and so the lowest quasiperiods are those of the golden rotation, amely ϕ1=12(51)\phi-1=\frac{1}{2}\left(\sqrt{5}-1\right) with ai=1a_{i}=1. Hence qnFn+1=15(ϕn+1(ϕ)(n+1))q_{n}\geq F_{n+1}=\frac{1}{\sqrt{5}}\left(\phi^{n+1}-(-\phi)^{-(n+1)}\right) giving n1logϕ(logqn+log5log(1ϕ2(n+1)))1n\leq\frac{1}{\log\phi}\left(\log q_{n}+\log\sqrt{5}-\log\left(1-\phi^{-2(n+1)}\right)\right)-1. Since ϕ2(n+1)\phi^{-2(n+1)} is a reducing function with nn we have for n1n\geq 1, n1logϕ(logqn+log5log(1ϕ4))1n\leq\frac{1}{\log\phi}\left(\log q_{n}+\log\sqrt{5}-\log\left(1-\phi^{-4}\right)\right)-1. Now 1ϕ4=ϕ2(ϕ2+ϕ2)=ϕ2F251-\phi^{-4}=\phi^{-2}(\phi^{2}+\phi^{-2})=\phi^{-2}F_{2}\sqrt{5} and F2=1F_{2}=1 so that log5log(1ϕ4)=2logϕ\log\sqrt{5}-\log\left(1-\phi^{-4}\right)=2\log\phi. So we have nlogqnlogϕ+1n\leq\frac{\log q_{n}}{\log\phi}+1 for n1n\geq 1 with equality only for n=1n=1 and q1=1q_{1}=1 (and in fact the strict inequality still holds for n=0n=0).

Also qn=r=1nqrqr1r=1nAr(An)nq_{n}=\prod_{r=1}^{n}\frac{q_{r}}{q_{r-1}}\leq\prod_{r=1}^{n}A_{r}\leq\left(A_{n}\right)^{n}. Hence

(2.3) logqnlogAnnlogqnlogϕ+1\frac{\log q_{n}}{\log A_{n}}\leq n\leq\frac{\log q_{n}}{\log\phi}+1

where the left inequality holds for qn>1q_{n}>1 (and hence also An>1)A_{n}>1), and the right inequality holds for n0n\geq 0.666In this context it is tempting to index the ara_{r} from 0 so α=[0;a0,a1,]\alpha=[0;a_{0},a_{1},\ldots] as this gives the nicely corresponding sequences (ar)r=0,(br)r=0,(qr)r=0(a_{r})_{r=0}^{\infty},(b_{r})_{r=0}^{\infty},(q_{r})_{r=0}^{\infty} with brarb_{r}\leq a_{r}. However this would be a significant departure from a strong historic convention which works well in a wider context. We will therefore adhere to the established convention so that α=[0;a1,a2,]\alpha=[0;a_{1},a_{2},\ldots] and qr+1=ar+1qr+qr1q_{r+1}=a_{r+1}q_{r}+q_{r-1}ar,qra_{r},q_{r}, as elsewhere in the literature.

Continued Fractions Preliminary Results We will need some additional simple results in following sections. Given the maturity of CF theory, it is likely that these results are not new, however we have not been able to find them (perhaps they are hidden by notational differences), and so we will give a proof here.

Lemma 12.

pn(1)n+1qn11modqnp_{n}\equiv(-1)^{n+1}q_{n-1}^{-1}\bmod q_{n} where r1r^{-1} signifies the inverse of rmodqr\bmod q, ie 0r1<q0\leq r^{-1}<q with rr11modqrr^{-1}\equiv 1\bmod q. For α=a0/<1\alpha=a_{0}^{/}<1 this is equivalent to pn=qn11p_{n}=q_{n-1}^{-1} (nn odd) and pn=qnqn11p_{n}=q_{n}-q_{n-1}^{-1} (nn even).

Proof.

The first part follows directly from expressing (2.2) modqn\bmod q_{n}. For the second part, note for α<1\alpha<1 we have pn<qnp_{n}<q_{n}, and the result follows. ∎

Lemma 13.

We have the equalities qn/=1nar/q_{n}^{/}=\prod_{1}^{n}a_{r}^{/} and pn/qn/=a0/\frac{p_{n}^{/}}{q_{n}^{/}}=a_{0}^{/}

Proof.

Using the recurrence relations we have qn+1/=an+1/qn+qn1=an+1/(anqn1+qn2)+qn1q_{n+1}^{/}=a_{n+1}^{/}q_{n}+q_{n-1}=a_{n+1}^{/}\left(a_{n}q_{n-1}+q_{n-2}\right)+q_{n-1}. The anthyphairetic relation gives an+1/an+1=an+1/an/a_{n+1}^{/}a_{n}+1=a_{n+1}^{/}a_{n}^{/} and hence qn+1/=an+1/(an/qn1+qn2)=an+1/qn/q_{n+1}^{/}=a_{n+1}^{/}\left(a_{n}^{/}q_{n-1}+q_{n-2}\right)=a_{n+1}^{/}q_{n}^{/}. This gives us qn/=(1nar/)q0/q_{n}^{/}=\left(\prod_{1}^{n}a_{r}^{/}\right)q_{0}^{/}. Now q0/=a0/q1+q2=1q_{0}^{/}=a_{0}^{/}q_{-1}+q_{-2}=1, and the first result follows. A similar argument gives pn/=(1nar/)p0/p_{n}^{/}=\left(\prod_{1}^{n}a_{r}^{/}\right)p_{0}^{/} and p0/=a0/p1+p2=a0/p_{0}^{/}=a_{0}^{/}p_{-1}+p_{-2}=a_{0}^{/}, and the second result follows. ∎

Definition 14.

We define the upper Gauss map Γ:[0,][1,]\Gamma:[0,\infty]\rightarrow[1,\infty] by x1xxx\mapsto\frac{1}{x-\left\lfloor x\right\rfloor}, noting this defines the left shift on the sequence of full quotients, ie Γ(an/)=an+1/\Gamma(a_{n}^{/})=a_{n+1}^{/}.777It is also the conjugate of the lower Gauss map γ:[0,1][0,1]\gamma:[0,1]\rightarrow[0,1] by Γ=RγR\Gamma=R\gamma R where RR is the involution x1xx\rightarrow\frac{1}{x} on [0,][0,\infty]

Recall we define the dual rotation α¯1α\overline{\alpha}\coloneqq 1-\alpha. For any sequence of observables xrxr(α)x_{r}\coloneqq x_{r}(\alpha) we define the dual observable values x¯rxr(α¯)\overline{x}_{r}\coloneqq x_{r}(\overline{\alpha})

a0/=α,an=an/,an+1/=1an/ana_{0}^{/}=\alpha,a_{n}=\left\lfloor a_{n}^{/}\right\rfloor,a_{n+1}^{/}=\frac{1}{a_{n}^{/}-a_{n}} so for α<12\alpha<\frac{1}{2} a1=1α2,a0/=α,a1/=1α,a¯0/=1α,a¯1/=11αa_{1}=\left\lfloor\frac{1}{\alpha}\right\rfloor\geq 2,a_{0}^{/}=\alpha,a_{1}^{/}=\frac{1}{\alpha},\overline{a}_{0}^{/}=1-\alpha,\overline{a}_{1}^{/}=\frac{1}{1-\alpha} and hencea¯2/=1a¯1/a¯1=111α1=1αα=1α1=a1/1\overline{a}_{2}^{/}=\frac{1}{\overline{a}_{1}^{/}-\overline{a}_{1}}=\frac{1}{\frac{1}{1-\alpha}-1}=\frac{1-\alpha}{\alpha}=\frac{1}{\alpha}-1=a_{1}^{/}-1. Hence a¯2=a11\overline{a}_{2}=a_{1}-1

Lemma 15.

For 0<α<120<\alpha<\frac{1}{2}, the quotients a¯r/,a¯r\overline{a}_{r}^{/},\overline{a}_{r} of α¯\overline{\alpha} satisfy the identity x¯r+1=xr\overline{x}_{r+1}=x_{r} for r2r\geq 2, and further q¯r/\overline{q}_{r}^{/} satisfies it for r1r\geq 1, and q¯r\overline{q}_{r} for r0r\geq 0. In addition the early sequence values are a¯0/=1α,a¯1/=11α,a¯2/=α1α\overline{a}_{0}^{/}=1-\alpha,\overline{a}_{1}^{/}=\frac{1}{1-\alpha},\overline{a}_{2}^{/}=\frac{\alpha}{1-\alpha}, a¯0=0,a¯1=1,a¯2=a11\overline{a}_{0}=0,\overline{a}_{1}=1,\overline{a}_{2}=a_{1}-1, q¯0/=1,q¯1/=11α\overline{q}_{0}^{/}=1,\overline{q}_{1}^{/}=\frac{1}{1-\alpha} and q¯0=1\overline{q}_{0}=1.

Proof.

Standard results from earlier in this section immediately give us a¯0/=1α,q¯0=1,q¯0/=1\overline{a}_{0}^{/}=1-\alpha,\overline{q}_{0}=1,\overline{q}_{0}^{/}=1, and also since 1α<11-\alpha<1, a¯0=0,q¯1/=a¯1/=11α\overline{a}_{0}=0,\overline{q}_{1}^{/}=\overline{a}_{1}^{/}=\frac{1}{1-\alpha} (note that q¯1/q0/\overline{q}_{1}^{/}\neq q_{0}^{/},a¯1/a0/\overline{a}_{1}^{/}\neq a_{0}^{/}).

Now we use 0<α<120<\alpha<\frac{1}{2} so that 12<α¯<1\frac{1}{2}<\overline{\alpha}<1 and hence a¯1/=11α<2\overline{a}_{1}^{/}=\frac{1}{1-\alpha}<2, giving q¯1=a¯1=1=q0\overline{q}_{1}=\overline{a}_{1}=1=q_{0}. We can now calculate a¯2/=1a¯1/a¯1=111α1=1αα=1α1=a1/1\overline{a}_{2}^{/}=\frac{1}{\overline{a}_{1}^{/}-\overline{a}_{1}}=\frac{1}{\frac{1}{1-\alpha}-1}=\frac{1-\alpha}{\alpha}=\frac{1}{\alpha}-1=a_{1}^{/}-1 and hence a¯2=a11\overline{a}_{2}=a_{1}-1. These results then give q¯2=(a11).1+1=a1=q1\overline{q}_{2}=(a_{1}-1).1+1=a_{1}=q_{1} and similarly q¯2/=(a1/1).1+1=a1/=q1/\overline{q}_{2}^{/}=(a_{1}^{/}-1).1+1=a_{1}^{/}=q_{1}^{/}. Finally a¯3/=1a¯2/a¯2=1(a1/1)(a11)=a2/\overline{a}_{3}^{/}=\frac{1}{\overline{a}_{2}^{/}-\overline{a}_{2}}=\frac{1}{\left(a_{1}^{/}-1\right)-\left(a_{1}-1\right)}=a_{2}^{/} and hence also a¯3=a2\overline{a}_{3}=a_{2}. This then gives q¯3/=a¯3/q¯2/=a2/q1/=q2/\overline{q}_{3}^{/}=\overline{a}_{3}^{/}\overline{q}_{2}^{/}=a_{2}^{/}q_{1}^{/}=q_{2}^{/}, and we have established the early sequence values.

Now for n2n\geq 2, we can write a¯n+1/=Γn2a¯3/=Γn2a2/=an/\overline{a}_{n+1}^{/}=\Gamma^{n-2}\overline{a}_{3}^{/}=\Gamma^{n-2}a_{2}^{/}=a_{n}^{/} and hence also a¯n+1=an\overline{a}_{n+1}=a_{n}. The recurrence relations for q¯n+1/,q¯n+1\overline{q}_{n+1}^{/},\overline{q}_{n+1} quickly give the results for these sequences for n2n\geq 2, and the early sequence results extend them to the claimed starting values. ∎

We now develop some useful identities

Lemma 16.

Identities:

  1. (1)

    For n0n\geq 0 and a0/a_{0}^{/} irrational we have 1an+1<an+1/=an+1+1an+2/<an+1+11\leq a_{n+1}<a_{n+1}^{/}=a_{n+1}+\frac{1}{a_{n+2}^{/}}<a_{n+1}+1

  2. (2)
  3. (3)

    For n>0n>0, qn<qn/<qn+qn1<min((1+1an)qn,qn+1)2qnq_{n}<q_{n}^{/}<q_{n}+q_{n-1}<\min\left(\left(1+\frac{1}{a_{n}}\right)q_{n},q_{n+1}\right)\leq 2q_{n} Hence r=1nar<qn<qn/=r=1nar/=r=1n(ar+1ar+1/)<r=1n(ar+1)\prod_{r=1}^{n}a_{r}<q_{n}<q_{n}^{/}=\prod_{r=1}^{n}a_{r}^{/}=\prod_{r=1}^{n}(a_{r}+\frac{1}{a_{r+1}^{/}})<\prod_{r=1}^{n}(a_{r}+1)

  4. (4)

    an+1/(an/an)=1a_{n+1}^{/}(a_{n}^{/}-a_{n})=1 (equivalent to an+1/an/=an+1/an+1a_{n+1}^{/}a_{n}^{/}=a_{n+1}^{/}a_{n}+1 hence an+1/an/>2a_{n+1}^{/}a_{n}^{/}>2 and qn+1/>2qn1/q_{n+1}^{/}>2q_{n-1}^{/}) and hence qn/=qn+qn1an+1/<qn(1+1anan+1/)q_{n}^{/}=q_{n}+\frac{q_{n-1}}{a_{n+1}^{/}}<q_{n}\left(1+\frac{1}{a_{n}a_{n+1}^{/}}\right). This is <32qn<\frac{3}{2}q_{n} unless an+1=an=1a_{n+1}=a_{n}=1

  5. (5)

    1qr+2/+ar+1qr+1/=1qr./1qr.\frac{1}{q_{r+2}^{/}}+\frac{a_{r+1}}{q_{r+1}^{/}}=\frac{1}{q_{r}^{./}}\leq\frac{1}{q_{r}^{.}} (equality only for r=0r=0)

  6. (6)

    1qr1qr/=qr1qrqr+1/min{1qr+1/,12qr}12qr\frac{1}{q_{r}}-\frac{1}{q_{r}^{/}}=\frac{q_{r-1}}{q_{r}q_{r+1}^{/}}\leq\min\{\frac{1}{q_{r+1}^{/}},\frac{1}{2q_{r}}\}\leq\frac{1}{2q_{r}} with equality only possible for r=1,q1=1r=1,q_{1}=1

Proof.

1qr+2/+ar+1qr+1/=qr+1/+ar+1qr+2/qr+2/qr+1/=(1+ar+1ar+2/)qr+1/qr+2/qr+1/=(ar+1/ar+2/)qr+2/=1qr/\frac{1}{q_{r+2}^{/}}+\frac{a_{r+1}}{q_{r+1}^{/}}=\frac{q_{r+1}^{/}+a_{r+1}q_{r+2}^{/}}{q_{r+2}^{/}q_{r+1}^{/}}=\frac{\left(1+a_{r+1}a_{r_{+2}}^{/}\right)q_{r+1}^{/}}{q_{r+2}^{/}q_{r+1}^{/}}=\frac{\left(a_{r+1}^{/}a_{r_{+2}}^{/}\right)}{q_{r+2}^{/}}=\frac{1}{q_{r}^{/}} and qr/qrq_{r}^{/}\geq q_{r} with equality only for r=0r=0

1qr1qr/=qr/qrqrqr/=(ar/ar)qr1qrqr/=1ar+1/qr1qrqr/=qr1qrqr+1/\frac{1}{q_{r}}-\frac{1}{q_{r}^{/}}=\frac{q_{r}^{/}-q_{r}}{q_{r}q_{r}^{/}}=\frac{\left(a_{r}^{/}-a_{r}\right)q_{r-1}}{q_{r}q_{r}^{/}}=\frac{\frac{1}{a_{{}_{r+1}}^{/}}q_{r-1}}{q_{r}q_{r}^{/}}=\frac{q_{r-1}}{q_{r}q_{r+1}^{/}}. The inequalities follow from qr1qrq_{r-1}\leq q_{r} (equality only possible for r=1,α>12r=1,\alpha>\frac{1}{2}), and qr+1/>qr+1>2qr1q_{r+1}^{/}>q_{r+1}>2q_{r-1}

Ostrowski Representation Ostrowski introduced this representation in his paper (##paper) studying the sum of remainders (the Birkhoff sum r=1N{rα}\sum_{r=1}^{N}\left\{r\alpha\right\}). The precise application of the tool in that paper does not seem to have a natural generalisation, but the tool itself it has found many other applications since, and it will be a primary tool in our study.

We first recall the idea of a non-standard number system for representing integers. In a standard number system we use a radix q>1q>1 to represent an integer NN by taking the sequence of weights 1=q0<q1<q2<qn1=q^{0}<q^{1}<q^{2}...<q^{n} defined by qnN<qn+1q^{n}\leq N<q^{n+1} and finding integer coefficients br0b_{r}\geq 0 such that N=r=0nbrqrN=\sum_{r=0}^{n}b_{r}q^{r}. In general NN has many such representations, but if we impose the condition that we maximise bnb_{n}, then bn1b_{n-1} and so on down to b0b_{0}, then the brb_{r} are determined and we have a canonical representation. In a non-standard number system we take an arbitrary unbounded sequence of weights 1=q0q1q21=q_{0}\leq q_{1}\leq q_{2}... and apply precisely the same procedure to arrive at a canonical representation N=r=0nbrqrN=\sum_{r=0}^{n}b_{r}q_{r}. (We can allow qr=qr+1q_{r}=q_{r+1} because maximising br+1b_{r+1} before brb_{r} will result in br=0b_{r}=0).

Definition 17.

For NN\in\mathbb{N} the Ostrowski representation Oα(N)O_{\alpha}(N) is the canonical representation N=r=0nbrqrN=\sum_{r=0}^{n}b_{r}q_{r} in the non-standard number system defined by the quasiperiods qrq_{r} of α\alpha.

Preliminary results In this section we will build a number of simple results from the foregoing which will be useful in our main development.

We note the following characteristics of the Ostrowski representation:

Lemma 18.

Let α\alpha have partial quotient sequence (an)(a_{n}) and quasiperiod sequence (qn),(q_{n}),and suppose NN has Ostrowski Representation r=0nbrqr\sum_{r=0}^{n}b_{r}q_{r}, then:

  1. (1)

    brar+1b_{r}\leq a_{r+1}

  2. (2)

    b0<a1b_{0}<a_{1}

  3. (3)

    br=ar+1br1=0b_{r}=a_{r+1}\,\Longrightarrow\,b_{r-1}=0

  4. (4)
  5. (5)

    q1=1b0=0q_{1}=1\,\Longrightarrow b_{0}=0

Proof.

The crucial insight is that if brqrqr+1b_{r}q_{r}\geq q_{r+1} then br+1b_{r+1} is not the maximal coefficient of qr+1q_{r+1}. So in the canonical representation we must have brqr<qr+1b_{r}q_{r}<q_{r+1}. Since qr+1=ar+1qr+qr1q_{r+1}=a_{r+1}q_{r}+q_{r-1} and qr>0q_{r}>0 for r0r\geq 0, this gives br<ar+1+qr1qrb_{r}<a_{r+1}+\frac{q_{r-1}}{q_{r}} for r0r\geq 0. Recall q1=0,q0=1,q11,q2>1q_{-1}=0,q_{0}=1,q_{1}\geq 1,q_{2}>1 and so for r=0,r=0,b0<a1b_{0}<a_{1} and for r1r\geq 1, brar+1b_{r}\leq a_{r+1}. Also if q1=1q_{1}=1 then a1=1a_{1}=1 and then b0<a1b_{0}<a_{1} gives b0=0b_{0}=0. Finally, refining the initial insight, we note that if br=ar+1b_{r}=a_{r+1} then brqr+br1qr1=qr+1+(br11)qr1b_{r}q_{r}+b_{r-1}q_{r-1}=q_{r+1}+(b_{r-1}-1)q_{r-1} which again means that br+1b_{r+1} cannot be maximal if br110b_{r-1}-1\geq 0. ∎

3. Theory of Hom-set Magmas

3.1. Introduction

Given the Birkhoff sum SN(ϕ,x0)=r=1Nϕ(x0+rα)S_{N}(\phi,x_{0})=\sum_{r=1}^{N}\phi(x_{0}+r\alpha), we can rewrite this in curried form as SNx0ϕS_{N}^{x_{0}}\phi, ie as the action of an operator SNx0S_{N}^{x_{0}} with signature (𝕋)(\mathbb{TR})\mathbb{R} (see**) mapping the function space of observables {ϕ𝕋}\{\phi_{\mathbb{TR}}\} to \mathbb{R}. More specifically therefore, SNx0S_{N}^{x_{0}} is a functional. In later sections we will decompose SNx0S_{N}^{x_{0}} into a sum of many functionals of the same signature. This leads to much repetitive work which can be saved by a structured understanding of a number of related underlying dualities.

In this section we will develop the algebraic theory which contains these dualities, which seems of interest in itself. Amongst other things, it delivers a set of 8 dual relations between functionals of signature (ST)T(ST)T built from a Source structure (S,σS,RS)(S,\sigma_{S},R_{S}) and a Target structure (T,τT,RT)(T,\tau_{T},R_{T}) equipped with endomorphisms σ,τ\sigma,\tau and endorelations RS,RTR_{S},R_{T}. If any one of the dual relations hold, we can immediately deduce the other 7.

We will later (Section 7.2) apply this theory to the environment of (𝕋,.¯,𝕋),(,,)(\mathbb{T},\overline{.},\leq_{\mathbb{T}}),(\mathbb{R},-_{\mathbb{R}},\leq_{\mathbb{R}}). Although this is an extremely simple environment in which to apply the theory, we have still found this approach has saved duplicated effort, and perhaps more importantly, confusion, by providing a rigorous and systematic approach to manipulating these relations. The General Hom-set Magma For a brief summary of free magmas and the height function hh, and also of hom-sets, see the sections on generated structure and on hom-sets in subsection 3.

Let X={Xα}X=\{X_{\alpha}\} be a set of sets indexed by another set AA. We will call each XαX_{\alpha} a base set, or height 0 set, and set M0(X)XM_{0}(X)\coloneqq X. Recall that we write ABAB for the set of functions from AA to BB. For n1n\geq 1 define Mn(X)={AB:AMn1(X)|BMn1(X)}M_{n}(X)=\left\{AB:A\in M_{n-1}(X)|B\in M_{n-1}(X)\right\} (ie at least one of A,BA,B is in Mn1(X)M_{n-1}(X)). Each mMn(X)m\in M_{n}(X) is a hom-set and we say that it has height nn (written h(m)=nh(m)=n) and is of type (h(A),h(B))(h(A),h(B)). Note that h(AB)=max(h(A),h(B))+1h(AB)=\max(h(A),h(B))+1. Finally we define M(X)=n=0Mn(X)M(X)=\bigcup_{n=0}^{\infty}M_{n}(X) called the Magma of hom-sets on XX. In the sequel we will find it more convenient to refer to the height of ABAB as its Level.

Note that if A,BA,B are sets then ABAB is an element of P2(A×B)P^{2}\left(A\times B\right) where PP is the power set operator on sets, and hence together with M(X)M(X) itself satisfies the requirements of a set in ZFZF so that there are no foundational issues.

There is however another technical issue: we cannot easily guarantee that the generating set XX is free. Hence M(X)M(X) may not be a free magma, and a given element mm may appear at different levels, with different tree representations. We avoid this issue by regarding M(X)M(X) as a family of sets indexed by the free magma on the symbols Xα{}^{\prime}X_{\alpha}^{\prime}. Then when we talk about ABM(X)AB\in M(X) we must specify which representation tree in the free magma we have in mind, and then the height, width and type of ABAB are well defined.

Further notes on notation and annotation We will adopt the conventions of operator theory in the sequel, and call functions of Level 2 and above operators. Further, we will call operators of type (n,0)(n,0) (which map functions to a base set) functionals. We will use greek letters for functions of level 1 (which are necessarily of type (0,0)(0,0)), eg ϕXαXβ,\phi_{X_{\alpha}X_{\beta}}, and capital letters for functionals, eg A(XαXβ)XβA_{(X_{\alpha}X_{\beta})X_{\beta}} (so that we can correctly write (A(XαXβ)XβϕXαXβ)Xβ\left(A_{(X_{\alpha}X_{\beta})X_{\beta}}\phi_{X_{\alpha}X_{\beta}}\right)_{X_{\beta}} ). We also need to refer to relations and will use the symbol R{}^{\prime}R^{\prime} to indicate these. In particular RXXR_{XX} refers to an endorelation on XX.

For convenience we will also introduce a condensed endo annotation by writing the annotation TT^{\circ} for the annotation TTTT, and T=(TT)=(TT)(TT)T^{\circ\circ}=(TT)^{\circ}=(TT)(TT) etc. This applies to both endomorphisms and endorelations. For example the level 2 endomorphism α(XY)(XY)\alpha_{(XY)(XY)} on XYXY can be written α(XY)\alpha_{\left(XY\right)^{\circ}}, and the level 3 endomorphism σ((ST)(ST))((ST)(ST))\sigma_{\left((ST)(ST)\right)\left((ST)(ST)\right)} on (ST)(ST)^{\circ} can be written σ(ST)\sigma_{(ST)^{\circ\circ}}. Note that the presence of .°.\text{\textdegree} always indicates that the annotated morphism is an endomorphism. Later when we hope the reader has become comfortable with which symbols represent endomorphism or endorelations, we will drop the ..^{\circ} annotation altogether, so that the (Level 2) annotated endomorphism σ(ST)\sigma_{(ST)^{\circ}} will be annotated even more succinctly as αST\alpha_{ST}. This of course has the risk of confusing it with the (Level 1) function αST\alpha_{ST}. The quack rule applies: if it looks like an endomorphism, and behaves like an endomorphism, it probably is a higher level endomorphism rather than a lower level function.

We will also adopt the convention of using annotation only when there are changes in the referents, reading left to right. For example in the formula σ(ST)ϕST=ϕσS\sigma_{(ST)^{\circ}}\phi_{ST}=\phi\circ\sigma_{S^{\circ}}, the symbol σ{}^{\prime}\sigma^{\prime} changes referents (from Level 22 of type (1,1)(1,1) to Level 11 of type (0,0)(0,0)) whereas the symbol ϕ{}^{\prime}\phi^{\prime} has the same referent in both places.

Finally we use the symbol {}^{\prime}\circ^{\prime} conventionally as the binary operator which composes morphisms, so that the function composition (fTZgST)SZ(f_{TZ}\circ g_{ST})_{SZ} represents the function sSf(g(s))s_{S}\mapsto f(g(s)), with the appropriate corresponding definition for relations.

Source-Target Magma In this section we will be working with Levels 0 to 33 of the hom-set magma M(X)M(X) on a pair of Level 0 sets X={S,T}X=\{S,T\}. We will be focusing on morphisms from SS to TT, and call SS the source, and TT the target structure. We will here point out the hom-sets of particular interest.

Level 1 hom-sets contain functions of type (0,0)(0,0), ie between ordered pairs of the Level 0 sets S,TS,T. In particular the hom-sets SS,TTSS,TT (or S,TS^{\circ},T^{\circ}) are the endomorphisms of SS and TT respectively, and STST is the the hom-set of function from the source to the target. The other Level 11 hom-set is TSTS, but we shall not focus on this here.

Level 2 hom-sets contain Level 2 operators (ie involving Level 1) objects, and we shall focus on:

  1. (1)

    (ST)(ST)(ST)(ST) or (ST)(ST)^{\circ}, the hom-set of type (1,1) endomorphisms mapping functions in STST to functions in STST ,

  2. (2)

    (ST)T(ST)T the hom-set of type (1,0)(1,0) operators (functionals) which map functions in STST to “values”, ie to elements of the base set TT.

Level 3 hom-sets contain Level 3 operators (ie involving Level 2 objects), and we shall focus on ((ST)T)((ST)T)\left((ST)T\right)\left((ST)T\right) or ((ST)T)\left((ST)T\right)^{\circ}, the hom-set of endomorphisms of type (2,1)(2,1) mapping functionals to functionals in (ST)T(ST)T.

We will also develop the concept of a Pull Up. As the name suggests, Pull Ups map objects of one level to a higher level, unlike functionals which map objects down to the base level. Technically therefore pull ups are operators, but there is less value in surfacing them explicitly, and we will simply explore their results. In particular we will introduce pull ups of Level 1 objects to Levels 2 and 3, and of Level 2 objects to Level 3. Objects may be functions or relations. Pull Ups are specialisations of the well known pull back and push forward constructions.

3.2. Pull Ups of EndoMorphisms

Level 2 Pull Ups of Endomorphisms on S,TS,T

Definition 19 (Pull Up Endomorphisms).

Let σSS,τTT\sigma_{SS},\tau_{TT} be Level 1 endomorphisms of S,TS,T respectively.

We now “pull back” σSS\sigma_{SS} to STST to induce a level 2 endomorphism σ(ST)(ST)\sigma_{(ST)(ST)} which is of type (1,1)(1,1) on STST. Using the condensed endomorphism annotation, we have:

(3.1) σ(ST)ϕST(ϕσS)ST\sigma_{(ST)^{\circ}}\coloneqq\phi_{ST}\mapsto(\phi\circ\sigma_{S^{\circ}})_{ST}

We call σ(ST)\sigma_{(ST)^{\circ}} the (source) pull up of σS\sigma_{S^{\circ}} from SS to STST . Similarly we can “push forward” τT\tau_{T^{\circ}} to induce the (target) pull up of τT\tau_{T^{\circ}} from TT to STST, which is again a level 2 endomorphism of type (1,1)(1,1):

(3.2) τ(ST)ϕST(τTϕ)ST\tau_{(ST)^{\circ}}\coloneqq\phi_{ST}\mapsto\left(\tau_{T^{\circ}}\circ\phi\right)_{ST}

Note the reversal of the order of composition between the pull ups of source and target endomorphisms.

Lemma 20.

The pull up of an involution is also an involution.

Proof.

Let σ(ST)\sigma_{\left(ST\right)^{\circ}} be the source pull up of σS\sigma_{S^{\circ}}. Then

(3.3) σ(ST)2ϕ=σ(σϕ)=σ(ϕσS)=(ϕσS)σS=ϕσS2\sigma_{(ST)^{\circ}}^{2}\phi=\sigma(\sigma\phi)=\sigma(\phi\circ\sigma_{S^{\circ}})=(\phi\circ\sigma_{S^{\circ}})\circ\sigma_{S^{\circ}}=\phi\circ\sigma_{S^{\circ}}^{2}

so that if σS\sigma_{S^{\circ}} is an involution (ie σ2=Id\sigma^{2}=Id) then σ(ST)\sigma_{(ST)^{\circ}} is also an involution. A similar argument works for target pull ups. ∎

Lemma 21.

Two source (or target) pull ups commute if their underlying endomorphisms commute. Source and target pull ups always commute.

Proof.

Let σS,ςS\sigma_{S^{\circ}},\varsigma_{S^{\circ}} be two commuting endomorphisms. Using the associativity of composition we get

(3.4) (σ(ST)ς(ST))ϕST=σ(ς(ϕ))=(ϕςS)σS=ϕ(ςσ)=ϕ(σς)=(ς(ST)σ(ST))ϕ\left(\sigma_{(ST)^{\circ}}\circ\varsigma_{(ST)^{\circ}}\right)\phi_{ST}=\sigma(\varsigma(\phi))=(\phi\circ\varsigma_{S^{\circ}})\circ\sigma_{S^{\circ}}=\phi\circ(\varsigma\circ\sigma)=\phi\circ(\sigma\circ\varsigma)=(\varsigma_{(ST)^{\circ}}\circ\sigma_{(ST)^{\circ}})\phi

A similar argument establishes the result for two target endomorphisms. By another similar argument:

(3.5) (σ(ST)τ(ST))ϕST=σ(τ(ϕ))=(τTTϕ)σSS=τ(ϕσ)=τ(ST)(σ(ST)(ϕ))=(τσ)ϕ\left(\sigma_{(ST)^{\circ}}\circ\tau_{(ST)^{\circ}}\right)\phi_{ST}=\sigma(\tau(\phi))=(\tau_{TT}\circ\phi)\circ\sigma_{SS}=\tau\circ(\phi\circ\sigma)=\tau_{(ST)^{\circ}}(\sigma_{(ST)^{\circ}}(\phi))=(\tau\circ\sigma)\phi

and so σ(ST),τ(ST)\sigma_{(ST)^{\circ}},\tau_{(ST)^{\circ}} always commute. ∎

Remark 22.

If two involutions σ,τ\sigma,\tau commute then (στ)2=σ2τ2=Id\left(\sigma\tau\right)^{2}=\sigma^{2}\tau^{2}=Id so that στ\sigma\tau is also an involution.

Level 3 Pull Ups of endomorphisms on S,TS,T We can now use the above results about STST to deduce similar results about (ST)T(ST)T by replacing SS with STST and keeping TT unchanged. Note that (ST)T(ST)T is a level 2 set of morphisms of type (1,0)(1,0) and so ((ST)T)\left((ST)T\right)^{\circ} is a level 3 set of type (2,2)(2,2) morphisms.

Proceeding as in the previous subsection, the pullback of σ(ST)\sigma_{\left(ST\right)^{\circ}} gives us the source pull up:

(3.6) σ((ST)T)\displaystyle\sigma_{\left((ST)T\right)^{\circ}} A(ST)TAσ(ST)\displaystyle\coloneqq A_{(ST)T}\mapsto A\circ\sigma_{\left(ST\right)^{\circ}}

and the pushforward of τT\tau_{T^{\circ}} gives us the target pull up:

(3.7) τ((ST)T)A(ST)TτTA\tau_{\left((ST)T\right)^{\circ}}\coloneqq A_{(ST)T}\mapsto\tau_{T^{\circ}}\circ A

However this time we have an additional endomorphism on the source STST, namely τ(ST)\tau_{\left(ST\right)^{\circ}} which we can again pull back. The natural labelling of this new pull up would be τ((ST)T)\tau_{\left((ST)T\right)^{\circ}} but have just used that symbol and annotation for the pull up of τT\tau_{T^{\circ}}! Instead, since the new pull up is a source pull up, we will label it σ((ST)T)/\sigma_{\left((ST)T\right)^{\circ}}^{/} to get:

(3.8) σ((ST)T)/\displaystyle\sigma_{\left((ST)T\right)^{\circ}}^{/} A(ST)TAτ(ST)\displaystyle\coloneqq A_{(ST)T}\mapsto A\circ\tau_{\left(ST\right)^{\circ}}

By the second part of Lemma 21 we have immediately that σ((ST)T),τ((ST)T)\sigma_{\left((ST)T\right)^{\circ}},\tau_{\left((ST)T\right)^{\circ}} commute and that σ((ST)T)/,τ((ST)T)\sigma_{\left((ST)T\right)^{\circ}}^{/},\tau_{\left((ST)T\right)^{\circ}} commute. The same part of the lemma also gives us that σ(ST),τ(ST)\sigma_{\left(ST\right)^{\circ}},\tau_{\left(ST\right)^{\circ}} commute, and then we can apply the first part to find that σ((ST)T),σ((ST)T)/\sigma_{\left((ST)T\right)^{\circ}},\sigma_{\left((ST)T\right)^{\circ}}^{/} commute.

It follows by repeated application of Lemma 20 that if σS\sigma_{S^{\circ}} is an involution, so is σ((ST)T)\sigma_{\left((ST)T\right)^{\circ}}, and if τT\tau_{T^{\circ}} is an involution, so are τ((ST)T)\tau_{\left((ST)T\right)^{\circ}} and σ((ST)T)/\sigma_{\left((ST)T\right)^{\circ}}^{/}. If both σS,τT\sigma_{S^{\circ}},\tau_{T^{\circ}}are involutions, so are all the 8 compositions of the three Level 3 . Hence we have shown:

Theorem 23.

If S,TS,T are equipped with involutions σS,τT\sigma_{S^{\circ}},\tau_{T^{\circ}} respectively, then the space of functionals (ST)T(ST)T is equipped with a (C2)3\left(C_{2}\right)^{3} group of involutions, generated by the 3 involutions σ,σ/,τ\sigma,\sigma^{/},\tau of ((ST),T)\left((ST),T\right)^{\circ} which are themselves induced by σS,τT\sigma_{S^{\circ}},\tau_{T^{\circ}}.

3.3. Pull Ups of EndoRelations

The theory we have developed so far could be given purely within the context of Category Theory (ie in terms of objects and morphisms without defining what we mean by them). However at this stage our progress is easiest if we introduce the standard set theoretic notion of Relations as elements of sets.

Definition 24.

We define a relation RSTR_{ST} between S,TS,T as a subset of S×TS\times T, ie an element of the power set of the Cartesian product of S,TS,T. We write sSRSTtTs_{S}\,R_{ST}\,t_{T} to mean (s,t)RST(s,t)\in R_{ST} and we also define the dual or opposite relation RTSOp={(t,s):(s,t)RST}R_{TS}^{Op}=\left\{(t,s):(s,t)\in R_{ST}\right\}. If S=TS=T, then RR and ROpR^{Op} are called endorelations on SS.

Note that OpOp is an involution on relations, ie (ROp)Op=R(R^{Op})^{Op}=R, and that sRttROpssRt\Leftrightarrow tR^{Op}s.

We now suppose that S,TS,T are equipped with endorelations RSS×S,RTT×TR_{S^{\circ}}\subseteq S\times S,R_{T^{\circ}}\subseteq T\times T in addition to the endomorphisms σS,τT\sigma_{S^{\circ}},\tau_{T^{\circ}}. As usual if we have (s1,s2)RS(s_{1},s_{2})\in R_{S^{\circ}} we will write this as s1RSs2s_{1}\,R_{S^{\circ}}\,s_{2}, and we define the opposite relation RSOpR_{S^{\circ}}^{Op} by s1RSs2s2RSOps1s_{1}\,R_{S^{\circ}}\,s_{2}\Leftrightarrow s_{2}\,R_{S^{\circ}}^{Op}\,s_{1}, and then RTOpR_{T^{\circ}}^{Op} similarly.

Definition 25.

A relation homomorphism from RSR_{S^{\circ}} to RTR_{T^{\circ}} is a morphism ϕST\phi_{ST} such that ϕ×ϕ\phi\times\phi is an element of RSRTR_{S^{\circ}}R_{T^{\circ}}, ie s1RSs2(s1,s2)RS(ϕs1,ϕs2)RSϕs1RTϕs2s_{1}R_{S^{\circ}}s_{2}\Leftrightarrow(s_{1},s_{2})\in R_{S^{\circ}}\Rightarrow(\phi s_{1},\phi s_{2})\in R_{S^{\circ}}\Leftrightarrow\phi s_{1}R_{T^{\circ}}\phi s_{2}. A relation anti homomorphism from RSR_{S^{\circ}} to RTR_{T^{\circ}} is a relation homomorphism from RSR_{S^{\circ}} to RTOpR_{T^{\circ}}^{Op}, and has the opposite result, ie s1RSs2ϕs1RTOpϕs2ϕs2RTϕs1s_{1}R_{S}s_{2}\Rightarrow\phi s_{1}R_{T}^{Op}\phi s_{2}\Leftrightarrow\phi s_{2}R_{T}\phi s_{1}. . If RS,RTR_{S^{\circ}},R_{T^{\circ}} are both functions we call ϕST\phi_{ST} a function homomorphism.

Remark 26.

Note that a relation anti-morphism of RSRTR_{S^{\circ}}R_{T^{\circ}} transposes its images under RTR_{T^{\circ}}, so that a composition of a relation morphism and anti-morphism is a relation anti-morphism, whilst a composition of two relation morphisms or two relation anti-morphisms is a relation morphism. Further if ϕ\phi is an RSRTR_{S^{\circ}}R_{T^{\circ}} morphism, then we have s2RSOps1s1RSs2ϕs1RTϕs2ϕs2RTOpϕs1s_{2}R_{S^{\circ}}^{Op}s_{1}\Leftrightarrow s_{1}R_{S^{\circ}}s_{2}\Rightarrow\phi s_{1}R_{T^{\circ}}\phi s_{2}\Leftrightarrow\phi s_{2}R_{T^{\circ}}^{Op}\phi s_{1} so that ϕ\phi is also an RSRTOpR_{S^{\circ}}R_{T^{\circ}}^{Op}, RSOpRTR_{S^{\circ}}^{Op}R_{T^{\circ}} anti morphism, and an RSOpRTOpR_{S^{\circ}}^{Op}R_{T^{\circ}}^{Op} morphism.

Now suppose that σS\sigma_{S^{\circ}} is an RSRSR_{S^{\circ}}R_{S^{\circ}} anti-morphism, and that similarly τT\tau_{T^{\circ}} is an RTRTR_{T^{\circ}}R_{T^{\circ}} anti-morphism. Then if ϕ\phi is also a relation anti-morphism of RSRTR_{S^{\circ}}R_{T^{\circ}}, then σST(ϕ)=ϕσS,τST(ϕ)=τTϕ\sigma_{ST}(\phi)=\phi\circ\sigma_{S^{\circ}},\tau_{ST}(\phi)=\tau_{T}\circ\phi are both relation morphisms of RSRTR_{S^{\circ}}R_{T^{\circ}} and (στ)ϕ(\sigma\circ\tau)\circ\phi is a relation anti-morphism of RSRTR_{S^{\circ}}R_{T^{\circ}}.

Remark 27.

When RR is a function we write xRyxRy as y=R(x)y=R(x), so a function homomorphism has s2=Rs(s1)ϕs2=RT(ϕs1)s_{2}=R_{s}(s_{1})\Rightarrow\phi s_{2}=R_{T}(\phi s_{1}) hence ϕ(RSs1)=RT(ϕs1)\phi(R_{S}s_{1})=R_{T}(\phi s_{1}) and so ϕRS=RTϕ\phi\circ R_{S}=R_{T}\circ\phi.

Level 2 Pull Ups of endorelations from TT We define in this section a pull up of an endorelation on TT. This pull up is not the same operation as the pull up of a function: there are many possible pull ups, and we will only define the operation in one direction, as a pull back from the target. Nevertheless the term pull up still seems appropriate.

Definition 28.

For any subset ZSZ\subseteq S, the pull up of the endorelation RTR_{T} to be the endorelation (RZ)(ST)\left(R_{Z}\right)_{(ST)^{\circ}} on STST defined by ϕ1RZϕ2(ϕ1sRTϕ2s\phi_{1}\,R_{Z}\,\phi_{2}\Leftrightarrow(\phi_{1}s\,R_{T^{\circ}}\,\phi_{2}s for all sZ)s_{Z}). If Z=Z=\emptyset, RZR_{Z} is the full relation.

Remark 29.

Recall that that we may allow ourselves at any point to change annotation at the risk of confusion. This seems to be a point where improvement in legibility outweighs the risk.

We will use σ\sigma (without annotation) to signify σT\sigma_{T^{\circ}}, σST\sigma_{ST} to signify σ(ST)\sigma_{(ST)^{\circ}}, and analogously for the symbol τ\tau. We will also call a morphism of RZRZR_{Z}R_{Z} an endomorphism of RZR_{Z}.

Remark 30.

Note that if Z/ZZ^{/}\subseteq Z then ϕ1RZϕ2ϕ1RZ/ϕ2\phi_{1}\,R_{Z}\,\phi_{2}\Rightarrow\phi_{1}\,R_{Z^{/}}\,\phi_{2}, ie RZRZ/R_{Z}\subseteq R_{Z^{/}}.

Given ϕ1,ϕ2\phi_{1},\phi_{2} such that ϕ1RZϕ2\phi_{1}R_{Z}\phi_{2} we can now deduce a number of dual results, ie given a result involving RZR_{Z} we can deduce a corresponding result involving RZOpR_{Z}^{Op}. However using we will find it possible, and in fact more convenient, to translate these results back into the alternative form of results in terms of RZR_{Z}.

Proposition 31.

If τ\tau is an anti-endomorphism of RTR_{T^{\circ}} then τST\tau_{ST} is an anti-endomorphism of RZR_{Z}, ie ϕ1RZϕ2(τSTϕ2)RZ(τSTϕ1)\phi_{1}\,R_{Z}\,\phi_{2}\Rightarrow\left(\tau_{ST}\phi_{2}\right)\,R_{Z}\,\left(\tau_{ST}\phi_{1}\right). A similar result holds for endomorphisms.

Proof.

If τ\tau is an anti morphism then for any sSs_{S}, ϕ1sRTϕ2sτ(ϕ2s)RTτ(ϕ1s)(τSTϕ2)sRT(τSTϕ1)s\phi_{1}s\,R_{T^{\circ}}\,\phi_{2}s\Rightarrow\tau(\phi_{2}s)\,R_{T^{\circ}}\,\tau(\phi_{1}s)\Leftrightarrow(\tau_{ST}\phi_{2})s\,R_{T^{\circ}}\,(\tau_{ST}\phi_{1})s. Since this holds trivially for all sZs_{Z}, the result for anti-morphisms follows. The morphism case is similar. ∎

Recall that σZ{σz:zZ}\sigma Z\coloneqq\{\sigma z:z\in Z\}.

Proposition 32.

If σ\sigma is an involution on SS, then σST\sigma_{ST} is an endomorphism of RσZR_{\sigma Z}, ie ϕ1RZϕ2(σSTϕ1)RσZ(σSTϕ2)\phi_{1}R_{Z}\phi_{2}\Rightarrow\left(\sigma_{ST}\phi_{1}\right)\,R_{\sigma Z}\,\left(\sigma_{ST}\phi_{2}\right)

Proof.

Since σ\sigma is an involution, ϕs=ϕ(σ2s)=(ϕσ)(σs)=(σSTϕ)(σs)\phi s=\phi(\sigma^{2}s)=(\phi\circ\sigma)(\sigma s)=(\sigma_{ST}\phi)(\sigma s), so ϕ1sRTϕ2s(σSTϕ1)(σs)RT(σSTϕ2)(σs)\phi_{1}s\,R_{T^{\circ}}\,\phi_{2}s\Rightarrow(\sigma_{ST}\phi_{1})(\sigma s)\,R_{T^{\circ}}\,(\sigma_{ST}\phi_{2})(\sigma s) which establishes the result. ∎

Proposition 33.

If σ\sigma is an involution on SS, and τT\tau_{T} is an (anti) morphism on RTR_{T}, then σSTτST\sigma_{ST}\circ\tau_{ST} is an (anti) morphism of RσZR_{\sigma Z}.

Proof.

This follows immediately by applying 31 followed by 32, eg in the anti morphism case, ϕ1RZϕ2(σSTτST)ϕ2RσZ(σSTτST)ϕ1\phi_{1}R_{Z}\phi_{2}\Rightarrow(\sigma_{ST}\circ\tau_{ST})\phi_{2}\,R_{\sigma Z}\,(\sigma_{ST}\circ\tau_{ST})\phi_{1}

Note that we established σSTτST=τSTσST\sigma_{ST}\circ\tau_{ST}=\tau_{ST}\circ\sigma_{ST} in (3.5), and so the order of application of results above is immaterial.

Level 3 Pull Ups of endorelations from TT Recall (23) that if σ,τ\sigma,\tau are involutions on S,TS,T then σ,σ/,τ\sigma,\sigma^{/},\tau on (ST)T(ST)T are also commuting involutions and generate an (Abelian) (C2)3(C_{2})^{3} group of 8 involutions. We wish to investigate how these behave with respect to induced relations on (ST)T(ST)T. We first look at results we can take from the previous section by replacing SS by STST, ZSZ\subseteq S by ΨST\Psi\subseteq ST, and ϕST\phi_{ST} with A(ST)TA_{(ST)T}. Recall that σ((ST)T)/(A)=Aτ(ST)\sigma_{\left((ST)T\right)^{\circ}}^{/}(A)=A\circ\tau_{\left(ST\right)^{\circ}}.

Note we can apply proposition 31 to τ(ST)T\tau_{(ST)T}, and proposition 32 to σ(ST)T,σ(ST)T/\sigma_{(ST)T},\sigma_{(ST)T}^{/}. 31 requires τT\tau_{T} to be an anti-morphism on RTR_{T}, and 32 requires σST,σST/\sigma_{ST},\sigma_{ST}^{/} to be involutions on STST. Now σST\sigma_{ST} is an involution if σS\sigma_{S} is, and σST/\sigma_{ST}^{/} is an involution if τT\tau_{T} is. If these hold then 33 will also hold for the pairs σ,τ\sigma,\tau and σ/,τ\sigma^{/},\tau. Hence 31-33 give dual results relating to Id,σ,σ/,τ,στ,σ/τId,\sigma,\sigma^{/},\tau,\sigma\circ\tau,\sigma^{/}\circ\tau but we are missing results for σσ/,σσ/τ\sigma\circ\sigma^{/},\sigma\circ\sigma^{/}\circ\tau which we will now develop.

Proposition 34.

If σST,σST/\sigma_{ST},\sigma_{ST}^{/} are involutions on STST, then σ(ST)Tσ(ST)T/\sigma_{(ST)T}\circ\sigma_{(ST)T}^{/} is itself an involution and also an endomorphism of Rσσ/ΨR_{\sigma\sigma^{/}\Psi}, ie A1RΨA2(σσ/)A1R(σσ/Ψ)(σσ/)A2A_{1}R_{\Psi}A_{2}\Leftrightarrow(\sigma\sigma^{/})A_{1}\,R_{(\sigma\sigma^{/}\Psi)}\,(\sigma\sigma^{/})A_{2}

Proof.

By Theorem 23 σσ/\sigma\circ\sigma^{/} is itself an involution on STST, and the result follows by applying Proposition 32

Proposition 35.

If σST,σST/\sigma_{ST},\sigma_{ST}^{/} are involutions on STST, and τT\tau_{T} is an (anti)morphism of RTR_{T}, then σ(ST)Tσ(ST)T/τ(ST)T\sigma_{(ST)T}\circ\sigma_{(ST)T}^{/}\circ\tau_{(ST)T} is an (anti)morphism of Rσσ/ΨR_{\sigma\sigma^{/}\Psi}, ie A1RΨA2(σσ/τ)A2R(σσ/Ψ)(σσ/τ)A1A_{1}R_{\Psi}A_{2}\Leftrightarrow(\sigma\sigma^{/}\tau)A_{2}\,R_{(\sigma\sigma^{/}\Psi)}\,(\sigma\sigma^{/}\tau)A_{1}

Proof.

By the previous proposition σ(ST)Tσ(ST)T/\sigma_{(ST)T}\circ\sigma_{(ST)T}^{/} is an involution, and the result follows by applying proposition 33

We have now established the following:

Theorem 36.

If σS,τT\sigma_{S},\tau_{T} are involutions on (S,σS,RS),(T,τT,RT)(S,\sigma_{S},R_{S}),(T,\tau_{T},R_{T}) respectively, and τT\tau_{T} is also an anti-morphism on RTR_{T}, then we have 8 dual equivalences involving the induced endorelations RΨ,RσΨ,RτΨ,RστΨ,R_{\Psi},R_{\sigma\Psi},R_{\tau\Psi},R_{\sigma\tau\Psi}, on (ST)T(ST)T and the induced involutions σ(ST)T,σ(ST)T/,τ(ST)T\sigma_{(ST)T},\sigma_{(ST)T}^{/},\tau_{(ST)T}, namely

A1RΨA2\displaystyle A_{1}R_{\Psi}A_{2} σA1RσΨσA2σ/A1RτΨσ/A2σσ/A1RστΨσσ/A2\displaystyle\Leftrightarrow\sigma A_{1}R_{\sigma\Psi}\sigma A_{2}\Leftrightarrow\sigma^{/}A_{1}R_{\tau\Psi}\sigma^{/}A_{2}\Leftrightarrow\sigma\sigma^{/}A_{1}R_{\sigma\tau\Psi}\sigma\sigma^{/}A_{2}
τA1RΨOpτA2στA1RσΨOpστA2σ/τA1RτΨOpσ/τA2σσ/τA1RστΨOpσσ/τA2\Leftrightarrow\tau A_{1}R_{\Psi}^{Op}\tau A_{2}\Leftrightarrow\sigma\tau A_{1}R_{\sigma\Psi}^{Op}\sigma\tau A_{2}\Leftrightarrow\sigma^{/}\tau A_{1}R_{\tau\Psi}^{Op}\sigma^{/}\tau A_{2}\Leftrightarrow\sigma\sigma^{/}\tau A_{1}R_{\sigma\tau\Psi}^{Op}\sigma\sigma^{/}\tau A_{2}

3.4. Special Cases and Application

We now consider 2 special cases of Theorem 36: the case of self dual Ψ\Psi, and the case of function homomorphisms from τST\tau_{ST} to τT\tau_{T} in Ψ\Psi. When Ψ\Psi is self-conjugate under the involution στ\sigma\tau Given any ΦST\Phi\subseteq ST, let Ψ(Φ)=ΦστΦ\Psi(\Phi)=\Phi\bigcup\sigma\tau\Phi. Note then that στΨ=Ψ\sigma\tau\Psi=\Psi (ie Ψ\Psi is self-conjugate under the involution στ=στ\sigma\tau=\sigma\circ\tau), which immediately gives τΨ=σΨ\tau\Psi=\sigma\Psi since σ,τ\sigma,\tau are commuting involutions. Hence both Ψ(Φ)\Psi(\Phi) and Ψ(σΦ)\Psi(\sigma\Phi) are self-conjugate under στ\sigma\tau and we can interchange RΨR_{\Psi} with RστΨR_{\sigma\tau\Psi}, and RσΨR_{\sigma\Psi} with RτΨR_{\tau\Psi} in each of the above dual forms of Theorem 36.

(3.9) A1RΨA2\displaystyle A_{1}R_{\Psi}A_{2} σA1RσΨσA2σ/A1RσΨσ/A2σσ/A1RΨσσ/A2\displaystyle\Leftrightarrow\sigma A_{1}R_{\sigma\Psi}\sigma A_{2}\Leftrightarrow\sigma^{/}A_{1}R_{\sigma\Psi}\sigma^{/}A_{2}\Leftrightarrow\sigma\sigma^{/}A_{1}R_{\Psi}\sigma\sigma^{/}A_{2}
τA1RΨOpτA2στA1RσΨOpστA2σ/τA1RσΨOpσ/τA2σσ/τA1RΨOpσσ/τA2\Leftrightarrow\tau A_{1}R_{\Psi}^{Op}\tau A_{2}\Leftrightarrow\sigma\tau A_{1}R_{\sigma\Psi}^{Op}\sigma\tau A_{2}\Leftrightarrow\sigma^{/}\tau A_{1}R_{\sigma\Psi}^{Op}\sigma^{/}\tau A_{2}\Leftrightarrow\sigma\sigma^{/}\tau A_{1}R_{\Psi}^{Op}\sigma\sigma^{/}\tau A_{2}

Now suppose σS\sigma_{S} is a relation anti-homomorphism on RSR_{S} in addition to being an involution. If Φ\Phi is a set of relation (anti) homomorphisms then στΦ\sigma\tau\Phi is also a set of relation (anti) homomorphisms by Remark (26), and so the full set Ψ(Φ)\Psi(\Phi) is also a set of relation (anti) homomorphisms. Similarly σΦ,τΦ,Ψ(σΦ)\sigma\Phi,\tau\Phi,\Psi(\sigma\Phi) are sets of relation anti-homomorphisms (homomorphisms). Finally note that Φ,στΦΨ(Φ)\Phi,\sigma\tau\Phi\subseteq\Psi(\Phi), and σΦ,τΦσΨ(Φ)\sigma\Phi,\tau\Phi\subseteq\sigma\Psi(\Phi)στ\sigma\tau and by remark 30, if any of these relations hold, we can then substitute any of the subsets for the full set, eg from A1RΨA2A_{1}R_{\Psi}A_{2} we can deduce σσ/A1RΦσσ/A2\sigma\sigma^{/}A_{1}R_{\Phi}\sigma\sigma^{/}A_{2}.

When A1,A2A_{1},A_{2} are function homomorphisms of τSTτT\tau_{ST}\tau_{T} There is in general no necessary relationship between σ(ST)T/A(ST)T=AτST\sigma_{(ST)T}^{/}A_{(ST)T}=A\circ\tau_{ST} and τ(ST)TA(ST)T=τTA\tau_{(ST)T}A_{(ST)T}=\tau_{T}\circ A. However this changes when AA is a function homomorphism from τST\tau_{ST} to τT\tau_{T}, ie a homomorphism from the algebraic structure (ST,τST)(ST,\tau_{ST}) to (T,τT)(T,\tau_{T}), ie A(τSTϕ)=τT(Aϕ)A\left(\tau_{ST}\phi\right)=\tau_{T}\left(A\phi\right)

Lemma 37.

σ(ST)T/=τ(ST)T\sigma_{(ST)T}^{/}=\tau_{(ST)T} on function homomorphisms from τST\tau_{ST} to τT\tau_{T}

Proof.

By remark 27 A(ST)TA_{(ST)T} is a function homomorphism means AτST=τTAA\circ\tau_{ST}=\tau_{T}\circ A. Then

σ(ST)T/(A)=AτST=τTA=τ(ST)T(A)\sigma_{(ST)T}^{/}(A)=A\circ\tau_{ST}=\tau_{T}\circ A=\tau_{(ST)T}(A)

This simplifies the dual forms of Theorem 36 when A1,A2A_{1},A_{2} are function homomorphisms to:

(3.10) A1RΨA2\displaystyle A_{1}R_{\Psi}A_{2} σA1RσΨσA2τA1RτΨτA2στA1RστΨστA2\displaystyle\Leftrightarrow\sigma A_{1}R_{\sigma\Psi}\sigma A_{2}\Leftrightarrow\tau A_{1}R_{\tau\Psi}\tau A_{2}\Leftrightarrow\sigma\tau A_{1}R_{\sigma\tau\Psi}\sigma\tau A_{2}
τA1RΨOpτA2στA1RσΨOpστA2A1RτΨOpA2σA1RστΨOpσA2\Leftrightarrow\tau A_{1}R_{\Psi}^{Op}\tau A_{2}\Leftrightarrow\sigma\tau A_{1}R_{\sigma\Psi}^{Op}\sigma\tau A_{2}\Leftrightarrow A_{1}R_{\tau\Psi}^{Op}A_{2}\Leftrightarrow\sigma A_{1}R_{\sigma\tau\Psi}^{Op}\sigma A_{2}

The combined case Finally let us consider both special cases. Then we have the simplified equivalences of (3.10)\eqref{eq:DualHom} but with the addition of στΨ=Ψ\sigma\tau\Psi=\Psi and σΨ=τΨ\sigma\Psi=\tau\Psi gives:

(3.11) A1RΨA2σA1RτΨσA2τA1RτΨτA2στA1RΨστA2\displaystyle A_{1}R_{\Psi}A_{2}\Leftrightarrow\sigma A_{1}R_{\tau\Psi}\sigma A_{2}\Leftrightarrow\tau A_{1}R_{\tau\Psi}\tau A_{2}\Leftrightarrow\sigma\tau A_{1}R_{\Psi}\sigma\tau A_{2}
τA1RΨOpτA2στA1RτΨOpστA2A1RτΨOpA2σA1RΨOpσA2\Leftrightarrow\tau A_{1}R_{\Psi}^{Op}\tau A_{2}\Leftrightarrow\sigma\tau A_{1}R_{\tau\Psi}^{Op}\sigma\tau A_{2}\Leftrightarrow A_{1}R_{\tau\Psi}^{Op}A_{2}\Leftrightarrow\sigma A_{1}R_{\Psi}^{Op}\sigma A_{2}

Example The theory we have developed is of quite wide application, but we will give one simple example which will become useful to us in Section 7.

We let S,TS,T be 2 real intervals having relations RS,RTR_{S},R_{T} both as appropriate subsets of the {}^{\prime}\leq_{\mathbb{R}}^{\prime} order relation. Let σS\sigma_{S} be any involution on SS which is an anti-morphism of \leq, ie for s1s2s_{1}\leq s_{2} in SS, σs2σs1\sigma s_{2}\leq\sigma s_{1}, and let τT\tau_{T} be the same on TT. Suppose ΨST\Psi\subset ST is self-conjugate under στ\sigma\tau.

Suppose further that A1,A2A_{1},A_{2} are functionals in (ST)T(ST)T and that under the pull up of RTR_{T} to RΨR_{\Psi} that we have A1RΨA2A_{1}R_{\Psi}A_{2}, ie for all ϕΨ\phi_{\Psi} we have A1ϕTA2ϕA_{1}\phi\leq_{T}A_{2}\phi. It makes sense then to write RΨ=RστΨ=ΨR_{\Psi}=R_{\sigma\tau\Psi}=\leq_{\Psi} and similarly RσΨ=RτΨ=σΨR_{\sigma\Psi}=R_{\tau\Psi}=\leq_{\sigma\Psi}. This means we write A1ΨA2A_{1}\leq_{\Psi}A_{2} to mean A1A2A_{1}\leq A_{2} on Ψ\Psi, and similarly A1σΨA2A_{1}\leq_{\sigma\Psi}A_{2} to mean A1A2A_{1}\leq A_{2} on σΨ\sigma\Psi. Then by (3.9) when A1ΨA2A_{1}\leq_{\Psi}A_{2} we also have the dual results (recalling A1ROpA2A2RA1A_{1}R^{Op}A_{2}\Leftrightarrow A_{2}RA_{1}):

(3.12) σA1σΨσA2,σ/A1σΨσ/A2,σσ/A1Ψσσ/A2\sigma A_{1}\leq_{\sigma\Psi}\sigma A_{2},\;\sigma^{/}A_{1}\leq_{\sigma\Psi}\sigma^{/}A_{2},\;\sigma\sigma^{/}A_{1}\leq_{\Psi}\sigma\sigma^{/}A_{2}
τA2ΨτA1,στA2σΨστA1,σ/τσΨσ/τA1,σσ/τA2Ψσσ/τA1\tau A_{2}\leq_{\Psi}\tau A_{1},\;\sigma\tau A_{2}\leq_{\sigma\Psi}\sigma\tau A_{1},\;\sigma^{/}\tau\leq_{\sigma\Psi}\sigma^{/}\tau A_{1},\;\sigma\sigma^{/}\tau A_{2}\leq_{\Psi}\sigma\sigma^{/}\tau A_{1}

4. Separation of Concerns in Birkhoff Sums

4.1. Introduction

Recall that given a Dynamical System (X,TXX)(X,T_{XX}) with a value space VMonoid=(V,+V)V_{Monoid}=(V,+_{V}) we define the NNth Birkhoff sum of an observable ϕXV\phi_{XV} as SN(ϕ,x)r=1Nϕ(Trx)S_{N}(\phi,x)\coloneqq\sum_{r=1}^{N}\phi(T^{r}x) (where the summation takes place in VV). As we have noted, previous studies with the Birkhoff sum in this form have proceeded with properties of ϕ\phi being tightly bound with the entire reasoning process. Our strategy is to separate the role of ϕ\phi from the underlying dynamics of the system (X,T)(X,T). The key result which enables us to do this is:

(4.1) SN(ϕ,x)=(SNϕ)x=ϕ(SNx)S_{N}(\phi,x)=(S_{N}\phi)x=\phi(S_{N}x)

The right side of this identity allows us to separate the study of the composite Birkhoff sum SN(ϕ,x)S_{N}(\phi,x) into two parts. First we study the effects of the homonymic operator SNS_{N} on xx. Unlike the original SNS_{N} this newly constructed operator is independent of ϕ\phi and is purely dynamical. In Section 5 we use it to develop a theory of the sequential distribution of orbits on the circle. This needs to done only once, and then the results are available for use with any observable ϕ\phi.

In section 6 we will study the general classification of unbounded observables, and develop some initial estimates for the general Birkhoff sum. In section 5 we will develop the estimates for the case of irrational rotations of the circle. Finally in section 8 we apply the developed theory to obtain more specialised estimates for the Birkhoff sum in the case of some some important specific classes of observables.

The result (4.1) is probably best positioned as a result in Abstract Algebra. In essence it captures a general insight into a structural decomposition which, as with many such results, is “obvious” once seen, and can then be more quickly introduced by way of an ansatz. However as it is fundamental to our approach we will provide two proofs - a relatively formal derivation and a quick proof by ansatz. The first proof is much longer but provides a deeper understanding of the mechanisms at work. The ansatz simply delivers the result as quickly as possible, and will suit those to whom the result is intuitive and are anxious to proceed to the content of subsequent sections. However the significant difference between the 2 approaches also helps to show that, although the result may appear “intuitive” to some, this intuition is built upon a fair amount of internalised mathematical machinery. Also like much “abstract nonsense”, the proof itself is relatively simple - the challenge lies in developing the right set of definitions and cocepts within which the result becomes natural.

4.2. Formal development

In the sequel we will fix N0N\geq 0 and to simplify notation we will write merely xr\sum x_{r} to denote the formal sum x1+x2+xNx_{1}+x_{2}\ldots+x_{N} (where ++ is an associative binary operator, not necessarily commutative).

Recall three universal constructions

Given sets X,Y,ZX,Y,Z then XYXY is the set of functions ϕXYxX(ϕx)Y\phi_{XY}\coloneqq x_{X}\mapsto(\phi x)_{Y} from XX to YY, and (X×Y)Z(X\times Y)Z is the set of bifunctions (functions of two arguments) f(x,y)(f(x,y))Zf\coloneqq(x,y)\mapsto\left(f(x,y)\right)_{Z}.

C1:

From a set XX we can construct the Kleene star (or free monoid) XX^{*} by X=(X,+X)X^{*}=(X^{*},+_{X^{*}}) where XX^{*} is the set of formal sums x1+x2+xkx_{1}+x_{2}\ldots+x_{k} for k0k\geq 0, and ++ is the concatenation of formal sums (with the empty sum as unit). We regard XX as a subset of XX^{*} by regarding xXx_{X} as a formal sum of length 1. Note that “free” here means free of relations, ie r=1mxr=r=1nxr/\sum_{r=1}^{m}x_{r}=\sum_{r=1}^{n}x_{r}^{/} means m=nm=n and xr=xr/x_{r}=x_{r}^{/}. We will write xr\sum^{*}x_{r} to denote the formal sum r=1Nxi\sum_{r=1}^{N}x_{i} (recall NN is fixed).

C2:

From an element xX,x_{X},we can construct the pull back function (or functional) x(XY)Yx_{(XY)Y} of xXx_{X} by xϕϕxx\phi\coloneqq\phi x for each ϕXY\phi_{XY}.

C3:

From a bifunction f(X×Y)Zf_{(X\times Y)Z} we can construct the two curried functions fX(YZ)c1x(yf(x,y))f_{X(YZ)}^{c1}\coloneqq x\mapsto(y\mapsto f(x,y)) so that (fx)(y)=f(x,y)\left(fx\right)(y)=f(x,y), and fY(XZ)c2y(xf(x,y))f_{Y(XZ)}^{c2}\coloneqq y\mapsto(x\mapsto f(x,y)) so that (fy)(x)=f(x,y)\left(fy\right)(x)=f(x,y).

Let us equip YY with a binary operator +Y+_{Y}, ie an element of the set of bifunctions (Y×Y)Y(Y\times Y)Y, so that Y=(Y,+Y)Y=(Y,+_{Y}) is a semi-group. Recall we can always add a unit if necessary to make a semi-group into a monoid, so we assume YY is a monoid.

C4:

From +Y+_{Y} we construct the pull back (binary) operator (+Y)XY\left(+_{Y}\right)_{XY} on XYXY by (ϕ1+Yϕ2)xϕ1x+Yϕ2x(\phi_{1}+_{Y}\phi_{2})x\coloneqq\phi_{1}x+_{Y}\phi_{2}x. We call (XY,+Y)(XY,+_{Y}) the pull back monoid. We write Yϕr\sum^{Y}\phi_{r} to denote a sum in XYXY using the pull back +Y+_{Y}.

Note that we now have two monoids constructed on XYXY, namely the free monoid ((XY),+(XY))\left((XY)^{*},+_{(XY)^{*}}\right) and the pull back monoid (XY,+XY)(XY,+_{XY}). The general element of (XY)(XY)^{*} has the form (ϕr)(XY)\left(\sum^{*}\phi_{r}\right)_{(XY)^{*}} and (Yϕr)XY\left(\sum^{Y}\phi_{r}\right)_{XY} respectively.

C5:

Given the two monoid constructions (XY)(XY)^{*} and XYXY on the underlying set XYXY, we construct the natural homomorphism θ(XY)(XY)\theta_{(XY)^{*}(XY)} by θ(ϕr)Yϕr\theta\left(\sum^{*}\phi_{r}\right)\coloneqq\sum^{Y}\phi_{r} which is surjective but not generally injective. This also allows us to define (ϕr)x(Yϕr)x=(ϕrx)\left(\sum^{*}\phi_{r}\right)x\coloneqq\left(\sum^{Y}\phi_{r}\right)x=\sum\left(\phi_{r}x\right). Important case is when Y=WY=W^{*}, ie YY is already a free monoid. Then the natural homomorphism is (ϕr)(XY)ϕr\left(\sum^{*}\phi_{r}\right)_{(XY)^{*}}\mapsto\sum^{*}\phi_{r} which is now an isomorphism, ie (XW)XW(XW^{*})^{*}\cong XW^{*}

C6:

Finally, given a set function ϕXY\phi_{XY} we construct the monoid homomorphism ϕXY(xi)(ϕxi)\phi_{X^{*}Y}\coloneqq\left(\sum^{*}x_{i}\right)\mapsto\sum(\phi x_{i}). Note that this is well defined since XX^{*} is free. If this were not the case ϕXY\phi_{X^{*}Y} may be multi-valued (ie no longer strictly a function).

Theorem 38.

Let X,YX,Y be sets and Z=(Z,+Z)Z=(Z,+_{Z}) a monoid, then for any set of bifunctions frf_{r} in (X×Y)Z(X\times Y)Z we have (fr)(x,y)=((frc1)(x))y=x((frc2)(y))\left(\sum^{*}f_{r}\right)(x,y)=\left((\sum^{*}f_{r}^{c1})(x)\right)y=x\left((\sum^{*}f_{r}^{c2})(y)\right)

Proof.

(fr)(x,y)=C5(Zfr)(x,y)=C4fr(x,y)=C3((frc2y)(x))=C2(x(frc2y)))=C5x((frc2y))=C5x((frc2)y)\left(\sum^{*}f_{r}\right)(x,y)=_{C5}\left(\sum^{Z}f_{r}\right)(x,y)=_{C4}\sum f_{r}(x,y)=_{C3}\sum\left((f_{r}^{c2}y)(x)\right)=_{C2}\sum\left(x(f_{r}^{c2}y))\right)=_{C5}x\left(\sum^{*}(f_{r}^{c2}y)\right)=_{C5}x\left((\sum^{*}f_{r}^{c2})y\right)

Corollary 39.

Given a Dynamical System (X,TXX)(X,T_{XX}) with a Value System (V,+V)(V,+_{V}), we have SN(ϕ,x)=(SNϕ)x=ϕ(SNx)S_{N}(\phi,x)=(S_{N}\phi)x=\phi(S_{N}x)

Proof.

From TT we construct T(Φ×X)V(ϕ,x)ϕ(Tx)T_{(\Phi\times X)V}(\phi,x)\coloneqq\phi(Tx). From the theorem we get (Tr)(ϕ,x)=Thmϕ((Trc2)(x))=C6ϕ(Trc2x)\left(\sum^{*}T_{r}\right)(\phi,x)=_{Thm}\phi\left((\sum^{*}T_{r}^{c2})(x)\right)=_{C6}\sum\phi\left(T_{r}^{c_{2}}x\right). Now ϕ(Trc2x)=C2(Trc2x)ϕ=C3Tr(ϕ,x)=defϕ(Tx)\phi\left(T_{r}^{c_{2}}x\right)=_{C2}\left(T_{r}^{c_{2}}x\right)\phi=_{C3}T_{r}(\phi,x)=_{\textrm{def}}\phi(Tx) and so ϕ(Trc2x)=ϕ(Trx)=C6ϕ((Trx))=C5ϕ((Tr)x))\sum\phi\left(T_{r}^{c_{2}}x\right)=\sum\phi(T_{r}x)=_{C6}\phi\left(\sum^{*}(T_{r}x)\right)=_{C5}\phi\left((\sum^{*}T_{r})x)\right).

Now put Tr=TrT_{r}=T^{r} and SN=TXXrS_{N}=\sum^{*}T_{XX}^{r}, SN=(Tr)(Φ×X)V.S_{N}=\sum^{*}(T^{r})_{(\Phi\times X)V}.

Alternate form gives SN(ϕ,x)=(SNϕ)xS_{N}(\phi,x)=(S_{N}\phi)x where SNS_{N} is ΦΦ\Phi\Phi^{*} but under \sim this becomes ΦΦ\Phi\Phi , ie SNϕS_{N}\phi can be regarded simply as another observable. ∎

4.3. Proof by Ansatz

Define SNxr=1NTrxS_{N}x\coloneqq\sum_{r=1}^{N}T^{r}x as a formal sum, and ϕ(r=1Nxr)r=1Nϕ(xr)\phi(\sum_{r=1}^{N}x_{r})\coloneqq\sum_{r=1}^{N}\phi(x_{r}) then ϕ(SNx)=r=1Nϕ(Trx)=SN(ϕ,x)\phi(S_{N}x)=\sum_{r=1}^{N}\phi(T^{r}x)=S_{N}(\phi,x)

5. Distribution of orbit segments on the circle

Given a Dynamical System (X,T)(X,T) we now develop the theory of orbit segments SNxr=1NTrxS_{N}x\coloneqq\sum_{r=1}^{N}T^{r}x which we introduced in the previous section, but specifically for the case where TT is a rotation RαR_{\alpha} of the circle through α\alpha_{\mathbb{R}} revolutions. In this case for a circle point x𝕋x_{\mathbb{T}} we have SNx=r=1NTrx=(Trx)r=1N=(x+𝕋rα)r=1NS_{N}x=\sum_{r=1}^{N}T^{r}x=\left(T^{r}x\right)_{r=1}^{N}=\left(x+_{\mathbb{T}}r\alpha\right)_{r=1}^{N} (where +𝕋+_{\mathbb{T}} is addition on the circle). Note that we will find it useful to be able to switch between the formal sum notation r=1NTrx\sum_{r=1}^{N}T^{r}x and the sequence notation (Trx)r=1N=(x+𝕋rα)r=1N\left(T^{r}x\right)_{r=1}^{N}=\left(x+_{\mathbb{T}}r\alpha\right)_{r=1}^{N}.

The distribution of the orbit (rα)r=1N=(α,2α,3α,Nα)(r\alpha)_{r=1}^{N}=(\alpha,2\alpha,3\alpha\ldots,N\alpha) is well understood classically when considered simply as a set, and is a primary example of equidistribution. In this section we instead study the distribution of the orbit as a sequence. This introduces some additional and quite elegant structure. In later sections we shall exploit this structure in deriving estimates for our anergodic Birkhoff sums. However the results of this section seem of interest in their own right.

5.1. QuasiPeriod Decomposition

Introduction/Motivation Intuitively if α\alpha and β\beta lie “close” to each other, then the orbits of a circle point xx under the rotations RαR_{\alpha} and RβR_{\beta} will continue to lie close to one another over suitably short initial orbit segments. We make this notion precise. If prqr\frac{p_{r}}{q_{r}} is a convergent of α\alpha, then we know (see 2.1) that αprqr=(1)rqrqr+1/\alpha-\frac{p_{r}}{q_{r}}=\frac{(-1)^{r}}{q_{r}q_{r+1}^{/}}, and so the tt-th points of the orbits of xx under Rα,RprqrR_{\alpha},R_{\frac{p_{r}}{q_{r}}} are separated from each other by a distance of tαtprqr=tqrqr+1/\left\|t\alpha-t\frac{p_{r}}{q_{r}}\right\|=\left\|\frac{t}{q_{r}q_{r+1}^{/}}\right\|. Further for 1tqr1\leq t\leq q_{r} we have 0<(1)r(tαtprqr)=tqrqr+1/1qr+1/<1qr0<(-1)^{r}(t\alpha-\frac{tp_{r}}{q_{r}})=\frac{t}{q_{r}q_{r+1}^{/}}\leq\frac{1}{q_{r+1}^{/}}<\frac{1}{q_{r}}. But since the sequence (tpr)t=1qr(tp_{r})_{t=1}^{q_{r}} is just a permutation modqr\bmod q_{r} of (u)u=1qr(u)_{u=1}^{q_{r}}, this tells us that the orbital points {tα}t=1qr\{t\alpha\}_{t=1}^{q_{r}} are ’pigeon-holed’ by the circle partition defined by the points (uqr)u=1qr(\frac{u}{q_{r}})_{u=1}^{q_{r}}: in other words, for each 1uqr1\leq u\leq q_{r} the partition interval (u1qr,uqr)(\frac{u-1}{q_{r}},\frac{u}{q_{r}}) contains precisely one orbital point tαt\alpha. Recall that each qrq_{r} we call a quasiperiod of α\alpha. The preceding discussion now motivates decomposing the initial orbit segment of length NN into sub-segments of quasiperiod length. A sub-segment of length qrq_{r} we will then relate (using the observations above) to rational (periodic) orbits of qrq_{r}. The Ostrowski representation (see 2.5) will be used to give us precisely this decomposition.

What we have said so far is a classical strategy developed by Koksma(##paper) and further developed by Herman(##paper). However we now introduce two new elements:

  1. (1)

    Previously the approach was used directly in the decomposition of the Birkhoff sum under study. However our strategy is to decouple the Birkhoff sum itself from the underlying dynamics of the irrational rotation. This is an important conceptual distinction which bears fruit later. It comes at the expense of slightly more theory and notation, in order to describe the decomposition of orbits rather than scalar sums.

  2. (2)

    Previously the approach required the sum-function in the Birkhoff sum to be of bounded variation. This is because the periodic orbits could be allowed to float to begin at the start of a quasiperiod segment. In order to study unbounded functions, we are forced instead to use periodic orbits anchored to the same fixed initial point. This requires more analysis, but also results in a deeper understanding of the distribution of the points of an initial orbit segment.

Ostrowski Decomposition of Orbit Segments Given α\alpha, the Ostrowski Representation (see 2.5) ORα(N)OR_{\alpha}(N) gives us a canonical way of representing an integer N0N\geq 0 as a sum of quasiperiods of α\alpha, which we write N=r=0nbrqrN=\sum_{r=0}^{n}b_{r}q_{r}. With a little care over notation we can use this to induce a corresponding canonical decomposition of an orbit segment of length NN into segments of quasiperiod length.

We first introduce a way of representing other integers MM in terms of the Ostrowski representation of NN:

Definition 40 (Ostrowski Triples).

Let α,N\alpha,N be fixed, and NN has Ostrowski Representation ORα(N)=r=0nbrqrOR_{\alpha}(N)=\sum_{r=0}^{n}b_{r}q_{r} wrt α\alpha. We say (r,s,t)3(r,s,t)_{\mathbb{Z}^{3}} is an Ostrowski triple representing the integer MM_{\mathbb{Z}} (with respect to α,N\alpha,N) if M=u=r+1nbuqu+sqr+tM=\sum_{u=r+1}^{n}b_{u}q_{u}+sq_{r}+t. For convenience we will use the homonymous formal notation rstrst to represent both the triple (r,s,t)(r,s,t) and the represented integer MM so that we can write M=rst=u=r+1nbuqu+sqr+tM=rst=\sum_{u=r+1}^{n}b_{u}q_{u}+sq_{r}+t.

Note from the definition that rst=rs0+t=r00+sqr+t=u=r+1nbuqu+sqr+trst=rs0+t=r00+sq_{r}+t=\sum_{u=r+1}^{n}b_{u}q_{u}+sq_{r}+t. In particular n00=0n00=0, n01=1n01=1, (1)00=N(-1)00=N, and 000=u=1nbuqu=Nb0q0=Nb0000=\sum_{u=1}^{n}b_{u}q_{u}=N-b_{0}q_{0}=N-b_{0}. For 0rn10\leq r\leq n-1, r00=bnqn+bn1qn1+br+1qr+1r00=b_{n}q_{n}+b_{n-1}q_{n-1}\ldots+b_{r+1}q_{r+1} so that that rstrst increases as rr decreases from nn to 1-1. Note also that MM may in general be represented by many Ostrowski triples with respect to N,αN,\alpha but we can define a distinguished representation as follows.

Definition 41.

For 1MN1\leq M\leq N we define the Ostrowski triple of MM (wrt N,α)N,\alpha) to be the unique Ostrowski triple representing MM defined by r=min{k:k00<M}r=\min\{k:k00<M\}, s=M1r00qrs=\left\lfloor\frac{M-1-r00}{q_{r}}\right\rfloor, t=Mrs0t=M-rs0. For M=1..NM=1..N we call the Ostrowski triple of MM a canonical triple of (α,N)(\alpha,N).

Note that if rstrst is a canonical triple then from the definition sqrM1r00sq_{r}\leq M-1-r00 giving t1t\geq 1. In particular it is easy to see then that for canonical triples we have 0rn,0sbr1<ar+1,1tqr0\leq r\leq n,0\leq s\leq b_{r}-1<a_{r+1},1\leq t\leq q_{r}, and and 1rstN1\leq rst\leq N. NN itself is represented by the canonical triple =0(bn1)qn=0(b_{n}-1)q_{n} , and rs0rs0 is never a canonical triple (since t1t\geq 1).

We are now ready to decompose the orbit segment SN(x0)S_{N}(x_{0}) into segments of quasiperiod length. If M=rstM=rst we have the equivalent notations xM=TMx0=Trstx0=xrstx_{M}=T^{M}x_{0}=T^{rst}x_{0}=x_{rst}.

Definition 42 (Ostrowski Decomposition).

Given α\alpha irrational, N0N\geq 0 and a Dynamical system (X,T)(X,T), we define the Ostrowski decomposition of the orbit segment SN(x0)=u=1NTux0=u=1NxuS_{N}(x_{0})=\sum_{u=1}^{N}T^{u}x_{0}=\sum_{u=1}^{N}x_{u} to be the reverse formal sum SN(x0)=r=0nSbrqr(xr00)S_{N}(x_{0})=\underleftarrow{\sum_{r=0}^{n}}S_{b_{r}q_{r}}(x_{r00}). We further decompose each Sbrqr(y)S_{b_{r}q_{r}}(y) into brb_{r} segments of length qrq_{r} to obtain the Extended Ostrowski Decomposition SN(x0)=r=0n(s=0br1Sqr(xrs0))S_{N}(x_{0})=\underleftarrow{\sum_{r=0}^{n}}\left(\sum_{s=0}^{b_{r}-1}S_{q_{r}}(x_{rs0})\right).

We can write this more explicitly (recalling n00=0)n00=0) as

SN(x0)\displaystyle S_{N}(x_{0}) =Sbnqn(x0)+Sbn1qn1(xbnqn)+Sbn2qn2(xbnqn+bn1qn1)+Sb0q0(xNb0q0)\displaystyle=S_{b_{n}q_{n}}(x_{0})+S_{b_{n-1}q_{n-1}}(x_{b_{n}q_{n}})+S_{b_{n-2}q_{n-2}}(x_{b_{n}q_{n}+b_{n-1}q_{n-1}})+\ldots S_{b_{0}q_{0}}(x_{N-b_{0}q_{0}})
=Sqn(xn00)+Sqn(xn10)+Sqn(xn20)+Sqn(xn(bn1)0)\displaystyle=S_{q_{n}}(x_{n00})+S_{q_{n}}(x_{n10})+S_{q_{n}}(x_{n20})\ldots+S_{q_{n}}(x_{n(b_{n}-1)0})
+Sqn1(x(n1)00)+Sqn1(x(n1)10)+Sqn1(x(n1)20)+Sqn1(x(n1)(bn11)0)\displaystyle+S_{q_{n-1}}(x_{(n-1)00})+S_{q_{n-1}}(x_{(n-1)10})+S_{q_{n-1}}(x_{(n-1)20})\ldots+S_{q_{n-1}}(x_{(n-1)(b_{n-1}-1)0})
\displaystyle\vdots
+Sq0(x000)+Sq0(x010)+Sq0(x020)+Sq0(x0(bn1)0)\displaystyle+S_{q_{0}}(x_{000})+S_{q_{0}}(x_{010})+S_{q_{0}}(x_{020})\ldots+S_{q_{0}}(x_{0(b_{n}-1)0})

Note that if br=0b_{r}=0 the corresponding line is empty.

Also note that 0s0=(Nb0q0)+s+00s0=(N-b_{0}q_{0})+s+0 and q0=1q_{0}=1 so the last line simplifies to x001+x011++x0(bn1)1=x(Nb0)+1+x(Nb0)+2++xNx_{001}+x_{011}+\ldots+x_{0(b_{n}-1)1}=x_{(N-b_{0})+1}+x_{(N-b_{0})+2}+\ldots+x_{N}.

5.2. Application to irrational rotations

When TT is an irrational rotation RαR_{\alpha} of the circle we have TMx0={x0+Mα}T^{M}x_{0}=\left\{x_{0}+M\alpha\right\} so that if M=rstM=rst we have xrst={x0+(rst)α}x_{rst}=\left\{x_{0}+(rst)\alpha\right\}. In particular we will write αrst{(rst)α}\alpha_{rst}\coloneqq\left\{(rst)\alpha\right\} and then the Ostrowski Decomposition is SN(x0)=r=0ns=0br1Sqr(x0+αrs0)S_{N}(x_{0})=\underleftarrow{\sum_{r=0}^{n}}\sum_{s=0}^{b_{r}-1}S_{q_{r}}(x_{0}+\alpha_{rs0}) and Sqr(x0+αrs0)S_{q_{r}}(x_{0}+\alpha_{rs0}) is the sequence of points ({x0+αrst})t=1qr\left(\left\{x_{0}+\alpha_{rst}\right\}\right)_{t=1}^{q_{r}}. We are now in a position to extend our discussion to each orbit segment SqrS_{q_{r}} in the Ostrowski decomposition. In section (5.1) we introduced the notion of tracking quasiperiod segments of the orbit under RαR_{\alpha} by means of quasiperiod partitions. We introduce a quantity which will help us measure the precision of this tracking behaviour.

Definition 43 (Tracking error).

We define the tracking error (ϵrst)(\epsilon_{rst})_{\mathbb{R}}of αrst\alpha_{rst} to be the signed real number

ϵrst=(1)rαr00+(sqr+1/+tqrqr+1/)\epsilon_{rst}=(-1)^{r}\alpha_{r00}+\left(\frac{s}{q_{r+1}^{/}}+\frac{t}{q_{r}q_{r+1}^{/}}\right)

We shall see later that the tracking error ϵrst\epsilon_{rst} is the signed smallest distance of the circle point αrst\alpha_{rst} from the circle point tprqrt\text{$\frac{p_{r}}{q_{r}}$}. Our goal is to develop bounds to show that this error is always suitably small, although at this point we cannot even rule out |ϵrst|>1\left|\epsilon_{rst}\right|>1. However we can observe immediately that for r=nr=n we have αn0t=tα\alpha_{n0t}=t\alpha and so our earlier discussion (5.1) tells us that for t1..qnt\in 1..q_{n} we have 0<ϵn0t=(1)n(tαtpnqn)1qn+1/<1an+1qn0<\epsilon_{n0t}=(-1)^{n}\left(t\alpha-t\frac{p_{n}}{q_{n}}\right)\leq\frac{1}{q_{n+1}^{/}}<\frac{1}{a_{n+1}q_{n}}. .

The general case becomes more complex as we have to take into account the effects of quasiperiod segments which do not start from the origin. We first show that we can in fact recover αrst\alpha_{rst} from the definition of ϵrst\epsilon_{rst}, via the following simple result:

Proposition 44.

αrst={tprqr+(1)rϵrst}\alpha_{rst}=\left\{t\frac{p_{r}}{q_{r}}+(-1)^{r}\epsilon_{rst}\right\}

Proof.

By definition αrst={αr00+(sqr+t)α}=αr00+(sqr+t)α+M\alpha_{rst}=\left\{\alpha_{r00}+(sq_{r}+t)\alpha\right\}=\alpha_{r00}+(sq_{r}+t)\alpha+M for some integer MM. But α=prqr+(1)rqrqr+1/\alpha=\frac{p_{r}}{q_{r}}+\frac{(-1)^{r}}{q_{r}q_{r+1}^{/}} so that αrst=αr00+(1)r(sqr+1/+tqrqr+1/)+tprqr+M=(1)rϵrst+tprqr+M\alpha_{rst}=\alpha_{r00}+(-1)^{r}\left(\frac{s}{q_{r+1}^{/}}+\frac{t}{q_{r}q_{r+1}^{/}}\right)+t\frac{p_{r}}{q_{r}}+M=(-1)^{r}\epsilon_{rst}+t\frac{p_{r}}{q_{r}}+M and the result follows since αrst={αrst}\alpha_{rst}=\left\{\alpha_{rst}\right\}. ∎

We now investigate the two components of ϵrst\epsilon_{rst} separately, namely (1)rαr00(-1)^{r}\alpha_{r00} and sqr+1/+tqrqr+1/\frac{s}{q_{r+1}^{/}}+\frac{t}{q_{r}q_{r+1}^{/}}. We start with the latter as it is the simpler.

Lemma 45.

For canonical rstrst we have 0<sqr+1/+tqrqr+1/s+1qr+1/<1qr/0<\frac{s}{q_{r+1}^{/}}+\frac{t}{q_{r}q_{r+1}^{/}}\leq\frac{s+1}{q_{r+1}^{/}}<\frac{1}{q_{r}^{/}}

Proof.

For canonical rstrst we have 0sbr10\leq s\leq b_{r}-1and 1tqr1\leq t\leq q_{r} so that 0<sqr+1/+tqrqr+1/s+1qr+1/brqr+1/ar+1qr+1/<ar+1/qr+1/=1qr/0<\frac{s}{q_{r+1}^{/}}+\frac{t}{q_{r}q_{r+1}^{/}}\leq\frac{s+1}{q_{r+1}^{/}}\leq\frac{b_{r}}{q_{r+1}^{/}}\leq\frac{a_{r+1}}{q_{r+1}^{/}}<\frac{a_{r+1}^{/}}{q_{r+1}^{/}}=\frac{1}{q_{r}^{/}}

We now turn to investigate the term αr00\alpha_{r00}. Recall the definition αr00={u=r+1nbuquα}\alpha_{r00}=\left\{\sum_{u=r+1}^{n}b_{u}q_{u}\alpha\right\}. We might expect this sum to be capable of taking a good range of values within [0,1)[0,1) but fortunately for our purposes it turns out to be surprisingly constrained, and this rigidity is the foundational result of this section.

Lemma 46.

If br0b_{r}\neq 0 then αr00<1qr+1/\left\|\alpha_{r00}\right\|<\frac{1}{q_{r+1}^{/}} and more precisely 1qr+1/+1qr+2/<(1)r{{αr00}}<1qr+2/\frac{-1}{q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}<(-1)^{r}\left\{\left\{\alpha_{r00}\right\}\right\}<\frac{1}{q_{r+2}^{/}}

Proof.

For 0rn0\leq r\leq n we have αr00=u=r+1nbuquα=u=r+1nbupu+u=r+1n(EuOu)buqu+1/\alpha_{r00}=\sum_{u=r+1}^{n}b_{u}q_{u}\alpha=\sum_{u=r+1}^{n}b_{u}p_{u}+\sum_{u=r+1}^{n}\frac{\left(E_{u}-O_{u}\right)b_{u}}{q_{u+1}^{/}}. Hence

(5.1) {{αr00}}={{u=r+1nEubuqu+1/u=r+1nOubuqu+1/}}\left\{\left\{\alpha_{r00}\right\}\right\}=\left\{\left\{\sum_{u=r+1}^{n}\frac{E_{u}b_{u}}{q_{u+1}^{/}}-\sum_{u=r+1}^{n}\frac{O_{u}b_{u}}{q_{u+1}^{/}}\right\}\right\}

We now estimate the even and odd sums on the right side. We will primarily use the fact that buau+1b_{u}\leq a_{u+1} but with an important refinement. Recall (Lemma 18) that if br0b_{r}\neq 0 then br+1<ar+2b_{r+1}<a_{r+2} and so we have 0u=r+1nEubuqu+1/(u=r+1nEuau+1qu+1/)Er+1qr+2/=(u=r+1+ErnOnEuau+1qu+1/)Orqr+2/0\leq\sum_{u=r+1}^{n}\frac{E_{u}b_{u}}{q_{u+1}^{/}}\leq\left(\sum_{u=r+1}^{n}\frac{E_{u}a_{u+1}}{q_{u+1}^{/}}\right)-\frac{E_{r+1}}{q_{r+2}^{/}}=\left(\sum_{u=r+1+E_{r}}^{n-O_{n}}\frac{E_{u}a_{u+1}}{q_{u+1}^{/}}\right)-\frac{O_{r}}{q_{r+2}^{/}}, and 0u=r+1nOubuqu+1/(u=r+1nOuau+1qu+1/)Or+1qr+2/=(u=r+1+OrnEnOuau+1qu+1/)Erqr+2/0\leq\sum_{u=r+1}^{n}\frac{O_{u}b_{u}}{q_{u+1}^{/}}\leq\left(\sum_{u=r+1}^{n}\frac{O_{u}a_{u+1}}{q_{u+1}^{/}}\right)-\frac{O_{r+1}}{q_{r+2}^{/}}=\left(\sum_{u=r+1+O_{r}}^{n-E_{n}}\frac{O_{u}a_{u+1}}{q_{u+1}^{/}}\right)-\frac{E_{r}}{q_{r+2}^{/}}. We now estimate the sums over aua_{u}. Since au+1qu+1/<1qu/\frac{a_{u+1}}{q_{u+1}^{/}}<\frac{1}{q_{u}^{/}}, and considering the parity of r,nr,n we can refine the sums to:

(5.2) u=r+1nEuau+1qu+1/<u=r+1+ErnOnEuqu/<2qr+1+Er/\sum_{u=r+1}^{n}\frac{E_{u}a_{u+1}}{q_{u+1}^{/}}<\sum_{u=r+1+E_{r}}^{n-O_{n}}\frac{E_{u}}{q_{u}^{/}}<\frac{2}{q_{r+1+E_{r}}^{/}}
(5.3) u=r+1nOuau+1qu+1/<u=r+1+OrnEnOuqu/<2qr+1+Or/\sum_{u=r+1}^{n}\frac{O_{u}a_{u+1}}{q_{u+1}^{/}}<\sum_{u=r+1+O_{r}}^{n-E_{n}}\frac{O_{u}}{q_{u}^{/}}<\frac{2}{q_{r+1+O_{r}}^{/}}

Now r+1+Er2r+1+E_{r}\geq 2 and q2/>2q_{2}^{/}>2 and so u=r+1nEuau+1qu+1/={u=r+1nEuau+1qu+1/}\sum_{u=r+1}^{n}\frac{E_{u}a_{u+1}}{q_{u+1}^{/}}=\left\{\sum_{u=r+1}^{n}\frac{E_{u}a_{u+1}}{q_{u+1}^{/}}\right\}. Similarly r+1+Or2r+1+O_{r}\geq 2 for r1r\geq 1 and so u=r+1nOuau+1qu+1/={u=r+1nOuau+1qu+1/}\sum_{u=r+1}^{n}\frac{O_{u}a_{u+1}}{q_{u+1}^{/}}=\left\{\sum_{u=r+1}^{n}\frac{O_{u}a_{u+1}}{q_{u+1}^{/}}\right\} unless r=0r=0 and q1/<2q_{1}^{/}<2. But in the latter case q1=1q_{1}=1 and hence α>12\alpha>\frac{1}{2} which means b0=0b_{0}=0. Hence u=r+1nOuau+1qu+1/={u=r+1nOuau+1qu+1/}\sum_{u=r+1}^{n}\frac{O_{u}a_{u+1}}{q_{u+1}^{/}}=\left\{\sum_{u=r+1}^{n}\frac{O_{u}a_{u+1}}{q_{u+1}^{/}}\right\} whenever br0b_{r}\neq 0. We now know both these sums are less than 11.

Now recall α=puqu+(1)uququ+1/\alpha=\frac{p_{u}}{q_{u}}+\frac{(-1)^{u}}{q_{u}q_{u+1}^{/}} so that 1qu+1/=(1)u(quαpu)\frac{1}{q_{u+1}^{/}}=(-1)^{u}\left(q_{u}\alpha-p_{u}\right). Hence au+1qu+1/=(1)uau+1(quαpu)=(1)u(qu+1qu1)α(1)uau+1pu\frac{a_{u+1}}{q_{u+1}^{/}}=(-1)^{u}a_{u+1}\left(q_{u}\alpha-p_{u}\right)=(-1)^{u}\left(q_{u+1}-q_{u-1}\right)\alpha-(-1)^{u}a_{u+1}p_{u}. Summing over uu even gives the telescoping sum:

u=r+1nEuau+1qu+1/={u=r+1nEuau+1qu+1/}={u=r+1+Or+1nOnEu(qu+1qu1)α}={(qn+1Onqr+Or+1)α}={1qn+2On/+1qr+1+Or+1/}<1qr+1+Er/\sum_{u=r+1}^{n}\frac{E_{u}a_{u+1}}{q_{u+1}^{/}}=\left\{\sum_{u=r+1}^{n}\frac{E_{u}a_{u+1}}{q_{u+1}^{/}}\right\}=\left\{\sum_{u=r+1+O_{r+1}}^{n-O_{n}}E_{u}\left(q_{u+1}-q_{u-1}\right)\alpha\right\}=\left\{\left(q_{n+1-O_{n}}-q_{r+O_{r+1}}\right)\alpha\right\}=\left\{-\frac{1}{q_{n+2-O_{n}}^{/}}+\frac{1}{q_{r+1+O_{r+1}}^{/}}\right\}<\frac{1}{q_{r+1+E_{r}}^{/}}

Again using r+1+Er2r+1+E_{r}\geq 2 gives 1qr+1+Er/<12\frac{1}{q_{r+1+E_{r}}^{/}}<\frac{1}{2}and so 0u=r+1nEubuqu+1/<1qr+1+Er/Er+1qr+2/<120\leq\sum_{u=r+1}^{n}\frac{E_{u}b_{u}}{q_{u+1}^{/}}<\frac{1}{q_{r+1+E_{r}}^{/}}-\frac{E_{r+1}}{q_{r+2}^{/}}<\frac{1}{2}. Following the same argument for uu odd gives 0u=r+1nOubuqu+1/<1qr+1+Or/Or+1+Orqr+2/<120\leq\sum_{u=r+1}^{n}\frac{O_{u}b_{u}}{q_{u+1}^{/}}<\frac{1}{q_{r+1+O_{r}}^{/}}-\frac{O_{r+1+O_{r}}}{q_{r+2}^{/}}<\frac{1}{2} but again with the proviso that br0b_{r}\neq 0

We can now use these results in (5.1) to obtain (for br0b_{r}\neq 0):

12<(1qr+1+Or/+Erqr+2/)<(u=r+1nOubuqu+1/)<{{αr00}}<(u=r+1nEubuqu+1/)<(1qr+1+Er/Orqr+2/)<12-\frac{1}{2}<\left(-\frac{1}{q_{r+1+O_{r}}^{/}}+\frac{E_{r}}{q_{r+2}^{/}}\right)<\left(-\sum_{u=r+1}^{n}\frac{O_{u}b_{u}}{q_{u+1}^{/}}\right)<\,\,\left\{\left\{\alpha_{r00}\right\}\right\}\,\,<\left(\sum_{u=r+1}^{n}\frac{E_{u}b_{u}}{q_{u+1}^{/}}\right)<\left(\frac{1}{q_{r+1+E_{r}}^{/}}-\frac{O_{r}}{q_{r+2}^{/}}\right)<\frac{1}{2}

It is easily checked that this is a restatement of the theorem result. ∎

We now combine the previous lemmas of this section to obtain:

Lemma 47.

For rstrst canonical, ϵrst\epsilon_{rst} has lower and upper bounds ϵrstL\epsilon_{rst}^{L} and ϵrstU=ϵrstL+1qr+1/\epsilon_{rst}^{U}=\epsilon_{rst}^{L}+\frac{1}{q_{r+1}^{/}} respectively, satisfying:

1qr+1/<(s1qr+1/+1qr+2/)<((s1)qr+tqrqr+1/+1qr+2/)=ϵrstL<ϵrst<ϵrstU=(sqr+tqrqr+1/+1qr+2/)(s+1qr+1/+1qr+2/)1qr/-\frac{1}{q_{r+1}^{/}}<\left(\frac{s-1}{q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}\right)<\left(\frac{(s-1)q_{r}+t}{q_{r}q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}\right)=\epsilon_{rst}^{L}<\epsilon_{rst}<\epsilon_{rst}^{U}=\left(\frac{sq_{r}+t}{q_{r}q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}\right)\leq\left(\frac{s+1}{q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}\right)\leq\frac{1}{q_{r}^{/}}

.

In particular |ϵrst|<1qr/\left|\epsilon_{rst}\right|<\frac{1}{q_{r}^{/}} and if s0s\neq 0 ϵrst>0\epsilon_{rst}>0. Further for r=nr=n ϵrst=sqr+tqrqr+1/>0\epsilon_{rst}=\frac{sq_{r}+t}{q_{r}q_{r+1}^{/}}>0, and for r<n,s=0,t=qrr<n,s=0,t=q_{r} ϵrst>1qr+2/>0\epsilon_{rst}>\frac{1}{q_{r+2}^{/}}>0

Proof.

Using 0sbr1,1tqr0\leq s\leq b_{r}-1,1\leq t\leq q_{r} for rstrst canonical gives 0<sqr+tqrqr+1/s+1qr+1/0<\frac{sq_{r}+t}{q_{r}q_{r+1}^{/}}\leq\frac{s+1}{q_{r+1}^{/}}. Further 1qr+2/+s+1qr+1/1qr+2/+ar+1qr+1/=1qr/\frac{1}{q_{r+2}^{/}}+\frac{s+1}{q_{r+1}^{/}}\leq\frac{1}{q_{r+2}^{/}}+\frac{a_{r+1}}{q_{r+1}^{/}}=\frac{1}{q_{r}^{/}}. Hence 1qr+1/<ϵ<1qr/\frac{-1}{q_{r+1}^{/}}<\epsilon<\frac{1}{q_{r}^{/}} and for qr2q_{r}\geq 2 this means 12<ϵ<12\frac{-1}{2}<\epsilon<\frac{1}{2} so that ϵrst={{ϵrst}}\epsilon_{rst}=\{\{\epsilon_{rst}\}\}. The various inequalities of the first result then follow and we turn to the remaining results. For r=nr=n we have αn00=0\alpha_{n00}=0 and so ϵrst=sqr+tqrqr+1/>0\epsilon_{rst}=\frac{sq_{r}+t}{q_{r}q_{r+1}^{/}}>0. For s0s\neq 0, ϵrst>s1qr+1/+1qr+2/>0\epsilon_{rst}>\frac{s-1}{q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}>0. If s=0s=0 but t=qrt=q_{r}, ϵrst>(s1)qr+tqrqr+1/+1qr+2/=1qr+2/>0\epsilon_{rst}>\frac{(s-1)q_{r}+t}{q_{r}q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}=\frac{1}{q_{r+2}^{/}}>0.

The results can be packaged into one simple and quite elegant result which includes all of the cases above:

Corollary 48 (Parity Duality).

Writing lrst={(1)rtprqr}+ϵrstLl_{rst}=\left\{(-1)^{r}\frac{tp_{r}}{q_{r}}\right\}+\epsilon_{rst}^{L} and urst={(1)rtprqr}+ϵrstUu_{rst}=\left\{(-1)^{r}\frac{tp_{r}}{q_{r}}\right\}+\epsilon_{rst}^{U}, then 0<1qr+2/<lrst=urst1qr+1/0<\frac{1}{q_{r+2}^{/}}<l_{rst}=u_{rst}-\frac{1}{q_{r+1}^{/}} and urst11qr+1qr/<1u_{rst}\leq 1-\frac{1}{q_{r}}+\frac{1}{q_{r}^{/}}<1 (r>0)r>0) and urst11qr+1/<1u_{rst}\leq 1-\frac{1}{q_{r+1}^{/}}<1 (r=0r=0). In addition:

lrst\displaystyle l_{rst} <αrst<urstreven\displaystyle<\text{$\alpha_{rst}$$<$$u_{rst}$}\qquad r\,\mathrm{even}
lrst\displaystyle l_{rst} <αrst¯<urst rodd\displaystyle<\text{$\overline{\alpha_{rst}}$$<$$u_{rst}$ }\qquad r\,\mathrm{odd}
Proof.

Note that {(1)rtprqr}11qr\left\{(-1)^{r}\frac{tp_{r}}{q_{r}}\right\}\leq 1-\frac{1}{q_{r}} and by (47) ϵrstU1qr/\epsilon_{rst}^{U}\leq\frac{1}{q_{r}^{/}} giving urst11qr+1qr/u_{rst}\leq 1-\frac{1}{q_{r}}+\frac{1}{q_{r}^{/}}. But 1qr/<1qr\frac{1}{q_{r}^{/}}<\frac{1}{q_{r}} for r>0r>0 and then urst<1u_{rst}<1.

For r=0r=0 we have {(1)rtprqr}=0\left\{(-1)^{r}\frac{tp_{r}}{q_{r}}\right\}=0 but we now have q0/=q0=1q_{0}^{/}=q_{0}=1 which does not achieve the result. However also from (47) if r=0r=0 ϵ0stU1q0/1q1/=11q1/\epsilon_{0st}^{U}\leq\frac{1}{q_{0}^{/}}-\frac{1}{q_{1}^{/}}=1-\frac{1}{q_{1}^{/}}, and q1/>1q_{1}^{/}>1 so that u0st<1u_{0st}<1 and so urst<1u_{rst}<1 for any r0r\geq 0. We now consider lrstl_{rst}.

If {tprqr}0\left\{\frac{tp_{r}}{q_{r}}\right\}\neq 0 then qr>1q_{r}>1 and we have {(1)rtprqr}1qr\left\{(-1)^{r}\frac{tp_{r}}{q_{r}}\right\}\geq\frac{1}{q_{r}}. From (47) ϵrstU>tqrqr+1/+1qr+2/\epsilon_{rst}^{U}>\frac{t}{q_{r}q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}} and so lrst>1qr+1qr+2/1qr+1/>1qr+2/>0l_{rst}>\frac{1}{q_{r}}+\frac{1}{q_{r+2}^{/}}-\frac{1}{q_{r+1}^{/}}>\frac{1}{q_{r+2}^{/}}>0.

If {tprqr}=0\left\{\frac{tp_{r}}{q_{r}}\right\}=0 then lrst=ϵrstU1qr+1/l_{rst}=\epsilon_{rst}^{U}-\frac{1}{q_{r+1}^{/}} and either qr=1q_{r}=1 (and then t=1=qrt=1=q_{r}) or t=qrt=q_{r} - in either case t=qrt=q_{r}. Then ϵrsqrU>1qr+1/+1qr+2/\epsilon_{rsq_{r}}^{U}>\frac{1}{q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}} and hence lrst>1qr+2/>0l_{rst}>\frac{1}{q_{r+2}^{/}}>0.

Finally ϵrstL=ϵrstU1qr+1/\epsilon_{rst}^{L}=\epsilon_{rst}^{U}-\frac{1}{q_{r+1}^{/}} in (47), which gives us lrst=urst1qr+1/l_{rst}=u_{rst}-\frac{1}{q_{r+1}^{/}}.

The first result follows, and we turn to consideration of the results involving αrst\alpha_{rst}.

Recall from (44) αrst={tprqr+(1)rϵrst}\alpha_{rst}=\left\{\frac{tp_{r}}{q_{r}}+(-1)^{r}\epsilon_{rst}\right\} and |ϵrst|<1qr/\left|\epsilon_{rst}\right|<\frac{1}{q_{r}^{/}} and ϵrstL<ϵrst<ϵrstU\epsilon_{rst}^{L}<\epsilon_{rst}<\epsilon_{rst}^{U}

For {tprqr}0\left\{\frac{tp_{r}}{q_{r}}\right\}\neq 0 (so that qr>1q_{r}>1) {tprqr+(1)rϵrst}={tprqr}+(1)rϵrst\left\{\frac{tp_{r}}{q_{r}}+(-1)^{r}\epsilon_{rst}\right\}=\left\{\frac{tp_{r}}{q_{r}}\right\}+(-1)^{r}\epsilon_{rst}.

For rr even this gives αrst={tprqr}+ϵrst\alpha_{rst}=\left\{\frac{tp_{r}}{q_{r}}\right\}+\epsilon_{rst}.

If rr is odd then αrst¯=1αrst=1({tprqr}ϵrst)={tprqr}+ϵrst\overline{\alpha_{rst}}=1-\alpha_{rst}=1-\left(\left\{\frac{tp_{r}}{q_{r}}\right\}-\epsilon_{rst}\right)=\left\{-\frac{tp_{r}}{q_{r}}\right\}+\epsilon_{rst}.

The results follow in both cases from ϵrstL<ϵrst<ϵrstU\epsilon_{rst}^{L}<\epsilon_{rst}<\epsilon_{rst}^{U} and the definitions of urst,lrstu_{rst},l_{rst}.

If {tprqr}=0\left\{\frac{tp_{r}}{q_{r}}\right\}=0 we have {tprqr+(1)rϵrst}={(1)rϵrst}\left\{\frac{tp_{r}}{q_{r}}+(-1)^{r}\epsilon_{rst}\right\}=\left\{(-1)^{r}\epsilon_{rst}\right\}. We also have urst=ϵrstU,lrst=ϵrstLu_{rst}=\epsilon_{rst}^{U},l_{rst}=\epsilon_{rst}^{L}. Hence 0<lrst=ϵrstL<ϵrst<ϵrstU=urst<10<l_{rst}=\epsilon_{rst}^{L}<\epsilon_{rst}<\epsilon_{rst}^{U}=u_{rst}<1 which also gives us {ϵrst}=ϵrst\left\{\epsilon_{rst}\right\}=\epsilon_{rst}.

For rr even this gives αrst={ϵrst}=ϵrst\alpha_{rst}=\left\{\epsilon_{rst}\right\}=\epsilon_{rst}.

If rr is odd then αrst¯=1{ϵrst}={ϵrst}=ϵrst\overline{\alpha_{rst}}=1-\left\{-\epsilon_{rst}\right\}=\left\{\epsilon_{rst}\right\}=\epsilon_{rst}.

The results follow directly. ∎

Note that we can rewrite the second inequality using x¯=1x\overline{x}=1-x to obtain the dual result lrst¯>αrst>urst¯\overline{l_{rst}}>\alpha_{rst}>\overline{u_{rst}} for rr odd.

We have now developed a lot of understanding of where the point αrst\alpha_{rst} lies in the partition defined by the points tprqrt\frac{p_{r}}{q_{r}}, but there are also some important special cases where we can go further (and which are also important when we come to consider the application to anergodic Birkhoff sums).

Corollary 48 tells us that αrst\alpha_{rst} always lies in one of the two intervals of length 1/qr1/q_{r}either side of tpr/qrtp_{r}/q_{r}, and in the majority of cases (s0s\neq 0or r=nr=n or t=qrt=q_{r}) it lies in the primary interval. With a little extra work we can identify the interval more precisely. The partition interval of αrst\alpha_{rst}

Definition 49.

Fixing rr, the circle points kqr\frac{k}{q_{r}} (k=1..qrk=1..q_{r}) form a partition defining qrq_{r} distinct open intervals of length 1qr\frac{1}{q_{r}}. For qr=1q_{r}=1 there is only such one point and interval, but for qr>1q_{r}>1 each point is surrounded by two distinct intervals, one on each side, which we will label IkI_{k} and Ik-I_{k} (the positive and negative intervals of the point kqr\frac{k}{q_{r}}), which we will write generically as (1)uIk(-1)^{u}I_{k} for some uu_{\mathbb{Z}}. We define (1)uIk(-1)^{u}I_{k} specifically as the set of points {k+(1)uνqr}Set\left\{\frac{k\,+\,(-1)^{u}\nu}{q_{r}}\right\}_{Set} for ν(0,1)\nu\in(0,1).

Note that in fact the latter definition also holds for qr=1q_{r}=1, but in this case the two intervals coincide (Ik=IkI_{k}=-I_{k}).

Recall that circle points kqr,k+mqrqr\frac{k}{q_{r}},\frac{k+mq_{r}}{q_{r}} coincide for any integer mm_{\mathbb{Z}}, so that (1)uIk=(1)uIk+mqr(-1)^{u}I_{k}=(-1)^{u}I_{k+mq_{r}}.

The set of positive intervals and the set of negative intervals represent two different ways of labelling the same underlying set of intervals which make up the partition. There is therefore a natural bijection between the two sets of labels. We can capture this explicitly as follows:

Lemma 50.

(1)uIk=(1)u+1Ik+(1)u(-1)^{u}I_{k}=(-1)^{u+1}I_{k+(-1)^{u}}

Proof.

(1)uIk={k+(1)uνqr}={k+(1)u(1)u(ν1)qr}=(1)u+1Ik+(1)u(-1)^{u}I_{k}=\left\{\frac{k\,+\,(-1)^{u}\nu}{q_{r}}\right\}=\left\{\frac{k+(-1)^{u}\,-\,(-1)^{u}(\nu-1)}{q_{r}}\right\}=(-1)^{u+1}I_{k+(-1)^{u}}. ∎

Lemma 51.

, αrst(1)rsgn(ϵrst)Ik\alpha_{rst}\in(-1)^{r}\operatorname{sgn}(\epsilon_{rst})I_{k} where k=tprmodqrk=tp_{r}\bmod q_{r}.

Proof.

This follows immediately from αrst={tprqr+(1)rϵrst}\alpha_{rst}=\left\{t\frac{p_{r}}{q_{r}}+(-1)^{r}\epsilon_{rst}\right\} and |ϵrst|<1qr\left|\epsilon_{rst}\right|<\frac{1}{q_{r}}. ∎

Proposition 52.

For r0r\geq 0, pr=(1)r+1qr11modqrp_{r}=(-1)^{r+1}q_{r-1}^{-1}\bmod q_{r}

Proof.

This follows immediately from the identityqrpr1prqr1=(1)rq_{r}p_{r-1}-p_{r}q_{r-1}=(-1)^{r}for r0r\geq 0

Lemma 53.

For canonical rstrst, and any integer uu, the points αrst\alpha_{rst} which lie in (1)uIk(-1)^{u}I_{k} are precisely those for which either (a) t=(1)r+1kqr1modqrt=(-1)^{r+1}kq_{r-1}\bmod q_{r} and sgn(ϵrst)=(1)u+r\operatorname{sgn}(\epsilon_{rst})=(-1)^{u+r}, or (b) t=(1)r+1(k+(1)u)qr1modqrt=(-1)^{r+1}\left(k+(-1)^{u}\right)q_{r-1}\bmod q_{r} and sgn(ϵrst)=(1)u+r+1\operatorname{sgn}(\epsilon_{rst})=(-1)^{u+r+1}

Proof.

From previous 3 lemmas ∎

Special case - the intervals around the origin

Corollary 54.

Given rr, the canonical triples for which αrst\alpha_{rst} lies in an interval around the origin (ie ±I0\pm I_{0}) are given by:

  1. (1)

    For (1)rI0(-1)^{r}I_{0}: t=qrt=q_{r} (for any ss) or r<n,s=0,t=qrqr1r<n,s=0,t=q_{r}-q_{r-1} and sgn(ϵr0(qrqr1))=1\operatorname{sgn}(\epsilon_{r0(q_{r}-q_{r-1})})=-1). Then we get the lower bound lr0(qrqr1)>1qr+1qr+1/+1qr+2/>12qrl_{r0(q_{r}-q_{r-1})}>\frac{1}{q_{r}}+\frac{-1}{q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}>\frac{1}{2q_{r}}.

  2. (2)

    For (1)r+1I0(-1)^{r+1}I_{0}: t=qr1modqrt=q_{r-1}\bmod q_{r} and sgn(ϵrst)=+1\operatorname{sgn}(\epsilon_{rst})=+1. Then we get the upper bound ursqr1<1ar+1/sqr+1/u_{rsq_{r-1}}<1-\frac{a_{r+1}^{/}-s}{q_{r+1}^{/}}.

Proof.

For canonical rstrst, the points αrst\alpha_{rst} which lie in (1)uI0(-1)^{u}I_{0} are precisely those for which either (a) t=0modqrt=0\bmod q_{r} and sgn(ϵrst)=(1)u+r\operatorname{sgn}(\epsilon_{rst})=(-1)^{u+r}, or (b) t=(1)r+1((1)u)qr1modqrt=(-1)^{r+1}\left((-1)^{u}\right)q_{r-1}\bmod q_{r} and sgn(ϵrst)=(1)u+r+1\operatorname{sgn}(\epsilon_{rst})=(-1)^{u+r+1}

(a) For t=qrt=q_{r} we have from (47) that always sgn(ϵrst)=+1\operatorname{sgn}(\epsilon_{rst})=+1 and so u+r=0mod2u+r=0\bmod 2 and αrsqr\alpha_{rsq_{r}} always lies in (1)rI0(-1)^{r}I_{0}.

(b) If u=rmod2u=r\bmod 2, the possibilities are sgn(ϵrst)=1\operatorname{sgn}(\epsilon_{rst})=-1 and t=qrqr1t=q_{r}-q_{r-1} giving αrst\alpha_{rst} in (1)rI0(-1)^{r}I_{0}. But in this case we can only have sgn(ϵrst)=1\operatorname{sgn}(\epsilon_{rst})=-1 if r<n,s=0r<n,s=0. Then lr0(qrqr1)>1qr+(1qr+1/+qrqr1qrqr+1/+1qr+2/)>12qrl_{r0(q_{r}-q_{r-1})}>\frac{1}{q_{r}}+\left(\frac{-1}{q_{r+1}^{/}}+\frac{q_{r}-q_{r-1}}{q_{r}q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}\right)>\frac{1}{2q_{r}}.

If u=r+1mod2u=r+1\bmod 2, the possibilities are sgn(ϵrst)=+1\operatorname{sgn}(\epsilon_{rst})=+1 and t=qr1t=q_{r-1} giving αrst\alpha_{rst} in (1)r+1I0(-1)^{r+1}I_{0}. But in this case we can only have sgn(ϵrst)+1\operatorname{sgn}(\epsilon_{rst})\neq+1 if s=0s=0. And ursqr1=(11qr)+(sqr+qr1qrqr+1/+1qr+2/)<1ar+1/sqr+1/u_{rsq_{r-1}}=\left(1-\frac{1}{q_{r}}\right)+\left(\frac{sq_{r}+q_{r-1}}{q_{r}q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}\right)<1-\frac{a_{r+1}^{/}-s}{q_{r+1}^{/}} by (48). ∎

6. Analysis of spaces of unbounded observables

The analysis of spaces of unbounded (and non integrable) functions is not well developed, and so in this section we lay some foundations suitable for our purposes. The usual approaches to functional analysis via the theory of Banach spaces and Schauder bases do not seem immediately helpful here. Instead, noting that fundamentally we are interested in summing functions over large orbits, we look for help from the elementary theory of integration. In particular we find that extending the ideas of Bounded Variation and Jordan Decomposition leads to a natural set of function spaces suitable for our use.

For precise definitions of circle terminology in this section, see 2.4.

6.1. Definitions and Basic Topology

Recall that a real observable on the circle is a (total) function ϕ𝕋\phi_{\mathbb{T}\mathbb{R}}. Our first inclination may be to define an unbounded observable as a function which can take values on the extended real line {±}\mathbb{R}\bigcup\{\pm\infty\}. However this is not necessary, and instead it proves technically simpler to restrict the codomain of unbounded observables to \mathbb{R} for the following reasons:

  1. (1)

    The sum i=1nϕ(xi)\sum_{i=1}^{n}\phi(x_{i}) is always well defined (we avoid the difficulty of defining \infty-\infty)

  2. (2)

    The function i=1nϕi\sum_{i=1}^{n}\phi_{i} is always well defined (again we avoid the difficulty of defining \infty-\infty)

  3. (3)

    We can derive observables (total functions) ϕ𝕋\phi_{\mathbb{T}\mathbb{R}} from partial functions by using the simple convention that the resulting observable vanishes outside the domain of the partial function. We will write ϕX(x)=ψY(x)\phi_{X\mathbb{R}}(x)=\psi_{Y\mathbb{R}}(x) for YXY\subseteq X to mean ϕ\phi coincides with ψ\psi on YY and vanishes on XX outside YY.

Example 55.

The function ϕ𝕀(x)=1x\phi_{\mathbb{I}\mathbb{R}}(x)=\frac{1}{x} on 𝕀=(0,1)\mathbb{I}=(0,1) is a partial function on 𝕋=[0,1)\mathbb{T}=[0,1) whose domain is (0,1).(0,1). It extends to the observable ϕ𝕋\phi_{\mathbb{T}\mathbb{R}}, a total function on [0,1)[0,1) by ϕ𝕋(0)0\phi_{\mathbb{T}\mathbb{R}}(0)\coloneqq 0. The observable ϕ\phi is unbounded but defined on the whole of [0,1)[0,1), and has codomain \mathbb{\mathbb{R}} (there is no need to extend \mathbb{\mathbb{R}} with points at infinity). We allow ourselves to write ϕ𝕋(x)=1x\phi_{\mathbb{T}\mathbb{R}}(x)=\frac{1}{x} with the convention that ϕ(x)=0\phi(x)=0 where 1x\frac{1}{x} is undefined.

The following treatment of locally bounded variation is slightly non-standard but again allows us to avoid some technical difficulties. We first extend the definition of bounded variation to non-closed intervals:

Definition 56 (Bounded Variation on an Interval).

Let JJ be any interval (not necessarily closed) of the real line or the circle, and let Part(J)Part(J) the set of partitions of JJ (ie ordered sequences of the form (xi)i=1n(x_{i})_{i=1}^{n} for n1n\geq 1). Given a function ϕJ\phi_{J\mathbb{\mathbb{R}}}, if JJ is not the full circle, we define the variation of ϕ\phi over JJ as VarJϕ=supPart(J)r=1n1|ϕ(xr+1)ϕ(xr)|\operatorname{Var}_{J}\phi=\sup_{Part(J)}\sum_{r=1}^{n-1}\left|\phi(x_{r+1})-\phi(x_{r})\right|. If VarJϕ\operatorname{Var}_{J}\phi exists we say ϕ\phi is of Bounded Variation (ϕ(\phi is BV) on JJ, otherwise ϕ\phi is of Unbounded Variation (ϕ\phi is UBV) on JJ. If JJ is the full circle, we define Var𝕋ϕ=supPart(𝕋)r=1n|ϕ(xr+1)ϕ(xr)|\operatorname{Var}_{\mathbb{T}}\phi=\sup_{Part(\mathbb{T})}\sum_{r=1}^{n}\left|\phi(x_{r+1})-\phi(x_{r})\right| where xn+1x1x_{n+1}\coloneqq x_{1}.

Note that this definition coincides with the classical definitions of Bounded Variation on closed intervals and on the full circle.

Example 57.

Given the observable ϕx=1x\phi x=\frac{1}{x} on the circle (using our convention that ϕ(0)=0\phi(0)=0) then ϕ\phi is BV on the directed intervals [12,1)[\frac{1}{2},1),[12,1][\frac{1}{2},1] but UBV on [0,12)[0,\frac{1}{2}) and (0,12)(0,\frac{1}{2}).

We now give a simple but powerful principle which is not specifically related to unboundedness, but includes it.

Definition 58 (Localisation).

Given a topological space (X,)(X,\mathscr{F}), and a proposition PP defined on the open sets of XX. We derive the proposition locally PXP_{X} (written LPXLP_{X}) which is defined at points of XX as follows: xx is locally PP (ie LPX(x)LP_{X}(x)) iff xx has an open neighbourhood NxN_{x}\in\mathscr{F} satisfying PX(Nx)P_{X}(N_{x}). In particular if the proposition is defined in terms of a function ϕXY\phi_{XY}, ie PP(ϕXY)P\coloneqq P(\phi_{XY}), then when xx is locally PP (ie LP(x)LP(x)) we will also say that ϕ\phi is locally PP at xx. We will denote the set of points at which xx is locally PP as LP(ϕ)LP(\phi).

Example 59.

Let (J,)(J,\mathscr{F}) be an interval of the real line or the circle equipped with the ambient subspace topology, and ϕJ\phi_{J\mathbb{R}} a fixed function. Let P=BVP=BV, the proposition defined on each KK\in\mathscr{F} that ϕ\phi is of Bounded Variation on KK. Then ϕJ\phi_{J\mathbb{R}} is of Locally Bounded Variation (ie LP=LBVLP=LBV) at xJx_{J} iff xx has a neighbourhood KK\in\mathscr{F} on which ϕ\phi is of Bounded Variation. We then write LBVJ(x)LBV_{J}(x). Note that if ϕ\phi is not LBVLBV at xx, then there is no neighbourhood of JJ containing xx on which ϕ\phi is LBVLBV, which means in particular that ϕ\phi is UBVUBV on every open interval containing xx and we will write UBV(x)UBV(x). Finally LBVJ(ϕ)LBV_{J}(\phi) is the set of points in JJ at which ϕ\phi is of locally bounded variation, and UBVJ(ϕ)UBV_{J}(\phi) is the set of points at which ϕ\phi is of Unbounded Variation.

Proposition 60.

Given a topological space (X,)(X,\mathscr{F}), an observable ϕX\phi_{X\mathbb{R}}, and a proposition PXP_{X} defined on open sets of XX, then the set of points LPX(ϕ)LP_{X}(\phi) (the points at which ϕ\phi is locally PP) is open, and its boundary belongs to the complement of LP(ϕ)LP(\phi), the closed set of points at which ϕ\phi is not locally PP

Proof.

Let xLP(ϕ)x\in LP(\phi) so there is a neighbourhood NxN_{x} with xNxx\in N_{x} and P(Nx)P(N_{x}). By definition there is an open set OxO_{x}\in\mathscr{F} with xOxNxx\in O_{x}\subseteq N_{x}. Now for any yOxy\in O_{x}, NxN_{x} is also a neighbourhood of yy, and since P(Nx)P(N_{x}) we have yLP(ϕ)y\in LP(\phi). Hence OxLP(ϕ)O_{x}\subseteq LP(\phi) and so xx is an interior point of LP(ϕ)LP(\phi) and LP(ϕ)LP(\phi) is open.

Now LP(x)LP(x) is defined for every xXx_{X} and so either LP(x)LP(x) or !LP(x)!LP(x), and so the points at which !LP(x)!LP(x) form the complement of LP(ϕ)LP(\phi). Since LP(ϕ)LP(\phi) is open, its complement is closed and the boundary of LP(ϕ)LP(\phi) lies completely in that complement. ∎

Corollary 61 (LBV topology).

In an interval JJ with the subspace topology, the points LBVJ(ϕ)LBV_{J}(\phi) at which a function ϕJ\phi_{J\mathbb{\mathbb{R}}} is of locally bounded variation, form an at most countable set of disjoint open intervals in JJ, and the boundaries of these intervals in JJ are of unbounded variation.

Proof.

Applying the Proposition to example 59 gives us that LBVJ(ϕ)LBV_{J}(\phi) (the set of points of JJ at which ϕ\phi is of locally bounded variation) is an open set in JJ, and hence (by Proposition 8) is an at most countable union of disjoint open intervals. The boundary of the set LBVJ(ϕ)LBV_{J}(\phi) in JJ is therefore the set of boundaries of these intervals and the proposition tells us these lie in !LBV(ϕ)!LBV(\phi), ie they are in UBV(ϕ)UBV(\phi).

Note that the endpoints of an interval JJ itself are not boundary points in the subspace topology, so that the endpoints of JJ may still be LBVJ(ϕ).LBV_{J}(\phi). For example let ϕx=1/x\phi x=1/x with J=[34,1],J=[\frac{3}{4},1],then ϕJ\phi_{J\mathbb{\mathbb{R}}} is LBVLBV on the whole of JJ including both endpoints, and so LBV(ϕ)LBV(\phi) is the single interval JJ which is open without boundary in the subspace topology of JJ. Contrast this with the case of J=[0,14]J=[0,\frac{1}{4}], when LBV(ϕ)LBV(\phi) is now (0,14](0,\frac{1}{4}] and this interval has the boundary point 0 in JJ and is UBVUBV, but 14\frac{1}{4} is LBVLBV and is not a boundary point of LBV(ϕ)LBV(\phi) in JJ.

Also from Corollary 61 we have !LBV(ϕ)=UBV(ϕ)!LBV(\phi)=UBV(\phi), ie ϕ\phi is of Unbounded Variation at all points (including endpoints) outside the open intervals of LBV(ϕ)LBV(\phi). Note that UBV(ϕ)UBV(\phi) itself may have positive measure. For example the Dirichlet observable (the indicator function on the rationals in [0,1)[0,1)) is UBVUBV everywhere and LBV(ϕ)LBV(\phi) is empty.

We summarise our findings in the following Theorem:

Theorem 62.

For any interval JJ of the circle or real line (with the subspace topology), and observable ϕJ\phi_{J\mathbb{\mathbb{R}}}, there is a canonical decomposition of JJ into the disjoint union of the closed set UBVJ(ϕ)UBV_{J}(\phi) with an at most countable set of open subintervals {Ik}\{I_{k}\} on which ϕ\phi is LBVLBV. Further, writing ϕU=xUBV(ϕ)ϕ,ϕk=xIkϕ\phi_{U}=\left\llbracket x\in UBV(\phi)\right\rrbracket\phi,\phi_{k}=\left\llbracket x\in I_{k}\right\rrbracket\phi we have

(6.1) ϕ=ϕU+kϕk\phi=\phi_{U}+\sum_{k}\phi_{k}

where each ϕk\phi_{k} is LBVLBV with disjoint open support IkI_{k}, and the boundary (endpoints) of each IkI_{k} are in UBV(ϕ)UBV(\phi)

Definition 63 (Primitive functions).

Given intervals IJI\subseteq J, a function ϕJ\phi_{J\mathbb{R}} is a primitive on II if it vanishes888this means II contains the set theoretic support of ϕ\phi in JJ, but not necessarily the topological support as this is the closure of the set theoretic support) outside II, is monotone on II, and has a bound on II (ie infIϕ\inf_{I}\phi or supIϕ\sup_{I}\phi exists, or both). If it has both bounds it is a bounded primitive, otherwise it is an upper or lower bounded primitive.

Notes:

  1. (1)

    A monotone function on an interval is LBVLBV on that interval, so that any primitive on II is LBVLBV on II.

  2. (2)

    The monotone constraint rules out the existence of primitives which are bounded functions of unbounded variation (eg sin1/x\sin 1/x on (0,1)(0,1)).

  3. (3)

    We need to take the same care over primitives and interval endpoints as we do for any LBVLBVfunctions. So a primitive may be unbounded on non-included endpoints, eg ϕ𝕋x=1/x\phi_{\mathbb{TR}}x=1/x is a lower bounded primitive on I=(0,1)I=(0,1) but unbounded at 0 (where its value is 0). If II includes an endpoint aa, then ϕ\phi must be bounded on II at aa though it may be unbounded outside II at aa, eg ϕ𝕋x=1/x\phi_{\mathbb{TR}}x=1/x is also a bounded primitive on I=[12,1]I=[\frac{1}{2},1], where it is bounded at 11 in II, but unbounded at 11 in 𝕋\mathbb{T}.

6.2. Some initial function classes

Normalised functions

Definition 64.

We say an observable ϕ\phi is normalised if ϕ(x)=0\phi(x)=0 on UBV(ϕ)UBV(\phi). (Note this requires UBV(ϕ)UBV(\phi) to have empty interior). The normalisation of ϕ\phi is the normalised function ϕNorm(x)=xUBV(ϕ)ϕ(x)\phi^{Norm}(x)=\left\llbracket x\not\in UBV(\phi)\right\rrbracket\phi(x). We say ϕ\phi is equivalent to ψ\psi under normalisation if the equivalence relation ϕNorm=ψNorm\phi^{Norm}=\psi^{Norm} holds.

Example 65.

The observable ϕx=1/x\phi x=1/x is normalised (since UBV(ϕ)={0}UBV(\phi)=\{0\} and ϕ0=0\phi 0=0). However the Dirichlet function is not normalised, since it is UBV everywhere but not 0 everywhere. Its normalisation is the 0 function.

When considering Birkhoff sums SNϕ=r=1Nϕ(x0+rα)S_{N}\phi=\sum_{r=1}^{N}\phi(x_{0}+r\alpha) we are primarily interested only in orbits which avoid the unbounded points of ϕ\phi. In these cases the value of ϕ\phi at each unbounded point has no effect on the Birkhoff sum, ie SNϕ=SNϕNormS_{N}\phi=S_{N}\phi^{Norm}. We will henceforth regard observables as normalised unless otherwise indicated.

Note that if ϕ\phi is normalised then ϕU=0\phi_{U}=0 in (6.1), and so this equation simplifies to ϕ=kϕk\phi=\sum_{k}\phi_{k}. In the next section we will investigate the decomposition of the functions ϕk\phi_{k}.

Banach congruence classes Given an additive group GG of real valued functions, let GB={ϕ:sup|ϕ|<}G_{B}=\{\phi:\sup|\phi|<\infty\} be the Banach space (and subgroup) of bounded functions. Then each congruence class of G/GBG/G_{B} consists of unbounded functions whose differences lie in GBG_{B}.

Definition 66.

We say that two observables ϕ,ψ\phi,\psi are of finite difference if ϕψ\phi-\psi is bounded, and we write (ϕBψ\phi\stackrel{{\scriptstyle B}}{{\sim}}\psi) noting that this is an equivalence relation.

Example 67.

On [0,1)[0,1) it is easily shown that πcotπxB12xx(1x)B1{{x}}\pi\cot\pi x\stackrel{{\scriptstyle B}}{{\sim}}\frac{1-2x}{x(1-x)}\stackrel{{\scriptstyle B}}{{\sim}}\frac{1}{\{\{x\}\}} but cscπx≁cotπx\csc\pi x\not\sim\cot\pi x as their difference is unbounded as x1x\rightarrow 1.

If ϕ,ψ\phi,\psi are of bounded difference, this means SNϕSNψ=O(N)S_{N}\phi-S_{N}\psi=O(N) and further if (ϕψ)=0\oint(\phi-\psi)=0 then SNϕSNψ=O(logN)2+ϵS_{N}\phi-S_{N}\psi=O(\log N)^{2+\epsilon} (**) for ae α\alpha. In particular this means if SNϕS_{N}\phi has a super-linear growth rate (ie greater than O(N)O(N)), then SNψS_{N}\psi has the same growth rate and the same set of unbounded points, and that this super-linear growth rate is determined purely by the behaviour of ϕ,ψ\phi,\psi at unbounded points.

We now refine some existing understanding and terminology related to boundedness to enable us to study unboundedness. Monotone functions We will use the word “increasing” always to mean monotone increasing unless we specify otherwise. The word “decreasing” is dual to increasing, and can be substituted in the following paragraph.

6.3. Classification of primitive functions on an interval

Refer to caption
Figure 6.1. Quadrants of Primitives

Let IJI\subseteq J be an interval with endpoint coordinates a,ba,b, such that JJ contains the endpoints a,ba,b. (If JJ is the circle we allow a,ba,b to represent the same circle point, but require b=a+1b=a+1). Recall ϕJ\phi_{J\mathbb{\mathbb{R}}} is a primitive on II if ϕ\phi vanishes outside II, is monotone on II, and has a bound (ie at least one of infϕ\inf\phi or supϕ\sup\phi exists). Note that a primitive function which is constant is both ascending and descending on II, but any other primitive is either ascending or descending.

A primitive is called upper bounded if supϕ\sup\phi exists, and lower bounded if infϕ\inf\phi exists. It is called a bounded primitive if both bounds exist, and an unbounded primitive otherwise. We can therefore classify an unbounded primitive by whether it is ascending or descending (A or D) and whether it is upper or lower bounded (U or L).

Let us write the set of Descending, Lower bounded primitives on II as DL, we similarly define the sets AL, AU, and DU. An unbounded primitive belongs to precisely one of these sets, but a bounded primitive is both upper and lower bounded and so will belong to both AU and AL, or to both DU and DL, except for the case of a constant function which belongs to all 4 sets. Note that DL is closed under addition and contains the constant primitive 0. It therefore forms a commutative monoid. Further it forms a module over the semi-ring +={x:x0}\mathbb{R}^{+}=\{x\in\mathbb{R}:x\geq 0\}. The same applies to each of the other 3 sets.

Note that if ϕ\phi is a primitive on II, then so is ϕ-\phi. Further, if II is either open or closed (ie not semi-open) then ϕ¯(x)ϕ(a+bx)\overline{\phi}(x)\coloneqq\phi(a+b-x) is also a primitive on II, and hence so is ϕ¯-\overline{\phi}. Finally ϕ¯(x)=(ϕ)(a+bx)=ϕ¯(x)\overline{-\phi}(x)=(-\phi)(a+b-x)=-\overline{\phi}(x).

Let \mathcal{F} be a family of primitives of II. Then {ϕ:ϕ}-\mathcal{F}\coloneqq\{\phi:-\phi\in\mathcal{F}\} is clearly also a family of primitives of II. If II is either open or closed we can also define another family of primitives ¯{ϕ:ϕ¯(a+bx)}\overline{\mathcal{F}}\coloneqq\{\phi:\overline{\phi}(a+b-x)\in\mathcal{F}\}, and then ¯=¯={ϕ:ϕ¯}\overline{-\mathcal{F}}=-\overline{\mathcal{F}}=\{\phi:-\overline{\phi}\in\mathcal{F}\} is a 4th family. We call the set {,.¯,¯}\{\mathcal{F},-\mathcal{F}.\overline{\mathcal{F}},-\overline{\mathcal{F}}\} the dual quadrants of \mathcal{F} (under the dualities ,.¯-,\bar{.}). Note that the set {DL,AU,DU,AL}\{DL,AU,DU,AL\} comprises the dual quadrants of each of its 4 members when II is open or closed (but not semi-open), and fixing DLDL this gives AU=DL,AL=DL¯,DU=DL¯AU=-DL,AL=\overline{DL},DU=-\overline{DL}.

If ϕ\phi is a primitive of II which has a lower bound of 0 (ie ϕ0\phi\geq 0 on II) then we say ϕ\phi is a positive primitive on II, and if ϕ0\phi\leq 0 we say ϕ\phi is negative. We denote the set of descending positive primitives on II as Φ(I)\Phi(I), and note this is a sub-monoid of DLDL (Φ(I)DL(I)\Phi(I)\subset DL(I)). Then the dual quadrant Φ(I)AU(I)-\Phi(I)\subset AU(I) is the monoid of ascending negative primitives. If II is open or closed, we can also identify the other dual quadrants as the monoids Φ¯(I)AL(I),Φ¯(I)DU(I)\overline{\Phi}(I)\subset AL(I),-\overline{\Phi}(I)\subset DU(I).

Positive and negative primitives have the important properties that they can be extended and translated in JJ.

If ϕΦ(I)\phi\in\Phi(I) and KK is a right extension of II (ie xK\Ixbx\in K\backslash I\,\Rightarrow\,x\geq b where bb is the right endpoint of II), then ϕΦ(K)\phi\in\Phi(K) so that Φ(I)Φ(K)\Phi(I)\subset\Phi(K). Similarly Φ(I)Φ(K)-\Phi(I)\subset-\Phi(K), and if KK^{\prime} is a left extension of II (and open or closed with II) then also Φ¯(I)Φ¯(K),Φ¯(I)Φ¯(K)\overline{\Phi}(I)\subset\overline{\Phi}(K^{\prime}),-\overline{\Phi}(I)\subset-\overline{\Phi}(K^{\prime}).

We now assume I=(a,b)I=(a,b) is open. Let K=|c,d|K=|c,d| (where we set c=0,d=1c=0,d=1 when KK is the circle [0,1)[0,1)). If ϕΦ(I)\phi\in\Phi(I) we can define the translation Tϕ(x)=ϕ(x+(ac))T\phi(x)=\phi(x+(a-c)) and then TϕΦ(c,c+(ba))T\phi\in\Phi(c,c+(b-a)). By the remarks on extension above we can write TϕΦ(c,d)T\phi\in\Phi(c,d) and TϕΦ(c,d)-T\phi\in-\Phi(c,d). In particular when KK is the circle, TϕΦ(0,1),TϕΦ(0,1)T\phi\in\Phi(0,1),-T\phi\in-\Phi(0,1). Similarly if ϕΦ¯(I)\phi\in\overline{\Phi}(I) we define the translation T¯ϕ(x)=ϕ((db)+x))\overline{T}\phi(x)=\phi((d-b)+x)) and then T¯ϕΦ¯(d(ba),d)\overline{T}\phi\in\overline{\Phi}(d-(b-a),d) which we can extend to T¯ϕΦ¯(c,d)\overline{T}\phi\in\overline{\Phi}(c,d) and T¯ϕΦ¯(c,d)-\overline{T}\phi\in-\overline{\Phi}(c,d). In particular when JJ is the circle, T¯ϕΦ¯(0,1),T¯ϕΦ¯(0,1)\overline{T}\phi\in\overline{\Phi}(0,1),-\overline{T}\phi\in-\overline{\Phi}(0,1). In summary, positive and negative primitives on an open interval II can be translated to primitives on the interior of KK.

Note that a monotone function on a closed interval is BV. Since a primitive ϕ\phi on II vanishes outside II, if II is closed ϕ\phi is BV (and hence LBV) on the whole of JJ. If II is open then ϕ\phi is LBV on II (by 61) and LBVLBV outside the closure of II, and also LBVLBV at the bounded endpoint of II.

6.4. Representation of functions by primitives

Recall we denote the set of points of locally bounded variation LBV(ϕ)LBV(\phi), and its complement UBV(ϕ)UBV(\phi). We have shown LBV(ϕ)LBV(\phi) is a union of disjoint open intervals, although it is important to note that this includes the possibilities LBV(ϕ)=𝕋LBV(\phi)=\mathbb{T} (when ϕ\phi is BV on the whole circle) and LBV=LBV=\emptyset (for example ϕ(x)=x=pq,(p,q)=1q\phi(x)=\left\llbracket x=\frac{p}{q},(p,q)=1\right\rrbracket q is unbounded everywhere and so is not locally bounded anywhere). Otherwise there is an at most countable set of disjoint open intervals Ik=(xkL,xkR)I_{k}=(x_{k}^{L},x_{k}^{R}) with LBV(ϕ)=kIkLBV(\phi)=\bigcup_{k}I_{k} and xkL,xkRUBV(ϕ)x_{k}^{L},x_{k}^{R}\in UBV(\phi). We will assume LBV(ϕ),UBV(ϕ)LBV(\phi),UBV(\phi)\neq\emptyset going forward. We can now define the observables ϕUx=xUBV(ϕ)ϕx\phi_{U}x=\left\llbracket x\in UBV(\phi)\right\rrbracket\phi x and ϕi=x(xiL,xiR)ϕ\phi_{i}=\left\llbracket x\in(x_{i}^{L},x_{i}^{R})\right\rrbracket\phi, which enables us to write ϕ=ϕU+iϕi\phi=\phi_{U}+\sum_{i}\phi_{i} with UBV(ϕi)={xiL,xiR}UBV(\phi_{i})=\{x_{i}^{L},x_{i}^{R}\}. In the special case of ϕ\phi having a single UBV point xx, we get x1L=x1R=xx_{1}^{L}=x_{1}^{R}=x, otherwise the endpoints are all distinct.

Note that for a normalised function, by definition ϕU=0\phi_{U}=0.

We can now extend the Jordan Decomposition theorem from BV functions to LBV functions:

Lemma 68 (Jordan Decomposition of LBV).

If ϕ\phi is LBV on an (non-empty) open interval I=(a,b)I=(a,b)then for each point m(a,b)m\in(a,b) there is a canonical decomposition into primitives on II, namely ϕ=ϕ(m)+i=14ϕi\phi=\phi(m)+\sum_{i=1}^{4}\phi_{i} where ϕ1Φ(a,m),ϕ2Φ(a,m),ϕ3Φ¯(m,b),ϕ4Φ¯(m,b)\phi_{1}\in\Phi(a,m),\,\phi_{2}\in-\Phi(a,m),\,\phi_{3}\in\overline{\Phi}(m,b),\,\phi_{4}\in-\overline{\Phi}(m,b).

Proof.

We partition ϕ\phi into left and right functions ϕL=x(a,m)(ϕ(x)ϕ(m)),ϕR=x(m,b)(ϕ(x)ϕ(m))\phi^{L}=\left\llbracket x\in(a,m)\right\rrbracket(\phi(x)-\phi(m)),\,\phi^{R}=\left\llbracket x\in(m,b)\right\rrbracket(\phi(x)-\phi(m)).We now define the left and right variations of ϕ\phi for a<x<ba<x<b by VL(x)=x(a,m)Var[x,m]ϕ,VR(x)=x(m,b)Var[m,x]ϕV^{L}(x)=\left\llbracket x\in(a,m)\right\rrbracket\operatorname{Var}_{[x,m]}\phi,\,V^{R}(x)=\left\llbracket x\in(m,b)\right\rrbracket\operatorname{Var}_{[m,x]}\phi (the total variation of ϕ\phi over (x,m),(m,x)(x,m),(m,x) respectively). These functions are well defined since ϕ\phi is BV on each interval (x,m)(x,m) or (m,x)(m,x). We now define the difference functions DL(x)=VL(x)ϕL(x),DR(x)=VR(x)ϕR(x)D^{L}(x)=V^{L}(x)-\phi^{L}(x),\,D^{R}(x)=V^{R}(x)-\phi^{R}(x). Note that these 4 defined functions all vanish outside (a,m)(a,m) or (m,b)(m,b) and are monotone and positive and hence positive primitives on their domains. VL,DLV^{L},D^{L} are both decreasing and so VL,DLΦ(a,m)V^{L},D^{L}\in\Phi(a,m), whereas VR,DRV^{R},D^{R} are increasing and so VR,DRΦ¯(m,b)V^{R},D^{R}\in\overline{\Phi}(m,b). But now x(a,b)ϕ(x)=ϕ(m)+VL(x)DL(x)+VR(x)DR(x)\left\llbracket x\in(a,b)\right\rrbracket\phi(x)=\phi(m)+V^{L}(x)-D^{L}(x)+V^{R}(x)-D^{R}(x) and the result follows. ∎

We can combine (6.1) and Lemma (68) to obtain:

Theorem 69 (Decomposition Theorem).

Given an observable ϕJ\phi_{J\mathbb{R}} then LBV(ϕ)LBV(\phi) is an at most countable union of disjoint open intervals Ik=(ak,bk)I_{k}=(a_{k},b_{k}), and ϕ=ϕU+kϕk\phi=\phi_{U}+\sum_{k}\phi_{k} where ϕU=xUBV(ϕ)ϕ\phi_{U}=\left\llbracket x\in UBV(\phi)\right\rrbracket\phi and ϕk\phi_{k} is an LBV function on IkI_{k}. Further, given mkIkm_{k}\in I_{k} we have the representation of ϕk\phi_{k} in terms of primitives given by ϕk=ϕ(mk)+j=14ϕkj\phi_{k}=\phi(m_{k})+\sum_{j=1}^{4}\phi_{kj} where ϕk1Φ(ak,mk),ϕk2Φ(ak,mk),ϕk3Φ¯(ak,mk),ϕk4Φ¯(ak,mk)\phi_{k1}\in\Phi(a_{k},m_{k}),\,\phi_{k2}\in-\Phi(a_{k},m_{k}),\,\phi_{k3}\in\overline{\Phi}(a_{k},m_{k}),\,\phi_{k4}\in-\overline{\Phi}(a_{k},m_{k}).

7. Homogeneous Birkhoff Sums of unbounded observables

7.1. Introduction

In the last section we established that if ϕ𝕋\phi_{\mathbb{TR}} is an observable on the circle we have the decomposition ϕ=ϕU+kKϕk\phi=\phi_{U}+\sum_{k\in K}\phi_{k} into (at most countable) functions of disjoint support, where each ϕk\phi_{k} is LBV with its support in an open interval IkI_{k}. We also showed we can decompose each ϕk\phi_{k} into a sum of 4 primitive functions ϕki\phi_{ki} from each quadrant of Φ(I)\Phi(I). Since SNS_{N} is linear this means we have reduced the study of SN(ϕ,x)S_{N}(\phi,x) to the study of SN(ϕU,x)S_{N}(\phi_{U},x) and the study of SN(ϕki,x)S_{N}(\phi_{ki},x) with ϕki\phi_{ki} primitive. We will generally be concerned with sums for which x+rαx+r\alpha never coincides with an unbounded point of ϕ\phi so that SN(ϕU,x)=0S_{N}(\phi_{U},x)=0. This means we can replace ϕ\phi by its normalised form ϕNorm\phi^{Norm} (which vanishes on the set of unbounded points of ϕ\phi) without changing the overall sum. In this section we concentrate our attention on the study of SN(ϕ,x)S_{N}(\phi,x) for primitive ϕ\phi.

Now suppose ϕ\phi is a primitive on the directed interval (a,b)(a,b). If ϕ\phi is unbounded at aa then ϕax𝕋ϕ{x+a}\phi_{a}\coloneqq x_{\mathbb{T}}\rightarrow\phi\{x+a\} is a primitive on (0,1)(0,1), unbounded at 0+0^{+}. Similarly if ϕ\phi is unbounded at bb then ϕbx𝕋ϕ{xb}\phi_{b}\coloneqq x_{\mathbb{T}}\rightarrow\phi\{x-b\} is a primitive on (0,1)(0,1), unbounded at 11^{-}. Then SN(ϕ,x0)=SN(ϕa,x0a)S_{N}(\phi,x_{0})=S_{N}(\phi_{a},x_{0}-a) and SN(ϕ,x0)=SN(ϕb,x0+b)S_{N}(\phi,x_{0})=S_{N}(\phi_{b},x_{0}+b) which means we can reduce the study of Birkhoff sums of general primitives to the study of Birkhoff sums of primitives on (0,1)(0,1). We will now assume ϕ\phi is a primitive on (0,1)(0,1).

Recall that we call SN(ϕ,x)S_{N}(\phi,x) a homogeneous sum when x=0x=0 and inhomogeneous otherwise.

The general case of an inhomogeneous sum SN(ϕ,x)S_{N}(\phi,x) is challenging. This may seem surprising because we can easily convert an inhomogeneous sum to a homogeneous sum by defining ϕc(x)=ϕ(x+c)\phi_{c}(x)=\phi(x+c), and then SN(ϕ,x0)=SN(ϕx0,0)S_{N}(\phi,x_{0})=S_{N}(\phi_{x_{0}},0). The problem is that in general ϕx0\phi_{x_{0}} will not then be a primitive on (0,1)(0,1), and the underlying problem has not been simplified.

Fortunately homogeneous sums are sufficient to address a high proportion of problems of interest (including the cases described in the abstract of this paper).. Given that this paper is already somewhat lengthy, we therefore content ourselves with postponing consideration of inhomogeneous sums to another paper, and restrict this paper to the homogeneous case.

We will now use the results on the sequential distribution of (rα)(r\alpha) from section 5 to develop estimates of the homogeneous sum SNϕSN(ϕ,0)=r=1Nϕ(rα)S_{N}\phi\coloneqq S_{N}(\phi,0)=\sum_{r=1}^{N}\phi(r\alpha) when ϕ\phi is a monotonic function on (0,1)(0,1). We will use another variant of notation SN(x)SN(ϕ,x)=r=1Nϕ(x+rα)S_{N}(x)\coloneqq S_{N}(\phi,x)=\sum_{r=1}^{N}\phi(x+r\alpha), noting carefully that this is semantically distinct from the homonymous notation SN(x)r=1NxrS_{N}(x)\coloneqq\sum_{r=1}^{N}x_{r} of section 4.

The restriction to monotone functions might at first seem restrictive. Classical studies on sums of bounded functions tend to focus on functions of bounded variation as the primary objects of interest. However since any function of bounded variation can be represented as the sum of two monotonic functions (one increasing, one decreasing), a focus on monotone functions is not a limitation, but rather a recognition of the fact that monotonic functions in fact lie slightly deeper.

We will first establish some duality results which reduce the number of cases we will need to consider. We will use the Source-Target Magma theory developed in Section 3, where we put the source S=(0,1)S=(0,1) with involution σSx(1x)\sigma_{S}\coloneqq x\mapsto(1-x) and target T=T=\mathbb{R} with involution τSxx\tau_{S}\coloneqq x\mapsto-x.

7.2. Dual Relations for Birkhoff Functionals

In this section we will explore the use of dualities to reduce the number of cases we will need to consider when we look at bounds for Birkhoff sums. We will exploit both the quite general Source and Target dualities investigated in Section 3, and also a more specialised quasiperiodic duality which arises in the context of Continued Fraction theory.

These dualities arise under the operation of various involutions, and we will call fixed points of an involution self-conjugate under that involution.

We will also define a duality which combines the effects of the Source and Quasiperiodic involutions. As this will be an important duality for us, we will call it simply the Double Duality, and its fixed points are the double self-conjugate points.

Source-Target Dualities in Birkhoff Sums on the circle Recall that a Birkhoff sum SNϕS_{N}\phi can be regarded as a functional operator (SN)(𝕋)\left(S_{N}\right)_{(\mathbb{TR})\mathbb{R}} acting on the function ϕ𝕋\phi_{\mathbb{TR}}. Note that since the orbit (rα)(r\alpha) avoids the origin, the sum SNϕ𝕋=1Nϕ𝕋(rα)S_{N}\phi_{\mathbb{TR}}=\sum_{1}^{N}\phi_{\mathbb{TR}}(r\alpha) is identical in value with the sum SNϕS{rα}S_{N}\phi_{S\mathbb{R}}\left\{r\alpha\right\} where ϕ𝕋\phi_{\mathbb{TR}} has been replaced by its restriction to the interval S=(0,1)S=(0,1), and the Birkhoff Sum functional (SN)(𝕋)\left(S_{N}\right)_{(\mathbb{TR})\mathbb{R}} becomes (SN)(S)\left(S_{N}\right)_{(S\mathbb{R})\mathbb{R}}.

Using this observation, we can now apply the theory developed in Section 3. In particular we will build on the example application given in Subsection 3.4. We take as the real intervals S(0,1)S\coloneqq(0,1) and TT\coloneqq\mathbb{R} (again equipped with the endorelations RS=RT=R_{S}=R_{T}=\leq_{\mathbb{R}}). We now set the source involution σS\sigma_{S} to be the circle involution .¯:x{1x}\overline{.}:x\mapsto\{1-x\} but restricted to (0,1)(0,1) (where we note it is still an involution and can be written x1xx\mapsto 1-x), and we set the target involution to be τTxx\tau_{T}\coloneqq x_{\mathbb{R}}\mapsto-x.

Note that now the source pull up of σS\sigma_{S} is σST=(.¯)STϕSTϕσS\sigma_{ST}=(\overline{.})_{ST}\coloneqq\phi_{ST}\mapsto\phi\circ\text{$\sigma_{S}$}, ie ϕ¯αϕα¯\overline{\phi}\alpha\coloneqq\phi\overline{\alpha}. Similarly the target pull up of τT\tau_{T} is τST=STϕTϕ\tau_{ST}=-_{ST}\coloneqq\phi\mapsto-_{T}\circ\phi, ie (STϕ)α(ϕα)\left(-_{ST}\phi\right)\alpha\coloneqq-_{\mathbb{R}}\left(\phi\alpha\right). Again we can pull up to Level 2, to get σ(ST)TA(ST)TAσST\sigma_{(ST)T}\coloneqq A_{(ST)T}\mapsto A\circ\sigma_{ST}, ie A¯ϕAϕ¯\overline{A}\phi\coloneqq A\overline{\phi} and similarly ((ST)TA)ϕ(Aϕ)\left(-_{(ST)T}A\right)\phi\coloneqq-_{\mathbb{R}}\left(A\phi\right).

If we further assume that AA is a τ\tau-morphism this means A(STϕ)=(Aϕ)A(-_{ST}\phi)=-_{\mathbb{R}}\left(A\phi\right).

In this source-target environment we also have further structure than in 3.4 since T=(,+,×)T=\left(\mathbb{R},+_{\mathbb{R}},\times_{\mathbb{R}}\right) is a field. The pull up the binary operations of TT to STST induces an algebra of functions over TT (where λTϕST+STμTψST\lambda_{T}\phi_{ST}\,+_{ST}\,\mu_{T}\psi_{ST} and ϕ×STψ\phi\,\times_{ST}\,\psi have their natural meanings). It is easily verified that the involutions σST,τST\sigma_{ST},\tau_{ST} are are then linear in both +ST,×ST+_{ST},\times_{ST}. The same argument applies analogously to the algebra (ST)T(ST)T of functionals over TT.

If ΨST\Psi\subset ST is self-conjugate under στ\sigma\tau, the Example application of Subsection 3.4 now applies, giving us:

(7.1) A1ΨA2\displaystyle A_{1}\leq_{\Psi}A_{2} σA1Ψ¯σA2σ/A1Ψ¯σ/A2σσ/A1Ψσσ/A2\displaystyle\Leftrightarrow\sigma A_{1}\leq_{\overline{\Psi}}\sigma A_{2}\Leftrightarrow\sigma^{/}A_{1}\leq_{\overline{\Psi}}\sigma^{/}A_{2}\Leftrightarrow\sigma\sigma^{/}A_{1}\leq_{\Psi}\sigma\sigma^{/}A_{2}
τA1ΨOpτA2στA1Ψ¯OpστA2σ/τA1Ψ¯Opσ/τA2σσ/τA1ΨOpσσ/τA2\Leftrightarrow\tau A_{1}\leq_{\Psi}^{Op}\tau A_{2}\Leftrightarrow\sigma\tau A_{1}\leq_{\overline{\Psi}}^{Op}\sigma\tau A_{2}\Leftrightarrow\sigma^{/}\tau A_{1}\leq_{\overline{\Psi}}^{Op}\sigma^{/}\tau A_{2}\Leftrightarrow\sigma\sigma^{/}\tau A_{1}\leq_{\Psi}^{Op}\sigma\sigma^{/}\tau A_{2}

We can go further in the case that A1,A2A_{1},A_{2} are τ\tau-morphisms. We use the results of (3.10) to obtain from A1ΨA2A_{1}\leq_{\Psi}A_{2} :

(7.2) A1ΨA2σA1Ψ¯σA2τA1Ψ¯τA2στA1ΨστA2\displaystyle A_{1}\leq_{\Psi}A_{2}\Leftrightarrow\sigma A_{1}\leq_{\overline{\Psi}}\sigma A_{2}\Leftrightarrow\tau A_{1}\leq_{\overline{\Psi}}\tau A_{2}\Leftrightarrow\sigma\tau A_{1}\leq_{\Psi}\sigma\tau A_{2}
τA1ΨOpτA2στA1Ψ¯OpστA2A1Ψ¯OpA2σA1ΨOpσA2\Leftrightarrow\tau A_{1}\leq_{\Psi}^{Op}\tau A_{2}\Leftrightarrow\sigma\tau A_{1}\leq_{\overline{\Psi}}^{Op}\sigma\tau A_{2}\Leftrightarrow A_{1}\leq_{\overline{\Psi}}^{Op}A_{2}\Leftrightarrow\sigma A_{1}\leq_{\Psi}^{Op}\sigma A_{2}

We pick out the first and last dual relations as being of particular use to us later in this Section:

(7.3) A1ΨA2σA2ΨσA1A_{1}\leq_{\Psi}A_{2}\Leftrightarrow\sigma A_{2}\leq_{\Psi}\sigma A_{1}
Remark 70.

Let ΘST\Theta\subset ST be a set of functions which are descending (ascending) on SS. Now σSTτSTΘ=Θ¯\sigma_{ST}\tau_{ST}\Theta=-\overline{\Theta} and this is also a set of descending (ascending) functions. Recall Ψ(Θ)ΘστΘ=ΘΘ¯\Psi(\Theta)\coloneqq\Theta\bigcup\sigma\tau\Theta=\Theta\bigcup-\overline{\Theta} is self-conjugate under στ\sigma\tau, as is also σΨ=Ψ¯=Θ¯Θ\sigma\Psi=\overline{\Psi}=\overline{\Theta}\bigcup-\Theta. So in these cases of Φ\Phi, Ψ\Psi is a self-conjugate set of descending (ascending) functions, and Ψ¯\overline{\Psi} is a self-conjugate set of ascending (descending) functions for which the dual relations in (7.1) apply, and further (7.2) apply when A1,A2A_{1},A_{2} are τ\tau-morphisms. There are important two special cases:

  1. (1)

    When Θ=Ψ\Theta=\Psi is the set of all monotonic descending (or monotonic ascending, or simply all monotonic) functions

  2. (2)

    When Ψ=ΦστΦ\Psi=\Phi\bigcup\sigma\tau\Phi where Φ\Phi is the monoid of positive descending primitives from (**).

Quasi-Period Duality

In this section we explore a specialised duality which arises in the context of continued fraction developments. We will use some of the Source/Target Duality theory from the previous section, but now in our specific environment of Birkhoff sums over rotations. Indexed Families Recall that it is sometimes convenient to call a function ιIX:IX\iota_{IX}:I\rightarrow X an index function. We then call II the index set, and xiι(i)x_{i}\coloneqq\iota(i) an indexed element. We call the collection (xi)iI(x_{i})_{i\in I} an indexed family (indexed by II). If I={i}i=1nI=\{i\}_{i=1}^{n} we also write the indexed family as (xi)i=1n(x_{i})_{i=1}^{n}.

Remark 71.

It is particularly important to note that (xi)(x_{i}) is not a set if ι\iota is not injective (it is a collection containing identical elements). This means we must be particularly careful if we wish to define functions on the set X={xi}X=\{x_{i}\} making use of the index ii. For example, given an endomorphism fIIf_{II} of II we can define the pull back function gIX=ιfg_{IX}=\iota\circ f which is a well-defined index function from II to XX. It is then tempting to think that we can then also define a derived endomorphism hXX:xixf(i)h_{XX}:x_{i}\mapsto x_{f(i)} but this is not the case in general: if hh exists and ι(i)=ι(j)\iota(i)=\iota(j) then xf(i)=h(xi)=h(xj)=xf(j)x_{f(i)}=h(x_{i})=h(x_{j})=x_{f(j)} and this is not the case for general ff, even if ff is a bijection. In general then, hXXh_{XX} is a relation (or multi-valued function).

Quasi-Period Indexed Families Recall (10) that we denote the sequence of quasiperiods of α\\alpha_{\mathbb{R}\backslash\mathbb{Q}} as (qrα)r=0(q_{r}^{\alpha}){}_{r=0}^{\infty}, and that for α(0,12)\\alpha\in(0,\frac{1}{2})\backslash\mathbb{Q} we have qrα=qr¯α¯q_{r}^{\alpha}=q_{\overline{r}}^{\overline{\alpha}} where α¯={1α}\overline{\alpha}=\{1-\alpha\} and r¯=r+1\overline{r}=r+1. This result reveals a non-trivial duality which we will develop in this section.

Note first that we have here two maps with the homonymous notation .¯{}^{\prime}\overline{.}^{\prime} . We can extend both to involutions as follows. The first is defined on 𝕀=[0,1)\mathbb{I}=[0,1) by (.¯)𝕀α{1α}\left(\overline{.}\right)_{\mathbb{I}}\coloneqq\alpha\mapsto\{1-\alpha\} (note that 0¯=0\overline{0}=0 is fixed point). The second needs slightly more care in definition: recalling =+{0}\mathbb{N}=\mathbb{N}^{+}\bigcup\{0\} we take the disjoint union 2=+\mathbb{N}_{2}=\mathbb{N}\bigsqcup\mathbb{N}^{+} and define the involution (.¯)2\left(\overline{.}\right)_{\mathbb{N}_{2}} by: r¯(r+1)+\overline{r_{\mathbb{N}}}\coloneqq(r+1)_{\mathbb{N}^{+}}and r+¯(r1)\overline{r_{\mathbb{N}^{+}}}\coloneqq(r-1)_{\mathbb{N}}. We can now use these two involutions to define a product involution (.¯)𝕀×(.¯)2\left(\overline{.}\right)_{\mathbb{I}}\times\left(\overline{.}\right)_{\mathbb{N}_{2}} on a suitable set as follows:

Definition 72.

The QuasiPeriod index set QPI=(QPISet,.¯)QPI=\left(QPI_{Set},\overline{.}\right) has as its underlying set the strict subset QPISet𝕀×QPI_{Set}\subset\mathbb{I}\times\mathbb{N} defined by

QPI{(α,r):α𝕀\&rα12}QPI\coloneqq\left\{(\alpha,r):\alpha\in\mathbb{I}\backslash\mathbb{Q}\>\&\>r\geq\left\llbracket\alpha\geq\frac{1}{2}\right\rrbracket\right\}

It is equipped with the Quasiperiod involution QQPIQ_{QPI} defined by Q(α,r)(α,r)¯=(α¯,r¯)Q(\alpha,r)\coloneqq\overline{(\alpha,r)}=(\overline{\alpha},\overline{r}) where α¯{1α}\overline{\alpha}\coloneqq\{1-\alpha\} and r¯r+(1)α>12\overline{r}\coloneqq r+(-1)^{\left\llbracket\alpha>\frac{1}{2}\right\rrbracket}.

A QP indexed family (QPI family) X={(xrα),ι(QPI)X}X=\left\{(x_{r}^{\alpha}),\iota_{(QPI)X}\right\} is a family X=(xrα)X=(x_{r}^{\alpha}) indexed by the QP index set QPIQPI using the index map ι(QPI)X:(α,r)QPIxrα\iota_{(QPI)X}:(\alpha,r)_{QPI}\mapsto x_{r}^{\alpha}). There is usually no need to specify the index map, in which case we simply write X=(xrα)X=(x_{r}^{\alpha}). Points satisfying xr¯α¯=xrαx_{\overline{r}}^{\overline{\alpha}}=x_{r}^{\alpha} we call QP self-conjugate. . If the index map is independent of α\alpha, ie for each rr, ι(α,r)=xr\iota(\alpha,r)=x_{r} for every α\alpha, we will also allow ourselves to write the family as (xrα)=(xr)(x_{r}^{\alpha})=(x_{r}). Similarly if ι\iota is independent of rr, we will write the family as (xα)(x^{\alpha}), and as (x)(x) if ι\iota is a constant map.

Remark 73.

Note that (α,r)QPI(\alpha,r)\in QPI means r0r\geq 0 for α<12\alpha<\frac{1}{2} and r1r\geq 1 for α>12\alpha>\frac{1}{2}. In particular this means that a QPI family (xrα)(x_{r}^{\alpha}) contains no element x0αx_{0}^{\alpha} for α>12\alpha>\frac{1}{2}.

Note that a QP indexed family may be further indexed by other index sets, eg the family (xrzα)(x_{rz}^{\alpha}) is indexed by QPI×ZQPI\times Z, but the QP involution only affects the QP index within the family, ie xrzα=ι((α,r),z)x_{rz}^{\alpha}=\iota\left((\alpha,r),z\right) and ι((α¯,r¯),z)=xr¯zα¯\iota\left((\overline{\alpha},\overline{r}),z\right)=x_{\overline{r}z}^{\overline{\alpha}}.

In line with Remark 71 we cannot assume that there is an induced qp involution QXxrαxr¯α¯Q_{X}\coloneqq x_{r}^{\alpha}\mapsto x_{\overline{r}}^{\overline{\alpha}} on a QPI family X=(xrα)X=(x_{r}^{\alpha}) if the index map is not injective. However in many (but not all ) important cases, the function xrαxr¯α¯x_{r}^{\alpha}\mapsto x_{\overline{r}}^{\overline{\alpha}} is well-defined (and then the induced function is also an involution since QQPIQ_{QPI} is itself an involution). When this is the case some individual proofs below could be simplified. However our overall presentation is simplified by having a single set of proofs which work whether an induced involution is defined or not. Hence we will NOT generally assume Qxrαxr¯α¯Q\coloneqq x_{r}^{\alpha}\mapsto x_{\overline{r}}^{\overline{\alpha}} is well-defined in the sequel.

We now define a duality which will prove important in later sections. This is a hybrid using both the QP involution and the source involution (.¯)𝕀𝕀α1α\left(\overline{.}\right)_{\mathbb{II}}\coloneqq\alpha\mapsto{1-\alpha}.

Definition 74 (Double Dual).

Given a QPI family of functions (ϕrα)(\phi_{r}^{\alpha}), we call the derived family (ϕr¯α¯¯)\left(\overline{\phi_{\overline{r}}^{\overline{\alpha}}}\right) the family of the double dual functions of (ϕrα)(\phi_{r}^{\alpha}). For convenience we will use the simplified notation ϕ¯rαϕr¯α¯¯\overline{\phi}_{r}^{\alpha}\coloneqq\overline{\phi_{\overline{r}}^{\overline{\alpha}}}. When ϕ¯rα=ϕrα\overline{\phi}_{r}^{\alpha}=\phi_{r}^{\alpha} we say ϕrα\phi_{r}^{\alpha} is double self-conjugate.

Remark 75.

It is important note that following Remark (**) we cannot regard the Double Dual as a function map because ϕ¯rαϕrα\overline{\phi}_{r}^{\alpha}\mapsto\phi_{r}^{\alpha} may be multi-valued.

Example 76.

Important examples of QPI families

  1. (1)

    The quasiperiod index set QPIQPI is itself a QP indexed family using the index map IdQPIId_{QPI}. Since IdId is injective, in this case the involution Q:(α,r)(α,r)¯Q:(\alpha,r)\mapsto\overline{(\alpha,r)} exists trivially.

  2. (2)

    A scalar qp family (λrα)\left(\lambda_{r}^{\alpha}\right) with each λrα\lambda_{r}^{\alpha}\in\mathbb{R}. Important examples are qrα,ar+1α,brαq_{r}^{\alpha},a_{r+1}^{\alpha},b_{r}^{\alpha} (where these symbols have their standard meanings). Note that these particular examples are also qp self-conjugate (eg qrα=qr¯α¯q_{r}^{\alpha}=q_{\overline{r}}^{\overline{\alpha}}) with the single exception of ar+1αa_{r+1}^{\alpha} for α<12,r=0\alpha<\frac{1}{2},r=0 when a2α¯=a1α1a_{2}^{\overline{\alpha}}=a_{1}^{\alpha}-1. Scalar families will not generally have injective index maps (Q(λrα)Q_{{}_{\left(\lambda_{r}^{\alpha}\right)}}is not well defined), although the identity function serves as a partial qp involution when restricted to the set of self-conjugate elements.

  3. (3)

    The particular scalar qp family (Erα)\left(E_{r}^{\alpha}\right) where ErαEr=revenE_{r}^{\alpha}\coloneqq E_{r}=\left\llbracket r\,\mathrm{even}\right\rrbracket (so the family (Erα)\left(E_{r}^{\alpha}\right) is independent of α\alpha). Note Er¯α¯=OrαE_{\overline{r}}^{\overline{\alpha}}=O_{r}^{\alpha} and Or¯α¯=ErαO_{\overline{r}}^{\overline{\alpha}}=E_{r}^{\alpha} and so (Erα)=(Orα)\left(E_{r}^{\alpha}\right)=\left(O_{r}^{\alpha}\right).The index map ι:(α,r)Erα=Er\iota:(\alpha,r)\mapsto E_{r}^{\alpha}=E_{r} is clearly not injective, but in fact the induced involution Q(Erα)Q_{\left(E_{r}^{\alpha}\right)} does exist.

  4. (4)

    Given a scalar qp family {λrα}\{\lambda_{r}^{\alpha}\} we can form new scalar families such as {f(λrα)}\left\{f_{\mathbb{\mathbb{R}R}}(\lambda_{r}^{\alpha})\right\} or {ϕ𝕀(λrα)}\left\{\phi_{\mathbb{IR}}(\lambda_{r}^{\alpha})\right\} (when λrα𝕀\lambda_{r}^{\alpha}\in\mathbb{I}), and P(λrα)\left\llbracket P(\lambda_{r}^{\alpha})\right\rrbracket (where PP is a proposition). If λrα\lambda_{r}^{\alpha} is qp self conjugate, so are f(λrα)f(\lambda_{r}^{\alpha}), ϕ(λrα)\phi(\lambda_{r}^{\alpha}) and P(λrα)\left\llbracket P(\lambda_{r}^{\alpha})\right\rrbracket. In particular ϕ(1qrα)\phi\left(\frac{1}{q_{r}^{\alpha}}\right) is qp self conjugate for r>0r>0.

  5. (5)

    The qp family of qp index functions nα:+n^{\alpha}:\mathbb{N}^{+}\rightarrow\mathbb{N} defined by nα(N)max{t:qtαN}n^{\alpha}(N)\coloneqq\max\{t:q_{t}^{\alpha}\leq N\}. Note we have dropped rr from the notation to indicate that these functions are independent of rr, ie for any rr, nrα=nαn_{r}^{\alpha}=n^{\alpha}. Note also that we can now write the Ostrowski representation of NN as N=r=α>12nα(N)brqrN=\sum_{r=\left\llbracket\alpha>\frac{1}{2}\right\rrbracket}^{n^{\alpha}(N)}b_{r}q_{r}, and also that we can now write the relationship between the indexes nα,nα¯n^{\alpha},n^{\overline{\alpha}} as nα¯(N)=nα(N)+(1)α>12n^{\overline{\alpha}}(N)=n^{\alpha}(N)+(-1)^{\left\llbracket\alpha>\frac{1}{2}\right\rrbracket}. In particular this gives qnαα=qnα¯α¯.q_{n^{\alpha}}^{\alpha}=q_{n^{\overline{\alpha}}}^{\overline{\alpha}}.

  6. (6)

    The qp scalar family crstαc_{rst}^{\alpha} for some symbol cc:

    1. (a)

      When cc is the null symbol we will simply write αrst\alpha_{rst}. The QP conjugate is α¯r¯st\overline{\alpha}_{\overline{r}st}, and since α¯r¯st={1αrst}=αrst¯\overline{\alpha}_{\overline{r}st}=\{1-\alpha_{rst}\}=\overline{\alpha_{rst}} qp involution and source involution coincide on 𝕀\\mathbb{I}\backslash\mathbb{Q}

    2. (b)

      When cc is an observable symbol, eg ϕ\phi, we write ϕrstαϕ(αrst)\phi_{rst}^{\alpha}\coloneqq\phi(\alpha_{rst}) with QP conjugate ϕr¯stα¯=ϕ(α¯r¯st)=ϕ(αrst¯)=ϕ¯rstα\phi_{\overline{r}st}^{\overline{\alpha}}=\phi(\overline{\alpha}_{\overline{r}st})=\phi(\overline{\alpha_{rst}})=\overline{\phi}_{rst}^{\alpha}, and again qp involution and source involution coincide

. QPI Families of Operators

Let 𝒳\mathscr{X} be the set of QPI families of functionals (families whose elements lie in (ST)T(ST)T), and let (Xrα),(Yrα)\left(X_{r}^{\alpha}\right),\left(Y_{r}^{\alpha}\right) 2 members of 𝒳\mathscr{X}. Define ZrαλXrα+μYrαZ_{r}^{\alpha}\coloneqq\lambda X_{r}^{\alpha}+\mu Y_{r}^{\alpha}. By this definition Zr¯α¯λXr¯α¯+μYr¯α¯Z_{\overline{r}}^{\overline{\alpha}}\coloneqq\lambda X_{\overline{r}}^{\overline{\alpha}}+\mu Y_{\overline{r}}^{\overline{\alpha}} and hence (Zrα)\left(Z_{r}^{\alpha}\right) is also a QPI family, and we can write (Zrα)=λT(Xrα)+𝒳μT(Yrα)\left(Z_{r}^{\alpha}\right)=\lambda_{T}\left(X_{r}^{\alpha}\right)\,+_{\mathscr{X}}\,\mu_{T}\left(Y_{r}^{\alpha}\right) to mean for each (α,r)(\alpha,r) that Zrα=λXrα+μYrαZ_{r}^{\alpha}=\lambda X_{r}^{\alpha}+\mu Y_{r}^{\alpha}. Note then 𝒳\mathscr{X} is a vector space over TT , and +𝒳+_{\mathscr{X}} is the pull up of +T+_{T} to 𝒳\mathscr{X}. Similarly we can also pull up ×T\times_{T} to 𝒳\mathscr{X} so that (Xrα)×(Yrα)\left(X_{r}^{\alpha}\right)\times\left(Y_{r}^{\alpha}\right) is again a QPI family, and so 𝒳\mathscr{X} is also an algebra over TT.

Recall also that that when STST is equipped with an involution υST\upsilon{}_{ST} then this induces a pullup involution υ(ST)T\upsilon_{(ST)T} defined by υ(ST)T(X)XυST\upsilon_{(ST)T}(X)\coloneqq X\circ\upsilon_{ST} which is also linear in +ST,×ST+_{ST},\times_{ST}. Again given a QPI family (Xrα)\left(X_{r}^{\alpha}\right) we can define the pullup involution υ𝒳\upsilon_{\mathscr{X}} by υ𝒳((Xrα))=(υ(ST)TXrα)\upsilon_{\mathscr{X}}\left(\,\left(X_{r}^{\alpha}\right)\,\right)=\left(\upsilon_{(ST)T}X_{r}^{\alpha}\right) and again this is linear in +𝒳,×𝒳+_{\mathscr{X}},\times_{\mathscr{X}}, so that υ𝒳\upsilon_{\mathscr{X}} is an algebra automorphism.

In particular we have proved:

Proposition 77 (Linearity of Double Dual).

If Zrα=i=1kXirα×YirαZ_{r}^{\alpha}=\sum_{i=1}^{k}X_{ir}^{\alpha}\times Y_{ir}^{\alpha} then (Zrα)\left(Z_{r}^{\alpha}\right) is a QPI family and Zr¯α¯¯=i=1kXr¯α¯¯×Yr¯α¯¯\overline{Z_{\overline{r}}^{\overline{\alpha}}}=\sum_{i=1}^{k}\overline{X_{\overline{r}}^{\overline{\alpha}}}\times\overline{Y_{\overline{r}}^{\overline{\alpha}}}, ie the double dual of an algebraic combination of QPIQPI families is the same algebraic combination of the double duals

Proof.

We showed above that (Zrα)\left(Z_{r}^{\alpha}\right) is a QPI family. The double dual of ZrαZ_{r}^{\alpha} is σ(ST)TZr¯α¯\sigma_{(ST)T}Z_{\overline{r}}^{\overline{\alpha}} which by definition of ZrαZ_{r}^{\alpha} is σ(ST)T(i=1kXr¯α¯×Yr¯α¯)\sigma_{(ST)T}\left(\sum_{i=1}^{k}X_{\overline{r}}^{\overline{\alpha}}\times Y_{\overline{r}}^{\overline{\alpha}}\right) and the result follows since σ\sigma is an algebra homomorphism. ∎

Example 78.

Important examples of qp operator families

  1. (1)

    Given an operator XX we can define the constant qp family 𝒳X\mathscr{X}_{X} in which Xrαϕ=XϕX_{r}^{\alpha}\phi=X\phi for all (α,r)QPI(\alpha,r)\in QPI. This family is automatically qp self conjugate (Xr¯α¯ϕ=Xϕ=XrαϕX_{\overline{r}}^{\overline{\alpha}}\phi=X\phi=X_{r}^{\alpha}\phi).

  2. (2)

    The partition sum operator family PqrαP_{q_{r}^{\alpha}} (see Subsection 7.3 for details) is a linear operator family which is Source self conjugate (Pqrα=Pqrα¯P_{q_{r}^{\alpha}}=\overline{P_{q_{r}^{\alpha}}}), QP self conjugate (Pqrα=Pqr¯α¯)(P_{q_{r}^{\alpha}}=P_{q_{\overline{r}}^{\overline{\alpha}}}), and hence also double self conjugate (Pqrα=P¯qrαPqr¯α¯¯P_{q_{r}^{\alpha}}=\overline{P}_{q_{r}^{\alpha}}\coloneqq\overline{P_{q_{\overline{r}}^{\overline{\alpha}}}})

  3. (3)

    ϕϕrstαϕ(αrst)\phi\mapsto\phi_{rst}^{\alpha}\coloneqq\phi\left(\alpha_{rst}\right) is an (anonymous) linear operator family which is neither source nor qp self conjugate, but is double self conjugate ((ϕrstα=ϕ¯rstαϕr¯stα¯¯\phi_{rst}^{\alpha}=\overline{\phi}_{rst}^{\alpha}\coloneqq\overline{\phi_{\overline{r}st}^{\overline{\alpha}}}).

  4. (4)

    The family Srsαϕ=Sqrα(ϕ,αrs)t=1qrϕrstαS_{rs}^{\alpha}\phi=S_{q_{r}}^{\alpha}(\phi,\alpha_{rs})\coloneqq\sum_{t=1}^{q_{r}}\phi_{rst}^{\alpha} is double self conjugate since ϕϕrstα\phi\mapsto\phi_{rst}^{\alpha} is double self conjugate, and hence also the sum s=0br1Srsα\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha} is double self-conjugate.

  5. (5)

    ϕϕ(f(qrα))\phi\mapsto\phi(f(q_{r}^{\alpha})) is an (anonymous) linear operator family which is not source self conjugate, but which is qp self conjugate since the qp family (qrα)\left(q_{r}^{\alpha}\right) is qp self conjugate (qrα=qr¯α¯q_{r}^{\alpha}=q_{\overline{r}}^{\overline{\alpha}}). In particular ϕϕ(1qrα)\phi\mapsto\phi(\frac{1}{q_{r}^{\alpha}}) is qp self conjugate.

Recall that in this section we are considering the Source-Target Magma where S=(0,1),T=S=(0,1),T=\mathbb{R}.

Definition 79.

Given two QPI operator families (Xrα),(Yrα)\left(X_{r}^{\alpha}\right),\left(Y_{r}^{\alpha}\right) we will say (Xrα)\left(X_{r}^{\alpha}\right) is dominated by (Yrα)\left(Y_{r}^{\alpha}\right) on ΘST\Theta\subseteq ST if for every (α,r)QPI\alpha,r)\in QPI we have XrαΘYrαX_{r}^{\alpha}\leq_{\Theta}Y_{r}^{\alpha}. We write this as (Xrα)Θ(Yrα)\left(X_{r}^{\alpha}\right)\leq_{\Theta}\left(Y_{r}^{\alpha}\right)

Lemma 80 (Domination Duals).

Let ΨST\Psi\subset ST be self-conjugate under the involution σSTτST\sigma_{ST}\tau_{ST}. Given two linear QPI families (Xrα),(Yrα)\left(X_{r}^{\alpha}\right),\left(Y_{r}^{\alpha}\right) of functionals from (ST)T(ST)T, then we have the following dual results:

(Xrα)Ψ(Yrα)(Xr¯α¯)Ψ(Yr¯α¯)(Yrα¯)Ψ(Xrα¯)(Yr¯α¯¯)Ψ(Xr¯α¯¯)\left(X_{r}^{\alpha}\right)\leq_{\Psi}\left(Y_{r}^{\alpha}\right)\Leftrightarrow\left(X_{\overline{r}}^{\overline{\alpha}}\right)\leq_{\Psi}\left(Y_{\overline{r}}^{\overline{\alpha}}\right)\Leftrightarrow\left(\overline{Y_{r}^{\alpha}}\right)\leq_{\Psi}\left(\overline{X_{r}^{\alpha}}\right)\Leftrightarrow\left(\overline{Y_{\overline{r}}^{\overline{\alpha}}}\right)\leq_{\Psi}\left(\overline{X_{\overline{r}}^{\overline{\alpha}}}\right)
Proof.

(121\Leftrightarrow 2): By definition (Xrα)=(Xr¯α¯)\left(X_{r}^{\alpha}\right)=\left(X_{\overline{r}}^{\overline{\alpha}}\right), and so trivially (Xrα)Ψ(Yrα)(Xr¯α¯)Θ(Yr¯α¯)\left(X_{r}^{\alpha}\right)\leq_{\Psi}\left(Y_{r}^{\alpha}\right)\Leftrightarrow\left(X_{\overline{r}}^{\overline{\alpha}}\right)\leq_{\Theta}\left(Y_{\overline{r}}^{\overline{\alpha}}\right).

(131\Leftrightarrow 3): Since Ψ\Psi is self-conjugate, and Xrα,YrαX_{r}^{\alpha},Y_{r}^{\alpha} are linear (and hence τ\tau-morphisms), then by 7.3 we have XrαΨYrαYrα¯ΨXrα¯X_{r}^{\alpha}\leq_{\Psi}Y_{r}^{\alpha}\Leftrightarrow\overline{Y_{r}^{\alpha}}\leq_{\Psi}\overline{X_{r}^{\alpha}}, and the result follows.

(242\Leftrightarrow 4): This follows by taking the duality (131\Leftrightarrow 3) and applying the duality (121\Leftrightarrow 2). ∎

Analogous results follow for Ψ¯\overline{\Psi}.

Recall that Srsαϕt=1qrϕrstαS_{rs}^{\alpha}\phi\coloneqq\sum_{t=1}^{q_{r}}\phi_{rst}^{\alpha}

Corollary 81.

If Ψ\Psi is self-conjugate and (Xrα)\left(X_{r}^{\alpha}\right) is linear, (s=0brα1Srsα)Ψ(Xrα)\left(\sum_{s=0}^{b_{r}^{\alpha}-1}S_{rs}^{\alpha}\right)\leq_{\Psi}\left(X_{r}^{\alpha}\right) on Ψ\Psi if and only if (Xr¯α¯¯)Ψ(s=0brα1Srsα)\left(\overline{X_{\overline{r}}^{\overline{\alpha}}}\right)\leq_{\Psi}\left(\sum_{s=0}^{b_{r}^{\alpha}-1}S_{rs}^{\alpha}\right)

Proof.

This follows directly from the lemma putting Yrα=s=0br1SrsαY_{r}^{\alpha}=\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha} and using the fact that brαb_{r}^{\alpha}and SrsαS_{rs}^{\alpha} are double self conjugate, and hence by linearity of the double dual (Proposition 77) so is the sum YrαY_{r}^{\alpha} . ∎

7.3. Partition Sums

We now introduce a set of linear functionals on observables which will form an important part of our Birkhoff Sum estimates.

Definition 82.

Partition Sums. Consider the partition of [0,1)[0,1) into k1k\geq 1 equal sub-intervals starting from 0. We call the k1k-1 points with coordinates {tk}t=1k1\left\{\frac{t}{k}\right\}_{t=1}^{k-1} (ie not including 0) the interior points, and define the partition sum of ϕ\phi on the partition to be the sum of values of ϕ\phi over the interior points, namely Pk(ϕ)=t=1k1ϕ(tk)P_{k}(\phi)=\sum_{t=1}^{k-1}\phi(\frac{t}{k}).

Remark 83.

Note that PkP_{k} is a linear functional of ϕ\phi, and that P1(ϕ)P_{1}(\phi) is an empty sum taking the value 0, and that P2(ϕ)=ϕ(12)P_{2}(\phi)=\phi(\frac{1}{2}).

Recall from the previous section that we are using the notation (.¯)𝕋\left(\overline{.}\right)_{\mathbb{T}^{\circ}} for the Source involution of the circle xx¯={1x}x\mapsto\overline{x}=\{1-x\}. This has the pull ups (.¯)(𝕋):ϕϕ¯\left(\overline{.}\right)_{\left(\mathbb{TR}\right)^{\circ}}:\phi\mapsto\overline{\phi}, (.¯)((𝕋)T):AA¯\left(\overline{.}\right)_{\left((\mathbb{TR})T\right)^{\circ}}:A\mapsto\overline{A} defined by ϕ¯x=ϕx¯,A¯ϕ=Aϕ¯\overline{\phi}x=\phi\overline{x},\overline{A}\phi=A\overline{\phi}

Proposition 84.

PkP_{k} is self-conjugate under the Source involution (.¯)𝕋\left(\overline{\text{.}}\right)_{\mathbb{T}}, and if ϕ𝕋\phi_{\mathbb{TR}} is anti-symmetric then Pk(ϕ)=0P_{k}(\phi)=0

Proof.

By definition, putting u=ktu=k-t:

Pk¯(ϕ)=Pk(ϕ¯)=t=1k1ϕ(1tk)=u=1k1ϕ(uk)=Pk(ϕ)\overline{P_{k}}(\phi)=P_{k}(\overline{\phi})=\sum_{t=1}^{k-1}\phi(1-\frac{t}{k})=\sum_{u=1}^{k-1}\phi(\frac{u}{k})=P_{k}(\phi)

Now if ϕ\phi is anti-symmetric on the circle we have ϕ=ϕ¯\phi=-\overline{\phi}. Since PkP_{k} is linear Pk(ϕ)=Pk(ϕ¯)=Pk(ϕ¯)=Pk(ϕ)P_{k}\left(\phi\right)=P_{k}\left(-\overline{\phi}\right)=-P_{k}\left(\overline{\phi}\right)=-P_{k}\left(\phi\right), hence Pk(ϕ)=0P_{k}(\phi)=0. ∎

We will usually be concerned with partition sums PqrαP_{q_{r}^{\alpha}} where qrαq_{r}^{\alpha} is a quasiperiod of α𝕋\alpha_{\mathbb{T}}.

Proposition 85.

PqrαP_{q_{r}^{\alpha}} is self-conjugate under the quasiperiod involution (α,r)(α¯,r¯)(\alpha,r)\mapsto(\overline{\alpha},\overline{r})

Proof.

Recall that qrαq_{r}^{\alpha} is itself self-conjugate, ie qrα=qr¯α¯q_{r}^{\alpha}=q_{\overline{r}}^{\overline{\alpha}}, and the result follows trivially. ∎

7.4. Birkhoff sums of monotonic functions on (0,1)(0,1)

Recall SNαϕ=r=0nαs=0br1SrsαϕS_{N}^{\alpha}\phi=\sum_{r=0}^{n^{\alpha}}\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha}\phi where Srsαϕ=Sqrα(ϕ,αrs)=t=1qrαϕ(αrs+tα)=t=1qrαϕrstαS_{rs}^{\alpha}\phi=S_{q_{r}^{\alpha}}(\phi,\alpha_{rs})=\sum_{t=1}^{q_{r}^{\alpha}}\phi(\alpha_{rs}+t\alpha)=\sum_{t=1}^{q_{r}^{\alpha}}\phi_{rst}^{\alpha}. We will fix α\alpha for the moment and drop it from the annotation.

When qr=1q_{r}=1 we may have q1=q0=1q_{1}=q_{0}=1. However in this case we also have α>12\alpha>\frac{1}{2}. Also from (18)b0=0b_{0}=0 so that the Sq0S_{q_{0}} term is not of interest and we can write SNαϕ=r=α>12ns=0br1SrsαϕS_{N}^{\alpha}\phi=\sum_{r=\left\llbracket\alpha>\frac{1}{2}\right\rrbracket}^{n}\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha}\phi

Now for qr=1q_{r}=1, t(1..qr)t\in(1..q_{r}) gives t(1)t\in(1) and so Sqr(αrs)=t=1qrϕrst=ϕrsqrS_{q_{r}}(\alpha_{rs})=\sum_{t=1}^{q_{r}}\phi_{rst}=\phi_{rsq_{r}} and similarly for qr=2q_{r}=2 we have Sqr(αrs)=t=1qrϕrst=ϕrsqr+ϕrsqr1S_{q_{r}}(\alpha_{rs})=\sum_{t=1}^{q_{r}}\phi_{rst}=\phi_{rsq_{r}}+\phi_{rsq_{r-1}} (since qr1=1q_{r-1}=1). Combining these results we have for qr2q_{r}\leq 2 (and where br=0b_{r}=0 gives the empty sum 0):

(7.4) s=0br1Srsαϕ=s=0br1(ϕrsqrαα+qrα>1ϕrsqr1αα)\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha}\phi=\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}^{\alpha}}^{\alpha}+\left\llbracket q_{r}^{\alpha}>1\right\rrbracket\phi_{rsq_{r-1}^{\alpha}}^{\alpha}\right)
Definition 86 (Bounds Operators).

Given α,N\alpha,N, the Bounds Functionals (Brα)\left(B_{r}^{\alpha}\right) form a QP family of operators in (𝕋)\left(\mathbb{TR}\right)\mathbb{R} defined on observables ϕ\phi by:

(7.5) Brαϕ=brα>0(qrα>1brα(Pqrα(ϕ)ϕ¯(1qrα))+2<qrα<qnααErα(ϕr0,qrqr1αϕ¯(2qrα))+s=0brα1(ϕrsqrα+qrα>1ϕrsqr1α))B_{r}^{\alpha}\phi=\left\llbracket b_{r}^{\alpha}>0\right\rrbracket\left(\left\llbracket q_{r}^{\alpha}>1\right\rrbracket b_{r}^{\alpha}\left(P_{q_{r}^{\alpha}}(\phi)-\overline{\phi}(\frac{1}{q_{r}^{\alpha}})\right)+\left\llbracket 2<q_{r}^{\alpha}<q_{n^{\alpha}}^{\alpha}\right\rrbracket E_{r}^{\alpha}\left(\phi_{r0,q_{r}-q_{r-1}}^{\alpha}-\overline{\phi}\left(\frac{2}{q_{r}^{\alpha}}\right)\right)+\sum_{s=0}^{b_{r}^{\alpha}-1}\left(\phi_{rsq_{r}}^{\alpha}+\left\llbracket q_{r}^{\alpha}>1\right\rrbracket\phi_{rsq_{r-1}}^{\alpha}\right)\right)
Remark 87.

Note this reduces to (7.4) for qrα2q_{r}^{\alpha}\leq 2.

Further note that each Bounds Functional is an algebraic combinations of other linear QPI functionals and hence linear .

Lemma 88.

The Bounds Functionals (Brα)\left(B_{r}^{\alpha}\right) have QP duals (Br¯α¯)\left(B_{\overline{r}}^{\overline{\alpha}}\right) and Double Duals (Br¯α¯¯)\left(\overline{B_{\overline{r}}^{\overline{\alpha}}}\right) which can be written (when α\alpha is fixed):

(7.6) Br¯α¯ϕ=br>0(qr>1br(Pqr(ϕ)ϕ¯(1qr))+2<qr<qnOr(ϕ¯r0,qrqr1ϕ¯(2qr))+s=0br1(ϕ¯rsqr+qr>1ϕ¯rsqr1))B_{\overline{r}}^{\overline{\alpha}}\phi=\left\llbracket b_{r}>0\right\rrbracket\left(\left\llbracket q_{r}>1\right\rrbracket b_{r}\left(P_{q_{r}}(\phi)-\overline{\phi}(\frac{1}{q_{r}})\right)+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket O_{r}\left(\overline{\phi}_{r0,q_{r}-q_{r-1}}-\overline{\phi}\left(\frac{2}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(\overline{\phi}_{rsq_{r}}+\left\llbracket q_{r}>1\right\rrbracket\overline{\phi}_{rsq_{r-1}}\right)\right)
(7.7) Br¯α¯¯ϕ=br>0(qr>1br(Pqr(ϕ)ϕ(1qr))+2<qr<qnOr(ϕr0,qrqr1ϕ(2qr))+s=0br1(ϕrsqr+qr>1ϕrsqr1))\overline{B_{\overline{r}}^{\overline{\alpha}}}\phi=\left\llbracket b_{r}>0\right\rrbracket\left(\left\llbracket q_{r}>1\right\rrbracket b_{r}\left(P_{q_{r}}(\phi)-\phi(\frac{1}{q_{r}})\right)+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket O_{r}\left(\phi_{r0,q_{r}-q_{r-1}}-\phi\left(\frac{2}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\left\llbracket q_{r}>1\right\rrbracket\phi_{rsq_{r-1}}\right)\right)
Proof.

Applying the QP involution to the definition (7.5) gives us the QP dual:

Br¯α¯ϕ=br¯α¯>0(qr¯α¯>1br¯α¯(Pqr¯α¯(ϕ)ϕ¯(1qr¯α¯))+2<qr¯α¯<qnα¯α¯Er¯α¯(ϕr¯0,qr¯α¯qr¯1α¯α¯ϕ¯(2qr¯α¯))+s=0br¯α¯1(ϕr¯sqr¯α¯α¯+qr¯α¯>1ϕr¯sqr¯1α¯α¯))B_{\overline{r}}^{\overline{\alpha}}\phi=\left\llbracket b_{\overline{r}}^{\overline{\alpha}}>0\right\rrbracket\left(\left\llbracket q_{\overline{r}}^{\overline{\alpha}}>1\right\rrbracket b_{\overline{r}}^{\overline{\alpha}}\left(P_{q_{\overline{r}}^{\overline{\alpha}}}(\phi)-\overline{\phi}(\frac{1}{q_{\overline{r}}^{\overline{\alpha}}})\right)+\left\llbracket 2<q_{\overline{r}}^{\overline{\alpha}}<q_{n^{\overline{\alpha}}}^{\overline{\alpha}}\right\rrbracket E_{\overline{r}}^{\overline{\alpha}}\left(\phi_{\overline{r}0,q_{\overline{r}}^{\overline{\alpha}}-q_{\overline{r}-1}^{\overline{\alpha}}}^{\overline{\alpha}}-\overline{\phi}\left(\frac{2}{q_{\overline{r}}^{\overline{\alpha}}}\right)\right)+\sum_{s=0}^{b_{\overline{r}}^{\overline{\alpha}}-1}\left(\phi_{\overline{r}sq_{\overline{r}}^{\overline{\alpha}}}^{\overline{\alpha}}+\left\llbracket q_{\overline{r}}^{\overline{\alpha}}>1\right\rrbracket\phi_{\overline{r}sq_{\overline{r}-1}^{\overline{\alpha}}}^{\overline{\alpha}}\right)\right)

But now we have by (**) that brα,qrα,Pqrαb_{r}^{\alpha},q_{r}^{\alpha},P_{q_{r}^{\alpha}} and qnααq_{n^{\alpha}}^{\alpha} are all self-conjugate under the QP involution, so that for example brα=br¯α¯b_{r}^{\alpha}=b_{\overline{r}}^{\overline{\alpha}} and we can and will denote either of them by brb_{r} when α\alpha is fixed. Further Er¯α¯=Orα=OrE_{\overline{r}}^{\overline{\alpha}}=O_{r}^{\alpha}=O_{r} and ϕr¯stα¯=ϕ¯rstα\phi_{\overline{r}st}^{\overline{\alpha}}=\overline{\phi}_{rst}^{\alpha} and the QP dual result (7.6) follows. For the double dual we have Br¯α¯¯ϕBr¯α¯ϕ¯\overline{B_{\overline{r}}^{\overline{\alpha}}}\phi\coloneqq B_{\overline{r}}^{\overline{\alpha}}\overline{\phi}. From (7.6) we then obtain:

Br¯α¯ϕ¯=br>0(qr>1br(Pqr(ϕ¯)ϕ(1qr))+2<qr<qnOr(ϕr0,qrqr1ϕ(2qr))+s=0br1(ϕrsqr+qr>1ϕrsqr1))B_{\overline{r}}^{\overline{\alpha}}\overline{\phi}=\left\llbracket b_{r}>0\right\rrbracket\left(\left\llbracket q_{r}>1\right\rrbracket b_{r}\left(P_{q_{r}}(\overline{\phi})-\phi(\frac{1}{q_{r}})\right)+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket O_{r}\left(\phi_{r0,q_{r}-q_{r-1}}-\phi\left(\frac{2}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\left\llbracket q_{r}>1\right\rrbracket\phi_{rsq_{r-1}}\right)\right)

Now note that PqrP_{q_{r}} is also self-conjugate under ϕϕ¯\phi\mapsto\overline{\phi} and (7.7) follows. ∎

Theorem 89 (Bounds for monotonic functions).

Given α,N,r\alpha,N,r and the associated Bounds Functionals Brα,Br¯α¯¯B_{r}^{\alpha},\overline{B_{\overline{r}}^{\overline{\alpha}}}, let ϕ\phi be a monotonic decreasing observable on (0,1)(0,1). Then we have:

(7.8) Br¯α¯¯ϕ\displaystyle\overline{B_{\overline{r}}^{\overline{\alpha}}}\phi s=0br1SrsαϕBrαϕ\displaystyle\leq\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha}\phi\leq B_{r}^{\alpha}\phi

Further, these are equalities for constant ϕ\phi (and also if br=0b_{r}=0 or qr2q_{r}\leq 2).

Proof.

Given the right hand inequality s=0br1SrsαϕBrαϕ\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha}\phi\leq B_{r}^{\alpha}\phi we can deduce the left hand inequality via the corollary to the domination lemma, Corollary 81. It remains to prove the right hand. We first deal with the equality claims and qr2q_{r}\leq 2.

If br=0b_{r}=0 then we have the equality 0=Br¯α¯¯ϕ=Brαϕ=s=0br1Sqr(αrs)0=\overline{B_{\overline{r}}^{\overline{\alpha}}}\phi=B_{r}^{\alpha}\phi=\sum_{s=0}^{b_{r}-1}S_{q_{r}}(\alpha_{rs}). For qr2q_{r}\leq 2, we have P2=ϕ(12)=ϕ¯(12)P_{2}=\phi(\frac{1}{2})=\overline{\phi}(\frac{1}{2}) and P1=0P_{1}=0, and so for qr2q_{r}\leq 2 both the first and mid terms of Br¯α¯¯,Brα\overline{B_{\overline{r}}^{\overline{\alpha}}},B_{r}^{\alpha} vanish and we are left with (7.4) which is an equality. If qr>2q_{r}>2 and ϕ(x)=c\phi(x)=c then the mid terms vanish, the first term is br((qr1)cc)b_{r}\left((q_{r}-1)c-c\right) and the final term becomes br(c+c)b_{r}(c+c) so that Arϕ=Brϕ=brqrcA_{r}\phi=B_{r}\phi=b_{r}q_{r}c which is also s=0br1Sqr(αrs)\sum_{s=0}^{b_{r}-1}S_{q_{r}}(\alpha_{rs}). We now deal with the general case for qr>2q_{r}>2.

We now use the results of Section 5 to understand where αrst\alpha_{rst} lies in the regular qrq_{r} partition of the circle.

Case: s>0s>0 or s=0&r=ns=0\&r=n.

In this case for each tt, the co-siting condition holds, ie αrst\alpha_{rst} lies in I(tα)I(t\alpha) so that ϕrst\phi_{rst} lies between the values of ϕ\phi at the interval endpoints. Summing over increasing values of αrst\alpha_{rst} (ie in the direction 0 to 11) gives us:

For rr even we have Sqr(αrs)=t=1qrϕ(αrst)ϕrsqr+(Pqrϕ(1qr))+ϕrsqr1S_{q_{r}}(\alpha_{rs})=\sum_{t=1}^{q_{r}}\phi(\alpha_{rst})\geq\phi_{rsq_{r}}+\left(P_{q_{r}}-\phi(\frac{1}{q_{r}})\right)+\phi_{rsq_{r-1}} and Sqr(αrs)ϕrsqr+(Pqrϕ(11qr))+ϕrsqr1S_{q_{r}}(\alpha_{rs})\leq\phi_{rsq_{r}}+\left(P_{q_{r}}-\phi(1-\frac{1}{q_{r}})\right)+\phi_{rsq_{r-1}}.

Similarly for rr odd we have ϕrsqr1+(Pqrϕ(11qr))+ϕrsqrSqr(αrs)ϕrsqr1+(Pqrϕ(1qr))+ϕrsqr\phi_{rsq_{r-1}}+\left(P_{q_{r}}-\phi(1-\frac{1}{q_{r}})\right)+\phi_{rsq_{r}}\geq S_{q_{r}}(\alpha_{rs})\geq\phi_{rsq_{r-1}}+\left(P_{q_{r}}-\phi(\frac{1}{q_{r}})\right)+\phi_{rsq_{r}} which is structurally identical to the rr even case, even though αrsqr,αrsqr1\alpha_{rsq_{r}},\alpha_{rsq_{r-1}} are switched in their positions around 0.

Case: s=0&r<n,qr>1s=0\,\&\,r<n,q_{r}>1.

In this case for each tqrt\neq q_{r}, αrst\alpha_{rst} may be shifted by one partition interval from I(tα)I(t\alpha). In particular, αrs(qrqr1)\alpha_{rs(q_{r}-q_{r-1})} may be shifted into the same partition interval as αrsqr\alpha_{rsq_{r}}.

For rr even, the possible shift is backward (towards 0), so that the lower bound is not affected, but the upper bound increases to Sqr(αrs)(Pqrϕ(11qr))+ϕr0qr+ϕr0,qrqr1+(ϕrsqr1ϕ(12qr))=(Pqr+ϕrsqr+ϕrsqr1ϕ(11qr))+(ϕr0,qrqr1ϕ(12qr))S_{q_{r}}(\alpha_{rs})\leq\left(P_{q_{r}}-\phi(1-\frac{1}{q_{r}})\right)+\phi_{r0q_{r}}+\phi_{r0,q_{r}-q_{r-1}}+\left(\phi_{rsq_{r-1}}-\phi(1-\frac{2}{q_{r}})\right)=\left(P_{q_{r}}+\phi_{rsq_{r}}+\phi_{rsq_{r-1}}-\phi(1-\frac{1}{q_{r}})\right)+\left(\phi_{r0,q_{r}-q_{r-1}}-\phi(1-\frac{2}{q_{r}})\right).

For rr odd, the possible shift is forward (towards 1), so that the upper bound is not affected, but the lower bound decreases to Sqr(αrs)(Pqrϕ(1qr))+(ϕrsqr1ϕ(2qr))+ϕrsqr+ϕr0,qrqr1=(ϕrsqr+Pqrϕ(1qr)+ϕrsqr1)+(ϕr0,qrqr1ϕ(2qr))S_{q_{r}}(\alpha_{rs})\geq\left(P_{q_{r}}-\phi\left(\frac{1}{q_{r}}\right)\right)+\left(\phi_{rsq_{r-1}}-\phi(\frac{2}{q_{r}})\right)+\phi_{rsq_{r}}+\phi_{r0,q_{r}-q_{r-1}}=\left(\phi_{rsq_{r}}+P_{q_{r}}-\phi(\frac{1}{q_{r}})+\phi_{rsq_{r-1}}\right)+\left(\phi_{r0,q_{r}-q_{r-1}}-\phi(\frac{2}{q_{r}})\right).

The results follow on reorganising the terms.

Corollary 90.

r=0nBr¯α¯¯ϕSNϕr=0nBrαϕ\sum_{r=0}^{n}\overline{B_{\overline{r}}^{\overline{\alpha}}}\phi\leq S_{N}\phi\leq\sum_{r=0}^{n}B_{r}^{\alpha}\phi

Proof.

This follows from SNϕ=r=0ns=0br1SrsαϕS_{N}\phi=\sum_{r=0}^{n}\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha}\phi

Corollary 91 (Duality results).

Let ΨST\Psi\subset ST be a στ\sigma\tau self-conjugate set of decreasing functions on (0,1)(0,1). Then we have:

Br¯α¯¯\displaystyle\overline{B_{\overline{r}}^{\overline{\alpha}}} Ψ\displaystyle\>\leq_{\Psi} s=0br1Srsα\displaystyle\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha} Ψ\displaystyle\>\leq_{\Psi}\; Brα\displaystyle B_{r}^{\alpha}
σBr¯α¯¯\displaystyle\sigma\overline{B_{\overline{r}}^{\overline{\alpha}}} Ψ¯\displaystyle\>\leq_{\overline{\Psi}} σs=0br1Srsα\displaystyle\sigma\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha} Ψ¯\displaystyle\>\leq_{\overline{\Psi}}\; σBrα\displaystyle\sigma B_{r}^{\alpha}
τBr¯α¯¯\displaystyle\tau\overline{B_{\overline{r}}^{\overline{\alpha}}} Ψ¯\displaystyle\>\leq_{\overline{\Psi}} τs=0br1Srsα\displaystyle\tau\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha} Ψ¯\displaystyle\>\leq_{\overline{\Psi}}\; τBrα\displaystyle\tau B_{r}^{\alpha}
στBr¯α¯¯\displaystyle\sigma\tau\overline{B_{\overline{r}}^{\overline{\alpha}}} Ψ\displaystyle\>\leq_{\Psi} στs=0br1Srsα\displaystyle\;\sigma\tau\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha} Ψ\displaystyle\>\leq_{\Psi}\; στBrα\displaystyle\sigma\tau B_{r}^{\alpha}
τBr¯α¯¯\displaystyle\tau\overline{B_{\overline{r}}^{\overline{\alpha}}} Ψ\displaystyle\>\geq_{\Psi} τs=0br1Srsα\displaystyle\tau\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha} Ψ\displaystyle\>\geq_{\Psi}\; τBrα\displaystyle\tau B_{r}^{\alpha}
τσBr¯α¯¯\displaystyle\tau\sigma\overline{B_{\overline{r}}^{\overline{\alpha}}} Ψ¯\displaystyle\>\geq_{\overline{\Psi}} τσs=0br1Srsα\displaystyle\;\tau\sigma\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha} Ψ¯\displaystyle\>\geq_{\overline{\Psi}}\; τσBrα\displaystyle\tau\sigma B_{r}^{\alpha}
Br¯α¯¯\displaystyle\overline{B_{\overline{r}}^{\overline{\alpha}}} Ψ¯\displaystyle\>\geq_{\overline{\Psi}} s=0br1Srsα\displaystyle\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha} Ψ¯\displaystyle\>\geq_{\overline{\Psi}}\; Brα\displaystyle B_{r}^{\alpha}
σBr¯α¯¯\displaystyle\sigma\overline{B_{\overline{r}}^{\overline{\alpha}}} Ψ\displaystyle\>\geq_{\Psi} σs=0br1Srsα\displaystyle\sigma\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha} Ψ\displaystyle\>\geq_{\Psi}\; σBrα\displaystyle\sigma B_{r}^{\alpha}
Proof.

This is an application of the dualities in Subsection 7.2. ∎

Corollary 92.

If ϕ\phi is a decreasing anti-symmetric function then

(7.9) Brαϕ\displaystyle B_{r}^{\alpha}\phi =br>0(s=0br1(ϕrsqr+qr>1(ϕrsqr1+ϕ(1qr)))+2<qr<qnEr(ϕr0,qrqr1+ϕ(2qr)))\displaystyle=\left\llbracket b_{r}>0\right\rrbracket\left(\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\left\llbracket q_{r}>1\right\rrbracket\left(\phi_{rsq_{r-1}}+\phi(\frac{1}{q_{r}})\right)\right)+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left(\phi_{r0,q_{r}-q_{r-1}}+\phi\left(\frac{2}{q_{r}}\right)\right)\right)
(7.10) Br¯α¯¯ϕ\displaystyle\overline{B_{\overline{r}}^{\overline{\alpha}}}\phi =br>0(s=0br1(ϕrsqr+qr>1(ϕrsqr1ϕ(1qr)))+2<qr<qnOr(ϕr0,qrqr1ϕ(2qr)))\displaystyle=\left\llbracket b_{r}>0\right\rrbracket\left(\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\left\llbracket q_{r}>1\right\rrbracket\left(\phi_{rsq_{r-1}}-\phi(\frac{1}{q_{r}})\right)\right)+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket O_{r}\left(\phi_{r0,q_{r}-q_{r-1}}-\phi\left(\frac{2}{q_{r}}\right)\right)\right)

Further for qr>1q_{r}>1 each sum s=0br1(ϕrsqr+ϕrsqr1)\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\phi_{rsq_{r-1}}\right) is positive for rr even, and negative for rr odd.

.

Proof.

Note if ϕ\phi is anti-symmetric then ϕ=ϕ¯\phi=-\overline{\phi} and Pqr(ϕ)=0P_{q_{r}}(\phi)=0. The expressions (7.9,7.10) then follow easily from the theorem. We now examine Xr=s=0br1(ϕrsqr+ϕrsqr1)X_{r}=\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\phi_{rsq_{r-1}}\right). For rr even ϕrsqr+ϕrsqr1ϕ(s+1/ar+2/qr+1/)ϕ(ar+1sqr+1/)\phi_{rsq_{r}}+\phi_{rsq_{r-1}}\leq\phi\left(\frac{s+1/a_{r+2}^{/}}{q_{r+1}^{/}}\right)-\phi\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right).

Reversing the order of second terms gives us Xr=s=0br1(ϕ(s+1/ar+2/qr+1/)ϕ(ar+1(br1)+sqr+1/))X_{r}=\sum_{s=0}^{b_{r}-1}\left(\phi\left(\frac{s+1/a_{r+2}^{/}}{q_{r+1}^{/}}\right)-\phi\left(\frac{a_{r+1}-(b_{r}-1)+s}{q_{r+1}^{/}}\right)\right). Since brar+1,b_{r}\leq a_{r+1}, each term in this sum is positive, and hence Xr0X_{r}\geq 0 for rr even. Similarly Xr0X_{r}\leq 0 for rr odd. ∎

7.5. Further bounds for primitive functions

The results of the previous section hold for any monotonic decreasing ϕ\phi. In this section we will develop the upper bound results further for those functions which are lower bounded, and in particular for the positive primitives in Φ\Phi.. Recall from Subsection 6.4 we can reduce any unbounded observable to a sum of primitives.

The key new ingredient arises from the fact that if ϕΦ\phi{}_{\Phi} is lower bounded at say 11^{-} with bound cc, then for irrational 0<α<120<\alpha<\frac{1}{2} the sequence of extremal points αrsqr\alpha_{rsq_{r}} tend to 11^{-} for rr odd and hence ϕrsqrc\phi_{rsq_{r}}\rightarrow c.

Recall our notation for limits when they exist, for example ϕ(1)=limx1ϕ(x)\phi(1^{-})=\lim_{x\uparrow 1}\phi(x). Note the limit exists whenever ϕ\phi is LBV left of 11, and in particular for ϕΦΦ\phi\in\Phi\bigcup-\Phi. Also note that ϕϕ(1)\phi\mapsto\phi(1^{-}) is then a linear functional, and in particular (ϕ)(1)=ϕ(1)(-\phi)(1^{-})=-\phi(1^{-}). Analogous remarks apply to ϕ(0+)=limx0ϕ(x)\phi(0^{+})=\lim_{x\downarrow 0}\phi(x). Note that if ϕ(1)\phi(1^{-}) exists, so does ϕ¯(0+)\overline{\phi}(0^{+}) and the two limits are equal.

Finally for ϕ\phi monotonic descending with ϕ(1)=C<0\phi(1^{-})=C<0 then ϕCΦ\phi-C\in\Phi and SNϕ=SN(ϕC)+CNS_{N}\phi=S_{N}(\phi-C)+CN, which means we can restrict our attention to ϕΦ\phi\in\Phi. Similar remarks apply to all quadrants.

Lemma 93 (Upper bounds for primitive functions).

Let ϕΦ\phi\in\Phi. Then for qr>2q_{r}>2 we have:

(7.11) Brα(ϕ)\displaystyle B_{r}^{\alpha}(\phi) br>0(brPqr+2<qr<qnEr(ϕr0,qrqr1ϕ¯(1qr))+s=0br1(Erϕrsqr+Orϕrsqr1))\displaystyle\leq\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}P_{q_{r}}+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left(\phi_{r0,q_{r}-q_{r-1}}-\overline{\phi}\left(\frac{1}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(E_{r}\phi_{rsq_{r}}+O_{r}\phi_{rsq_{r-1}}\right)\right)
(7.12) Br¯α¯¯ϕ¯\displaystyle\overline{B_{\overline{r}}^{\overline{\alpha}}}\overline{\phi} br>0(brPqr+2<qr<qnOr(ϕ¯r0,qrqr1ϕ¯(1qr))+s=0br1(Orϕ¯rsqr+Erϕ¯rsqr1))\displaystyle\leq\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}P_{q_{r}}+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket O_{r}\left(\overline{\phi}_{r0,q_{r}-q_{r-1}}-\overline{\phi}\left(\frac{1}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(O_{r}\overline{\phi}_{rsq_{r}}+E_{r}\overline{\phi}_{rsq_{r-1}}\right)\right)

Further, these inequalities are reversed under the duality ϕϕ\phi\mapsto-\phi, and are equalities for constant ϕ\phi (and trivially also for br=0b_{r}=0).

Proof.

All terms are linear functionals, so that results for ϕ-\phi follow by duality. The equalities follow the same argument as Lemma 89.

Since ϕΦ\phi\in\Phi, ϕ\phi is descending and we have from Lemma 89 for qr>2q_{r}>2

(7.13) Brα(ϕ)\displaystyle B_{r}^{\alpha}(\phi) =br>0(br(Pqrϕ¯(1qr))+2<qr<qnEr(ϕr0,qrqr1ϕ¯(2qr))+s=0br1(ϕrsqr+ϕrsqr1))\displaystyle=\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}\left(P_{q_{r}}-\overline{\phi}(\frac{1}{q_{r}})\right)+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left(\phi_{r0,q_{r}-q_{r-1}}-\overline{\phi}\left(\frac{2}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\phi_{rsq_{r-1}}\right)\right)
(7.14) Br¯α¯¯(ϕ)\displaystyle\overline{B_{\overline{r}}^{\overline{\alpha}}}(\phi) =br>0(br(Pqrϕ(1qr))+2<qr<qnOr(ϕr0,qrqr1ϕ(2qr))+s=0br1(ϕrsqr+ϕrsqr1))\displaystyle=\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}\left(P_{q_{r}}-\phi(\frac{1}{q_{r}})\right)+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket O_{r}\left(\phi_{r0,q_{r}-q_{r-1}}-\phi\left(\frac{2}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\phi_{rsq_{r-1}}\right)\right)

Also since ϕ\phi is descending, for qr>1q_{r}>1 we have OrϕrsqrOrϕ¯(1qr)O_{r}\phi_{rsq_{r}}\leq O_{r}\overline{\phi}(\frac{1}{q_{r}}), and for s>0s>0 or r=nr=n we also have Erϕrsqr1Erϕ¯(1qr)E_{r}\phi_{rsq_{r-1}}\leq E_{r}\overline{\phi}(\frac{1}{q_{r}}). However for s=0&r<ns=0\&r<n we may have αr0qr1\alpha_{r0q_{r-1}} pulled back to its alternate interval.

However in this case, since qr>2q_{r}>2 we have Erϕr0qr1Erϕ¯(2qr)E_{r}\phi_{r0q_{r-1}}\leq E_{r}\overline{\phi}(\frac{2}{q_{r}}) which then gives us ϕrsqr+ϕrsqr1ϕ¯(1qr)Erϕrsqr+Orϕrsqr1+s=0&r<nEr(ϕ¯(2qr)ϕ¯(1qr))\phi_{rsq_{r}}+\phi_{rsq_{r-1}}-\overline{\phi}(\frac{1}{q_{r}})\leq E_{r}\phi_{rsq_{r}}+O_{r}\phi_{rsq_{r-1}}+\left\llbracket s=0\&r<n\right\rrbracket E_{r}(\overline{\phi}(\frac{2}{q_{r}})-\overline{\phi}(\frac{1}{q_{r}})). For br>0b_{r}>0 we can now substitute this result into (7.13) to obtain (7.12). The result for Br(ϕ¯)B_{r}(\overline{\phi}) in (7.12)) follows by analogous argument using ϕ¯\overline{\phi} ascending.

Lemma 94.

Let ϕΦ\phi\in\Phi. Then for qr=2q_{r}=2 we have:

(7.15) Brαϕ\displaystyle B_{r}^{\alpha}\phi br>0(brPqr+2=qr<qnEr(ϕr0qr1αϕ¯(1qr))+s=0br1(Erϕrsqrα+Orϕrsqr1α))\displaystyle\leq\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}P_{q_{r}}+\left\llbracket 2=q_{r}<q_{n}\right\rrbracket E_{r}(\phi_{r0q_{r-1}}^{\alpha}-\overline{\phi}(\frac{1}{q_{r}}))+\sum_{s=0}^{b_{r}-1}\left(E_{r}\phi_{rsq_{r}}^{\alpha}+O_{r}\phi_{rsq_{r-1}}^{\alpha}\right)\right)
(7.16) Br¯α¯¯ϕ¯\displaystyle\overline{B_{\overline{r}}^{\overline{\alpha}}}\overline{\phi} br>0(brPqr+2=qr<qnOr(ϕ¯r0qr1αϕ¯(1qr))+s=0br1(Orϕ¯rsqrα+Erϕ¯rsqr1α))\displaystyle\leq\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}P_{q_{r}}+\left\llbracket 2=q_{r}<q_{n}\right\rrbracket O_{r}(\overline{\phi}_{r0q_{r-1}}^{\alpha}-\overline{\phi}(\frac{1}{q_{r}}))+\sum_{s=0}^{b_{r}-1}\left(O_{r}\overline{\phi}_{rsq_{r}}^{\alpha}+E_{r}\overline{\phi}_{rsq_{r-1}}^{\alpha}\right)\right)

Writing Xrα(ϕ)=2=qr<qnEr(ϕr0qr1αϕ¯(1qr))X_{r}^{\alpha}(\phi)=\left\llbracket 2=q_{r}<q_{n}\right\rrbracket E_{r}\left(\phi_{r0q_{r-1}}^{\alpha}-\overline{\phi}(\frac{1}{q_{r}})\right) for qr=2q_{r}=2 we have |Xr(ϕ)|<ϕ(13)ϕ(12)\left|X_{r}(\phi)\right|<\phi(\frac{1}{3})-\phi(\frac{1}{2}). If Xrϕ0X_{r}\phi\neq 0 then sgnXrϕ¯={α}<12=sgnXrϕ-\operatorname{sgn}X_{r}\overline{\phi}=\left\llbracket\left\{\alpha\right\}<\frac{1}{2}\right\rrbracket=\operatorname{sgn}X_{r}\phi, so that the XrX_{r} term can be ignored in 7.16 for {α}<12\left\llbracket\left\{\alpha\right\}<\frac{1}{2}\right\rrbracket, and in 7.16 for {α}>12\left\llbracket\left\{\alpha\right\}>\frac{1}{2}\right\rrbracket.

Further, these inequalities are reversed under the duality ϕϕ\phi\mapsto-\phi, and are equalities for constant ϕ\phi (and also trivially for br=0b_{r}=0).

Proof.

All terms are linear functionals, so that results for ϕ-\phi follow by duality. The equalities again follow the same argument as Lemma 89 above.

Since ϕΦ\phi\in\Phi, ϕ\phi is descending and we have from Lemma 89 for qr=2q_{r}=2

(7.17) Br(ϕ)\displaystyle B_{r}(\phi) =br>0(br(Pqrϕ¯(1qr))+s=0br1(ϕrsqr+ϕrsqr1))\displaystyle=\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}\left(P_{q_{r}}-\overline{\phi}(\frac{1}{q_{r}})\right)+\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\phi_{rsq_{r-1}}\right)\right)
(7.18) B¯r(ϕ)\displaystyle\overline{B}_{r}(\phi) =br>0(br(Pqrϕ(1qr))+s=0br1(ϕrsqr+ϕrsqr1))\displaystyle=\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}\left(P_{q_{r}}-\phi(\frac{1}{q_{r}})\right)+\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\phi_{rsq_{r-1}}\right)\right)

Also since ϕ\phi is descending, for qr>1q_{r}>1 we have OrϕrsqrOrϕ¯(1qr)O_{r}\phi_{rsq_{r}}\leq O_{r}\overline{\phi}(\frac{1}{q_{r}}), and for s>0s>0 or r=nr=n we also have Erϕrsqr1Erϕ¯(1qr)E_{r}\phi_{rsq_{r-1}}\leq E_{r}\overline{\phi}(\frac{1}{q_{r}}). However for s=0&r<ns=0\&r<n we may have αr0qr1\alpha_{r0q_{r-1}} pulled back to its alternate interval which gives us ϕrsqr+ϕrsqr1ϕ¯(1qr)Erϕrsqr+Orϕrsqr1+s=0&r<nEr(ϕr0qr1ϕ¯(1qr))\phi_{rsq_{r}}+\phi_{rsq_{r-1}}-\overline{\phi}(\frac{1}{q_{r}})\leq E_{r}\phi_{rsq_{r}}+O_{r}\phi_{rsq_{r-1}}+\left\llbracket s=0\&r<n\right\rrbracket E_{r}(\phi_{r0q_{r-1}}-\overline{\phi}(\frac{1}{q_{r}})). This establishes 7.15 and then 7.16 follows as its qp-dual.

If ϕ\phi is a constant function, the mid-term is zero. We now investigate requirements for the mid-terms to be either zero or positive (non-negative). Note that if qr=2q_{r}=2 we have qr1=1q_{r-1}=1 and r{1,2}r\in\{1,2\}.

For r=1r=1 we have Er=0E_{r}=0 and so the mid-term vanishes in B1ϕB_{1}\phi. However Or=1O_{r}=1 and so the mid-term in B1αϕ¯B_{1}^{\alpha}\overline{\phi} is ϕ¯101αϕ¯(12)=X¯1αϕ¯\overline{\phi}_{101}^{\alpha}-\overline{\phi}(\frac{1}{2})=\overline{X}_{1}^{\alpha}\overline{\phi} which is non-negative for α10112\alpha_{101}\geq\frac{1}{2}. But α101<1(12+1qr+2/12qr+1/)<12+12q2/\alpha_{101}<1-\left(\frac{1}{2}+\frac{1}{q_{r+2}^{/}}-\frac{1}{2q_{r+1}^{/}}\right)<\frac{1}{2}+\frac{1}{2q_{2}^{/}}and q2/>3q_{2}^{/}>3 so α101<12+12.3=23\alpha_{101}<\frac{1}{2}+\frac{1}{2.3}=\frac{2}{3}. Also for α10112\alpha_{101}\geq\frac{1}{2} we must have 1qr+2/12qr+1/\frac{1}{q_{r+2}^{/}}\leq\frac{1}{2q_{r+1}^{/}} giving ar+2/2a_{r+2}^{/}\geq 2, requiring a32a_{3}\geq 2. Finally q1=qr=2q_{1}=q_{r}=2 which means a1=2a_{1}=2 and hence 13<α<12\frac{1}{3}<\alpha<\frac{1}{2}.

For r=2r=2 we have Or=0O_{r}=0 and so the mid-term vanishes in B2ϕ¯B_{2}\overline{\phi}. However Er=1E_{r}=1 and so the mid-term in B2ϕB_{2}\phi is ϕ201ϕ¯(12)\phi_{201}-\overline{\phi}(\frac{1}{2}). We now use the identity ϕrstα=ϕ¯r¯stα¯\phi_{rst}^{\alpha}=\overline{\phi}_{\overline{r}st}^{\overline{\alpha}} to get ϕ201αϕ¯(12)=ϕ¯101α¯ϕ¯(12)\phi_{201}^{\alpha}-\overline{\phi}(\frac{1}{2})=\overline{\phi}_{101}^{\overline{\alpha}}-\overline{\phi}(\frac{1}{2}). But this is the mid term of B1α¯ϕ¯B_{1}^{\overline{\alpha}}\overline{\phi} and so by the previous result, non-negative requires 13<α¯<12,12<α¯101<23\frac{1}{3}<\overline{\alpha}<\frac{1}{2},\frac{1}{2}<\overline{\alpha}_{101}<\frac{2}{3} and a1α=1,a2α=a1α¯1=1,a4α=a3α¯f2a_{1}^{\alpha}=1,a_{2}^{\alpha}=a_{1}^{\overline{\alpha}}-1=1,a_{4}^{\alpha}=a_{3}^{\overline{\alpha}f}\geq 2 which establishes the result. ∎

Note that the constant XrϕX_{r}\phi in (7.16) is generally a small constant which will be insignificant for most purposes, and is often negative in which case it can be ignored entirely in these inequalities. The conditions for it to be positive are necessary but not sufficient. In the following example X2ϕ>0X_{2}\phi>0 for strictly descending ϕ\phi, showing that this term cannot be eliminated in the inequality.

Example 95.

α=[2,1,3,100,]0.36,N=8=23+12+01,(101)=7,α101={7α}0.54>12\alpha=[2,1,3,100,...]\approx 0.36,N=8=2*3+1*2+0*1,(101)=7,\alpha_{101}=\{7\alpha\}\approx 0.54>\frac{1}{2}.

Analysis of Upper Bound Components

(7.19) Brαϕ\displaystyle B_{r}^{\alpha}\phi br>0(brPqr+2<qr<qnEr(ϕr0,qrqr1αϕ¯(1qr))+s=0br1(Erϕrsqrα+qr>1Orϕrsqr1α))\displaystyle\leq\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}P_{q_{r}}+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left(\phi_{r0,q_{r}-q_{r-1}}^{\alpha}-\overline{\phi}\left(\frac{1}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(E_{r}\phi_{rsq_{r}}^{\alpha}+\left\llbracket q_{r}>1\right\rrbracket O_{r}\phi_{rsq_{r-1}}^{\alpha}\right)\right)
(7.20) Br¯α¯¯ϕ¯\displaystyle\overline{B_{\overline{r}}^{\overline{\alpha}}}\overline{\phi} br>0(brPqr+2<qr<qnOr(ϕ¯r0,qrqr1αϕ¯(1qr))+s=0br1(Orϕ¯rsqrα+qr>1Erϕ¯rsqr1α))\displaystyle\leq\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}P_{q_{r}}+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket O_{r}\left(\overline{\phi}_{r0,q_{r}-q_{r-1}}^{\alpha}-\overline{\phi}\left(\frac{1}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(O_{r}\overline{\phi}_{rsq_{r}}^{\alpha}+\left\llbracket q_{r}>1\right\rrbracket E_{r}\overline{\phi}_{rsq_{r-1}}^{\alpha}\right)\right)

We now analyse further the third terms above. Using values of αrst\alpha_{rst} We now exploit bounds for αrst\alpha_{rst} developed in Section 5.

First we need to treat the case qr=1q_{r}=1 separately.

Lemma 96.

For qr=1q_{r}=1 we have Pqr=0P_{q_{r}}=0 and then Brαϕ=s=0br1ErϕrsqrαB_{r}^{\alpha}\phi=\sum_{s=0}^{b_{r}-1}E_{r}\phi_{rsq_{r}}^{\alpha}

(7.21) s=0br1ϕ(s+1qr+1/+1qr+2/)B¯r(ϕ)=s=0br1ϕrsqr=Br(ϕ)s=0br1ϕ(sqr+1/+1qr+2/)\sum_{s=0}^{b_{r}-1}\phi\left(\frac{s+1}{q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}\right)\leq\overline{B}_{r}(\phi)=\sum_{s=0}^{b_{r}-1}\phi_{rsq_{r}}=B_{r}(\phi)\leq\sum_{s=0}^{b_{r}-1}\phi\left(\frac{s}{q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}\right)

where the inequalities are reversed under the dualities ϕϕ,ϕϕ¯\phi\mapsto-\phi,\phi\mapsto\overline{\phi}, and become equalities for constant ϕ\phi. The upper bounds also give s=0br1ϕrsqr=Brϕϕ(1qr+2/)+s=1br1ϕ(sqr+1/)\sum_{s=0}^{b_{r}-1}\phi_{rsq_{r}}=B_{r}\phi\leq\phi\left(\frac{1}{q_{r+2}^{/}}\right)+\sum_{s=1}^{b_{r}-1}\phi\left(\frac{s}{q_{r+1}^{/}}\right) and s=0br1ϕ¯rsqr=Brϕ¯s=1brϕ¯(ar+1/sqr+1/)\sum_{s=0}^{b_{r}-1}\overline{\phi}_{rsq_{r}}=B_{r}\overline{\phi}\leq\sum_{s=1}^{b_{r}}\overline{\phi}\left(\frac{a_{r+1}^{/}-s}{q_{r+1}^{/}}\right)

Proof.

Since ϕΦ\phi\in\Phi, ϕ\phi is monotonic descending and we have by definition (see 7.5) for any monotonic descending ϕ\phi and qr=1q_{r}=1:

(7.22) Br(ϕ)\displaystyle B_{r}(\phi) =B¯r(ϕ)=br>0s=0br1ϕrsqr\displaystyle=\overline{B}_{r}(\phi)=\left\llbracket b_{r}>0\right\rrbracket\sum_{s=0}^{b_{r}-1}\phi_{rsq_{r}}

If br=0b_{r}=0 the sum is empty so the condition br>0\left\llbracket b_{r}>0\right\rrbracket becomes redundant which establishes the central equalities in (7.21) The rest of (7.21) then follows from (48). The duality result follows from (2.4).

The second upper bound for s=0br1ϕrsqr\sum_{s=0}^{b_{r}-1}\phi_{rsq_{r}} is just a rewrite of (7.21). Using duality we have s=0br1ϕ¯rsqrs=0br1ϕ¯(s+1qr+1/+1qr+2/)\sum_{s=0}^{b_{r}-1}\overline{\phi}_{rsq_{r}}\leq\sum_{s=0}^{b_{r}-1}\overline{\phi}\left(\frac{s+1}{q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}\right)

Now using 1qr+2/+ar+1sqr+1/=ar+1/sqr+1/\frac{1}{q_{r+2}^{/}}+\frac{a_{r+1}-s}{q_{r+1}^{/}}=\frac{a_{r+1}^{/}-s}{q_{r+1}^{/}} from (16) we get s+1qr+1/+1qr+2/=ar+1/(ar+1s1)qr+1/=ar+1/uqr+1/\frac{s+1}{q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}=\frac{a_{r+1}^{/}-(a_{r+1}-s-1)}{q_{r+1}^{/}}=\frac{a_{r+1}^{/}-u}{q_{r+1}^{/}} where u=ar+1s1u=a_{r+1}-s-1 so s=0br1ϕ¯(s+1qr+1/+1qr+2/)=u=ar+1brar+11ϕ¯(ar+1/uqr+1/)u=ar+1brtar+11tϕ¯(ar+1/uqr+1/)\sum_{s=0}^{b_{r}-1}\overline{\phi}\left(\frac{s+1}{q_{r+1}^{/}}+\frac{1}{q_{r+2}^{/}}\right)=\sum_{u=a_{r+1}-b_{r}}^{a_{r+1}-1}\overline{\phi}\left(\frac{a_{r+1}^{/}-u}{q_{r+1}^{/}}\right)\leq\sum_{u=a_{r+1}-b_{r}-t}^{a_{r+1}-1-t}\overline{\phi}\left(\frac{a_{r+1}^{/}-u}{q_{r+1}^{/}}\right) for t0t\geq 0. We put t=ar+1br1t=a_{r+1}-b_{r}-1 to get u=1brϕ¯(ar+1/uqr+1/)\sum_{u=1}^{b_{r}}\overline{\phi}\left(\frac{a_{r+1}^{/}-u}{q_{r+1}^{/}}\right).

We now look at the general case where qr>1q_{r}>1. For convenience we define some further shorthand notation:

Definition 97.

Define CrsαϕErϕrsqrα+qr>1Orϕrsqr1αC_{rs}^{\alpha}\phi\coloneqq E_{r}\phi_{rsq_{r}}^{\alpha}+\left\llbracket q_{r}>1\right\rrbracket O_{r}\phi_{rsq_{r-1}}^{\alpha} (from (7.20)and its double -dual C¯rsαϕ¯Orϕ¯rsqrα+qr>1Erϕ¯rsqr1α\overline{C}_{rs}^{\alpha}\overline{\phi}\coloneqq O_{r}\overline{\phi}_{rsq_{r}}^{\alpha}+\left\llbracket q_{r}>1\right\rrbracket E_{r}\overline{\phi}_{rsq_{r-1}}^{\alpha}

For ϕΦ\phi\in\Phi (ie monotone decreasing, bounded below), we now investigate upper bounds for the qr>1q_{r}>1 terms in CrsϕC_{rs}\phi.

Using the bounds on αrst\alpha_{rst} from 44 (**Need a clear summary of αrst\alpha_{rst} results for here!!) gives:

(7.23) Crsϕ=(ϕrsqrE+ϕrsqr1O)\displaystyle C_{rs}\phi=\left(\phi_{rsq_{r}}^{E}+\phi_{rsq_{r-1}}^{O}\right) =ϕE(s+1qr+1/)+ϕO(1qr+2/+ar+1sqr+1/)\displaystyle=\phi^{E}\left(\frac{s+1}{q_{r+1}^{/}}\right)+\phi^{O}\left(\frac{1}{q_{r+2}^{/}}+\frac{a_{r+1}-s}{q_{r+1}^{/}}\right) r=n\displaystyle r=n
ϕE(1qr+2/+s+1qr+1/)+ϕO(ar+1(s1)qr+1/)\displaystyle\geq\phi^{E}\left(\frac{1}{q_{r+2}^{/}}+\frac{s+1}{q_{r+1}^{/}}\right)+\phi^{O}\left(\frac{a_{r+1}-\left(s-1\right)}{q_{r+1}^{/}}\right) r<n\displaystyle r<n
ϕE(1qr+2/+sqr+1/)+ϕO(ar+1sqr+1/)\displaystyle\leq\phi^{E}\left(\frac{1}{q_{r+2}^{/}}+\frac{s}{q_{r+1}^{/}}\right)+\phi^{O}\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right) r<n\displaystyle r<n

and

(7.24) C¯rsϕ¯=(ϕ¯rsqrO+ϕ¯rsqr1E)\displaystyle\overline{C}_{rs}\overline{\phi}=\left(\overline{\phi}_{rsq_{r}}^{O}+\overline{\phi}_{rsq_{r-1}}^{E}\right) =ϕO(s+1qr+1/)+ϕE(1qr+2/+ar+1sqr+1/)\displaystyle=\phi^{O}\left(\frac{s+1}{q_{r+1}^{/}}\right)+\phi^{E}\left(\frac{1}{q_{r+2}^{/}}+\frac{a_{r+1}-s}{q_{r+1}^{/}}\right) r=n\displaystyle r=n
ϕO(1qr+2/+s+1qr+1/)+ϕE(ar+1(s1)qr+1/)\displaystyle\geq\phi^{O}\left(\frac{1}{q_{r+2}^{/}}+\frac{s+1}{q_{r+1}^{/}}\right)+\phi^{E}\left(\frac{a_{r+1}-\left(s-1\right)}{q_{r+1}^{/}}\right) r<n\displaystyle r<n
ϕO(1qr+2/+sqr+1/)+ϕE(ar+1sqr+1/)\displaystyle\leq\phi^{O}\left(\frac{1}{q_{r+2}^{/}}+\frac{s}{q_{r+1}^{/}}\right)+\phi^{E}\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right) r<n\displaystyle r<n

Grouping terms by quasiperiod

Note in the sums (7.23) that the denominator of each fraction is of order qr+1/q_{r+1}^{/} with the exceptions of the upper bounds for ϕrsqrE,ϕ¯rsqrO\phi_{rsq_{r}}^{E},\overline{\phi}_{rsq_{r}}^{O} for s=0,r<ns=0,r<n (**check r=0???) which are of order qr+2/q_{r+2}^{/}.

But the size of ϕ(1qr/)\phi\left(\frac{1}{q_{r}^{/}}\right) is sensitively dependent on the size of qr/q_{r}^{/} when ϕ\phi is unbounded at 0+0^{+}, and so we will now group terms by denominator, as follows.

Definition 98.

Let CrC_{r} be the sum of the terms in the double sum DSϕ=r=0ns=1br1CrsϕDS\phi=\sum_{r=0}^{n}\sum_{s=1}^{b_{r}-1}C_{rs}\phi with denominator qr+1/q_{r+1}^{/}, so that DS(ϕ)=r=0nCr(ϕ)DS(\phi)=\sum_{r=0}^{n}C_{r}(\phi).

For qr=1q_{r}=1, we have from (7.21) Cr(ϕ)=s=1br1ϕ(sqr+1/)C_{r}(\phi)=\sum_{s=1}^{b_{r}-1}\phi\left(\frac{s}{q_{r+1}^{/}}\right) and C¯r(ϕ)=s=1brϕ¯(ar+1sqr+1/)\overline{C}_{r}(\phi)=\sum_{s=1}^{b_{r}}\overline{\phi}\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right).

Then by inspection of (7.23), promoting the terms with denominators qr+2/q_{r+2}^{/} we have for r1r\geq 1:

(7.25) Cn(ϕ)\displaystyle C_{n}(\phi) =s=0bn1ϕE(s+1qn+1/)+s=0bn1ϕO(an+1/sqn+1/)+bn1>0Orϕ(1qn+1/)\displaystyle=\sum_{s=0}^{b_{n}-1}\phi^{E}\left(\frac{s+1}{q_{n+1}^{/}}\right)+\sum_{s=0}^{b_{n}-1}\phi^{O}\left(\frac{a_{n+1}^{/}-s}{q_{n+1}^{/}}\right)\,\,+\left\llbracket b_{n-1}>0\right\rrbracket O_{r}\phi\left(\frac{1}{q_{n+1}^{/}}\right) r=n\displaystyle r=n
Cr(ϕ)\displaystyle C_{r}(\phi) s=1br1ϕE(1qr+2/+sqr+1/)+s=0br1ϕO(ar+1sqr+1/)+br1>0Orϕ(1qr+1/)\displaystyle\leq\sum_{s=1}^{b_{r}-1}\phi^{E}\left(\frac{1}{q_{r+2}^{/}}+\frac{s}{q_{r+1}^{/}}\right)+\sum_{s=0}^{b_{r}-1}\phi^{O}\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right)\,\,+\left\llbracket b_{r-1}>0\right\rrbracket O_{r}\phi\left(\frac{1}{q_{r+1}^{/}}\right) r<n\displaystyle r<n

Note that at this point we have simply rearranged terms using the originating inequalities (7.23). However it is difficult to make use of all the information in (7.25) except in special circumstances such as periodic partial quotients. Our focus in this paper is on the general case, and so we now develop some relaxations of (7.25) which are less precise but more tractable for our purposes. Since ϕ\phi is decreasing we have ϕE(1qr+2/+sqr+1/)ϕE(sqr+1/)\phi^{E}\left(\frac{1}{q_{r+2}^{/}}+\frac{s}{q_{r+1}^{/}}\right)\leq\phi^{E}\left(\frac{s}{q_{r+1}^{/}}\right) and ϕO(ar+1/sqr+1/)ϕO(ar+1sqr+1/)\phi^{O}\left(\frac{a_{r+1}^{/}-s}{q_{r+1}^{/}}\right)\leq\phi^{O}\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right), which with some rewriting results in the slightly less precise but more uniform inequalities:

(7.26) Cn(ϕ)\displaystyle C_{n}(\phi) s=1bnϕE(sqn+1/)+bn1>0ϕO(1qn+1/)+s=0bn1ϕO(an+1sqn+1/)\displaystyle\leq\sum_{s=1}^{b_{n}}\phi^{E}\left(\frac{s}{q_{n+1}^{/}}\right)+\left\llbracket b_{n-1}>0\right\rrbracket\phi^{O}\left(\frac{1}{q_{n+1}^{/}}\right)+\sum_{s=0}^{b_{n}-1}\phi^{O}\left(\frac{a_{n+1}-s}{q_{n+1}^{/}}\right)
Cr(ϕ)\displaystyle C_{r}(\phi) s=1br1ϕE(sqr+1/)+br1>0ϕO(1qr+1/)+s=0br1ϕO(ar+1sqr+1/)\displaystyle\leq\sum_{s=1}^{b_{r}-1}\phi^{E}\left(\frac{s}{q_{r+1}^{/}}\right)+\left\llbracket b_{r-1}>0\right\rrbracket\phi^{O}\left(\frac{1}{q_{r+1}^{/}}\right)+\sum_{s=0}^{b_{r}-1}\phi^{O}\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right)

Note that for rr even CrC_{r} has br1b_{r}-1 or brb_{r} (for r=nr=n only) non-zero terms, whereas for rr odd CrC_{r} has brb_{r} or br+1b_{r}+1 (for br1>0b_{r-1}>0) non-zero terms. We will use this form for one particular application in the next section, but for many purposes we can afford to relax the inequalities still further to the point where we can eliminate the asymmetry.

We first use the fact that ϕ\phi is descending so that for br<ar+1b_{r}<a_{r+1} we have s=0br1ϕO(ar+1sqr+1/)=s=ar+1(br1)ar+1ϕO(sqr+1/)s=2br+1ϕO(sqr+1/)\sum_{s=0}^{b_{r}-1}\phi^{O}\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right)=\sum_{s=a_{r+1}-(b_{r}-1)}^{a_{r+1}}\phi^{O}\left(\frac{s}{q_{r+1}^{/}}\right)\leq\sum_{s=2}^{b_{r}+1}\phi^{O}\left(\frac{s}{q_{r+1}^{/}}\right). Using br1>0=1br1=0\left\llbracket b_{r-1}>0\right\rrbracket=1-\left\llbracket b_{r-1}=0\right\rrbracket and ϕ\phi descending gives

br1>0ϕO(1qr+1/)+s=0br1ϕO(ar+1sqr+1/)s=1br+1ϕO(sqr+1/)br1=0ϕO(1qr+1/)\left\llbracket b_{r-1}>0\right\rrbracket\phi^{O}\left(\frac{1}{q_{r+1}^{/}}\right)+\sum_{s=0}^{b_{r}-1}\phi^{O}\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right)\leq\sum_{s=1}^{b_{r}+1}\phi^{O}\left(\frac{s}{q_{r+1}^{/}}\right)-\left\llbracket b_{r-1}=0\right\rrbracket\phi^{O}\left(\frac{1}{q_{r+1}^{/}}\right)

If br=ar+1b_{r}=a_{r+1} we have s=0br1ϕO(ar+1sqr+1/)=s=1ar+1ϕO(sqr+1/)\sum_{s=0}^{b_{r}-1}\phi^{O}\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right)=\sum_{s=1}^{a_{r+1}}\phi^{O}\left(\frac{s}{q_{r+1}^{/}}\right), and we also have from(18) that br1=0b_{r-1}=0 so that s=0br1ϕO(ar+1sqr+1/)=s=1ar+1ϕO(sqr+1/)\sum_{s=0}^{b_{r}-1}\phi^{O}\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right)=\sum_{s=1}^{a_{r+1}}\phi^{O}\left(\frac{s}{q_{r+1}^{/}}\right).

In addition we have ϕE(1qr+2/+sqr+1/)ϕE(sqr+1/)\phi^{E}\left(\frac{1}{q_{r+2}^{/}}+\frac{s}{q_{r+1}^{/}}\right)\leq\phi^{E}\left(\frac{s}{q_{r+1}^{/}}\right) and ϕO(ar+1/sqr+1/)ϕO(ar+1sqr+1/)\phi^{O}\left(\frac{a_{r+1}^{/}-s}{q_{r+1}^{/}}\right)\leq\phi^{O}\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right). Hence, writing crα=c(N,α,r)=Er(br1+r=n)+Ormin(br+1,ar+1)c_{r}^{\alpha}=c(N,\alpha,r)=E_{r}\left(b_{r}-1+\left\llbracket r=n\right\rrbracket\right)+O_{r}\min(b_{r}+1,a_{r+1}), we can combine all these results for 2rn2\leq r\leq n into:

(7.27) Cr(ϕ)\displaystyle C_{r}(\phi) s=1crαϕ(sqr+1/)br1=0ϕO(1qr+1/)\displaystyle\leq\sum_{s=1}^{c_{r}^{\alpha}}\phi\left(\frac{s}{q_{r+1}^{/}}\right)\,\,-\left\llbracket b_{r-1}=0\right\rrbracket\phi^{O}\left(\frac{1}{q_{r+1}^{/}}\right)

Finally, if ϕ(1qr+1/)0\phi\left(\frac{1}{q_{r+1}^{/}}\right)\geq 0 we can reduce this to Crs=1crϕ(sqr+1/)C_{r}\leq\sum_{s=1}^{c_{r}}\phi\left(\frac{s}{q_{r+1}^{/}}\right). Note that for rr odd, this adds significant imprecision if br1=0b_{r-1}=0, but then we cannot use this information if we do not know br1=0b_{r-1}=0, and in general we will not.

The argument for ϕ¯\overline{\phi} is precisely analogous giving for c¯rα=cr¯α¯=Or(br1+r=n)+Ermin(br+1,ar+1)\overline{c}_{r}^{\alpha}=c_{\overline{r}}^{\overline{\alpha}}=O_{r}\left(b_{r}-1+\left\llbracket r=n\right\rrbracket\right)+E_{r}\min(b_{r}+1,a_{r+1}), and 1rn1\leq r\leq n:

(7.28) C¯rα(ϕ¯)\displaystyle\overline{C}_{r}^{\alpha}(\overline{\phi}) s=1c¯rαϕ(sqr+1/)br1=0ϕE(1qr+1/)\displaystyle\leq\sum_{s=1}^{\overline{c}_{r}^{\alpha}}\phi\left(\frac{s}{q_{r+1}^{/}}\right)\,\,-\left\llbracket b_{r-1}=0\right\rrbracket\phi^{E}\left(\frac{1}{q_{r+1}^{/}}\right)

7.6. Summary of Results

For convenience we summarise here the results of this Section.

Recall SNϕ=r=0ns=0br1SrsαϕS_{N}\phi=\sum_{r=0}^{n}\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha}\phi where Srsαϕ=Sqrα(ϕ,αrs)=t=1qrϕ(αrs+tα)=t=1qrϕrstαS_{rs}^{\alpha}\phi=S_{q_{r}}^{\alpha}(\phi,\alpha_{rs})=\sum_{t=1}^{q_{r}}\phi(\alpha_{rs}+t\alpha)=\sum_{t=1}^{q_{r}}\phi_{rst}^{\alpha}.

Lemma 99 (Bounds for monotonic functions).

Let ϕ\phi be a monotonic decreasing observable. Then for r0r\geq 0 we have Br¯α¯¯ϕs=0br1SrsαϕBrαϕ\overline{B_{\overline{r}}^{\overline{\alpha}}}\phi\leq\sum_{s=0}^{b_{r}-1}S_{rs}^{\alpha}\phi\leq B_{r}^{\alpha}\phi where Brα,Br¯α¯¯B_{r}^{\alpha},\overline{B_{\overline{r}}^{\overline{\alpha}}} are the Bound Functionals:

(7.29) Brαϕ\displaystyle B_{r}^{\alpha}\phi =br>0(qr>1br(Pqrϕ¯(1qr))+2<qr<qnEr(ϕr0,qrqr1ϕ¯(2qr))+s=0br1(ϕrsqr+qr>1ϕrsqr1))\displaystyle=\left\llbracket b_{r}>0\right\rrbracket\left(\left\llbracket q_{r}>1\right\rrbracket b_{r}\left(P_{q_{r}}-\overline{\phi}(\frac{1}{q_{r}})\right)+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left(\phi_{r0,q_{r}-q_{r-1}}-\overline{\phi}\left(\frac{2}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\left\llbracket q_{r}>1\right\rrbracket\phi_{rsq_{r-1}}\right)\right)
(7.30) Br¯α¯¯ϕ\displaystyle\overline{B_{\overline{r}}^{\overline{\alpha}}}\phi =Br¯α¯ϕ¯=br>0(qr>1br(Pqrϕ(1qr))+2<qr<qnOr(ϕr0,qrqr1ϕ(2qr))+s=0br1(ϕrsqr+qr>1ϕrsqr1))\displaystyle=B_{\overline{r}}^{\overline{\alpha}}\overline{\phi}=\left\llbracket b_{r}>0\right\rrbracket\left(\left\llbracket q_{r}>1\right\rrbracket b_{r}\left(P_{q_{r}}-\phi(\frac{1}{q_{r}})\right)+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket O_{r}\left(\phi_{r0,q_{r}-q_{r-1}}-\phi\left(\frac{2}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\left\llbracket q_{r}>1\right\rrbracket\phi_{rsq_{r-1}}\right)\right)
Lemma 100 (Upper bounds for primitive functions).

Let ϕΦ\phi\in\Phi, the set of monotone descending positive functions on (0,1)(0,1).. Then for qr>2q_{r}>2 we have:

(7.31) Brαϕ\displaystyle B_{r}^{\alpha}\phi br>0(brPqr+2<qr<qnEr(ϕr0,qrqr1αϕ¯(1qr))+s=0br1(Erϕrsqrα+Orϕrsqr1α))\displaystyle\leq\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}P_{q_{r}}+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left(\phi_{r0,q_{r}-q_{r-1}}^{\alpha}-\overline{\phi}\left(\frac{1}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(E_{r}\phi_{rsq_{r}}^{\alpha}+O_{r}\phi_{rsq_{r-1}}^{\alpha}\right)\right)
(7.32) Br¯α¯¯ϕ¯\displaystyle\overline{B_{\overline{r}}^{\overline{\alpha}}}\overline{\phi} br>0(brPqr+2<qr<qnOr(ϕ¯r0,qrqr1αϕ¯(1qr))+s=0br1(Orϕ¯rsqrα+Erϕ¯rsqr1α))\displaystyle\leq\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}P_{q_{r}}+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket O_{r}\left(\overline{\phi}_{r0,q_{r}-q_{r-1}}^{\alpha}-\overline{\phi}\left(\frac{1}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(O_{r}\overline{\phi}_{rsq_{r}}^{\alpha}+E_{r}\overline{\phi}_{rsq_{r-1}}^{\alpha}\right)\right)
Definition 101.

We define CrsαϕErϕrsqrα+Orϕrsqr1αC_{rs}^{\alpha}\phi\coloneqq E_{r}\phi_{rsq_{r}}^{\alpha}+O_{r}\phi_{rsq_{r-1}}^{\alpha} and its quasiperiod dual C¯rsαϕ¯Orϕ¯rsqrα+Erϕ¯rsqr1α\overline{C}_{rs}^{\alpha}\overline{\phi}\coloneqq O_{r}\overline{\phi}_{rsq_{r}}^{\alpha}+E_{r}\overline{\phi}_{rsq_{r-1}}^{\alpha}. We let CrC_{r} be the sum of the terms in the double sum DSϕ=r=0ns=1br1CrsϕDS\phi=\sum_{r=0}^{n}\sum_{s=1}^{b_{r}-1}C_{rs}\phi with denominator qr+1/q_{r+1}^{/}, so that DS(ϕ)=r=0nCr(ϕ)DS(\phi)=\sum_{r=0}^{n}C_{r}(\phi). And SNαϕrBrαϕrSSrαϕ+CrαϕS_{N}^{\alpha}\phi\leq\sum_{r}B_{r}^{\alpha}\phi\leq\sum_{r}SS_{r}^{\alpha}\phi+C_{r}^{\alpha}\phi

(7.33) Cn(ϕ)\displaystyle C_{n}(\phi) =s=1bnϕE(sqn+1/)+s=0bn1ϕO(an+1/sqn+1/)+bn1>0Orϕ(1qn+1/)\displaystyle=\sum_{s=1}^{b_{n}}\phi^{E}\left(\frac{s}{q_{n+1}^{/}}\right)+\sum_{s=0}^{b_{n}-1}\phi^{O}\left(\frac{a_{n+1}^{/}-s}{q_{n+1}^{/}}\right)\,\,+\left\llbracket b_{n-1}>0\right\rrbracket O_{r}\phi\left(\frac{1}{q_{n+1}^{/}}\right) r=n\displaystyle r=n
Cr(ϕ)\displaystyle C_{r}(\phi) s=1br1ϕE(1qr+2/+sqr+1/)+s=0br1ϕO(ar+1sqr+1/)+br1>0Orϕ(1qr+1/)\displaystyle\leq\sum_{s=1}^{b_{r}-1}\phi^{E}\left(\frac{1}{q_{r+2}^{/}}+\frac{s}{q_{r+1}^{/}}\right)+\sum_{s=0}^{b_{r}-1}\phi^{O}\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right)\,\,+\left\llbracket b_{r-1}>0\right\rrbracket O_{r}\phi\left(\frac{1}{q_{r+1}^{/}}\right) r<n\displaystyle r<n

Writing crα=c(N,α,r)=Er(br1+r=n)+Ormin(br+1,ar+1)c_{r}^{\alpha}=c(N,\alpha,r)=E_{r}\left(b_{r}-1+\left\llbracket r=n\right\rrbracket\right)+O_{r}\min(b_{r}+1,a_{r+1}) this gives

(7.34) Cr(ϕ)\displaystyle C_{r}(\phi) s=1crαϕ(sqr+1/)br1=0ϕO(1qr+1/)\displaystyle\leq\sum_{s=1}^{c_{r}^{\alpha}}\phi\left(\frac{s}{q_{r+1}^{/}}\right)\,\,-\left\llbracket b_{r-1}=0\right\rrbracket\phi^{O}\left(\frac{1}{q_{r+1}^{/}}\right)

8. Application to the function family θβ(x)=xβ,β1\theta^{\beta}(x)=x^{-\beta},\beta\geq 1

Could also cover 0β<10\leq\beta<1 but it’s tedious - is it worth it???

In the previous section we established inequalities applicable to any suitable decreasing observable ϕ\phi. In this section we will further develop the inequalities for the special family of functions θβ(x)=xβ\theta^{\beta}(x)=x^{-\beta} for β1.\text{$\beta\geq$1}. (For β1\beta\leq 1 SNθβS_{N}\theta^{\beta} can be estimated using the Denjoy-Koksma(**) result, combining it with the technique of truncated forms of the observables for 0<β<10<\beta<1).

We recall from the previous section that the lower bound for SNϕS_{N}\phi can be studied in a straightforward way, but the upper bound is more complex and we will it in parts. In particular SNS_{N} splits naturally into a sum of 3 parts: SNϕ(SN1+SN2+SN3)ϕS_{N}\phi\leq\left(S_{N}^{1}+S_{N}^{2}+S_{N}^{3}\right)\phi where

(8.1) SN1ϕ\displaystyle S_{N}^{1}\phi =r=0nbr>0brPqr(ϕ)\displaystyle=\sum_{r=0}^{n}\left\llbracket b_{r}>0\right\rrbracket b_{r}P_{q_{r}}(\phi)
(8.2) SN2ϕ\displaystyle S_{N}^{2}\phi =r=0nbr>02<qr<qnEr(ϕr0,qrqr1ϕ¯(1qr))\displaystyle=\sum_{r=0}^{n}\left\llbracket b_{r}>0\right\rrbracket\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left(\phi_{r0,q_{r}-q_{r-1}}-\overline{\phi}\left(\frac{1}{q_{r}}\right)\right)
(8.3) SN3ϕ\displaystyle S_{N}^{3}\phi =r=0nbr>0s=0br1(Erϕrsqr+Orϕrsqr1)=r=0nCrϕ\displaystyle=\sum_{r=0}^{n}\left\llbracket b_{r}>0\right\rrbracket\sum_{s=0}^{b_{r}-1}\left(E_{r}\phi_{rsq_{r}}+O_{r}\phi_{rsq_{r-1}}\right)=\sum_{r=0}^{n}C_{r}\phi

Note that the first two sums are single sums whilst the third is a double sum. Also SN1ϕS_{N}^{1}\phi vanishes for ϕ\phi anti-symmetric (**).

From this point we assume α\alpha is fixed and can be dropped from the notation, and that the canonical Ostrowski representation of NN is N=r=0nbrqrN=\sum_{r=0}^{n}b_{r}q_{r} .

8.1. Some ancillary lemmas

The following function will play a large role in this section:

Definition 102 (Generalised Harmonic Function).

For y>0y>0 we define Hkβ(y)s=0k1(y+s)βH_{k}^{\beta}(y)\coloneqq\sum_{s=0}^{k-1}(y+s)^{-\beta} (with Hkβ(y)0H_{k}^{\beta}(y)\coloneqq 0 for k0k\leq 0). We will also write HkβHkβ(1)H_{k}^{\beta}\coloneqq H_{k}^{\beta}(1).

Note that this is a positive function decreasing in yy and increasing in kk. It follows that for y1y\geq 1 that Hkβ(y)Hkβ<Hβ=ζ(β)H_{k}^{\beta}(y)\leq H_{k}^{\beta}<H_{\infty}^{\beta}=\zeta(\beta). So for β>1\beta>1 Hkβ(y)H_{k}^{\beta}(y) is O(1)O(1), but the constant becomes arbitrarily large as β1\beta\downarrow 1.

Proposition 103.

Nbnqn>bnbn+1N12NN\geq b_{n}q_{n}>\frac{b_{n}}{b_{n}+1}N\geq\frac{1}{2}N

Proof.

By definition N<(bn+1)qnN<\left(b_{n}+1\right)q_{n} and the left hand inequality follows immediately. The right hand follows since bn1b_{n}\geq 1. ∎

This gives yet less sharp, but again more digestible alternatives.

Lemma 104.

Suppose β1\beta\geq 1 then r=0n1br>0qrβr=0n1brqrβ<qnβ\sum_{r=0}^{n-1}\left\llbracket b_{r}>0\right\rrbracket q_{r}^{\beta}\leq\sum_{r=0}^{n-1}b_{r}q_{r}^{\beta}<q_{n}^{\beta}

Proof.

The first inequality is trivial. For β1\beta\geq 1, r=0n1brqrβ(r=0n1brqr)β\sum_{r=0}^{n-1}b_{r}q_{r}^{\beta}\leq\left(\sum_{r=0}^{n-1}b_{r}q_{r}\right)^{\beta} and r=0n1brqr=Nbnqn<qn\sum_{r=0}^{n-1}b_{r}q_{r}=N-b_{n}q_{n}<q_{n}. ∎

Corollary 105.

r=0nbr>0qrβmin(2qnβ,Nβ)\sum_{r=0}^{n}\left\llbracket b_{r}>0\right\rrbracket q_{r}^{\beta}\leq\min\left(2q_{n}^{\beta},N^{\beta}\right) and r=0n1br>0qrβ<min(2qn1β,qnβ)\sum_{r=0}^{n-1}\left\llbracket b_{r}>0\right\rrbracket q_{r}^{\beta}<\min\left(2q_{n-1}^{\beta},q_{n}^{\beta}\right)

Corollary 106.

For ϕx=(1x)β,β1\phi x=(1-x)^{-\beta},\beta\geq 1 From (48), for r>0r>0 we have both ϕr0,qrqr1Eϕ¯(12qr)=(2qr)β\phi_{r0,q_{r}-q_{r-1}}^{E}\leq\overline{\phi}(\frac{1}{2q_{r}})=(2q_{r})^{\beta} and also ϕr0,qrqr1Oϕ¯(12qr)\phi_{r0,q_{r}-q_{r-1}}^{O}\leq\overline{\phi}(\frac{1}{2q_{r}}) and hence also r=0n1br>0(ϕr0,qrqr1E+ϕr0,qrqr1O)r=0n1br>0ϕ¯(12qr)2βqnβ.\sum_{r=0}^{n-1}\left\llbracket b_{r}>0\right\rrbracket\left(\phi_{r0,q_{r}-q_{r-1}}^{E}+\phi_{r0,q_{r}-q_{r-1}}^{O}\right)\leq\sum_{r=0}^{n-1}\left\llbracket b_{r}>0\right\rrbracket\overline{\phi}(\frac{1}{2q_{r}})\leq 2^{\beta}q_{n}^{\beta}. By the lemma r=1n1ϕr0,qrqr1E+ϕr0,qrqr1O2βqnβ\sum_{r=1}^{n-1}\phi_{r0,q_{r}-q_{r-1}}^{E}+\phi_{r0,q_{r}-q_{r-1}}^{O}\leq 2^{\beta}q_{n}^{\beta}.

Similarly r=1n1ϕ¯E(2qr)+ϕ¯O(2qr)(12)βqnβ\sum_{r=1}^{n-1}\overline{\phi}^{E}\left(\frac{2}{q_{r}}\right)+\overline{\phi}^{O}\left(\frac{2}{q_{r}}\right)\leq\left(\frac{1}{2}\right)^{\beta}q_{n}^{\beta}

Lemma 107 (Generalised Harmonic Function).

The following results hold:

Pqrθβt=1qr1qrβtβ=qrβHqr1βP_{q_{r}}\theta^{\beta}\leq\sum_{t=1}^{q_{r}-1}\frac{q_{r}^{\beta}}{t^{\beta}}=q_{r}^{\beta}H_{q_{r}-1}^{\beta}.

For β=1,k1\beta=1,k\geq 1 we have log(1+ky)<Hk(y)1y+log(1+k1y)\log\left(1+\frac{k}{y}\right)<H_{k}(y)\leq\frac{1}{y}+\log\left(1+\frac{k-1}{y}\right) . For y=1y=1 this gives log(1+k)<Hk1+logk\log\left(1+k\right)<H_{k}\leq 1+\log k

For β>1,k1\beta>1,k\geq 1 we have 1β1(1yβ11(y+k)β1)<Hkβ(y)1yβ+1β1(1yβ11(y+k1)β1)\frac{1}{\beta-1}\left(\frac{1}{y^{\beta-1}}-\frac{1}{(y+k)^{\beta-1}}\right)<H_{k}^{\beta}(y)\leq\frac{1}{y^{\beta}}+\frac{1}{\beta-1}\left(\frac{1}{y^{\beta-1}}-\frac{1}{(y+k-1)^{\beta-1}}\right).

Finally a useful trivial estimate is for n0n\geq 0 Hnβn>0(1+(12β)(n1))H_{n}^{\beta}\leq\left\llbracket n>0\right\rrbracket\left(1+\left(\frac{1}{2^{\beta}}\right)(n-1)\right) (equality for n{1,2}n\in\{1,2\}) which also gives Hn+11+12nH_{n+1}\leq 1+\frac{1}{2}n and Hn112nH_{n-1}\leq\frac{1}{2}n

By definition Hnβ=0H_{n}^{\beta}=0 for n0n\leq 0.

For ij0i\geq j\geq 0,

(8.4) Hijβ(j+1)=HiβHjβ=k=j+1i1tβ1(j+1)β(ij)H_{i-j}^{\beta}(j+1)=H_{i}^{\beta}-H_{j}^{\beta}=\sum_{k=j+1}^{i}\frac{1}{t^{\beta}}\leq\frac{1}{\left(j+1\right)^{\beta}}(i-j)

and for i1i\geq 1, Hiβ1+t=2i1tβ1+12β(i1)H_{i}^{\beta}\leq 1+\sum_{t=2}^{i}\frac{1}{t^{\beta}}\leq 1+\frac{1}{2^{\beta}}(i-1). In ptic for β=1\beta=1 we get for n0n\geq 0

(8.5) Hnn>012(n+1))H_{n}\leq\left\llbracket n>0\right\rrbracket\frac{1}{2}\left(n+1)\right)

(equality for n2n\leq 2) which also gives Hn+11+12nH_{n+1}\leq 1+\frac{1}{2}n and Hn112nH_{n-1}\leq\frac{1}{2}n

QrEr(12br+1)+Or(12br)=Er+12brQ_{r}\leq E_{r}\left(\frac{1}{2}b_{r}+1\right)+O_{r}\left(\frac{1}{2}b_{r}\right)=E_{r}+\frac{1}{2}b_{r}

8.2. Lower bound for SNθβS_{N}\theta^{\beta} using the generalised harmonic function HβH^{\beta}

Most interest in the mathematical community has been on upper bounds for these sums, and we will also focus primarily on these upper bounds. However first we take a little time here to derive some simple results for a lower bound also.

Since θβ>0\theta^{\beta}>0 we have the simple and crude lower bound SNθβ=r=0nSbrqrθβSbnqnθβS_{N}\theta^{\beta}=\sum_{r=0}^{n}S_{b_{r}q_{r}}\theta^{\beta}\geq S_{b_{n}q_{n}}\theta^{\beta} (where N=r=0nbrqrN=\sum_{r=0}^{n}b_{r}q_{r}). Recall we are regarding α\alpha as fixed and drop it from the notation, so that from (7.8),(7.7) we get:

s=0br1SrsϕB¯rϕ=br>0(qr>1br(Pqrϕ(1qr))+2<qr<qnOr(ϕr0,qrqr1ϕ(2qr))+s=0br1(ϕrsqr+qr>1ϕrsqr1))\sum_{s=0}^{b_{r}-1}S_{rs}\phi\geq\overline{B}_{r}\phi=\\ \left\llbracket b_{r}>0\right\rrbracket\left(\left\llbracket q_{r}>1\right\rrbracket b_{r}\left(P_{q_{r}}-\phi(\frac{1}{q_{r}})\right)+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket O_{r}\left(\phi_{r0,q_{r}-q_{r-1}}-\phi\left(\frac{2}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\left\llbracket q_{r}>1\right\rrbracket\phi_{rsq_{r-1}}\right)\right)
s=0br1Srsϕ¯Brαϕ¯=br>0(qr>1br(Pqrϕ(1qr))+2<qr<qnEr(ϕ¯r0,qrqr1ϕ(2qr))+s=0br1(ϕ¯rsqr+qr>1ϕ¯rsqr1))\sum_{s=0}^{b_{r}-1}S_{rs}\overline{\phi}\geq B_{r}^{\alpha}\overline{\phi}=\\ \left\llbracket b_{r}>0\right\rrbracket\left(\left\llbracket q_{r}>1\right\rrbracket b_{r}\left(P_{q_{r}}-\phi(\frac{1}{q_{r}})\right)+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left(\overline{\phi}_{r0,q_{r}-q_{r-1}}-\phi\left(\frac{2}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(\overline{\phi}_{rsq_{r}}+\left\llbracket q_{r}>1\right\rrbracket\overline{\phi}_{rsq_{r-1}}\right)\right)

Hence SNθβSbnqnθβ=s=0bn1SnsθβB¯nθβS_{N}\theta^{\beta}\geq S_{b_{n}q_{n}}\theta^{\beta}=\sum_{s=0}^{b_{n}-1}S_{ns}\theta^{\beta}\geq\overline{B}_{n}\theta^{\beta}.

Note for qn>1q_{n}>1, B¯nθβ=bn(Pqnθβ(1qn))+s=0bn1(θnsqnβ+θnsqn1β)\overline{B}_{n}\theta^{\beta}=b_{n}\left(P_{q_{n}}-\theta^{\beta}(\frac{1}{q_{n}})\right)+\sum_{s=0}^{b_{n}-1}\left(\theta_{nsq_{n}}^{\beta}+\theta_{nsq_{n-1}}^{\beta}\right). And for any 1<qrqn1<q_{r}\leq q_{n}, Pqrθβ(1qr)=qrβHqr1βqrβ=qrβ(Hqr1β1)P_{q_{r}}-\theta^{\beta}(\frac{1}{q_{r}})=q_{r}^{\beta}H_{q_{r}-1}^{\beta}-q_{r}^{\beta}=q_{r}^{\beta}\left(H_{q_{r}-1}^{\beta}-1\right).

For nn even s=0bn1θnsqnβ=qn+1/βHbnβ\sum_{s=0}^{b_{n}-1}\theta_{nsq_{n}}^{\beta}=q_{n+1}^{/\beta}H_{b_{n}}^{\beta} and s=0bn1θnsqn1β>s=0bn11=bn\sum_{s=0}^{b_{n}-1}\theta_{nsq_{n-1}}^{\beta}>\sum_{s=0}^{b_{n}-1}1=b_{n}. For nn odd s=0bn1θnsqnβbn\sum_{s=0}^{b_{n}-1}\theta_{nsq_{n}}^{\beta}\geq b_{n} and θnsqn1β>(qn+1/an+1/s)β>qn/β\theta_{nsq_{n-1}}^{\beta}>\left(\frac{q_{n+1}^{/}}{a_{n+1}^{/}-s}\right)^{\beta}>q_{n}^{/\beta} giving s=0bn1θnsqn1β>bnqn/β\sum_{s=0}^{b_{n}-1}\theta_{nsq_{n-1}}^{\beta}>b_{n}q_{n}^{/\beta}. Hence for qn>1q_{n}>1

(8.6) Sbnqnθβ>bnqnβ(Hqn1β1)+Enqn+1/βHbnβ+Onbnqn/β+bnS_{b_{n}q_{n}}\theta^{\beta}>b_{n}q_{n}^{\beta}\left(H_{q_{n}-1}^{\beta}-1\right)+E_{n}q_{n+1}^{/\beta}H_{b_{n}}^{\beta}+O_{n}b_{n}q_{n}^{/\beta}+b_{n}

Using QP duality this also gives us for qn>1q_{n}>1

(8.7) Sbnqnθ¯β>bnqnβ(Hqn1β1)+Onqn+1/βHbnβ+Enbnqn/β+bnS_{b_{n}q_{n}}\overline{\theta}^{\beta}>b_{n}q_{n}^{\beta}\left(H_{q_{n}-1}^{\beta}-1\right)+O_{n}q_{n+1}^{/\beta}H_{b_{n}}^{\beta}+E_{n}b_{n}q_{n}^{/\beta}+b_{n}
(8.8) Sbnqn(θβ+θ¯β)>2bnqnβ(Hqn1β1)+qn+1/βHbnβ+bnqn/β+2bnS_{b_{n}q_{n}}\left(\theta^{\beta}+\overline{\theta}^{\beta}\right)>2b_{n}q_{n}^{\beta}\left(H_{q_{n}-1}^{\beta}-1\right)+q_{n+1}^{/\beta}H_{b_{n}}^{\beta}+b_{n}q_{n}^{/\beta}+2b_{n}

We derived estimates for the HH terms in (107).

8.3. Upper bound for SNθβS_{N}\theta^{\beta} using the generalised harmonic function HβH^{\beta}

Recall we have decomposed the Birkhoff sum SNϕS_{N}\phi into three component sums SN1ϕ+SN2ϕ+SN3ϕS_{N}^{1}\phi+S_{N}^{2}\phi+S_{N}^{3}\phi,. We now study each of these component sums for the function θβ\theta^{\beta}.

The Single Sum Components SN1θβS_{N}^{1}\theta^{\beta} and SN2θβS_{N}^{2}\theta^{\beta} From (8.1) we have SN1(θβ)=r=0nbrPqr(θβ)r=0nbrqrβHqr1βS_{N}^{1}(\theta^{\beta})=\sum_{r=0}^{n}b_{r}P_{q_{r}}(\theta^{\beta})\leq\sum_{r=0}^{n}b_{r}q_{r}^{\beta}H_{q_{r}-1}^{\beta}. Since PqrP_{q_{r}} is symmetric, it follows that SN1S_{N}^{1} is also symmetric, ie SN1(θ¯β)=SN1(θβ)S_{N}^{1}(\overline{\theta}^{\beta})=S_{N}^{1}(\theta^{\beta}). Also from (8.2) we have SN2(θβ)=r=0nbr>02<qr<qnEr(θr0,qrqr1βθ¯β(1qr))r=0nbr>02<qr<qnEr(θβ(12qr)1)S_{N}^{2}(\theta^{\beta})=\sum_{r=0}^{n}\left\llbracket b_{r}>0\right\rrbracket\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left(\theta_{r0,q_{r}-q_{r-1}}^{\beta}-\overline{\theta}^{\beta}\left(\frac{1}{q_{r}}\right)\right)\leq\sum_{r=0}^{n}\left\llbracket b_{r}>0\right\rrbracket\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left(\theta^{\beta}\left(\frac{1}{2q_{r}}\right)-1\right)

Noting Er(θβ(12qr)1)=Er((2qr)β1)E_{r}\left(\theta^{\beta}\left(\frac{1}{2q_{r}}\right)-1\right)=E_{r}\left((2q_{r})^{\beta}-1\right), and the invoking QPQP duality we get:

(SN1+SN2)θβ\displaystyle\left(S_{N}^{1}+S_{N}^{2}\right)\theta^{\beta} r=0nbr>0(brqrβHqr1β+2<qr<qnEr((2qr)β1))\displaystyle\leq\sum_{r=0}^{n}\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}q_{r}^{\beta}H_{q_{r}-1}^{\beta}+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left((2q_{r})^{\beta}-1\right)\right)
(SN1+SN2)θ¯β\displaystyle\left(S_{N}^{1}+S_{N}^{2}\right)\overline{\theta}^{\beta} r=0nbr>0(brqrβHqr1β+2<qr<qnOr((2qr)β1))\displaystyle\leq\sum_{r=0}^{n}\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}q_{r}^{\beta}H_{q_{r}-1}^{\beta}+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket O_{r}\left((2q_{r})^{\beta}-1\right)\right)

The Double Sum Component SN3θβS_{N}^{3}\theta^{\beta} Initial estimates using the harmonic function From (**) we have SN3θβ=r=0nCrθβS_{N}^{3}\theta^{\beta}=\sum_{r=0}^{n}C_{r}\theta^{\beta} and from (**) we have

(8.9) Cn(ϕ)\displaystyle C_{n}(\phi) =Ens=1bnϕ(sqn+1/)+On(s=0bn1ϕ(an+1/sqn+1/)+bn1>0ϕ(1qn+1/))\displaystyle=E_{n}\sum_{s=1}^{b_{n}}\phi\left(\frac{s}{q_{n+1}^{/}}\right)+O_{n}\left(\sum_{s=0}^{b_{n}-1}\phi\left(\frac{a_{n+1}^{/}-s}{q_{n+1}^{/}}\right)\,\,+\left\llbracket b_{n-1}>0\right\rrbracket\phi\left(\frac{1}{q_{n+1}^{/}}\right)\right) r=n\displaystyle r=n
Cr(ϕ)\displaystyle C_{r}(\phi) Ers=1br1ϕ(s+(ar+1/ar+1)qr+1/)+Or(s=0br1ϕ(ar+1sqr+1/)+br1>0ϕ(1qr+1/))\displaystyle\leq E_{r}\sum_{s=1}^{b_{r}-1}\phi\left(\frac{s+(a_{r+1}^{/}-a_{r+1})}{q_{r+1}^{/}}\right)+O_{r}\left(\sum_{s=0}^{b_{r}-1}\phi\left(\frac{a_{r+1}-s}{q_{r+1}^{/}}\right)\,\,+\left\llbracket b_{r-1}>0\right\rrbracket\phi\left(\frac{1}{q_{r+1}^{/}}\right)\right) r<n\displaystyle r<n

Since θβ(sqr+1/)=1sβqr+1/β\theta^{\beta}\left(\frac{s}{q_{r+1}^{/}}\right)=\frac{1}{s^{\beta}}q_{r+1}^{/\beta}, the substitution ϕ=θβ\phi=\theta^{\beta} in the equations above gives us Cr(θβ)=Qrqr+1/βC_{r}\left(\theta^{\beta}\right)=Q_{r}q_{r+1}^{/\beta} for some coefficient QrQ_{r}. The sums on the right hand side above can now be written in terms of the generalised harmonic function as:

(8.10) Cn(ϕ)=Qnqn+1/β\displaystyle C_{n}(\phi)=Q_{n}q_{n+1}^{/\beta}\leq (EnHbnβ+On(Hbnβ(an+1/bn+1)+bn1>0))qn+1/β\displaystyle\left(E_{n}H_{b_{n}}^{\beta}+O_{n}\left(H_{b_{n}}^{\beta}(a_{n+1}^{/}-b_{n}+1)+\left\llbracket b_{n-1}>0\right\rrbracket\right)\right)q_{n+1}^{/\beta}
(8.11) \displaystyle\leq (EnHbnβ+On(Hbnβ(an+1bn+1)+bn1>0))qn+1/β\displaystyle\left(E_{n}H_{b_{n}}^{\beta}+O_{n}\left(H_{b_{n}}^{\beta}(a_{n+1}-b_{n}+1)+\left\llbracket b_{n-1}>0\right\rrbracket\right)\right)q_{n+1}^{/\beta} r=n\displaystyle r=n
Cr(ϕ)=Qrqr+1/β\displaystyle C_{r}(\phi)=Q_{r}q_{r+1}^{/\beta}\leq (ErHbr1β(1+(ar+1/ar+1))+Or(Hbrβ(ar+1br+1)+br1>0))qr+1/β\displaystyle\left(E_{r}H_{b_{r}-1}^{\beta}(1+(a_{r+1}^{/}-a_{r+1}))+O_{r}\left(H_{b_{r}}^{\beta}(a_{r+1}-b_{r}+1)+\left\llbracket b_{r-1}>0\right\rrbracket\right)\right)q_{r+1}^{/\beta}
(8.12) \displaystyle\leq (ErHbr1β+Or(Hbrβ(ar+1br+1)+br1>0))qr+1/β\displaystyle\left(E_{r}H_{b_{r}-1}^{\beta}+O_{r}\left(H_{b_{r}}^{\beta}(a_{r+1}-b_{r}+1)+\left\llbracket b_{r-1}>0\right\rrbracket\right)\right)q_{r+1}^{/\beta} r<n\displaystyle r<n

Some Refined Estimates

For rr odd we have Qr=Hbrβ(ar+1br+1)+br1>0Q_{r}=H_{b_{r}}^{\beta}(a_{r+1}-b_{r}+1)+\left\llbracket b_{r-1}>0\right\rrbracket which we will now examine more closely. Recall if br=ar+1b_{r}=a_{r+1} then br1=0b_{r-1}=0 and so then Qr=Har+1βQ_{r}=H_{a_{r+1}}^{\beta}. If br<ar+1b_{r}<a_{r+1} then Hbrβ(ar+1br+1)Hbrβ(2)=Hbr+1β1H_{b_{r}}^{\beta}(a_{r+1}-b_{r}+1)\leq H_{b_{r}}^{\beta}(2)=H_{b_{r}+1}^{\beta}-1 and so (for rr odd) QrHbr+1βbr1=0Q_{r}\leq H_{b_{r}+1}^{\beta}-\left\llbracket b_{r-1}=0\right\rrbracket. Hence we can set

(8.13) crα=c(N,α,r)=Er(brbr>0+r=n)+Ormin(br+br1>0,ar+1)br+1c_{r}^{\alpha}=c(N,\alpha,r)=E_{r}\left(b_{r}-\left\llbracket b_{r}>0\right\rrbracket+\left\llbracket r=n\right\rrbracket\right)+O_{r}\min(b_{r}+\left\llbracket b_{r-1}>0\right\rrbracket,a_{r+1})\leq b_{r}+1

to get

(8.14) QrHcrβOrbr1=0Q_{r}\leq H_{c_{r}}^{\beta}-O_{r}\left\llbracket b_{r-1}=0\right\rrbracket

and we have Hcrβmin{ζ(β),cr>0+log+cr}min{ζ(β),1+logar+1}H_{c_{r}}^{\beta}\leq\min\left\{\zeta(\beta),\left\llbracket c_{r}>0\right\rrbracket+\log^{+}c_{r}\right\}\leq\min\left\{\zeta(\beta),1+\log a_{r+1}\right\} and also Hcrcr>012(1+cr)H_{c_{r}}\leq\left\llbracket c_{r}>0\right\rrbracket\frac{1}{2}(1+c_{r}) - see lemma 107 and hence for β=1\beta=1 and r<nr<n we have

(8.15) QrEr12(1+(br1))+Or(12br+br1>0)Or+12brQ_{r}\leq E_{r}\frac{1}{2}\left(1+(b_{r}-1)\right)+O_{r}\left(\frac{1}{2}b_{r}+\left\llbracket b_{r-1}>0\right\rrbracket\right)\leq O_{r}+\frac{1}{2}b_{r}

.

Note that in both even and odd cases, the estimate for QrQ_{r} increase with brb_{r}. However the even estimates begin at H1=1H_{1}=1 and the increments decrease in size, whereas odd estimates begin lower at H1β(ar+1/)=1ar+1/βH_{1}^{\beta}(a_{r+1}^{/})=\frac{1}{a_{r+1}^{/\beta}} but the increments increase in size. The estimates converge as brb_{r} increases. Each QrQ_{r} estimate is therefore maximal for br=ar+1b_{r}=a_{r+1}. However the sum Qrqr+1/β\sum Q_{r}q_{r+1}^{/\beta} may not be maximal with br=ar+1b_{r}=a_{r+1} for each rr because each br=ar+1b_{r}=a_{r+1} forces br1=0b_{r-1}=0 which reduces the size of Qr1Q_{r-1}. To maximise the double sum component, we therefore also need to investigate the case of br=ar+11b_{r}=a_{r+1}-1:
For ar+1>1a_{r+1}>1 we then have for rr odd Qr<Hbrβ(2)+br1>0Q_{r}<H_{b_{r}}^{\beta}(2)+\left\llbracket b_{r-1}>0\right\rrbracket and if br1>0b_{r-1}>0 this gives QrHar+11βQ_{r}\leq H_{a_{r+1}-1}^{\beta} (with equality only for ar+1=1,br1>0a_{r+1}=1,b_{r-1}>0).
In the case of rr even we get QrHar+12βQ_{r}\leq H_{a_{r+1}-2}^{\beta}.
In all cases:

(8.16) QrHcrβmin{ζ(β),1+log+cr}min{ζ(β),1+logar+1}Q_{r}\leq H_{c_{r}}^{\beta}\leq\min\{\zeta(\beta),1+\log^{+}c_{r}\}\leq\min\{\zeta(\beta),1+\log a_{r+1}\}

Now ζ(β)<2\zeta(\beta)<2 for β>β01.73\beta>\beta_{0}\approx 1.73 and logar+1>1\log a_{r+1}>1 for ar+13a_{r+1}\geq 3 so for larger values of β\beta, Qr<ζ(β)Q_{r}<\zeta(\beta) will be the better estimate, but for values of β\beta close to 11, Qr<1+logcrQ_{r}<1+\log c_{r} will be the better estimate. Of course for β=1\beta=1 Qr1+log+crQ_{r}\leq 1+\log^{+}c_{r} (equality only for cr1c_{r}\leq 1) will always be the better estimate. The case of ar+1=1a_{r+1}=1 needs further consideration.

Hence the estimate is maximal for N=brqrN=\sum b_{r}q_{r} with br=ar+11b_{r}=a_{r+1}-1 (except ar+1=1a_{r+1}=1)

8.4. Using Estimates for the harmonic function

Here we use the ancillary estimates established in subsection 8.1. The Single Sum Components SN1θβ,SN2θβS_{N}^{1}\theta^{\beta},S_{N}^{2}\theta^{\beta} We have SSr(θβ)=SN1θβ+SN2θβbr>0(brqrβHqr1β+2<qr<qnEr((2qr)β1))SS_{r}(\theta^{\beta})=S_{N}^{1}\theta^{\beta}+S_{N}^{2}\theta^{\beta}\leq\left\llbracket b_{r}>0\right\rrbracket\left(b_{r}q_{r}^{\beta}H_{q_{r}-1}^{\beta}+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left((2q_{r})^{\beta}-1\right)\right) for qr1q_{r}\geq 1.

Now for β1\beta\geq 1, Hqr1β<ζ(β)H_{q_{r}-1}^{\beta}<\zeta(\beta) and so r=0nbrqrβHqr1β<r=0nbrβqrβζ(β)Nβζ(β)\sum_{r=0}^{n}b_{r}q_{r}^{\beta}H_{q_{r}-1}^{\beta}<\sum_{r=0}^{n}b_{r}^{\beta}q_{r}^{\beta}\zeta(\beta)\leq N^{\beta}\zeta(\beta) and also r=0nbrqrβHqr1βr=0nbrqrHqr1N(1+logqn)\sum_{r=0}^{n}b_{r}q_{r}^{\beta}H_{q_{r}-1}^{\beta}\leq\sum_{r=0}^{n}b_{r}q_{r}H_{q_{r}-1}\leq N(1+\log q_{n}), giving us:

(8.17) SN1θβ=SN1θ¯β<Nβmin{ζ(β),1+logqn}S_{N}^{1}\theta^{\beta}=S_{N}^{1}\overline{\theta}^{\beta}<N^{\beta}\min\left\{\zeta(\beta),1+\log q_{n}\right\}

By (105) we have r=0nbr>02<qr<qnErqrβ<r=0n1br>0ErqrβOnmin(112βqn1β,qnβ)+Enmin(112βqn2β,qn1β)\sum_{r=0}^{n}\left\llbracket b_{r}>0\right\rrbracket\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}q_{r}^{\beta}<\sum_{r=0}^{n-1}\left\llbracket b_{r}>0\right\rrbracket E_{r}q_{r}^{\beta}\leq O_{n}\min\left(\frac{1}{1-2^{-\beta}}q_{n-1}^{\beta},q_{n}^{\beta}\right)+E_{n}\min\left(\frac{1}{1-2^{-\beta}}q_{n-2}^{\beta},q_{n-1}^{\beta}\right)

From subsection 8.1 r=0k12rβ<1112β=2β2β1=1+12β1\sum_{r=0}^{k}\frac{1}{2^{r\beta}}<\frac{1}{1-\frac{1}{2^{\beta}}}=\frac{2^{\beta}}{2^{\beta}-1}=1+\frac{1}{2^{\beta}-1} and so

(8.18) SN2θβr=0nbr>02<qr<qnEr((2qr)β1)<2β(Onmin{qn1β12β,qnβ}+Enmin{qn2β12β,qn1β})2βqnβS_{N}^{2}\theta^{\beta}\leq\sum_{r=0}^{n}\left\llbracket b_{r}>0\right\rrbracket\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left((2q_{r})^{\beta}-1\right)<2^{\beta}\left(O_{n}\min\left\{\frac{q_{n-1}^{\beta}}{1-2^{-\beta}},q_{n}^{\beta}\right\}+E_{n}\min\left\{\frac{q_{n-2}^{\beta}}{1-2^{-\beta}},q_{n-1}^{\beta}\right\}\right)\leq 2^{\beta}q_{n}^{\beta}

Note that since qr=arqr1+qr2q_{r}=a_{r}q_{r-1}+q_{r-2}, if ar>1a_{r}>1 then qrβ>(2qr1)βqr1β12βq_{r}^{\beta}>(2q_{r-1})^{\beta}\geq\frac{q_{r-1}^{\beta}}{1-2^{-\beta}} for β1\beta\geq 1. If ar=1a_{r}=1 the situation is more complex. However in the case β=1\beta=1, we do have qr<2qr1q_{r}<2q_{r-1} for ar=1a_{r}=1. By duality, the same result holds for θ¯β\overline{\theta}^{\beta} but with En,OnE_{n},O_{n} interchanged. This also gives us

(8.19) SN2(θβ+θ¯β)<2β(min{qn1β12β,qnβ}+min{qn2β12β,qn1β})2β(qnβ+qn1β)S_{N}^{2}\left(\theta^{\beta}+\overline{\theta}^{\beta}\right)<2^{\beta}\left(\min\left\{\frac{q_{n-1}^{\beta}}{1-2^{-\beta}},q_{n}^{\beta}\right\}+\min\left\{\frac{q_{n-2}^{\beta}}{1-2^{-\beta}},q_{n-1}^{\beta}\right\}\right)\leq 2^{\beta}(q_{n}^{\beta}+q_{n-1}^{\beta})

Double Sum Component SN3θβS_{N}^{3}\theta^{\beta} - initial Landau estimates In this section we will introduce the 3 main approaches to estimating the double sum, and use them to develop quick estimates using Landau (big O) notation. We will use the approaches more carefully in the following section to estimate explicit constants.

We now consider the double sum component of SNθβS_{N}\theta^{\beta}, namely SN3θβ=r=0nCr(θβ)=r=0nQrqr+1/βS_{N}^{3}\theta^{\beta}=\sum_{r=0}^{n}C_{r}\left(\theta^{\beta}\right)=\sum_{r=0}^{n}Q_{r}q_{r+1}^{/\beta}. This is by far the most complex of the terms to estimate partly because it is irregular and partly because the estimates are aesthetically rather unsatisfying. Those who require beauty in their estimates should look elsewhere!

Another problem is that whereas qr+1/=O(N)q_{r+1}^{/}=O(N) for r<nr<n, for r=nr=n the term qn+1/q_{n+1}^{/} is arbitrarily larger than NN. This means we cannot expect to develop a general estimate purely in terms of NN unless we impose restrictions on the type of α\alpha (eg α\alpha of constant type gives qn+1/β=O(N)q_{n+1}^{/\beta}=O(N)), and this has often been an approach used. Otherwise the estimate must be dependent on both NN and α\alpha. We will work first with general α\alpha, but then also look at the effects of restricting α\alpha.

A number of approaches have been developed in the literature in studying both the function θ\theta and a number of related functions. These approaches are presented very differently in different contexts, and can be difficult to compare in terms of concepts, terminology and notation. However the theoretical structure we have developed in this paper suggests we can usefully group them into one or other of 2 broad approaches. We will also consider a 3rd approach in this paper which tends to give slightly better results. However each of the approaches has strengths and weaknesses, and in fact for each approach we can find combinations of α,N\alpha,N for which that approach produces the best estimate.

We start by noting that we can write SN3θβ=r=0nCrθβ=r=0nQrqr+1/βS_{N}^{3}\theta^{\beta}=\sum_{r=0}^{n}C_{r}\theta^{\beta}=\sum_{r=0}^{n}Q_{r}q_{r+1}^{/\beta}where QrQ_{r} is a positive real number. This means we are seeking an estimate of the sum of products r=0nQrqr+1/β\sum_{r=0}^{n}Q_{r}q_{r+1}^{/\beta} with all terms positive. The most obvious technique to use is that of partial summation, but unfortunately it does not seem to help us greatly here. We must content ourselves therefore with less sophisticated approaches which we will call estimates by extraction:

Definition 108.

Given sequences of positive real numbers ai,bi,cia_{i},b_{i},c_{i} (with ci0c_{i}\neq 0), and the sum of products i=0naibi\sum_{i=0}^{n}a_{i}b_{i} then we define “the estimate by extraction of cic_{i}” as (maxin(bici))i=0naici\left(\max_{i\leq n}(\frac{b_{i}}{c_{i}})\right)\sum_{i=0}^{n}a_{i}c_{i}. In the special case ci=1c_{i}=1 the definition becomes (maxinbi)i=0nai\left(\max_{i\leq n}b_{i}\right)\sum_{i=0}^{n}a_{i} which we will call instead “the estimate by extraction of bib_{i}”.

Note that i=0naibi=i=0naibicici(maxin(bici))i=0naici\sum_{i=0}^{n}a_{i}b_{i}=\sum_{i=0}^{n}a_{i}\frac{b_{i}}{c_{i}}c_{i}\leq\left(\max_{i\leq n}(\frac{b_{i}}{c_{i}})\right)\sum_{i=0}^{n}a_{i}c_{i}, so that the estimate is an upper bound for i=0naibi\sum_{i=0}^{n}a_{i}b_{i}.

If we now return to our sum r=0nQrqr+1/β\sum_{r=0}^{n}Q_{r}q_{r+1}^{/\beta} we can now see immediately that there are 3 natural ways of estimating this sum by extraction, namely setting ci=1c_{i}=1 and extracting either QrQ_{r} or qr+1/βq_{r+1}^{/\beta}, or extracting another term ci1c_{i}\neq 1. Most approaches in the literature extract the quasiperiod term qr+1/βq_{r+1}^{/\beta} and the remaining approaches extract a “type” of α\alpha (recall that a type gives a bound on the growth of the quasiperiods qrq_{r}). From this analysis there is an obvious 3rd candidate, namely the extraction of QrQ_{r}, the coefficient of qr+1/βq_{r+1}^{/\beta}. We shall see that the latter approach tends to give best results, but we have not found it in the literature. This seems surprising, and so it may be that we have just missed it.

We will label these approaches, namely of QP extraction, type extraction, and coefficient extraction as A,B,CA,B,C respectively. We shall see below that this also places them in a general sense in order of improving accuracy. For convenience, given a sequence (xi)(x_{i}) of real numbers, we also introduce the notation Mnximax{xi:in}M_{n}x_{i}\coloneqq\max\{x_{i}:i\leq n\}.

We now make some high level remarks about each of the 3 approaches A,B,CA,B,C.

Approach A (Quasiperiod Extraction): SN3θβ=r=0nQrqr+1/βqn+1/βr=0nQrS_{N}^{3}\theta^{\beta}=\sum_{r=0}^{n}Q_{r}q_{r+1}^{/\beta}\leq q_{n+1}^{/\beta}\sum_{r=0}^{n}Q_{r} This approach extracts the largest term, namely the quasiperiod term (qn+1/βq_{n+1}^{/\beta}), and will not usually give the best results. However it has been the most common approach used historically, and for β=1\beta=1 it allows for the use of the elegant inequality r=0nlogar+1=logr=0nar+1logqn+1\sum_{r=0}^{n}\log a_{r+1}=\log\prod_{r=0}^{n}a_{r+1}\leq\log q_{n+1} (equality only for n=0n=0). This inequality helps aesthetically, but unfortunately does not improve accuracy. Order of Magnitude of SN3Aθβqn+1/βr=0nQrS_{N}^{3A}\theta^{\beta}\coloneqq q_{n+1}^{/\beta}\sum_{r=0}^{n}Q_{r} We have r=0nQrmin{(n+1)ζ(β),(n+1)+r=0nlogcr}\sum_{r=0}^{n}Q_{r}\leq\min\left\{(n+1)\zeta(\beta),(n+1)+\sum_{r=0}^{n}\log c_{r}\right\}. Now for qn>1q_{n}>1, n=O(logqn)n=O(\log q_{n}) and r=0nlogcrr=0nlogar+1logqn+1\sum_{r=0}^{n}\log c_{r}\leq\sum_{r=0}^{n}\log a_{r+1}\leq\log q_{n+1}. Hence, using qn+1/=O(qn+1)q_{n+1}^{/}=O(q_{n+1}) we obtain for β>1\beta>1, SN3aθβ=O(qn+1/βlogqn)S_{N}^{3a}\theta^{\beta}=O(q_{n+1}^{/\beta}\log q_{n}) whereas for β=1\beta=1 SN3Aθβ=O(qn+1/logqn+1)S_{N}^{3A}\theta^{\beta}=O(q_{n+1}^{/}\log q_{n+1})

Approach B (Type Extraction): r=0nQrqr+1/β(maxrnqr+1/βqrβ)r=0nQrqrβ\sum_{r=0}^{n}Q_{r}q_{r+1}^{/\beta}\leq\left(\max_{r\leq n}\frac{q_{r+1}^{/\beta}}{q_{r}^{\beta}}\right)\sum_{r=0}^{n}Q_{r}q_{r}^{\beta} . This approach extracts the upper type function An+1/β=maxrnqr+1/β/qrβA_{n+1}^{/\beta}=\max_{r\leq n}q_{r+1}^{/\beta}/q_{r}^{\beta}, a term which is much smaller term than qn+1/βq_{n+1}^{/\beta} in approach A. It is not as small as the term extracted in approach C, but it has the benefit of naturally producing results involving NN rather than qnq_{n}. This approach was introduced by Lang, but we will apply it in a different way from Lang to produce an improved result (see ##paper for comparison). Order of Magnitude of SN3BθβAn+1/βr=0nQrqrβS_{N}^{3B}\theta^{\beta}\coloneqq A_{n+1}^{/\beta}\sum_{r=0}^{n}Q_{r}q_{r}^{\beta} Note An+1/=O(An+1)A_{n+1}^{/}=O(A_{n+1}) and for β1\beta\geq 1 r=0nQrqrβ=O(r=0n(1+br)qrβ)=O(Nβ)=O(qn+1β)\sum_{r=0}^{n}Q_{r}q_{r}^{\beta}=O\left(\sum_{r=0}^{n}(1+b_{r})q_{r}^{\beta}\right)=O(N^{\beta})=O(q_{n+1}^{\beta}). Hence SN3bθβ=O(An+1βqn+1β)S_{N}^{3b}\theta^{\beta}=O(A_{n+1}^{\beta}q_{n+1}^{\beta}) (β1\beta\geq 1). Approach C (Coefficient Extraction): r=0nQrqr+1/β(maxrnQr)r=0nqr+1/β\sum_{r=0}^{n}Q_{r}q_{r+1}^{/\beta}\leq\left(\max_{r\leq n}Q_{r}\right)\sum_{r=0}^{n}q_{r+1}^{/\beta} This approach extracts the smallest term, namely the maximum coefficient (maxrnQr\max_{r\leq n}Q_{r}) and will generally produce the most accurate results. Alas it is also produces the least elegant results. Order of Magnitude of SN3Cθβ(maxrnQr)r=0nqr+1/βS_{N}^{3C}\theta^{\beta}\coloneqq\left(\max_{r\leq n}Q_{r}\right)\sum_{r=0}^{n}q_{r+1}^{/\beta} Note r=0nqr+1/β=O(qn+1β)\sum_{r=0}^{n}q_{r+1}^{/\beta}=O(q_{n+1}^{\beta}) and for β>1\beta>1 maxrnQr=O(1)\max_{r\leq n}Q_{r}=O(1), for β=1\beta=1 Qr1+log+cr1+logAn+1}Q_{r}\leq 1+\log^{+}c_{r}\leq 1+\log A_{n+1}\}. Hence SN3Cθβ=O(qn+1β)S_{N}^{3C}\theta^{\beta}=O(q_{n+1}^{\beta}) (β>1\beta>1) and SN3Cθ=O(qn+1logAn+1)S_{N}^{3C}\theta=O(q_{n+1}\log A_{n+1}) (β=1).\beta=1). Summary We summarise the order of magnitude estimates for each approach in the following table:

Order Estimates for SN3θβS_{N}^{3}\theta^{\beta}

β=1\beta=1 β>1\beta>1
Method A qn+1logqn+1q_{n+1}\log q_{n+1} qn+1βlogqnq_{n+1}^{\beta}\log q_{n}
Method B qn+1An+1q_{n+1}A_{n+1} qn+1βAn+1βq_{n+1}^{\beta}A_{n+1}^{\beta}
Method C qn+1logAn+1q_{n+1}\log A_{n+1} qn+1βq_{n+1}^{\beta}

Refined Double Sum Estimates We now introduce optimisations of each approach in the previous section, which improve on the results of the basic approach. We first introduce two types of optimisation which are applicable to more than one approach.

Head and Tail Optimisation: For each r<nr<n we have qr+1/β=O(Nβ)q_{r+1}^{/\beta}=O(N^{\beta}), but for r=nr=n the term qn+1/βq_{n+1}^{/\beta} may be arbitrarily large in comparison with NN. We will therefore usually split the sum r=0nQrqr+1/β\sum_{r=0}^{n}Q_{r}q_{r+1}^{/\beta} into the sum of its head term H=Qnqn+1/βH=Q_{n}q_{n+1}^{/\beta} and the tail sum T=r=0n1Qrqr+1/βT=\sum_{r=0}^{n-1}Q_{r}q_{r+1}^{/\beta}.

Extraction of constants: If ar=c+dra_{r}=c+d_{r} where cc is a constant and drd_{r} increasing, then narbrcbr+(maxbr)dr\sum_{n}a_{r}b_{r}\leq c\sum b_{r}+\left(\max b_{r}\right)\sum d_{r} is a better estimate than the simple extraction narbrmaxbrar\sum_{n}a_{r}b_{r}\leq\max b_{r}\sum a_{r}.

We are now ready to examine each approach in detail.

The estimates in (8.10) contain enough information to allow us to exploit specific patterns in the distribution of the coefficients brb_{r} (eg if we are given br=0b_{r}=0 for rr even we can exploit this). Here we shall only be concerned with generic estimates, and we will discard some of the information to simplify our results.

We will use the generic estimate Qrmin{ζ(β),1+log+cr}Q_{r}\leq\min\{\zeta(\beta),1+\log^{+}c_{r}\} for simpler results.

Also for r<nr<n we have QrEr12(1+(br1))+Or(12(br)+1)=Or+12brQ_{r}\leq E_{r}\frac{1}{2}\left(1+(b_{r}-1)\right)+O_{r}\left(\frac{1}{2}(b_{r})+1\right)=O_{r}+\frac{1}{2}b_{r}

Approach A: Quasiperiod Extraction We use the approach to estimate the tail sum T=r=0n1Qrqr+1/β.T=\sum_{r=0}^{n-1}Q_{r}q_{r+1}^{/\beta}. Note that maxrn1(qr+1/β)=qn/β\max_{r\leq n-1}\left(q_{r+1}^{/\beta}\right)=q_{n}^{/\beta}.

The estimate Qrζ(β)Q_{r}\leq\zeta(\beta) gives us Tnζ(β)qn/β<ζ(β)qn/β(logqnlogϕ+1)T\leq n\zeta(\beta)q_{n}^{/\beta}<\zeta(\beta)q_{n}^{/\beta}\left(\frac{\log q_{n}}{\log\phi}+1\right).

The estimate Qr1+log+crQ_{r}\leq 1+\log^{+}c_{r} gives, by separation of constants, Tr=0n1qr+1/β+qn/βr=0n1log+crT\leq\sum_{r=0}^{n-1}q_{r+1}^{/\beta}+q_{n}^{/\beta}\sum_{r=0}^{n-1}\log^{+}c_{r}. Now crar+1c_{r}\leq a_{r+1} and qnΠr=1narq_{n}\geq\Pi_{r=1}^{n}a_{r} (equality only for r1r\leq 1) and hence r=0n1log+crlogqn\sum_{r=0}^{n-1}\log^{+}c_{r}\leq\log q_{n}.

Hence Tmin{nζ(β)qn/β,r=0n1qr+1/β+qn/βlogqn}T\leq\min\left\{n\zeta(\beta)q_{n}^{/\beta},\,\sum_{r=0}^{n-1}q_{r+1}^{/\beta}+q_{n}^{/\beta}\log q_{n}\right\}. And r=0n1qr+1/β112β(qn/β+qn1/β)\sum_{r=0}^{n-1}q_{r+1}^{/\beta}\leq\frac{1}{1-2^{-\beta}}\left(q_{n}^{/\beta}+q_{n-1}^{/\beta}\right) which is O(qn/β)O(q_{n}^{/\beta}).

It follows that the two estimates are asymptotically nζ(β)qn/βn\zeta(\beta)q_{n}^{/\beta} and qn/βlogqnq_{n}^{/\beta}\log q_{n}, and the question of which is the lower estimate depends upon the relative values of ζ(β)\zeta(\beta) and logqn1/n\log q_{n}^{1/n}.

Approach B: Type extraction Recall that the upper type function is defined as An+1/=maxrnqr+1/qr=maxrn(ar+1/+qr1qr)A_{n+1}^{/}=\max_{r\leq n}\frac{q_{r+1}^{/}}{q_{r}}=\max_{r\leq n}\left(a_{r+1}^{/}+\frac{q_{r-1}}{q_{r}}\right) and that ar+1/+qr1qr<ar+1/+1ar<ar+1+2a_{r+1}^{/}+\frac{q_{r-1}}{q_{r}}<a_{r+1}^{/}+\frac{1}{a_{r}}<a_{r+1}+2 giving r=0nQrqr+1/βAn+1/βr=0nQrqrβ\sum_{r=0}^{n}Q_{r}q_{r+1}^{/\beta}\leq A_{n+1}^{/\beta}\sum_{r=0}^{n}Q_{r}q_{r}^{\beta}.

In this approach results naturally involve NN and it becomes counter-productive to split the head and tail. We will use the alternative estimates for QrQ_{r}from (**)

Estimate:

ODD: For rnr\leq n OrQrOr(br1>0+12br+12br=ar+1)Or(1+12br))O_{r}Q_{r}\leq O_{r}\left(\left\llbracket b_{r-1}>0\right\rrbracket+\frac{1}{2}b_{r}+\frac{1}{2}\left\llbracket b_{r}=a_{r+1}\right\rrbracket\right)\leq O_{r}\left(1+\frac{1}{2}b_{r})\right)

EVEN: For r=nr=n (in which case br1b_{r}\geq 1) QrEEr(1+12(br1))=Er12(1+br))Q_{r}^{E}\leq E_{r}\left(1+\frac{1}{2}(b_{r}-1)\right)=E_{r}\frac{1}{2}\left(1+b_{r})\right)

and for r<nr<n QrEErbr>1(1+12(br2))=Erbr>112brQ_{r}^{E}\leq E_{r}\left\llbracket b_{r}>1\right\rrbracket\left(1+\frac{1}{2}(b_{r}-2)\right)=E_{r}\left\llbracket b_{r}>1\right\rrbracket\frac{1}{2}b_{r}. Combine to ErQrErbr>112br+r=n12(1+br=1)E_{r}Q_{r}\leq E_{r}\left\llbracket b_{r}>1\right\rrbracket\frac{1}{2}b_{r}+\left\llbracket r=n\right\rrbracket\frac{1}{2}\left(1+\left\llbracket b_{r}=1\right\rrbracket\right). Or for r<nr<n QrOr+12brQ_{r}\leq O_{r}+\frac{1}{2}b_{r}

So for nn even r=0nQrqrβ12(1+br))qnβ+r=0n1(Or+12br)qrβ=12r=0nbrqrβ+r=0n1Orqrβ12Nβ+12qnβ+112βqn1β\sum_{r=0}^{n}Q_{r}q_{r}^{\beta}\leq\frac{1}{2}\left(1+b_{r})\right)q_{n}^{\beta}+\sum_{r=0}^{n-1}\left(O_{r}+\frac{1}{2}b_{r}\right)q_{r}^{\beta}=\frac{1}{2}\sum_{r=0}^{n}b_{r}q_{r}^{\beta}+\sum_{r=0}^{n-1}O_{r}q_{r}^{\beta}\leq\frac{1}{2}N^{\beta}+\frac{1}{2}q_{n}^{\beta}+\frac{1}{1-2^{-\beta}}q_{n-1}^{\beta} and for nn odd 12Nβ+112βqnβ\frac{1}{2}N^{\beta}+\frac{1}{1-2^{-\beta}}q_{n}^{\beta}

Also QrEEr(brr<n,br>0)Q_{r}^{E}\leq E_{r}\left(b_{r}-\left\llbracket r<n,b_{r}>0\right\rrbracket\right) and QrOOr(br+br1>0)Q_{r}^{O}\leq O_{r}\left(b_{r}+\left\llbracket b_{r-1}>0\right\rrbracket\right) hence Qrbr+Orbr1>0Err<n,br>0Q_{r}\leq b_{r}+O_{r}\left\llbracket b_{r-1}>0\right\rrbracket-E_{r}\left\llbracket r<n,b_{r}>0\right\rrbracket which gives us for rr odd Qrqrβ+Qr1qr1β<brqrβ+br1>0(qrβqr1β)Q_{r}q_{r}^{\beta}+Q_{r-1}q_{r-1}^{\beta}<b_{r}q_{r}^{\beta}+\left\llbracket b_{r-1}>0\right\rrbracket\left(q_{r}^{\beta}-q_{r-1}^{\beta}\right).

Hence t=02k+1Qtqtβ=r=0k(Q2r+1q2r+1β+Q2rq2rβ)r=02k+1brqrβ+r=0kb2r>0(q2r+1βq2rβ)\sum_{t=0}^{2k+1}Q_{t}q_{t}^{\beta}=\sum_{r=0}^{k}\left(Q_{2r+1}q_{2r+1}^{\beta}+Q_{2r}q_{2r}^{\beta}\right)\leq\sum_{r=0}^{2k+1}b_{r}q_{r}^{\beta}+\sum_{r=0}^{k}\left\llbracket b_{2r}>0\right\rrbracket\left(q_{2r+1}^{\beta}-q_{2r}^{\beta}\right). Let qLq_{L} be the largest quasiperiod in the second sum (or 0 if the sum is empty), so that we can telescope the second sum to obtain r=0kb2r>0(q2r+1βq2rβ)qLβ\sum_{r=0}^{k}\left\llbracket b_{2r}>0\right\rrbracket\left(q_{2r+1}^{\beta}-q_{2r}^{\beta}\right)\leq q_{L}^{\beta}, and so t=02k+1Qtqtβr=02k+1brqrβ+qLβ\sum_{t=0}^{2k+1}Q_{t}q_{t}^{\beta}\leq\sum_{r=0}^{2k+1}b_{r}q_{r}^{\beta}+q_{L}^{\beta}. Formally

(8.20) L=max1rn{r odd:(r=1)|(br1>0)}L=\max_{-1\leq r\leq n}\left\{r\textrm{ odd}:(r=-1)|(b_{r-1}>0)\right\}

(recalling q1=0q_{-1}=0). If nn is odd take k=(n1)/2k=(n-1)/2 and if nn is even take k=(n/2)1k=(n/2)-1 and note QnbnQ_{n}\leq b_{n} giving in both cases r=0nQrqrβr=0nbrqrβ+qLβNβ+qLβ\sum_{r=0}^{n}Q_{r}q_{r}^{\beta}\leq\sum_{r=0}^{n}b_{r}q_{r}^{\beta}+q_{L}^{\beta}\leq N^{\beta}+q_{L}^{\beta} from (**). Note qLqnq_{L}\leq q_{n} for nn odd and qLqn1q_{L}\leq q_{n-1} for nn even.

Approach C: Coefficient Extraction In this case T<(maxrn1Qr)r=0n1qr+1/βT<\left(\max_{r\leq n-1}Q_{r}\right)\sum_{r=0}^{n-1}q_{r+1}^{/\beta}. Now maxrn1Qrmaxrn1Hcrβmin(ζ(β),1+log(cn1max))\max_{r\leq n-1}Q_{r}\leq\max_{r\leq n-1}H_{c_{r}}^{\beta}\leq\min\left(\zeta(\beta),1+\log\left(c_{n-1}^{max}\right)\right) and cn1max=maxrn1crmaxrn1ar+1=anmaxc_{n-1}^{max}=\max_{r\leq n-1}c_{r}\leq\max_{r\leq n-1}a_{r+1}=a_{n}^{\max}.

Hence Tmin(ζ(β),1+logcnmax)r=0n1qr+1/βT\leq\min\left(\zeta(\beta),1+\log c_{n}^{max}\right)\,\sum_{r=0}^{n-1}q_{r+1}^{/\beta}. Unless ar=1a_{r}=1 for rnr\leq n, we have An2A_{n}\geq 2 which gives 1+logAn1.691+\log A_{n}\geq 1.69 whilst ζ(2)=π2/6<1.65\zeta(2)=\pi^{2}/6<1.65 so that the ζ(β)\zeta(\beta) form is better for β2\beta\geq 2 once an2a_{n}\geq 2.

Compared with Approach B where we extract An/βA_{n}^{/\beta}, we are here extracting 1+logcn1max<1+logAn1+\log c_{n-1}^{max}<1+\log A_{n} . The difference is small for small AnA_{n} but can be significant with well approximable α\alpha. On the other hand we are left with the sum r=0n1qr+1/β\sum_{r=0}^{n-1}q_{r+1}^{/\beta} which proves not to lend itself well to aesthetically pleasing estimates.

8.5. Summary of upper bound results for SNθβS_{N}\theta^{\beta}

We have SNθβ(SN1+SN2+SN3)θβS_{N}\theta^{\beta}\leq\left(S_{N}^{1}+S_{N}^{2}+S_{N}^{3}\right)\theta^{\beta} where:

SN1θβ<Nβmin{ζ(β),1+logqn}S_{N}^{1}\theta^{\beta}<N^{\beta}\min\left\{\zeta(\beta),1+\log q_{n}\right\}
SN2θβ<2β(Onmin{2β2β1qn1β,qnβ}+Enmin{2β2β1qn2β,qn1β})S_{N}^{2}\theta^{\beta}<2^{\beta}\left(O_{n}\min\left\{\frac{2^{\beta}}{2^{\beta}-1}q_{n-1}^{\beta},q_{n}^{\beta}\right\}+E_{n}\min\left\{\frac{2^{\beta}}{2^{\beta}-1}q_{n-2}^{\beta},q_{n-1}^{\beta}\right\}\right)

Using 3 methods, we have derived several estimates for SN3θβS_{N}^{3}\theta^{\beta} each of which may be best in particular circumstance. However Method C will tend to give the best results.

SN3θβ=r=0nQrqr+1/β=Qnqn+1/β+r=0n1Qrqr+1/βS_{N}^{3}\theta^{\beta}=\sum_{r=0}^{n}Q_{r}q_{r+1}^{/\beta}=Q_{n}q_{n+1}^{/\beta}+\sum_{r=0}^{n-1}Q_{r}q_{r+1}^{/\beta}. The high order term Qnqn+1/βQ_{n}q_{n+1}^{/\beta} is split out in Methods A and C and we have

Qn=(EnHbnβ+On(bn1>0+Han+1/βHan+1/bnβ))min{ζ(β),En(1+logbn)+On(bn1>0+bn<an+1logan+1an+1bn+bn=an+1(1+logan+1))}Q_{n}=\left(\,E_{n}H_{b_{n}}^{\beta}+O_{n}\left(\left\llbracket b_{n-1}>0\right\rrbracket+H_{a_{n+1}^{/}}^{\beta}-H_{a_{n+1}^{/}-b_{n}}^{\beta}\right)\,\right)\\ \leq\min\left\{\zeta(\beta),\,E_{n}\left(1+\log b_{n}\right)+O_{n}\left(\left\llbracket b_{n-1}>0\right\rrbracket+\left\llbracket b_{n}<a_{n+1}\right\rrbracket\log\frac{a_{n+1}}{a_{n+1}-b_{n}}+\left\llbracket b_{n}=a_{n+1}\right\rrbracket\left(1+\log a_{n+1}\right)\right)\right\}
(8.21) MethodA\displaystyle\mathrm{Method\,A} SN3Aθβ\displaystyle S_{N}^{3A}\theta^{\beta} Qnqn+1/β+min{nζ(β)qn/β(r=0n1qr+1/β)+qn/βlogqnqn/β(1+(1+1logϕ)logqn)\displaystyle\,\,\leq Q_{n}q_{n+1}^{/\beta}+\min\begin{cases}n\zeta(\beta)q_{n}^{/\beta}\\ \left(\sum_{r=0}^{n-1}q_{r+1}^{/\beta}\right)+q_{n}^{/\beta}\log q_{n}\\ q_{n}^{/\beta}\left(1+\left(1+\frac{1}{\log\phi}\right)\log q_{n}\right)\end{cases}
(8.22) MethodB\displaystyle\mathrm{Method\,B} SN3Bθβ\displaystyle S_{N}^{3B}\theta^{\beta} An+1/βmin{12Nβ+On(112βqnβ)+En(12qnβ+112βqn1β)Nβ+qLβ\displaystyle\,\,\leq A_{n+1}^{/\beta}\min\begin{cases}\frac{1}{2}N^{\beta}+O_{n}\left(\frac{1}{1-2^{-\beta}}q_{n}^{\beta}\right)+E_{n}\left(\frac{1}{2}q_{n}^{\beta}+\frac{1}{1-2^{-\beta}}q_{n-1}^{\beta}\right)\\ N^{\beta}+q_{L}^{\beta}\end{cases}
(8.23) MethodC\displaystyle\mathrm{Method\,C} SN3Cθβ\displaystyle S_{N}^{3C}\theta^{\beta} Qnqn+1/β+(r=0n1qr+1/β)min{ζ(β)1+log+cn1max\displaystyle\,\,\leq Q_{n}q_{n+1}^{/\beta}+\,\left(\sum_{r=0}^{n-1}q_{r+1}^{/\beta}\right)\min\begin{cases}\zeta(\beta)\\ 1+\log^{+}c_{n-1}^{max}\end{cases}

where:

and from (8.13) we have cn1maxmaxrn1(Er(brbr>0)+Ormin(br+1,ar+1))anmaxc_{n-1}^{\max}\leq\max_{r\leq n-1}\left(E_{r}\left(b_{r}-\left\llbracket b_{r}>0\right\rrbracket\right)+O_{r}\min(b_{r}+1,a_{r+1})\right)\leq a_{n}^{\max}. Further note anmax<An=maxrnqr/qr1a_{n}^{\max}<A_{n}=\max_{r\leq n}q_{r}/q_{r-1} so log+cn1maxmaxrn(logqrlogqr1)\log^{+}c_{n-1}^{max}\leq\max_{r\leq n}\left(\log q_{r}-\log q_{r-1}\right)

and from (8.20) L=max1rn{r odd:(r=1)|(br1>0)}L=\max_{-1\leq r\leq n}\left\{r\textrm{ odd}:(r=-1)|(b_{r-1}>0)\right\} giving qLOnqn+Enqn1qnNq_{L}\leq O_{n}q_{n}+E_{n}q_{n-1}\leq q_{n}\leq N.

and from (2.3) nlogqnlogϕ+1n\leq\frac{\log q_{n}}{\log\phi}+1

and r=0nqr/β<r=0nErqr/β+r=0nOrqr/β<112β(qn/β+qn1/β)<2β12β(qnβ+qn1β)\sum_{r=0}^{n}q_{r}^{/\beta}<\sum_{r=0}^{n}E_{r}q_{r}^{/\beta}+\sum_{r=0}^{n}O_{r}q_{r}^{/\beta}<\frac{1}{1-2^{-\beta}}\left(q_{n}^{/\beta}+q_{n-1}^{/\beta}\right)<\frac{2^{\beta}}{1-2^{-\beta}}\left(q_{n}^{\beta}+q_{n-1}^{\beta}\right). For β=1\beta=1 we can improve this slightly by using qr/<qr+qr1q_{r}^{/}<q_{r}+q_{r-1} to get r=0nqr/<r=0nEr(qr+qr1)+r=0nOr(qr+qr1)<2(qn+qn1)+2(qn1+qn2)<3qn+4qn1\sum_{r=0}^{n}q_{r}^{/}<\sum_{r=0}^{n}E_{r}(q_{r}+q_{r-1})+\sum_{r=0}^{n}O_{r}(q_{r}+q_{r-1})<2(q_{n}+q_{n-1})+2(q_{n-1}+q_{n-2})<3q_{n}+4q_{n-1}. It should be noted that these are very much worst case estimates.

8.6. Duality results

Finally the duality results (see 91) gives us that SNθ¯βS_{N}\overline{\theta}^{\beta} (for n>2n>2 for α>12\alpha>\frac{1}{2} to ensure an+1a_{n+1} is self-conjugate) has the same upper bounds except that instance of En,OnE_{n},O_{n} are interchanged.

We can of course combine appropriate estimates to obtain estimates for SN(θβ+θ¯β)S_{N}(\theta^{\beta}+\overline{\theta}^{\beta}).

Finally θβ1\theta^{\beta}\geq 1 on (0,1)(0,1) so that SN(θβθ¯β)SNθβNS_{N}(\theta^{\beta}-\overline{\theta}^{\beta})\leq S_{N}\theta^{\beta}-N. However a better result can be obtained by taking (**)

9. Comparisons with Prior Art

9.1. Introduction

As we discussed in the introduction to this paper, there have been many studies of particular anergodic Birkhoff sums over the circle, using techniques specific to the observable sum being studied. Our aim in this section is to show that the generic technique developed in this paper can now be applied relatively easily and quickly to generate equivalent or better results for each of these particular sums. The main challenge lies not in generating results, but in comparing them with prior art because of the many different forms and notations which have been used to date.

Typically these estimates are not aesthetically pleasing! Some of the gruesome detail is removed by working in Landau notation (ie without providing details of the constants involved). In some cases there may be no easy way to estimate the constants in the first place. Whether for these reasons of others, many studies have restricted themselves to Landau type results. In most cases our methods lend themselves readily to providing details of the constants, and where constants have been calculated previously in the literature, we show that we can provide in most cases improved constants. However where only Landau results exist in the literature, we have contented ourselves with also providing such results. The proofs and results are then less clouded with distracting detail, but the proofs are often easily extended to provide constants if so desired.

We will typically find that we will need to sacrifice precision somewhere in order to derive commensurable results. When this is necessary we will carry out this sacrifice on our own results rather than those in the literature. This means that our results will typically be somewhat better in their original form rather than the form they are forced into for the purpose of comparison.

We will continue the convention of the previous section in setting θx=1x\theta x=\frac{1}{x}, θ¯x=1(1x)\overline{\theta}x=\frac{1}{\left(1-x\right)} on (0,1)(0,1) and using α\alpha for an irrational rotation number. Recall that we say ϕ,ψ\phi,\psi are equivalent if ϕψ\phi-\psi is bounded on (0,1)(0,1), and hence SNϕSNψ=O(N)S_{N}\phi\,-\,S_{N}\psi=O(N) (ie their Birkhoff sums differ by at most a fixed multiple of NN).

In this section we will now review our estimates and prior art for the following Birkhoff sums:

  1. (1)

    The sum of reciprocals of fractional parts r=1N1{rα}\sum_{r=1}^{N}\frac{1}{\left\{r\alpha\right\}}

  2. (2)

    The sum of reciprocals of distances to the nearest integer r=1N1rα\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}

  3. (3)

    The sum of signed remainders r=1N1{{rα}}\sum_{r=1}^{N}\frac{1}{\left\{\left\{r\alpha\right\}\right\}}

  4. (4)

    The sum of cotangents r=1Ncotπrα\sum_{r=1}^{N}\cot\pi r\alpha

  5. (5)

    The double exponential sum u=0N1v=0N1e2πiuvα\sum_{u=0}^{N-1}\sum_{v=0}^{N-1}e^{2\pi iuv\alpha}

In addition we will also cover a number of closely related sums which seem of interest but for which we are not aware of previous work.

9.2. Summary of our results for θx=1x\theta x=\frac{1}{x}

We shall see that all the functions in the list above are equivalent to a form of SNθS_{N}\theta (ie SNθβS_{N}\theta^{\beta} with β=1\beta=1) in either its native (θ)(\theta), symmetric (θ+θ¯)(\theta+\overline{\theta}) or anti-symmetric (θθ¯)(\theta-\overline{\theta}) form. We therefore give here the summary table from Subsection 8.5 of estimates for θβ\theta^{\beta}, simplified to show only the case β=1\beta=1.

We have:

(9.1) SNθ(SN1+SN2+SN3)θS_{N}\theta\leq\left(S_{N}^{1}+S_{N}^{2}+S_{N}^{3}\right)\theta
(9.2) SN1θ<N(1+logqn)S_{N}^{1}\theta<N\left(1+\log q_{n}\right)
(9.3) SN2θ<2(Onmin{2qn1,qn}+Enmin{2qn2,qn1})S_{N}^{2}\theta<2\left(O_{n}\min\left\{2q_{n-1},q_{n}\right\}+E_{n}\min\left\{2q_{n-2},q_{n-1}\right\}\right)

We have derived several estimates for SN3θS_{N}^{3}\theta using 3 methods labelled A,B,CA,B,C. Each of these may give the best results in particular circumstances, although Method C will tend to give the best results in on average.

We start from the identity SN3θ=r=0nQrqr+1/=Qnqn+1/+r=0n1Qrqr+1/S_{N}^{3}\theta=\sum_{r=0}^{n}Q_{r}q_{r+1}^{/}=Q_{n}q_{n+1}^{/}+\sum_{r=0}^{n-1}Q_{r}q_{r+1}^{/} (where the right hand expression simply splits out the high order term) where:

Qn=(EnHbn+On(bn1>0+Han+1/Han+1/bn))En(1+logbn)+On(bn1>0+bn<an+1logan+1an+1bn+bn=an+1(1+logan+1))Q_{n}=\left(\,E_{n}H_{b_{n}}+O_{n}\left(\left\llbracket b_{n-1}>0\right\rrbracket+H_{a_{n+1}^{/}}-H_{a_{n+1}^{/}-b_{n}}\right)\,\right)\leq\\ E_{n}\left(1+\log b_{n}\right)+O_{n}\left(\left\llbracket b_{n-1}>0\right\rrbracket+\left\llbracket b_{n}<a_{n+1}\right\rrbracket\log\frac{a_{n+1}}{a_{n+1}-b_{n}}+\left\llbracket b_{n}=a_{n+1}\right\rrbracket\left(1+\log a_{n+1}\right)\right)

The high order term Qnqn+1/Q_{n}q_{n+1}^{/} is split out in Methods A and C, and we have the following upper bounds:

(9.4) MethodA\displaystyle\mathrm{Method\,A} SN3Aθ\displaystyle S_{N}^{3A}\theta Qnqn+1/+min{(r=0n1qr+1/)+qn/logqnqn/(1+(1+1logϕ)logqn)\displaystyle\,\,\leq Q_{n}q_{n+1}^{/}+\min\begin{cases}\left(\sum_{r=0}^{n-1}q_{r+1}^{/}\right)+q_{n}^{/}\log q_{n}\\ q_{n}^{/}\left(1+\left(1+\frac{1}{\log\phi}\right)\log q_{n}\right)\end{cases}
(9.5) MethodB\displaystyle\mathrm{Method\,B} SN3Bθ\displaystyle S_{N}^{3B}\theta An+1/min{12N+On(2qn)+En(12qn+2qn1)N+qL\displaystyle\,\,\leq A_{n+1}^{/}\min\begin{cases}\frac{1}{2}N+O_{n}\left(2q_{n}\right)+E_{n}\left(\frac{1}{2}q_{n}+2q_{n-1}\right)\\ N+q_{L}\end{cases}
(9.6) MethodC\displaystyle\mathrm{Method\,C} SN3Cθ\displaystyle S_{N}^{3C}\theta Qnqn+1/+(r=0n1qr+1/)(1+log+cn1max)\displaystyle\,\,\leq Q_{n}q_{n+1}^{/}+\,\left(\sum_{r=0}^{n-1}q_{r+1}^{/}\right)\left(1+\log^{+}c_{n-1}^{max}\right)

where:

and from (8.13) we have cn1maxmaxrn1(Er(brbr>0)+Ormin(br+1,ar+1))anmaxc_{n-1}^{\max}\leq\max_{r\leq n-1}\left(E_{r}\left(b_{r}-\left\llbracket b_{r}>0\right\rrbracket\right)+O_{r}\min(b_{r}+1,a_{r+1})\right)\leq a_{n}^{\max}. Further note anmaxmaxrnqr/qr1a_{n}^{\max}\leq\max_{r\leq n}q_{r}/q_{r-1}\geq so also log+cn1maxmaxrn(logqrlogqr1)\log^{+}c_{n-1}^{max}\leq\max_{r\leq n}\left(\log q_{r}-\log q_{r-1}\right)

and from (8.20) L=max1rn{r odd:(r=1)|(br1>0)}L=\max_{-1\leq r\leq n}\left\{r\textrm{ odd}:(r=-1)|(b_{r-1}>0)\right\} giving qLOnqn+Enqn1qnNq_{L}\leq O_{n}q_{n}+E_{n}q_{n-1}\leq q_{n}\leq N.

and from (2.3) nlogqnlogϕ+1n\leq\frac{\log q_{n}}{\log\phi}+1, and ϕ\phi is the golden ratio.

and using qr/<qr+qr1q_{r}^{/}<q_{r}+q_{r-1} we get r=0nqr/<r=0nEr(qr+qr1)+r=0nOr(qr+qr1)<2(qn+qn1)+2(qn1+qn2)<3qn+4qn1\sum_{r=0}^{n}q_{r}^{/}<\sum_{r=0}^{n}E_{r}(q_{r}+q_{r-1})+\sum_{r=0}^{n}O_{r}(q_{r}+q_{r-1})<2(q_{n}+q_{n-1})+2(q_{n-1}+q_{n-2})<3q_{n}+4q_{n-1}. It should be noted that these are very much worst case estimates.

9.3. The sum of reciprocals of fractional parts r=1N1{rα}\sum_{r=1}^{N}\frac{1}{\{r\alpha\}}

Analysis This is the simplest case for comparison as we have the direct equality r=1N1{rα}=SNθ\sum_{r=1}^{N}\frac{1}{\{r\alpha\}}=S_{N}\theta. Prior art

Although it dates back to 1966, Lang’s estimate of this sum [8] still stands:

(9.7) r=1N1{rα}<2NlogN+20Ng(N)+K0\sum_{r=1}^{N}\frac{1}{\{r\alpha\}}<2N\log N+20Ng(N)+K_{0}

Here k00k_{0}\geq 0 is an arbitrary integer, K0=r=1k01{rα}K_{0}=\sum_{r=1}^{k_{0}}\frac{1}{\{r\alpha\}} and g(N)g(N) is a “co-type” of α\alpha for N>k0N>k_{0}, defined as follows:

Definition 109.

(Lang) The function g(N)g(N) is a co-type of α\alpha for N>k0N>k_{0} if gg is a monotonic increasing function, and for any N>k0N>k_{0} there is always quasiperiod qnq_{n} of α\alpha such that N<qnNg(N)N<q_{n}\leq Ng(N)

Remarks:

The minimal co-type for k0=0k_{0}=0 is in fact our type function A(N)A(N) from 11. To see this, let gg be a co-type for N>0N>0. If we choose N=qnN=q_{n}then we have N<qn+1=qn+1qnqnNg(N)N<q_{n+1}=\frac{q_{n+1}}{q_{n}}q_{n}\leq Ng(N) so g(qn)qn+1qng(q_{n})\geq\frac{q_{n+1}}{q_{n}}, Since gg is increasing so for any qnN<qn+1,rnq_{n}\leq N<q_{n+1},r\leq n we have g(N)g(qn)g(qr)qr+1qrg(N)\geq g(q_{n})\geq g(q_{r})\geq\frac{q_{r+1}}{q_{r}}, so that our type sequence An+1=maxrnqr+1qrg(N)A_{n+1}=\max_{r\leq n}\frac{q_{r+1}}{q_{r}}\leq g(N).

The constant terms k0,K0k_{0},K_{0} simply allow us to disregard large initial outliers in the initial set {ar}rk0\{a_{r}\}_{r\leq k_{0}}(consider for example the effects of setting a1=10100,ar=1a_{1}=10^{100},a_{r}=1 for r>1r>1). It is straightforward to introduce this refinement into our results also, by defining and using the refined type function An+1k0=maxnr>k0{qr+1/qr}A_{n+1}^{k_{0}}=\max_{n\geq r>k_{0}}\{q_{r+1}/q_{r}\}). However we will leave this as an exercise for the reader to keep the core comparison clearer. Effectively this means we will take k0=0k_{0}=0 so that K0=0K_{0}=0, and (9.7) becomes for comparison purposes:

r=1N1{rα}<2NlogN+20NAn+1\sum_{r=1}^{N}\frac{1}{\{r\alpha\}}<2N\log N+20NA_{n+1}

Comparison Lang’s estimate is simple and attractive. It concentrates the effects of the choice of α\alpha into the type function gg. We will compare it with our estimate via Method B. Although we know Method B is not generally as sharp as that of Method C, it provides here for an easier comparison.

By the remarks above, we will focus on comparing the results of our method with the estimate 2NlogN+20NAn+12N\log N+20NA_{n+1}.

Our results for comparison are (using Method B):

Lemma 110.

r=1N1{rα}=SNθ<Nlogqn+(N+qn)An+1+(2N+3qn)\sum_{r=1}^{N}\frac{1}{\{r\alpha\}}=S_{N}\theta<N\log q_{n}+\left(N+q_{n}\right)A_{n+1}+(2N+3q_{n})

Proof.

We have from (9.1) SNθ(SN1+SN2+SN3B)θ=N(1+logqn)+2(Enqn+Onqn1)+An+1/(N+qL)Nlogqn+N+2qn+An+1/(N+qn)S_{N}\theta\leq\left(S_{N}^{1}+S_{N}^{2}+S_{N}^{3B}\right)\theta=N\left(1+\log q_{n}\right)+2\left(E_{n}q_{n}+O_{n}q_{n-1}\right)+A_{n+1}^{/}\left(N+q_{L}\right)\leq N\log q_{n}+N+2q_{n}+A_{n+1}^{/}\left(N+q_{n}\right). Now from 11 An+1/<An+1+1A_{n+1}^{/}<A_{n+1}+1 which gives the result. ∎

Note that we have taken worst assumptions to obtain this result, namely nn even, N2qnN\leq 2q_{n}, An+1/=An+1+1A_{n+1}^{/}=A_{n+1}+1, but the result gives better constants than 2NlogN+20NAn+12N\log N+20NA_{n+1} for little effort.

9.4. The sum of reciprocals of the distance to nearest integer function r=1N1rα\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}

The Birkhoff sum r=1N1rα\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|} has been studied by many authors. The most recent major study is by Beresnevich et al[2] who included explicit (with constants) upper and lower bounds.

In fact this is the only sum (of which we are aware) for which lower bounds have been investigated, and so Beresnevich et al’s results are the only test case for our method on lower bound.

We will treat upper and lower bounds separately. Analysis First we cast the problem appropriately as a suitable anergodic sum.

Proposition 111.

r=1N1rα=SN(θ+θ¯ψ)\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}=S_{N}\left(\theta+\overline{\theta}-\psi\right) where θx=1x\theta x=\frac{1}{x} and SNψ2Nlog2±2dα(N)S_{N}\psi\in 2N\log 2\pm 2d_{\alpha}(N)

Proof.

Recall that we define the nearest integer function on real numbers by x=minn|xn|=0{x}<12x+12{x}<11x\left\|x\right\|=\min_{n\in\mathbb{Z}}\left|x-n\right|=\left\llbracket 0\leq\{x\}<\frac{1}{2}\right\rrbracket{x}+\left\llbracket\frac{1}{2}\leq\{x\}<1\right\rrbracket{1-x}. Since x={x}\left\|x\right\|=\left\|\{x\}\right\| we may assume 0<x<10<x<1so that 1x=0x<12x+12x<11x=(1x+11x)(0x<121x+12x<1x)\frac{1}{\left\|x\right\|}=\frac{\left\llbracket 0\leq x<\frac{1}{2}\right\rrbracket}{x}+\frac{\left\llbracket\frac{1}{2}\leq x<1\right\rrbracket}{1-x}=\left(\frac{1}{x}+\frac{1}{1-x}\right)-\left(\frac{\left\llbracket 0\leq x<\frac{1}{2}\right\rrbracket}{1-x}+\frac{\left\llbracket\frac{1}{2}\leq x<1\right\rrbracket}{x}\right). We put ψx=0x<121x+12x<1x\psi x=\frac{\left\llbracket 0\leq x<\frac{1}{2}\right\rrbracket}{1-x}+\frac{\left\llbracket\frac{1}{2}\leq x<1\right\rrbracket}{x} so that r=1N1rα=SN(θ+θ¯ψ)\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}=S_{N}\left(\theta+\overline{\theta}-\psi\right). Now ψ\psi has the integral 2log22\log 2 and variation 22 so can immediately be estimated by Denjoy-Koksma(**) as SNψ2Nlog2±2dα(N)S_{N}\psi\in 2N\log 2\pm 2d_{\alpha}(N), where dα(N)=0nbrd_{\alpha}(N)=\sum_{0}^{n}b_{r}, the sum of digits in the Ostrowski representation of NN with 1dα(N)<qn1/n(logqnlogϕ+1)1\leq d_{\alpha}(N)<q_{n}^{1/n}\left(\frac{\log q_{n}}{\log\phi}+1\right)(Equality when N=qnN=q_{n}). ∎

Prior Art - Upper Bound In this section a factor of 2 appears in all terms of the sum - for convenience therefore we will estimate the half sum 12r=1N1rα\frac{1}{2}\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}.

(9.8) 12r=1N1rα2qn+1(1+log(1+Nqn))+32Nlogqn+q3N\frac{1}{2}\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}\leq 2q_{n+1}\left(1+\log\left(1+\frac{N}{q_{n}}\right)\right)+32N\log q_{n}+q_{3}N

We note that this is O(qn+1log(1+Nqn)+Nlogqn)O\left(q_{n+1}\log\left(1+\frac{N}{q_{n}}\right)+N\log q_{n}\right) where either term can dominate since qn+1N\frac{q_{n+1}}{N} can be arbitrarily large or O(qn)O(q_{n}) depending upon α\alpha. Results from our theory We first derive an estimate using our own methods.

Lemma 112.

12r=1N1rα<(1+logcn)qn+1/+Nlogqn+(1+logcn1max)r=1nqn/+((1log2)N+qn+qn1)+dα(N)\frac{1}{2}\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}<(1+\log c_{n})q_{n+1}^{/}+N\log q_{n}+(1+\log c_{n-1}^{max})\sum_{r=1}^{n}q_{n}^{/}+\left(\left(1-\log 2\right)N+q_{n}+q_{n-1}\right)+d_{\alpha}(N)

Proof.

From proposition 111 we have r=1N1rα=SN(θ+θ¯ψ)SN(θ+θ¯)2Nlog2+2dα(N)\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}=S_{N}\left(\theta+\overline{\theta}-\psi\right)\leq S_{N}\left(\theta+\overline{\theta}\right)-2N\log 2+2d_{\alpha}(N). Now (9.5) gives us an upper bound for SNθ,S_{N}\theta,and by duality (Subsection 8.6) SNθ¯S_{N}\overline{\theta} has the same bound with En/OnE_{n}/O_{n} interchanged. The full result follows from combining these partial results ∎

Comparison To compare our result with (9.8) we start by noting cnbn+1=1+Nqnc_{n}\leq b_{n}+1=1+\left\lfloor\frac{N}{q_{n}}\right\rfloor, and qn+1/<qn+1+qnq_{n+1}^{/}<q_{n+1}+q_{n} so that (1+logcn)qn+1/<(qn+1+qn)(1+log(1+Nqn))(1+\log c_{n})q_{n+1}^{/}<\left(q_{n+1}+q_{n}\right)\left(1+\log\left(1+\frac{N}{q_{n}}\right)\right). Next by (**) r=1nqn/<3qn+4qn1\sum_{r=1}^{n}q_{n}^{/}<3q_{n}+4q_{n-1} and dα(N)<d_{\alpha}(N)<**?. Putting these results together gives

12r=1N1rα<(qn+1+qn)(1+log(1+Nqn))+Nlogqn+(3qn+4qn1)logcn1max+(N(1log2)+4qn+5qn1)+dα(N)\frac{1}{2}\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}<\left(q_{n+1}+q_{n}\right)\left(1+\log\left(1+\frac{N}{q_{n}}\right)\right)+N\log q_{n}+\left(3q_{n}+4q_{n-1}\right)\log c_{n-1}^{max}+\left(N\left(1-\log 2\right)+4q_{n}+5q_{n-1}\right)+d_{\alpha}(N)

The main obstacle to comparison with Beresnevich now lies with the third term, and in particular with the component cn1maxc_{n-1}^{\max}.

Note for ae α\alpha we have cn=O(logqn)1+ϵc_{n}=O(\log q_{n})^{1+\epsilon} so that for ae α\alpha logcn1max=O(L2qn1)\log c_{n-1}^{\max}=O(L^{2}q_{n-1}) and so this term is asymptotically insignificant compared with the first two terms and we get

12r=1N1rα(qn+1+qn)(1+log(1+Nqn))+Nlogqn+O(qnL2qn1+N)\frac{1}{2}\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}\leq(q_{n+1}+q_{n})\left(1+\log\left(1+\frac{N}{q_{n}}\right)\right)+N\log q_{n}+O(q_{n}L^{2}q_{n-1}+N)

For all α\alpha we can make the (very) coarse estimate cn1maxanmax<maxrnqrqr1<qnc_{n-1}^{\max}\leq a_{n}^{\max}<\max_{r\leq n}\frac{q_{r}}{q_{r-1}}<q_{n} to give

12r=1N1rα<(qn+1+qn)(1+log(1+Nqn))+(N+3qn+4qn1)logqn+O(N)\frac{1}{2}\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}<\left(q_{n+1}+q_{n}\right)\left(1+\log\left(1+\frac{N}{q_{n}}\right)\right)+\left(N+3q_{n}+4q_{n-1}\right)\log q_{n}+O(N)

which shows the same asymptotic order as Beresnevich but with improved bounds. Beresnevich point out that they have not sought best possible results, but then neither have we - these results are simplifications read off our more general results.

Prior Art - Lower Bound Beresnevich et al show (also using elementary methods) that r=1N1rαLB2\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}\geq LB2 where

(9.9) LB2=qn+1log(1+Nqn)+124Nlogqn(13logq2+12)NLB2=q_{n+1}\log\left(1+\left\lfloor\frac{N}{q_{n}}\right\rfloor\right)+\frac{1}{24}N\log q_{n}-(\frac{1}{3}\log q_{2}+\frac{1}{2})N
999Strictly, the paper has Nqn\frac{N}{q_{n}} rather than Nqn\left\lfloor\frac{N}{q_{n}}\right\rfloor but this makes negligible difference and appears to be a typographical error ** Check this wording with Sebastian

They also give an alternative result derived using Minkowski’s Convex Body Theorem showing that for N2N\geq 2 we have r=1N1rαLB3\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}\geq LB3 where:

(9.10) LB3=NlogN+N(1log2)+2LB3=N\log N+N(1-\log 2)+2

The latter is an interesting result because it is derived in such a different manner - it exploits the convexity of the particular function 1x\frac{1}{\left\|x\right\|} but does not use any properties of the rotation number α\alpha. We therefore expect it to give less good results when specific properties of the rotation number become important, and indeed this is what we find. Results from our theory Given the Ostrowski representation N=r=0nbrqrN=\sum_{r=0}^{n}b_{r}q_{r} we will use the rather crude estimate from (8.8) where we exploit the fact that ϕ0\phi\geq 0 and so SNϕSbnqnϕS_{N}\phi\geq S_{b_{n}q_{n}}\phi. Clearly this lower bound of SbnqnϕS_{b_{n}q_{n}}\phi is at its best when N=bnqnN=b_{n}q_{n} and becomes less good as NN is increased towards its maximum of (bn+1)qn1(b_{n}+1)q_{n}-1 (or in the case of bn=an+1b_{n}=a_{n+1} a maximum of qn+11q_{n+1}-1). The proportional error will be greatest when bn=1b_{n}=1 and N=2qn1N=2q_{n}-1. More refined estimates can be made by considering additional terms from the Ostrowski representation. However as we shall see, this initial estimate already compares well with other published results.

Lemma 113.

For qn>1q_{n}>1 we have the lower bound r=1bnqn1rα>LB1\sum_{r=1}^{b_{n}q_{n}}\frac{1}{\left\|r\alpha\right\|}>LB1 where

(9.11) LB1=qn+1/log(1+bn)+bnqn(2logqn(1+2log2))LB1=q_{n+1}^{/}\log^{*}(1+b_{n})+b_{n}q_{n}\left(2\log^{*}q_{n}-(1+2\log 2)\right)

where logx=max{1,logx}\log^{*}x=\max\{1,\log x\}. (Note that 2logqn(1+2log2)>02\log^{*}q_{n}-(1+2\log 2)>0 for qn2>4eq_{n}^{2}>4e, and hence for qn>3q_{n}>3).

Proof.

From proposition 111 r=1N1rα=SN(θ+θ¯ψ)Sbnqn(θ+θ¯ψ)\sum_{r=1}^{N}\frac{1}{\left\|r\alpha\right\|}=S_{N}\left(\theta+\overline{\theta}-\psi\right)\geq S_{b_{n}q_{n}}\left(\theta+\overline{\theta}-\psi\right) where ψ(x)=x<1211x+x121x\psi(x)=\left\llbracket x<\frac{1}{2}\right\rrbracket\frac{1}{1-x}+\left\llbracket x\geq\frac{1}{2}\right\rrbracket\frac{1}{x}. Note that Varψ=2\operatorname{Var}\psi=2 and so by Denjoy-Koksma(**) SNψ<2Nlog2+2dα(N)S_{N}\psi<2N\log 2+2d_{\alpha}(N). Now dα(bnqn)=bnd_{\alpha}(b_{n}q_{n})=b_{n} so this gives Sbnqnψ<2bnqnlog2+2bnS_{b_{n}q_{n}}\psi<2b_{n}q_{n}\log 2+2b_{n}.

From (8.8) Sbnqn(θ+θ¯)>2bnqn(Hqn11)+qn+1/Hbn+bnqn/+2bn>2bnqn(logqn1)+qn+1/log(1+bn)+bnqn+2bnS_{b_{n}q_{n}}\left(\theta+\overline{\theta}\right)>2b_{n}q_{n}\left(H_{q_{n}-1}-1\right)+q_{n+1}^{/}H_{b_{n}}+b_{n}q_{n}^{/}+2b_{n}>2b_{n}q_{n}\left(\log^{*}q_{n}-1\right)+q_{n+1}^{/}\log^{*}(1+b_{n})+b_{n}q_{n}+2b_{n} (since qn/>qnq_{n}^{/}>q_{n}). The result follows. ∎

Comparison of Lower Bound results LB1LB1vs LB2LB2:

Let us compare the three terms of our result LB1LB1 and Beresnevich’s result LB2LB2. To compare first terms we note that Nqn=bn\left\lfloor\frac{N}{q_{n}}\right\rfloor=b_{n} and that qn+1/qn+1=(an+1/an+1)qn=1an+2/qnq_{n+1}^{/}-q_{n+1}=(a_{n+1}^{/}-a_{n+1})q_{n}=\frac{1}{a_{n+2}^{/}}q_{n}. To compare the second and third terms we use N<(bn+1)qnN<(b_{n}+1)q_{n}.

This gives us: LB1LB2>1an+2/qnlog(1+bn)+(21+1bn24)bnqnlogqn+bnqn((13logq2+12)(1+1bn)12log2)LB1-LB2>\frac{1}{a_{n+2}^{/}}q_{n}\log^{*}(1+b_{n})+\left(2-\frac{1+\frac{1}{b_{n}}}{24}\right)b_{n}q_{n}\log^{*}q_{n}+b_{n}q_{n}\left((\frac{1}{3}\log q_{2}+\frac{1}{2})(1+\frac{1}{b_{n}})-1-2\log 2\right). This is positive for qnq2q_{n}\geq q_{2} and dominated by the central term which has a minimum of 2312bnqnlogqn\frac{23}{12}b_{n}q_{n}\log q_{n}.

LB1LB1vs LB3LB3:

To compare LB1LB1 and LB3LB3 we define b=Nqnb=\frac{N}{q_{n}} so that bnb<bn+1b_{n}\leq b<b_{n}+1 (also b<an+1+qn1qnb<a_{n+1}+\frac{q_{n-1}}{q_{n}}). This gives LB3=bqn(logqn+logb)+bqn(1log2)+2LB3=bq_{n}(\log q_{n}+\log b)+bq_{n}(1-\log 2)+2 and hence LB1LB3=(qn+1/log(1+bn)bqnlogb)+qn((2bnlogqnblogqn)bn(1+2log2)b(1log2))2LB1-LB3=\left(q_{n+1}^{/}\log^{*}(1+b_{n})-bq_{n}\log b\right)+q_{n}\left((2b_{n}\log^{*}q_{n}-b\log q_{n})-b_{n}(1+2\log 2)-b(1-\log 2)\right)-2.

Now bqn=N<qn+1/bq_{n}=N<q_{n+1}^{/} and b<1+bnb<1+b_{n} so that the first term is always positive. We now investigate the coefficient of the second (qnq_{n}) term which we designate C=(2bnlogqnblogqn)bn(1+2log2)b(1log2)C=(2b_{n}\log^{*}q_{n}-b\log q_{n})-b_{n}(1+2\log 2)-b(1-\log 2). Note that Cqn>2Cq_{n}>2 gives LB1>LB3LB1>LB3.

Lemma 114.

If bn>1b_{n}>1 then LB1>LB3LB1>LB3 for qn27q_{n}\geq 27

Proof.

Using b<bn+1b<b_{n}+1 gives C(bn1)logqnbn(2+log2)(1log2)C\geq(b_{n}-1)\log q_{n}-b_{n}(2+\log 2)-(1-\log 2). Hence for bn>1b_{n}>1 CC is dominated by the term (bn1)qnlogqn(b_{n}-1)q_{n}\log q_{n}. Further C0C\geq 0 for logqn1bn1(bn(2+log2)(1log2))+2qn=2+log2+1bn1+2qn\log q_{n}\geq\frac{1}{b_{n}-1}\left(b_{n}(2+\log 2)-(1-\log 2)\right)+\frac{2}{q_{n}}=2+\log 2+\frac{1}{b_{n}-1}+\frac{2}{q_{n}} or qn2e2.5+2/qnq_{n}\geq 2e^{2.5+2/q_{n}} which holds for integers qn27q_{n}\geq 27. ∎

As we noted above, the proportional error as a result of using SbnqnS_{b_{n}q_{n}} as a lower bound is greatest when bn=1b_{n}=1 and NN is close to its maximal value of 2qn12q_{n}-1. Indeed our next lemma guarantees LB1>LB3LB1>LB3 when bn=1b_{n}=1 for most values of NN, but not for the largest values.

Lemma 115.

If bn=1b_{n}=1 then LB1>LB3LB1>LB3 if an+1/(3+2log2)qn12qna_{n+1}^{/}\geq(3+2\log 2)-\frac{q_{n-1}-2}{q_{n}} or if N/qn23logqn+1log2N/q_{n}\leq 2-\frac{3}{\log q_{n}+1-\log 2}. If neither condition holds, we may still have LB1>LB3LB1>LB3 but it is no longer guaranteed.

Proof.

For bn=1b_{n}=1 we have C3C\geq-3

C=(2b)logqnb(1log2)(1+2log2)C=(2-b)\log q_{n}-b(1-\log 2)-(1+2\log 2) and b<2b<2. Put b=2λb=2-\lambda then C=λlogqn+λ(1log2)3C=\lambda\log q_{n}+\lambda(1-\log 2)-3 so C0C\geq 0 for λ3logqn+1log2\lambda\geq\frac{3}{\log q_{n}+1-\log 2}. Hence given any fixed λ\lambda we will always have C>0C>0 for large enough qnq_{n}, but it is also true that C<0C<0 if λ\lambda is small enough. Put λ=2Nqn=kqn\lambda=2-\frac{N}{q_{n}}=\frac{k}{q_{n}} for some 1kqn1\leq k\leq q_{n}, so for large qnq_{n} there will always be values of N2qn1N\leq 2q_{n}-1 which give C<0C<0, namely N=2qnkN=2q_{n}-k where k<3qnlogqn+1log2k<\frac{3q_{n}}{\log q_{n}+1-\log 2} and kqnk\leq q_{n}. For these, taking b<2b<2 gives LB1LB3>(qn+1/2qnlog2)3qn2LB1-LB3>\left(q_{n+1}^{/}-2q_{n}\log 2\right)-3q_{n}-2 which is positive if qn+1/(3+2log2)qn+2q_{n+1}^{/}\geq(3+2\log 2)q_{n}+2. This gives an+1/qn+qn1(3+2log2)qn+2a_{n+1}^{/}q_{n}+q_{n-1}\geq(3+2\log 2)q_{n}+2 or an+1/(3+2log2)qn12qna_{n+1}^{/}\geq(3+2\log 2)-\frac{q_{n-1}-2}{q_{n}}, which is guaranteed true for an+1>4a_{n+1}>4 and often true for an+1=4a_{n+1}=4 (specifically when 1an+2/>2log21qn12qn)\frac{1}{a_{n+2}^{/}}>2\log 2-1-\frac{q_{n-1}-2}{q_{n}}). When it is false, we may still have LB1>LB3LB1>LB3, but it is no longer guaranteed. ∎

In summary LB1>LB3LB1>LB3 if bn>1b_{n}>1 or if bn=1b_{n}=1 and an+1/(3+2log2)qn12qna_{n+1}^{/}\geq(3+2\log 2)-\frac{q_{n-1}-2}{q_{n}} or N/qn23logqn+1log2N/q_{n}\leq 2-\frac{3}{\log q_{n}+1-\log 2}. It is possible that given α\alpha and taking into account more terms of the Ostrowski expansion we could obtain give a refined lower bound always guaranteed to exceed LB3LB3 for large enough NN. On the other hand LB3LB3 exploits the convexity of 1x\frac{1}{\left\|x\right\|} which our method does not. Possibly this guarantees LB3>LB1LB3>LB1 in certain circumstances. We leave this as a conjecture.

Conjecture 116.

The estimate for SN1xS_{N}\frac{1}{\left\|x\right\|} in (8.8) can be improved beyond the current estimate for Sbnqn1xS_{b_{n}q_{n}}\frac{1}{\left\|x\right\|} to the point where it always exceeds LB3LB3 for large enough NN.

9.5. The sum of the anti-symmetrisation of reciprocals SN(θθ¯)=r=1N1{rα}11{rα}S_{N}(\theta-\overline{\theta})=\sum_{r=1}^{N}\frac{1}{\{r\alpha\}}-\frac{1}{1-\{r\alpha\}}

Analysis Anti-symmetric functions have been less thoroughly investigated than the lower bounded functions studied above. As far as we are aware, published results to date have not included estimates of the asymptotic constants. We will also restrict ourselves to Landau notation, and focus instead on showing how we can obtain such results very quickly. Note that for anti-symmetric functions, we can expect lower bounds to be similar to the negative of the upper bounds, so we will consider only the upper bound here.

Recall that if gg is a positive function, f=O(g)f=O(g) means |f|<Cg\left|f\right|<Cg for some constant CC, and in particular ff may take negative values.

The following result is trivial but useful:

Remark 117.

Recall SNϕ1Nϕ(rα)S_{N}\phi\coloneqq\sum_{1}^{N}\phi(r\alpha) and so depends only on the values of ϕ\phi along the orbit (rα)(r\alpha). Hence if ϕ,ψ\phi,\psi agree at the (countable) set of orbital points we have SNϕ=SNψS_{N}\phi=S_{N}\psi. In particular if α\alpha is irrational, and ϕ,ψ\phi,\psi disagree only at some set of rational points, then SNϕ=SNψS_{N}\phi=S_{N}\psi.

Lemma 118.

For α\alpha of constant type, let ϕ\phi be a monotonic decreasing function satisfying ϕ(x)=O(1x)\phi(x)=O\left(\frac{1}{\left\|x\right\|}\right) on (0,1)(0,1), then Brϕ=brPqr+O(qr)B_{r}\phi=b_{r}P_{q_{r}}+O(q_{r}) and the same result holds for B¯rϕ\overline{B}_{r}\phi

Proof.

Recall the definition of BrϕB_{r}\phi:

(9.12) Brϕ=br>0(qr>1br(Pqrϕ¯(1qr))+2<qr<qnEr(ϕr0,qrqr1ϕ¯(2qr))+s=0br1(ϕrsqr+qr>1ϕrsqr1))B_{r}\phi=\left\llbracket b_{r}>0\right\rrbracket\left(\left\llbracket q_{r}>1\right\rrbracket b_{r}\left(P_{q_{r}}-\overline{\phi}(\frac{1}{q_{r}})\right)+\left\llbracket 2<q_{r}<q_{n}\right\rrbracket E_{r}\left(\phi_{r0,q_{r}-q_{r-1}}-\overline{\phi}\left(\frac{2}{q_{r}}\right)\right)+\sum_{s=0}^{b_{r}-1}\left(\phi_{rsq_{r}}+\left\llbracket q_{r}>1\right\rrbracket\phi_{rsq_{r-1}}\right)\right)

We will need the following results:

Since ϕ(x)=O(1x)\phi(x)=O(\frac{1}{\left\|x\right\|}) we have for qr>1q_{r}>1, ϕ¯(1qr)=ϕ(11qr)=O(qr)\overline{\phi}(\frac{1}{q_{r}})=\phi(1-\frac{1}{q_{r}})=O(q_{r}), and similarly for qr>2q_{r}>2, ϕ¯(2qr)=O(qr)\overline{\phi}\left(\frac{2}{q_{r}}\right)=O(q_{r}).

Since α\alpha is of constant type, qr+2/=O(qr)q_{r+2}^{/}=O(q_{r}). But αrst>1qr+2/\left\|\alpha_{rst}\right\|>\frac{1}{q_{r+2}^{/}} and so ϕrst=ϕ(αrst)=O(qr)\phi_{rst}=\phi(\alpha_{rst})=O(q_{r}).

Also since α\alpha is of constant type brar+1=O(1)b_{r}\leq a_{r+1}=O(1) and so s=0br1ϕrst=O(qr)\sum_{s=0}^{b_{r}-1}\phi_{rst}=O(q_{r})

Combining these results in (9.12) gives Brϕ=brPqr+O(qr)B_{r}\phi=b_{r}P_{q_{r}}+O(q_{r}). A similar argument gives also B¯rϕ=brPqr+O(qr)\overline{B}_{r}\phi=b_{r}P_{q_{r}}+O(q_{r})

Note that a monotonic decreasing anti-symmetric real function on (0,1)(0,1) must satisfy ϕ(12)=0\phi(\frac{1}{2})=0 and hence ϕ(x)0\phi(x)\geq 0 on (0,12](0,\frac{1}{2}]. Also since in this case ϕ(x)=ϕ(1x)\phi(x)=-\phi(1-x), the constraint ϕ(x)=O(1x)\phi(x)=O\left(\frac{1}{\left\|x\right\|}\right) on (0,1)(0,1) is equivalent to ϕ(x)=O(1x)\phi(x)=O\left(\frac{1}{x}\right)on (0,12)(0,\frac{1}{2}).

Theorem 119.

For α\alpha of constant type, let ϕ\phi be a monotonic decreasing anti-symmetric function satisfying ϕ(x)=O(1x)\phi(x)=O\left(\frac{1}{x}\right)on (0,12)(0,\frac{1}{2}), then SNϕ=O(N)S_{N}\phi=O(N)

Proof.

From Lemma 89 for any decreasing ϕ\phi we have SNϕr=0nBr(ϕ)S_{N}\phi\leq\sum_{r=0}^{n}B_{r}(\phi).

Since ϕ\phi is anti-symmetric we have Pqr=t=1qr1ϕ(tqr)=0P_{q_{r}}=\sum_{t=1}^{q_{r}-1}\phi(\frac{t}{q_{r}})=0. Hence in this case using Lemma 118 gives us Brϕ=O(qr)B_{r}\phi=O(q_{r}).

Now r=0nqr<2(qn+qn1)<4qn\sum_{r=0}^{n}q_{r}<2(q_{n}+q_{n-1})<4q_{n} and hence SNϕr=0nBrϕ=O(r=0nqr)=O(qn)=O(N)S_{N}\phi\leq\sum_{r=0}^{n}B_{r}\phi=O(\sum_{r=0}^{n}q_{r})=O(q_{n})=O(N).

A similar argument gives SNϕr=0nB¯rϕ=O(N)S_{N}\phi\geq\sum_{r=0}^{n}\overline{B}_{r}\phi=O(N) and so SNϕ=O(N)S_{N}\phi=O(N). ∎

Corollary 120.

For α\alpha of constant type, then SNϕ=O(N)S_{N}\phi=O(N) for each of ϕ1(x)=1x11x,ϕ2(x)=cotπx,ϕ3(x)=1{{x}}\phi_{1}(x)=\frac{1}{x}-\frac{1}{1-x},\phi_{2}(x)=\cot\pi x,\phi_{3}(x)=\frac{1}{\{\{x\}\}}

Proof.

Note that each of ϕ1,ϕ2\phi_{1},\phi_{2} is natively anti-symmetric, and so is the derived function ϕ3/(x)=x12ϕ3(x)\phi_{3}^{/}(x)=\left\llbracket x\neq\frac{1}{2}\right\rrbracket\phi_{3}(x). All 3 are monotonic decreasing. We also have:

  1. (1)

    ϕ1(x)=1x11x<1x\phi_{1}(x)=\frac{1}{x}-\frac{1}{1-x}<\frac{1}{x} on (0,12)(0,\frac{1}{2})

  2. (2)

    πcotπx<1x\pi\cot\pi x<\frac{1}{x} on (0,12)(0,\frac{1}{2})

  3. (3)

    x12{{x}}=1x\frac{\left\llbracket x\neq\frac{1}{2}\right\rrbracket}{\{\{x\}\}}=\frac{1}{x} on (0,12)(0,\frac{1}{2})

Hence Theorem 119 applies, and the result is proven for ϕ1,ϕ2,ϕ3/\phi_{1},\phi_{2},\phi_{3}^{/}. Now ϕ3/,ϕ3\phi_{3}^{/},\phi_{3} differ only at the rational point x=12x=\frac{1}{2} , so by Remark 117 SNϕ3=SNϕ3/=O(N)S_{N}\phi_{3}=S_{N}\phi_{3}^{/}=O(N). ∎

9.6. The series r=1N1rγϕ(rα)\sum_{r=1}^{N}\frac{1}{r^{\gamma}}\phi(r\alpha)

Analysis Note that this series not itself a Birkhoff sum (since rγr^{\gamma} is not a function of the value of rαr\alpha), but we can estimate it using Birkhoff sums.

Theorem 121.

Let ϕ\phi be any function with SNϕ=O(Nβ)S_{N}\phi=O\left(N^{\beta}\right) for some β1\beta\geq 1, then for any γ\gamma_{\mathbb{R}}the series S=r=1N1rγϕ(rα)S=\sum_{r=1}^{N}\frac{1}{r^{\gamma}}\phi(r\alpha) is O(1)O(1) for γ>β\gamma>\beta, is O(logNO(\log N) for γ=β\gamma=\beta, and is O(Nβγ)O(N^{\beta-\gamma}) for γ<β\gamma<\beta.

Proof.

By partial summation r=1N1rγϕ(rα)=r=1N1(Srϕ)(1rγ1(1+r)γ)+(SNϕ)(1Nγ)\sum_{r=1}^{N}\frac{1}{r^{\gamma}}\phi(r\alpha)=\sum_{r=1}^{N-1}\left(S_{r}\phi\right)\left(\frac{1}{r^{\gamma}}-\frac{1}{(1+r)^{\gamma}}\right)+\left(S_{N}\phi\right)\left(\frac{1}{N^{\gamma}}\right). Note the final term is O(Nβγ)O(N^{\beta-\gamma}). Now (1rγ1(1+r)γ)=rγ(1(1+r)γ)=γr(γ+1)+O(r(γ+2))\left(\frac{1}{r^{\gamma}}-\frac{1}{(1+r)^{\gamma}}\right)=r^{-\gamma}\left(1-\left(1+r\right)^{-\gamma}\right)=\gamma r^{-(\gamma+1)}+O(r^{-(\gamma+2)}). Hence r=1N1(Srϕ)(1rγ1(1+r)γ)=O(r=1N1rβr(γ+1))\sum_{r=1}^{N-1}\left(S_{r}\phi\right)\left(\frac{1}{r^{\gamma}}-\frac{1}{(1+r)^{\gamma}}\right)=O\left(\sum_{r=1}^{N-1}r^{\beta}r^{-(\gamma+1)}\right). Now r=1N1r(γ+1β)\sum_{r=1}^{N-1}r^{-(\gamma+1-\beta)}converges for γ+1β>1\gamma+1-\beta>1 and is O(logN)O(\log N) for γ+1β=1\gamma+1-\beta=1 and O(Nβγ)O(N^{\beta-\gamma}) for γ+1β<1\gamma+1-\beta<1 as required. ∎

Corollary 122.

For each of the functions ϕ(x)=1x11x,ϕ(x)=cotπx,ϕ(x)=1{{x}}\phi(x)=\frac{1}{x}-\frac{1}{1-x},\phi(x)=\cot\pi x,\phi(x)=\frac{1}{\{\{x\}\}} we have r=1N1rϕ(rα)=O(logN)\sum_{r=1}^{N}\frac{1}{r}\phi(r\alpha)=O(\log N)

Proof.

This is a direct consequence of Corollary 120 and Theorem 121 with γ=β=1\gamma=\beta=1

9.7. The double exponential sum Exp2(N)=u=0N1v=0N1e2πiuvαExp2(N)=\sum_{u=0}^{N-1}\sum_{v=0}^{N-1}e^{2\pi iuv\alpha}

For convenience we will write e(x)e2πixe(x)\coloneqq e^{2\pi ix}, noting that |1e(x)|2\left|1-e(x)\right|\leq 2.

Analysis This sum was studied by Sinai & Ulcigrai [15] (in connection with a problem in quantum computing) who showed that for α\alpha of bounded type and N=qn,N=q_{n},the sum is O(qn)O(q_{n}). Although the double sum is not itself a Birkhoff sum, several Birkhoff sums are involved in estimating it.

The inner single sum over vv is a Birkhoff sum, although this fact is unimportant as we can simply sum it as a geometric progression giving v=0N1e(uvα)=u=0N+u01e(uNα)1e((uα)\sum_{v=0}^{N-1}e(uv\alpha)=\left\llbracket u=0\right\rrbracket N+\left\llbracket u\neq 0\right\rrbracket\frac{1-e(uN\alpha)}{1-e((u\alpha)}.

Hence Exp2(N)=N+u=1N11e(uNα)1e((uα)Exp2(N)=N+\sum_{u=1}^{N-1}\frac{1-e(uN\alpha)}{1-e((u\alpha)} and the second term we can now write as another Birkhoff sum SN1fS_{N-1}f where f(x)=1e(Nx)1e((x)f(x)=\frac{1-e(Nx)}{1-e((x)}. We are left with showing SN1f=O(N)S_{N-1}f=O(N) when N=qnN=q_{n}.

Now 11e(x)=1e(x)2(e((x)+e(x))=(1cos2πx)+isin2πx2(1cos2πx)=12(1+icotπx)\frac{1}{1-e(x)}=\frac{1-e(-x)}{2-\left(e((x)+e(-x)\right)}=\frac{(1-\cos 2\pi x)+i\sin 2\pi x}{2(1-\cos 2\pi x)}=\frac{1}{2}\left(1+i\cot\pi x\right) and hence f(x)=12(1e(Nx))+i2cotπxi2cotπxe(Nx)f(x)=\frac{1}{2}\left(1-e(Nx)\right)+\frac{i}{2}\cot\pi x-\frac{i}{2}\cot\pi x\,e(Nx). Denoting the 3 terms of ff as T1,T2,T3T_{1},T_{2},T_{3} we have |T1(x)|=12|1e(Nx)|1\left|T_{1}(x)\right|=\frac{1}{2}\left|1-e(Nx)\right|\leq 1 so that SN1T1=O(N)S_{N-1}T_{1}=O(N), and also by Corollary 120 SN1cotπx=O(N)S_{N-1}\cot\pi x=O(N). Hence Exp2(N)=O(N)+SN1T3Exp2(N)=O(N)+S_{N-1}T_{3}. We are left with showing SN1T3=O(N)S_{N-1}T_{3}=O(N) for N=qnN=q_{n}. In fact this is the central result proved by Sinai & Ulcigrai, and it uses substantial machinery. More precisely they reduce SN1T3S_{N-1}T_{3} to a sum involving {{x}}\{\{x\}\} instead of cotπx\cot\pi x and then prove the result Sqn1(e(qnx){{x}})=O(qn)S_{q_{n}-1}\left(\frac{e(q_{n}x)}{\{\{x\}\}}\right)=O(q_{n})

. We will prove a more general result which provides both Exp2(N)=O(N)Exp2(N)=O(N) and SN1T3=O(N)S_{N-1}T_{3}=O(N) for N=qnN=q_{n} as particular cases.

Recall that θ:XY\theta:X\rightarrow Y is a Lipschitz continuous map between two metric spaces if there is a C0C\geq 0 with dY(y1,y2)CdX(x1,x2)d_{Y}(y_{1},y_{2})\leq Cd_{X}(x_{1},x_{2}) for any x1,x2Xx_{1},x_{2}\in X. In particular when X=𝕋X=\mathbb{T} we will take the metric to be .\left\|.\right\|, and note that since 𝕋\mathbb{T} is compact, θ\theta is bounded.

Lemma 123 (Partial Birkhoff Summation).

Let ψ,ϕ\psi,\phi be two complex valued observables on the circle 𝕋\mathbb{T} such that ψ\psi is Lipschitz continuous with constant CC, and |SNϕ|BN\left|S_{N}\phi\right|\leq BN for some constant BB (ie SNϕS_{N}\phi is O(N)O(N)). Given an irrational rotation number α\alpha we then have |SN(ψMϕ)|<BN(12CMα(N1)+|ψ|)\left|S_{N}(\psi_{M}\phi)\right|<BN\left(\frac{1}{2}C\left\|M\alpha\right\|(N-1)+\left|\psi\right|\right) where ψM(x)ψ({Mx})\psi_{M}(x)\coloneqq\psi(\{Mx\})

Proof.

Using partial summation we get SN(ψMϕ)=r=1Nψ(rMα)ϕ(rα)=r=1N1(Srϕ)(ψ(Mrα)ψ(r+1)Mα)+(SNϕ)(ψ(NMα))S_{N}\left(\psi_{M}\phi\right)=\sum_{r=1}^{N}\psi(rM\alpha)\phi(r\alpha)=\sum_{r=1}^{N-1}\left(S_{r}\phi\right)\left(\psi(Mr\alpha)-\psi(r+1)M\alpha\right)+\left(S_{N}\phi\right)\left(\psi(NM\alpha)\right). Now the Lipschitz condition ensures that |ψ|=supx𝕋|ψ(x)|<\left|\psi\right|=\sup_{x\in\mathbb{T}}\left|\psi(x)\right|<\infty and that |ψ(Mrα)ψ(r+1)Mα|<CrMα(r+1)Mα=CMα\left|\psi(Mr\alpha)-\psi(r+1)M\alpha\right|<C\left\|rM\alpha-(r+1)M\alpha\right\|=C\left\|M\alpha\right\|. Also |Srϕ|Br\left|S_{r}\phi\right|\leq Br so that |r=1N1(Srϕ)(ψ(Mrα)ψ(r+1)Mα)|r=1N1(Br)(CMα)=12N(N1)B(CMα)\left|\sum_{r=1}^{N-1}\left(S_{r}\phi\right)\left(\psi(Mr\alpha)-\psi(r+1)M\alpha\right)\right|\leq\sum_{r=1}^{N-1}\left(Br\right)\left(C\left\|M\alpha\right\|\right)=\frac{1}{2}N(N-1)B\left(C\left\|M\alpha\right\|\right). Finally |(SNϕ)(ψ(NMα))|BN|ψ|\left|\left(S_{N}\phi\right)\left(\psi(NM\alpha)\right)\right|\leq BN\left|\psi\right| and the result follows.

Corollary 124.

We have |SN(ψMϕ)|=O(N)\left|S_{N}(\psi_{M}\phi)\right|=O(N) whenever Mα=O(1N)\left\|M\alpha\right\|=O(\frac{1}{N}) holds. Given k0,c0k_{0},c_{0}, then when N<qn+1N<q_{n+1} this holds if Mα<k0qn+1\left\|M\alpha\right\|<\frac{k_{0}}{q_{n+1}}, and in particular if M=kqnM=kq_{n} for kk0k\leq k_{0}.

Proof.

This is a simple consequence of the lemma and qnc>Acqnq_{n-c}>A^{-c}q_{n}

We can now deduce the major result of Sinai & Ulcigrai: ∎

Corollary 125 (Sinai & Ulcigrai).

Sqn1(cotπxe(qnx))=O(qn)S_{q_{n}-1}\left(\cot\pi x\,e(q_{n}x)\right)=O(q_{n}) and Sqn1(e(qnx){{x}})=O(qn)S_{q_{n}-1}\left(\frac{e(q_{n}x)}{\{\{x\}\}}\right)=O(q_{n}) and Exp2(qn)=u=0qn1v=0qn1e2πiuvα=O(qn)Exp2(q_{n})=\sum_{u=0}^{q_{n}-1}\sum_{v=0}^{q_{n}-1}e^{2\pi iuv\alpha}=O(q_{n})

Proof.

e(x)e(x) is Lipschitz continuous on the circle with constant C=2C=2. For α\alpha of constant type we also have from Corollary 120 SN(cotπx)=O(N)S_{N}(\cot\pi x)=O(N), and SN1{{x}}=O(N)S_{N}\frac{1}{\{\{x\}\}}=O(N). The results follow immediately. ∎

Note that the key ingredient in the result above is that MαM\alpha is sufficiently small, and this restricts the values of MM. However this does not necessarily mean that the growth rate of SN(ψMϕ)S_{N}(\psi_{M}\phi) will be higher for other values of MM. Indeed Sinai & Ulcigrai wonder if Exp2(N)=O(N)Exp2(N)=O(N) for all NN (which is equivalent to SN(ψNϕ)=O(N)S_{N}(\psi_{N}\phi)=O(N)) , and more generally we wonder if

Conjecture 126.

Under the conditions of lemma 84, SN(ψMϕ)=O(N)S_{N}(\psi_{M}\phi)=O(N) for any fixed MM, and the constant is independent of MM

It seems unlikely that our methods will answer this question. It seems dependent upon showing that the sign of ψMϕ\psi_{M}\phi is well distributed so as not to destroy the cancellative effect of a sum across an anti-symmetric function.

10. Conclusion

We have developed a general method for estimating the series r=1Nϕ(rα)\sum_{r=1}^{N}\phi(r\alpha) in anergodic cases where ϕ\phi has an unbounded singularity at 0. Although many such series have been studied case by case in the literature, the current paper introduces a single unified approach, which in many cases also leads to improved results.

We outline a number of obvious directions for further study:

  1. (1)

    Extension of the theory to more general circle morphisms which are “sufficiently” close to rotations to benefit from the theory developed in the current paper

  2. (2)

    Extension of the theory to the inhomogenous case r=1Nϕ(x0+rα)\sum_{r=1}^{N}\phi(x_{0}+r\alpha) with x00x_{0}\neq 0. It may be simpler to start with specific values eg x0=12x_{0}=\frac{1}{2}, x0x_{0} rational.

  3. (3)

    Extension to higher dimensions. Here the general strategy of separating concerns remains valid, but there is an obstacle to be overcome: the estimation of sums analogous to SqrϕS_{q_{r}}\phi is problematic in the absence of an easy extension of Continued Fraction theory to higher dimensions.

References

  • [1] Heinrich Behnke. Zur theorie der diophantischen approximationen. In Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg, volume 3, pages 261–318. Springer, 1924.
  • [2] Victor Beresnevich, Alan Haynes, and Sanju Velani. Sums of reciprocals of fractional parts and multiplicative Diophantine approximation. Mem. Amer. Math. Soc., 263(1276):vii + 77, 2020.
  • [3] P. Erdős and G. Szekeres. On the product Πk=1n(1zak)\Pi^{n}_{k=1}(1-z^{a}k). Acad. Serbe Sci. Publ. Inst. Math., 13:29–34, 1959.
  • [4] G. H. Hardy and J. E. Littlewood. Some Problems of Diophantine Approximation: The Lattice-Points of a Right-Angled Triangle. Proc. London Math. Soc. (2), 20(1):15–36, 1921.
  • [5] G. H. Hardy and J. E. Littlewood. Some problems of Diophantine approximation: The lattice-points of a right-angled triangle. (Second memoir.). Abh. Math. Sem. Univ. Hamburg, 1(1):211–248, 1922.
  • [6] G. H. Hardy and E. M. Wright. An Introduction to the Theory of Numbers. Oxford, fourth edition edition, 1975.
  • [7] E Hecke. Über analytische funktionen und die verteilung von zahlen mod. eins. In Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg, volume 1, pages 54–76. Springer, 1922.
  • [8] Serge Lang. Introduction to diophantine approximations. Springer-Verlag, New York, 1995.
  • [9] C. G. Lekkerkerker. Representation of natural numbers as a sum of Fibonacci numbers. Simon Stevin, 29:190–195, 1952.
  • [10] M. Lerch. Question 1547. l’intermédiare des Mathématiciens, 11:145–146, 1904.
  • [11] D. S. Lubinsky. The size of (q;q)n(q;q)_{n} for qq on the unit circle. J. Number Theory, 76(2):217–247, 1999.
  • [12] Alexander Ostrowski. Bemerkungen zur theorie der diophantischen approximationen. In Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg, volume 1, pages 77–98. Springer, 1922.
  • [13] W. Sierpiński. Un théorème sur les nombres irrationnels. Krakau. Anz. [2], 725-727 (1909)., 1909.
  • [14] W. Sierpiński. Sur la valeur asymptotique d’une certaine somme. Krakau Anz. (A) 1910, 9-11 (1910)., 1910.
  • [15] Yakov G. Sinai and Corinna Ulcigrai. Estimates from above of certain double trigonometric sums. Journal of Fixed Point Theory and Applications, 6(1):93–113, 2009.
  • [16] C. Sudler, Jr. An estimate for a restricted partition function. Quart. J. Math. Oxford Ser. (2), 15:1–10, 1964.
  • [17] Paul Verschueren. Quasiperiodic Renormalisation: Quasiperiodic Sums, Products and Composition Sum Operators. Open University, 2016. PhD Thesis https://doi.org/10.21954/ou.ro.0000ef50.
  • [18] Paul Verschueren and Ben Mestel. Growth of the Sudler product of sines at the golden rotation number. J. Math. Anal. Appl., 433(1):200–226, 2016.
  • [19] E. M. Wright. Proof of a conjecture of Sudler’s. Quart. J. Math. Oxford Ser. (2), 15:11–15, 1964.
  • [20] E. Zeckendorf. Représentation des nombres naturels par une somme de nombres de Fibonacci ou de nombres de Lucas. Bull. Soc. Roy. Sci. Liège, 41:179–182, 1972.

Appendix A Context Annotation

This annotation has been developed whilst working on this paper to be a more efficient way of indicating context whilst working in related spaces with related morphisms. The annotation seems widely useful, particularly in higher order categories (ie in situations where we wish to discuss induced functions, or functions of functions). The main benefit is that it provides the flexibility to introduce formal precision when it is needed but without binding ourselves to heavy notational machinery when it is not needed. Of course this places more responsibility on the author to make good stylistic decisions!

Mathematical notation is extremely abbreviated - mathematicians tend to use very short signifiers (usually individual symbols) to signify mathematical objects. Since the pool of signifiers is severely limited (“not enough letters” syndrome), we end up “abusing notation”, ie we will use the same signifier to signify different mathematical objects. The term “abuse” taken too literally is unfair - “reuse of notation” is probably a better term: we are simply reusing scarce notational resources: done well this is a good thing, but done badly it can introduce confusion or even incomprehensibility. Taken not too literally, the term “abuse” is useful as a warning that there is a trade-off to be managed: the economy of notation comes at the expense of ambiguity. Done well, the ambiguity is easily resolved by the reader from the context, but sometimes it is useful or even necessary to make the context explicit.

There are two classical approaches to disambiguation which we compare briefly. Given two groups (X,+),(Y,+)(X,+),(Y,+), how do we make the meaning of x+yx+y clear? The more common approach is to write “For x,yXx,y\in X, x+yx+y”. A less common approach is to write “x+Xyx\,+_{X}\,y to mean that ++ is taken from (X,+)(X,+) and so the reader can deduce that x,yx,y are required to lie in XX. The latter approach is less familiar but it is arguably more economical to write and read. It is normally restricted for use with binary operations, but we will here extend and formalise it for all mathematical objects.

Definition 127.

If we wish to record the fact that the mathematical object signified by x{}^{\prime}x^{\prime} is “contextualised by” another object signified by X{}^{\prime}X^{\prime} we may do so by annotating the first signifier with the second, for example by writing xX{}^{\prime}x_{X}^{\prime}. We call XX a context of xx.

We will deal with the semantics of “contextualised by” in a moment, but first we need to do some unpacking of the definition.

It is crucial to note that this is an annotation, not part of the notation: it is an optional extra which can be used by the author stylistically, and does not need to be rigorously applied.

For example we might write the anonymous function xx2x_{\mathbb{R}}\mapsto x^{2} rather than xx2x_{\mathbb{R}}\mapsto x_{\mathbb{\mathbb{\mathbb{\mathbb{R}}}}}^{2}. We may also be flexible in positioning the annotation: given an object xix_{i} we could write it for example as xiXx_{i}^{X} or (xi)X(x_{i})_{X}.

In addition, whereas the notations x,xi{}^{\prime}x^{\prime},^{\prime}x_{i}^{\prime} generally signify different objects, x{}^{\prime}x^{\prime} and the annotation xX{}^{\prime}x_{X}^{\prime} generally signify the same object. A key exception is when there is abuse of notation, for example using x{}^{\prime}x^{\prime} to signify elements of two disjoint sets X1,X2X_{1},X_{2}. Now, writing xX1,xX2x_{X_{1}},x_{X_{2}} successfully disambiguates the situation, but if we choose x=xX1x=x_{X_{1}}, then we have chosen x,xX2x,x_{X_{2}} to be different objects.

The semantics of “contextualised by XX” needs be defined for any given context XX, and xx generally will have many contexts which could be used. However a sensible context is often obvious, as is the semantics.

If XX is a set, the obvious default semantics is that xX{}^{\prime}x_{X}^{\prime} means xXx\in X, but more specialised signifiers have a different obvious default. For example, in the case of a classical function symbol such as f{}^{\prime}f^{\prime}, the meaning of fXf_{X} defaults to mean fXXf\in X^{X}, ie ff is a function on XX. Similarly +X+_{X} will mean +XXX2+_{X}\in X^{X^{2}}, ie +X+_{X} is a binary operation on XX: and generally if \circ is an nn_{\mathbb{N}}-ary operation on XX, ie XXn\circ\in X^{X^{n}}, we will still write X\circ_{X}.

If XX is not a set, we are free to assign xXx_{X} our meaning of choice, but often it will be obvious. For example if Set{}^{\prime}Set^{\prime}, Grp{}^{\prime}Grp^{\prime} are our signifiers for the categories of sets, groups respectively, then GGrpG_{Grp} means informally that GG is a group, or more formally it means GG satisfies the predicates of the proper class ob(Grp)ob(Grp) (and of course the class ob(Grp)ob(Grp) is itself an object in the the category GrpGrp). We may formally define GGrp(XSet,+X)G_{Grp}\coloneqq(X_{Set},+_{X}), but by abuse of notation we might also write G(G,+)G\coloneqq(G,+) when we feel the meaning is clear. We can also define xGGGrpx\in_{G}\,G_{Grp} to mean xGSetx\in G_{Set} and then by abuse of notation write simply xGx\in G. This gives us the freedom to introduce formal precision when it is needed, without binding ourselves to heavy notation when it is not needed. At the same time it places more responsibility on the author to make good choices.

Finally we introduce a useful abbreviation. Given objects XX and YY of a category, the morphisms from XX to YY are normally designated Hom(X,Y)Hom(X,Y), so a morphism ϕ:XY\phi:X\rightarrow Y could be annotated ϕHom(X,Y)\phi_{Hom(X,Y)}. We will abbreviate Hom(X,Y)Hom(X,Y) to XYXY and write ϕXY\phi_{XY} to mean ϕ:XY\phi:X\rightarrow Y. Note that writing ϕXYx\phi_{XY}x now tells us that we have xXx_{X} and (ϕXYx)Y(\phi_{XY}x)_{Y}. We can also express composition of morphisms by writing ϕYZϕXY=ϕXZ\phi_{YZ}\phi_{XY}=\phi_{XZ} (which is of course even more expressive in reverse Polish notation, giving us ϕXYϕYZ=ϕXZ\phi_{XY}\phi_{YZ}=\phi_{XZ}).