This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Lower bounds for constant query affine-invariant LCCs and LTCs

Arnab Bhattacharyya
Indian Institute of Science
arnabb@csa.iisc.ernet.in
Research partially supported by a DST Ramanujan Fellowship.
   Sivakanth Gopi
Princeton University
sgopi@cs.princeton.edu
Research partially supported by NSF grants CCF-1523816 and CCF-1217416
Abstract

Affine-invariant codes are codes whose coordinates form a vector space over a finite field and which are invariant under affine transformations of the coordinate space. They form a natural, well-studied class of codes; they include popular codes such as Reed-Muller and Reed-Solomon. A particularly appealing feature of affine-invariant codes is that they seem well-suited to admit local correctors and testers.

In this work, we give lower bounds on the length of locally correctable and locally testable affine-invariant codes with constant query complexity. We show that if a code 𝒞Σ𝕂n{\mathcal{C}}\subset\Sigma^{{\mathbb{K}}^{n}} is an rr-query locally correctable code (LCC), where 𝕂{\mathbb{K}} is a finite field and Σ\Sigma is a finite alphabet, then the number of codewords in 𝒞{\mathcal{C}} is at most exp(O𝕂,r,|Σ|(nr1))\exp(O_{{\mathbb{K}},r,|\Sigma|}(n^{r-1})). Also, we show that if 𝒞Σ𝕂n{\mathcal{C}}\subset\Sigma^{{\mathbb{K}}^{n}} is an rr-query locally testable code (LTC), then the number of codewords in 𝒞{\mathcal{C}} is at most exp(O𝕂,r,|Σ|(nr2))\exp(O_{{\mathbb{K}},r,|\Sigma|}(n^{r-2})). The dependence on nn in these bounds is tight for constant-query LCCs/LTCs, since Guo, Kopparty and Sudan (ITCS ‘13) construct affine-invariant codes via lifting that have the same asymptotic tradeoffs. Note that our result holds for non-linear codes, whereas previously, Ben-Sasson and Sudan (RANDOM ‘11) assumed linearity to derive similar results.

Our analysis uses higher-order Fourier analysis. In particular, we show that the codewords corresponding to an affine-invariant LCC/LTC must be far from each other with respect to Gowers norm of an appropriate order. This then allows us to bound the number of codewords, using known decomposition theorems which approximate any bounded function in terms of a finite number of low-degree non-classical polynomials, upto a small error in the Gowers norm.

1 Introduction

Error-correcting codes which admit local algorithms are of significant interest in theoretical computer science. A code is called a locally correctable code (LCC) if there is a randomized algorithm that, given an index ii and a received word ww close to a codeword cc in Hamming distance, outputs cic_{i} by querying only a few positions of ww. A code is called a locally testable code (LTC) if there is a randomized algorithm that, given a received word ww, determines whether ww is in the code or whether ww is far in Hamming distance from every codeword, based on queries to a small number of locations of ww. The number of positions of the received word queried is called the query complexity of the LCC or LTC.

The notions of local correctability and local testability have a long history in computer science by now. Also called “self-correction”, the idea of local correction originated in works by Lipton [Lip90] and by Blum and Kannan [BK95] on program checkers. LCCs are closely related to locally decodable codes (LDCs), where the goal is to recover a symbol of the underlying message when given a corrupted codeword, using a small number of queries [KT00]. LDCs and LCCs have found applications in private information retrieval schemes [CKGS98, BIW07] and derandomization [STV99]. See [Yek11] for a detailed survey on LDCs and LCCs. Research on LTCs implicitly started with Blum, Luby, and Rubinfeld’s seminal discovery [BLR93] that the Hadamard code is an LTC with query complexity 33; they were first formally defined by Goldreich and Sudan in [GS06]. LTCs have been used (implicitly and explicitly) in many contexts, most notably in the construction of PCP’s [AS98, ALM+98, Din07].

In spite of the wide interest in them, some basic questions about LCCs and LTCs remain unanswered. We restrict ourselves throughout to the setting where the query complexity is a constant (independent of the length of the code) and consider the tradeoff between query complexity and code length. The current best constant-query LCCs have exponential length, while the current best constant-query LTCs have near-linear length but they are quite complicated [BS08, Din07, Mei09, Vid15]. Getting subexponential length LCCs or linear length LTCs with constant query complexity are major open problems in the area.

Intuitively, for LCCs and LTCs with constant query complexity, there must be a lot of redundancy in the code, since every symbol of the codeword must satisfy local constraints with most other symbols in the codeword. A systematic way to generate redundancy is to make sure that the code has a large group of invariances***A quite different way to generate redundancy is through tensoring; see [BS04]. Invariances and tensoring are essentially the only two “generic” reasons known to cause local correctability/testability.. Formally, given a code 𝒞ΣN{\mathcal{C}}\subset\Sigma^{N} of length NN over alphabet Σ\Sigma, a codeword c𝒞c\in{\mathcal{C}} can be naturally viewed as a function c:[N]Σc:[N]\to\Sigma. Then, we say that 𝒞{\mathcal{C}} is invariant under a set G{[N][N]}G\subset\{[N]\to[N]\}{AB}\{A\to B\} and BAB^{A} denote the set of all functions from AA to BB. if for every πG\pi\in G and codeword c𝒞c\in{\mathcal{C}}, cπc\circ\pi also describes a codeword c𝒞c^{\prime}\in{\mathcal{C}}. Now, the key observation is that if for every codeword c𝒞c\in{\mathcal{C}}, if there is a constraint among c(i1),,c(ik)c(i_{1}),\dots,c(i_{k}) for some i1,,ik[N]i_{1},\dots,i_{k}\in[N], then for every c𝒞c\in{\mathcal{C}}, there must also be a constraint among c(π(i1)),,c(π(ik))c(\pi(i_{1})),\dots,c(\pi(i_{k})) for any π\pi in the invariance set GG, since cπc\circ\pi is itself another codeword. Hence if GG is large, the presence of one local constraint immediately implies presence of many and suggests the possibility of local algorithms for the code. This connection between invariance and correctability/testability was first explicitly examined by Kaufman and Sudan [KS08]. One is then motivated to understand more clearly the possibilities and limitations of local correctors/testers for codes possessing natural symmetries.

We focus on affine-invariant codes, for which the domain [N][N] is an nn-dimensional vector space 𝕂n{\mathbb{K}}^{n} over a finite field 𝕂{\mathbb{K}} and the code 𝒞{𝕂nΣ}{\mathcal{C}}\subset\{{\mathbb{K}}^{n}\to\Sigma\} is invariant under affine transformations A:𝕂n𝕂nA:{\mathbb{K}}^{n}\to{\mathbb{K}}^{n}. Affine invariance is a very natural symmetry for “algebraic codes” and has long been studied in coding theory [KLP67]. The study of affine-invariant LCCs and LTCs was initiated in [KS08] and has been investigated in several follow-up works [BS11, Guo13, BRS12, GSVW15]. The hope is that because affine-invariant codes have a large group of invariance and, at the same time, are conducive to non-trivial algebraic constructions, they may contain a code that improves current constructions of LCCs or LTCs.

The current best parameters for constant-query affine-invariant LCCs and LTCs are achieved by the lifted codes of Guo, Kopparty and Sudan [GKS13]. They construct an affine-invariant code {𝔽2n𝔽2}{\mathcal{F}}\subset\{{\mathbb{F}}_{2^{\ell}}^{n}\to{\mathbb{F}}_{2}\} with exp(Θ(nr2))\exp(\Theta(n^{r-2})) codewords that is an (r1)(r-1)-query LCC and an rr-query LTC, where r=2r=2^{\ell}. The Θ()\Theta(\cdot) notation hides factors that depend on rr but not nn. For LCCs, the same asymptotic tradeoff between query complexity and code length is achieved by the Reed-Muller code. For every r2r\geq 2, the Reed-Muller code of order r1r-1 (i.e., polynomials over 𝔽q{\mathbb{F}}_{q} on nn variables of total degree r1\leq r-1 with q>rq>r) is an affine-invariant rr-query LCC with exp(Θ(nr1))\exp(\Theta(n^{r-1})) codewords. In fact, even if we drop the affine-invariance requirement, Reed-Muller codes and the construction of [GKS13] achieve the best known codeword length for constant query LCCsIn contrast, there exist non-affine-invariant LTCs of constant query complexity and inverse polylogarithmic rate. This corresponds to an LTC with exp(N/polylog(N))\exp(N/\mathrm{polylog}(N)) codewords, where NN is the code length, while the affine-invariant LTC of [GKS13] and Reed-Muller codes have exp(polylog(N))\exp(\mathrm{polylog}(N)) codewords for constant query complexity..

In this work, we show that the parameters for the lifted codes of [GKS13] are, in fact, tight for affine-invariant LCCs/LTCs in {𝕂nΣ}\{{\mathbb{K}}^{n}\to\Sigma\} for any fixed finite field 𝕂{\mathbb{K}} and any fixed finite alphabet Σ\Sigma.

Theorem 1 (Main Result, informal).
  1. (i)

    Let 𝒞{𝕂nΣ}{\mathcal{C}}\subset\{{\mathbb{K}}^{n}\to\Sigma\} be an rr-query affine-invariant LCC. Then |𝒞|exp(O𝕂,r,|Σ|(nr1))|{\mathcal{C}}|\leq\exp\left(O_{{\mathbb{K}},r,|\Sigma|}(n^{r-1})\right).

  2. (ii)

    Let 𝒞{𝕂nΣ}{\mathcal{C}}\subset\{{\mathbb{K}}^{n}\to\Sigma\} be an rr-query affine-invariant LTC. Then |𝒞|exp(O𝕂,r,|Σ|(nr2))|{\mathcal{C}}|\leq\exp\left(O_{{\mathbb{K}},r,|\Sigma|}(n^{r-2})\right).

1.1 Related Work

Ben-Sasson and Sudan in [BS11] obtained a similar result as Theorem 1, when the code is assumed to be linear, i.e., when the codewords form a vector space. They showed that if 𝒞{𝕂n𝔽}{\mathcal{C}}\subset\{{\mathbb{K}}^{n}\to{\mathbb{F}}\} is an (r1)(r-1)-query locally correctable or rr-query locally testable linear, affine-invariant code, where 𝕂{\mathbb{K}} and 𝔽{\mathbb{F}} are finite fields of characteristic p>0p>0 with 𝕂{\mathbb{K}} an extension of 𝔽{\mathbb{F}}, then the dimension of 𝒞{\mathcal{C}} as a vector space over 𝔽{\mathbb{F}} is at most (nlogp|𝕂|)r2(n\log_{p}|{\mathbb{K}}|)^{r-2}. When 𝕂{\mathbb{K}} is fixed (as in [GKS13]’s construction of constant query LCCs/LTCs), the result of [BS11] is a very special case of our Theorem 1. On the other hand, [BS11]’s result also applies when the size of 𝕂{\mathbb{K}} is growing (as long as 𝕂{\mathbb{K}} extends 𝔽{\mathbb{F}}), whereas ours does not.

There are several works which study lower bounds for constant query LCCs [KT00, GKST02, DS07, KdW03, BDYW11, BDSS11, Woo12, DSW14]. For general (non-affine-invariant) LCCs, tight lower bounds are known only for 2-query LCCs. Kerendis and deWolf [KdW03] prove that if 𝒞{{0,1}nΣ}{\mathcal{C}}\subset\{\{0,1\}^{n}\to\Sigma\} is a 2-query LCC§§§Their lower bound also holds for the weaker notion of locally decodable dode (LDC), then |𝒞|exp(O(n|Σ|5)).|{\mathcal{C}}|\leq\exp(O(n|\Sigma|^{5})). This is tight for constant Σ\Sigma and achieved by the Hadamard code. For rr-query LCCs where r>2r>2, the lower bounds known are much weaker. The best known bounds, due to [KdW03, Woo07], show that if 𝒞{{0,1}n{0,1}}{\mathcal{C}}\subset\{\{0,1\}^{n}\to\{0,1\}\} is an rr-query LCC, then

|𝒞|exp(2n/(1+1/(r/2+1))+o(n)).|{\mathcal{C}}|\leq\exp\left(2^{n/(1+1/(\lceil r/2\rceil+1))+o(n)}\right).

Higher-order Fourier analysis was applied to other problems in coding theory in [BL15b, TW14].

1.2 Proof Overview

Our arguments are based on standard techniques from higher-order Fourier analysis [Tao12], but they are new in this context. We show that if an affine-invariant code is an rr-query LCC, then its codewords are far from each other in the UrU^{r}-norm, the Gowers norm of order rr. Similarly, we show that the codewords of an affine-invariant rr-query LTC are far from each other in the Ur1U^{r-1}-norm. Therefore, we can upper bound the number of LCC/LTC codewords in terms of the size of a net that is fine enough with respect to the Gowers norm of an appropriate order. We bound the size of such a net by explicitly constructing one using a standard decomposition theorem (analogous to Szemerédi’s regularity lemma): any bounded function f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}} can be approximated, upto a small error in the Gowers norm, by a composition of a bounded number of low-degree non-classical polynomials [TZ12].

The way we argue that two codewords ff and gg of an rr-query LCC are far in the Gowers norm is that if fgUr<ϵ\|f-g\|_{U^{r}}<\epsilon, then for small enough ϵ\epsilon (with respect to rr, |Σ||\Sigma| and correctness probability), the local corrector when applied to ff can act as if it is applied to gg. The argument is, briefly, as follows. On the one hand, the codewords ff and gg must be far in Hamming distance, because the definition of LCC implies that there is a unique codeword close to any string. So, with constant probability over choice of y𝕂ny\in{\mathbb{K}}^{n}, the local corrector’s guess for f(y)f(y) must differ from g(y)g(y). On the other hand, we can lower bound by a constant the probability of the event that the corrector outputs g(y)g(y) when it queries coordinates of ff, because ff and gg are close in the Ur\|\cdot\|_{U^{r}} norm. This last calculation uses the affine invariance of the code and the generalized von Neumann inequality, which bounds by f0Uk\|f_{0}\|_{U^{k}} the expectation over z1,,zm𝕂nz_{1},\dots,z_{m}\in{\mathbb{K}}^{n} of the product i=0kfi(i(z1,,zm))\prod_{i=0}^{k}f_{i}({\mathcal{L}}_{i}(z_{1},\dots,z_{m})), where the i{\mathcal{L}}_{i}’s are arbitrary linear forms so that no two are linearly dependent and fi:𝕂nf_{i}:{\mathbb{K}}^{n}\to{\mathbb{C}} are arbitrary functions with |fi|1|f_{i}|\leq 1.

The argument for rr-query LTCs is similar. Suppose ff and gg are close in the Ur1\|\cdot\|_{U^{r-1}} norm. Consider the random function HH such that for every xx independently, H(x)H(x) equals f(x)f(x) with probability 1/21/2 and g(x)g(x) with probability 1/21/2. HH itself is far from a codeword with high probability. But we show that since the local tester accepts ff, it will also accept HH\circ\ell for a random invertible affine map :𝕂n𝕂n\ell:{\mathbb{K}}^{n}\to{\mathbb{K}}^{n} with good probability. This implies that with good probability, HH\circ\ell is close to a codeword and by affine-invariance, HH itself is close to a codeword which gives a contradiction. To draw this conclusion, we again use the generalized von Neumann inequality as well as a hybrid argument.

Organization.

Section 2 contains preliminaries that lay the foundations of our analysis. Section 3 proves the first part of our main result about LCCs, while Section 4 proves the second part about LTCs.

2 Preliminaries

2.1 Error-correcting codes

Let 𝒳{\mathcal{X}} be a finite set called the set of coordinates and Σ\Sigma be an other finite set called the alphabet. Let Σ𝒳\Sigma^{\mathcal{X}} denote the set of all functions from 𝒳Σ{\mathcal{X}}\to\Sigma. A subset 𝒞Σ𝒳{\mathcal{C}}\subset\Sigma^{\mathcal{X}} is called a code and its elements are called codewords.

Definition 1 (Hamming distance).

Given f,gΣ𝒳f,g\in\Sigma^{\mathcal{X}}, we define the normalized Hamming distance between ff and gg is defined as Δ(f,g):=Prx𝒳[f(x)g(x)]\Delta(f,g):=\Pr_{x\in{\mathcal{X}}}[f(x)\neq g(x)] where xx is uniformly chosen from 𝒳{\mathcal{X}}. For a code 𝒞Σ𝒳{\mathcal{C}}\subset\Sigma^{\mathcal{X}}, we define the minimum distance of 𝒞{\mathcal{C}} as minf,g𝒞,fgΔ(f,g)\min_{f,g\in{\mathcal{C}},f\neq g}\Delta(f,g).

Let Σ={q:Σ0:iΣq(i)=1}{\blacktriangle}_{\Sigma}=\{q:\Sigma\to{\mathbb{R}}_{\geq 0}:\sum_{i\in\Sigma}q(i)=1\} denote the probability simplex on Σ\Sigma. We embed Σ\Sigma into Σ{\blacktriangle}_{\Sigma} by sending iΣi\in\Sigma to eie_{i} which is the ithi^{th} coordinate vector in Σ{\mathbb{R}}^{\Sigma}. This also lets us extend functions f:𝒳Σf:{\mathcal{X}}\to\Sigma to f^:𝒳Σ{\hat{f}}:{\mathcal{X}}\to{\blacktriangle}_{\Sigma} using the embedding. We call f^{\hat{f}} the simplex extension of ff. Now given f,gΣ𝒳f,g\in\Sigma^{\mathcal{X}}, we can write the Hamming distance between them as

Δ(f,g)=1Prx𝒳[f(x)=g(x)]=1𝔼x𝒳f^,g^\Delta(f,g)=1-\Pr_{x\in{\mathcal{X}}}[f(x)=g(x)]=1-{\mathbb{E}}_{x\in{\mathcal{X}}}\langle{\hat{f}},{\hat{g}}\rangle

where ,\left\langle\cdot,\cdot\right\rangle is the standard inner product in Σ{\mathbb{R}}^{\Sigma}.

Definition 2 (Affine invariance).

Let 𝒳{\mathcal{X}} be a finite dimensional vector space over some finite field 𝕂{\mathbb{K}}, then 𝒞Σ𝒳{\mathcal{C}}\subset\Sigma^{\mathcal{X}} is called affine invariant if for every f𝒞f\in{\mathcal{C}} and every invertible affine map :𝒳𝒳\ell:{\mathcal{X}}\to{\mathcal{X}}, f𝒞f\circ\ell\in{\mathcal{C}}.

Locally correctable and testable codes are defined formally in Sections 3 and 4 respectively.

2.2 Higher order Fourier analysis

Fix a finite field 𝔽p{\mathbb{F}}_{p} of prime order pp, and let 𝕂=𝔽q{\mathbb{K}}={\mathbb{F}}_{q} where q=ptq=p^{t} for a positive integer tt. 𝕂{\mathbb{K}} is then a vector space of dimension tt over 𝔽p{\mathbb{F}}_{p}. We denote by Tr:𝕂𝔽p{\rm Tr}:{\mathbb{K}}\to{\mathbb{F}}_{p} the trace function:

Tr(x)=x+xp+xp2++xpt1.{\rm Tr}(x)=x+x^{p}+x^{p^{2}}+\cdots+x^{p^{t-1}}.

Also, we use |||\cdot| to denote the obvious map from 𝔽p{\mathbb{F}}_{p} to {0,1,,p1}\{0,1,\dots,p-1\}.

Given functions f,g:𝕂nf,g:{\mathbb{K}}^{n}\to{\mathbb{C}}, we define their inner product as f,g=𝔼x[f(x)¯g(x)]\left\langle f,g\right\rangle={\mathbb{E}}_{x}[\overline{f(x)}g(x)] where xx is chosen uniformly from 𝕂n{\mathbb{K}}^{n}. We define p\|\cdot\|_{p}-norm on such functions as fp=𝔼x[|f(x)|p]1/p\|f\|_{p}={\mathbb{E}}_{x}[|f(x)|^{p}]^{1/p}. We say a function f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}} is bounded if |f|1|f|\leq 1. Let 𝕋{\mathbb{T}} denote the circle group /{\mathbb{R}}/{\mathbb{Z}} and e:𝕋e:{\mathbb{T}}\to{\mathbb{C}} be the map given by e(x)=exp(2πix)e(x)=\exp(2\pi ix).

Definition 3 (Non-classical Polynomials).

A non-classical polynomial of degree <d<d is a function f:𝕂n𝕋f:{\mathbb{K}}^{n}\to{\mathbb{T}} if

h1,h2,hd𝕂nDh1Dh2Dhdf=0\forall h_{1},h_{2}\cdots,h_{d}\in{\mathbb{K}}^{n}\ \ D_{h_{1}}D_{h_{2}}\cdots D_{h_{d}}f=0

where DhD_{h} is the difference operator defined as Dhf(x)=f(x+h)f(x)D_{h}f(x)=f(x+h)-f(x). For such an ff, the function e(f)e(f) is called a non-classical phase polynomial of degree <d<d.

Let α1,,αt𝕂\alpha_{1},\cdots,\alpha_{t}\in{\mathbb{K}} be a basis for 𝕂{\mathbb{K}} when viewed as a vector space over 𝔽p{\mathbb{F}}_{p}. It is known [TZ12, BB15] that non-classical polynomials of degree d\leq d are exactly those functions P:𝕂n𝕋P:{\mathbb{K}}^{n}\to{\mathbb{T}} which have the following form:

P(x1,,xn)=θ+k00di,j<pi[n],j[t];0<i=1nj=1tdi,jdk(p1)cd1,1,,dn,t,ki=1nj=1t|Tr(αjxi)|di,jpk+1(mod1)P(x_{1},\dots,x_{n})=\theta+\sum_{k\geq 0}\sum_{\begin{subarray}{c}0\leq d_{i,j}<p~\forall i\in[n],j\in[t];\\ 0<\sum_{i=1}^{n}\sum_{j=1}^{t}d_{i,j}\leq d-k(p-1)\end{subarray}}\frac{c_{d_{1,1},\dots,d_{n,t},k}\prod_{i=1}^{n}\prod_{j=1}^{t}|{\rm Tr}(\alpha_{j}x_{i})|^{d_{i,j}}}{p^{k+1}}\pmod{1} (1)

for some cd1,1,,dn,t,k{0,1,,p1}c_{d_{1,1},\dots,d_{n,t},k}\in\{0,1,\cdots,p-1\} and θ𝕋\theta\in{\mathbb{T}}. Next, we define the Gowers norm for arbitrary functions f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}}.

Definition 4 (Gowers uniformity norm [Gow01]).

For a function f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}}, the Gowers norm of order rr, denoted by Ur\|\cdot\|_{U^{r}}, is defined as

fUr=(𝔼x,h1,,hr𝕂n[Δh1Δh2Δhrf(x)])1/2r\|f\|_{U^{r}}=\left({\mathbb{E}}_{x,h_{1},\cdots,h_{r}\in{\mathbb{K}}^{n}}[\Delta_{h_{1}}\Delta_{h_{2}}\cdots\Delta_{h_{r}}f(x)]\right)^{1/2^{r}}

where Δh\Delta_{h} is the multiplicative difference operator defined as Δhf(x)=f(x+h)f(x)¯\Delta_{h}f(x)=f(x+h)\overline{f(x)}.

The Gowers norm is an actual norm when r2r\geq 2. It also satisfies a useful monotonicity property: for any function f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}},

|𝔼[f(x)]|=fU1fU2fUrf.|{\mathbb{E}}[f(x)]|=\|f\|_{U^{1}}\leq\|f\|_{U^{2}}\leq\cdots\leq\|f\|_{U^{r}}\leq\cdots\leq\|f\|_{\infty}.

See [Tao12] for more on Gowers norm. Observe that if f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}} is a non-classical phase polynomial of degree <r<r then fUr=1\|f\|_{U^{r}}=1. The inverse Gowers theorem is a partial converse to this. It shows that the Gowers norm of order rr of a function is in direct correspondence with its correlation with non-classical phase polynomials of degree <r<r. In particular:

Lemma 1 (Inverse Gowers theorem [TZ12]).

For any bounded f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}}, if fUr>δ\|f\|_{U^{r}}>\delta then there exists a non-classical polynomial PP of degree <r<r such that

|f,e(P)|c(δ,𝕂,r)|\left\langle f,e(P)\right\rangle|\geq c(\delta,{\mathbb{K}},r)

where c(δ,𝕂,r)c(\delta,{\mathbb{K}},r) is a constant depending only on δ,𝕂,r\delta,{\mathbb{K}},r.

A linear form on mm variables is a vector =(w1,,wm)𝕂m{\mathcal{L}}=(w_{1},\cdots,w_{m})\in{\mathbb{K}}^{m} that is interpreted as a function :(𝕂n)m𝕂n{\mathcal{L}}:({\mathbb{K}}^{n})^{m}\to{\mathbb{K}}^{n} via the map (x1,,xm)i=1mwixi(x_{1},\cdots,x_{m})\mapsto\sum_{i=1}^{m}w_{i}x_{i}. A key reason that the Gowers norm is useful in applications is that if a function has small Gowers norm of the appropriate order, then it behaves pseudorandomly in a certain way with respect to linear forms.

Lemma 2 (Generalized von Neumann inequality (Exercise 1.3.23 in [Tao12])).

Let f0,f1,f2,,fk:𝕂nf_{0},f_{1},f_{2},\cdots,f_{k}:{\mathbb{K}}^{n}\to{\mathbb{C}} be bounded functions and let ={0,1,,k}{\mathcal{L}}=\{{\mathcal{L}}_{0},{\mathcal{L}}_{1},\cdots,{\mathcal{L}}_{k}\} be a system of k+1k+1 linear forms in mm variables such that no form is a multiple of another. Then

|𝔼z1,,zm𝕂n[i=0kfi(i(z1,,zm))]|min0ikfiUk|{\mathbb{E}}_{z_{1},\cdots,z_{m}\in{\mathbb{K}}^{n}}[\prod_{i=0}^{k}f_{i}({\mathcal{L}}_{i}(z_{1},\cdots,z_{m}))]|\leq\min_{0\leq i\leq k}\|f_{i}\|_{U^{k}}

See Appendix A for proof.

2.3 A net for Gowers norm

The goal of this section is to establish the following claim.

Theorem 2 (ϵ\epsilon-net for UrU^{r} norm).

The metric induced by the Ur\|\cdot\|_{U^{r}} norm on the space of all bounded functions {f:𝕂n}\{f:{\mathbb{K}}^{n}\to{\mathbb{C}}\} has an ϵ\epsilon-net of size exp(Oϵ,𝕂,r(nr1))\exp(O_{\epsilon,{\mathbb{K}},r}(n^{r-1})).

For the proof, we need the following definitions.

Definition 5 (Polynomial factors).

A polynomial factor {\mathcal{B}} is a sequence of non-classical polynomials P1,,Pk:𝕂n𝕋P_{1},...,P_{k}:{\mathbb{K}}^{n}\to{\mathbb{T}}. We also identify it with the function :𝕂n𝕋k{\mathcal{B}}:{\mathbb{K}}^{n}\to{\mathbb{T}}^{k} mapping x(P1(x),,Pk(x))x\mapsto(P_{1}(x),...,P_{k}(x)). The partition induced by {\mathcal{B}} is the partition of 𝕂n{\mathbb{K}}^{n} given by {1(y):y𝕋k}\{{\mathcal{B}}^{-1}(y):y\in{\mathbb{T}}^{k}\}. The complexity of {\mathcal{B}} is the number of defining polynomials, ||=k|{\mathcal{B}}|=k. The degree of {\mathcal{B}} is the maximum degree among its defining polynomials P1,,PkP_{1},\cdots,P_{k}. A function f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}} is called {\mathcal{B}}-measurable if it is constant in each cell of the partition induced by {\mathcal{B}} or equivalently ff can be written as a τ(P1,,Pk)\tau(P_{1},\cdots,P_{k}) for some function τ:𝕋k\tau:{\mathbb{T}}^{k}\to{\mathbb{C}}.

Definition 6 (Conditional expectations).

Given a polynomial factor {\mathcal{B}}, the conditional expectation of f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}} over {\mathcal{B}}, denoted by 𝔼[f|]{\mathbb{E}}[f|{\mathcal{B}}], is the {\mathcal{B}}-measurable function defined by

𝔼[f|](x)=𝔼y1((x))[f(y)].{\mathbb{E}}[f|{\mathcal{B}}](x)={\mathbb{E}}_{y\in{\mathcal{B}}^{-1}\left({\mathcal{B}}(x)\right)}[f(y)].
Definition 7 (Factor refinement).

Given two polynomial factors ,{\mathcal{B}},{\mathcal{B}}^{\prime}, we say {\mathcal{B}}^{\prime} is a refinement of {\mathcal{B}}, denoted by {\mathcal{B}}^{\prime}\preceq{\mathcal{B}}, if every cell in the partition induced by {\mathcal{B}}^{\prime} is contained in some cell in the partition induced by {\mathcal{B}}.

The definition of refinement immediately implies:

Lemma 3 (Pythagoras theorem).

Let ,{\mathcal{B}},{\mathcal{B}}^{\prime} be polynomial factors such that {\mathcal{B}}^{\prime}\preceq{\mathcal{B}}, then for any function f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}},

𝔼[f|]22=𝔼[f|]22+𝔼[f|]𝔼[f|]22.\|{\mathbb{E}}[f|{\mathcal{B}}^{\prime}]\|_{2}^{2}=\|{\mathbb{E}}[f|{\mathcal{B}}]\|_{2}^{2}+\|{\mathbb{E}}[f|{\mathcal{B}}^{\prime}]-{\mathbb{E}}[f|{\mathcal{B}}]\|_{2}^{2}.

The next claim shows that any bounded function is “close” to being measurable by a polynomial factor of bounded complexity. Precisely:

Lemma 4 (Decomposition Theorem).

Any bounded f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}} can be approximated in Ur\|\cdot\|_{U^{r}} by a function of a small number of degree <r<r non-classical polynomials i.e. for any ϵ>0\epsilon>0, there exists non-classical polynomials P1,P2,,PkP_{1},P_{2},\cdots,P_{k} of degree <r<r with Pi(0¯)=0iP_{i}(\bar{0})=0\ \forall i and a bounded function τ:𝕋k\tau:{\mathbb{T}}^{k}\to{\mathbb{C}} such that

fτ(P1,P2,,Pk)Urϵ\|f-\tau(P_{1},P_{2},\cdots,P_{k})\|_{U^{r}}\leq\epsilon

where k=k(ϵ,𝕂,r)k=k(\epsilon,{\mathbb{K}},r) is a constant depending only on ϵ,𝕂,r\epsilon,{\mathbb{K}},r.

Proof.

The proof is similar to the proof of the Quadratic Koopman-von Neumann decompostion which is Prop 3.7 in [Gre06] but using the full Inverse Gowers Theorem (Lemma 1) and similar claims are implicit elsewhere, but for completeness, we give the proof.

The main idea is to approximate the function ff using its conditional expectation over a suitable polynomial factor {\mathcal{B}} of degree <r<r. We will start with the trivial factor 0=(1){\mathcal{B}}_{0}=(1) and iteratively construct more refined partitions ii1{\mathcal{B}}_{i}\preceq{\mathcal{B}}_{i-1} until we find a factor k{\mathcal{B}}_{k} which satisfies f𝔼[f|k]Urϵ\|f-{\mathbb{E}}[f|{\mathcal{B}}_{k}]\|_{U^{r}}\leq\epsilon. To bound the number of iterations needed to achieve this, we will show that the energy 𝔼[f|i]22\|{\mathbb{E}}[f|{\mathcal{B}}_{i}]\|_{2}^{2} which is bounded above by 1, increases by a fixed constant in every step.
Suppose that after step i1i-1, we still have f𝔼[f|i1]Ur>ϵ\|f-{\mathbb{E}}[f|{\mathcal{B}}_{i-1}]\|_{U^{r}}>\epsilon. Let g=f𝔼[f|i1]g=f-{\mathbb{E}}[f|{\mathcal{B}}_{i-1}], then by the inverse Gowers theorem (Lemma 1), we have some non-classical polynomial PiP_{i} of degree <r<r such that |g,e(Pi)|κ=c(ϵ,p,r)|\left\langle g,e(P_{i})\right\rangle|\geq\kappa=c(\epsilon,p,r). We can assume that Pi(0¯)=0P_{i}(\bar{0})=0. Refine the factor i1{\mathcal{B}}_{i-1} by adding the polynomial PiP_{i} to obtain ii1{\mathcal{B}}_{i}\preceq{\mathcal{B}}_{i-1}. Now consider the energy increment,

𝔼[f|i]22𝔼[f|i1]22=𝔼[f|i]𝔼[f|i1]22=𝔼[g|i]22\|{\mathbb{E}}[f|{\mathcal{B}}_{i}]\|_{2}^{2}-\|{\mathbb{E}}[f|{\mathcal{B}}_{i-1}]\|_{2}^{2}=\|{\mathbb{E}}[f|{\mathcal{B}}_{i}]-{\mathbb{E}}[f|{\mathcal{B}}_{i-1}]\|_{2}^{2}=\|{\mathbb{E}}[g|{\mathcal{B}}_{i}]\|_{2}^{2}

where we used the Pythagoras theorem(Lemma 3) and the fact that 𝔼[𝔼[f|i1]|i]=𝔼[f|i1]{\mathbb{E}}\big{[}{\mathbb{E}}[f|{\mathcal{B}}_{i-1}]\big{|}{\mathcal{B}}_{i}\big{]}={\mathbb{E}}[f|{\mathcal{B}}_{i-1}] since ii1{\mathcal{B}}_{i}\preceq{\mathcal{B}}_{i-1}. So

κ2\displaystyle\kappa^{2} |𝔼[ge(Pi)]|2=|𝔼[𝔼[ge(Pi)|i]]|2=|𝔼[e(Pi)𝔼[g|i]]|2\displaystyle\leq|{\mathbb{E}}[g\cdot e(P_{i})]|^{2}=\left|{\mathbb{E}}\big{[}{\mathbb{E}}[g\cdot e(P_{i})|{\mathcal{B}}_{i}]\big{]}\right|^{2}=\left|{\mathbb{E}}\big{[}e(P_{i}){\mathbb{E}}[g|{\mathcal{B}}_{i}]\big{]}\right|^{2}
𝔼[g|i]12𝔼[g|i]22=𝔼[f|i]22𝔼[f|i1]22.\displaystyle\leq\|{\mathbb{E}}[g|{\mathcal{B}}_{i}]\|_{1}^{2}\leq\|{\mathbb{E}}[g|{\mathcal{B}}_{i}]\|_{2}^{2}=\|{\mathbb{E}}[f|{\mathcal{B}}_{i}]\|_{2}^{2}-\|{\mathbb{E}}[f|{\mathcal{B}}_{i-1}]\|_{2}^{2}.

Thus the energy increases by κ2\kappa^{2} every step. But since the energy is bounded above by 1, the process should end in a finite number of steps k1κ2k\leq\frac{1}{\kappa^{2}}. So f𝔼[f|k]Urϵ\|f-{\mathbb{E}}[f|{\mathcal{B}}_{k}]\|_{U^{r}}\leq\epsilon, but since 𝔼[f|k]{\mathbb{E}}[f|{\mathcal{B}}_{k}] is k{\mathcal{B}}_{k}-measurable, we can write 𝔼[f|k]=τ(P1,,Pk){\mathbb{E}}[f|{\mathcal{B}}_{k}]=\tau(P_{1},\cdots,P_{k}) for some function τ\tau with |τ|=|𝔼[f|k]||f|1|\tau|=|{\mathbb{E}}[f|{\mathcal{B}}_{k}]|\leq|f|\leq 1. ∎

We are now ready to prove Theorem 2.

Proof of Theorem 2.

Recall that 𝕂{\mathbb{K}} is an extension field of dimension tt over a prime field 𝔽p{\mathbb{F}}_{p}. The ϵ\epsilon-net will be the set 𝒩{\mathcal{N}} of all functions of the form τ(P1,,Pk)\tau(P_{1},\cdots,P_{k}) where P1,,PkP_{1},\cdots,P_{k} are degree <r<r non-classical polynomials with zero constant terms, τ:𝕋k\tau:{\mathbb{T}}^{k}\to{\mathbb{C}} is a bounded function and k=k(ϵ,p,r)k=k(\epsilon,p,r) is the constant given by Lemma 4. But we will not include all possible bounded τ:𝕋k\tau:{\mathbb{T}}^{k}\to{\mathbb{C}}. Firstly by Equation 1, P1,,PkP_{1},\cdots,P_{k} take values only in 1pr/\frac{1}{p^{r}}{\mathbb{Z}}/{\mathbb{Z}}. Next we will discretize the set {z:|z|1}\{z\in{\mathbb{C}}:|z|\leq 1\} into the ϵ\epsilon-lattice i.e. we will only consider maps τ:(1pr/)k{z:|z|1}ϵ(+i)\tau:(\frac{1}{p^{r}}{\mathbb{Z}}/{\mathbb{Z}})^{k}\to\{z\in{\mathbb{C}}:|z|\leq 1\}\cap\epsilon({\mathbb{Z}}+i{\mathbb{Z}}). The number of such maps is bounded by (4/ϵ2)prk(4/\epsilon^{2})^{p^{rk}}.

By Equation 1, a non-classical polynomial of degree <r<r in nn variables with zero constant term can be represented by (nt+r1r1)r\leq\binom{nt+r-1}{r-1}r coefficients in {0,1,,p1}\{0,1,\cdots,p-1\}. So the number of such non-classical polynomials is bounded by exp(Or,𝕂(nr1))\exp\left(O_{r,{\mathbb{K}}}(n^{r-1})\right). Combining both the bounds,

|𝒩|exp(Or,𝕂(nr1))k(4/ϵ2)prk=exp(Oϵ,𝕂,r(nr1)).|{\mathcal{N}}|\leq\exp\left(O_{r,{\mathbb{K}}}(n^{r-1})\right)^{k}\cdot(4/\epsilon^{2})^{p^{rk}}=\exp\left(O_{\epsilon,{\mathbb{K}},r}(n^{r-1})\right).

We will now prove that 𝒩{\mathcal{N}} is a 3ϵ3\epsilon-net. Given any f:𝕂n[1,1]f:{\mathbb{K}}^{n}\to[-1,1], using Lemma 4, there is a function τ(P1,,Pk)\tau(P_{1},\cdots,P_{k}) such that

fτ(P1,P2,,Pk)Urϵ.\|f-\tau(P_{1},P_{2},\cdots,P_{k})\|_{U^{r}}\leq\epsilon.

If we consider the τ~𝒩\tilde{\tau}\in{\mathcal{N}} by rounding values real and imaginary parts of τ\tau to the nearest multiple of ϵ\epsilon, we get

fτ~(P1,P2,,Pk)Ur\displaystyle\|f-\tilde{\tau}(P_{1},P_{2},\cdots,P_{k})\|_{U^{r}} fτ(P1,P2,,Pk)Ur+τ(P1,P2,,Pk)τ~(P1,P2,,Pk)Ur\displaystyle\leq\|f-\tau(P_{1},P_{2},\cdots,P_{k})\|_{U^{r}}+\|\tau(P_{1},P_{2},\cdots,P_{k})-\tilde{\tau}(P_{1},P_{2},\cdots,P_{k})\|_{U^{r}}
ϵ+τ(P1,P2,,Pk)τ~(P1,P2,,Pk)3ϵ.\displaystyle\leq\epsilon+\|\tau(P_{1},P_{2},\cdots,P_{k})-\tilde{\tau}(P_{1},P_{2},\cdots,P_{k})\|_{\infty}\leq 3\epsilon.

3 Locally Correctable Codes

We begin by defining locally correctable codes formally. Note that the definition below differs from the conventional one in terms of a local correction algorithm and adversarial errors (see, for instance, [Yek11]); however, our definition is certainly weaker. Therefore, this makes our lower bounds stronger.

Definition 8 (Locally Correctable Code (LCC)).

An (r,δ,τ)(r,\delta,\tau) LCC is a code 𝒞Σ𝒳{\mathcal{C}}\subset\Sigma^{\mathcal{X}} with the following property:
For each x𝒳x\in{\mathcal{X}} there is a distribution x{\mathcal{M}}_{x} over rr-tuples of distinctWLOG we can assume the tuples have distinct coordinates by adding dummy coordinates and modifying the decoding functions 𝒟x,y1,,yr{\mathcal{D}}_{x,y_{1},\cdots,y_{r}} coordinates such that whenever f~Σ𝒳{\tilde{f}}\in\Sigma^{\mathcal{X}} is δ\delta-close to some codeword f𝒞f\in{\mathcal{C}} in Hamming distance,

Pr(y1,,yr)x[𝒟x,y1,,yr(f~(y1),f~(y2),,f~(yr))=f(x)]1τ\Pr_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{x}}[{\mathcal{D}}_{x,y_{1},\cdots,y_{r}}({\tilde{f}}(y_{1}),{\tilde{f}}(y_{2}),\cdots,{\tilde{f}}(y_{r}))=f(x)]\geq 1-\tau

where 𝒟x,y1,,yr:ΣrΣ{\mathcal{D}}_{x,y_{1},\cdots,y_{r}}:\Sigma^{r}\to\Sigma, called the decoding operator, depends only on x,y1,,yrx,y_{1},\cdots,y_{r}.
If furthermore 𝒳{\mathcal{X}} is a vector space and 𝒞{\mathcal{C}} is affine invariant then we call it an affine invariant LCC.

Remark 1.

Let |Σ|=m|\Sigma|=m, WLOG we can assume that Σ={1,2,,m}\Sigma=\{1,2,\cdots,m\}. Then we can extend functions f:𝒳Σf:{\mathcal{X}}\to\Sigma to f^:𝒳m{\hat{f}}:{\mathcal{X}}\to{\blacktriangle}_{m}. The decoding operators 𝒟:ΣrΣ{\mathcal{D}}:\Sigma^{r}\to\Sigma can also be extended to 𝒟^:mrm{\widehat{{\mathcal{D}}}}:{\blacktriangle}_{m}^{r}\to{\blacktriangle}_{m} as follows: For z1,,zrmz_{1},\cdots,z_{r}\in{\blacktriangle}_{m} define

𝒟^(z1,,zr)=11,,rme𝒟(1,,r)(z1)1(zr)r{\widehat{{\mathcal{D}}}}(z_{1},\cdots,z_{r})=\sum_{1\leq\ell_{1},\cdots,\ell_{r}\leq m}e_{{\mathcal{D}}(\ell_{1},\cdots,\ell_{r})}(z_{1})_{\ell_{1}}\cdots(z_{r})_{\ell_{r}}

where eje_{j} stands for the jthj^{th} coordinate vector in m{\mathbb{R}}^{m} and (zj)(z_{j})_{\ell} is the th\ell^{th} coordinate of the vector zjz_{j}. Now we can rewrite the decoding condition as:

𝔼(y1,,yr)x[f^(x),𝒟^x,y1,,yr(f^(y1),f^(y2),,f^(yr))]1τ.{\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{x}}[\left\langle{\hat{f}}(x),{\widehat{{\mathcal{D}}}}_{x,y_{1},\cdots,y_{r}}({\hat{f}}(y_{1}),{\hat{f}}(y_{2}),\cdots,{\hat{f}}(y_{r}))\right\rangle]\geq 1-\tau.

First, we make the observation that any LCC must have good minimum distance.

Lemma 5.

Let 𝒞Σ𝒳{\mathcal{C}}\subset\Sigma^{\mathcal{X}} be an (r,δ,τ)(r,\delta,\tau) LCC with τ<1/2\tau<1/2, then the minimum distance of 𝒞{\mathcal{C}} is at least 2δ2\delta.

Proof.

Let f,g𝒞f,g\in{\mathcal{C}} be two distinct codewords such that Δ(f,g)<2δ\Delta(f,g)<2\delta. Let hh be the midpoint of ff and gg i.e. hh is δ\delta-close to both ff and gg. Let x𝒳x\in{\mathcal{X}} be such that f(x)g(x)f(x)\neq g(x). By the LCC property,

Pr(y1,,yr)x[f(x)=𝒟x,y1,,yr(h(y1),,h(yr))]1τ\Pr_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{x}}[f(x)={\mathcal{D}}_{x,y_{1},\cdots,y_{r}}(h(y_{1}),\cdots,h(y_{r}))]\geq 1-\tau
Pr(y1,,yr)x[g(x)=𝒟x,y1,,yr(h(y1),,h(yr))]1τ.\Pr_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{x}}[g(x)={\mathcal{D}}_{x,y_{1},\cdots,y_{r}}(h(y_{1}),\cdots,h(y_{r}))]\geq 1-\tau.

This is a contradiction when τ<12\tau<\frac{1}{2}. Therefore every two codewords must be at least 2δ2\delta apart. ∎

Now, we are ready to prove our main result of this section.

Theorem 3 (Lower bound for LCCs).

Let 𝒞Σ𝕂n{\mathcal{C}}\subset\Sigma^{{\mathbb{K}}^{n}} be an (r,δ,τ)(r,\delta,\tau) affine-invariant LCC where τ<2δ3\tau<\frac{2\delta}{3}. Then |𝒞|exp(Oδ,𝕂,r,|Σ|(nr1))|{\mathcal{C}}|\leq\exp\left(O_{\delta,{\mathbb{K}},r,|\Sigma|}(n^{r-1})\right).

Proof.

Let |Σ|=m|\Sigma|=m. Let 𝒩{\mathcal{N}} be an ϵ/2\epsilon/2-net for the space of all bounded functions {f:𝕂n}\{f:{\mathbb{K}}^{n}\to{\mathbb{C}}\} with the metric induced by Ur\|\cdot\|_{U^{r}}-norm where ϵ=2δ3mr\epsilon=\frac{2\delta}{3m^{r}}. Given a bounded f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}}, define

ϕ(f):=argminh𝒩fhUr\phi(f):=\mathrm{argmin}_{h\in{\mathcal{N}}}\|f-h\|_{U^{r}}

(break ties arbitrarily). Since 𝒩{\mathcal{N}} is an ϵ/2\epsilon/2 net, we have fϕ(f)Urϵ/2\|f-\phi(f)\|_{U^{r}}\leq\epsilon/2. Define Ψ:𝒞𝒩m\Psi:{\mathcal{C}}\to{\mathcal{N}}^{m} as

Ψ(f):=(ϕ(f^1),,ϕ(f^m))\Psi(f):=(\phi({\hat{f}}_{1}),\cdots,\phi({\hat{f}}_{m}))

where f^i:𝕂n0{\hat{f}}_{i}:{\mathbb{K}}^{n}\to{\mathbb{R}}_{\geq 0} is the ithi^{th} coordinate function of the simplex extension f^:𝕂nm{\hat{f}}:{\mathbb{K}}^{n}\to{\blacktriangle}_{m} of ff. We claim that Ψ\Psi is one-one which implies that |𝒞||𝒩|m|{\mathcal{C}}|\leq|{\mathcal{N}}|^{m}. Now using Theorem 2, the required bound follows. Suppose that Ψ\Psi is not one-one. Let f,g𝒞f,g\in{\mathcal{C}} be two distinct codewords such that Ψ(f)=Ψ(g)\Psi(f)=\Psi(g). This implies that

i[m]f^ig^iUrf^iϕ(f^i)Ur+g^iϕ(g^i)Urϵ.\forall\ i\in[m]\ \|{\hat{f}}_{i}-{\hat{g}}_{i}\|_{U^{r}}\leq\|{\hat{f}}_{i}-\phi({\hat{f}}_{i})\|_{U^{r}}+\|{\hat{g}}_{i}-\phi({\hat{g}}_{i})\|_{U^{r}}\leq\epsilon.

By affine invariance of 𝒞{\mathcal{C}}, f𝒞f\circ\ell\in{\mathcal{C}} for all invertible affine maps :𝕂n𝕂n\ell:{\mathbb{K}}^{n}\to{\mathbb{K}}^{n}. So by the local correction property,

Pr,y0,(y1,,yr)y0[f(y0)=𝒟y0,y1,,yr(f(y1),,f(yr))]1τ\Pr_{\ell,y_{0},(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{y_{0}}}[f\circ\ell(y_{0})={\mathcal{D}}_{y_{0},y_{1},\cdots,y_{r}}(f\circ\ell(y_{1}),\cdots,f\circ\ell(y_{r}))]\geq 1-\tau

where \ell ranges uniformly over all invertible affine maps from 𝕂n𝕂n{\mathbb{K}}^{n}\to{\mathbb{K}}^{n} and y0y_{0} ranges uniformly over 𝕂n{\mathbb{K}}^{n}. Now consider the following difference:

Pr,y0,(y1,,yr)y0[f(y0)=𝒟y0,y1,,yr(f(y1),,f(yr))]\displaystyle\Pr_{\ell,y_{0},(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{y_{0}}}[f\circ\ell(y_{0})={\mathcal{D}}_{y_{0},y_{1},\cdots,y_{r}}(f\circ\ell(y_{1}),\cdots,f\circ\ell(y_{r}))]
Pr,y0,(y1,,yr)y0[g(y0)=𝒟y1,,yr(f(y1),,f(yr))]\displaystyle\hskip 56.9055pt-\Pr_{\ell,y_{0},(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{y_{0}}}[g\circ\ell(y_{0})={\mathcal{D}}_{y_{1},\cdots,y_{r}}(f\circ\ell(y_{1}),\cdots,f\circ\ell(y_{r}))]
=𝔼𝔼y0𝔼(y1,,yr)y0[f^(y0),𝒟^y0,y1,,yr(f^(y1),,f^(yr))\displaystyle={\mathbb{E}}_{\ell}{\mathbb{E}}_{y_{0}}{\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{y_{0}}}\left[\left\langle{\hat{f}}\circ\ell(y_{0}),{\widehat{{\mathcal{D}}}}_{y_{0},y_{1},\cdots,y_{r}}({\hat{f}}\circ\ell(y_{1}),\cdots,{\hat{f}}\circ\ell(y_{r}))\right\rangle\right.
g^(y0),𝒟^y1,,yr(f^(y1),,f^(yr))]\displaystyle\hskip 113.81102pt\left.-\left\langle{\hat{g}}\circ\ell(y_{0}),{\widehat{{\mathcal{D}}}}_{y_{1},\cdots,y_{r}}({\hat{f}}\circ\ell(y_{1}),\cdots,{\hat{f}}\circ\ell(y_{r}))\right\rangle\right]
=𝔼y0𝔼(y1,,yr)y0[𝔼[f^(y0)g^(y0),𝒟^y0,y1,,yr(f^(y1),,f^(yr))]].\displaystyle={\mathbb{E}}_{y_{0}}{\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{y_{0}}}\left[{\mathbb{E}}_{\ell}\left[\left\langle{\hat{f}}\circ\ell(y_{0})-{\hat{g}}\circ\ell(y_{0}),{\widehat{{\mathcal{D}}}}_{y_{0},y_{1},\cdots,y_{r}}({\hat{f}}\circ\ell(y_{1}),\cdots,{\hat{f}}\circ\ell(y_{r}))\right\rangle\right]\right].

Now we fix y0,y1,,yry_{0},y_{1},\cdots,y_{r} and show that inner expectation is small for each tuple (y0,y1,,yr)(y_{0},y_{1},\cdots,y_{r}). Let us denote 𝒟=𝒟y0,y1,,yr{\mathcal{D}}={\mathcal{D}}_{y_{0},y_{1},\cdots,y_{r}} for brevity. Let t=rank(y0,y1,,yr)t=\mathrm{rank}(y_{0},y_{1},\cdots,y_{r}), thus there exist independent vectors v1,,vt𝕂nv_{1},\cdots,v_{t}\in{\mathbb{K}}^{n} such that for every 0ir0\leq i\leq r, yi=j=1tλijvjy_{i}=\sum_{j=1}^{t}\lambda_{ij}v_{j} for some fixed λij𝕂\lambda_{ij}\in{\mathbb{K}}. The action of a random invertible affine map \ell can be approximated by sampling z0,z1,,zt𝕂nz_{0},z_{1},\cdots,z_{t}\in{\mathbb{K}}^{n} uniformly and mapping yiz0+j=1tλijzjy_{i}\mapsto z_{0}+\sum_{j=1}^{t}\lambda_{ij}z_{j} since with probability 1on(1)1-o_{n}(1), z1,,ztz_{1},\cdots,z_{t} will be independent. Therefore,

𝔼[f^(y0)g^(y0),𝒟^y0,y1,,yr(f^(y1),,f^(yr))]\displaystyle{\mathbb{E}}_{\ell}\left[\left\langle{\hat{f}}\circ\ell(y_{0})-{\hat{g}}\circ\ell(y_{0}),{\widehat{{\mathcal{D}}}}_{y_{0},y_{1},\cdots,y_{r}}({\hat{f}}\circ\ell(y_{1}),\cdots,{\hat{f}}\circ\ell(y_{r}))\right\rangle\right]
=on(1)𝔼z0,z1,,zt𝕂n[(f^g^)(z0+j=1tλ0jzj),𝒟^(f^(z0+j=1tλ1jzj),,f^(z0+j=1tλrjzj))]\displaystyle=_{o_{n}(1)}{\mathbb{E}}_{z_{0},z_{1},\cdots,z_{t}\in{\mathbb{K}}^{n}}\left[\left\langle({\hat{f}}-{\hat{g}})(z_{0}+\sum_{j=1}^{t}\lambda_{0j}z_{j}),{\widehat{{\mathcal{D}}}}\left({\hat{f}}(z_{0}+\sum_{j=1}^{t}\lambda_{1j}z_{j}),\cdots,{\hat{f}}(z_{0}+\sum_{j=1}^{t}\lambda_{rj}z_{j})\right)\right\rangle\right]
=𝔼z0,z1,,zt𝕂n[(f^g^)(z0+j=1tλ0jzj),(11,,rme𝒟(1,,r)i=1rf^i(z0+j=1tλijzj))]\displaystyle={\mathbb{E}}_{z_{0},z_{1},\cdots,z_{t}\in{\mathbb{K}}^{n}}\left[\left\langle({\hat{f}}-{\hat{g}})(z_{0}+\sum_{j=1}^{t}\lambda_{0j}z_{j}),\left(\sum_{1\leq\ell_{1},\cdots,\ell_{r}\leq m}e_{{\mathcal{D}}(\ell_{1},\cdots,\ell_{r})}\prod_{i=1}^{r}{\hat{f}}_{\ell_{i}}(z_{0}+\sum_{j=1}^{t}\lambda_{ij}z_{j})\right)\right\rangle\right]
=𝔼z0,z1,,zt𝕂n[(11,,rm(f^g^)𝒟(1,,r)(z0+j=1tλ0jzj)i=1rf^i(z0+j=1tλijzj))]\displaystyle={\mathbb{E}}_{z_{0},z_{1},\cdots,z_{t}\in{\mathbb{K}}^{n}}\left[\left(\sum_{1\leq\ell_{1},\cdots,\ell_{r}\leq m}({\hat{f}}-{\hat{g}})_{{\mathcal{D}}(\ell_{1},\cdots,\ell_{r})}(z_{0}+\sum_{j=1}^{t}\lambda_{0j}z_{j})\cdot\prod_{i=1}^{r}{\hat{f}}_{\ell_{i}}(z_{0}+\sum_{j=1}^{t}\lambda_{ij}z_{j})\right)\right]
(01,,rm1(f^g^)𝒟(1,,r)Ur)mrϵ\displaystyle\leq\left(\sum_{0\leq\ell_{1},\cdots,\ell_{r}\leq m-1}\|({\hat{f}}-{\hat{g}})_{{\mathcal{D}}(\ell_{1},\cdots,\ell_{r})}\|_{U^{r}}\right)\leq m^{r}\epsilon

where the first inequality is obtained by applying generalized von Neumann inequality (Lemma 2) to each term. Therefore

Pr,y0,(y1,,yr)y0[g(y0)=𝒟y1,,yr(f(y1),,f(yr))]\displaystyle\Pr_{\ell,y_{0},(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{y_{0}}}\left[g\circ\ell(y_{0})={\mathcal{D}}_{y_{1},\cdots,y_{r}}(f\circ\ell(y_{1}),\cdots,f\circ\ell(y_{r}))\right]
Pr,y0,(y1,,yr)y0[f(y0)=𝒟y1,,yr(f(y1),,f(yr))]mrϵ1τ2δ/3.\displaystyle\geq\Pr_{\ell,y_{0},(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{y_{0}}}\left[f\circ\ell(y_{0})={\mathcal{D}}_{y_{1},\cdots,y_{r}}(f\circ\ell(y_{1}),\cdots,f\circ\ell(y_{r}))\right]-m^{r}\epsilon\geq 1-\tau-2\delta/3.

On the other hand,

Pr,y0,(y1,,yr)y0[g(y0)=𝒟y1,,yr(f(y1),,f(yr))]\displaystyle\Pr_{\ell,y_{0},(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{y_{0}}}\left[g\circ\ell(y_{0})={\mathcal{D}}_{y_{1},\cdots,y_{r}}(f\circ\ell(y_{1}),\cdots,f\circ\ell(y_{r}))\right]
Pr,y0,(y1,,yr)y0[g(y0)=f(y0)]+Pr,y0,(y1,,yr)y0[f(y0)𝒟y1,,yr(f(y1),,f(yr))]\displaystyle\leq\Pr_{\ell,y_{0},(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{y_{0}}}\left[g\circ\ell(y_{0})=f\circ\ell(y_{0})\right]+\Pr_{\ell,y_{0},(y_{1},\cdots,y_{r})\sim{\mathcal{M}}_{y_{0}}}\left[f\circ\ell(y_{0})\neq{\mathcal{D}}_{y_{1},\cdots,y_{r}}(f\circ\ell(y_{1}),\cdots,f\circ\ell(y_{r}))\right]
Prx[f(x)=g(x)]+τ12δ+τ\displaystyle\leq\Pr_{x}[f(x)=g(x)]+\tau\leq 1-2\delta+\tau (By Lemma 5)

This is a contradiction when τ<2δ3\tau<\frac{2\delta}{3}.

4 Locally Testable Codes

We start by defining locally testable codes in a formulation convenient for our use.

Definition 9 (Locally Testable Code (LTC)).

An (r,δ,τ)(r,\delta,\tau) LTC is a code 𝒞Σ𝒳{\mathcal{C}}\subset\Sigma^{\mathcal{X}} with minimum distance at least δ\delta and the following property:
There is a distribution {\mathcal{M}} over rr-tuples of distinctWLOG we can assume the tuples have distinct coordinates by adding dummy coordinates and modifying the decoding functions 𝒟y1,,yr{\mathcal{D}}_{y_{1},\cdots,y_{r}} coordinates such that for each codeword fCf\in C,

Pr(y1,,yr)[𝒟y1,,yr(f(y1),f(y2),,f(yr))=1]3/4\Pr_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}[{\mathcal{D}}_{y_{1},\cdots,y_{r}}(f(y_{1}),f(y_{2}),\cdots,f(y_{r}))=1]\geq 3/4

and for every gΣ𝒳g\in\Sigma^{{\mathcal{X}}} which is τ\tau-far away from every codeword,

Pr(y1,,yr)[𝒟y1,,yr(g(y1),f(y2),,f(yr))=1]1/4\Pr_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}[{\mathcal{D}}_{y_{1},\cdots,y_{r}}(g(y_{1}),f(y_{2}),\cdots,f(y_{r}))=1]\leq 1/4

where 𝒟y1,,yr:Σr{0,1}{\mathcal{D}}_{y_{1},\cdots,y_{r}}:\Sigma^{r}\to\{0,1\}, called the testing operator, depends only on y1,,yry_{1},\cdots,y_{r}.
If furthermore 𝒳{\mathcal{X}} is a vector space and 𝒞{\mathcal{C}} is affine-invariant then we call it an affine invariant LTC.

Remark 2.

Let |Σ|=m|\Sigma|=m, WLOG we can assume that Σ={1,2,,m}\Sigma=\{1,2,\cdots,m\}. We can extend f:𝒳Σf:{\mathcal{X}}\to\Sigma to f^:𝒳m{\hat{f}}:{\mathcal{X}}\to{\blacktriangle}_{m}. The testing operator 𝒟:Σr{0,1}{\mathcal{D}}:\Sigma^{r}\to\{0,1\} can also be extended to 𝒟^:mr[0,1]{\widehat{{\mathcal{D}}}}:{\blacktriangle}_{m}^{r}\to[0,1] as follows: For z1,,zrmz_{1},\cdots,z_{r}\in{\blacktriangle}_{m} define

𝒟^(z1,,zr)=11,,rm𝒟(1,,r)(z1)1(zr)r.{\widehat{{\mathcal{D}}}}(z_{1},\cdots,z_{r})=\sum_{1\leq\ell_{1},\cdots,\ell_{r}\leq m}{\mathcal{D}}(\ell_{1},\cdots,\ell_{r})(z_{1})_{\ell_{1}}\cdots(z_{r})_{\ell_{r}}. (2)

Now we can rewrite the probability in terms of expectation as:

Pr(y1,,yr)[𝒟y1,,yr(f(y1),,f(yr))=1]=𝔼(y1,,yr)[𝒟^y1,,yr(f^(y1),,f^(yr))]\Pr_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}[{\mathcal{D}}_{y_{1},\cdots,y_{r}}(f(y_{1}),\cdots,f(y_{r}))=1]={\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}[{\widehat{{\mathcal{D}}}}_{y_{1},\cdots,y_{r}}({\hat{f}}\circ\ell(y_{1}),\cdots,{\hat{f}}\circ\ell(y_{r}))]

We are now ready to prove the main result of this section.

Theorem 4 (Lower bound for LTC’s).

Let 𝒞Σ𝕂n{\mathcal{C}}\subset\Sigma^{{\mathbb{K}}^{n}} be an (r,δ,δ/3)(r,\delta,\delta/3) affine invariant LTC, then |𝒞|exp(Oδ,𝕂,r,|Σ|(nr2))|{\mathcal{C}}|\leq\exp\left(O_{\delta,{\mathbb{K}},r,|\Sigma|}(n^{r-2})\right).

Proof.

Let |Σ|=m|\Sigma|=m. The proof is very similar to that of Theorem 3. Let 𝒩{\mathcal{N}} be an ϵ/2\epsilon/2-net for the space of all bounded functions {f:𝕂n}\{f:{\mathbb{K}}^{n}\to{\mathbb{C}}\} with the metric induced by Ur1\|\cdot\|_{U^{r-1}}-norm where ϵ=1/2rmr\epsilon=1/2rm^{r}. Define Ψ:𝒞𝒩m\Psi:{\mathcal{C}}\to{\mathcal{N}}^{m} as in the proof of Theorem 3, it is enough to show that Ψ\Psi is one-one. Suppose that Ψ\Psi is not one-one. Then there exists f,g𝒞f,g\in{\mathcal{C}} which are distinct such that Ψ(f)=Ψ(g)\Psi(f)=\Psi(g). This implies that

i[m]f^ig^iUr1ϵ.\forall\ i\in[m]\ \|{\hat{f}}_{i}-{\hat{g}}_{i}\|_{U^{r-1}}\leq\epsilon.

By affine invariance of 𝒞{\mathcal{C}}, f𝒞f\circ\ell\in{\mathcal{C}} for all invertible affine maps :𝕂n𝕂n\ell:{\mathbb{K}}^{n}\to{\mathbb{K}}^{n}. So

𝔼𝔼(y1,,yr)[𝒟y1,,yr(f(y1),f(y2),,f(yr))]3/4{\mathbb{E}}_{\ell}{\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}[{\mathcal{D}}_{y_{1},\cdots,y_{r}}(f\circ\ell(y_{1}),f\circ\ell(y_{2}),\cdots,f\circ\ell(y_{r}))]\geq 3/4

where \ell ranges over all invertible affine maps from 𝕂n𝕂n{\mathbb{K}}^{n}\to{\mathbb{K}}^{n}. Let HΣ𝒳H\in\Sigma^{\mathcal{X}} be a random word where for each coordinate x𝒳x\in{\mathcal{X}} independently,

H(x)={f(x)w.p. 1/2g(x)w.p. 1/2H(x)=\begin{cases}f(x)&w.p.\ 1/2\\ g(x)&w.p.\ 1/2\end{cases}

Define h^:𝒳m{\hat{h}}:{\mathcal{X}}\to{\blacktriangle}_{m} as h^(x)=𝔼H[H^(x)]=f^(x)+g^(x)2{\hat{h}}(x)={\mathbb{E}}_{H}[{\widehat{H}}(x)]=\frac{{\hat{f}}(x)+{\hat{g}}(x)}{2} where f^,g^{\hat{f}},{\hat{g}} are the simplex extensions of the original f,gf,g. So i[m]f^ih^iUr1=f^ig^iUr1/2ϵ/2\forall\ i\in[m]\ \|{\hat{f}}_{i}-{\hat{h}}_{i}\|_{U^{r-1}}=\|{\hat{f}}_{i}-{\hat{g}}_{i}\|_{U^{r-1}}/2\leq\epsilon/2. We will now show that the test accepts HH\circ\ell with good probability when \ell is a random invertible affine map from 𝕂n𝕂n{\mathbb{K}}^{n}\to{\mathbb{K}}^{n}.

𝔼H𝔼𝔼(y1,,yr)[𝒟y1,,yr(f(y1),,f(yr))𝒟y1,,yr(H(y1),,H(yr))]\displaystyle{\mathbb{E}}_{H}{\mathbb{E}}_{\ell}{\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}[{\mathcal{D}}_{y_{1},\cdots,y_{r}}(f\circ\ell(y_{1}),\cdots,f\circ\ell(y_{r}))-{\mathcal{D}}_{y_{1},\cdots,y_{r}}(H\circ\ell(y_{1}),\cdots,H\circ\ell(y_{r}))]
=𝔼H𝔼𝔼(y1,,yr)[𝒟^y1,,yr(f^(y1),,f^(yr))𝒟^y1,,yr(H^(y1),,H^(yr))]\displaystyle={\mathbb{E}}_{H}{\mathbb{E}}_{\ell}{\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}[{\widehat{{\mathcal{D}}}}_{y_{1},\cdots,y_{r}}({\hat{f}}\circ\ell(y_{1}),\cdots,{\hat{f}}\circ\ell(y_{r}))-{\widehat{{\mathcal{D}}}}_{y_{1},\cdots,y_{r}}({\widehat{H}}\circ\ell(y_{1}),\cdots,{\widehat{H}}\circ\ell(y_{r}))]
=𝔼𝔼(y1,,yr)[𝒟^y1,,yr(f^(y1),,f^(yr))𝒟^y1,,yr(h^(y1),,h^(yr))]\displaystyle={\mathbb{E}}_{\ell}{\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}[{\widehat{{\mathcal{D}}}}_{y_{1},\cdots,y_{r}}({\hat{f}}\circ\ell(y_{1}),\cdots,{\hat{f}}\circ\ell(y_{r}))-{\widehat{{\mathcal{D}}}}_{y_{1},\cdots,y_{r}}({\hat{h}}\circ\ell(y_{1}),\cdots,{\hat{h}}\circ\ell(y_{r}))] (by using the multilinear expansion of 𝒟^y1,,yr{\widehat{{\mathcal{D}}}}_{y_{1},\cdots,y_{r}}(Equation 2) and taking expectation over HH)
=𝔼(y1,,yr)[𝔼[𝒟^y1,,yr(f^(y1),,f^(yr))𝒟^y1,,yr(h^(y1),,h^(yr))]]\displaystyle={\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}\left[{\mathbb{E}}_{\ell}\left[{\widehat{{\mathcal{D}}}}_{y_{1},\cdots,y_{r}}({\hat{f}}\circ\ell(y_{1}),\cdots,{\hat{f}}\circ\ell(y_{r}))-{\widehat{{\mathcal{D}}}}_{y_{1},\cdots,y_{r}}({\hat{h}}\circ\ell(y_{1}),\cdots,{\hat{h}}\circ\ell(y_{r}))\right]\right]

Now we fix y1,,yry_{1},\cdots,y_{r} and show that inner expectation is small for each tuple (y1,,yr)(y_{1},\cdots,y_{r}). Let us denote 𝒟=𝒟y1,,yr{\mathcal{D}}={\mathcal{D}}_{y_{1},\cdots,y_{r}} for brevity. Let t=rank(y1,,yr)t=\mathrm{rank}(y_{1},\cdots,y_{r}), thus there exist independent vectors v1,,vt𝕂nv_{1},\cdots,v_{t}\in{\mathbb{K}}^{n} such that for every 1ir1\leq i\leq r, yi=j=1tλijvjy_{i}=\sum_{j=1}^{t}\lambda_{ij}v_{j} for some fixed λij𝕂\lambda_{ij}\in{\mathbb{K}}. The action of a random invertible affine map \ell can be approximated by sampling z0,z1,,zt𝕂nz_{0},z_{1},\cdots,z_{t}\in{\mathbb{K}}^{n} uniformly and mapping yiz0+j=1tλijzjy_{i}\mapsto z_{0}+\sum_{j=1}^{t}\lambda_{ij}z_{j} since with probability 1on(1)1-o_{n}(1), z1,,ztz_{1},\cdots,z_{t} will be independent. Therefore,

𝔼[𝒟^y1,,yr(f^(y1),,f^(yr))𝒟^y1,,yr(h^(y1),,h^(yr))]\displaystyle{\mathbb{E}}_{\ell}\left[{\widehat{{\mathcal{D}}}}_{y_{1},\cdots,y_{r}}({\hat{f}}\circ\ell(y_{1}),\cdots,{\hat{f}}\circ\ell(y_{r}))-{\widehat{{\mathcal{D}}}}_{y_{1},\cdots,y_{r}}({\hat{h}}\circ\ell(y_{1}),\cdots,{\hat{h}}\circ\ell(y_{r}))\right]
=on(1)𝔼z0,,zt𝕂n[𝒟^(f^(z0+j=1tλ1jzj),,f^(z0+j=1tλrjzj))𝒟(h^(z0+j=1tλ1jzj),,h^(z0+j=1tλrjzj))]\displaystyle=_{o_{n}(1)}{\mathbb{E}}_{z_{0},\cdots,z_{t}\in{\mathbb{K}}^{n}}\left[{\widehat{{\mathcal{D}}}}({\hat{f}}(z_{0}+\sum_{j=1}^{t}\lambda_{1j}z_{j}),\cdots,{\hat{f}}(z_{0}+\sum_{j=1}^{t}\lambda_{rj}z_{j}))-{\mathcal{D}}({\hat{h}}(z_{0}+\sum_{j=1}^{t}\lambda_{1j}z_{j}),\cdots,{\hat{h}}(z_{0}+\sum_{j=1}^{t}\lambda_{rj}z_{j}))\right]
=𝔼z0,z1,,zt𝕂n[11,,rm𝒟(1,,r)(i=1rf^i(z0+j=1tλijzj)i=1rh^i(z0+j=1tλijzj))]\displaystyle={\mathbb{E}}_{z_{0},z_{1},\cdots,z_{t}\in{\mathbb{K}}^{n}}\left[\sum_{1\leq\ell_{1},\cdots,\ell_{r}\leq m}{\mathcal{D}}(\ell_{1},\cdots,\ell_{r})\left(\prod_{i=1}^{r}{\hat{f}}_{\ell_{i}}(z_{0}+\sum_{j=1}^{t}\lambda_{ij}z_{j})-\prod_{i=1}^{r}{\hat{h}}_{\ell_{i}}(z_{0}+\sum_{j=1}^{t}\lambda_{ij}z_{j})\right)\right]
rmrϵ2=14\displaystyle\leq r\cdot m^{r}\cdot\frac{\epsilon}{2}=\frac{1}{4}

where the last line is obtained by forming hybrids i.e. writing

f^1f^2f^rh^1h^2h^r=(f^1h^1)f^2f^r+h^1(f^2h^2)f^r++h^1h^2(f^rh^r){\hat{f}}_{\ell_{1}}\cdot{\hat{f}}_{\ell_{2}}\cdots{\hat{f}}_{\ell_{r}}-{\hat{h}}_{\ell_{1}}\cdot{\hat{h}}_{\ell_{2}}\cdots{\hat{h}}_{\ell_{r}}=({\hat{f}}_{\ell_{1}}-{\hat{h}}_{\ell_{1}})\cdot{\hat{f}}_{\ell_{2}}\cdots{\hat{f}}_{\ell_{r}}+{\hat{h}}_{\ell_{1}}\cdot({\hat{f}}_{\ell_{2}}-{\hat{h}}_{\ell_{2}})\cdots{\hat{f}}_{\ell_{r}}+\cdots+{\hat{h}}_{\ell_{1}}\cdot{\hat{h}}_{\ell_{2}}\cdots({\hat{f}}_{\ell_{r}}-{\hat{h}}_{\ell_{r}})

and using Lemma 2 for each term. Therefore

𝔼H𝔼𝔼(y1,,yr)[𝒟y1,,yr(H(y1),,H(yr))]\displaystyle{\mathbb{E}}_{H}{\mathbb{E}}_{\ell}{\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}[{\mathcal{D}}_{y_{1},\cdots,y_{r}}(H\circ\ell(y_{1}),\cdots,H\circ\ell(y_{r}))]
𝔼𝔼(y1,,yr)[𝒟y1,,yr(f(y1),,f(yr))]143414=12.\displaystyle\geq{\mathbb{E}}_{\ell}{\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}[{\mathcal{D}}_{y_{1},\cdots,y_{r}}(f\circ\ell(y_{1}),\cdots,f\circ\ell(y_{r}))]-\frac{1}{4}\geq\frac{3}{4}-\frac{1}{4}=\frac{1}{2}.

By Markov inequality,

14\displaystyle\frac{1}{4} PrH[𝔼𝔼(y1,,yr)[𝒟y1,,yr(H(y1),,H(yr))]13]\displaystyle\leq\Pr_{H}\left[{\mathbb{E}}_{\ell}{\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}[{\mathcal{D}}_{y_{1},\cdots,y_{r}}(H\circ\ell(y_{1}),\cdots,H\circ\ell(y_{r}))]\geq\frac{1}{3}\right]
PrH[𝔼(y1,,yr)[𝒟y1,,yr(H(y1),,H(yr))]13]\displaystyle\leq\Pr_{H}\left[\exists\ell\ \ {\mathbb{E}}_{(y_{1},\cdots,y_{r})\sim{\mathcal{M}}}[{\mathcal{D}}_{y_{1},\cdots,y_{r}}(H\circ\ell(y_{1}),\cdots,H\circ\ell(y_{r}))]\geq\frac{1}{3}\right]
PrH[Δ(H,𝒞)]δ3]\displaystyle\leq\Pr_{H}\left[\exists\ell\ \Delta(H\circ\ell,{\mathcal{C}})]\leq\frac{\delta}{3}\right] (by the soundness of the tester)
=PrH[Δ(H,𝒞)]δ3]\displaystyle=\Pr_{H}\left[\Delta(H,{\mathcal{C}})]\leq\frac{\delta}{3}\right] (since \ell is invertible and 𝒞{\mathcal{C}} is affine invariant)

Let =Supp(H){\mathcal{H}}=\mathrm{Supp}(H) be the set of words between ff and gg i.e. the set of all words eΣ𝕂ne\in\Sigma^{{\mathbb{K}}^{n}} such that e(x)=f(x)e(x)=f(x) or e(x)=g(x)e(x)=g(x) for all x𝕂nx\in{\mathbb{K}}^{n}. We have ||=2Δ(f,g)n|{\mathcal{H}}|=2^{\Delta(f,g)n}. Since the distribution of HH is uniform in {\mathcal{H}}, we proved that at least 14\frac{1}{4} fraction of words in {\mathcal{H}} contain a codeword in their δ/3\delta/3 neighborhood, let {\mathcal{H}}^{\prime}\subset{\mathcal{H}} denote this subset. Therefore the δ/6\delta/6 neighborhoods around the points in {\mathcal{H}}^{\prime} must be disjoint or else two distinct codewords will be <δ<\delta close to each other. The number of words in {\mathcal{H}} which lie in a Hamming ball of radius δ/6\delta/6 around a point of {\mathcal{H}}^{\prime} is

i=0δn/6(Δ(f,g)ni)2H(δ/6Δ(f,g))Δ(f,g)no(n)2H(δ/6)Δ(f,g)no(n)\sum_{i=0}^{\delta n/6}\binom{\Delta(f,g)n}{i}\geq 2^{H(\delta/6\Delta(f,g))\Delta(f,g)n-o(n)}\geq 2^{H(\delta/6)\Delta(f,g)n-o(n)}

where H()H(\cdot) is the binary entropy function. By a packing argument, we can upper bound the size of {\mathcal{H}}^{\prime} as

||2Δ(f,g)n2H(δ/6)Δ(f,g)no(n)=o(||).|{\mathcal{H}}^{\prime}|\leq\frac{2^{\Delta(f,g)n}}{2^{H(\delta/6)\Delta(f,g)n-o(n)}}=o(|{\mathcal{H}}|).

This contradicts the fact that ||||/4|{\mathcal{H}}^{\prime}|\geq|{\mathcal{H}}|/4.

5 Concluding Remarks

In this work, we proved tight lower bounds for constant query affine-invariant LCCs and LTCs when the number of queries rr, underlying field 𝕂{\mathbb{K}} and the alphabet Σ\Sigma are constant. However the constants in the bounds we obtain are of Ackermann-type in r,|𝕂|,|Σ|r,|{\mathbb{K}}|,|\Sigma| because of the use of higher-order Fourier analysis. Improving the dependence on these parameters is an open problem which might require new ideas. In a recent work, Bhowmick and Lovett [BL15a] obtain a “bias implies low rank” theorem for polynomials over growing fields. This might be a first step towards proving a variant of the inverse Gowers theorem (Lemma 1) for growing field size, which could then be used to make our lower bounds extend to the case of growing field size.

We also remark that our lower bounds work for any LCC or LTC where the queries are obtained as fixed linear combinations of uniformly chosen points from 𝕂n{\mathbb{K}}^{n}. Affine-invariant codes are a natural class of local codes where this is true. Relaxing these conditions to get lower bounds for a more general class of LCCs or LTCs is an open problem.

Acknowledgements

We thank Madhu Sudan for helpful pointers to previous work. The second author would like to thank his advisor, Zeev Dvir, for his guidance and encouragement.

References

  • [ALM+98] Sanjeev Arora, Carsten Lund, Rajeev Motwani, Madhu Sudan, and Mario Szegedy. Proof verification and the hardness of approximation problems. Journal of the ACM (JACM), 45(3):501–555, 1998.
  • [AS98] Sanjeev Arora and Shmuel Safra. Probabilistic checking of proofs: A new characterization of NP. Journal of the ACM (JACM), 45(1):70–122, 1998.
  • [BB15] Arnab Bhattacharyya and Abhishek Bhowmick. Using higher-order Fourier analysis over general fields. arXiv preprint arXiv:1505.00619, 2015.
  • [BDSS11] Arnab Bhattacharyya, Zeev Dvir, Amir Shpilka, and Shubhangi Saraf. Tight lower bounds for 2-query LCCs over finite fields. In Foundations of Computer Science (FOCS), 2011 IEEE 52nd Annual Symposium on, pages 638–647. IEEE, 2011.
  • [BDYW11] Boaz Barak, Zeev Dvir, Amir Yehudayoff, and Avi Wigderson. Rank bounds for design matrices with applications to combinatorial geometry and locally correctable codes. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 519–528. ACM, 2011.
  • [BIW07] Omer Barkol, Yuval Ishai, and Enav Weinreb. On locally decodable codes, self-correctable codes, and t-private PIR. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 311–325. Springer, 2007.
  • [BK95] Manuel Blum and Sampath Kannan. Designing programs that check their work. Journal of the ACM (JACM), 42(1):269–291, 1995.
  • [BL15a] Abhishek Bhowmick and Shachar Lovett. Bias vs structure of polynomials in large fields, and applications in effective algebraic geometry and coding theory. CoRR, abs/1506.02047, 2015.
  • [BL15b] Abhishek Bhowmick and Shachar Lovett. The list decoding radius of Reed-Muller codes over small fields. In Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, June 14-17, 2015, pages 277–285, 2015.
  • [BLR93] Manuel Blum, Michael Luby, and Ronitt Rubinfeld. Self-testing/correcting with applications to numerical problems. Journal of Computer and System Sciences, 47(3):549–595, 1993.
  • [BRS12] Eli Ben-Sasson, Noga Ron-Zewi, and Madhu Sudan. Sparse affine-invariant linear codes are locally testable. In Foundations of Computer Science (FOCS), 2012 IEEE 53rd Annual Symposium on, pages 561–570. IEEE, 2012.
  • [BS04] Eli Ben-Sasson and Madhu Sudan. Robust locally testable codes and products of codes. In Approximation, Randomization and Combinatorial Optimization. Algorithms and Techniques, pages 286–297, 2004.
  • [BS08] Eli Ben-Sasson and Madhu Sudan. Short PCPs with polylog query complexity. SIAM Journal on Computing, 38(2):551–607, 2008.
  • [BS11] Eli Ben-Sasson and Madhu Sudan. Limits on the rate of locally testable affine-invariant codes. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 412–423. Springer, 2011.
  • [CKGS98] Benny Chor, Eyal Kushilevitz, Oded Goldreich, and Madhu Sudan. Private information retrieval. Journal of the ACM (JACM), 45(6):965–981, 1998.
  • [Din07] Irit Dinur. The PCP theorem by gap amplification. Journal of the ACM (JACM), 54(3):12, 2007.
  • [DS07] Zeev Dvir and Amir Shpilka. Locally decodable codes with two queries and polynomial identity testing for depth 3 circuits. SIAM Journal on Computing, 36(5):1404–1434, 2007.
  • [DSW14] Zeev Dvir, Shubhangi Saraf, and Avi Wigderson. Breaking the quadratic barrier for 3-LCC’s over the reals. In Proceedings of the 46th Annual ACM Symposium on Theory of Computing, pages 784–793. ACM, 2014.
  • [GKS13] Alan Guo, Swastik Kopparty, and Madhu Sudan. New affine-invariant codes from lifting. In Proceedings of the 4th conference on Innovations in Theoretical Computer Science, pages 529–540. ACM, 2013.
  • [GKST02] Oded Goldreich, Howard Karloff, Leonard J Schulman, and Luca Trevisan. Lower bounds for linear locally decodable codes and private information retrieval. In Computational Complexity, 2002. Proceedings. 17th IEEE Annual Conference on, pages 143–151. IEEE, 2002.
  • [Gow01] William T Gowers. A new proof of Szemerédi’s theorem. Geometric and Functional Analysis, 11(3):465–588, 2001.
  • [Gre06] Ben Green. Montreal lecture notes on quadratic Fourier analysis. arXiv preprint math/0604089, 2006.
  • [GS06] Oded Goldreich and Madhu Sudan. Locally testable codes and PCPs of almost-linear length. Journal of the ACM, 53(4):558 – 655, July 2006.
  • [GSVW15] Venkatesan Guruswami, Madhu Sudan, Ameya Velingker, and Carol Wang. Limitations on testable affine-invariant codes in the high-rate regime. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1312–1325. SIAM, 2015.
  • [Guo13] Alan Xinyu Guo. Some closure features of locally testable affine-invariant properties. PhD thesis, Massachusetts Institute of Technology, 2013.
  • [KdW03] Iordanis Kerenidis and Ronald de Wolf. Exponential lower bound for 2-query locally decodable codes via a quantum argument. In Proceedings of the thirty-fifth annual ACM symposium on Theory of computing, pages 106–115. ACM, 2003.
  • [KLP67] T. Kasami, S. Lin, and W.W. Peterson. Some results on cyclic codes which are invariant under the affine group and their applications. Information and Control, 11(5–6):475–496, 1967.
  • [KS08] Tali Kaufman and Madhu Sudan. Algebraic property testing: the role of invariance. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pages 403–412. ACM, 2008.
  • [KT00] Jonathan Katz and Luca Trevisan. On the efficiency of local decoding procedures for error-correcting codes. In Proceedings of the thirty-second annual ACM symposium on Theory of computing, pages 80–86. ACM, 2000.
  • [Lip90] Richard J Lipton. Efficient checking of computations. In STACS 90, pages 207–215. Springer, 1990.
  • [Mei09] Or Meir. Combinatorial construction of locally testable codes. SIAM J. Comput., 39(2):491–544, 2009.
  • [STV99] Madhu Sudan, Luca Trevisan, and Salil Vadhan. Pseudorandom generators without the XOR lemma. In Proceedings of the thirty-first annual ACM symposium on Theory of computing, pages 537–546. ACM, 1999.
  • [Tao12] Terence Tao. Higher order Fourier analysis, volume 142. American Mathematical Soc., 2012.
  • [TW14] Madhur Tulsiani and Julia Wolf. Quadratic Goldreich-Levin theorems. SIAM Journal on Computing, 43(2):730–766, 2014.
  • [TZ12] Terence Tao and Tamar Ziegler. The inverse conjecture for the Gowers norm over finite fields in low characteristic. Annals of Combinatorics, 16(1):121–188, 2012.
  • [Vid15] Michael Viderman. Explicit strong LTCs with inverse poly-log rate and constant soundness. Electronic Colloquium on Computational Complexity (ECCC), 22:20, 2015.
  • [Woo07] David Woodruff. New lower bounds for general locally decodable codes. In Electronic Colloquium on Computational Complexity (ECCC), volume 14, 2007.
  • [Woo12] David P Woodruff. A quadratic lower bound for three-query linear locally decodable codes over any field. Journal of Computer Science and Technology, 27(4):678–686, 2012.
  • [Yek11] Sergey Yekhanin. Locally decodable codes. In Computer Science–Theory and Applications, pages 289–290. Springer, 2011.

Appendix A Proof of generalized von Neumann inequality (Lemma 2)

Since the lemma is not stated in the form we want in [Tao12], we will include a proof here for completeness. To prove Lemma 2, we need the following lemma first.

Lemma 6 (Exercise 1.3.22 in [Tao12]).

Let f:𝕂nf:{\mathbb{K}}^{n}\to{\mathbb{C}} be a function, and for each 1ik1\leq i\leq k, let gi:(𝕂n)kg_{i}:({\mathbb{K}}^{n})^{k}\to{\mathbb{C}} be a bounded function which is independent of the ithi^{th} coordinate of (𝕂n)k({\mathbb{K}}^{n})^{k}. Then,

|𝔼x1,,xk𝕂n[f(x1+x2++xk)i=1kgi(x1,,xk)]|fUk|{\mathbb{E}}_{x_{1},\cdots,x_{k}\in{\mathbb{K}}^{n}}[f(x_{1}+x_{2}+\cdots+x_{k})\prod_{i=1}^{k}g_{i}(x_{1},\cdots,x_{k})]|\leq\|f\|_{U^{k}}
Proof.

The proof is by induction on kk and using Cauchy-Schwarz inequality repeatedly. The case k=1k=1 is true by definition of U1\|\cdot\|_{U^{1}}.

|𝔼x1,,xk𝕂n[f(x1+x2++xk)i=1kgi(x1,,xk)]|\displaystyle\left|{\mathbb{E}}_{x_{1},\cdots,x_{k}\in{\mathbb{K}}^{n}}\left[f(x_{1}+x_{2}+\cdots+x_{k})\prod_{i=1}^{k}g_{i}(x_{1},\cdots,x_{k})\right]\right|
=|𝔼x2,,xk[g1(x1,,xk)𝔼x1[f(x1+x2++xk)i=2kgi(x1,,xk)]]|\displaystyle=\left|{\mathbb{E}}_{x_{2},\cdots,x_{k}}\left[g_{1}(x_{1},\cdots,x_{k}){\mathbb{E}}_{x_{1}}\left[f(x_{1}+x_{2}+\cdots+x_{k})\prod_{i=2}^{k}g_{i}(x_{1},\cdots,x_{k})\right]\right]\right| (since g1g_{1} doesn’t depend on x1x_{1})
|𝔼x2,,xk[𝔼x1[f(x1+x2++xk)i=2kgi(x1,x2,,xk)]𝔼x1[f¯(x1+x2++xk)i=2kg¯i(x1,x2,,xk)]]|1/2\displaystyle\leq\left|{\mathbb{E}}_{x_{2},\cdots,x_{k}}\left[{\mathbb{E}}_{x_{1}^{\prime}}\left[f(x_{1}^{\prime}+x_{2}+\cdots+x_{k})\prod_{i=2}^{k}g_{i}(x_{1}^{\prime},x_{2},\cdots,x_{k})\right]{\mathbb{E}}_{x_{1}}\left[\bar{f}(x_{1}+x_{2}+\cdots+x_{k})\prod_{i=2}^{k}\bar{g}_{i}(x_{1},x_{2},\cdots,x_{k})\right]\right]\right|^{1/2} (By Cauchy-Schwarz inequality and the fact that |g1|1|g_{1}|\leq 1)
=|𝔼x1,h1[𝔼x2,,xk[Δh1f(x1+x2++xk)i=2kgi(x1+h1,x2,,xk)g¯i(x1,x2,,xk)]]|1/2\displaystyle=\left|{\mathbb{E}}_{x_{1},h_{1}}\left[{\mathbb{E}}_{x_{2},\cdots,x_{k}}\left[\Delta_{h_{1}}f(x_{1}+x_{2}+\cdots+x_{k})\prod_{i=2}^{k}g_{i}(x_{1}+h_{1},x_{2},\cdots,x_{k})\bar{g}_{i}(x_{1},x_{2},\cdots,x_{k})\right]\right]\right|^{1/2} (By substituting x1=x1+h1x_{1}^{\prime}=x_{1}+h_{1})
|𝔼x1,h1[𝔼h2,,hk,z[ΔhkΔh1f(x1+z)]1/2k1]|1/2\displaystyle\leq\left|{\mathbb{E}}_{x_{1},h_{1}}\left[{\mathbb{E}}_{h_{2},\cdots,h_{k},z}\left[\Delta_{h_{k}}\cdots\Delta_{h_{1}}f(x_{1}+z)\right]^{1/2^{k-1}}\right]\right|^{1/2} (By induction hypothesis and the definition of Gowers norm)
|𝔼x1,h1,h2,,hk,z[ΔhkΔh1f(x1+z)]|1/2k\displaystyle\leq\left|{\mathbb{E}}_{x_{1},h_{1},h_{2},\cdots,h_{k},z}\left[\Delta_{h_{k}}\cdots\Delta_{h_{1}}f(x_{1}+z)\right]\right|^{1/2^{k}} (By Jensen’s inequality)
=|𝔼h1,h2,,hk,z[ΔhkΔh1f(z)]|1/2k=fUk\displaystyle=\left|{\mathbb{E}}_{h_{1},h_{2},\cdots,h_{k},z}\left[\Delta_{h_{k}}\cdots\Delta_{h_{1}}f(z)\right]\right|^{1/2^{k}}=\|f\|_{U^{k}}

Proof of Lemma 2.

By symmetry, it is enough to show that

|𝔼z1,,zm𝕂n[f0(0(z1,,zm))i=1kfi(i(z1,,zm))]|f0Uk.|{\mathbb{E}}_{z_{1},\cdots,z_{m}\in{\mathbb{K}}^{n}}[f_{0}({\mathcal{L}}_{0}(z_{1},\cdots,z_{m}))\prod_{i=1}^{k}f_{i}({\mathcal{L}}_{i}(z_{1},\cdots,z_{m}))]|\leq\|f_{0}\|_{U^{k}}.

We will make a linear change of variables so that we can use Lemma 6 to get the required bound. For each 1ik1\leq i\leq k, since 0{\mathcal{L}}_{0} is not a multiple of i{\mathcal{L}}_{i}, there exists a vector vi𝕂mv_{i}\in{\mathbb{K}}^{m} such that 0(vi)=1{\mathcal{L}}_{0}(v_{i})=1 and i(vi)=0{\mathcal{L}}_{i}(v_{i})=0. Now we make the following change of variables: (z1,,zm)(x1,,xm)+i=1kyiviT(z_{1},\cdots,z_{m})\rightarrow(x_{1},\cdots,x_{m})+\sum_{i=1}^{k}y_{i}v_{i}^{T} where x1,,xmx_{1},\cdots,x_{m} and y1,,yky_{1},\cdots,y_{k} are the new variables which range over 𝕂n{\mathbb{K}}^{n}.

|𝔼z1,,zm𝕂n[f0(0(z1,,zm))i=1kfi(i(z1,,zm))]|\displaystyle|{\mathbb{E}}_{z_{1},\cdots,z_{m}\in{\mathbb{K}}^{n}}[f_{0}({\mathcal{L}}_{0}(z_{1},\cdots,z_{m}))\prod_{i=1}^{k}f_{i}({\mathcal{L}}_{i}(z_{1},\cdots,z_{m}))]|
=|𝔼x1,,xm,y1,,yk𝕂n[f0(0(x1,,xm)+j[k]yj)i[k]fi(i(x1,,xm)+j[k]{i}yji(vj))]|\displaystyle=\left|{\mathbb{E}}_{x_{1},\cdots,x_{m},y_{1},\cdots,y_{k}\in{\mathbb{K}}^{n}}\left[f_{0}\left({\mathcal{L}}_{0}(x_{1},\cdots,x_{m})+\sum_{j\in[k]}y_{j}\right)\prod_{i\in[k]}f_{i}\left({\mathcal{L}}_{i}(x_{1},\cdots,x_{m})+\sum_{j\in[k]\setminus\{i\}}y_{j}{\mathcal{L}}_{i}(v_{j})\right)\right]\right| (By change of variables and linearity of i{\mathcal{L}}_{i})
𝔼x1,,xm𝕂n[|𝔼y1,,yk𝕂n[f0(0(x1,,xm)+j[k]yj)i[k]fi(i(x1,,xm)+j[k]{i}yji(vj))]|]\displaystyle\leq{\mathbb{E}}_{x_{1},\cdots,x_{m}\in{\mathbb{K}}^{n}}\left[\left|{\mathbb{E}}_{y_{1},\cdots,y_{k}\in{\mathbb{K}}^{n}}\left[f_{0}\left({\mathcal{L}}_{0}(x_{1},\cdots,x_{m})+\sum_{j\in[k]}y_{j}\right)\prod_{i\in[k]}f_{i}\left({\mathcal{L}}_{i}(x_{1},\cdots,x_{m})+\sum_{j\in[k]\setminus\{i\}}y_{j}{\mathcal{L}}_{i}(v_{j})\right)\right]\right|\right]
f0Uk\displaystyle\leq\|f_{0}\|_{U^{k}} (By Lemma 6)