This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A new point of NP-hardness for 2-to-1 Label Cover

Per Austrin Department of Computer Science, University of Toronto. Funded by NSERC.    Ryan O’Donnell Department of Computer Science, Carnegie Mellon University. Supported by NSF grants CCF-0747250 and CCF-0915893, and by a Sloan fellowship.    John Wright Department of Computer Science, Carnegie Mellon University.
Abstract

We show that given a satisfiable instance of the 22-to-11 Label Cover problem, it is 𝖭𝖯\mathsf{NP}-hard to find a (2324+ϵ)(\frac{23}{24}+\epsilon)-satisfying assignment.

1 Introduction

Over the past decade, a significant amount of progress has been made in the field of hardness of approximation via results based on the conjectured hardness of certain forms of the Label Cover problem. The Unique Games Conjecture (UGC) of Khot [Kho02] states that it is 𝖭𝖯\mathsf{NP}-hard to distinguish between nearly satisfiable and almost completely unsatisfiable instances of Unique, or 1-to-1, Label Cover. Using the UGC as a starting point, we now have optimal inapproximability results for Vertex Cover [KR03], Max-Cut [KKMO07], and many other basic constraint satisfaction problems (CSP). Indeed, assuming the UGC we have essentially optimal inapproximability results for all CSPs [Rag08]. In short, modulo the understanding of Unique Label Cover itself, we have an excellent understanding of the (in-)approximability of a wide range of problems.

Where the UGC’s explanatory powers falter is in pinning down the approximability of satisfiable CSPs. This means the task of finding a good assignment to a CSP when guaranteed that the CSP is fully satisfiable. For example, we know from the work of Håstad [Hås01] that given a fully satisfiable 3Sat instance, it is 𝖭𝖯\mathsf{NP}-hard to satisfy 78+ϵ\frac{7}{8}+\epsilon of the clauses for any ϵ>0\epsilon>0. However given a fully satisfiable 11-to-11 Label Cover instance, it is completely trivial to find a fully satisfying assignment. Thus the UGC can not be used as the starting point for hardness results for satisfiable CSPs. Because of this, Khot additionally posed his dd-to-11 Conjectures:

Conjecture 1.1 ([Kho02]).

For every integer d2d\geq 2 and ϵ>0\epsilon>0, there is a label set size qq such that it is 𝖭𝖯\mathsf{NP}-hard to (1,ϵ)(1,\epsilon)-decide the dd-to-11 Label Cover problem.

Here by (c,s)(c,s)-deciding a CSP we mean the task of determining whether an instance is at least cc-satisfiable or less than ss-satisfiable. It is well known (from the Parallel Repetition Theorem [FK94, Raz95]) that the conjecture is true if dd is allowed to depend on ϵ\epsilon. The strength of this conjecture, therefore, is that it is stated for each fixed dd greater than 11.

The dd-to-11 Conjectures have been used to resolve the approximability of several basic “satisfiable CSP” problems. The first result along these lines was due to Dinur, Mossel, and Regev [DMR09] who showed that the 22-to-11 Conjecture implies that it is 𝖭𝖯\mathsf{NP}-hard to CC-color a 44-colorable graph for any constant CC. (They also showed hardness for 33-colorable graphs via another Unique Games variant.) O’Donnell and Wu [OW09] showed that assuming the dd-to-11 Conjecture for any fixed dd implies that it is 𝖭𝖯\mathsf{NP}-hard to (1,58+ϵ)(1,\frac{5}{8}+\epsilon)-approximate instances a certain 33-bit predicate — the “Not Two” predicate. This is an optimal result among all 33-bit predicates, since Zwick [Zwi98] showed that every satisfiable 33-bit CSP instance can be efficiently 58\frac{5}{8}-approximated. In another example, Guruswami and Sinop [GS09] have shown that the 22-to-11 Conjecture implies that given a qq-colorable graph, it is 𝖭𝖯\mathsf{NP}-hard to find a qq-coloring in which less than a (1qO(lnqq2))(\frac{1}{q}-O(\frac{\ln q}{q^{2}})) fraction of the edges are monochromatic. This result would be tight up to the O()O(\cdot) by an algorithm of Frieze and Jerrum [FJ97]. It is therefore clear that settling the dd-to-1 Conjectures, especially in the most basic case of d=2d=2, is an important open problem.

Regarding the hardness of the 22-to-11 Label Cover problem, the only evidence we have is a family of integrality gaps for the canonical SDP relaxation of the problem, in [GKO+10]. Regarding algorithms for the problem, an important recent line of work beginning in [ABS10] (see also [BRS11, GS11, Ste10]) has sought subexponential-time algorithms for Unique Label Cover and related problems. In particular, Steurer [Ste10] has shown that for any constant β>0\beta>0 and label set size, there is an exp(O(nβ))\exp(O(n^{\beta}))-time algorithm which, given a satisfiable 22-to-11 Label Cover instance, finds an assignment satisfying an exp(O(1/β2))\exp(-O(1/\beta^{2}))-fraction of the constraints. E.g., there is a 2O(n.001)2^{O(n^{.001})}-time algorithm which (1,s0)(1,s_{0})-approximates 22-to-11 Label Cover, where s0>0s_{0}>0 is a certain universal constant.

In light of this, it is interesting not only to seek 𝖭𝖯\mathsf{NP}-hardness results for certain approximation thresholds, but to additionally seek evidence that nearly full exponential time is required for these thresholds. This can done by assuming the Exponential Time Hypothesis (ETH) [IP01] and by reducing from the Moshkovitz–Raz Theorem [MR10], which shows a near-linear size reduction from 3Sat to the standard Label Cover problem with subconstant soundness. In this work, we show reductions from 3Sat to the problem of (1,s+ϵ)(1,s+\epsilon)-approximating several CSPs, for certain values of ss and for all ϵ>0\epsilon>0. In fact, though we omit it in our theorem statements, it can be checked that all of the reductions in this paper are quasilinear in size for ϵ=ϵ(n)=Θ(1(loglogn)β)\epsilon=\epsilon(n)=\Theta\left(\frac{1}{(\log\log n)^{\beta}}\right), for some β>0\beta>0.

1.1 Our results

In this paper, we focus on proving 𝖭𝖯\mathsf{NP}-hardness for the 22-to-11 Label Cover problem. To the best of our knowledge, no explicit 𝖭𝖯\mathsf{NP}-hardness factor has previously been stated in the literature. However it is “folklore” that one can obtain an explicit one for label set sizes 33 & 66 by performing the “constraint-variable” reduction on an 𝖭𝖯\mathsf{NP}-hardness result for 33-coloring (more precisely, Max-33-Colorable-Subgraph). The best known hardness for 33-coloring is due to Guruswami and Sinop [GS09], who showed a factor 3233\frac{32}{33}-hardness via a somewhat involved gadget reduction from the 33-query adaptive PCP result of [GLST98]. This yields 𝖭𝖯\mathsf{NP}-hardness of (1,6566+ϵ)(1,\frac{65}{66}+\epsilon)-approximating 22-to-11 Label Cover with label set sizes 33 & 66. It is not known how to take advantage of larger label set sizes. On the other hand, for label set sizes 22 & 44 it is known that satisfying 22-to-11 Label Cover instances can be found in polynomial time.

The main result of our paper gives an improved hardness result:

Theorem 1.2.

For all ϵ>0\epsilon>0, (1,2324+ϵ)(1,\frac{23}{24}+\epsilon)-deciding the 22-to-11 Label Cover problem with label set sizes 33 & 66 is 𝖭𝖯\mathsf{NP}-hard.

By duplicating labels, this result also holds for label set sizes 3k3k & 6k6k for any k+k\in\mathbbm{N}^{+}.

Let us describe the high-level idea behind our result. The folklore constraint-variable reduction from 33-coloring to 22-to-11 Label Cover would work just as well if we started from “33-coloring with literals” instead. By this we mean the CSP with domain 3\mathbbm{Z}_{3} and constraints of the form “vivjc(mod3)v_{i}-v_{j}\neq c\pmod{3}”. Starting from this CSP — which we call 2NLin(3)\textsf{2NLin}(\mathbbm{Z}_{3}) — has two benefits: first, it is at least as hard as 33-coloring and hence could yield a stronger hardness result; second, it is a bit more “symmetrical” for the purposes of designing reductions. We obtain the following hardness result for 2NLin(3)\textsf{2NLin}(\mathbbm{Z}_{3}).

Theorem 1.3.

For all ϵ>0\epsilon>0, it is 𝖭𝖯\mathsf{NP}-hard to (1,1112+ϵ)(1,\frac{11}{12}+\epsilon)-decide the 2NLin problem.

As 3-coloring is a special case of 2NLin(3)\textsf{2NLin}(\mathbbm{Z}_{3}), [GS09] also shows that (1,3233+ϵ)(1,\frac{32}{33}+\epsilon)-deciding 2NLin is 𝖭𝖯\mathsf{NP}-hard for all ϵ>0\epsilon>0, and to our knowledge this was previously the only hardness known for 2NLin(3)\textsf{2NLin}(\mathbbm{Z}_{3}). The best current algorithm achieves an approximation ratio of 0.8360.836 (and does not need the instance to be satisfiable) [GW04]. To prove Theorem 1.3, we proceed by designing an appropriate “function-in-the-middle” dictator test, as in the recent framework of [OW12]. Although the [OW12] framework gives a direct translation of certain types of function-in-the-middle tests into hardness results, we cannot employ it in a black-box fashion. Among other reasons, [OW12] assumes that the test has “built-in noise”, but we cannot afford this as we need our test to have perfect completeness.

Thus, we need a different proof to derive a hardness result from this function-in-the-middle test. We first were able to accomplish this by an analysis similar to the Fourier-based proof of 2𝖫𝗂𝗇(2)2\mathsf{Lin}(\mathbbm{Z}_{2}) hardness given in Appendix F of [OW12]. Just as that proof “reveals” that the function-in-the-middle 2𝖫𝗂𝗇(2)2\mathsf{Lin}(\mathbbm{Z}_{2}) test can be equivalently thought of as Håstad’s 3𝖫𝗂𝗇(2)3\mathsf{Lin}(\mathbbm{Z}_{2}) test composed with the 3𝖫𝗂𝗇(2)3\mathsf{Lin}(\mathbbm{Z}_{2})-to-2𝖫𝗂𝗇(2)2\mathsf{Lin}(\mathbbm{Z}_{2}) gadget of [TSSW00], our proof for the 2NLin(3)\textsf{2NLin}(\mathbbm{Z}_{3}) function-in-the-middle test revealed it to be the composition of a function test for a certain four-variable CSP with a gadget. We have called the particular four-variable CSP 4-Not-All-There, or 4NAT for short. Because it is a 44-CSP, we are able to prove the following 𝖭𝖯\mathsf{NP}-hardness of approximation result for it using a classic, Håstad-style Fourier-analytic proof.

Theorem 1.4.

For all ϵ>0\epsilon>0, it is 𝖭𝖯\mathsf{NP}-hard to (1,23+ϵ)(1,\frac{2}{3}+\epsilon)-decide the 4NAT problem.

Thus, the final form in which we present our Theorem 1.2 is as a reduction from Label-Cover to 4NAT using a function test (yielding Theorem 1.4), followed by a 4NAT-to-2NLin(3)\textsf{2NLin}(\mathbbm{Z}_{3}) gadget (yielding Theorem 1.3), followed by the constraint-variable reduction to 22-to-11 Label Cover. Indeed, all of the technology needed to carry out this proof was in place for over a decade, but without the function-in-the-middle framework of [OW12] it seems that pinpointing the 4NAT predicate as a good starting point would have been unlikely.

1.2 Organization

We leave to Section 2 most of the definitions, including those of the CSPs we use. The heart of the paper is in Section 3, where we give both the 2NLin(3)\textsf{2NLin}(\mathbbm{Z}_{3}) and 4NAT function tests, explain how one is derived from the other, and then perform the Fourier analysis for the 4NAT test. The actual hardness proof for 4NAT is presented in Section 4, and it follows mostly the techniques put in place by Håstad in [Hås01].

2 Preliminaries

We primarily work with strings x3Kx\in\mathbbm{Z}_{3}^{K} for some integer KK. We write xix_{i} to denote the iith coordinate of xx. Oftentimes, our strings y3dKy\in\mathbbm{Z}_{3}^{dK} are “blocked” into KK “blocks” of size dd. In this case, we write y[i]3dy[i]\in\mathbbm{Z}_{3}^{d} for the iith block of yy, and (y[i])j3(y[i])_{j}\in\mathbbm{Z}_{3} for the jjth coordinate of this block. Define the function π:[dK][K]\pi:[dK]\rightarrow[K] such that π(k)=i\pi(k)=i if kk falls in the iith block of size dd (e.g., π(k)=1\pi(k)=1 for 1kd1\leq k\leq d, π(k)=2\pi(k)=2 for d+1k2dd+1\leq k\leq 2d, and so on).

2.1 Definitions of problems

An instance \mathcal{I} of a constraint satisfaction problem (CSP) is a set of variables VV, a set of labels DD, and a weighted list of constraints on these variables. We assume that the weights of the constraints are nonegative and sum to 1. The weights therefore induce a probability distribution on the constraints. Given an assignment to the variables f:VDf:V\rightarrow D, the value of ff is the probability that ff satisfies a constraint drawn from this probability distribution. The optimum of \mathcal{I} is the highest value of any assignment. We say that an \mathcal{I} is ss-satisfiable if its optimum is at least ss. If it is 1-satisfiable we simply call it satisfiable.

We define a CSP 𝒫\mathcal{P} to be a set of CSP instances. Typically, these instances will have similar constraints. We will study the problem of (c,s)(c,s)-deciding 𝒫\mathcal{P}. This is the problem of determining whether an instance of 𝒫\mathcal{P} is at least cc-satisfiable or less than ss-satisfiable. Related is the problem of (c,s)(c,s)-approximating 𝒫\mathcal{P}, in which one is given a cc-satisfiable instance of 𝒫\mathcal{P} and asked to find an assignment of value at least ss. It is easy to see that (c,s)(c,s)-deciding 𝒫\mathcal{P} is at least as easy as (c,s)(c,s)-approximating 𝒫\mathcal{P}. Thus, as all our hardness results are for (c,s)(c,s)-deciding CSPs, we also prove hardness for (c,s)(c,s)-approximating these CSPs.

We now state the three CSPs that are the focus of our paper.

2-NLin(3\mathbbm{Z}_{3}):

In this CSP the label set is 3\mathbbm{Z}_{3} and the constraints are of the form

vivja(mod3),a3.v_{i}-v_{j}\neq a\pmod{3},\quad a\in\mathbbm{Z}_{3}.

The special case when each RHS is 0 is the 33-coloring problem. We often drop the (3)(\mathbbm{Z}_{3}) from this notation and simply write 2NLin. The reader may think of the ‘N’ in 2NLin(3)\textsf{2NLin}(\mathbbm{Z}_{3}) as standing for ‘N’on-linear, although we prefer to think of it as standing for ‘N’early-linear. The reason is that when generalizing to moduli q>3q>3, the techniques in this paper generalize to constraints of the form “vivj(modq){a,a+1}v_{i}-v_{j}\pmod{q}\in\{a,a+1\}” rather than “vivja(modq)v_{i}-v_{j}\neq a\pmod{q}”. For the ternary version of this constraint, “vivj+vk(modq){a,a+1}v_{i}-v_{j}+v_{k}\pmod{q}\in\{a,a+1\}”, it is folklore111Venkatesan Guruswami, Subhash Khot personal communications. that a simple modification of Håstad’s work [Hås01] yields 𝖭𝖯\mathsf{NP}-hardness of (1,2q)(1,\frac{2}{q})-approximation.

4-Not-All-There:

For the 4-Not-All-There problem, denoted 4NAT, we define 4NAT:34{0,1}\textsf{4NAT}:\mathbbm{Z}_{3}^{4}\rightarrow\{0,1\} to have output 11 if and only if at least one of the elements of 3\mathbbm{Z}_{3} is not present among the four inputs. The 4NAT CSP has label set D=3D=\mathbbm{Z}_{3} and constraints of the form 4NAT(v1+k1,v2+k2,v3+k3,v4+k4)=1\textsf{4NAT}(v_{1}+k_{1},v_{2}+k_{2},v_{3}+k_{3},v_{4}+k_{4})=1, where the kik_{i}’s are constants in 3\mathbbm{Z}_{3}.

We additionally define the “Two Pairs” predicate TwoPair:34{0,1}\textsf{TwoPair}:\mathbbm{Z}_{3}^{4}\rightarrow\{0,1\}, which has output 11 if and only if its input contains two distinct elements of 3\mathbbm{Z}_{3}, each appearing twice. Note that an input which satisfies TwoPair also satisfies 4NAT.

𝐝\mathbf{d}-to-1 Label Cover:

An instance of the dd-to-1 Label Cover problem is a bipartite graph G=(UV,E)G=(U\cup V,E), a label set size KK, and a dd-to-1 map πe:[dK][K]\pi_{e}:[dK]\rightarrow[K] for each edge eEe\in E. The elements of UU are labeled from the set [K][K], and the elements of VV are labeled from the set [dK][dK]. A labeling f:UV[dK]f:U\cup V\rightarrow[dK] satisfies an edge e=(u,v)e=(u,v) if πe(f(v))=f(u)\pi_{e}(f(v))=f(u). Of particular interest is the d=2d=2 case, i.e., 2-to-1 Label Cover.

Label Cover serves as the starting point for most 𝖭𝖯\mathsf{NP}-hardness of approximation results. We use the following theorem of Moshkovitz and Raz:

Theorem 2.1 ([MR10]).

For any ϵ=ϵ(n)no(1)\epsilon=\epsilon(n)\geq n^{-o(1)} there exists K,d2poly(1/ϵ)K,d\leq 2^{\mathrm{poly}(1/\epsilon)} such that the problem of deciding a 3Sat instance of size nn can be Karp-reduced in poly(n)\mathrm{poly}(n) time to the problem of (1,ϵ)(1,\epsilon)-deciding dd-to-1 Label Cover instance of size n1+o(1)n^{1+o(1)} with label set size KK.

2.2 Gadgets

A typical way of relating two separate CSPs is by constructing a gadget reduction which translates from one to the other. A gadget reduction from 𝖢𝖲𝖯1\mathsf{CSP}_{1} to 𝖢𝖲𝖯2\mathsf{CSP}_{2} is one which maps any 𝖢𝖲𝖯1\mathsf{CSP}_{1} constraint into a weighted set of 𝖢𝖲𝖯2\mathsf{CSP}_{2} constraints. The 𝖢𝖲𝖯2\mathsf{CSP}_{2} constraints are over the same set of variables as the 𝖢𝖲𝖯1\mathsf{CSP}_{1} constraint, plus some new, auxiliary variables (these auxiliary variables are not shared between constraints of 𝖢𝖲𝖯1\mathsf{CSP}_{1}). We require that for every assignment which satisfies the 𝖢𝖲𝖯1\mathsf{CSP}_{1} constraint, there is a way to label the auxiliary variables to fully satisfy the 𝖢𝖲𝖯2\mathsf{CSP}_{2} constraints. Furthermore, there is some parameter 0<γ<10<\gamma<1 such that for every assignment which does not satisfy the 𝖢𝖲𝖯1\mathsf{CSP}_{1} constraint, the optimum labeling to the auxiliary variables will satisfy exactly γ\gamma fraction of the 𝖢𝖲𝖯2\mathsf{CSP}_{2} constraints. Such a gadget reduction we call a γ\gamma-gadget-reduction from 𝖢𝖲𝖯1\mathsf{CSP}_{1} to 𝖢𝖲𝖯2\mathsf{CSP}_{2}. The following proposition is well-known:

Proposition 2.2.

Suppose it is 𝖭𝖯\mathsf{NP}-hard to (c,s)(c,s)-decide 𝖢𝖲𝖯1\mathsf{CSP}_{1}. If there exists a γ\gamma-gadget-reduction from 𝖢𝖲𝖯1\mathsf{CSP}_{1} to 𝖢𝖲𝖯2\mathsf{CSP}_{2}, then it is 𝖭𝖯\mathsf{NP}-hard to (c+(1c)γ,s+(1s)γ)(c+(1-c)\gamma,s+(1-s)\gamma)-decide 𝖢𝖲𝖯2\mathsf{CSP}_{2}.

We note that the notation γ\gamma-gadget-reduction is similar to a piece of notation employed by [TSSW00], but the two have different (though related) definitions.

2.3 Fourier analysis on 3\mathbbm{Z}_{3}

Let ω=e2πi/3\omega=e^{2\pi i/3} and set U3={ω0,ω1,ω2}U_{3}=\{\omega^{0},\omega^{1},\omega^{2}\}. For α3n\alpha\in\mathbbm{Z}_{3}^{n}, consider the Fourier character χα:3nU3\chi_{\alpha}:\mathbbm{Z}_{3}^{n}\rightarrow U_{3} defined as χα(x)=ωαx\chi_{\alpha}(x)=\omega^{\alpha\cdot x}. Then it is easy to see that 𝐄[χα(𝒙)χβ(𝒙)¯]=𝟏[α=β]\mathop{\bf E\/}[\chi_{\alpha}({\boldsymbol{x}})\overline{\chi_{\beta}({\boldsymbol{x}})}]={\bf 1}[\alpha=\beta], where here and throughout 𝒙{\boldsymbol{x}} has the uniform probability distribution on 3n\mathbbm{Z}_{3}^{n} unless otherwise specified.. As a result, the Fourier characters form an orthonormal basis for the set of functions f:3nU3f:\mathbbm{Z}_{3}^{n}\rightarrow U_{3} under the inner product f,g=𝐄[f(𝒙)g(𝒙)]\langle f,g\rangle=\mathop{\bf E\/}[f({\boldsymbol{x}})g({\boldsymbol{x}})]; i.e.,

f=α3nf^(α)χα,f=\sum_{\alpha\in\mathbbm{Z}_{3}^{n}}\hat{f}(\alpha)\chi_{\alpha},

where the f^(α)\hat{f}(\alpha)’s are complex numbers defined as f^(α)=𝐄[f(𝒙)χα(𝒙)¯]\hat{f}(\alpha)=\mathop{\bf E\/}[f({\boldsymbol{x}})\overline{\chi_{\alpha}({\boldsymbol{x}})}]. For α3n\alpha\in\mathbbm{Z}_{3}^{n}, we use the notation |α||\alpha| to denote αi\sum\alpha_{i} and #α\#\alpha to denote the number of nonzero coordinates in α\alpha. When dd is clear from context and α3dK\alpha\in\mathbbm{Z}_{3}^{dK}, define π3(α)3K\pi_{3}(\alpha)\in\mathbbm{Z}_{3}^{K} so that (π3(α))i|α[i]|(mod3)(\pi_{3}(\alpha))_{i}\equiv|\alpha[i]|\pmod{3} (recall the notation α[i]\alpha[i] from the beginning of this section).

We have Parseval’s identity: for every f:3nU3f:\mathbbm{Z}_{3}^{n}\rightarrow U_{3} it holds that α3n|f^(α)|2=1\sum_{\alpha\in\mathbbm{Z}_{3}^{n}}|\hat{f}(\alpha)|^{2}=1. Note that this implies that |f^(α)|1|\hat{f}(\alpha)|\leq 1 for all α\alpha, as otherwise f^(α)2\hat{f}(\alpha)^{2} would be greater than 1. A function f:3n3f:\mathbbm{Z}_{3}^{n}\rightarrow\mathbbm{Z}_{3} is said to be folded if for every x3nx\in\mathbbm{Z}_{3}^{n} and c3c\in\mathbbm{Z}_{3}, it holds that f(x+c)=f(x)+cf(x+c)=f(x)+c, where (x+c)i=xi+c(x+c)_{i}=x_{i}+c.

Proposition 2.3.

Let f:3nU3f:\mathbbm{Z}_{3}^{n}\rightarrow U_{3} be folded. Then f^(α)0|α|1(mod3)\hat{f}(\alpha)\neq 0\Rightarrow|\alpha|\equiv 1\pmod{3}.

Proof.
f^(α)=𝐄[f(𝒙+1)χα(𝒙+1)¯]=𝐄[ωf(𝒙)χα(𝒙)¯χα(1,1,,1)¯]=ωχα(1,1,,1)¯f^(α).\hat{f}(\alpha)=\mathop{\bf E\/}[f({\boldsymbol{x}}+1)\overline{\chi_{\alpha}({\boldsymbol{x}}+1)}]=\mathop{\bf E\/}[\omega f({\boldsymbol{x}})\overline{\chi_{\alpha}({\boldsymbol{x}})}\overline{\chi_{\alpha}(1,1,\ldots,1)}]=\omega\overline{\chi_{\alpha}(1,1,\ldots,1)}\hat{f}(\alpha).

This means that ωχα(1,1,,1)¯\omega\overline{\chi_{\alpha}(1,1,\ldots,1)} must be 1. Expanding this quantity,

ωχα(1,1,,1)¯=ω1α(1,1,,1)=ω1|α|.\omega\overline{\chi_{\alpha}(1,1,\ldots,1)}=\omega^{1-\alpha\cdot(1,1,\ldots,1)}=\omega^{1-|\alpha|}.

So, |α|1(mod3)|\alpha|\equiv 1\pmod{3}, as promised. ∎

3 2-to-1 hardness

In this section, we give our hardness result for 2-to-1 Label Cover, following the proof outline described at the end of Section 1.1.

Theorem 1.2 (restated).

For all ϵ>0\epsilon>0, it is 𝖭𝖯\mathsf{NP}-hard to (1,2324+ϵ)(1,\frac{23}{24}+\epsilon)-decide the 22-to-11 Label Cover problem.

First, we state a pair of simple gadget reductions:

Lemma 3.1.

There is a 3/43/4-gadget-reduction from 4NAT to 2NLin.

Lemma 3.2.

There is a 1/21/2-gadget-reduction from 2NLin to 22-to-11.

Together with Proposition 2.2, these imply the following corollary:

Corollary 3.3.

There is a 7/87/8-gadget-reduction from 4NAT to 22-to-11. Thus, if it is 𝖭𝖯\mathsf{NP}-hard to (c,s)(c,s)-decide the 4NAT problem, then it is 𝖭𝖯\mathsf{NP}-hard to ((7+c)/8,(7+s)/8)((7+c)/8,(7+s)/8)-decide the 2-to-1 Label Cover problem.

The gadget reduction from 4NAT to 2NLin relies on the simple fact that if a,b,c,d3a,b,c,d\in\mathbbm{Z}_{3} satisfy the 4NAT predicate, then there is some element of 3\mathbbm{Z}_{3} that none of them equal.

Proof of Lemma 3.1.

A 4NAT constraint CC on the variables S=(v1,v2,v3,v4)S=(v_{1},v_{2},v_{3},v_{4}) is of the form

4NAT(v1+k1,v2+k2,v3+k3,v4+k4),\textsf{4NAT}(v_{1}+k_{1},v_{2}+k_{2},v_{3}+k_{3},v_{4}+k_{4}),

where the kik_{i}’s are all constants in 3\mathbbm{Z}_{3}. To create the 2NLin instance, introduce the auxiliary variable yCy_{C} and add the four 2NLin equations

vi+kiyC(mod3),i[4].v_{i}+k_{i}\neq y_{C}\pmod{3},\quad i\in[4]. (1)

If f:S3f:S\rightarrow\mathbbm{Z}_{3} is an assignment which satisfies the 4NAT constraint, then there is some a3a\in\mathbbm{Z}_{3} such that f(vi)+kia(mod3)f(v_{i})+k_{i}\neq a\pmod{3} for all i[4]i\in[4]. Assigning aa to yCy_{C} satisfies all four equations (1). On the other hand, if ff doesn’t satisfy the 4NAT constraint, then {f(vi)+ki}i[4]=3\{f(v_{i})+k_{i}\}_{i\in[4]}=\mathbbm{Z}_{3}, so no assignment to yCy_{C} satisfies all four equations. However, it is easy to see that there is an assignment which satisfies three of the equations. This gives a 34\frac{3}{4}-gadget-reduction from 4NAT to 2NLin, which proves the lemma. ∎

The reduction from 2NLin to 2-to-1 Label Cover is the well-known constraint-variable reduction, and uses the fact that in the equation vivja(mod3)v_{i}-v_{j}\neq a\pmod{3}, for any assignment to vjv_{j} there are two valid assignments to viv_{i}, and vice versa.

Proof of Lemma 3.2.

An 2NLin constraint CC on the variables S=(v1,v2)S=(v_{1},v_{2}) is of the form

v1v2a(mod3),v_{1}-v_{2}\neq a\pmod{3},

for some a3a\in\mathbbm{Z}_{3}. To create the 2-to-1 Label Cover instance, introduce the variable yCy_{C} which will be labeled by one of the six possible functions g:S3g:S\rightarrow\mathbbm{Z}_{3} which satisfies CC. Finally, introduce the 2-to-1 constraints yC(v1)=f(v1)y_{C}(v_{1})=f(v_{1}) and yC(v2)=f(v2)y_{C}(v_{2})=f(v_{2}).

If f:S3f:S\rightarrow\mathbbm{Z}_{3} is an assignment which satisfies the 2NLin constraint, then we label yCy_{C} with ff. In this case,

yC(vi)=f(vi),i=1,2.y_{C}(v_{i})=f(v_{i}),\quad i=1,2.

Thus, both equations are satisfied. On the other hand, if ff does not satisfy the 2NLin constraint, then any gg which yCy_{C} is labeled with disagrees with ff on at least one of v1v_{1} or v2v_{2}. It is easy to see, though, that a gg can be selected to satisfy one of the two equations. This gives a 12\frac{1}{2}-gadget-reduction from 2NLin to 2-to-1, which proves the lemma. ∎

3.1 A pair of tests

Now that we have shown that 2NLin hardness results translate into 2-to-1 Label Cover hardness results, we present our 2NLin function test. Even though we don’t directly use it, it helps explain how we were led to consider the 4NAT CSP. Furthermore, the Fourier analysis that we eventually use for the 4NAT Test could instead be performed directly on the 2NLin Test without any direct reference to the 4NAT predicate. The test is:

2NLin Test

Given folded functions f:3K3f:\mathbbm{Z}_{3}^{K}\rightarrow\mathbbm{Z}_{3}, g,h:3dK3g,h:\mathbbm{Z}_{3}^{dK}\rightarrow\mathbbm{Z}_{3}:

  • Let 𝒙3K{\boldsymbol{x}}\in\mathbbm{Z}_{3}^{K} and 𝒚3dK\boldsymbol{y}\in\mathbbm{Z}_{3}^{dK} be independent and uniformly random.

  • For each i[K],j[d]i\in[K],j\in[d], select (𝒛[i])j(\boldsymbol{z}[i])_{j} independently and uniformly from the elements of 3{𝒙i,(𝒚[i])j}\mathbbm{Z}_{3}\setminus\{{\boldsymbol{x}}_{i},(\boldsymbol{y}[i])_{j}\}.

  • With probability 14\frac{1}{4}, test f(𝒙)h(𝒛)f({\boldsymbol{x}})\neq h(\boldsymbol{z}); with probability 34\frac{3}{4}, test g(𝒚)h(𝒛)g(\boldsymbol{y})\neq h(\boldsymbol{z}).

Refer to caption
Figure 1: An illustration of the 2NLin test distribution; d=3d=3, K=5K=5

Above is an illustration of the test. We remark that for any given block ii, z[i]z[i] determines xix_{i} (with very high probability), because as soon as z[i]z[i] contains two distinct elements of 3\mathbbm{Z}_{3}, xix_{i} must be the third element of 3\mathbbm{Z}_{3}. Notice also that in every column of indices, the input to hh always differs from the inputs to both ff and gg. Thus, “matching dictator” assignments pass the test with probability 11. (This is the case in which f(x)=xif(x)=x_{i} and g(y)=(y[i])jg(y)=(y[i])_{j} for some i[K]i\in[K], j[d]j\in[d].) On the other hand, if ff and gg are “nonmatching dictators”, then they succeed with only 1112\frac{11}{12} probability. This turns out to be essentially optimal among functions ff and gg without “matching influential coordinates/blocks”. We will obtain the following theorem:

Theorem 1.3 restated.

For all ϵ>0\epsilon>0, it is 𝖭𝖯\mathsf{NP}-hard to (1,1112+ϵ)(1,\frac{11}{12}+\epsilon)-decide the 2NLin problem.

Before proving this, let us further discuss the 2NLin test. Given 𝒙{\boldsymbol{x}}, 𝒚\boldsymbol{y}, and 𝒛\boldsymbol{z} from the 2NLin test, consider the following method of generating two additional strings 𝒚,𝒚′′3dK\boldsymbol{y}^{\prime},\boldsymbol{y}^{\prime\prime}\in\mathbbm{Z}_{3}^{dK} which represent hh’s “uncertainty” about 𝒚\boldsymbol{y}. For j[d]j\in[d], if 𝒙i=(𝒚[i])j{\boldsymbol{x}}_{i}=(\boldsymbol{y}[i])_{j}, then set both (𝒚[i])j(\boldsymbol{y}^{\prime}[i])_{j} and (𝒚′′[i])j(\boldsymbol{y}^{\prime\prime}[i])_{j} to the lone element of 3{𝒙i,(𝒛[i])j}\mathbbm{Z}_{3}\setminus\{{\boldsymbol{x}}_{i},(\boldsymbol{z}[i])_{j}\}. Otherwise, set one of (𝒚[i])j(\boldsymbol{y}^{\prime}[i])_{j} or (𝒚′′[i])j(\boldsymbol{y}^{\prime\prime}[i])_{j} to 𝒙i{\boldsymbol{x}}_{i}, and the other one to (𝒚[i])j(\boldsymbol{y}[i])_{j}. It can be checked that TwoPair(𝒙i,(𝒚[i])j,(𝒚[i])j,(𝒚′′[i])j)=1\textsf{TwoPair}({\boldsymbol{x}}_{i},(\boldsymbol{y}[i])_{j},(\boldsymbol{y}^{\prime}[i])_{j},(\boldsymbol{y}^{\prime\prime}[i])_{j})=1, a more stringent requirement than satisfying 4NAT. In fact, the marginal distribution on these four variables is a uniformly random assignment that satisfies the TwoPair predicate. Conditioned on 𝒙{\boldsymbol{x}} and 𝒛\boldsymbol{z}, the distribution on 𝒚\boldsymbol{y}^{\prime} and 𝒚′′\boldsymbol{y}^{\prime\prime} is identical to the distribution on 𝒚\boldsymbol{y}. To see this, first note that by construction, neither (𝒚[i])j(\boldsymbol{y}^{\prime}[i])_{j} nor (𝒚′′[i])j(\boldsymbol{y}^{\prime\prime}[i])_{j} ever equals (𝒛[i])j(\boldsymbol{z}[i])_{j}. Further, because these indices are distributed as uniformly random satisfying assignments to TwoPair, 𝐏𝐫[(𝒚[i])j=xi]=𝐏𝐫[(𝒚′′[i])j=xi]=13\mathop{\bf Pr\/}[(\boldsymbol{y}^{\prime}[i])_{j}=x_{i}]=\mathop{\bf Pr\/}[(\boldsymbol{y}^{\prime\prime}[i])_{j}=x_{i}]=\frac{1}{3}, which matches the corresponding probability for 𝒚\boldsymbol{y}. Thus, as 𝒚\boldsymbol{y}, 𝒚\boldsymbol{y}^{\prime}, and 𝒚′′\boldsymbol{y}^{\prime\prime} are distributed identically, we may rewrite the test’s success probability as:

𝐏𝐫[fg, and h pass the test]\displaystyle\mathop{\bf Pr\/}[\text{$f$, $g$, and $h$ pass the test}] =14𝐏𝐫[f(𝒙)h(𝒛)]+34𝐏𝐫[g(𝒚)h(𝒛)]\displaystyle=\tfrac{1}{4}\mathop{\bf Pr\/}[f({\boldsymbol{x}})\neq h(\boldsymbol{z})]+\tfrac{3}{4}\mathop{\bf Pr\/}[g(\boldsymbol{y})\neq h(\boldsymbol{z})]
=avg{𝐏𝐫[f(𝒙)h(𝒛)],𝐏𝐫[g(𝒚)h(𝒛)],𝐏𝐫[g(𝒚)h(𝒛)],𝐏𝐫[g(𝒚′′)h(𝒛)]}\displaystyle=\text{avg}\left\{\begin{array}[]{r}\mathop{\bf Pr\/}[f({\boldsymbol{x}})\neq h(\boldsymbol{z})],\\ \mathop{\bf Pr\/}[g(\boldsymbol{y})\neq h(\boldsymbol{z})],\\ \mathop{\bf Pr\/}[g(\boldsymbol{y}^{\prime})\neq h(\boldsymbol{z})],\\ \mathop{\bf Pr\/}[g(\boldsymbol{y}^{\prime\prime})\neq h(\boldsymbol{z})]\phantom{,}\end{array}\right\}
34+14𝐄[4NAT(f(𝒙),g(𝒚),g(𝒚),g(𝒚′′))].\displaystyle\leq\frac{3}{4}+\frac{1}{4}\mathop{\bf E\/}[\textsf{4NAT}(f({\boldsymbol{x}}),g(\boldsymbol{y}),g(\boldsymbol{y}^{\prime}),g(\boldsymbol{y}^{\prime\prime}))].

This is because if 4NAT fails to hold on the tuple (f(𝒙),g(𝒚),g(𝒚),g(𝒚′′))(f({\boldsymbol{x}}),g(\boldsymbol{y}),g(\boldsymbol{y}^{\prime}),g(\boldsymbol{y}^{\prime\prime})), then h(𝒛)h(\boldsymbol{z}) can disagree with at most 33 of them.

At this point, we have removed hh from the test analysis and have uncovered what appears to be a hidden 4NAT test inside the 2NLin Test: simply generate four strings 𝒙{\boldsymbol{x}}, 𝒚\boldsymbol{y}, 𝒚\boldsymbol{y}^{\prime}, and 𝒚′′\boldsymbol{y}^{\prime\prime} as described earlier, and test 4NAT(f(𝒙),g(𝒚),g(𝒚),g(𝒚′′))\textsf{4NAT}(f({\boldsymbol{x}}),g(\boldsymbol{y}),g(\boldsymbol{y}^{\prime}),g(\boldsymbol{y}^{\prime\prime})). With some renaming of variables, this is exactly what our 4NAT Test does:

4NAT Test

Given folded functions f:3K3f:\mathbbm{Z}_{3}^{K}\rightarrow\mathbbm{Z}_{3}, g:3dK3g:\mathbbm{Z}_{3}^{dK}\rightarrow\mathbbm{Z}_{3}:

  • Let 𝒙3K{\boldsymbol{x}}\in\mathbbm{Z}_{3}^{K} be uniformly random.

  • Select 𝒚,𝒛,𝒘\boldsymbol{y},\boldsymbol{z},\boldsymbol{w} as follows: for each i[K],j[d]i\in[K],j\in[d], select ((𝒚[i])j,(𝒛[i])j,(𝒘[i])j)((\boldsymbol{y}[i])_{j},(\boldsymbol{z}[i])_{j},(\boldsymbol{w}[i])_{j}) uniformly at random from the elements of 3\mathbbm{Z}_{3} satisfying TwoPair(𝒙i,(𝒚[i])j,(𝒛[i])j,(𝒘[i])j)\textsf{TwoPair}({\boldsymbol{x}}_{i},(\boldsymbol{y}[i])_{j},(\boldsymbol{z}[i])_{j},(\boldsymbol{w}[i])_{j}).

  • Test 4NAT(f(𝒙),g(𝒚),g(𝒛),g(𝒘))\textsf{4NAT}(f({\boldsymbol{x}}),g(\boldsymbol{y}),g(\boldsymbol{z}),g(\boldsymbol{w})).

Refer to caption
Figure 2: An illustration of the 4NAT test distribution; d=3d=3, K=5K=5

Above is an illustration of this test. In this illustration, the strings 𝒛\boldsymbol{z} and 𝒘\boldsymbol{w} were derived from the strings in Figure 1 using the process detailed above for generating 𝒚\boldsymbol{y}^{\prime} and 𝒚′′\boldsymbol{y}^{\prime\prime}. Note that each column is missing one of the elements of 3\mathbbm{Z}_{3}, and that each column satisfies the TwoPair predicate. Because satisfying TwoPair implies satisfying 4NAT, matching dictators pass this test with probability 11. On the other hand, it can be seen that nonmatching dictators pass the test with probability 23\frac{2}{3}. In the next section we show that this is optimal among functions ff and gg without “matching influential coordinates/blocks”.

(As one additional remark, our 2NLin Test is basically the composition of the 4NAT Test with the gadget from Lemma 3.1. In this test, if we instead performed the f(𝒙)h(𝒛)f({\boldsymbol{x}})\neq h(\boldsymbol{z}) test with probability 13\frac{1}{3} and the g(𝒚)h(𝒛)g(\boldsymbol{y})\neq h(\boldsymbol{z}) test with probability 23\frac{2}{3}, then the resulting test would basically be the composition of a 3NLin test with a suitable 3NLin-to-2NLin gadget.)

3.2 Analysis of 4NAT Test

Let ω=e2πi/3\omega=e^{2\pi i/3}, and set U3={ω0,ω1,ω2}U_{3}=\{\omega^{0},\omega^{1},\omega^{2}\}. In what follows, we identify ff and gg with the functions ωf\omega^{f} and ωg\omega^{g}, respectively, whose range is U3U_{3} rather than 3\mathbbm{Z}_{3}. Set L=dKL=dK. The remainder of this section is devoted to the proof of the following lemma:

Lemma 3.4.

Let f:3KU3f:\mathbbm{Z}_{3}^{K}\rightarrow U_{3} and g:3dKU3g:\mathbbm{Z}_{3}^{dK}\rightarrow U_{3}. Then

𝐄[4NAT(f(𝒙),g(𝒚),g(𝒛),g(𝒘))]23+23α3L|f^(π3(α))||g^(α)|2(1/2)#α\mathop{\bf E\/}[\textsf{4NAT}(f({\boldsymbol{x}}),g(\boldsymbol{y}),g(\boldsymbol{z}),g(\boldsymbol{w}))]\leq\tfrac{2}{3}+\tfrac{2}{3}\sum_{\alpha\in\mathbbm{Z}_{3}^{L}}|\hat{f}(\pi_{3}(\alpha))|\cdot|\hat{g}(\alpha)|^{2}\cdot(1/2)^{\#\alpha}

The first step is to “arithmetize” the 4NAT predicate. It is not hard to verify that

4NAT(a1,a2,a3,a4)\displaystyle\textsf{4NAT}(a_{1},a_{2},a_{3},a_{4}) =59+19ijωaiω¯aj19i<j<kωaiωajωak19i<j<kω¯aiω¯ajω¯ak\displaystyle=\frac{5}{9}+\frac{1}{9}\sum_{i\neq j}\omega^{a_{i}}\overline{\omega}^{a_{j}}-\frac{1}{9}\sum_{i<j<k}\omega^{a_{i}}\omega^{a_{j}}\omega^{a_{k}}-\frac{1}{9}\sum_{i<j<k}\overline{\omega}^{a_{i}}\overline{\omega}^{a_{j}}\overline{\omega}^{a_{k}}
=59+29i<j[ωaiω¯aj]29i<j<k[ωaiωajωak].\displaystyle=\frac{5}{9}+\frac{2}{9}\sum_{i<j}\Re[\omega^{a_{i}}\overline{\omega}^{a_{j}}]-\frac{2}{9}\sum_{i<j<k}\Re[\omega^{a_{i}}\omega^{a_{j}}\omega^{a_{k}}].

Using the symmetry between 𝒚\boldsymbol{y}, 𝒛\boldsymbol{z}, and 𝒘\boldsymbol{w}, we deduce

𝐄[4NAT(f(𝒙),g(𝒚),g(𝒛),g(𝒘))]=59+23𝐄[f(𝒙)g(𝒚)¯]+23𝐄[g(𝒚)g(𝒛)¯]23𝐄[f(𝒙)g(𝒚)g(𝒛)]29𝐄[g(𝒚)g(𝒛)g(𝒘)].\mathop{\bf E\/}[\textsf{4NAT}(f({\boldsymbol{x}}),g(\boldsymbol{y}),g(\boldsymbol{z}),g(\boldsymbol{w}))]\\ =\tfrac{5}{9}+\tfrac{2}{3}\Re\mathop{\bf E\/}[f({\boldsymbol{x}})\overline{g(\boldsymbol{y})}]+\tfrac{2}{3}\Re\mathop{\bf E\/}[g(\boldsymbol{y})\overline{g(\boldsymbol{z})}]-\tfrac{2}{3}\Re\mathop{\bf E\/}[f({\boldsymbol{x}})g(\boldsymbol{y})g(\boldsymbol{z})]-\tfrac{2}{9}\Re\mathop{\bf E\/}[g(\boldsymbol{y})g(\boldsymbol{z})g(\boldsymbol{w})]. (2)

In the second term in the RHS of (2) we in fact have 𝐄[f(𝒙)g(𝒚)¯]=0\mathop{\bf E\/}[f({\boldsymbol{x}})\overline{g(\boldsymbol{y})}]=0. This is because 𝒙{\boldsymbol{x}} and 𝒚\boldsymbol{y} are independent, and hence 𝐄[f(𝒙)g(𝒚)¯]=𝐄[f(𝒙)]𝐄[g(𝒚)¯]=00\mathop{\bf E\/}[f({\boldsymbol{x}})\overline{g(\boldsymbol{y})}]=\mathop{\bf E\/}[f({\boldsymbol{x}})]\mathop{\bf E\/}[\overline{g(\boldsymbol{y})}]=0\cdot 0 since ff and gg are folded. Regarding the third term of the RHS in (2), this also turns out to be 0 by virtue of gg being folded. This can be proven using a Fourier-analytic argument; we present here an alternate combinatorial argument:

Lemma 3.5.

𝐄[g(𝒚)g(𝒛)¯]=0\mathop{\bf E\/}[g(\boldsymbol{y})\overline{g(\boldsymbol{z})}]=0.

Proof.

Fix any value y3Ly\in\mathbbm{Z}_{3}^{L} for 𝒚\boldsymbol{y}. Consider the function t:3K×3L3K×3Lt:\mathbbm{Z}_{3}^{K}\times\mathbbm{Z}_{3}^{L}\rightarrow\mathbbm{Z}_{3}^{K}\times\mathbbm{Z}_{3}^{L} defined as t(x,z)=(x+1,z1)t(x,z)=(x+1,z-1), where all arithmetic is performed modulo 3. Note that tt has order 3, meaning that t(t(t(x,z)))=(x,z)t(t(t(x,z)))=(x,z). This allows us to group values for 𝒙{\boldsymbol{x}} and 𝒛\boldsymbol{z} into sets of size three as follows: put (x,z)3K×3L(x,z)\in\mathbbm{Z}_{3}^{K}\times\mathbbm{Z}_{3}^{L} into the set T(x,z)={(x,z),t(x,z),t(t(x,z))}T(x,z)=\{(x,z),t(x,z),t(t(x,z))\}. Because tt is invertible and of order 3, each pair (x,z)(x,z) is a member of only one set: T(x,z)T(x,z).

Conditioned on 𝒚=y\boldsymbol{y}=y, if (x,z)(x,z) is in the support of the test, then all (x,z)T(x,z)(x^{\prime},z^{\prime})\in T(x,z) are also in the support of the test. This is because the strings which are in the support of the test are exactly the strings xx and zz for which the set {(xπ)i,yi,zi}3\{(x_{\pi})_{i},y_{i},z_{i}\}\subseteq\mathbbm{Z}_{3} is of size 2, for all i[L]i\in[L]. These strings, in turn, are exactly those for which xπ+y+z0(mod3)x_{\pi}+y+z\not\equiv 0\pmod{3}. But if (x,z)=t(x,z)(x^{\prime},z^{\prime})=t(x,z), then

xπ+y+z(xπ+1)+y+(z1)xπ+y+z0(mod3).x^{\prime}_{\pi}+y+z^{\prime}\equiv(x_{\pi}+1)+y+(z-1)\equiv x_{\pi}+y+z\not\equiv 0\pmod{3}.

This shows that t(x,z)t(x,z) is in the support of the test, conditioned on 𝒚=y\boldsymbol{y}=y. As T(x,z)=T(x,z)T(x^{\prime},z^{\prime})=T(x,z), the same holds for t(t(x,z))t(t(x,z)).

When conditioned on 𝒚=y\boldsymbol{y}=y, each pair (x,z)(x,z) in the support of the test occurs with equal probability. To see this, first note that 𝒙{\boldsymbol{x}} is pairwise independent from 𝒚\boldsymbol{y}. In other words, any value xx for 𝒙{\boldsymbol{x}} is equally likely, regardless of yy. Then, conditioned on 𝒙=x{\boldsymbol{x}}=x and 𝒚=y\boldsymbol{y}=y, there are exactly two possibilities for each index of 𝒛\boldsymbol{z}, both of which occur with half probability. Thus, the event (x,z)(x,z) occurs with the same probability, no matter the values of xx or zz.

Consider an arbitrary set T(x,z)T(x,z). Conditioned on (𝒙,𝒛)({\boldsymbol{x}},\boldsymbol{z}) falling in T(x,z)T(x,z), the value of (𝒙,𝒛)({\boldsymbol{x}},\boldsymbol{z}) is a uniformly random element of this set. This means that 𝒛\boldsymbol{z} is equally likely to be zz, z1z-1, or z2z-2. By the folding of gg, g(𝒛)g(\boldsymbol{z}) is therefore equally likely to be one of ω0,ω1\omega^{0},\omega^{1}, or ω2\omega^{2}. As this happens for any choice of the set T(x,z)T(x,z), g(𝒛)g(\boldsymbol{z}) is uniform on U3U_{3}, even when conditioned on 𝒚=y\boldsymbol{y}=y. Thus, 𝐄[g(𝒚)g(𝒛)¯]=0\mathop{\bf E\/}[g(\boldsymbol{y})\overline{g(\boldsymbol{z})}]=0 as desired. ∎

Equation (2) has now been reduced to

(2)=5923𝐄[f(𝒙)g(𝒚)g(𝒛)]29𝐄[g(𝒚)g(𝒛)g(𝒘)].\eqref{eq:bigexpansion}=\tfrac{5}{9}-\tfrac{2}{3}\Re\mathop{\bf E\/}[f({\boldsymbol{x}})g(\boldsymbol{y})g(\boldsymbol{z})]-\tfrac{2}{9}\Re\mathop{\bf E\/}[g(\boldsymbol{y})g(\boldsymbol{z})g(\boldsymbol{w})]. (3)

As g(𝒚)g(𝒛)g(𝒘)g(\boldsymbol{y})g(\boldsymbol{z})g(\boldsymbol{w}) is always in U3U_{3}, 𝐄[g(𝒚)g(𝒛)g(𝒘)]\Re\mathop{\bf E\/}[g(\boldsymbol{y})g(\boldsymbol{z})g(\boldsymbol{w})] is always at least 12-\frac{1}{2}. Therefore,

(3)2323𝐄[f(𝒙)g(𝒚)g(𝒛)].\eqref{eq:smallexpansion}\leq\tfrac{2}{3}-\tfrac{2}{3}\Re\mathop{\bf E\/}[f({\boldsymbol{x}})g(\boldsymbol{y})g(\boldsymbol{z})]. (4)

It remains to handle the 𝐄[f(𝒙)g(𝒚)g(𝒛)]\mathop{\bf E\/}[f({\boldsymbol{x}})g(\boldsymbol{y})g(\boldsymbol{z})] term, which is the subject of our next lemma. This is done through a standard argument in the style of Håstad [Hås01].

Lemma 3.6.

𝐄[f(𝒙)g(𝒚)g(𝒛)]=α3Lf^(π3(α))g^(α)2(12)#α\mathop{\bf E\/}[f({\boldsymbol{x}})g(\boldsymbol{y})g(\boldsymbol{z})]=\sum_{\alpha\in\mathbbm{Z}_{3}^{L}}\hat{f}(\pi_{3}(\alpha))\hat{g}(\alpha)^{2}\left(-\frac{1}{2}\right)^{\#\alpha}.

Proof.

Begin by expanding out 𝐄[f(𝒙)g(𝒚)g(𝒛)]\mathop{\bf E\/}[f({\boldsymbol{x}})g(\boldsymbol{y})g(\boldsymbol{z})]:

𝐄[f(𝒙)g(𝒚)g(𝒛)]=α3K,β,γ3L|α||β||γ|1(mod3)f^(α)g^(β)g^(γ)𝐄[χα(𝒙)χβ(𝒚)χγ(𝒛)].\mathop{\bf E\/}[f({\boldsymbol{x}})g(\boldsymbol{y})g(\boldsymbol{z})]=\sum_{\begin{subarray}{c}\alpha\in\mathbbm{Z}_{3}^{K},\beta,\gamma\in\mathbbm{Z}_{3}^{L}\\ |\alpha|\equiv|\beta|\equiv|\gamma|\equiv 1\pmod{3}\end{subarray}}\hat{f}(\alpha)\hat{g}(\beta)\hat{g}(\gamma)\mathop{\bf E\/}[\chi_{\alpha}({\boldsymbol{x}})\chi_{\beta}(\boldsymbol{y})\chi_{\gamma}(\boldsymbol{z})]. (5)

We focus on the products of the Fourier characters:

𝐄[χα(𝒙)χβ(𝒚)χγ(𝒛)]=i[K]𝐄[χαi(𝒙i)χβ[i](𝒚[i])χγ[i](𝒛[i])]\mathop{\bf E\/}[\chi_{\alpha}({\boldsymbol{x}})\chi_{\beta}(\boldsymbol{y})\chi_{\gamma}(\boldsymbol{z})]=\prod_{i\in[K]}\mathop{\bf E\/}[\chi_{\alpha_{i}}({\boldsymbol{x}}_{i})\chi_{\beta[i]}(\boldsymbol{y}[i])\chi_{\gamma[i]}(\boldsymbol{z}[i])] (6)

We can attend to each block separately:

𝐄[χαi(𝒙i)χβ[i](𝒚[i])χγ[i](𝒛[i])]=\displaystyle\mathop{\bf E\/}[\chi_{\alpha_{i}}({\boldsymbol{x}}_{i})\chi_{\beta[i]}(\boldsymbol{y}[i])\chi_{\gamma[i]}(\boldsymbol{z}[i])]= 𝐄[ωαi𝒙i+β[i]𝒚[i]+γ[i]𝒛[i]]\displaystyle\mathop{\bf E\/}\left[\omega^{\alpha_{i}\cdot{\boldsymbol{x}}_{i}+\beta[i]\cdot\boldsymbol{y}[i]+\gamma[i]\cdot\boldsymbol{z}[i]}\right]
=\displaystyle= 𝐄𝒙[ωαiaj:π(j)=i𝐄𝒚,𝒛[ωβj𝒚j+γj𝒛j𝒙i=a]()].\displaystyle\mathop{\bf E\/}_{{\boldsymbol{x}}}\left[\omega^{\alpha_{i}\cdot a}\prod_{j:\pi(j)=i}\underbrace{\mathop{\bf E\/}_{\boldsymbol{y},\boldsymbol{z}}\left[\omega^{\beta_{j}\boldsymbol{y}_{j}+\gamma_{j}\boldsymbol{z}_{j}}\mid{\boldsymbol{x}}_{i}=a\right]}_{(*)}\right]. (7)

Now, consider the expectation ()(*). The distribution on the values for (𝒚j,𝒛j)(\boldsymbol{y}_{j},\boldsymbol{z}_{j}) is uniform on the six possibilities (a+1,a+1)(a+1,a+1), (a+2,a+2)(a+2,a+2), (a,a+1)(a,a+1), (a,a+2)(a,a+2), (a+1,a)(a+1,a), and (a+2,a)(a+2,a). We claim that ()(*) is nonzero if and only if βjγj(mod3)\beta_{j}\equiv\gamma_{j}\pmod{3}. If, on the other hand, βjγj(mod3)\beta_{j}\not\equiv\gamma_{j}\pmod{3}, then either only one of βj\beta_{j} or γj\gamma_{j} is zero, or neither is zero, and βjγj(mod3)-\beta_{j}\equiv\gamma_{j}\pmod{3}. In the first case, the expectation is either 𝐄[ωβj𝒚j𝒙i=a]\mathop{\bf E\/}[\omega^{\beta_{j}\boldsymbol{y}_{j}}\mid{\boldsymbol{x}}_{i}=a] or 𝐄[ωγj𝒛j𝒙i=a]\mathop{\bf E\/}[\omega^{\gamma_{j}\boldsymbol{z}_{j}}\mid{\boldsymbol{x}}_{i}=a] for a nonzero βj\beta_{j} or a nonzero γj\gamma_{j}, respectively. Both of these expectations are zero, as both 𝒚j\boldsymbol{y}_{j} and 𝒛j\boldsymbol{z}_{j} are uniform on 3\mathbbm{Z}_{3}. In the second case,

𝐄[ωβj𝒚j+γj𝒛j𝒙i=a]=\displaystyle\mathop{\bf E\/}[\omega^{\beta_{j}\boldsymbol{y}_{j}+\gamma_{j}\boldsymbol{z}_{j}}\mid{\boldsymbol{x}}_{i}=a]= 𝐄[ωβj𝒚jβj𝒛j𝒙i=a]\displaystyle\mathop{\bf E\/}[\omega^{\beta_{j}\boldsymbol{y}_{j}-\beta_{j}\boldsymbol{z}_{j}}\mid{\boldsymbol{x}}_{i}=a]
=\displaystyle= 𝐄[ωβj(𝒚j𝒛j)𝒙i=a],\displaystyle\mathop{\bf E\/}[\omega^{\beta_{j}(\boldsymbol{y}_{j}-\boldsymbol{z}_{j})}\mid{\boldsymbol{x}}_{i}=a],

which is zero, because βj\beta_{j} is nonzero, and 𝒚j𝒛j\boldsymbol{y}_{j}-\boldsymbol{z}_{j} is uniformly distributed on 3\mathbbm{Z}_{3}.

Thus, when ()(*) and Equation (6) are nonzero, βγ(mod3)\beta\equiv\gamma\pmod{3}. This means that ()=𝐄[ωβj(𝒚j+𝒛j)𝒙i=a](*)=\mathop{\bf E\/}[\omega^{\beta_{j}(\boldsymbol{y}_{j}+\boldsymbol{z}_{j})}\mid{\boldsymbol{x}}_{i}=a]. When βj=0\beta_{j}=0, this is clearly 11. Otherwise, as either 𝒚j+𝒛j2a+1(mod3)\boldsymbol{y}_{j}+\boldsymbol{z}_{j}\equiv 2a+1\pmod{3} or 𝒚j+𝒛j2a+2(mod3)\boldsymbol{y}_{j}+\boldsymbol{z}_{j}\equiv 2a+2\pmod{3}, each with probability half, this is equal to

()=12(ωβj(2a+1)+ωβj(2a+2))=ω2aβj2(ω1+ω2)=ω2aβj2.(*)=\frac{1}{2}\left(\omega^{\beta_{j}(2a+1)}+\omega^{\beta_{j}(2a+2)}\right)=\frac{\omega^{2a\beta_{j}}}{2}(\omega^{1}+\omega^{2})=-\frac{\omega^{2a\beta_{j}}}{2}.

In summary, when β=γ\beta=\gamma, ()=(12)#βjω2aβj(*)=\left(-\frac{1}{2}\right)^{\#\beta_{j}}\omega^{2a\beta_{j}}.

We can now rewrite Equation (7) as

(7)=𝐄𝒙[ωαiaj:π(j)=i(12)#βjω2aβj]=𝐄𝒙[(12)#β[i]ω(αi+2|β[i]|)a].\eqref{eq:blockcharacters}=\mathop{\bf E\/}_{{\boldsymbol{x}}}\left[\omega^{\alpha_{i}\cdot a}\prod_{j:\pi(j)=i}\left(-\frac{1}{2}\right)^{\#\beta_{j}}\omega^{2a\beta_{j}}\right]=\mathop{\bf E\/}_{{\boldsymbol{x}}}\left[\left(-\frac{1}{2}\right)^{\#\beta[i]}\omega^{(\alpha_{i}+2|\beta[i]|)a}\right].

Note that the exponent of ω\omega, (αi+2|β[i]|)a(\alpha_{i}+2|\beta[i]|)a, is zero if αi|β[i]|(mod3)\alpha_{i}\equiv|\beta[i]|\pmod{3}, in which case the expectation is just the constant (1/2)#β[i](-1/2)^{\#\beta[i]}. This occurs for all i[K]i\in[K] exactly when α=π3(β)\alpha=\pi_{3}(\beta). If, on the other hand, αi+2|β[i]|\alpha_{i}+2|\beta[i]| is nonzero, then the entire expectation is zero because aa, the value of 𝒙i{\boldsymbol{x}}_{i}, is uniformly random from 3\mathbbm{Z}_{3}. Thus, Equation (6) is nonzero only when α=π3(β)\alpha=\pi_{3}(\beta) and β=γ\beta=\gamma, in which case it equals

(6)=(12)#β.\eqref{eq:fcharacters}=\left(-\frac{1}{2}\right)^{\#\beta}.

We may therefore conclude with

(5)=α3Lf^(π3(α))g^(α)2(12)#α.\eqref{eq:fourierexpanded}=\sum_{\alpha\in\mathbbm{Z}_{3}^{L}}\hat{f}(\pi_{3}(\alpha))\hat{g}(\alpha)^{2}\left(-\frac{1}{2}\right)^{\#\alpha}.\qed

Substituting this result into (4) yields

𝐄[4NAT(f(𝒙),g(𝒚),g(𝒛),g(𝒘))]\displaystyle\mathop{\bf E\/}[\textsf{4NAT}(f({\boldsymbol{x}}),g(\boldsymbol{y}),g(\boldsymbol{z}),g(\boldsymbol{w}))] 2323α3Lf^(π3(α))g^(α)2(12)#α\displaystyle\leq\tfrac{2}{3}-\tfrac{2}{3}\Re\sum_{\alpha\in\mathbbm{Z}_{3}^{L}}\hat{f}(\pi_{3}(\alpha))\hat{g}(\alpha)^{2}\left(-\frac{1}{2}\right)^{\#\alpha}
23+23α3L|f^(π3(α))||g^(α)|2(1/2)#α,\displaystyle\leq\tfrac{2}{3}+\tfrac{2}{3}\sum_{\alpha\in\mathbbm{Z}_{3}^{L}}|\hat{f}(\pi_{3}(\alpha))|\cdot|\hat{g}(\alpha)|^{2}\cdot(1/2)^{\#\alpha},

completing the proof of Lemma 3.4.

4 Hardness of 4NAT

In this section, we show the following theorem:

Theorem 1.4 (detailed).

For all ϵ>0\epsilon>0, it is 𝖭𝖯\mathsf{NP}-hard to (1,23+ϵ)(1,\frac{2}{3}+\epsilon)-decide the 4NAT problem. In fact, in the “yes case”, all 4NAT constraints can be satisfied by TwoPair assignments.

Combining this with Lemma 3.1 yields Theorem 1.3, and combining this with Corollary 3.3 yields Theorem 1.2. It is not clear whether this gives optimal hardness assuming perfect completeness. The 4NAT predicate is satisfied by a uniformly random input with probability 59\frac{5}{9}, and by the method of conditional expectation this gives a deterministic algorithm which (1,59)(1,\frac{5}{9})-approximates the 4NAT CSP. This leaves a gap of 19\frac{1}{9} in the soundness, and to our knowledge there are no better known algorithms.

On the hardness side, consider a uniformly random satisfying assignment to the TwoPair predicate. It is easy to see that each of the four variables is assigned a uniformly random value from 3\mathbbm{Z}_{3}, and also that the variables are pairwise independent. As any satisfying assignment to the TwoPair predicate also satisfies the 4NAT predicate, the work of Austrin and Mossel [AM09] immediately implies that (1ϵ,59+ϵ)(1-\epsilon,\frac{5}{9}+\epsilon)-approximating the 4NAT problem is 𝖭𝖯\mathsf{NP}-hard under the Unique Games conjecture. Thus, if we are willing to sacrifice a small amount in the completeness, we can improve the soundness parameter in Theorem 1.4. Whether we can improve upon the soundness without sacrificing perfect completeness is open.

We now arrive at the proof of Theorem 1.4. The proof is entirely standard, and proceeds by reduction from dd-to-1 Label Cover. It makes use of our analysis of the 4NAT Test, which is presented in Appendix 3.2. One preparatory note: most of the proof concerns functions f:3K3f:\mathbbm{Z}_{3}^{K}\rightarrow\mathbbm{Z}_{3} and g:3dK3g:\mathbbm{Z}_{3}^{dK}\rightarrow\mathbbm{Z}_{3}. However, we also be making use of Fourier analytic notions defined in Section 2.3, and this requires dealing with functions whose range is U3U_{3} rather than 3\mathbbm{Z}_{3}. Thus, we associate ff and gg with the functions ωf\omega^{f} and ωg\omega^{g}, and whenever Fourier analysis is used it will actually be with respect to the latter two functions.

Proof.

Let G=(UV,E)G=(U\cup V,E) be a dd-to-1 Label Cover instance with alphabet size KK and dd-to-1 maps πe:[dK][K]\pi_{e}:[dK]\rightarrow[K] for each edge eEe\in E. We construct a 4NAT instance by replacing each vertex in GG with its Long Code and placing constraints on adjacent Long Codes corresponding to the tests made in the 4NAT Test. Thus, each uUu\in U is replaced by a copy of the hypercube 3K\mathbbm{Z}_{3}^{K} and labeled by the function fu:3K3f_{u}:\mathbbm{Z}_{3}^{K}\rightarrow\mathbbm{Z}_{3}. Similarly, each vVv\in V is replaced by a copy of the Boolean hypercube 3dK\mathbbm{Z}_{3}^{dK} and labeled by the function gv:3dK3g_{v}:\mathbbm{Z}_{3}^{dK}\rightarrow\mathbbm{Z}_{3}. Finally, for each edge {u,v}E\{u,v\}\in E, a set of 4NAT constraints is placed between fuf_{u} and gvg_{v} corresponding to the constraints made in the 4NAT Test, and given a weight equal to the probability the constraint is tested in the 4NAT Test multiplied by the weight of {u,v}\{u,v\} in GG. This produces a 4NAT instance whose weights sum to 1 which is equivalent to the following test:

  • Pick an edge e=(u,v)Ee=(u,v)\in E uniformly at random.

  • Reorder the indices of gvg_{v} so that the kkth group of dd indices corresponds to πe1(k)\pi_{e}^{-1}(k).

  • Run the 4NAT test on fuf_{u} and gvg_{v}. Accept iff it does.

Completeness

If the original Label Cover instance is fully satisfiable, then there is a function F:UV[dK]F:U\cup V\rightarrow[dK] for which 𝗏𝖺𝗅(F)=1\mathsf{val}(F)=1. Set each fuf_{u} to the dictator assignment fu(x)=xF(u)f_{u}(x)=x_{F(u)} and each gvg_{v} to the dictator assignment gv(y)=yF(v)g_{v}(y)=y_{F(v)}. Let e={u,v}Ee=\{u,v\}\in E. Because FF satisfies the constraint πe\pi_{e}, F(u)=πe(F(v))F(u)=\pi_{e}(F(v)). Thus, fuf_{u} and gvg_{v} correspond to “matching dictator” assignments, and above we saw that matching dictators pass the 4NAT Test with probability 1. As this applies to every edge in EE, the 4NAT instance is fully satisfiable.

Soundness

Assume that there are functions {fu}uU\{f_{u}\}_{u\in U} and {gv}vV\{g_{v}\}_{v\in V} which satisfy at least a 23+ϵ\frac{2}{3}+\epsilon fraction of the 4NAT constraints. Then there is at least an ϵ/2\epsilon/2 fraction of the edges e={u,v}Ee=\{u,v\}\in E for which fuf_{u} and gvg_{v} pass the 4NAT Test with probability at least 23+ϵ/2\frac{2}{3}+\epsilon/2. This is because otherwise the fraction of 4NAT constraint satisfied would be at most

(1ϵ2)(23+ϵ2)+ϵ2(1)=23+2ϵ3ϵ24<23+ϵ.\left(1-\frac{\epsilon}{2}\right)\left(\frac{2}{3}+\frac{\epsilon}{2}\right)+\frac{\epsilon}{2}(1)=\frac{2}{3}+\frac{2\epsilon}{3}-\frac{\epsilon^{2}}{4}<\frac{2}{3}+\epsilon.

Let EE^{\prime} be the set of such edges, and consider {u,v}E\{u,v\}\in E^{\prime}. Set L=dKL=dK. By Lemma 3.4,

23+ϵ2𝐏𝐫[fu and gv pass the 4NAT test]23+23(α3L|f^u(π3(α))||g^v(α)|2(12)#α),\frac{2}{3}+\frac{\epsilon}{2}\leq\mathop{\bf Pr\/}[\text{$f_{u}$ and $g_{v}$ pass the $\textsf{4NAT}$ test}]\leq\frac{2}{3}+\frac{2}{3}\left(\sum_{\alpha\in\mathbbm{Z}_{3}^{L}}\left|\hat{f}_{u}(\pi_{3}(\alpha))\right|\left|\hat{g}_{v}(\alpha)\right|^{2}\left(\frac{1}{2}\right)^{\#\alpha}\right),

meaning that

3ϵ4α3L|f^u(π3(α))||g^v(α)|2(12)#α.\frac{3\epsilon}{4}\leq\sum_{\alpha\in\mathbbm{Z}_{3}^{L}}\left|\hat{f}_{u}(\pi_{3}(\alpha))\right|\left|\hat{g}_{v}(\alpha)\right|^{2}\left(\frac{1}{2}\right)^{\#\alpha}. (8)

Parseval’s equation tells us that α3L|g^v(α)|2=1\sum_{\alpha\in\mathbbm{Z}_{3}^{L}}|\hat{g}_{v}(\alpha)|^{2}=1. The function g^v\hat{g}_{v} therefore induces a probability distribution on the elements of 3L\mathbbm{Z}_{3}^{L}. As a result, we can rewrite Equation (8) as

3ϵ4𝐄αg^v[|f^u(π3(α))|(12)#α].\frac{3\epsilon}{4}\leq\mathop{\bf E\/}_{\alpha\sim\hat{g}_{v}}\left[\left|\hat{f}_{u}(\pi_{3}(\alpha))\right|\left(\frac{1}{2}\right)^{\#\alpha}\right]. (9)

As previously noted, |f^u(π3(α))||\hat{f}_{u}(\pi_{3}(\alpha))| is less than 1 for all α\alpha, so the expression in this expectation as never greater than 1. We can thus conclude that

3ϵ8𝐏𝐫αg^v[|f^u(π3(α))|(12)#α3ϵ8]𝖦𝖮𝖮𝖣α,\frac{3\epsilon}{8}\leq\mathop{\bf Pr\/}_{\alpha\sim\hat{g}_{v}}\underbrace{\left[\left|\hat{f}_{u}(\pi_{3}(\alpha))\right|\left(\frac{1}{2}\right)^{\#\alpha}\geq\frac{3\epsilon}{8}\right]}_{\mathsf{GOOD}_{\alpha}},

as otherwise the expectation in Equation (9) would be less than 3ϵ/43\epsilon/4. Call the event in the probability 𝖦𝖮𝖮𝖣α\mathsf{GOOD}_{\alpha}. When 𝖦𝖮𝖮𝖣α\mathsf{GOOD}_{\alpha} occurs, the following happens:

  • |f^u(π3(α))|29ϵ2/64|\hat{f}_{u}(\pi_{3}(\alpha))|^{2}\geq 9\epsilon^{2}/64.

  • #αlog2(8/3ϵ)\#\alpha\leq\log_{2}(8/3\epsilon). Furthermore, as fuf_{u} is folded, #α>0\#\alpha>0.

This suggests the following randomized decoding procedure for each uUu\in U: pick an element β3K\beta\in\mathbbm{Z}_{3}^{K} with probability |f^u(β)|2|\hat{f}_{u}(\beta)|^{2} and choose one of its nonzero coordinates uniformly at random. Similarly, for each vVv\in V, pick an element α3L\alpha\in\mathbbm{Z}_{3}^{L} with probability |g^v(α)|2|\hat{g}_{v}(\alpha)|^{2} and choose one of its nonzero coordinates uniformly at random. In both cases, nonzero coordinates are guaranteed to exist because all the fuf_{u}’s and gvg_{v}’s are folded.

Now we analyze how well this decoding scheme performs for the edges e={u,v}Ee=\{u,v\}\in E^{\prime} (we may assume the other edges are unsatisfied). Suppose that when the elements of 3K\mathbbm{Z}_{3}^{K} and 3L\mathbbm{Z}_{3}^{L} were randomly chosen, gvg_{v}’s set α\alpha was in 𝖦𝗈𝗈𝖽α\mathsf{Good}_{\alpha}, and fuf_{u}’s set β\beta equals π3(α)\pi_{3}(\alpha). Then, as #αlog2(8/3ϵ)\#\alpha\leq\log_{2}(8/3\epsilon), and each label in π3(α)\pi_{3}(\alpha) has at least one label in α\alpha which maps to it, the probability that matching labels are drawn is at least 1/log2(8/3ϵ)1/\log_{2}(8/3\epsilon). Next, the probability that such an α\alpha and β\beta are drawn is

α𝖦𝖮𝖮𝖣|f^u(π3(α))|2|g^v(α)|29ϵ264α𝖦𝖮𝖮𝖣|g^v(α)|29ϵ2643ϵ8=27ϵ3512.\sum_{\alpha\in\mathsf{GOOD}}|\hat{f}_{u}(\pi_{3}(\alpha))|^{2}|\hat{g}_{v}(\alpha)|^{2}\geq\frac{9\epsilon^{2}}{64}\sum_{\alpha\in\mathsf{GOOD}}|\hat{g}_{v}(\alpha)|^{2}\geq\frac{9\epsilon^{2}}{64}\frac{3\epsilon}{8}=\frac{27\epsilon^{3}}{512}.

Combining these, the probability that this edge is satisfied is at least 27ϵ3/512log2(8/3ϵ)27\epsilon^{3}/512\log_{2}(8/3\epsilon). Thus, the decoding scheme satisfies at least

27ϵ3512log2(8/3ϵ)|E||E|27ϵ41024log2(8/3ϵ)\frac{27\epsilon^{3}}{512\log_{2}(8/3\epsilon)}\cdot\frac{|E^{\prime}|}{|E|}\geq\frac{27\epsilon^{4}}{1024\log_{2}(8/3\epsilon)}

fraction of the Label Cover edges in expectation. By the probabilistic method, an assignment to the Label Cover instance must therefore exist which satisfies at least this fraction of the edges.

We now apply Theorem 2.1, setting the soundness value in that theorem equal to O(ϵ5)O(\epsilon^{5}), which concludes the proof. ∎

References

  • [ABS10] Sanjeev Arora, Boaz Barak, and David Steurer. Subexponential algorithms for Unique Games and related problems. In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science, pages 563–572, 2010.
  • [AM09] Per Austrin and Elchanan Mossel. Approximation resistant predicates from pairwise independence. Computational Complexity, 18(2):249–271, 2009.
  • [BRS11] Boaz Barak, Prasad Raghavendra, and David Steurer. Rounding semidefinite programming hierarchies via global correlation. In Proceedings of the 52nd Annual IEEE Symposium on Foundations of Computer Science, 2011.
  • [DMR09] Irit Dinur, Elchanan Mossel, and Oded Regev. Conditional hardness for approximate coloring. SIAM Journal on Computing, 39(3):843–873, 2009.
  • [FJ97] Alan Frieze and Mark Jerrum. Improved approximation algorithms for MAX k-CUT and MAX BISECTION. Algorithmica, 18(1):67–81, 1997.
  • [FK94] Uriel Feige and Joe Kilian. Two prover protocols: low error at affordable rates. In Proceedings of the 26th Annual ACM Symposium on Theory of Computing, pages 172–183, 1994.
  • [GKO+10] Venkatesan Guruswami, Subhash Khot, Ryan O’Donnell, Preyas Popat, Madhur Tulsiani, and Yi Wu. SDP gaps for 2-to-1 and other Label-Cover variants. In Proceedings of the 37th Annual International Colloquium on Automata, Languages and Programming, pages 617–628, 2010.
  • [GLST98] Venkatesan Guruswami, Daniel Lewin, Madhu Sudan, and Luca Trevisan. A tight characterization of NP with 3 query PCPs. In Proceedings of the 39th Annual IEEE Symposium on Foundations of Computer Science, pages 8–17, 1998.
  • [GS09] Venkatesan Guruswami and Ali Kemal Sinop. Improved inapproximability results for Maximum k-Colorable Subgraph. In Proceedings of the 12th Annual International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, pages 163–176, 2009.
  • [GS11] Venkatesan Guruswami and Ali Sinop. Lasserre hierarchy, higher eigenvalues, and approximation schemes for quadratic integer programming with PSD objectives. In Proceedings of the 52nd Annual IEEE Symposium on Foundations of Computer Science, 2011.
  • [GW04] Michel X. Goemans and David P. Williamson. Approximation algorithms for MAX-3-CUT and other problems via complex semidefinite programming. J. Comput. Syst. Sci., 68(2):442–470, 2004.
  • [Hås01] Johan Håstad. Some optimal inapproximability results. Journal of the ACM, 48(4):798–859, 2001.
  • [IP01] Russell Impagliazzo and Ramamohan Paturi. On the complexity of k-SAT. Journal of Computer and System Sciences, 62(2):367–375, 2001.
  • [Kho02] Subhash Khot. On the power of unique 2-prover 1-round games. In Proc. 34th ACM Symposium on Theory of Computing, pages 767–775, 2002.
  • [KKMO07] Subhash Khot, Guy Kindler, Elchanan Mossel, and Ryan O’Donnell. Optimal inapproximability results for Max-Cut and other 22-variable CSPs? SIAM Journal on Computing, 37(1):319–357, 2007.
  • [KR03] Subhash Khot and Oded Regev. Vertex Cover might be hard to approximate to within 2ϵ2-\epsilon. In Proc. 18th IEEE Conference on Computational Complexity, pages 379–386, 2003.
  • [MR10] Dana Moshkovitz and Ran Roz. Two-query PCP with subconstant error. Journal of the ACM, 57(5):29, 2010.
  • [OW09] Ryan O’Donnell and Yi Wu. Conditional hardness for satisfiable 33-CSPs. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, pages 493–502, 2009.
  • [OW12] Ryan O’Donnell and John Wright. A new point of NP-hardness for Unique-Games. In Proceedings of the 44th Annual ACM Symposium on Theory of Computing, 2012.
  • [Rag08] Prasad Raghavendra. Optimal algorithms and inapproximability results for every CSP? In Proceedings of the 40th Annual ACM Symposium on Theory of Computing, pages 245–254, 2008.
  • [Raz95] Ran Raz. A parallel repetition theorem. In Proceedings of the 27th Annual ACM Symposium on Theory of Computing, pages 447–456, 1995.
  • [Ste10] David Steurer. Subexponential algorithms for d-to-1 two-prover games and for certifying almost perfect expansion. Available at the author’s website, 2010.
  • [TSSW00] Luca Trevisan, Gregory Sorkin, Madhu Sudan, and David Williamson. Gadgets, approximation, and linear programming. SIAM Journal on Computing, 29(6):2074–2097, 2000.
  • [Zwi98] Uri Zwick. Approximation algorithms for constraint satisfaction problems involving at most three variables per constraint. In Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 201–210, 1998.