This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Simple Combinatorial Construction of the ko(1)k^{o(1)}-Lower Bound for Approximating the Parameterized kk-Clique

Yijia Chen Shanghai Jiao Tong University Yi Feng Shanghai University of Finance and Economics Bundit Laekhanukit Independent Researcher Yanlin Liu Ocean University of China
Abstract

In the parameterized kk-clique problem, or kk-Clique for short, we are given a graph GG and a parameter k1k\geq 1. The goal is to decide whether there exist kk vertices in GG that induce a complete subgraph (i.e., a kk-clique). This problem plays a central role in the theory of parameterized intractability as one of the first W[1]-complete problems. Existing research has shown that even an FPT-approximation algorithm for kk-Clique with arbitrary ratio does not exist, assuming the Gap-Exponential-Time Hypothesis (Gap-ETH) [Chalermsook et al., FOCS’17 and SICOMP]. However, whether this inapproximability result can be based on the standard assumption of W[1]FPT\text{W[1]}\neq\text{FPT} remains unclear.

In a breakthrough work of Bingkai Lin [STOC’21], any constant-factor approximation of kk-Clique is shown to be W[1]-hard, and subsequently, the inapproximation ratio is improved to ko(1)k^{o(1)} in the work of Karthik C.S. and Khot [CCC’22], and independently in [Lin, Ren, Sun Wang; ICALP’22] (under the apparently stronger complexity assumption ETH). All the work along this line follows the framework developed by Lin, which starts from the kk-vector-sum problem and requires some involved algebraic techniques.

This paper presents an alternative framework for proving the W[1]-hardness of the ko(1)k^{o(1)}-FPT-approximation of kk-Clique. Using this framework, we obtain a gap-producing self-reduction of kk-Clique without any intermediate algebraic problem. More precisely, we reduce from (k,k1)(k,k-1)-Gap Clique to (qk,qk1)(q^{k},q^{k-1})-Gap Clique, for any function qq depending only on the parameter kk, thus implying the ko(1)k^{o(1)}-inapproximability result when qq is sufficiently large. Our proof is relatively simple and mostly combinatorial. At the core of our construction is a novel encoding of kk-element subset stemmed from the theory of network coding and a (linear) Sidon set representation of a graph.

1 Introduction

In the kk-Clique problem, we are given an nn-vertex graph GG and an integer k1k\geq 1. The goal is to determine whether GG has a clique of size kk. This problem has been a major point of interest in computer science and is one of Karp’s 21 NP-complete problems [Kar72]. The computational intractability of kk-Clique has also been studied in the context of approximation algorithms. In the wake of the celebrated PCP theorem, a series of works [FGL+96, Hås01] have been dedicated to the construction of PCPs that capture the inapproximability of kk-Clique (or more precisely, the Maximum Clique problem). To date, we know that even finding a clique of size nϵn^{\epsilon} from a graph GG promised to have a clique of size n1ϵn^{1-\epsilon}, for any constant ϵ>0\epsilon>0, is NP-hard. The intractability of kk-Clique appears again in the context of Parameterized Complexity. Here, we parameterize the kk-Clique problem by the number kk with the expectation that kk is much smaller than the size of GG. Abusing the notation, we call the resulting parameterized problem again kk-Clique. It is well known that kk-Clique is a W[1]-complete problem, hence admits no FPT-algorithms, i.e., algorithms with running time t(k)poly(n)t(k)\mathrm{poly}(n) for a computable function tt, unless FPT=W[1]\text{FPT}=\text{W[1]}. Consequently, there is no FPT-algorithm that, on an input graph GG promised to have a kk-clique, computes a clique of size kk. A natural question arises whether there is an FPT-algorithm that computes a clique of size k/f(k)k/f(k). Here, ff is a computable function such that k/f(k)k/f(k) is unbounded and non-decreasing. Such an algorithm is called an FPT-approximation of kk-Clique with approximation ratio ff. Under an assumption apparently stronger than W[1]FPT\text{W[1]}\neq\text{FPT}, namely the Gap Exponential-Time Hypothesis (Gap-ETH), the existence of any FPT-approximation algorithm for kk-Clique has also been ruled out [CCK+20]. That is, kk-Clique is totally FPT-inapproximable – there exists no f(k)f(k)-approximation algorithm for the parameterized kk-Clique problem that runs in t(k)poly(n)t(k)\mathrm{poly}(n)-time, for any computable function ff and tt, depending only on kk with k/f(k)k/f(k) unbounded and non-decreasing – unless Gap-ETH is false. It has been a major open problem whether the standard parameterized intractability assumption (i.e., W[1]FPT\mathrm{W}[1]\neq\mathrm{FPT}) also implies that the kk-Clique problem is totally FPT-inapproximable.

In a breakthrough work [Lin21], Bingkai Lin show that it is W[1]-hard to approximate the parameterized kk-Clique problem within any constant factor. Later on, Karthik C.S. and Khot [KK22] improve the inapproximation ratio to ko(1)k^{o(1)} based on the same framework of Lin. The same lower bound is also recently obtained under ETH by Lin, Ren, Sun, and Wang [LRSW22]. All these results start from the kk-vector-sum problem (kk-Vector Sum for short), which might be viewed as an algebraization of kk-Clique. Then kk-Vector Sum is reduced to some appropriate CSP problem where gap between Yes- and No-instances emerges by some tricky algebraic techniques. Finally, by applying the renowned FGLSS reduction [FGL+96], we obtain kk-Clique instances with the desired gap.

In this paper, we show the same ko(1)k^{o(1)}-appproximation lower bound for kk-Clique by a relatively simple proof. Although our gap-creating reduction is inspired by the framework of Lin [Lin21], we deviate from Lin by re-interpreting his construction as the composition of Sidon Sets and Network Coding applied directly to the kk-Clique problem.

More precisely, the Sidon set is a set SS of vectors such that any two distinct vectors in SS sum to distinct values. This allows us to “label” vertices in a graph with vectors so that the sum of any two adjacent vertices yields a unique “edge-label.” The second ingredient is the network coding technique that we use to compress the information of any kk vertices of the input graph as linear combinations of vectors from the Sidon set. Each linear combination becomes a “node” in the resulting graph. Now, if two linear combinations have coefficients that differ by one or two positions, then we can subtract them to determine the vertex or edge they encode. As every node encodes kk vertices of the input graph, we can arbitrarily pick each node as the center of the group and determine the validity of the encoding. This enables us to show the existence of some constant gap between Yes- and No-instances. In order to achieve super constant gap, we use the linear form of Sidon sets, i.e., linear Sidon sets (see Section 2.3 for more details). Besides linear Sidon sets and network coding, other parts of our proof are combinatorial.

As already mentioned, our construction gives the same inapproximation ratio as that in [KK22] by Karthik C.S. and Khot. However, our reduction has better parameters, which might be important for other applications. In fact, [KK22] presents a reduction from kk-Vector Sum to (q2k2,q2k21/k)(q^{2k^{2}},q^{2k^{2}-1/k})-Gap Clique (see Section 2 for the precise definition of Gap Clique), where qq is a prime greater than 212k2^{12k}. Composing it with the reduction from kk-Clique to kk-Vector Sum [Lin21], which incurs a quadratic blowup in the parameter, we obtain a self-reduction from (k,k1)(k,k-1)-Gap Clique to (qΘ(k4),qΘ(k41/k2))(q^{\Theta(k^{4})},q^{\Theta(k^{4}-1/k^{2})})-Gap Clique, where q=2Ω(k2)q=2^{\Omega(k^{2})} is a prime. On the other hand, our self-reduction from kk-Clique creates instances of (qk,qk1)(q^{k},q^{k-1})-Gap Clique, where qq is an arbitrary prime power. As a by-product, this implies another result in [LRSW22] that there is no t(k)no(logk)t(k)n^{o(\log k)}-time approximation algorithm for kk-Clique with constant approximation ratio unless ETH fails. Furthermore, we believe that our reduction has the potential to be applied recursively to obtain an arbitrarily large gap. Very recently, Lin et al. [LRSW23b] proposed a new technique for obtaining a constant-inapproximability with a polynomial parameter blow-up under ETH. However, the constrained satisfaction problem (CSP) that their technique produces requires arity at least three, which means that it is not applicable to prove the inapproximability result under the W[1]\mathrm{W}[1]-hardness, due to a specific technicality. Moreover, the gap-amplification is limited at O(logk/loglogk)O(\log k/\log\log k).

The comparison of parameter transformation in all the known reductions is shown in Table 1.

Work Hardness Source Parameter Change Inapprox Ratio Running Time
[Lin21] kk-Vector Sum K=2Ω(k3)K=2^{\Omega(k^{3})} O(1)O(1) nΩ(logK6)n^{\Omega(\sqrt[6]{\log K})}
(from kk^{\prime}-Clique, k=(k2)k=\binom{k^{\prime}}{2}) for c>1c>1 (2ck32^{ck^{3}} vs 2ck312^{ck^{3}-1})
[KK22] kk-Vector Sum K=q2k2K=q^{2k^{2}} Ko(1)K^{o(1)} nΩ(logKlogq)n^{\Omega\left(\frac{\log K}{\log q}\right)}
(from kk^{\prime}-Clique, k=(k2)k=\binom{k^{\prime}}{2}) for prime q>212kq>2^{12k} (q2k2q^{2k^{2}} vs q2k21/kq^{2k^{2}-1/k})
[LRSW22] kk-Vector Sum K=q2kK=q^{2k} Ko(1)K^{o(1)} nΩ(logKt)n^{\Omega\left(\frac{\log K}{t}\right)}
(from 33-SAT) for c>1c>1, t=o(logk)t=o(\log k) (2ckt2^{ckt} vs 2(ck1)t2^{(ck-1)t}) (t=Θ(logq)t=\Theta(\log q) for q=2ctq=2^{ct})
[LRSW23b] kk-var Vector 22-CSP K=klogkloglogkK=k^{\frac{\log k}{\log\log k}} KΩ(11clogKK^{\Omega(1-\frac{1}{c\log K}} nΩ(logKloglogK)n^{\Omega\left(\frac{\log K}{\log\log K}\right)}
(from 33-SAT) for c>1c>1, t=o(logk/loglogk)t=o(\log k/\log\log k) (ktk^{t} vs kctk^{ct}) (t=Θ(logq)t=\Theta(\log q) for q=2ctq=2^{ct})
This paper kk-Clique K=qkK=q^{k} qq nΩ(logKlogq)n^{\Omega\left(\frac{\log K}{\log q}\right)}
(self-reduction) for any prime q=2q=\geq 2 (qkq^{k} vs qk1q^{k-1}) for any prime q2q\geq 2
Table 1: Comparison of Parameter Transformation and Running Time Lower Bound under ETH.

2 Preliminaries

We use standard terminology from graph theory. Let G=(V,E)G=(V,E) be a graph, where V=V(G)V=V(G) is the vertex set and E=E(G)E=E(G) is the edge set. An edge in GG between two distinct vertices u,vVu,v\in V is either denoted by uvEuv\in E or {u,v}E\{u,v\}\in E whichever is convenient. A kk-clique is a complete subgraph of GG on kk vertices. We may refer to a clique in GG using a subset of vertices SV(G)S\subseteq V(G) that induces a complete subgraph.

In the kk-Clique problem, we are given a GG and an integer k1k\geq 1, and the goal is to determine whether GG has a clique of size at least kk. The Maximum Clique problem is a maximization variant of kk-Clique, where we are asked to find a maximum-size clique in GG. However, we will mostly abuse the name kk-Clique also for its maximization variant, and here kk denotes the optimal solution, i.e., the size of the maximum clique in GG.111As far as FPT-approximation algorithms are concerned, two versions are indeed equivalent [CGG07]. When we refer to the kk-Clique problem in the context of parameterized complexity, we mean the kk-Clique problem parameterized by kk, i.e., the parameter kk is a small integer independent of the size of GG.

The (k,k)(k,k^{\prime})-Gap Clique problem is a promise version of kk-Clique that asks to decide whether the graph GG has a clique of size kk or every clique in GG has size at most kk^{\prime}. Again, kk is the standard parameter used when discussing parameterized algorithms and parameterized complexity.

For a prime power qq\in\mathbb{N} we use 𝔽q\mathbb{F}_{q} to denote the finite field with qq elements. Let d1d\geq 1. As usual, 𝔽qd\mathbb{F}_{q}^{d} is the vector space over 𝔽q\mathbb{F}_{q}, which consists of all vectors of the form

r=(r1,,rd)r=(r_{1},\ldots,r_{d})

where ri𝔽qr_{i}\in\mathbb{F}_{q} for every i[k]i\in[k]. We also use r[i]r[i] to denote the ii-th coordinate of rr, i.e., r[i]=rir[i]=r_{i}. Recall that the standard unit vector ei𝔽qde_{i}\in\mathbb{F}_{q}^{d} is defined by

ei[i]={1if i=i0otherwise.e_{i}[i^{\prime}]=\begin{cases}1&\text{if $i^{\prime}=i$}\\ 0&\text{otherwise}.\end{cases} (1)

Let r,r𝔽qdr,r^{\prime}\in\mathbb{F}_{q}^{d}. We define the set of coordinates (or positions) on which rr and rr^{\prime} differ by

diff(r,r){i[d]|r[i]r[i]}.\mathrm{diff}(r,r^{\prime})\coloneqq\big{\{}i\in[d]\bigm{|}r[i]\neq r^{\prime}[i]\big{\}}.

Then the Hamming distance between rr and rr^{\prime} is

Hamming(r,r)|diff(r,r)|.\mathrm{Hamming}(r,r^{\prime})\coloneqq\big{|}\mathrm{diff}(r,r^{\prime})\big{|}.

2.1 Parameterized Complexity

We follow the definitions and notations from [FG06]. In the context of computational complexity, a decision problem is defined as a set of strings over a finite alphabet Σ\Sigma, sometimes called a language, say ΠΣ\Pi\subseteq\Sigma^{*}. A parameterization of a problem is a polynomial-time computable function κ:Σ\kappa:\Sigma^{*}\rightarrow\mathbb{N}. A parameterized decision problem is then defined as a pair (Π,κ)(\Pi,\kappa), where ΠΣ\Pi\subseteq\Sigma^{*} is an arbitrary decision problem, and κ\kappa is its parameterization.

The parameterization κ\kappa gives a characterization of an instance of the designated problem that depends on the instance’s property but not on the input size. That is, the function κ\kappa maps an instance of a problem to a small positive integer, and we may think of a parameterized decision problem as a problem where each input instance xΣx\in\Sigma^{*} is associated with a number κ(x)\kappa(x) which is typically much smaller than the length |x||x| of xx.

We say that a parameterized problem (Π,κ)(\Pi,\kappa) is fixed-parameter tractable (FPT) if there is an algorithm that, given a string xΣx\in\Sigma^{*} decides whether xΠx\in\Pi in time t(κ(x))poly(|x|)t(\kappa(x))\mathrm{poly}(|x|), where poly(n)=c>0nc\mathrm{poly}(n)=\bigcup_{c>0}n^{c} and t(k)t(k) is a computable function that depends only on the parameter k=κ(x)k=\kappa(x). The running time of the form t(κ(x))poly(|x|)t(\kappa(x))\mathrm{poly}(|x|) is called FPT-time, and the algorithm that decides a parameterized problem in FPT-time is called an FPT algorithm. The complexity class FPT is the class of all parameterized problems that admit FPT algorithms. Similar to the polynomial-hierarchy in the theory of NP-completeness, there exists the classes of W-hierarchy such that

FPTW[1]W[2]W[i].\text{FPT}\subseteq\text{W[1]}\subseteq\text{W[2]}\subseteq\cdots\subseteq\text{W[i]}\subseteq\cdots.

The classes W[1] and W[2] contain many natural complete problems, in particular the kk-Clique problem and the kk-Dominating Set problem, respectively. Thus, kk-Clique admits no FPT algorithm unless W[1]=FPT\text{W[1]}=\text{FPT}, and similarly, kk-Dominating-Set admits no FPT algorithm unless W[2]=FPT\text{W[2]}=\text{FPT} (which thus implies W[2]=W[1]=FPT\text{W[2]}=\text{W[1]}=\text{FPT}).

FPT algorithms have been one of the tools for coping with NP-hard problems as when the parameter κ(x)\kappa(x) is much smaller than |x||x|, an FPT algorithm is simply a polynomial-time algorithm. However, as mentioned, assuming W[1]FPT\text{W[1]}\neq\text{FPT}, many important optimization problems like the maximum clique problem and the minimum dominating set problem admit no FPT algorithm. This leads to the seek of an f(k)f(k)-FPT-approximation algorithm for those optimization problems, where ff is a computable function. That is, an FPT-time algorithm that, given an instance xx with the additional parameter kk such that cost of an optimal solution, denoted by opt(x)\mathrm{opt}(x), satisfies

{opt(x)kfor maximization problemsopt(x)kfor minimization problems,\begin{cases}\mathrm{opt}(x)\geq k&\text{for maximization problems}\\ \mathrm{opt}(x)\leq k&\text{for minimization problems},\end{cases}

produces a feasible solution yy with

{cost(y)kf(k)for maximization problemscost(y)kf(k)for minimization problems.\begin{cases}\mathrm{cost}(y)\geq\frac{k}{f(k)}&\text{for maximization problems}\\ \mathrm{cost}(y)\leq k\cdot f(k)&\text{for minimization problems}.\end{cases}

The function ff is known as the approximation ratio of the algorithm. It should be clear that, for maximization problems, we are only interested in that ratio ff with unbounded and non-decreasing k/f(k)k/f(k).

The development of FPT-approximation algorithms has been at a fast pace in both upper bound and lower bound (i.e., the FPT-inapproximability results). Please see, e.g., [FSLM20] for references therein. At present, two core problems in the area of parameterized complexity – the kk-Clique problem and the kk-Dominating Set problem – are known to be totally FPT-inapproximable, i.e., they admit no FPT-approximation algorithm parameterized by the size of optimal solution kk for any ratio, unless Gap-ETH is false [CCK+20], and the total FPT-inapproximability of the kk-Dominating Set problem under W[1]FPT\text{W[1]}\neq\text{FPT} was later proved by Karthik C.S, Laekhanukit and Manurangsri in [SLM19]. There has been steady progress on the hardness of approximation for kk-Clique and kk-Dominating Set [LRSW23a]. Nevertheless, the question of whether kk-Clique is totally FPT-inapproximable under W[1]FPT\text{W[1]}\neq\text{FPT} remains open.

To be formal, we say that a maximization (respectively, minimization) problem Π\Pi parameterized by a standard parameter kk (e.g., the size of the optimal solution) is totally FPT-inapproximable if, for any computable non-decreasing functions t(k)t(k) and f(k)f(k), depending only on kk, given an input of size nn, there exists no algorithm running in time t(k)poly(n)t(k)\mathrm{poly}(n) that outputs a feasible solution with the cost at least k/f(k)k/f(k) (respectively, at most f(k)kf(k)\cdot k for minimization problems).

Open Problem 1 (W[1]{\mathrm{W}[1]}-hardness of kk-Clique).

Does the total FPT-inapproximability of the kk-Clique problem hold under W[1]FPT\mathrm{W}[1]\neq\mathrm{FPT}?

A similar problem is also open for the kk-Dominating Set problem.

Open Problem 2 (W[2]{\mathrm{W}[2]}-hardness of kk-Dominating Set).

Does the total FPT-inapproximability of the kk-Dominating Set problem hold under W[2]FPT\mathrm{W}[2]\neq\mathrm{FPT}?

Lastly, another conjecture has been established as a generic tool for deriving FPT-inapproximability result, namely the Parameterized Intractability Hypothesis (PIH). The conjecture involves asking whether a 22-Constraint Satisfaction problem (22-CSP) on kk variables, where kk is a constant, while each variable, say XiX_{i}, takes a value from [n][n] admits no constant FPT-approximation algorithms. More formally, 22-CSP is a constraint satisfaction problem in which each constraint involves exactly two variables, and it is hypothesized that there is no constant-factor approximation for such a 22-CSP parameterized by kk in FPT-time:

Hypothesis 1 (Parameterized Intractability Hypothesis).

For some constant ϵ>0\epsilon>0, there is no (1+ϵ)(1+\epsilon)-factor FPT-approximation algorithm for 22-CSP on kk variables 22-CSP on kk variables parameterized by kk, where each variable takes value from [n][n].

The result of Chalermsook et al.[CCK+20] implies that PIH holds under Gap-ETH, which has been improved to merely assuming ETH in a very recent breakthrough [GLR+24]. Nevertheless, it has been a major open problem whether PIH holds under the W[1]\mathrm{W}[1]-hardness:

Open Problem 3 (Parameterized Intractability Hypothesis).

Does the Parameterized Intractability Hypothesis hold under W[1]FPT\mathrm{W}[1]\neq\mathrm{FPT}?

2.2 Network Coding

Network coding is an information compression technique used in multi-input multicast networks. This approach was first introduced in [ACLY00], formalized in [HKM+03] for delay-free acyclic networks and then generalized to the general case in [HMS+03]. Please see [YLC06] for references therein.

In the model of random linear network coding, the data are transmitted from kk different source nodes to kk different destinations as kk vectors over the finite field 𝔽qd\mathbb{F}_{q}^{d}, say v1,,vk𝔽qdv_{1},\ldots,v_{k}\in{\mathbb{F}}^{d}_{q}, where qq is a sufficiently large prime power. Whenever data packets meet at an intermediate node, the data are compressed and transmitted as a linear combination of the kk vectors with random coefficients, i.e.,

r1v1+r2v2++rkvk,r_{1}v_{1}+r_{2}v_{2}+\ldots+r_{k}v_{k},

where ri𝔽qr_{i}\in\mathbb{F}_{q} for every i[k]i\in[k]

Given that a sink node receives enough packets, it can decode the information correctly by solving the linear system. More precisely, in the linear network coding with random coefficients, it was proved that O(k)O(k) packets are enough to guarantee the existence of kk linearly independent vectors. Thus, there is a unique solution to the linear system, allowing the sink nodes to retrieve and correctly decode the information.

The advantage of network coding is in the efficiency of the throughput, which is close to the optimum, and the amount of memory required to store the packets.

The connection between the network coding and the hardness of approximating kk-Clique is not known until the breakthrough result of Lin [Lin21], who implicitly applied the network coding approach to compress the information of kk vertices as one single vector. Once we have kk vectors with linear independent coefficients, then the information of the original kk vertices can be decoded uniquely, thus allowing one to check whether the encoded kk vertices form a kk-clique or not. As every vector encodes information of kk vertices, every subset of kk linearly independent vectors gives a unique solution to the linear system, making the verifier reject a graph that has no kk-clique by reading only a tiny portion of the vectors. This, consequently, gives an approximation hardness of the kk-Clique problem through a very clever reduction from this specific probabilistic proof system to an instance of kk-Clique.

Technically, the network coding is the same as Hadamard code, but it is more in line with our intuition of the self-reduction creating the gap for the kk-Clique problem. See Section 3 for more detailed discussions.

2.3 Sidon Sets and Generalization

A Sidon set is a subset SS of an abelian group such that the sum of any two distinct elements in SS are different, i.e., for any x,y,x,ySx,y,x^{\prime},y^{\prime}\in S with xyx\neq y and xyx^{\prime}\neq y^{\prime}, we have x+y=x+yx+y=x^{\prime}+y^{\prime} if and only if {x,y}={x,y}\{x,y\}=\{x^{\prime},y^{\prime}\}. Given a positive integer n1n\geq 1, a Sidon set S/nS\subseteq\mathbb{Z}/n\mathbb{Z} of size at least n1/3n^{1/3} can be constructed in polynomial time using a greedy algorithm (the algorithm is attributed to Erdös): Start with a set A=A=\emptyset. Iteratively add to AA a number x{0,,n1}Ax\in\{0,\ldots,n-1\}\setminus A such that xx cannot be written as a+bca+b-c, for any a,b,cAa,b,c\in A until no such number exists. Erdös and Turán [ET41] showed that a Sidon set of size n\sqrt{n} can also be constructed efficiently using quadratic residues. In particular, they showed that the set S={2pk+(k2):1kp}S=\{2pk+(k^{2}):1\leq k\leq p\} is a Sidon set, where pp is a fixed prime, and all the operations are done under 𝔽p\mathbb{F}_{p}. Observe that if the vertices of a graph are from a Sidon set, then the endpoints of every edge sum to a unique element in the underlying abelian group. Thus, representing a graph vertices with a Sidon set makes it open to standard arithmetic operation, albeit under some finite fields.

There are a few generalizations of Sidon sets [O’B04]. For our purposes, we require the linear form of Sidon sets, which have been studied in the works of Ruzsa [Ruz93, Ruz95], which we call linear Sidon sets.

Definition 1.

Let 𝔽\mathbb{F} be a finite field and d1d\geq 1. A subset S𝔽dS\subseteq\mathbb{F}^{d} is an Linear Sidon set if for all a,b𝔽(=𝔽{0})a,b\in\mathbb{F}^{*}(=\mathbb{F}\setminus\{0\}) and x,y,x,ySx,y,x^{\prime},y^{\prime}\in S with xyx\neq y and xyx^{\prime}\neq y^{\prime} we have

ax+by=ax+by\displaystyle ax+by=ax^{\prime}+by^{\prime} \displaystyle\Longrightarrow {x,y}={x,y}.\displaystyle\{x,y\}=\{x^{\prime},y^{\prime}\}.

The above definition is a natural generalization of the sum-free set [CE16, Gre04] and the set excluding no three-terms arithmetic progression. Our linear Sidon set, on the other hands, is a special case of the (M,b)(M,b)-free set [Gre05, Sha09] defined as a set SS in which there exists no non-trivial solution to a linear system Mx=bMx=b takes value from SS^{\ell}.

3 Overview of Our Proof

In this section, we outline the key ideas and steps in our construction. The input to our reduction is a graph GG and an integer k1k\geq 1. For technical reasons, we assume without loss of generality that GG is an instance of the multi-colored kk-Clique problem, where

V(G)=i[k]Vi,V(G)=\biguplus_{i\in[k]}V_{i},

and each vertex set ViV_{i} is an independent set in GG. Thus, any kk-clique in GG must include exactly one vertex from each ViV_{i}. Our objective is to construct a graph HH from GG and kk such that there is a significant gap between the size of the maximum clique in HH for the Yes-Instance (where GG has a kk-clique) and the No-Instance (where GG has no kk-clique).

Informal Discussions.

Let us begin with an informal explanation of the intuition behind our reduction.

Generally, the gap between the Yes and No cases can be created using the \ell-wise graph product for some 1\ell\geq 1, generating the graph GG^{\ell} where each node in GG^{\ell} represents a clique of size \ell in GG. Thus, any kk-clique in GG translates into a clique of size kk^{\ell} in GG^{\ell}, creating a gap of kk^{\ell} versus (k1)(k-1)^{\ell}. However, achieving a constant gap requires \ell to be at least Ω(k)\Omega(k) (e.g., =k/2\ell=k/2), which results in a graph GG^{\ell} with size nO(k)n^{O(k)}. This makes a standard graph product or self-reduction insufficient to achieve constant inapproximability under FPT-reduction.

To overcome this, we employ a technique from network coding theory to compress the representation of GG^{\ell}. Specifically, we encode each subset of kk vertices in GG as a linear combination of kk vertices represented by vectors over a finite field 𝔽qd\mathbb{F}^{d}_{q}. Each “node” in the new graph corresponds to a linear combination π=𝐫T𝐯\pi={\bf r}^{T}{\bf v}, where 𝐯{\bf v} is a column vector of kk vertices.

The new graph consists of qkq^{k} groups of nodes, or color classes, each corresponding to the coefficients of the linear combination. Each color class forms an independent set of nodes representing all possible linear combinations. Thus, any algorithm can select at most one node from each color class to form a clique. The edges between nodes from different color classes encode consistency between two linear combinations: an edge exists between two nodes if both encode the same “valid” information.

In network coding, the decoding is straightforward; the receiver node waits until kk linearly independent combinations arrive. Senders cannot deceive by sending inconsistent linear combinations since the linear system has a unique solution. However, in the construction of the kk-Clique instance, each edge represents the decoding of information from only two linear combinations. In proof systems, the kk-Clique instance can be viewed as a clique-test involving two-query tests performed (k2)\binom{k}{2} times. Each two-query test decodes only two nodes and is effective only when the two linear combinations differ by at most two coordinates. If they differ by one coordinate, the subtraction yields a numerical encoding of a vertex in the original graph. If they differ by two coordinates, it produces the sum of two numbers, which encodes two different nodes and purportedly an edge. This creates a uniqueness issue, as a single equation with two variables may not have a unique solution. We resolve this by using a Sidon Set from additive combinatorics, which ensures that the addition or subtraction of any two elements in the set is unique, thereby addressing the non-uniqueness problem.

This approach departs from previous works [Lin21] and [KK22], which used a similar encoding but treated it as Hadamard codes. Karthik and Khot [KK22] resolved the uniqueness issue by leveraging properties of Hadamard codes, arguing that most two-query tests have unique solutions. In contrast, we rely on the Sidon Set’s properties to resolve these issues.

Our proof further diverges by viewing the encoding through the lens of network coding, using linear independence to argue consistency. Specifically, it is not guaranteed that two different nodes encode the same information—an issue addressed in locally testable codes through local codeword testing. However, we bypass local testing by arguing via linear independence, a technique applicable exclusively to the clique problem. For any node vv, we treat it as the center of a ball, adding kk nodes whose coefficients differ by one position (within Hamming distance one). The kk nodes are linearly independent, ensuring a unique solution, preventing adversaries from feeding inconsistent information. This allows us to completely bypass local testing, an advantage unique to the clique problem.

More concretely, our argument proceeds as follows. If the input graph contains a kk-clique, then there exists a clique of size qkq^{k}. Otherwise, let QQ be a clique provided by any algorithm. We know that each node in QQ is from a different color class. Selecting any node vv as a center, we consider the nodes around it within Hamming distance one (with respect to coordinates). As discussed, these nodes form a linearly independent set, yielding a unique solution. Moreover, any two non-center nodes that are in different directions from the center are at Hamming distance two, triggering an edge test. An edge exists between these two nodes if their subtraction decodes to an edge in the original graph.

Combining the consistency property with the uniqueness of decoding, we conclude that each ball has non-center nodes in all kk directions away from the center if and only if their decoding forms a clique in the original graph. Therefore, we establish the following: (1) there are qkq^{k} centers, leading to qkq^{k} balls, and (2) each ball misses at least one direction, resulting in the loss of qq balls in that direction. Using a two-term Sidon Set, we show that each of the missing qq balls can belong to at most kk balls, implying that if the original graph has no kk-clique, the resulting graph can have a clique of size at most qkk/q=qk1/kq^{k}\cdot k/q=q^{k-1}/k. Finally, we will show that double counting does not occur if we employ a higher-order Sidon Set, say a four-term Sidon Set.

More formal description.

Now let us give a more formal description of our reduction at some high level. To begin with, we identify the vertex set V(G)V(G) with a linear Sidon set SS over a finite field 𝔽qd\mathbb{F}_{q}^{d} for an appropriate integer d1d\geq 1. We will show in Section 4 that such S=V(G)S=V(G) can be constructed with the size of 𝔽qd\mathbb{F}_{q}^{d} polynomially bounded by |V(G)||V(G)|. As the second step, for a fixed vector r=(r1,,rk)𝔽qkr=(r_{1},\ldots,r_{k})\in\mathbb{F}_{q}^{k} every kk-element subset {v1,,vk}V(G)\{v_{1},\ldots,v_{k}\}\subseteq V(G) with viViv_{i}\in V_{i} for every i[k]i\in[k] is associated with

π=r1v1+r2v2++rkvk,\pi=r_{1}v_{1}+r_{2}v_{2}+\cdots+r_{k}v_{k}, (2)

exactly the same way as we transmit kk vectors v1,,vkv_{1},\ldots,v_{k} by the network coding using r1,,rkr_{1},\ldots,r_{k} as coefficients. Thereby, for every r𝔽qkr\in\mathbb{F}_{q}^{k} we have a copy of 𝔽qd\mathbb{F}_{q}^{d} as the column CrC_{r} of vertices in HH indexed by rr.

Given that the size of 𝔽qd\mathbb{F}_{q}^{d} is polynomial in |V(G)||V(G)|, there are only a polynomially bounded number of linear combinations of kk vectors. However, the number of kk-element subsets of V(G)V(G) is Ω(nk)\Omega(n^{k}). This is simply because each vector π\pi obtained in (2) encodes many different kk-element subsets. However, if we have two π\pi and π\pi^{\prime} from two columns CrC_{r} and CrC_{r^{\prime}} with the Hamming distance between r=(r1,,rk)r=(r_{1},\ldots,r_{k}) and r=(r1,,rk)r^{\prime}=(r^{\prime}_{1},\ldots,r^{\prime}_{k}) being one, i.e., ririr_{i}\neq r^{\prime}_{i} for exactly one i[k]i\in[k], then ππ\pi-\pi^{\prime} will give us a unique vertex viViv_{i}\in V_{i} with

vi=(riri)1(ππ).v_{i}=(r_{i}-r^{\prime}_{i})^{-1}(\pi-\pi^{\prime}). (3)

Of course, it could happen that (riri)1(ππ)(r_{i}-r^{\prime}_{i})^{-1}(\pi-\pi^{\prime}) is not an element in the linear Sidon set S=V(G)S=V(G). Hence, we only add an edge between π\pi and π\pi^{\prime} in case (3) is really a vertex in GG. In other words, the edges between columns rr and rr^{\prime} test whether two encoding πCr\pi\in C_{r} and πCr\pi^{\prime}\in C_{r^{\prime}} are consistent on their ii-th positions where ririr_{i}\neq r^{\prime}_{i}. We might say that π\pi and π\pi^{\prime} pass the vertex test by the vertex viV(G)v_{i}\in V(G).

Now assume that, with r𝔽qkr\in\mathbb{F}^{k}_{q} we have for all i[k]i\in[k] an ri𝔽qkr^{i}\in\mathbb{F}_{q}^{k} which differs from rr only on the ii-th position. Furthermore, with πCr\pi\in C_{r}, assume that there are π1Cr1,,πkCrk\pi^{1}\in C_{r^{1}},\ldots,\pi^{k}\in C_{r^{k}} all adjacent to π\pi in HH. By the above construction, we have v1V1,,vkVkv_{1}\in V_{1},\ldots,v_{k}\in V_{k} with each pair π\pi and πi\pi^{i} passing the vertex test by the vertex viv_{i}. Then if viv_{i} and viv_{i^{\prime}} are adjacent in GG for all 1i<ik1\leq i<i^{\prime}\leq k, the original graph GG has a kk-clique, i.e., {v1,,vk}\{v_{1},\ldots,v_{k}\}. On the other hand, it is easy to see that rir_{i} and rir_{i^{\prime}} differ exactly on two positions, i.e., ii and ii^{\prime}. As viv_{i} and viv_{i^{\prime}} are both from the linear Sidon set S=V(G)S=V(G), it implies that they are uniquely determined by πiπi\pi^{i}-\pi^{i^{\prime}}. Thus we add an edge between πi\pi^{i} and πi\pi^{i^{\prime}} exactly when there is an edge between the corresponding viv_{i} and viv_{i^{\prime}} in GG. As a consequence,

if π,π1,,πk induce a clique in H, then G has a k-clique.\text{if $\pi,\pi^{1},\ldots,\pi^{k}$ induce a clique in $H$, then $G$ has a $k$-clique}. (4)

More generally, we have two r=(r1,,rk)r=(r_{1},\ldots,r_{k}) and r=(r1,,rk)r^{\prime}=(r^{\prime}_{1},\ldots,r^{\prime}_{k}) in 𝔽qk\mathbb{F}_{q}^{k} with Hamming distance two, i.e., ri=rir_{i}=r^{\prime}_{i} for all i[k]i\in[k] but two 1i1<i2k1\leq i_{1}<i_{2}\leq k. Moreover, πCr\pi\in C_{r} and πCr\pi^{\prime}\in C_{r^{\prime}} with

ππ=(ri1ri1)v+(ri2ri2)v.\pi-\pi^{\prime}=(r_{i_{1}}-r^{\prime}_{i_{1}})v+(r_{i_{2}}-r^{\prime}_{i_{2}})v^{\prime}. (5)

Then there is an edge between π\pi and π\pi^{\prime} if and only if vv and vv^{\prime} are adjacent in E(G)E(G). That is, π\pi and π\pi^{\prime} pass the edge test with the edge vvE(G)vv^{\prime}\in E(G).

To summarize, the graph HH consists of columns CrC_{r} of vertices indexed by r𝔽qkr\in\mathbb{F}_{q}^{k}. Each column CrC_{r} is a copy of 𝔽qd\mathbb{F}_{q}^{d} which is supposed to encode a kk-element subset of V(G)V(G) by the linear combination of the form (2). The edges between two columns CrC_{r} and CrC_{r^{\prime}} depend on the Hamming distance between rr and rr^{\prime}. They correspond to the vertex test and the edge test if the distance between rr and rr^{\prime} is one or two, respectively. If it is more than two, then we will add all the edges between CrC_{r} and CrC_{r^{\prime}}. Figure 1 illustrates a part of the construction of HH.

CrC_{r}π\piCr1C_{r^{1}}π1\pi^{1}Cr2C_{r^{2}}π2\pi^{2}CrC_{r^{\infty}}π\pi^{\infty}
Figure 1: Let r,r1,r2,r𝔽qkr,r^{1},r^{2},r^{\infty}\in\mathbb{F}_{q}^{k} with Hamming(r,r1)=1\mathrm{Hamming}(r,r^{1})=1, Hamming(r,r2)=1\mathrm{Hamming}(r,r^{2})=1, Hamming(r1,r2)=2\mathrm{Hamming}(r^{1},r^{2})=2, and Hamming(r,r)3\mathrm{Hamming}(r,r^{\infty})\geq 3. More precisely, say rr and r1r^{1} differ on their first positions, and rr and r2r^{2} on their second positions, hence r1r^{1} and r2r^{2} differ exactly on their first and second positions. As a consequence, the edge between πCr\pi\in C_{r} and π1\pi^{1} means that ππ1=(r[1]r1[1])v1\pi-\pi^{1}=(r[1]-r^{1}[1])v_{1} for some v1V1v_{1}\in V_{1}, and similarly ππ2=(r[2]r2[2])v2\pi-\pi^{2}=(r[2]-r^{2}[2])v_{2} for some v2V2v_{2}\in V_{2} by the edge ππ2\pi\pi^{2}. Here, we use r[1]r[1] to denote the first coordinate of the vector r𝔽qkr\in\mathbb{F}_{q}^{k}, and similarly, r1[1]r^{1}[1] is the first coordinate of r1r^{1}. Furthermore, the edge between π1\pi^{1} and π2\pi^{2} implies that v1v2v_{1}v_{2} is an edge in the original graph GG. Finally, since Hamming(π,π)3\mathrm{Hamming}(\pi,\pi^{\infty})\geq 3, there is an edge between π\pi and any πCr\pi^{\infty}\in C_{r^{\infty}}.

If GG has a kk-clique of vertices v1,,vkv_{1},\ldots,v_{k}, then for every column CrC_{r} we can pick πCr\pi\in C_{r} as (2). Our construction ensures they form a clique in HH of size qkq^{k}. Otherwise, i.e., any clique KK in GG has size at most k1k-1, we will argue that the size of a maximum clique in GG is on the order of qk1q^{k-1}. The key observation is that, as implied by (4), for every r𝔽qdr\in\mathbb{F}_{q}^{d}, if there is a πKCr\pi\in K\cap C_{r}, then for at least one i[k]i\in[k] for all rr^{\prime} which differs from rr in exactly the ii-th position, the column CrC_{r^{\prime}} contains no vertex in KK.

4 Construction of Linear Sidon Sets

This section presents the construction of a linear Sidon set of a given size using a greedy algorithm similar to that of Erdös mentioned in the Preliminaries.

Fix a finite field 𝔽\mathbb{F} and d1d\geq 1. To ease the presentation, we introduce a further technical notion.

Definition 2.

A subset S𝔽dS\subseteq\mathbb{F}^{d} is 44-term linearly independent if every SSS^{\prime}\subseteq S with |S|4|S^{\prime}|\leq 4 is linearly independent. Equivalently, every xSx\in S is not a linear combination of three (not necessarily distinct) vectors in S{x}S\setminus\{x\}.

Lemma 1.

Let S𝔽dS\subseteq\mathbb{F}^{d} be 44-term linearly independent. Then SS is a linear Sidon set.

Proof.

Let a,b𝔽=𝔽{0}a,b\in\mathbb{F}^{*}=\mathbb{F}\setminus\{0\} and x,y,x,ySx,y,x^{\prime},y^{\prime}\in S such that xyx\neq y, xyx^{\prime}\neq y^{\prime}, and ax+by=ax+byax+by=ax^{\prime}+by^{\prime}. Assume that {x,y}{x,y}\{x,y\}\neq\{x^{\prime},y^{\prime}\}. By symmetry, we can further assume without loss of generality

x{y,x,y}.x\notin\{y,x^{\prime},y^{\prime}\}.

Note that

x=a1(ax+byby)=x+a1bya1by.x=a^{-1}(ax^{\prime}+by^{\prime}-by)=x^{\prime}+a^{-1}by^{\prime}-a^{-1}by.

Thus xx is a linear combination of x,y,yS{x}x^{\prime},y^{\prime},y\in S\setminus\{x\}, contradicting Definition 2. ∎

By induction, we will construct a sequence

S0S1Si𝔽dS_{0}\subsetneq S_{1}\subsetneq\ldots\subsetneq S_{i}\subsetneq\ldots\subsetneq\mathbb{F}^{d}

of 44-term linearly independent subsets of 𝔽d\mathbb{F}^{d}, hence linear Sidon sets by Lemma 1. We start with S0S_{0}\coloneqq\emptyset which vacuously satisfies Definition 2. Let i0i\geq 0 and assume that the 44-term linearly independent Si𝔽dS_{i}\subseteq\mathbb{F}^{d} has already been constructed with |Si|=i|S_{i}|=i. We define a set

span(Si){ax+by+cx|\displaystyle\mathrm{span}(S_{i})\coloneqq\big{\{}ax+by+cx^{\prime}\bigm{|} a,b,c𝔽 and x,y,xSi}.\displaystyle\text{$a,b,c\in\mathbb{F}$ and $x,y,x^{\prime}\in S_{i}$}\big{\}}.

It is easy to see

|span(Si)||Si|3|𝔽|3=i3|𝔽|3.|\mathrm{span}(S_{i})|\leq|S_{i}|^{3}\cdot|\mathbb{F}|^{3}=i^{3}\cdot|\mathbb{F}|^{3}.

Assume

i3|𝔽|3<|𝔽|d=|𝔽d|,i^{3}\cdot|\mathbb{F}|^{3}<|\mathbb{F}|^{d}=\left|\mathbb{F}^{d}\right|, (6)

then there is a

w𝔽dspan(Si).w\in\mathbb{F}^{d}\setminus\mathrm{span}(S_{i}).

Choose arbitrarily such a ww and let

Si+1Si{w}.S_{i+1}\coloneqq S_{i}\cup\{w\}.

Clearly Si+1S_{i+1} is again 44-term linearly independent.

Theorem 1.

Let 𝔽q\mathbb{F}_{q} be a finite field and n1n\geq 1. We set

d3lognlogq+3.d\coloneqq\lceil\frac{3\log n}{\log q}+3\rceil.

Thus

nq(d3)/3.n\leq q^{(d-3)/3}.

Then we can construct a 4-term linearly independent set S𝔽qdS\subseteq\mathbb{F}_{q}^{d} with |S|=n|S|=n and |𝔽qd|q4n3|\mathbb{F}_{q}^{d}|\leq q^{4}\cdot n^{3} in time polynomial in n+qn+q. Observe that SS is also a linear Sidon set by Lemma 1.

Proof.

Following the greedy strategy as we described, we construct a sequence of 4-term linearly independent

S0S1Si𝔽d.S_{0}\subsetneq S_{1}\subsetneq\ldots\subsetneq S_{i}\subsetneq\ldots\subsetneq\mathbb{F}^{d}.

As

(n1)3|𝔽q|3<n3|𝔽q|3=n3q3qd3q3=qd=|𝔽q|d,(n-1)^{3}\cdot|\mathbb{F}_{q}|^{3}<n^{3}\cdot|\mathbb{F}_{q}|^{3}=n^{3}\cdot q^{3}\leq q^{d-3}\cdot q^{3}=q^{d}=|\mathbb{F}_{q}|^{d},

by (6) we conclude that SnS_{n} can be constructed. Thus we can take SSnS\coloneqq S_{n} with |S|=n|S|=n.

To see that the greedy algorithm runs in polynomial time, it suffices to observe that

|𝔽qd|=qd=q3lognlogq+3n3q4.|\mathbb{F}_{q}^{d}|=q^{d}=q^{\lceil\frac{3\log n}{\log q}+3\rceil}\leq n^{3}\cdot q^{4}.\qed

4.1 Multi-term linearly independent sets

For the construction in Subsection 5.4, we need to consider linear combinations of more than two vectors in a given set SS. This leads to the following definition.

Definition 3.

Let 𝔽\mathbb{F} be a finite field and t1t\geq 1. A subset S𝔽dS\subseteq\mathbb{F}^{d} is tt-term linearly independent if every SSS^{\prime}\subseteq S with |S|t|S^{\prime}|\leq t is linearly independent.

We leave the details of the construction of tt-term linearly independent sets to the reader, which is a straightforward generalization of Theorem 1.

Theorem 2.

Let 𝔽q\mathbb{F}_{q} be a finite field, t1t\geq 1 and n1n\geq 1. We set

d(2t1)lognlogq+2t1.d\coloneqq\lceil\frac{(2t-1)\log n}{\log q}+2t-1\rceil.

Thus

nq(d(2t1))/(2t1).n\leq q^{(d-(2t-1))/(2t-1)}.

Then we can construct a tt-term linearly independent set S𝔽qdS\subseteq\mathbb{F}_{q}^{d} with |S|=n|S|=n in time polynomial in nt+qtn^{t}+q^{t}.

5 Reduction from (k,k1)(k,k-1)-Gap Clique to (qk,qk1)(q^{k},q^{k-1})-Gap Clique

In this section, we first present a reduction from kk-Clique, or equivalently (k,k1)(k,k-1)-Gap Clique, to (qk,kqk1)(q^{k},k\cdot q^{k-1})-Gap Clique. This in fact already implies that the Maximum kk-Clique problem admits no FPT-approximation algorithm with approximation ratio ko(1)k^{o(1)}, unless FPT=W[1]\text{FPT}=\text{W[1]}. Then we explain how to modify our construction to get a reduction from (k,k1)(k,k-1)-Gap Clique to (qk,qk1)(q^{k},q^{k-1})-Gap Clique. It is less transparent than the first reduction, but will enable us to obtain a lower bound in [LRSW22] under ETH.

5.1 The reduction

Let (G,k)(G,k) be an instance of the multi-colored kk-Clique problem. In particular,

V(G)=i[k]Vi,V(G)=\biguplus_{i\in[k]}V_{i}, (7)

and each ViV_{i} is an independent set in GG. We construct a graph HH as follows.

  • Let 𝔽q\mathbb{F}_{q} be a finite field where qq is to be determined later. Moreover, let n|V(G)|n\coloneqq|V(G)| and d3lognlogq+3d\coloneqq\lceil\frac{3\log n}{\log q}+3\rceil. So, by Theorem 1, we can assume without loss of generality that V(G)𝔽qdV(G)\subseteq\mathbb{F}_{q}^{d} is a linear Sidon set of size nn.

  • For every r𝔽qkr\in\mathbb{F}_{q}^{k} we define the column with index rr as

    Cr{Πr,π|π𝔽qd}.C_{r}\coloneqq\Big{\{}\Pi_{r,\pi}\Bigm{|}\pi\in\mathbb{F}_{q}^{d}\Big{\}}.

    Here Πr,π\Pi_{r,\pi} is a unique vertex associated with rr and π\pi. Then we set

    V(H)r𝔽qkCr.V(H)\coloneqq\bigcup_{r\in\mathbb{F}_{q}^{k}}C_{r}. (8)
  • We still need to define the edge set E(H)E(H) for the graph HH. Let r,r𝔽qkr,r^{\prime}\in\mathbb{F}_{q}^{k} and Πr,πCr\Pi_{r,\pi}\in C_{r}, Πr,πCr\Pi_{r^{\prime},\pi^{\prime}}\in C_{r^{\prime}}. We distinguish the following cases.

    1. (H1)

      If r=rr=r^{\prime}, then there is no edge between Πr,π\Pi_{r,\pi} and Πr,π\Pi_{r^{\prime},\pi^{\prime}}.

    2. (H2)

      If Hamming(r,r)=1\mathrm{Hamming}(r,r^{\prime})=1, say r=r+aeir^{\prime}=r+ae_{i}222Recall eie_{i} is unit vector defined as (1). for some a𝔽qa\in\mathbb{F}_{q}^{*} and i[k]i\in[k]. Then

      Πr,πΠr,πE(H)\displaystyle\Pi_{r,\pi}\Pi_{r^{\prime},\pi^{\prime}}\in E(H) ππ=av for some vViV(G).\displaystyle\iff\text{$\pi^{\prime}-\pi=av$ for some $v\in V_{i}\subseteq V(G)$}. (9)

      That is, π\pi and π\pi^{\prime} pass vertex test with the vertex vv as described in Section 3.

    3. (H3)

      a,b𝔽qa,b\in\mathbb{F}_{q}^{*} and distinct i,j[k]i,j\in[k]. Then

      Πr,πΠr,πE(H)\displaystyle\Pi_{r,\pi}\Pi_{r^{\prime},\pi^{\prime}}\in E(H)\iff ππ=au+bv\pi^{\prime}-\pi=au+bv
      for some uVi and vVj with uvE(G).\displaystyle\quad\text{for some $u\in V_{i}$ and $v\in V_{j}$ with $uv\in E(G)$}. (10)

      Hence, π\pi and π\pi^{\prime} pass the edge test with the edge uvuv. Note by Definition 1 the vertices uu and vv, if exist, are unique.

    4. (H4)

      If Hamming(r,r)3\mathrm{Hamming}(r,r^{\prime})\geq 3, then Πr,πΠr,πE(H)\Pi_{r,\pi}\Pi_{r^{\prime},\pi^{\prime}}\in E(H).

Lemma 2.

The graph HH can be constructed in time polynomial in |V(G)|+qk|V(G)|+q^{k}.

Proof.

First, we identify V(G)V(G) with a linear Sidon set S𝔽qdS\subseteq\mathbb{F}_{q}^{d}. By Theorem 1, this can be done in time polynomial in |V(G)|+q|V(G)|+q. Then we construct the vertex set of HH as (8), which takes time linear in the size of V(H)V(H). Note

|V(H)|=|𝔽qk||𝔽qd|=qkqdqk+4|V(G)|3.|V(H)|=|\mathbb{F}_{q}^{k}|\cdot|\mathbb{F}_{q}^{d}|=q^{k}\cdot q^{d}\leq q^{k+4}\cdot|V(G)|^{3}.

Finally to construct the edge set E(G)E(G) we go through each pair of vertices in V(H)V(H) and check the conditions in (H1) – (H4), which requires firstly some simple arithmetic in 𝔽qk\mathbb{F}_{q}^{k} and 𝔽qd\mathbb{F}_{q}^{d}, and then checking the edge set V(G)V(G). Thus, it can be done again in time polynomial in qk+|V(G)|q^{k}+|V(G)|. ∎

5.2 The completeness

Lemma 3.

If GG has a kk-clique, then HH has a clique of size qkq^{k}.

Proof.

Let {v1,,vk}V(G)𝔽qd\{v_{1},\ldots,v_{k}\}\subseteq V(G)\subseteq\mathbb{F}_{q}^{d} be a kk-clique in GG. Then for every r𝔽qkr\in\mathbb{F}_{q}^{k} we define

πr(i[k]r[i]vi[1],,i[k]r[i]vi[d])𝔽qd.\pi_{r}\coloneqq\left(\sum_{i\in[k]}r[i]v_{i}[1],\ldots,\sum_{i\in[k]}r[i]v_{i}[d]\right)\in\mathbb{F}_{q}^{d}.

We claim that

K{Πr,πr|r𝔽qk}K\coloneqq\big{\{}\Pi_{r,\pi_{r}}\bigm{|}r\in\mathbb{F}_{q}^{k}\big{\}}

is clique in HH. Assume r,r𝔽qkr,r^{\prime}\in\mathbb{F}_{q}^{k} with rrr\neq r^{\prime}. We need to show Πr,πr\Pi_{r,\pi_{r}} and Πr,πr\Pi_{r^{\prime},\pi_{r^{\prime}}} are adjacent in HH.

  • Case 1: Hamming(r,r)=1\mathrm{Hamming}(r,r^{\prime})=1, say r=r+aeir^{\prime}=r+ae_{i} for some a𝔽qa\in\mathbb{F}_{q}^{*} and i[k]i\in[k]. Then

    πrπr\displaystyle\pi_{r^{\prime}}-\pi_{r} =([k]r[]v[1],,[k]r[]v[d])([k]r[]v[1],,[k]r[]v[d])\displaystyle=\left(\sum_{\ell\in[k]}r^{\prime}[\ell]v_{\ell}[1],\ldots,\sum_{\ell\in[k]}r^{\prime}[\ell]v_{\ell}[d]\right)-\left(\sum_{\ell\in[k]}r[\ell]v_{\ell}[1],\ldots,\sum_{\ell\in[k]}r[\ell]v_{\ell}[d]\right)
    =([k](r[]r[])v[1],,[k](r[]r[])v[d])\displaystyle=\left(\sum_{\ell\in[k]}(r^{\prime}[\ell]-r[\ell])v_{\ell}[1],\ldots,\sum_{\ell\in[k]}(r^{\prime}[\ell]-r[\ell])v_{\ell}[d]\right)
    =((r[i]r[i])vi[1],,(r[i]r[i])vi[d])=(avi[1],,avi[d])\displaystyle=\big{(}(r^{\prime}[i]-r[i])v_{i}[1],\ldots,(r^{\prime}[i]-r[i])v_{i}[d]\big{)}=\big{(}av_{i}[1],\ldots,av_{i}[d]\big{)}
    =a(vi[1],,vi[d])=avi.\displaystyle=a\big{(}v_{i}[1],\ldots,v_{i}[d]\big{)}=av_{i}.

    We are done by (9).

  • Case 2: Hamming(r,r)=2\mathrm{Hamming}(r,r^{\prime})=2, in particular r=r+aei+bejr^{\prime}=r+ae_{i}+be_{j} for some a,b𝔽qa,b\in\mathbb{F}_{q}^{*} and i,j[k]i,j\in[k] with iji\neq j. It follows that

    πrπr\displaystyle\pi_{r^{\prime}}-\pi_{r}
    =\displaystyle= ([k](r[]r[])v[1],,[k](r[]r[])v[d])\displaystyle\left(\sum_{\ell\in[k]}(r^{\prime}[\ell]-r[\ell])v_{\ell}[1],\ldots,\sum_{\ell\in[k]}(r^{\prime}[\ell]-r[\ell])v_{\ell}[d]\right)
    =\displaystyle= ((r[i]r[i])vi[1]+(r[j]r[j])vj[1],,(r[i]r[i])vi[d]+(r[j]r[j])vj[d])\displaystyle\big{(}(r^{\prime}[i]-r[i])v_{i}[1]+(r^{\prime}[j]-r[j])v_{j}[1],\ldots,(r^{\prime}[i]-r[i])v_{i}[d]+(r^{\prime}[j]-r[j])v_{j}[d]\big{)}
    =\displaystyle= a(vi[1],,vi[d])+b(vj[1],,vj[d])=avi+bvj.\displaystyle a\big{(}v_{i}[1],\ldots,v_{i}[d]\big{)}+b\big{(}v_{j}[1],\ldots,v_{j}[d]\big{)}=av_{i}+bv_{j}.

    So (10) implies that Πr,πrΠr,πrE(H)\Pi_{r,\pi_{r}}\Pi_{r^{\prime},\pi_{r^{\prime}}}\in E(H).

  • Case 3: Hamming(r,r)3\mathrm{Hamming}(r,r^{\prime})\geq 3. This is trivial by (H4).

Clearly, |K|=qk|K|=q^{k}. This finishes our proof. ∎

5.3 The soundness

Lemma 4.

If GG has no kk-clique, then HH has no clique of size

kqk1+1.k\cdot q^{k-1}+1.
Proof.

Assume that KV(H)K\subseteq V(H) is a clique in GG. For the latter purpose, let

R{r𝔽qk|KCr}.R\coloneqq\big{\{}r\in\mathbb{F}_{q}^{k}\bigm{|}K\cap C_{r}\neq\emptyset\big{\}}.

Then by (H1), we have

|KCr|=1|K\cap C_{r}|=1

for every rRr\in R, and otherwise |KCr|=0|K\cap C_{r}|=0 for every r𝔽qkRr\in\mathbb{F}_{q}^{k}\setminus R. In the former case, we use

Πr,πr\Pi_{r,\pi_{r}} (11)

to denote the unique element in KCrK\cap C_{r}. Thereby πr𝔽qd\pi_{r}\in\mathbb{F}_{q}^{d}.

Claim 1.

Let rRr\in R. Then there exists an i[k]i\in[k] such that for all a𝔽qa\in\mathbb{F}^{*}_{q} we have

r+aeiR.r+ae_{i}\notin R.
Proof of the claim..

Towards a contradiction we assume that for every i[k]i\in[k] there is an ai𝔽qa_{i}\in\mathbb{F}^{*}_{q} with

rir+aieiR.r^{i}\coloneqq r+a_{i}e_{i}\in R.

By (11) there is a unique

Πri,πriK.\Pi_{r^{i},\pi_{r^{i}}}\in K.

For the same reason, we have a unique Πr,πrK\Pi_{r,\pi_{r}}\in K. As KK is a clique, there is an edge between Πr,πr\Pi_{r,\pi_{r}} and Πri,πri\Pi_{r^{i},\pi_{r^{i}}} in the graph HH. By (H2) we have a vertex viViv_{i}\in V_{i} in the original graph GG such that

πriπr=aivi.\pi_{r^{i}}-\pi_{r}=a_{i}v_{i}.

Now we show {v1,,vk}\{v_{1},\ldots,v_{k}\} is a kk-clique in GG, contradicting our assumption that GG has no kk-clique. So we need to demonstrate that vivjE(H)v_{i}v_{j}\in E(H) for every distinct i,j[k]i,j\in[k]. Observe that

πriπrj=(πriπr)(πrjπr)=aiviajvj=aivi+(aj)vj.\pi_{r^{i}}-\pi_{r^{j}}=(\pi_{r^{i}}-\pi_{r})-(\pi_{r^{j}}-\pi_{r})=a_{i}v_{i}-a_{j}v_{j}=a_{i}v_{i}+(-a_{j})v_{j}.

By (H3) we conclude vivjE(G)v_{i}v_{j}\in E(G) as desired. Let us emphasize that V(G)V(G) is a linear Sidon set, hence the above viv_{i} and vjv_{j} are uniquely determined. ∎

Of course for each rRr\in R we might have more than one i[k]i\in[k] satisfying the above claim. Nevertheless, we fix an arbitrary one and denote it by iri_{r}. Then we define

Tr\displaystyle T_{r} {r+aeir𝔽qk|a𝔽q}.\displaystyle\coloneqq\Big{\{}r+a\cdot e_{i_{r}}\in\mathbb{F}_{q}^{k}\Bigm{|}a\in\mathbb{F}^{*}_{q}\Big{\}}. (12)

Note |Tr|=q1|T_{r}|=q-1 and TrR=T_{r}\cap R=\emptyset.

Claim 2.

Every r𝔽qkRr\in\mathbb{F}_{q}^{k}\setminus R can occur in at most kk many different TrT_{r^{\prime}} for rRr^{\prime}\in R. More precisely,

|{rRrTr}|k.\big{|}\{r^{\prime}\in R\mid r\in T_{r^{\prime}}\}\big{|}\leq k.
Proof of the claim..

Assume rTrr\in T_{r^{\prime}} and rTr′′r\in T_{r^{\prime\prime}} for distinct r,r′′Rr^{\prime},r^{\prime\prime}\in R. We show

irir′′,i_{r^{\prime}}\neq i_{r^{\prime\prime}},

which immediately implies the claim by ir,ir′′[k]i_{r^{\prime}},i_{r^{\prime\prime}}\in[k]. By assumption for some a,b𝔽qa,b\in\mathbb{F}^{*}_{q} we have

r=r+aeir\displaystyle r=r^{\prime}+ae_{i_{r^{\prime}}} and r=r′′+beir′′.\displaystyle r=r^{\prime\prime}+be_{i_{r^{\prime\prime}}}.

So if ir=ir′′i_{r^{\prime}}=i_{r^{\prime\prime}} we would have

r′′=r+(ab)eir.r^{\prime\prime}=r^{\prime}+(a-b)e_{i_{r^{\prime}}}.

Note ab0a-b\neq 0 as rr′′r^{\prime}\neq r^{\prime\prime}. Then by our definition of TrT_{r^{\prime}} (i.e.,  (12) where rrr\mapsto r^{\prime}) we conclude

r′′Tr.r^{\prime\prime}\in T_{r^{\prime}}.

On the other hand, TrR=T_{r^{\prime}}\cap R=\emptyset, contradicting r′′Rr^{\prime\prime}\in R. ∎

Now, let us continue the proof of Lemma 4. Putting all the pieces together, we have

RrRTr𝔽qk\displaystyle R\uplus\bigcup_{r\in R}T_{r}\subseteq\mathbb{F}_{q}^{k} (by TrR=T_{r}\cap R=\emptyset for every rRr\in R)
|R|+|rRTr|qk\displaystyle\Longrightarrow\ |R|+\left|\bigcup_{r\in R}T_{r}\right|\leq q^{k}
|R|+|R|(q1)kqk\displaystyle\Longrightarrow\ |R|+\frac{|R|\cdot(q-1)}{k}\leq q^{k} (by |Tr|=q1|T_{r}|=q-1 and Claim 2)
|R|(q1+k)kqk\displaystyle\Longrightarrow\frac{|R|\cdot(q-1+k)}{k}\leq q^{k}
|R|qkqk\displaystyle\Longrightarrow\frac{|R|\cdot q}{k}\leq q^{k} (by k1k\geq 1)
|R|kqk1.\displaystyle\Longrightarrow|R|\leq k\cdot q^{k-1}.

This finishes the proof. ∎

5.4 Improvement to (qk,qk1)(q^{k},q^{k-1})-Gap Clique

Again we start from a multi-colored kk-Clique instance (G,k)(G,k) with V(G)V(G) satisfying  (7) and modify the construction of HH in Subsection 5.1 as follows.

  • We identify the vertex V(G)V(G) with a 88-term linearly independent set in 𝔽qd\mathbb{F}_{q}^{d} as stated in Theorem 2.

  • Again the vertex set of HH is V(H)=r𝔽qkCrV(H)=\bigcup_{r\in\mathbb{F}_{q}^{k}}C_{r}, where Cr={Πr,π|π𝔽qd}C_{r}=\Big{\{}\Pi_{r,\pi}\Bigm{|}\pi\in\mathbb{F}_{q}^{d}\Big{\}} for every r𝔽qkr\in\mathbb{F}_{q}^{k}.

  • For the edge set E(H)E(H) of HH, let r,r𝔽qkr,r^{\prime}\in\mathbb{F}_{q}^{k} and Πr,πCr\Pi_{r,\pi}\in C_{r}, Πr,πCr\Pi_{r^{\prime},\pi^{\prime}}\in C_{r^{\prime}}. Set t:=Hamming(r,r)t:=\mathrm{Hamming}(r,r^{\prime}). There is an edge between Πr,π\Pi_{r,\pi} and Πr,π\Pi_{r^{\prime},\pi^{\prime}} if and only if one of the following conditions is satisfied.

    • Case 1. 1t41\leq t\leq 4. Assume diff(r,r)={i1,,it}\mathrm{diff}(r,r^{\prime})=\{i_{1},\ldots,i_{t}\}, then

      ππ=j[t](r[ij]r[ij])vj,\pi-\pi^{\prime}=\sum_{j\in[t]}\big{(}r[i_{j}]-r^{\prime}[i_{j}]\big{)}v_{j},

      where v1Vi1,,vtVitv_{1}\in V_{i_{1}},\ldots,v_{t}\in V_{i_{t}} and {v1,,vt}\{v_{1},\ldots,v_{t}\} is a tt-clique in GG. Observe that, as V(G)V(G) is 88-term linearly independent and t4t\leq 4, it is easy to see that the vertices v1,,vtv_{1},\ldots,v_{t} are unique. For the later purpose, we write

      vertexsetr,r(π,π):={v1,,vt}.\mathrm{vertexset}_{r,r^{\prime}}(\pi,\pi^{\prime}):=\{v_{1},\ldots,v_{t}\}. (13)
    • Case 2. t5t\geq 5.

We remark that the above graph HH is an induced subgraph of our original HH. Exactly as Lemma 2 we can show:

Lemma 5.

The graph HH can be constructed in time polynomial in |V(G)|+qk|V(G)|+q^{k}.

Now we are ready to prove the gap between qkq^{k} and qk1q^{k-1}.

Lemma 6.
  1. 1.

    Completeness. If GG has a kk-clique, then HH has a clique of size qkq^{k}.

  2. 2.

    Soundness. If GG has no kk-clique, then HH has no clique of size qk1+1q^{k-1}+1.

Proof.

The completeness case follows the same line as Lemma 3. For the soundness, some extra work is needed along the line of the proof of Lemma 4. Assume that GG does not have a kk-clique, and consider any clique KV(H)K\subseteq V(H) in HH. Again, let R{r𝔽qk|KCr}R\coloneqq\big{\{}r\in\mathbb{F}_{q}^{k}\bigm{|}K\cap C_{r}\neq\emptyset\big{\}}.

Claim 3.

Let rRr\in R. Then there exists an i[k]i\in[k] such that for all r𝔽qkr^{\prime}\in\mathbb{F}_{q}^{k} with Hamming(r,r)2\mathrm{Hamming}(r,r^{\prime})\leq 2 and idiff(r,r)i\in\mathrm{diff}(r,r^{\prime}) we have

rR.r^{\prime}\notin R.
Proof of the claim..

Assume that for every i[k]i\in[k] there is an riRr^{i}\in R with

Hamming(r,ri)2\displaystyle\mathrm{Hamming}(r,r^{i})\leq 2 and idiff(r,ri).\displaystyle i\in\mathrm{diff}(r,r^{i}). (14)

Fix such an rir^{i}. Since riRr^{i}\in R, there is a unique Πri,πriK\Pi_{r^{i},\pi_{r^{i}}}\in K. Similarly we have a unique Πr,πrK\Pi_{r,\pi_{r}}\in K. It follows that vertexsetr,ri(πr,πri)\mathrm{vertexset}_{r,r^{i}}(\pi_{r},\pi_{r^{i}}) as defined in (13), which has size at most 22, contains a vertex

viVi.v_{i}\in V_{i}.

Now let 1i<jk1\leq i<j\leq k. Then by (14) we have Hamming(ri,rj)4\mathrm{Hamming}(r^{i},r^{j})\leq 4. Furthermore, it is routine to verify that either {i,j}diff(ri,rj)\{i,j\}\subseteq\mathrm{diff}(r^{i},r^{j}), or {i,j}diff(r,ri)\{i,j\}\subseteq\mathrm{diff}(r,r^{i}), or {i,j}diff(r,rj)\{i,j\}\subseteq\mathrm{diff}(r,r^{j}), and all guarantee that vivjE(G)v_{i}v_{j}\in E(G) by the fact that (13) is a clique. Thus v1,,vkv_{1},\ldots,v_{k} induce a kk-clique in GG, which is a contradiction. ∎

For each rRr\in R we fix an ir[k]i_{r}\in[k] satisfying Claim 3 and let Tr{r+aeir𝔽qk|a𝔽q}T_{r}\coloneqq\Big{\{}r+a\cdot e_{i_{r}}\in\mathbb{F}_{q}^{k}\Bigm{|}a\in\mathbb{F}^{*}_{q}\Big{\}}, the same as (12).

Claim 4.

Every r𝔽qkRr\in\mathbb{F}_{q}^{k}\setminus R can occur in at most one TrT_{r^{\prime}} for rRr^{\prime}\in R, i.e.,

|{rRrTr}|1.\big{|}\{r^{\prime}\in R\mid r\in T_{r^{\prime}}\}\big{|}\leq 1.
Proof of the claim..

Towards a contradiction, assume rTrTr′′r\in T_{r^{\prime}}\cap T_{r^{\prime\prime}} for two distinct r,r′′Rr^{\prime},r^{\prime\prime}\in R. Hence

diff(r,r)={ir}\displaystyle\mathrm{diff}(r^{\prime},r)=\{i_{r^{\prime}}\} and diff(r′′,r)={ir′′}.\displaystyle\mathrm{diff}(r^{\prime\prime},r)=\{i_{r^{\prime\prime}}\}.

Since rr′′r^{\prime}\neq r^{\prime\prime}, we conclude

diff(r,r′′)={ir,ir′′},\displaystyle\mathrm{diff}(r^{\prime},r^{\prime\prime})=\big{\{}i_{r^{\prime}},i_{r^{\prime\prime}}\big{\}}, and thus Hamming(r,r′′)2.\displaystyle\mathrm{Hamming}(r^{\prime},r^{\prime\prime})\leq 2.

As rRr^{\prime}\in R and irdiff(r,r′′)i_{r^{\prime}}\in\mathrm{diff}(r^{\prime},r^{\prime\prime}), Claim 1 implies that r′′Rr^{\prime\prime}\notin R. This is the desired contradiction. ∎

Finally, we conclude the proof for the soundness case of Lemma 6. The above two claims imply the followings.

RrRTr𝔽qk\displaystyle R\uplus\bigcup_{r\in R}T_{r}\subseteq\mathbb{F}_{q}^{k} |R|+|rRTr|qk\displaystyle\Longrightarrow\ |R|+\left|\bigcup_{r\in R}T_{r}\right|\leq q^{k}
|R|+|R|(q1)qk\displaystyle\Longrightarrow\ |R|+|R|\cdot(q-1)\leq q^{k} (by |Tr|=q1|T_{r}|=q-1 and Claim 4)
|R|qqk|R|qk1.\displaystyle\Longrightarrow|R|\cdot q\leq q^{k}\Longrightarrow|R|\leq q^{k-1}.
Theorem 3.

There is an algorithm (i.e., a reduction) \mathbb{R} that on an input graph GG, k1k\geq 1, and a prime power qq, computes a graph H=H(G,k,q)H=H(G,k,q) satisfying the following conditions.

  1. (R1)

    If GG has a kk-clique, then HH has a clique of size qkq^{k}.

  2. (R2)

    If GG does not have a kk-clique, then HH has no clique of size qk1+1q^{k-1}+1.

Moreover, \mathbb{R} runs in time polynomial in |V(G)|+qk|V(G)|+q^{k}.

6 Sub-polynomial Approximation Lower Bounds

Equipped with Theorem 3 we are ready to derive the lower bounds for the FPT-appromxation of kk-Clique.

Theorem 4.

The kk-clique problem has no FPT-approximation with ratio ko(1)k^{o(1)}, unless FPT=W[1]\text{FPT}=\text{W[1]}.

Proof.

Towards a contradiction, assume that 𝔸\mathbb{A} is an FPT-approximation for the kk-Clique problem such that on any input graph GG and k1k\geq 1:

  1. (A)

    If GG has a kk-clique, then 𝔸\mathbb{A} outputs a clique of size at least k/kh(k)k/k^{h(k)}, where h:h:\mathbb{N}\to\mathbb{R} with

    limkh(k)=0.\lim_{k\to\infty}h(k)=0.

We will define a function q:q:\mathbb{N}\to\mathbb{N} such that for any k1k\geq 1, qq(k)q\coloneqq q(k) with qq being a prime, and kqkk^{\prime}\coloneqq q^{k} with

(k)h(k)<q.(k^{\prime})^{h(k^{\prime})}<q. (15)

This will give us the desired contradiction, since we would have an FPT algorithm deciding the kk-Clique problem: on any input (G,k)(G,k), we first compute qq(k)q\coloneqq q(k). 333There is a small but annoying issue on how to compute qq, but this is often ignored. For most natural functions hh, this is easy. A more rigorous treatment requires the “little o” in ko(1)k^{o(1)} to be interpreted “effectively” (see [CG07] for a detailed discussion). Then apply the algorithm \mathbb{R} in Theorem 3 to get a graph HH(G,k,q)H\coloneqq H(G,k,q). Finally we run the approximation algorithm 𝔸\mathbb{A} on (H,k)(H,k^{\prime}) with kqkk^{\prime}\coloneqq q^{k}.

  • If GG has a kk-clique, then HH has a kk^{\prime}-clique by (R1). It follows by (A) and (15) that the algorithm 𝔸\mathbb{A} will output a clique of size at least

    k(k)h(k)>qkq=qk1,\frac{k^{\prime}}{(k^{\prime})^{h(k^{\prime})}}>\frac{q^{k}}{q}=q^{k-1},

    i.e., at least qk1+1q^{k-1}+1.

  • If GG has no kk-clique, then HH has no clique of size qk1+1q^{k-1}+1 by (R2). Thus, the clique in which the algorithm 𝔸\mathbb{A} outputs must have size at most

    qk1.q^{k-1}.

Therefore, we can decide whether the original graph GG has a kk-clique by checking whether the clique that the algorithm 𝔸\mathbb{A} computes has size at least qk1+1q^{k-1}+1.

It remains to show that we can define a function q:q:\mathbb{N}\to\mathbb{N} such that for qq(k)q\coloneqq q(k) the inequality (15) holds. Let kk\in\mathbb{N}. Since limxh(x)=0\lim_{x\to\infty}h(x)=0, there is an nkn_{k}\in\mathbb{N} such that for all nnkn\geq n_{k} we have

h(n)<1k.h(n)<\frac{1}{k}.

Define

q=q(k)min{p|p is a prime and pknk}.q=q(k)\coloneqq\min\Big{\{}p\in\mathbb{N}\Bigm{|}\text{$p$ is a prime and $p^{k}\geq n_{k}$}\Big{\}}.

Clearly, q(k)q(k) is well defined. In particular, for kqkk^{\prime}\coloneqq q^{k} we get knkk^{\prime}\geq n_{k}. Hence, by our choice of nkn_{k}

h(k)<1k.h(k^{\prime})<\frac{1}{k}.

It follows that

(k)h(k)<(k)1/k=(qk)1/k=q.(k^{\prime})^{h(k^{\prime})}<(k^{\prime})^{1/k}=\left(q^{k}\right)^{1/k}=q.

This is precisely (15). ∎

The next lower bound was first proved in [LRSW22].

Theorem 5 (Simplified proof of [LRSW22]).

Assuming ETH, there is no approximation algorithm for the kk-Clique problem with a constant ratio of running time t(k)no(logk)t(k)n^{o(\log k)} for any computable function tt.

Proof.

Let c1c\geq 1 and 𝔹\mathbb{B} be an algorithm that finds a clique of size k/ck/c if the input graph GG contains a clique of size kk. Moreover, \mathbb{R} runs in time t(k)no(logk)t(k)n^{o(\log k)}, where tt is a computable function. We choose a (minimum) prime

qc+1.q\geq c+1.

Now given a graph GG and k2k\geq 2. We first invoke the reduction \mathbb{R} as stated in Theorem 3 on GG, kk, and qq, which produces a graph HH satisfying (R1) and (R2). In particular, if GG has a kk-clique, then HH has a clique of size

k:=qk.k^{\prime}:=q^{k}.

Otherwise, then HH has no clique of size

qk1+1=qkqk1+q1q1<qkq1kc,q^{k-1}+1=\frac{q^{k}-q^{k-1}+q-1}{q-1}<\frac{q^{k}}{q-1}\leq\frac{k^{\prime}}{c},

where the first inequality is by k2k\geq 2 and the second by qc+1q\geq c+1. Note \mathbb{R} runs in time polynomial in

|V(G)|+qk=n+2O(k),|V(G)|+q^{k}=n+2^{O(k)},

which implies that

|V(H)|2O(k)poly(n).|V(H)|\leq 2^{O(k)}\mathrm{poly}(n).

Next, we apply 𝔹\mathbb{B} on HH and kk^{\prime}. It follows that,

  • Completeness. If GG has a kk-clique, then HH has a clique of size k=qkk^{\prime}=q^{k}. Hence 𝔹\mathbb{B} output a clique of size at least k/ck^{\prime}/c.

  • Soundness. If GG does not have a kk-clique, then HH has no clique of size k/ck^{\prime}/c. Thus, 𝔹\mathbb{B} cannot output a clique of size k/ck^{\prime}/c.

This means that we can decide whether the original graph GG has a clique of size kk. Furthermore, observe that the running time of 𝔹\mathbb{B} is

t(k)|V(H)|o(logk)t(qk)(2O(k)poly(n))o(k)h(k)no(k)t(k^{\prime})|V(H)|^{o(\log k^{\prime})}\leq t(q^{k})\big{(}2^{O(k)}\mathrm{poly}(n)\big{)}^{o(k)}\leq h(k)n^{o(k)}

for some appropriate computable function hh. This contradicts ETH. ∎

7 Conclusion and Discussion

We presented a self-reduction from kk-Clique to (qk,qk1)(q^{k},q^{k-1})-Gap Clique. Our reduction is simple and almost combinatorial. We simply combine the technique from the network coding theory and the use of Sidon sets. Both techniques are well-known in the literature, and no heavy machinery is involved. Moreover, our reduction is a self-reduction, which gives an insight into a generic gap-producing FPT-reduction for W[1]-hard problems.

In fact, our ideal goal is to devise a self-reduction that transforms an instance of (k,k)(k,k^{\prime})-Gap Clique into an instance of (qk,qk)(q^{k},q^{k^{\prime}})-Gap clique. If such a transformation exists, this will imply the following chain of reductions.

(k,k1)-Gap Clique(qk,qk1)-Gap Clique(σqk,σqk1)-Gap Clique\displaystyle\text{$(k,k-1)$-Gap Clique}\Rightarrow\text{$(q^{k},q^{k-1})$-Gap Clique}\Rightarrow\text{$(\sigma^{q^{k}},\sigma^{q^{k-1}})$-Gap Clique}

Setting q=2q=2 and σ=f(k)\sigma=f(k) immediately rules out approximation ratio polynomial on kk, say the gap of KK vs K1/2K^{1/2}. Since we can choose σ\sigma to be an arbitrary function on kk, for any computable non-decreasing function f(K)f(K), we may choose σ=(f1(K))1/K\sigma=\left(f^{-1}(K)\right)^{1/K} to rule out f(K)f(K)-approximation algorithm that runs in FPT-time. Roughly speaking, we may be just one step behind proving the total FPT-inapproximability of kk-Clique under the W[1]-hardness.

References

  • [ACLY00] R. Ahlswede, N. Cai, S.-Y.R. Li, and R.W. Yeung. Network information flow. IEEE Transactions on Information Theory, 46(4):1204–1216, 2000.
  • [CCK+20] P. Chalermsook, M. Cygan, G. Kortsarz, B. Laekhanukit, P. Manurangsi, D. Nanongkai, and L. Trevisan. From gap-exponential time hypothesis to fixed parameter tractable inapproximability: Clique, dominating set, and more. SIAM J. Comput., 49(4):772–810, 2020.
  • [CE16] P.J. Cameron and P. Erdös. On the Number of Sets of Integers With Various Properties, pages 61–80. De Gruyter, 2016.
  • [CG07] Y. Chen and M. Grohe. An isomorphism between subexponential and parameterized complexity theory. SIAM J. Comput., 37(4):1228–1258, 2007.
  • [CGG07] Y. Chen, M. Grohe, and M. Grüber. On parameterized approximability. Electron. Colloquium Comput. Complex., TR07-106, 2007.
  • [ET41] P. Erdös and P. Turán. On a problem of Sidon in additive number theory, and on some related problems. Journal of the London Mathematical Society, s1-16(4):212–215, 1941.
  • [FG06] J. Flum and M. Grohe. Parameterized Complexity Theory (Texts in Theoretical Computer Science. An EATCS Series). Springer-Verlag, Berlin, Heidelberg, 2006.
  • [FGL+96] U. Feige, S. Goldwasser, L. Lovász, S. Safra, and M. Szegedy. Interactive proofs and the hardness of approximating cliques. J. ACM, 43(2):268–292, 1996.
  • [FSLM20] A. E. Feldmann, Karthik C. S., E. Lee, and P. Manurangsi. A survey on approximation in parameterized complexity: Hardness and algorithms. Algorithms, 13(6), 2020.
  • [GLR+24] Venkatesan Guruswami, Bingkai Lin, Xuandi Ren, Yican Sun, and Kewen Wu. Parameterized inapproximability hypothesis under exponential time hypothesis. In Bojan Mohar, Igor Shinkar, and Ryan O’Donnell, editors, Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024, pages 24–35. ACM, 2024.
  • [Gre04] B. Green. The Cameron–Erdős conjecture. Bulletin of the London Mathematical Society, 36(6):769–778, 2004.
  • [Gre05] B. Green. A Szemerédi-type regularity lemma in abelian groups, with applications. Geometric & Functional Analysis GAFA, 15(2):340–376, 2005.
  • [Hås01] J. Håstad. Some optimal inapproximability results. J. ACM, 48(4):798–859, 2001.
  • [HKM+03] T. Ho, R. Koetter, M. Medard, D.R. Karger, and M. Effros. The benefits of coding over routing in a randomized setting. In IEEE International Symposium on Information Theory, 2003. Proceedings., pages 442–, 2003.
  • [HMS+03] T. Ho, M. Médard, J. Shi, M. Effros, and D. R. Karger. On randomized network coding. In 41st41^{st} Allerton Annual Conference on Communication, Control, and Signal Processing, 2003. Invited paper.
  • [Kar72] R. M. Karp. Reducibility among combinatorial problems. In R. E. Miller and J. W. Thatcher, editors, Proceedings of a symposium on the Complexity of Computer Computations, held March 20-22, 1972, at the IBM Thomas J. Watson Research Center, Yorktown Heights, New York, USA, The IBM Research Symposia Series, pages 85–103. Plenum Press, New York, 1972.
  • [KK22] Karthik C. S. and S. Khot. Almost polynomial factor inapproximability for parameterized k-Clique. In S. Lovett, editor, 37th Computational Complexity Conference, CCC 2022, July 20-23, 2022, Philadelphia, PA, USA, volume 234 of LIPIcs, pages 6:1–6:21. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
  • [Lin21] B. Lin. Constant approximating k-Clique is W[1]-hard. In Samir Khuller and Virginia Vassilevska Williams, editors, STOC ’21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Virtual Event, Italy, June 21-25, 2021, pages 1749–1756. ACM, 2021.
  • [LRSW22] B. Lin, X. Ren, Y. Sun, and X. Wang. On lower bounds of approximating parameterized k-Clique. In Mikolaj Bojanczyk, Emanuela Merelli, and David P. Woodruff, editors, 49th International Colloquium on Automata, Languages, and Programming, ICALP 2022, July 4-8, 2022, Paris, France, volume 229 of LIPIcs, pages 90:1–90:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
  • [LRSW23a] B. Lin, X. Ren, Y. Sun, and X. Wang. Constant approximating parameterized k-setcover is W[2]-hard. In Nikhil Bansal and Viswanath Nagarajan, editors, Proceedings of the 2023 ACM-SIAM Symposium on Discrete Algorithms, SODA 2023, Florence, Italy, January 22-25, 2023, pages 3305–3316. SIAM, 2023.
  • [LRSW23b] Bingkai Lin, Xuandi Ren, Yican Sun, and Xiuhan Wang. Improved hardness of approximating k-clique under eth. In 64th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2023, Santa Cruz, CA, USA, November 6 - November 9, 2023. IEEE Computer Society, 2023. Preprint available at https://arxiv.org/pdf/2304.02943.pdf.
  • [O’B04] K. O’Bryant. A complete annotated bibliography of work related to sidon sequences. The Electronic Journal of Combinatorics [electronic only], DS11:39 p., electronic only–39 p., electronic only, 2004.
  • [Ruz93] I. Z. Ruzsa. Solving a linear equation in a set of integers I. Acta Arithmetica, 65(3):259–282, 1993.
  • [Ruz95] I. Z. Ruzsa. Solving a linear equation in a set of integers II. Acta Arithmetica, 72(4):385–397, 1995.
  • [Sha09] A. Shapira. Green’s conjecture and testing linear-invariant properties. In M. Mitzenmacher, editor, Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pages 159–166. ACM, 2009.
  • [SLM19] Karthik C. S., B. Laekhanukit, and P. Manurangsi. On the parameterized complexity of approximating dominating set. J. ACM, 66(5):33:1–33:38, 2019.
  • [YLC06] Raymond W. Yeung, S-y Li, and N. Cai. Network Coding Theory (Foundations and Trends(R) in Communications and Information Theory). Now Publishers Inc., Hanover, MA, USA, 2006.