This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Lifting Theorems Meet Information Complexity: Known and New Lower Bounds of Set-disjointness111Working Paper

Guangxu Yang
Department of Computer Science
University of Southern California
guangxuy@usc.edu
Research supported by NSF CAREER award 2141536.
   Jiapeng Zhang
Department of Computer Science
University of Southern California
jiapengz@usc.edu
Research supported by NSF CAREER award 2141536.
Abstract

Set-disjointness problems are one of the most fundamental problems in communication complexity and have been extensively studied in past decades. Given its importance, many lower bound techniques were introduced to prove communication lower bounds of set-disjointness.

Combining ideas from information complexity and query-to-communication lifting theorems, we introduce a density increment argument to prove communication lower bounds for set-disjointness:

  • We give a simple proof showing that a large rectangle cannot be 0-monochromatic for multi-party unique-disjointness.

  • We interpret the direct-sum argument as a density increment process and give an alternative proof of randomized communication lower bounds for multi-party unique-disjointness.

  • Avoiding full simulations in lifting theorems, we simplify and improve communication lower bounds for sparse unique-disjointness.

Potential applications to be unified and improved by our density increment argument are also discussed.

1 Introduction

Set-disjointness is one of the most important problems in communication complexity. Since the formulation of the communication model [Yao79], many researchers have made great efforts to understand the communication complexity, both upper and lower bounds, of set-disjointness problems in various communication models [BFS86, KS92, Raz92, Raz03, BYJKS04, JRS03, HW07, KW09, BEO+13, ST13, GW16, WW15, Gav16, BO17, BGK+18, KPW21, DOR21]. Building upon communication lower bounds for set-disjointness, applications in diverse areas have been studied. For example, it gives lower bounds for monotone circuit depth [GP18], streaming problems [AMS99, BYJKS04, KPW21], proof complexity [GP18], game theory [GNOR15, GS17], property testing [BBM11], data structure lower bound [MNSW95], extension complexity [BM13, GW16], and more.

Given the importance of this problem, many techniques were invented simply to understand communication lower bounds of set-disjointness. Some remarkable methods include rank method [Gri85, HW07, RY20], discrepancy method [RY15], corruption bound [Raz92], smooth rectangle bound [JK10, HJ13], and information complexity [CSWY01, BYJKS04, Gro09, Jay09]. Among all of these methods, the information complexity framework seems to provide the best results so far. We refer interested readers to [CP10] for a good survey on these results.

In this paper, we continue the study of set-disjointness. Inspired by simulation methods in query-to-communication lifting theorems [RM97, GPW18, LMM+22, YZ22], we present a proof of lower bounds of set-disjointness based on density increment arguments (sometimes also called structure-vs-pseudorandomness approach). Based on this method, we give several new lower bounds for set-disjointness in different communication models. Our proof can be considered as a combination of simulation methods and information complexity.

Compared with previous techniques, our proof is simpler and more general. It addresses some drawbacks of both simulation methods and information complexity methods. More details will be discussed in Section 1.2.

1.1 Our results

The main contribution of this work is ”explicit proofs” of communication lower bounds, together with some new unique-disjointness lower bounds. We call it explicit because our proof framework has several advantages compared to existing techniques:

  • It has fewer restrictions to communication models.

  • It allows us to use communication lower bound techniques in a non-black-box way.

  • It provides a method to analyze distributions with correlations between different coordinates.

In Section 1.3, we discuss three potential applications of these advantages. Each of them corresponds to an advantage here.

Our proof builds on a combination of simulation techniques from lifting theorems and information complexity. Specifically, we abstract the core idea from Raz-McKenzie simulation [RM97] and revise it as a density increment argument. To explain more connections and comparison with previous techniques, we present three lower bounds for unique-disjointness problem.

We first study the multi-party communication model (k-UDISJk\text{-}\mathrm{UDISJ}). In this setting, there are kk parties, where each party jj holds a set xj{0,1}nx_{j}\in\{0,1\}^{n} (we use a binary string to represent a set). It is promised that either all sets are pairwise disjoint, or they share a unique common element. Formally, we define

  • D0:={(x1,,xk)({0,1}n)k:i,x1(i)++xk(i)1}D_{0}:=\{(x_{1},\dots,x_{k})\in(\{0,1\}^{n})^{k}:\forall i,x_{1}(i)+\dots+x_{k}(i)\leq 1\}.

  • D:={(x1,,xk)({0,1}n)k:,x1()==xk()=1 and i,x1(i)++xk(i)1}D_{*}:=\{(x_{1},\dots,x_{k})\in(\{0,1\}^{n})^{k}:\exists\ell,x_{1}(\ell)=\cdots=x_{k}(\ell)=1\text{ and }\forall i\neq\ell,x_{1}(i)+\dots+x_{k}(i)\leq 1\}.

We use D0D_{0} to refer to the no instances and DD_{*} to refer to the yes instances. In this setting, we prove a structure lemma that any 0-large rectangle must intersect DD_{*}.

Theorem 1.1.

Let R({0,1}n)kR\subseteq(\{0,1\}^{n})^{k} be a rectangle such that |RD0|2n/k|D0||R\cap D_{0}|\geq 2^{-n/k}\cdot|D_{0}|, then RDR\cap D_{*}\neq\emptyset.

We note that Theorem 1.1 implies (and stronger than) an Ω(n/k)\Omega(n/k) deterministic communication lower bound of k-UDISJk\text{-}\mathrm{UDISJ}. For any protocol with o(n/k)o(n/k) communication bits, we can always find a rectangle RR in the partition such that |RD0|2n/k|D0||R\cap D_{0}|\geq 2^{-n/k}\cdot|D_{0}|, Theorem 1.1 then tells us that RR is not disjoint from DD_{*}.

Our proof is a two-page elementary (and self-contained) proof. Furthermore, we do not even need notions like entropy or rank. This proof also reveals the main idea of query-to-communication lifting theorems. We will discuss more details in Section 1.2.

Our second contribution is a new proof of randomized communication lower bounds of k-UDISJk\text{-}\mathrm{UDISJ}. This problem has been extensively studied for many years. Building on a series of great papers [AMS99, BYJKS04, CKS03], the optimal tight randomized communication lower bound Ω(n/k)\Omega(n/k) was finally obtained by [Gro09, Jay09] through the information complexity framework. In this paper, we reprove this theorem via the density increment argument.

Theorem 1.2.

For any k2k\geq 2, the randomized communication complexity of k-UDISJk\text{-}\mathrm{UDISJ} is Ω(n/k)\Omega(n/k).

We first note that Theorem 1.2 does not imply Theorem 1.1 because Theorem 1.1 indicates that every large rectangle (contains many no instance) cannot be monochromatic. However, Theorem 1.2 only proves randomized communication lower bounds.

Our proof of Theorem 1.2 is a mix of information complexity and query-to-communication simulations. Roughly speaking, in the information complexity framework, we analyze the information cost for each coordinate and then apply a direct-sum argument to merge them. In our density increment argument, we merge these costs by borrowing the projection operation from query-to-communication simulations. Hence, our density increment argument can be interpreted as an alternative direct-sum argument.

Several papers [CFK+19, GJPW18, MM22] pointed research directions in connecting information complexity and lifting theorems, and our proof has a great potential to unify information complexity and lifting theorems in this direction.

Our last result is a tight deterministic lower bound for (two-party) sparse unique-disjointess (S-UDISJ\mathrm{S}\text{-}\mathrm{UDISJ}) for a large range of sparse parameters. This problem, with sparsity parameter ss, can be described as follows: Alice holds a set AA and Bob holds a set BB with |A|,|B|s|A|,|B|\leq s. It is promised that either AB=A\cap B=\emptyset or |AB|=1|A\cap B|=1, where Alice and Bob need to distinguish the two cases with deterministic communication.

Two extreme choices of ss correspond to two important problems in communication complexity. If s=ns=n, this problem becomes the standard unique-disjointness problem (i.e., k-UDISJk\text{-}\mathrm{UDISJ} with k=2k=2). When s=1s=1, the problem is essentially the EQUALITY problem. For S-UDISJ\mathrm{S}\text{-}\mathrm{UDISJ}, we prove the following theorem.

Theorem 1.3.

Let ϵ>0\epsilon>0 be any small constant. For any sn1/2ϵs\leq n^{1/2-\epsilon}, the deterministic communication complexity of ss-sparse unique-disjointness is Ω(slog(n/s))\Omega(s\cdot\log(n/s)).

Prior to our work, Kushilevitz and Weinreb [KW09] proved the same lower bound for a smaller range of slognloglogns\leq\frac{\log n}{\log\log n}. Then Loff and Mukhopadhyay [LM19] improved this range to sn1/101s\leq n^{1/101}. Our Theorem 1.3 further pushes this range to n1/2\approx n^{1/2}.

Our proof of Theorem 1.3 is built on [LM19] with several differences where the main difference is that we no longer fully simulate the communication tree by a decision tree. Instead, we aim to find a long path in the communication tree. The approach was suggested by Yang and Zhang [YZ22]. We believe it is possible to further improve the range to all s1s\geq 1 and discuss more details in Section 5.

A similar task is to prove a deterministic lower bound for sparse set-disjointness without the unique requirement. In this setting, Håstad and Wigderson pointed out [HW07] a same Ω(slog(n/s))\Omega(s\cdot\log(n/s)) bound can be proved via the rank method in [Juk11]. However, in the unique setting, [KW09] showed that the rank method is impossible to achieve such tight bounds.

We emphasize that Theorem 1.3 is a lower bound only for deterministic communication complexity. Allowing public randomness and a constant error, there exists a protocol that costs only O(s)O(s) bits [HW07]. Therefore, this log(n/s)\log(n/s) factor is a separation between randomized communication and deterministic communication. This also implies all lower bound techniques that simultaneously imply randomized communication lower bounds, including information complexity approaches, cannot reprove our bounds.

Furthermore, Braverman [Bra12] gave a zero-error protocol for 11-sparse unique-disjointness with constant information cost which can be extended to all s1s\geq 1.

Lemma 1.4.

For any s>0s>0. There is a zero-error protocol for ss-sparse unique-disjointness with information cost O(s)O(s).

Overall, Theorem 1.3 demonstrates that the density increment argument has fewer restrictions on communication models and is able to circumvent such barriers by rank methods and information complexity.

1.2 Our techniques

Here we give an overview of our proof technique and discuss connections to lifting theorems and information complexity. We focus on Theorem 1.1. We recall that no instances are

D0:={(x1,,xk)({0,1}n)k:i,x1(i)++xk(i)1}.D_{0}:=\{(x_{1},\dots,x_{k})\in(\{0,1\}^{n})^{k}:\forall i,x_{1}(i)+\dots+x_{k}(i)\leq 1\}.

Our main idea of Theorem 1.1 is a density increment argument. In this argument, we first define the density of a rectangle RR on D0D_{0} by

E(R):=|RD0||D0|=|RD0|(k+1)n.E(R):=\frac{|R\cap D_{0}|}{|D_{0}|}=\frac{|R\cap D_{0}|}{(k+1)^{n}}.

It is clear that E(R)1E(R)\leq 1 because RD0D0R\cap D_{0}\subseteq D_{0}. Now Theorem 1.1 is equivalent to say that any rectangle RR with density E(R)2o(n/k)E(R)\geq 2^{-o(n/k)} cannot be monochromatic.

Let RR be any monochromatic rectangle containing only no instances. We will perform a projection operation to convert RR into another monochromatic rectangle R({0,1}n1)kR^{\prime}\subseteq(\{0,1\}^{n-1})^{k} with a larger density E(R)E(R)(1+1/k)E(R^{\prime})\geq E(R)\cdot(1+1/k). Now, since RR^{\prime} is still monochromatic, we repeat this projection for nn rounds where each round increases the density by a factor of (1+1/k)(1+1/k). Let RR^{*} be the rectangle after nn projections, we then have

E(R)E(R)(1+1/k)n.E(R^{*})\geq E(R)\cdot(1+1/k)^{n}.

Combining E(R)1E(R^{*})\leq 1 and E(R)2o(n/k)E(R)\geq 2^{-o(n/k)}, this gives a contradiction.

Now we briefly explain our projection process. Let R=X1××Xk({0,1}n)kR=X_{1}\times\dots\times X_{k}\subseteq(\{0,1\}^{n})^{k} be a monochromatic rectangle. For each party j[k]j\in[k], the projection of RR on jj is a rectangle Πj(R)=X1××Xk({0,1}n1)k\Pi_{j}(R)=X_{1}^{\prime}\times\cdots\times X_{k}^{\prime}\subseteq(\{0,1\}^{n-1})^{k} defined by:

  • For each party jjj^{\prime}\neq j, Xj:={x{0,1}n1:(x,0)Xj}X_{j^{\prime}}^{\prime}:=\{x^{\prime}\in\{0,1\}^{n-1}:(x^{\prime},0)\in X_{j^{\prime}}\}.

  • For the party jj, Xj:={x{0,1}n1: either (x,0)Xj or (x,1)Xj}X_{j}^{\prime}:=\{x^{\prime}\in\{0,1\}^{n-1}:\text{ either }(x^{\prime},0)\in X_{j}\text{ or }(x^{\prime},1)\in X_{j}\}.

It is not hard to see that Πj(R)\Pi_{j}(R) (for any jj) preserves the monochromatic property of RR. On the other hand, we show there exists a party j[k]j\in[k], such that Πj(R)\Pi_{j}(R) has a larger density compared to RR. In fact, this density increment captures the communication cost of party jj.

We give a full proof of Theorem 1.1 in Section 3. We suggest readers begin with Section 3, and then proceed to Section 4 and Section 5.

Connections to lifting theorems.

Query-to-communication lifting theorems are a generic method to lift the hardness of one-party functions (decision tree lower bounds) to two-party lower bounds in the communication model. This recent breakthrough gives many applications in diverse areas [Göö15, HN12, GPW18, GJW18, DRNV16, dR19, GGKS18, CKLM18, GR21, BR20, RPRC16, PR17, LRS15, CLRS16, KMR21, CMS20, GKY22]. The simulation method is widely used to prove such lifting theorems. During the simulation, we maintain some pseudorandom properties to enforce the communication protocol to be similar to a query protocol. In this process, a potential function is always used to capture the number of communication bits against the pseudorandom property.

In this paper, we adopt the idea of the potential function argument and rephrase it as a density increment argument. In lifting theorems, the simulation is mainly used to maintain a good structure of rectangles, such as maintaining full-range rectangles. In contrast, our density increment argument is more flexible.

Connections to information complexity.

The information complexity is another important tool to prove communication complexity lower bounds. It aims to analyze the mutual information between communication transcripts and the random inputs held by Alice and Bob. From information-theoretical perspective, the randomized communication complexity is then lower bounded by the mutual information. To analyze the mutual entropy, a very useful tool here is called the direct-sum argument [CSWY01, BYJKS04, CKS03, Gro09, Jay09]. Roughly speaking, based on this argument, we only need to analyze the mutual information of the communication transcripts and each coordinate of Alice and Bob’s input. This step significantly reduces the difficulty of analyzing mutual entropy.

In our proof (see Section 4 for example), we also use the local-to-global strategy. The difference is that the information complexity paradigm [BYJKS04, CKS03, Gro09, Jay09] uses a direct-sum argument, which is a useful tool in information theory. In comparison, the density increment arguments use a combinatorial operation called projection. The projection provides more flexibility in different applications. As [DOR21] pointed out many applications in other settings are not amenable to the standard direct sum argument, such as proving information-theoretic lower bounds for the number-on-forehead model. In the density increment arguments, we do not see such barriers for now.

1.3 Potential applications of explicit proofs

We discuss some potential applications of our density increment arguments.

Different communication models.

As an active research area, many techniques were invented to prove communication complexity lower bounds in past decades. However, many of these techniques are specific to only one communication model. For example, the rank method mainly applies to deterministic communication; information complexity is usually used for randomized communication. We believe our density increment arguments provide more flexibility, having less dependency on the communication models. For example, in Theorem 1.1, we study a lower bound similar to (but not exactly) corruption bounds; in Theorem 1.2, we prove randomized communication lower bounds; in Theorem 1.3, we show deterministic lower bounds (it is a separation from randomized communication). Overall, we demonstrate that (at least for the set-disjointness problems) the density increment arguments combine both advantages of lifting theorems and information complexity. Till now, some communication models are still not fully understood. For example, the (1)(\exists-1)-game is an interesting communication model with applications in extension complexity [GJW18]. However, to our best knowledge, we still do not have a generic way to prove extension complexity lower bounds. Another example is the number-on-forehead model. It would be interesting to see if density increment arguments give new applications in various communication models.

Streaming lower bounds.

The connections between communication complexity and streaming lower bounds were explored by a seminal work of Alon, Matias and Szegedy [AMS99]. [AMS99] proved a streaming lower bound for frequency moment estimation based on a reduction from unique-disjointness lower bounds. After that, many subsequent works made a lot of efforts to improve the lower bounds [BYJKS04, CKS03, Gro09, CCM08, AMOP08, GH09] for this problem. As [CMVW16] also pointed out, any improved lower bounds of frequency moment estimation can be automatically applied to improve lower bounds for many other streaming problems.

However, the optimal bound for this fundamental problem222We focus on the random order streaming model. is still not clear333[GH09] claimed a tight bound, however, [CMVW16] pointed there is a flaw in [GH09].. To the best of our knowledge, all current lower bounds rely on (black-box) reductions from k-UDISJk\text{-}\mathrm{UDISJ} lower bounds. As we discussed, an Θ(n/k)\Theta(n/k) bound for randomized communication of k-UDISJk\text{-}\mathrm{UDISJ} is already tight, and the black-box reduction seems a dead end in achieving tight bounds for frequency moment estimation.

To resolve this barrier, we believe an important step is to open the black box. Put differently, we should extend communication complexity lower-bounds techniques into streaming models. Since our proof of k-UDISJk\text{-}\mathrm{UDISJ} has fewer restrictions to models, it is reasonable to try this argument on streaming settings. Concretely, could we prove a tight lower bound of frequency moment estimation by the density increment argument?

Coordinate-wise correlated hard distributions.

Many proofs of randomized communication complexity lower bounds start with Yao’s minimax theorem and design a hard distribution. In some important applications, the hard distribution has a strong correlation between input coordinates. A good example is Tseitin problems, the lower bounds of which can be converted into lower bounds in proof complexity [GP18], extension complexity [GJW18], monotone computation lower bounds [PR17]. However, the hard distribution of Tseitin has complicated coordinate-wise correlation, which makes the information complexity argument difficult to use. Known lower bounds for randomized communication [GP18, GJW18] all lose a logn\log n factor (including one based on a black-box reduction from two-party unique-disjointness [GP18]). Again, it seems the loss cannot be avoided in black-box reductions and it is very interesting to see if our density increment arguments are able to break this barrier.

Acknowledgements.

Authors thank Shachar Lovett and Xinyu Mao for helpful discussions. We are grateful to Kewen Wu for reading the early versions of this paper and providing useful suggestions.

2 Preliminary

For integer n0n\geq 0, we use [n][n] to denote the set {1,2,,n}\{1,2,\ldots,n\}. Throughout, log()\log(\cdot) is the logarithm with base 22. For a finite domain XX, we use xXx\sim X to denote a random variable xx uniformly distributed over XX.

Definition 2.1 (Entropy).

Let DD be a random variable on XX. The entropy of DD is defined by

(D):=xPr[D=x]log(1/Pr[D=x]).\mathcal{H}(D):=\sum_{x}\Pr[D=x]\cdot\log(1/\Pr[D=x]).

Let AA and BB be two random variables on XX and YY respectively. the conditional entropy of AA given BB is defined by

(A|B)=yYPr[B=y]xXPr[A=x|B=y]log(1/Pr[A=x|B=y]).\mathcal{H}(A|B)=\sum_{y\in Y}\Pr[B=y]\cdot\sum_{x\in X}\Pr[A=x|B=y]\cdot\log(1/\Pr[A=x|B=y]).
Definition 2.2 (Mutual information).

Let AA and BB be two (possibly correlated) random variables on XX and YY respectively. The mutual information of AA and BB is defined by

(A:B)=(A)(A|B)=(B)(B|A).\mathcal{I}(A:B)=\mathcal{H}(A)-\mathcal{H}(A|B)=\mathcal{H}(B)-\mathcal{H}(B|A).

Let CC be a random variable on ZZ, the conditional mutual information of AA and BB given CC is defined by

(A:B|C)=(A|C)(A|B,C)=(B|C)(B|A,C).\mathcal{I}(A:B|C)=\mathcal{H}(A|C)-\mathcal{H}(A|B,C)=\mathcal{H}(B|C)-\mathcal{H}(B|A,C).

We use several basic properties of entropy and mutual information.

Fact 2.3.

Let AA and BB be two (possibly correlated) random variables on XX and YY respectively.

  1. 1.

    Conditional entropy inequality: (B|A)(B)\mathcal{H}(B|A)\leq\mathcal{H}(B).

  2. 2.

    Chain rule: (A,B)=(B)+(B|A)=(A)+(A|B)\mathcal{H}(A,B)=\mathcal{H}(B)+\mathcal{H}(B|A)=\mathcal{H}(A)+\mathcal{H}(A|B).

  3. 3.

    Nonnegativity: (A:B)0\mathcal{I}(A:B)\geq 0.

  4. 4.

    (A:B)max{(A),(B)}\mathcal{I}(A:B)\leq\max\{\mathcal{H}(A),\mathcal{H}(B)\}.

3 Deterministic lower bound for multi-party unique-disjointness

In this section, we give a simple proof for Theorem 1.1. Our proof is based on a density increment argument. We first formally define the problem. In our writing, we use binary strings to represent a set. We associate any set A[n]A\subseteq[n] with a corresponding string x{0,1}nx\in\{0,1\}^{n} by setting x(i)=1x(i)=1 iff iAi\in A.

Definition 3.1 (k-UDISJk\text{-}\mathrm{UDISJ}, deterministic version).

For each k2k\geq 2 and n1n\geq 1. We define D0D_{0} (no instances) and DD_{*} (yes instances) as follows:

  • D0:={(x1,,xk)({0,1}n)k:i,x1(i)++xk(i)1}D_{0}:=\{(x_{1},\dots,x_{k})\in(\{0,1\}^{n})^{k}:\forall i,x_{1}(i)+\dots+x_{k}(i)\leq 1\}.

  • D:={(x1,,xk)({0,1}n)k:,x1()==xk()=1 and i,x1(i)++xk(i)1}D_{*}:=\{(x_{1},\dots,x_{k})\in(\{0,1\}^{n})^{k}:\exists\ell,x_{1}(\ell)=\cdots=x_{k}(\ell)=1\text{ and }\forall i\neq\ell,x_{1}(i)+\dots+x_{k}(i)\leq 1\}.

A kk-party deterministic communication protocol CC solves k-UDISJk\text{-}\mathrm{UDISJ} if,

  • for all (x1,,xk)D0(x_{1},\dots,x_{k})\in D_{0}, C(x1,,xk)=0C(x_{1},\dots,x_{k})=0,

  • for all (x1,,xk)D(x_{1},\dots,x_{k})\in D_{*}, C(x1,,xk)=1C(x_{1},\dots,x_{k})=1.

Since the projection may fix some coordinates, we also define the projected instances. For a set I[n]I\subseteq[n], we define D0ID_{0}^{I} (no instances on II) as

D0I:={(x1,,xk)({0,1}I)k:iI,x1(i)++xk(i)1},D_{0}^{I}:=\{(x_{1},\dots,x_{k})\in(\{0,1\}^{I})^{k}:\forall i\in I,x_{1}(i)+\dots+x_{k}(i)\leq 1\},

and define DID_{*}^{I} (yes instances on II) as

DI:={(x1,,xk)({0,1}I)k:iI,x1(i)==xk(i)=1 and ii,x1(i)++xk(i)1}.D_{*}^{I}:=\{(x_{1},\dots,x_{k})\in(\{0,1\}^{I})^{k}:\exists i\in I,x_{1}(i)=\cdots=x_{k}(i)=1\text{ and }\forall i^{\prime}\neq i,x_{1}(i^{\prime})+\dots+x_{k}(i^{\prime})\leq 1\}.

We also partition the yes instances by DI=iIDiID_{*}^{I}=\bigcup_{i\in I}D_{i}^{I}, where

DiI:={(x1,,xk)({0,1}I)k:x1(i)==xk(i)=1 and ii,x1(i)++xk(i)1}.D_{i}^{I}:=\{(x_{1},\dots,x_{k})\in(\{0,1\}^{I})^{k}:x_{1}(i)=\cdots=x_{k}(i)=1\text{ and }\forall i^{\prime}\neq i,x_{1}(i^{\prime})+\dots+x_{k}(i^{\prime})\leq 1\}.

Now we define the density function.

Definition 3.2 (Density function).

For each I[n]I\subseteq[n] and R=X1××Xk({0,1}I)kR=X_{1}\times\dots\times X_{k}\subseteq(\{0,1\}^{I})^{k}, we define its density function as

EI(R):=log(|RD0I|(k+1)|I|).E^{I}(R):=\log\left(\frac{|R\cap D^{I}_{0}|}{(k+1)^{|I|}}\right).

Note that EI(R)0E^{I}(R)\leq 0 for any rectangle RR because |D0I|=(k+1)|I||D_{0}^{I}|=(k+1)^{|I|}. We simplify the notation as E(R)E(R) if II is clear in the context. A crucial step in our argument is the projection operation.

Definition 3.3 (Projection).

Let R=X1××Xk({0,1}I)kR=X_{1}\times\dots\times X_{k}\subseteq(\{0,1\}^{I})^{k} be a rectangle. For an iIi\in I and j[k]j\in[k], the projection of RR on (i,j)(i,j) is a rectangle Πi,j(R)=X1××Xk({0,1}I{i})k\Pi_{i,j}(R)=X_{1}^{\prime}\times\cdots\times X_{k}^{\prime}\subseteq(\{0,1\}^{I\setminus\{i\}})^{k} defined by:

  • for each jjj^{\prime}\neq j, Xj:={x{0,1}I{i}:(x,0)Xj}X_{j^{\prime}}^{\prime}:=\{x^{\prime}\in\{0,1\}^{I\setminus\{i\}}:(x^{\prime},0)\in X_{j^{\prime}}\},

  • for jj, Xj:={x{0,1}I{i}: either (x,0)Xj or (x,1)Xj}X_{j}^{\prime}:=\{x^{\prime}\in\{0,1\}^{I\setminus\{i\}}:\text{ either }(x^{\prime},0)\in X_{j}\text{ or }(x^{\prime},1)\in X_{j}\}.

Here (x,0)(x^{\prime},0) is a string in {0,1}I\{0,1\}^{I} by extending x{0,1}I{i}x^{\prime}\in\{0,1\}^{I\setminus\{i\}} with xi=0x_{i}=0.

The projection operation has two useful properties. The first one is that projection preserves the monochromatic property of the rectangle.

Fact 3.4.

Let RR be a rectangle such that RDI=R\cap D_{*}^{I}=\emptyset. Then for every iIi\in I and j[k]j\in[k], we have

Πi,j(R)DI{i}=.\Pi_{i,j}(R)\cap D_{*}^{I\setminus\{i\}}=\emptyset.

The proof of Fact 3.4 follows from the definition and we omit it here. The next property is phrased as the following projection lemma.

Lemma 3.5 (Projection lemma).

Let R=X1××Xk({0,1}I)kR=X_{1}\times\dots\times X_{k}\subseteq(\{0,1\}^{I})^{k} be a rectangle. If there is a coordinate iIi\in I such that RDiI=R\cap D_{i}^{I}=\emptyset, then there is some j[k]j\in[k] such that

EI{i}(Πi,j(R))EI(R)+1/k.E^{I\setminus\{i\}}(\Pi_{i,j}(R))\geq E^{I}(R)+1/k.

Given Lemma 3.5 and Fact 3.4, Theorem 1.1 becomes straightforward. We can simply repeat the projection nn times for i[n]i\in[n], where each time we use Lemma 3.5 to choose a good jj to do projection and increase the density function by 1/k1/k. Now we prove Lemma 3.5.

Proof of Lemma 3.5.

Let RR be a rectangle such that RDiI=R\cap D_{i}^{I}=\emptyset. Let I:=I{i}I^{\prime}:=I\setminus\{i\} and

L:={xD0I:xRD0I,x|I=x}.L:=\{x^{\prime}\in D_{0}^{I^{\prime}}:\exists x\in R\cap D_{0}^{I},~{}x|_{I^{\prime}}=x^{\prime}\}.

Here x|Ix|_{I^{\prime}} is the restriction of xx on I{i}I\setminus\{i\}. Note that for all j[k]j\in[k], Πi,j(R)D0I=Πi,j(R)L\Pi_{i,j}(R)\cap D_{0}^{I^{\prime}}=\Pi_{i,j}(R)\cap L, and our goal is to show that there is a jj such that |Πi,j(R)L||\Pi_{i,j}(R)\cap L| is large.

For every xLx^{\prime}\in L, define the extension set of xx^{\prime} as ext(x):={xRD0I:x|I=x}\mathrm{ext}(x^{\prime}):=\{x\in R\cap D_{0}^{I}:x|_{I^{\prime}}=x^{\prime}\}. Crucially, for every xLx^{\prime}\in L, we have

|ext(x)|k.\displaystyle|\mathrm{ext}(x^{\prime})|\leq k. (1)

Note that, without the condition RDiI=R\cap D_{i}^{I}=\emptyset, it can only be bounded by k+1k+1. Inequality (1) is proved by contradiction. If there is a x=(x1,,xk)Lx^{\prime}=(x_{1}^{\prime},\dots,x_{k}^{\prime})\in L such that |ext(x)|=k+1|\mathrm{ext}(x^{\prime})|=k+1. Then we must have that (x1,1)X1,,(xk,1)Xk(x_{1}^{\prime},1)\in X_{1},\dots,(x_{k}^{\prime},1)\in X_{k}, contradicting RDiI=R\cap D_{i}^{I}=\emptyset.

We now continue our proof. Partition LL into two parts:

A:={xL:|ext(x)|2}andB:={xL:|ext(x)|=1}.A:=\{x^{\prime}\in L:|\mathrm{ext}(x^{\prime})|\geq 2\}\quad\text{and}\quad B:=\{x^{\prime}\in L:|\mathrm{ext}(x^{\prime})|=1\}.

First observe that for any x=(x1,,xk)Ax^{\prime}=(x_{1}^{\prime},\ldots,x_{k}^{\prime})\in A, we have (xj,0)Xj(x_{j}^{\prime},0)\in X_{j} for every j[k]j\in[k] as RR is a rectangle. This implies xΠi,j(R)x^{\prime}\in\Pi_{i,j}(R) for all j[k]j\in[k]. Hence

|A|=|AΠi,j(R)|,j[k].|A|=|A\cap\Pi_{i,j}(R)|,\quad\forall j\in[k].

Applying (1) with xAx^{\prime}\in A, we have

k|AΠi,j(R)|=k|A||{xRD0I:x|IA}|,j[k].k\cdot|A\cap\Pi_{i,j}(R)|=k\cdot|A|\geq|\{x\in R\cap D_{0}^{I}:x|_{I^{\prime}}\in A\}|,\quad\forall j\in[k].

For xBx^{\prime}\in B, since |ext(x)|=1|\mathrm{ext}(x^{\prime})|=1, we have

|{xRD0I:x|IB}|=|B|.|\{x\in R\cap D_{0}^{I}:x|_{I^{\prime}}\in B\}|=|B|.

On the other hand, for every xLx^{\prime}\in L, it always exists one j[k]j\in[k] such that xΠi,j(R)x^{\prime}\in\Pi_{i,j}(R). By an average argument, there is at least one j[k]j\in[k] such that,

k|BΠi,j(R)||B|=|{xRD0I:x|IB}|.k\cdot|B\cap\Pi_{i,j}(R)|\geq|B|=|\{x\in R\cap D_{0}^{I}:x|_{I^{\prime}}\in B\}|.

As a result, for this fixed jj we have

k|LΠi,j(R)|\displaystyle k\cdot|L\cap\Pi_{i,j}(R)| =k|BΠi,j(R)|+k|AΠi,j(R)|\displaystyle=k\cdot|B\cap\Pi_{i,j}(R)|+k\cdot|A\cap\Pi_{i,j}(R)|
|{xRD0I:xA}|+|{xRD0I:xB}|\displaystyle\geq|\{x\in R\cap D_{0}^{I}:x^{\prime}\in A\}|+|\{x\in R\cap D_{0}^{I}:x^{\prime}\in B\}|
=|RD0I|.\displaystyle=|R\cap D_{0}^{I}|.

By the definition of the density function, we have

EI(Πi,j(R))\displaystyle E^{I^{\prime}}(\Pi_{i,j}(R)) =log(|Πi,j(R)D0I|(k+1)|I|)=log((k+1)|Πi,j(R)L|(k+1)|I|)\displaystyle=\log\left(\frac{|\Pi_{i,j}(R)\cap D_{0}^{I^{\prime}}|}{(k+1)^{|I^{\prime}|}}\right)=\log\left(\frac{(k+1)\cdot|\Pi_{i,j}(R)\cap L|}{(k+1)^{|I|}}\right)
log((k+1)|RD0I|k(k+1)|I|)=EI(R)+log(1+1/k)\displaystyle\geq\log\left(\frac{(k+1)\cdot|R\cap D_{0}^{I}|}{k\cdot(k+1)^{|I|}}\right)=E^{I}(R)+\log(1+1/k)
EI(R)+1/k.\displaystyle\geq E^{I}(R)+1/k. (since log(1+1/k)1/k\log(1+1/k)\geq 1/k for all k1k\geq 1)

4 Randomized lower bound for multi-party unique-disjointness

We focus on randomized communication lower bounds in this section. By Yao’s minimax theorem, this is equivalent to identifying a distribution 𝒫\mathcal{P} that is hard on average for any deterministic communication protocol. We use the notation D0,D,D0I,DID_{0},D_{*},D_{0}^{I},D_{*}^{I} same as the previous section (Definition 3.1).

Our hard distribution 𝒫\mathcal{P} is supported on D0DD_{0}\cup D_{*}.

Definition 4.1.

For any n,k1n,k\geq 1, we define the hard distribution 𝒫\mathcal{P} on ({0,1}n)k(\{0,1\}^{n})^{k} as follows.

  1. 1.

    For every i[n]i\in[n], uniformly and independently sample 𝒲i[k]\mathcal{W}_{i}\sim[k] and 𝒜i{0,1}\mathcal{A}_{i}\sim\{0,1\}.

  2. 2.

    For every i[n],j[k]i\in[n],j\in[k], if 𝒲i=j\mathcal{W}_{i}=j and 𝒜i=1\mathcal{A}_{i}=1, then set xj(i)=1x_{j}(i)=1; otherwise set xj(i)=0x_{j}(i)=0.

  3. 3.

    Sample {0,1}\mathcal{B}\sim\{0,1\} and [n]\mathcal{\ell}\sim[n] uniformly. If =1\mathcal{B}=1, then update xj()=1x_{j}(\ell)=1 for all j[k]j\in[k].

  4. 4.

    Output x=(x1,,xk)x=(x_{1},\dots,x_{k}).

Given this hard distribution 𝒫\mathcal{P}, we also define the distribution 𝒬:=(𝒫=0)\mathcal{Q}:=(\mathcal{P}\mid\mathcal{B}=0).

Now we give some explanation of the random variables in this sampling process.

  • The bit \mathcal{B} determines whether to output yes instances or no instances. In particular, for =1\mathcal{B}=1, we output a yes instance. Hence we update xj()=1x_{j}(\ell)=1 for all j[k]j\in[k], where \ell is uniformly sampled.

  • For every ii, the variable 𝒲i[k]\mathcal{W}_{i}\in[k] captures which party (𝒲i\mathcal{W}_{i}) may have the ii-th element

  • For every ii, 𝒜i\mathcal{A}_{i} determines whether 𝒲i\mathcal{W}_{i} has the ii-th element or not.

It is well-known that a deterministic protocol CC with communication complexity cc partitions the input domain into at most 2c2^{c} rectangles where each rectangle corresponds to a leaf in the communication tree. We then define the following random variable \mathcal{R}, which is the rectangle of a random leaf induced by the input distribution 𝒬\mathcal{Q}.

Definition 4.2.

For a fixed deterministic protocol CC, we define a distribution C\mathcal{R}_{C} on leaf rectangles (of CC) as follows.

  1. 1.

    Randomly sample x𝒬x\sim\mathcal{Q}.

  2. 2.

    Output the rectangle RR of CC containing xx.

We emphasize \mathcal{R} is defined by 𝒬\mathcal{Q}, which is not 𝒫\mathcal{P}. Hence for a protocol CC with a small error on 𝒫\mathcal{P}, the random rectangle RR\sim\mathcal{R} should be biased towards D0D_{0} with high probability.

If CC is clear in the context, we will simply use \mathcal{R} for C\mathcal{R}_{C}. For any rectangle RR, we also use 𝒬R\mathcal{Q}_{R} to denote the distribution (𝒬=R).(\mathcal{Q}\mid\mathcal{R}=R). Let 𝒲=(𝒲1,,𝒲n)\mathcal{W}=(\mathcal{W}_{1},\dots,\mathcal{W}_{n}) and we use (𝒬,𝒲)(\mathcal{Q},\mathcal{W}) to denote the joint distribution of 𝒬\mathcal{Q} and 𝒲\mathcal{W}. We are now ready to explain our theorems.

Theorem 4.3.

Let 0<ϵ<0.00010<\epsilon<0.0001 be a constant. For any deterministic protocol CC with error ϵ\epsilon under 𝒫\mathcal{P}, i.e.,

Prx𝒫[C(x)=k-UDISJ(x)]1ϵ.\Pr_{x\sim\mathcal{P}}[C(x)=k\text{-}\mathrm{UDISJ}(x)]\geq 1-\epsilon.

We have

(𝒬,𝒲:C)=Ω(n/k).\mathcal{I}(\mathcal{Q},\mathcal{W}:\mathcal{R}_{C})=\Omega(n/k).

We note that Theorem 4.3 implies Theorem 1.2 because the communication complexity of CC is lower bounded by ()\mathcal{H}(\mathcal{R}), and ()\mathcal{H}(\mathcal{R}) is an upper bound of ((𝒬,𝒲):)\mathcal{I}((\mathcal{Q},\mathcal{W}):\mathcal{R}) (Fact 2.3).

A similar lower bound (𝒬:|𝒲)\mathcal{I}(\mathcal{Q}:\mathcal{R}|\mathcal{W}) was previously obtained by the information complexity framework [Gro09]. We reprove it by a density increment argument. In what follows, we fix the protocol CC. We first give a high-level view of our proof.

Sketch of the proof.

We first reinterpret the proof of the deterministic lower bound (Section 3) in an entropy perspective. Then we generalize it to randomized communication lower bounds.

Let CC be a deterministic communication protocol for k-UDISJk\text{-}\mathrm{UDISJ}. Every leaf of CC is a monochromatic rectangle. Let RR be any 0-monochromatic rectangle (i.e., RD=R\cap D_{*}=\emptyset) of CC. Then for every input xRD0x^{*}\in R\cap D_{0} and i[n]i\in[n],

Prx=(x1,xk)𝒫[x1(i)==xk(i)=1xR and ii,x(i)=x(i)]=0\displaystyle\Pr_{x=(x_{1}\dots,x_{k})\sim\mathcal{P}}\left[x_{1}(i)=\cdots=x_{k}(i)=1\mid x\in R\text{ and }\forall i^{\prime}\neq i,x(i^{\prime})=x^{*}(i^{\prime})\right]=0 (2)

since RR is 0-monochromatic. Furthermore, since RR is a rectangle, there is a party jj such that

Prx=(x1,xk)𝒫[xj(i)=1xR and ii,x(i)=x(i)]=0.\Pr_{x=(x_{1}\dots,x_{k})\sim\mathcal{P}}\left[x_{j}(i)=1\mid x\in R\text{ and }\forall i^{\prime}\neq i,x(i^{\prime})=x^{*}(i^{\prime})\right]=0.

Recall 𝒬=(𝒫=0)\mathcal{Q}=(\mathcal{P}\mid\mathcal{B}=0) samples no instances. Thus we also have

Prx=(x1,xk)𝒬[xj(i)=1xR and ii,x(i)=x(i)]=0.\Pr_{x=(x_{1}\dots,x_{k})\sim\mathcal{Q}}\left[x_{j}(i)=1\mid x\in R\text{ and }\forall i^{\prime}\neq i,x(i^{\prime})=x^{*}(i^{\prime})\right]=0.

By the definition of \mathcal{R}, it is equivalent to 444In the following part, we use replace the notation xRx\in R with =R\mathcal{R}=R when x𝒬x\sim\mathcal{Q}.

Prx=(x1,xk)𝒬[xj(i)=1=R and ii,x(i)=x(i)]=0.\Pr_{x=(x_{1}\dots,x_{k})\sim\mathcal{Q}}\left[x_{j}(i)=1\mid\mathcal{R}=R\text{ and }\forall i^{\prime}\neq i,x(i^{\prime})=x^{*}(i^{\prime})\right]=0.

If we use the entropy language and recall the definition of 𝒲=(𝒲1,,𝒲n)\mathcal{W}=(\mathcal{W}_{1},...,\mathcal{W}_{n}), it is equivalent to

(xj(i)=R,𝒲i=j and ii,x(i)=x(i))=0.\mathcal{H}\left(x_{j}(i)\mid\mathcal{R}=R,\mathcal{W}_{i}=j\text{ and }\forall i^{\prime}\neq i,x(i^{\prime})=x^{*}(i^{\prime})\right)=0.

In contrast, if we do not condition on RR, we have that

Prx=(x1,xk)𝒬[xj(i)=1𝒲i=j and ii,x(i)=x(i)]=1/2.\Pr_{x=(x_{1}\dots,x_{k})\sim\mathcal{Q}}\left[x_{j}(i)=1\mid\mathcal{W}_{i}=j\text{ and }\forall i^{\prime}\neq i,x(i^{\prime})=x^{*}(i^{\prime})\right]=1/2.

which can be written as,

(xj(i)𝒲i=j and ii,x(i)=x(i))=1.\mathcal{H}\left(x_{j}(i)\mid\mathcal{W}_{i}=j\text{ and }\forall i^{\prime}\neq i,x(i^{\prime})=x^{*}(i^{\prime})\right)=1.

This gap captures the mutual information of \mathcal{R} and (𝒬,𝒲)(\mathcal{Q},\mathcal{W}) on the ii-th coordinate.

For different choices of xRD0x^{*}\in R\cap D_{0}, we may have different jj witnessing the mutual information. But on average, we have

𝔼j[(xj(i)=R,𝒲i=j and x(1),,x(i1),x(i+1),,x(n))]11/k.\mathbb{E}_{j}\left[\mathcal{H}\left(x_{j}(i)\mid\mathcal{R}=R,\mathcal{W}_{i}=j\text{ and }x(1),\dots,x(i-1),x(i+1),\dots,x(n)\right)\right]\leq 1-1/k.

In particular, there exists a j[k]j\in[k] such that

(xj(i)=R,𝒲i=j,x(1),,x(i1),x(i+1),,x(n))11/k.\mathcal{H}\left(x_{j}(i)\mid\mathcal{R}=R,\mathcal{W}_{i}=j,x(1),\dots,x(i-1),x(i+1),\dots,x(n)\right)\leq 1-1/k.

Now we explain how can we view the projection as a decoupling process for this mutual information. We can decompose the projection in two steps:

  1. 1.

    Fix 𝒲i=j\mathcal{W}_{i}=j, i.e., update (𝒬~,𝒲~)(𝒬,𝒲𝒲i=j)(\tilde{\mathcal{Q}},\tilde{\mathcal{W}})\leftarrow(\mathcal{Q},\mathcal{W}\mid\mathcal{W}_{i}=j).

  2. 2.

    Update the density function as

    (𝒬~[n]{i},𝒲~[n]{i}=R)(𝒬~[n]{i},𝒲~[n]{i}),\mathcal{H}\left(\tilde{\mathcal{Q}}_{[n]\setminus\{i\}},\tilde{\mathcal{W}}_{[n]\setminus\{i\}}\mid\mathcal{R}=R\right)-\mathcal{H}\left(\tilde{\mathcal{Q}}_{[n]\setminus\{i\}},\tilde{\mathcal{W}}_{[n]\setminus\{i\}}\right),

    or equivalently

    (𝒬[n]{i},𝒲[n]{i}=R,𝒲i=j)(𝒬[n]{i},𝒲[n]{i}𝒲i=j),\mathcal{H}\left(\mathcal{Q}_{[n]\setminus\{i\}},\mathcal{W}_{[n]\setminus\{i\}}\mid\mathcal{R}=R,\mathcal{W}_{i}=j\right)-\mathcal{H}\left(\mathcal{Q}_{[n]\setminus\{i\}},\mathcal{W}_{[n]\setminus\{i\}}\mid\mathcal{W}_{i}=j\right),

    where 𝒬[n]{i}\mathcal{Q}_{[n]\setminus\{i\}} (resp., 𝒬~[n]{i}\tilde{\mathcal{Q}}_{[n]\setminus\{i\}}) is the marginal distribution of 𝒬\mathcal{Q} (resp., 𝒬~\tilde{\mathcal{Q}}) on [n]{i}[n]\setminus\{i\}.

In the first step, we pick the party jj that has mutual information. In the second step, we decouple the mutual information by simply removing it from the density function. The projection lemma (Lemma 3.5) captures how this decoupling step increases the density function. Another crucial fact is that, for any 0-monochromatic rectangle RR, the distribution (𝒬[n]{i}=R,𝒲i=j)(\mathcal{Q}_{[n]\setminus\{i\}}\mid\mathcal{R}=R,\mathcal{W}_{i}=j) is also supported on D0[n]{i}D_{0}^{[n]\setminus\{i\}} (see Fact 3.4), which guarantees us to continuously increase the density by projections on different coordinates.

Now we generalize this to the randomized communication setting where the rectangle RR is not necessarily monochromatic. By the correctness of the protocol, most rectangle RR is biased to either yes instances or no instances.

For a rectangle RR biased to no instances, we expect an inequality similar to (2) will hold: For most RR\sim\mathcal{R}, most no instance x(𝒬=R)x^{*}\sim(\mathcal{Q}\mid\mathcal{R}=R) and most i[n]i\sim[n], it satisfies

Prx=(x1,xk)𝒫[x1(i)==xk(i)=1xR and ii,x(i)=x(i)]δ,\Pr_{x=(x_{1}\dots,x_{k})\sim\mathcal{P}}\left[x_{1}(i)=\cdots=x_{k}(i)=1\mid x\in R\text{ and }\forall i^{\prime}\neq i,x(i^{\prime})=x^{*}(i^{\prime})\right]\leq\delta,

where δ\delta is a small constant depending on the error rate ϵ\epsilon of the protocol.

On the other hand, we also need to argue that projections can be repeated. This part is slightly more complicated than the deterministic case where we can simply fix 𝒲i=j\mathcal{W}_{i}=j for some j[k]j\in[k]. In the randomized case, we cannot fix it because we have to preserve the bias. This is addressed by:

  • Bias lemma (Lemma 4.11), a randomized variant of Fact 3.4.

  • Projection lemma (Lemma 4.10), a randomized variant of Lemma 3.5.

4.1 Key definitions and lemmas

Now we introduce the key definitions and lemmas (bias lemma and projection lemma) needed for the randomized communication lower bound.

Definition 4.4 (ρ\rho-restriction).

For J[n]J\subset[n] and wJ[k]Jw_{J}\in[k]^{J}, we call ρ=(J,wJ)\rho=(J,w_{J}) an ρ\rho-restriction, and denote (𝒬,𝒲|ρ)(\mathcal{Q},\mathcal{W}|_{\rho}) as the distribution (𝒬,𝒲iJ,𝒲i=wi)(\mathcal{Q},\mathcal{W}\mid\forall i\in J,\mathcal{W}_{i}=w_{i}).

The ρ\rho-restriction corresponds to projections in the deterministic case: For ρ=(J,wJ)\rho=(J,w_{J}), iJi\in J corresponds to the projection Πi,wi\Pi_{i,w_{i}}. Now we define our new density function.

Definition 4.5 (Density function).

Let RR be a rectangle and a set I[n]I\subseteq[n]. For a restriction ρ=(Ic,wIc)\rho=(I^{c},w_{I^{c}}) with Ic=[n]II^{c}=[n]\setminus I, its density is defined by

EI(R,ρ):=(𝒬I,𝒲Iρ,=R)(𝒬I,𝒲Iρ).E^{I}(R,\rho):=\mathcal{H}(\mathcal{Q}_{I},\mathcal{W}_{I}\mid\rho,\mathcal{R}=R)-\mathcal{H}(\mathcal{Q}_{I},\mathcal{W}_{I}\mid\rho).

The average density is defined by

EI:=𝔼(ρ,R)(𝒲Ic,)[EI(R,ρ)]=𝔼𝜌[(𝒬I,𝒲I:ρ)]=(𝒬I,𝒲I:𝒲Ic).E^{I}:=\underset{(\rho,R)\sim(\mathcal{W}_{I^{c}},\mathcal{R})}{\mathbb{E}}\left[E^{I}(R,\rho)\right]=\underset{\rho}{\mathbb{E}}\left[-\mathcal{I}(\mathcal{Q}_{I},\mathcal{W}_{I}:\mathcal{R}\mid\rho)\right]=-\mathcal{I}(\mathcal{Q}_{I},\mathcal{W}_{I}:\mathcal{R}\mid\mathcal{W}_{I^{c}}).

In particular E[n]=(𝒬,𝒲:)E^{[n]}=-\mathcal{I}(\mathcal{Q},\mathcal{W}:\mathcal{R}).

The main difference between the deterministic setting and randomized setting is that, in the deterministic case, we consider EI(R,ρ)E^{I}(R,\rho) for some fixed RR and ρ\rho. However, in the randomized communication case, we have to consider EIE^{I} which does not fix RR and ρ\rho because the projection lemma (Lemma 4.10) and the bias lemma (Lemma 4.11) are not preserved under fixed RR and ρ\rho.

We also note that EI(R,ρ)-E^{I}(R,\rho) might be negative for some RR and ρ\rho. But EI-E^{I} is always nonnegative because it is mutual information.

As we mentioned before, in the randomized setting, the leaves are no longer monochromatic but biased. Now we define the following bias definition to capture it (This is a randomized version of equation 2).

Definition 4.6.

Let RR be a rectangle and I[n]I\subseteq[n]. Let ρ=(Ic,wIc)\rho=(I^{c},w_{I^{c}}) be a restriction. For any iIi\in I and input xD0I{i}x^{*}\in D_{0}^{I\setminus\{i\}}, the bias of xx^{*} on the coordinate ii under (R,ρ)(R,\rho) is defined by

γi,ρ,RI(x):=Prx=(x1,xk)𝒫[x1(i)==xk(i)=1|ρ,xR,xIcD and iI{i},x(i)=x(i)],\gamma_{i,\rho,R}^{I}(x^{*}):=\Pr_{x=(x_{1}\dots,x_{k})\sim\mathcal{P}}\left[x_{1}(i)=\cdots=x_{k}(i)=1~{}\middle|~{}\rho,x\in R,x\notin\bigcup_{\ell\in I^{c}}D_{\ell}\text{ and }\forall i^{\prime}\in I\setminus\{i\},x(i^{\prime})=x^{*}(i^{\prime})\right],

where DDD_{\ell}\subseteq D_{*} is the yes instances with intersection witnessed by x()x(\ell), i.e., DD_{\ell} is support of (𝒫|=1,)(\mathcal{P}|\mathcal{B}=1,\ell).555Though RR is the leaf conditioned on an input from 𝒬=(𝒫|=0)\mathcal{Q}=(\mathcal{P}|\mathcal{B}=0), it is still possible that 𝒫(DR)>0\mathcal{P}(D_{\ell}\cap R)>0 since the protocol is allowed to err. That is why xIcDx\notin\bigcup_{\ell\in I^{c}}D_{\ell} is not implied by =R\mathcal{R}=R. Then we define the average bias of a rectangle RR on ii as

γi,RI:=𝔼(x,ρ)(𝒬I{i},𝒲Ic=R)[γi,ρ,RI(x)].\gamma_{i,R}^{I}:=\underset{(x^{*},\rho)\sim(\mathcal{Q}_{I\setminus\{i\}},\mathcal{W}_{I^{c}}\mid\mathcal{R}=R)}{\mathbb{E}}\left[\gamma_{i,\rho,R}^{I}(x^{*})\right].

The overall bias on ii is defined by

γiI:=𝔼R[γi,RI].\gamma_{i}^{I}:=\underset{R\sim\mathcal{R}}{\mathbb{E}}\left[\gamma_{i,R}^{I}\right].

Finally, we define the projection for randomized communication. Recall in the deterministic case, there are two steps in the projection. In the randomized case, since we put ρ\rho in average, we can remove the first step. Then the projection can be defined as follows.

Definition 4.7 (Projection).

Let I[n]I\subseteq[n] be the set of unrestricted coordinates. For any iIi\in I, the projection on ii is to update the density function from EIE^{I} to EI{i}E^{I\setminus\{i\}}.

Remark 4.8.

We may use different projections for different communication problems. For example, the BPP lifting theorem [GPW17] used a very different projection because they studied low-discrepancy gadgets. We define the projection in such a way because we are working on AND gadgets. Given this flexibility, we believe the density increment arguments may provide new applications beyond the information complexity framework.

Now we introduce three key lemmas in our proof.

Lemma 4.9.

Let ϵ(0,0.0001)\epsilon\in(0,0.0001) be a constant. Let CC be a deterministic protocol with error ϵ\epsilon under the distribution 𝒫\mathcal{P}. There is a constant δ(0,0.02)\delta\in(0,0.02) (depending only on ϵ\epsilon) and a set of coordinates J[n]J\subseteq[n] with |J|=Ω(n)|J|=\Omega(n) such that γi[n]δ\gamma_{i}^{[n]}\leq\delta holds for each iJi\in J.

Since CC is a protocol with a small error under 𝒫\mathcal{P} and \mathcal{R} is sampled according to 𝒬\mathcal{Q} (no instances), we have that, for a random RR\sim\mathcal{R}, it is very likely that RR is biased towards to no instances. Then Lemma 4.9 can be proved by an average argument. This is a generalization from the deterministic case that γi[n]=0\gamma_{i}^{[n]}=0 for all i[n]i\in[n]. The proof of Lemma 4.9 is deferred to Section A as part of the proof of Lemma 4.11.

Lemma 4.10 (Projection lemma).

Let δ(0,0.02)\delta\in(0,0.02) be a constant. For any I[n]I\subseteq[n] and iIi\in I, if γiIδ\gamma_{i}^{I}\leq\delta, the projection on ii increases the density function by Ω(1/k)\Omega(1/k), i.e.,

EI{i}EI+Ω(1/k).E^{I\setminus\{i\}}\geq E^{I}+\Omega(1/k).

The projection lemma shows that the density function increases if we do a projection on a biased coordinate. We prove it in Section 4.2.

Our last lemma shows that the bias is preserved during the projections as a counterpart to Lemma 3.4 in the deterministic case.

Lemma 4.11 (Bias lemma).

Let δ>0\delta>0 be the constant and J[n]J\subseteq[n] be the set from Lemma 4.9. For any I[n]I\subseteq[n] and distinct i,iIJi,i^{\prime}\in I\cap J, we have that

γiI{i}δ.\gamma_{i^{\prime}}^{I\setminus\{i\}}\leq\delta.

This lemma can be by a convexity inequality and its proof is deferred to the Section A. Now we summarize these three lemmas and complete the proof of Theorem 4.3.

  • Lemma 4.9 shows that, if CC is a communication protocol with a small error under 𝒫\mathcal{P}, then γi[n]\gamma_{i}^{[n]} is very small for many coordinates ii.

  • Projection lemma (Lemma 4.10) converts the bias on ii into the density increment of projection on the coordinate ii.

  • Bias lemma (Lemma 4.11) proves that a projection on a coordinate ii preserves the bias on other coordinates ii^{\prime}, which shows the projection lemma can be applied many times.

Proof of Theorem 4.3.

Assume CC has error ϵ(0,0.0001)\epsilon\in(0,0.0001) under distribution 𝒫\mathcal{P}. By Lemma 4.9, there is a constant δ(0,0.02)\delta\in(0,0.02) and a set of coordinates J[n]J\subseteq[n] with |J|=Ω(n)|J|=\Omega(n) such that γi[n]δ\gamma_{i}^{[n]}\leq\delta for every iJi\in J.

Let I=[n]JI=[n]\setminus J. Then iteratively applying Lemma 4.10 and Lemma 4.11 on coordinates in JJ, we have

EIE[n]+Ω(|J|/k)=E[n]+Ω(n/k).E^{I}\geq E^{[n]}+\Omega(|J|/k)=E^{[n]}+\Omega(n/k).

By the definition of the density function, we know E[n]=(𝒬,𝒲:)E^{[n]}=-\mathcal{I}(\mathcal{Q},\mathcal{W}:\mathcal{R}). Since EI-E^{I} is always non-negative, we have

(𝒬,𝒲:)=E[n]EI+Ω(n/k)Ω(n/k).\mathcal{I}(\mathcal{Q},\mathcal{W}:\mathcal{R})=-E^{[n]}\geq-E^{I}+\Omega(n/k)\geq\Omega(n/k).

4.2 Proof of the projection lemma

Now we prove Lemma 4.10. Recall that

EI=𝔼(ρ,R)(𝒲,)[(𝒬I,WIρ,=R)(𝒬I,𝒲Iρ)]E^{I}=\underset{(\rho,R)\sim(\mathcal{W},\mathcal{R})}{\mathbb{E}}\left[\mathcal{H}(\mathcal{Q}_{I},W_{I}\mid\rho,\mathcal{R}=R)-\mathcal{H}(\mathcal{Q}_{I},\mathcal{W}_{I}\mid\rho)\right]

and

γiI=𝔼R[γi,RI].\gamma_{i}^{I}=\underset{R\sim\mathcal{R}}{\mathbb{E}}\left[\gamma_{i,R}^{I}\right].

We aim to show that if γiIδ\gamma_{i}^{I}\leq\delta for some δ(0,0.02)\delta\in(0,0.02), then we have that

EI{i}EI+Ω(1/k).E^{I\setminus\{i\}}\geq E^{I}+\Omega(1/k).

In our proof, we borrow a useful lemma from [Gro09] and [Jay09]. In [Gro09, Jay09], this lemma was used to analyze information cost.

Lemma 4.12 ([Gro09, Theorem 3.16]).

Let δ<0.02\delta<0.02 be a constant and I[n]I\subseteq[n]. Fix a deterministic protocol CC. If γiIδ\gamma^{I}_{i}\leq\delta, then

(𝒬i|𝒲i)(𝒬i|,𝒬Ii,𝒲)=Ω(1/k).\mathcal{H}(\mathcal{Q}_{i}|\mathcal{W}_{i})-\mathcal{H}(\mathcal{Q}_{i}|\mathcal{R},\mathcal{Q}_{I\setminus i},\mathcal{W})=\Omega(1/k).

Though Lemma 4.12 is not exactly the same as [Gro09, Jay09], the proof is similar and we omit the proof of this lemma here. We will include the proof in our full version.

Proof of Lemma 4.10.

Recall

EI=(𝒬I,𝒲I𝒲Ic,)(𝒬I,𝒲I𝒲Ic).E^{I}=\mathcal{H}(\mathcal{Q}_{I},\mathcal{W}_{I}\mid\mathcal{W}_{I^{c}},\mathcal{R})-\mathcal{H}(\mathcal{Q}_{I},\mathcal{W}_{I}\mid\mathcal{W}_{I^{c}}).

Since WIcW_{I^{c}} is independent with (𝒬I,WI)(\mathcal{Q}_{I},W_{I}), we have

EI=(𝒬I,𝒲I𝒲Ic,)(𝒬I,𝒲I).E^{I}=\mathcal{H}(\mathcal{Q}_{I},\mathcal{W}_{I}\mid\mathcal{W}_{I^{c}},\mathcal{R})-\mathcal{H}(\mathcal{Q}_{I},\mathcal{W}_{I}).

Similarly (𝒬I,𝒲I)(𝒬Ii,𝒲I{i})=(𝒬i,Wi)\mathcal{H}(\mathcal{Q}_{I},\mathcal{W}_{I})-\mathcal{H}(\mathcal{Q}_{I\setminus i},\mathcal{W}_{I\setminus\{i\}})=\mathcal{H}(\mathcal{Q}_{i},W_{i}) since (𝒬i,𝒲i)(\mathcal{Q}_{i},\mathcal{W}_{i}) and (𝒬I{i},𝒲Ii)(\mathcal{Q}_{I\setminus\{i\}},\mathcal{W}_{I\setminus i}) are independent. Hence,

EI{i}EI=(𝒬i,𝒲i)(𝒬I,𝒲I𝒲Ic,)+(𝒬I{i},𝒲I{i}𝒲Ic,𝒲i,).E^{I\setminus\{i\}}-E^{I}=\mathcal{H}(\mathcal{Q}_{i},\mathcal{W}_{i})-\mathcal{H}(\mathcal{Q}_{I},\mathcal{W}_{I}\mid\mathcal{W}_{I^{c}},\mathcal{R})+\mathcal{H}(\mathcal{Q}_{I\setminus\{i\}},\mathcal{W}_{I\setminus\{i\}}\mid\mathcal{W}_{I^{c}},\mathcal{W}_{i},\mathcal{R}).

Applying chain rule of entropy on (𝒬I,𝒲I|,𝒲Ic)\mathcal{H}(\mathcal{Q}_{I},\mathcal{W}_{I}|\mathcal{R},\mathcal{W}_{I^{c}}), i.e.,

(𝒬I,𝒲I|,𝒲Ic)=(𝒲i|,𝒲Ic)+(𝒬I{i},𝒲I{i}|,𝒲i,𝒲Ic)+(𝒬i|,𝒬I{i},𝒲),\mathcal{H}(\mathcal{Q}_{I},\mathcal{W}_{I}|\mathcal{R},\mathcal{W}_{I^{c}})=\mathcal{H}(\mathcal{W}_{i}|\mathcal{R},\mathcal{W}_{I^{c}})+\mathcal{H}(\mathcal{Q}_{I\setminus\{i\}},\mathcal{W}_{I\setminus\{i\}}|\mathcal{R},\mathcal{W}_{i},\mathcal{W}_{I^{c}})+\mathcal{H}(\mathcal{Q}_{i}|\mathcal{R},\mathcal{Q}_{I\setminus\{i\}},\mathcal{W}),

we have

EIiEI=(𝒬i,𝒲i)(𝒲i|,𝒲Ic)+(𝒬i|,𝒬I{i},𝒲).E^{I\setminus i}-E^{I}=\mathcal{H}(\mathcal{Q}_{i},\mathcal{W}_{i})-\mathcal{H}(\mathcal{W}_{i}|\mathcal{R},\mathcal{W}_{I^{c}})+\mathcal{H}(\mathcal{Q}_{i}|\mathcal{R},\mathcal{Q}_{I\setminus\{i\}},\mathcal{W}).

By the chain rule (𝒬i,𝒲i)=(𝒲i)+(𝒬i|𝒲i)\mathcal{H}(\mathcal{Q}_{i},\mathcal{W}_{i})=\mathcal{H}(\mathcal{W}_{i})+\mathcal{H}(\mathcal{Q}_{i}|\mathcal{W}_{i}) and the fact that (𝒲i)(𝒲i|,𝒲Ic)\mathcal{H}(\mathcal{W}_{i})\geq\mathcal{H}(\mathcal{W}_{i}|\mathcal{R},\mathcal{W}_{I^{c}}), we conclude that,

EI{i}EI(𝒬i|𝒲i)(𝒬i|,𝒬I{i},𝒲).E^{I\setminus\{i\}}-E^{I}\geq\mathcal{H}(\mathcal{Q}_{i}|\mathcal{W}_{i})-\mathcal{H}(\mathcal{Q}_{i}|\mathcal{R},\mathcal{Q}_{I\setminus\{i\}},\mathcal{W}).

Finally, by Lemma 4.12 and the fact that γiIδ<0.02\gamma^{I}_{i}\leq\delta<0.02, we have

EI{i}EIΩ(1/k).E^{I\setminus\{i\}}-E^{I}\geq\Omega(1/k).

5 Deterministic lower bounds for sparse unique-disjointness

In this section, We discuss the sparse unique-disjointness problem.

Definition 5.1.

For each s2s\geq 2 and n1n\geq 1, the ss-UDISJ problem is defined as follows:

  • No instances: D0(s):={(x,y):|x|,|y|s and i,x(i)+y(i)1}D_{0}^{(s)}:=\{(x,y):|x|,|y|\leq s\text{ and }\forall i,x(i)+y(i)\leq 1\}.

  • Yes instances: D(s):={(x,y):|x|,|y|s and ,x()=y()=1 and i,x(i)+y(i)1}D_{*}^{(s)}:=\{(x,y):|x|,|y|\leq s\text{ and }\exists\ell,x(\ell)=y(\ell)=1\text{ and }\forall i\neq\ell,x(i)+y(i)\leq 1\}.

Here |x||x| is the Hamming weight of xx.

Theorem 1.3 aims to show that any deterministic communication protocol for ss-UDISJ requires Ω(slog(n/s))\Omega(s\cdot\log(n/s)) communication bits. To prove this theorem, we consider the following Unique-Equality problem [ST13, LM19].

Definition 5.2.

Let s2s\geq 2 and n1n\geq 1 be integers. Let BB be a set with (n/s)(n/s) elements. The ss-UEQUAL problem is defined as follows:

  • No instances: B0(s):={(x,y)Bs×Bs:i[s],xiyi}B_{0}^{(s)}:=\{(x,y)\in B^{s}\times B^{s}:\forall i\in[s],x_{i}\neq y_{i}\}.

  • Yes instances: B(s):={(x,y)Bs×Bs:,x=y and i[s]{},xiyi}B_{*}^{(s)}:=\{(x,y)\in B^{s}\times B^{s}:\exists\ell,x_{\ell}=y_{\ell}\text{ and }\forall i\in[s]\setminus\{\ell\},x_{i}\neq y_{i}\}.

There is a simple reduction from ss-UEQUAL to ss-UDISJ [ST13]. Hence it is sufficient to prove a communication lower bound for ss-UEQUAL. In Theorem 1.3, we focus on the regime that sn1/2ϵs\leq n^{1/2-\epsilon} for any small constant ϵ>0.\epsilon>0. Now our goal is to prove the communication complexity of ss-UEQUAL is Ω(slog(n/s))=Ω(slogn)\Omega(s\cdot\log(n/s))=\Omega(s\cdot\log n).

We borrow the square idea from [LM19] but revise and simplify it as we do not need to fully simulate the protocol. See Section 5.2 for discussions.

Definition 5.3 (Square).

Let R=X×YBs×BsR=X\times Y\subseteq B^{s}\times B^{s} be a rectangle. A square in RR contains a set I[s]I\subseteq[s], a set SBIS\subseteq B^{I}, and for every i[s]Ii\in[s]\setminus I, there is a set AiA_{i}. We denote the family of these AiA_{i}’s as 𝒜\mathcal{A}.

Given (I,S,𝒜)(I,S,\mathcal{A}), we say it is a square in R=X×YR=X\times Y if, for every zSz\in S, there exists some xXx\in X and yYy\in Y such that:

  • x|I=zx|_{I}=z and, for all i[s]Ii\in[s]\setminus I, xiAix_{i}\in A_{i};

  • y|I=zy|_{I}=z and, for all i[s]Ii\in[s]\setminus I, yiBAiy_{i}\in B\setminus A_{i}.

Same as in previous sections, we use the set II to denote unrestricted coordinates and use [s]I[s]\setminus I to denote fixed coordinates. We remark that the definition above enforces that xiyix_{i}\neq y_{i} (as xiAi,yiBAix_{i}\in A_{i},y_{i}\in B\setminus A_{i}) for all i[s]Ii\in[s]\setminus I. Hence, the fixed coordinates do not reveal any information about whether it is a yes instance or a no instance.

Similar to Raz-McKenzie simulation, we also have a notion of thickness in the proof.

Definition 5.4 (Thickness).

A set SBIS\subseteq B^{I} is rr-thick if, it is not empty and for every iIi\in I and xSx\in S, we have that

|{xS:ji,xj=xj}|r.|\{x^{\prime}\in S:\forall j\neq i,x_{j}=x_{j}^{\prime}\}|\geq r.

We say that a square (I,S,𝒜)(I,S,\mathcal{A}) is rr-thick if the set SS is rr-thick.

In our proof, we always choose r=10lognr=10\cdot\log n, and sometimes abbreviate rr-thick as thick. The following thickness to full-range lemma is a standard fact in query-to-communication simulations.

Lemma 5.5.

Let SBIS\subseteq B^{I} be a thick set. Then for every z{0,1}Iz\in\{0,1\}^{I}, there is a pair x,ySx,y\in S such that

iI,zi=1 iff xi=yi.\forall i\in I,~{}z_{i}=1~{}~{}~{}\text{ iff }~{}~{}~{}x_{i}=y_{i}.

The proof of this lemma will be included in the full version. As a byproduct of this lemma, we have the following corollary.

Corollary 5.6.

Let RR be a rectangle containing a square (I,S,𝒜)(I,S,\mathcal{A}) such that II\neq\emptyset and SS is thick, then RR is not monochromatic.

Definition 5.7 (Average degree).

Let SBIS\subseteq B^{I}. For each iIi\in I, we define the set SiBI{i}S_{-i}\subseteq B^{I\setminus\{i\}} as

Si:={xBI{i}:xS,x|I{i}=x}.S_{-i}:=\{x^{\prime}\in B^{I\setminus\{i\}}:\exists x\in S,x|_{I\setminus\{i\}}=x^{\prime}\}.

We say that the average degree of SS is α\alpha if |S|α|Si||S|\geq\alpha\cdot|S_{-i}| holds for all ii. We say that a square (I,S,𝒜)(I,S,\mathcal{A}) has an average degree α\alpha if the average degree of SS is α\alpha.

Regards to the average degree, we have a simple but useful fact.

Fact 5.8.

For α,β>0\alpha,\beta>0. Let SS be a set that has average degree α\alpha. Then for any subset SSS^{\prime}\subseteq S of size |S|β|S||S^{\prime}|\geq\beta\cdot|S|, SS^{\prime} has average degree αβ\alpha\cdot\beta.

A crucial component in Raz-McKenzie simulation connecting thickness and the average degree is the thickness lemma. In our proof, we borrow a version from [LM19].

Lemma 5.9 (Thickness lemma [LM19]).

Let α,δ>0\alpha,\delta>0 be parameters. Let I[s]\emptyset\neq I\subseteq[s] and SBIS\subseteq B^{I}. If SS has average degree α\alpha, then there is a (δα/s)(\delta\cdot\alpha/s)-thick set SSS^{\prime}\subseteq S of size |S|(1δ)|S||S^{\prime}|\geq(1-\delta)\cdot|S|.

We also fix δ=1/2\delta=1/2 and α=n\alpha=\sqrt{n} together with r=10lognr=10\cdot\log n. Recall that sn1/2ϵs\leq n^{1/2-\epsilon} for some ϵ>0\epsilon>0. In this regime of parameters, we have that δα/snϵ/210logn=r\delta\cdot\alpha/s\geq n^{\epsilon}/2\geq 10\log n=r. Hence, as long as we maintain a square with average degree α\alpha. we are able to apply the thickness lemma.

Lemma 5.10 (Projection lemma).

Let R=X×YR=X\times Y be a rectangle and let Q=(I,S,𝒜)Q=(I,S,\mathcal{A}) be a thick square in RR. If the set SS has size more than (3α)|I|(3\alpha)^{|I|}, then there is a square Q=(I,S,𝒜)Q^{\prime}=(I^{\prime},S^{\prime},\mathcal{A}^{\prime}) in RR such that

  • III^{\prime}\subseteq I and II^{\prime}\neq\emptyset,

  • SS^{\prime} has average degree 2α2\alpha,

  • |S|0.9(3α)|I||I||S||S^{\prime}|\geq 0.9\cdot(3\alpha)^{|I^{\prime}|-|I|}\cdot|S|.

Proof sketch.

We prove this lemma by a standard structure-vs-pseudorandomness approach. We first describe the process (Algorithm 1) to find the set II^{\prime} and S~\tilde{S}.

1:  Input: A set I[s]I\subseteq[s] and a set SBIS\subseteq B^{I} of size |S|(3α)|I||S|\geq(3\alpha)^{|I|}
2:  Let III^{\prime}\leftarrow I and S~S\tilde{S}\leftarrow S
3:  Let t0t\leftarrow 0
4:  while iI,|S~|3α|S~i|\exists i\in I^{\prime},|\tilde{S}|\leq 3\alpha\cdot|\tilde{S}_{-i}| do
5:     II{i}I^{\prime}\leftarrow I^{\prime}\setminus\{i\}
6:     S~S~i\tilde{S}\leftarrow\tilde{S}_{-i}
7:     tt+1t\leftarrow t+1
8:     itii_{t}\leftarrow i and StS~S_{t}\leftarrow\tilde{S}
9:  end while
10:  return  (I,S~)(I^{\prime},\tilde{S}) and L=(i1,,it)L=(i_{1},\dots,i_{t})
Algorithm 1 Finding a set II^{\prime}

We note that the average degree of S~\tilde{S} is at least (3α)(3\alpha), otherwise the algorithm would not stop. Following this algorithm, it is also clear that |S~||S|(3α)|I||I||\tilde{S}|\geq|S|\cdot(3\alpha)^{|I^{\prime}|-|I|}. This implies that II^{\prime}\neq\emptyset because |S|>(3α)|I||S|>(3\alpha)^{|I|}.

Now, for each iIIi\in I\setminus I^{\prime}, we randomly pick a set AiBA_{i}\subseteq B by independently including each element with probability 1/21/2. Let 𝒜=𝒜{Ai:iII}\mathcal{A}^{\prime}=\mathcal{A}\cup\{A_{i}:i\in I\setminus I^{\prime}\} and SS~S^{\prime}\subseteq\tilde{S} be those strings zS~z\in\tilde{S} such that, there exists an input xXx\in X and yYy\in Y such that:

  • x|I=zx|_{I^{\prime}}=z and, for all i[s]Ii\in[s]\setminus I^{\prime}, xiAix_{i}\in A_{i}.

  • y|I=zy|_{I^{\prime}}=z and, for all i[s]Ii\in[s]\setminus I^{\prime}, yiBAiy_{i}\in B\setminus A_{i}.

We show that with high probability, the square (I,S,𝒜)(I^{\prime},S^{\prime},\mathcal{A}^{\prime}) is a witness for this lemma. We already argued that |S~||S|(3α)|I||I|,|\tilde{S}|\geq|S|\cdot(3\alpha)^{|I^{\prime}|-|I|}, now we show for every zS~z\in\tilde{S},

Pr{Ai}iII[zS]1O(1/n).\Pr_{\{A_{i}\}_{i\in I\setminus I^{\prime}}}[z\in S^{\prime}]\geq 1-O(1/n).

This inequality uses the fact that SS is (10logn)(10\log n)-thick, and then a Chernoff bound on each AiA_{i}, and a union bound on all iIIi\in I\setminus I^{\prime}. We omit the details here and will include them in the full version.

Once it is established, then by an average argument, there is a choice of {Ai1,,Ait}\{A_{i_{1}},\dots,A_{i_{t}}\} such that

  • |S|(1O(1/n))|S~|0.9|S|(3α)|I||I||S^{\prime}|\geq(1-O(1/n))\cdot|\tilde{S}|\geq 0.9\cdot|S|\cdot(3\alpha)^{|I^{\prime}|-|I|};

  • SS^{\prime} has average degree 2α2\alpha, by Fact 5.8 and S~\tilde{S} having average 3α3\alpha and |S|0.9|S~||S^{\prime}|\geq 0.9\cdot|\tilde{S}|.

Now we are ready to explain how to find a long path in the communication tree.

5.1 Finding a long path in a communication tree

Before presenting our algorithm, we first fix some notations.

Definition 5.11.

Let Q=(I,S,𝒜)Q=(I,S,\mathcal{A}) be a square in a rectangle RR. For any sub-rectangle R=X×YR^{\prime}=X^{\prime}\times Y^{\prime} of RR, sub-square Q|R=(I,S,𝒜)Q|_{R^{\prime}}=(I^{\prime},S^{\prime},\mathcal{A}^{\prime}) is defined as follows:

  • Keep I=II^{\prime}=I and 𝒜=𝒜\mathcal{A}^{\prime}=\mathcal{A} the same.

  • SSS^{\prime}\subseteq S contains all of those zSz\in S such that, there exists inputs xXx\in X^{\prime} and xYx\in Y^{\prime},

    x|I=zx|_{I^{\prime}}=z and, for all i[s]Ii\in[s]\setminus I^{\prime}, xiAix_{i}\in A_{i},

    y|I=zy|_{I^{\prime}}=z and, for all i[s]Ii\in[s]\setminus I^{\prime}, yiBAiy_{i}\in B\setminus A_{i}.

Definition 5.12 (Density function).

For a square Q=(I,S,𝒜)Q=(I,S,\mathcal{A}), we define its density as

E(Q)=log(|S||B||I|).E(Q)=\log\left(\frac{|S|}{|B|^{|I|}}\right).
1:  Initialize vv\leftarrow root of communication tree Π\Pi and square Q0([s],Bs,)Q_{0}\leftarrow([s],B^{s},\emptyset)
2:  Set t0t\leftarrow 0
3:  while RvR_{v} is not a monochromatic rectangle do
4:     Let Qt=(It,St,𝒜t)Q_{t}=(I_{t},S_{t},\mathcal{A}_{t}) be currently maintained square.
5:     Let v0,v1v_{0},v_{1} be the children of vv in Π\Pi.
6:     if Alice sends a bit at vv then
7:        Let Xv0,Xv1X_{v_{0}},X_{v_{1}} be the partition of XvX_{v} according to Alice’s partition.
8:        Let Rv0Xv0×YvR_{v_{0}}\leftarrow X_{v_{0}}\times Y_{v} and Rv1Xv1×YvR_{v_{1}}\leftarrow X_{v_{1}}\times Y_{v}.
9:     end if
10:     if Bob sends a bit at vv then
11:        Let Yv0,Yv1Y_{v_{0}},Y_{v_{1}} be the partition of YvY_{v} according to Bob’s partition.
12:        Let Rv0Xv×Yv0R_{v_{0}}\leftarrow X_{v}\times Y_{v_{0}} and Rv1Xv×Yv1R_{v_{1}}\leftarrow X_{v}\times Y_{v_{1}}.
13:     end if
14:     if E(Qt|Rv0)E(Qt|Rv1)E(Q_{t}|_{R_{v_{0}}})\geq E(Q_{t}|_{R_{v_{1}}}) then
15:        Update vv0v\leftarrow v_{0} and QtQt|Rv0Q^{\prime}_{t}\leftarrow Q_{t}|_{R_{v_{0}}}
16:     else
17:        Update vv1v\leftarrow v_{1} and QtQt|Rv1Q^{\prime}_{t}\leftarrow Q_{t}|_{R_{v_{1}}}
18:     end if
19:     Let Q~t\tilde{Q}_{t} be a rr-thick square obtained by applying Lemma 5.9 on QtQ^{\prime}_{t}
20:     if the average degree of Q~t\tilde{Q}_{t} is smaller than (2α)(2\alpha) then
21:        Let Qt+1Q_{t+1} be the square by applying Lemma 5.10 on Q~t\tilde{Q}_{t}
22:     else
23:        Let Qt+1Q~tQ_{t+1}\leftarrow\tilde{Q}_{t}
24:     end if
25:     Update tt+1t\leftarrow t+1
26:  end while
Algorithm 2 Finding a Long Path

Now we describe how to find a long path in the communication tree. Recall that every node in a communication tree has an associated rectangle. Starting from the root, we find a path as follows:

  1. 1.

    We maintain a square in each intermediate node.

  2. 2.

    For each intermediate node, the path always visits the left or right child whose associated rectangle maximizes the density.

The pseudo-code is given in Algorithm 2.

Proof sketch of Theorem 1.3.

Let tt^{*} be the value of tt when we terminate the algorithm. We note this is the length of our path, which is a lower bound of the deterministic communication complexity. Now we argue a lower bound of tt^{*} by analyzing the changes to the density function in Algorithm 2. We consider two types of density changes, which are called simulation and projection respectively.

  • Simulation. In each round tt, we obtain a square Q~t\tilde{Q}_{t} from the square QtQ_{t}. For every tt, we have that |S~t||St|/2|\tilde{S}_{t}|\geq|S^{\prime}_{t}|/2 by Lemma 5.9 and |St||St|/2|S^{\prime}_{t}|\geq|S_{t}|/2 by the choice on Line 14. Hence,

    E(Q~t)E(Qt)2.E(\tilde{Q}_{t})\geq E(Q_{t})-2.
  • Projection. The Line 21 is a projection. For every tt, if z=|It||It+1|>0z=|I_{t}|-|I_{t+1}|>0, then

    E(Qt+1)E(Q~t)+z(log(|B|/(3α))2)E(Q~t)+zΩ(logn).E(Q_{t+1})\geq E(\tilde{Q}_{t})+z\cdot(\log(|B|/(3\alpha))-2)\geq E(\tilde{Q}_{t})+z\cdot\Omega(\log n).

Note that, to apply Lemma 5.9, we need control over the average degrees; to apply Lemma 5.10, we need control over the thickness. Indeed we will inductively show that the following properties P1(t),P2(t),P3(t)P_{1}(t),P_{2}(t),P_{3}(t) are true for all t0t\geq 0,

  • P1(t)P_{1}(t): the average degree of Qt=(It,St,𝒜t)Q_{t}=(I_{t},S_{t},\mathcal{A}_{t}) is at least 2α2\cdot\alpha.

  • P2(t)P_{2}(t): the average degree of Qt=(It,St,𝒜t)Q^{\prime}_{t}=(I^{\prime}_{t},S^{\prime}_{t},\mathcal{A}^{\prime}_{t}) is at least α\alpha.

  • P3(t)P_{3}(t): Q~t=(I~t,S~t,𝒜~t)\tilde{Q}_{t}=(\tilde{I}_{t},\tilde{S}_{t},\tilde{\mathcal{A}}_{t}) is rr-thick.

The base case P1(0)P_{1}(0) is true because S0=BsS_{0}=B^{s} and |B|2α|B|\geq 2\cdot\alpha. The rest part can be proved by applying the thickness lemma and the projection lemma alternatively. We skip the proof here and will include it in our full version.

Finally, we observe that we must have that It=I_{t^{*}}=\emptyset when the algorithm terminates at step tt^{*}. Otherwise, it is not a monochromatic rectangle by Corollary 5.6.

We note that the total density decrease before the algorithm termination is at most 2t2\cdot t^{*}. On the other hand, the total density increase before the termination is at least sΩ(logn)s\cdot\Omega(\log n). This implies

2tsΩ(logn),2\cdot t^{*}\geq s\cdot\Omega(\log n),

and the result follows. ∎

5.2 Discussions and open problems

A very interesting follow-up open problem is to study ss-UEQUAL lower bounds for sn1/2s\gtrsim n^{1/2}. In our proof, the main bottleneck is Lemma 5.9 (thickness lemma), which requires that sαs\leq\alpha. Note that α|B|\alpha\leq|B| and s|B|=ns\cdot|B|=n. Hence, Lemma 5.9 only applies to the range sn1/2s\leq n^{1/2}. In fact, the thickness lemma (or similar lemmas) is also the main barrier in query-to-communication lifting theorems. Lifting theorems usually require a full-range lemma (something similar to Lemma 5.5) to maintain a full simulation on the communication tree. We use the term full simulation to refer to these proofs aiming to construct a decision tree to exactly compute the Boolean functions.

In contrast, we only attempt to find a long path in the communication tree. This approach was suggested by Yang and Zhang [YZ22]. In our analysis, only Corollary 5.6 (a direct corollary of the full-range lemma) is needed. This is much weaker than the full-range requirement. Recall that the full-range lemma shows that: For every z{0,1}Iz\in\{0,1\}^{I}, there is a pair x,ySx,y\in S such that

iI,zi=1 iff xi=yi.\forall i\in I,~{}z_{i}=1~{}~{}~{}\text{ iff }~{}~{}~{}x_{i}=y_{i}.

For the ss-UEQUAL problem, we only care about a subset {0I,e1,,e|I|}\{0^{I},e_{1},\dots,e_{|I|}\} of {0,1}I\{0,1\}^{I}. Here ei{0,1}Ie_{i}\in\{0,1\}^{I} is the indicate vector. This observation may give a chance to avoid the full-range barrier, providing tight lower bounds for any s1s\geq 1.

Overall, we believe that finding a long-path paradigm may provide more applications beyond the full simulation paradigm.

References

  • [AMOP08] Alexandr Andoni, Andrew McGregor, Krzysztof Onak, and Rina Panigrahy. Better bounds for frequency moments in random-order streams. arXiv preprint arXiv:0808.2222, 2008.
  • [AMS99] Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and system sciences, 58(1):137–147, 1999.
  • [BBM11] Eric Blais, Joshua Brody, and Kevin Matulef. Property testing lower bounds via communication complexity. In 2011 IEEE 26th Annual Conference on Computational Complexity, pages 210–220, 2011.
  • [BEO+13] Mark Braverman, Faith Ellen, Rotem Oshman, Toniann Pitassi, and Vinod Vaikuntanathan. A tight bound for set disjointness in the message-passing model. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pages 668–677. IEEE, 2013.
  • [BFS86] László Babai, Peter Frankl, and Janos Simon. Complexity classes in communication complexity theory (preliminary version). In FOCS 1986, 1986.
  • [BGK+18] Mark Braverman, Ankit Garg, Young Kun Ko, Jieming Mao, and Dave Touchette. Near-optimal bounds on the bounded-round quantum communication complexity of disjointness. SIAM Journal on Computing, 47(6):2277–2314, 2018.
  • [BM13] Mark Braverman and Ankur Moitra. An information complexity approach to extended formulations. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 161–170, 2013.
  • [BO17] Mark Braverman and Rotem Oshman. A rounds vs. communication tradeoff for multi-party set disjointness. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 144–155. IEEE, 2017.
  • [BR20] Yakov Babichenko and Aviad Rubinstein. Communication complexity of nash equilibrium in potential games. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 1439–1445. IEEE, 2020.
  • [Bra12] Mark Braverman. Interactive information complexity. In Proceedings of the forty-fourth annual ACM symposium on Theory of computing, pages 505–524, 2012.
  • [BYJKS04] Ziv Bar-Yossef, Thathachar S Jayram, Ravi Kumar, and D Sivakumar. An information statistics approach to data stream and communication complexity. Journal of Computer and System Sciences, 68(4):702–732, 2004.
  • [CCM08] Amit Chakrabarti, Graham Cormode, and Andrew McGregor. Robust lower bounds for communication and stream computation. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pages 641–650, 2008.
  • [CFK+19] Arkadev Chattopadhyay, Yuval Filmus, Sajin Koroth, Or Meir, and Toniann Pitassi. Query-to-communication lifting using low-discrepancy gadgets. arXiv preprint arXiv:1904.13056, 2019.
  • [CKLM18] Arkadev Chattopadhyay, Michal Kouckỳ, Bruno Loff, and Sagnik Mukhopadhyay. Simulation beats richness: New data-structure lower bounds. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 1013–1020, 2018.
  • [CKS03] Amit Chakrabarti, Subhash Khot, and Xiaodong Sun. Near-optimal lower bounds on the multi-party communication complexity of set disjointness. In In IEEE Conference on Computational Complexity, pages 107–117, 2003.
  • [CLRS16] Siu On Chan, James R Lee, Prasad Raghavendra, and David Steurer. Approximate constraint satisfaction requires large lp relaxations. Journal of the ACM (JACM), 63(4):1–22, 2016.
  • [CMS20] Arkadev Chattopadhyay, Nikhil S Mande, and Suhail Sherif. The log-approximate-rank conjecture is false. Journal of the ACM (JACM), 67(4):1–28, 2020.
  • [CMVW16] Michael Crouch, Andrew McGregor, Gregory Valiant, and David P Woodruff. Stochastic streams: Sample complexity vs. space complexity. In 24th Annual European Symposium on Algorithms (ESA 2016). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016.
  • [CP10] Arkadev Chattopadhyay and Toniann Pitassi. The story of set disjointness. ACM SIGACT News, 41(3):59–85, 2010.
  • [CSWY01] Amit Chakrabarti, Yaoyun Shi, Anthony Wirth, and Andrew Yao. Informational complexity and the direct sum problem for simultaneous message complexity. In Proceedings 42nd IEEE Symposium on Foundations of Computer Science, pages 270–278. IEEE, 2001.
  • [DOR21] Nachum Dershowitz, Rotem Oshman, and Tal Roth. The communication complexity of multiparty set disjointness under product distributions. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2021, page 1194–1207, New York, NY, USA, 2021. Association for Computing Machinery.
  • [dR19] Susanna F de Rezende. Lower Bounds and Trade-offs in Proof Complexity. PhD thesis, KTH Royal Institute of Technology, 2019.
  • [DRNV16] Susanna F De Rezende, Jakob Nordström, and Marc Vinyals. How limited interaction hinders real communication (and what it means for proof and circuit complexity). In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 295–304. IEEE, 2016.
  • [Gav16] Dmitry Gavinsky. Communication complexity of inevitable intersection. ArXiv, abs/1611.08842, 2016.
  • [GGKS18] Ankit Garg, Mika Göös, Pritish Kamath, and Dmitry Sokolov. Monotone circuit lower bounds from resolution. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 902–911, 2018.
  • [GH09] Sudipto Guha and Zhiyi Huang. Revisiting the direct sum theorem and space lower bounds in random order streams. In International Colloquium on Automata, Languages, and Programming, pages 513–524. Springer, 2009.
  • [GJPW18] Mika Göös, T. S. Jayram, Toniann Pitassi, and Thomas Watson. Randomized communication versus partition number. ACM Trans. Comput. Theory, 10(1), jan 2018.
  • [GJW18] Mika Göös, Rahul Jain, and Thomas Watson. Extension complexity of independent set polytopes. SIAM Journal on Computing, 47(1):241–269, 2018.
  • [GKY22] Mika Göös, Stefan Kiefer, and Weiqiang Yuan. Lower bounds for unambiguous automata via communication complexity. Leibniz International Proceedings in Informatics, 229(1), 2022.
  • [GNOR15] Yannai A. Gonczarowski, Noam Nisan, Rafail Ostrovsky, and Will Rosenbaum. A stable marriage requires communication. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’15, page 1003–1017, USA, 2015. Society for Industrial and Applied Mathematics.
  • [Göö15] Mika Göös. Lower bounds for clique vs. independent set. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, pages 1066–1076. IEEE, 2015.
  • [GP18] Mika Göös and Toniann Pitassi. Communication lower bounds via critical block sensitivity. SIAM Journal on Computing, 47(5):1778–1806, 2018.
  • [GPW17] Mika Göös, Toniann Pitassi, and Thomas Watson. Query-to-communication lifting for bpp. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 132–143. IEEE, 2017.
  • [GPW18] Mika Göös, Toniann Pitassi, and Thomas Watson. Deterministic communication vs. partition number. SIAM Journal on Computing, 47(6):2435–2450, 2018.
  • [GR21] Mika Göös and Aviad Rubinstein. Near-optimal communication lower bounds for approximate nash equilibria. SIAM Journal on Computing, pages FOCS18–316, 2021.
  • [Gri85] Dima Grigoriev. Lower bounds in algebraic computational complexity. Journal of Soviet Mathematics, 1985.
  • [Gro09] Andre Gronemeier. Asymptotically optimal lower bounds on the nih-multi-party information complexity of the and-function and disjointness. In in Proc. of the 26th International Symposium on Theoretical Aspects of Computer Science, STACS, pages 505–516, 2009.
  • [GS17] Anat Ganor and Karthik C. S. Communication complexity of correlated equilibrium in two-player games. arXiv preprint arXiv:1704.01104, 2017.
  • [GW16] Mika Göös and Thomas Watson. Communication complexity of set-disjointness for all probabilities. Theory of Computing, 12(1):1–23, 2016.
  • [HJ13] Prahladh Harsha and Rahul Jain. A strong direct product theorem for the tribes function via the smooth-rectangle bound. arXiv preprint arXiv:1302.0275, 2013.
  • [HN12] Trinh Huynh and Jakob Nordström. On the virtue of succinct proofs: Amplifying communication complexity hardness to time-space trade-offs in proof complexity. In Proceedings of the forty-fourth annual ACM symposium on Theory of computing, pages 233–248, 2012.
  • [HW07] Johan Håstad and Avi Wigderson. The randomized communication complexity of set disjointness. Theory of Computing, 3(1):211–219, 2007.
  • [Jay09] T. S. Jayram. Hellinger strikes back: A note on the multi-party information complexity of and. APPROX ’09 / RANDOM ’09, page 562–573, Berlin, Heidelberg, 2009. Springer-Verlag.
  • [JK10] Rahul Jain and Hartmut Klauck. The partition bound for classical communication complexity and query complexity. In 2010 IEEE 25th Annual Conference on Computational Complexity, pages 247–258. IEEE, 2010.
  • [JRS03] Rahul Jain, Jaikumar Radhakrishnan, and Pranab Sen. A lower bound for the bounded round quantum communication complexity of set disjointness. In 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings., pages 220–229. IEEE, 2003.
  • [Juk11] Stasys Jukna. Extremal combinatorics: with applications in computer science, volume 571. Springer, 2011.
  • [KMR21] Pravesh K Kothari, Raghu Meka, and Prasad Raghavendra. Approximating rectangles by juntas and weakly exponential lower bounds for lp relaxations of csps. SIAM Journal on Computing, pages STOC17–305, 2021.
  • [KPW21] Akshay Kamath, Eric Price, and David P Woodruff. A simple proof of a new set disjointness with applications to data streams. arXiv preprint arXiv:2105.11338, 2021.
  • [KS92] Bala Kalyanasundaram and Georg Schintger. The probabilistic communication complexity of set intersection. SIAM J. Discret. Math., 5(4):545–557, nov 1992.
  • [KW09] Eyal Kushilevitz and Enav Weinreb. The communication complexity of set-disjointness with small sets and 0-1 intersection. In 2009 50th Annual IEEE Symposium on Foundations of Computer Science, pages 63–72, 2009.
  • [LM19] Bruno Loff and Sagnik Mukhopadhyay. Lifting theorems for equality. In 36th International Symposium on Theoretical Aspects of Computer Science (STACS 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019.
  • [LMM+22] Shachar Lovett, Raghu Meka, Ian Mertz, Toniann Pitassi, and Jiapeng Zhang. Lifting with sunflowers. In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2022.
  • [LRS15] James R Lee, Prasad Raghavendra, and David Steurer. Lower bounds on the size of semidefinite programming relaxations. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing, pages 567–576, 2015.
  • [MM22] Yahel Manor and Or Meir. Lifting with inner functions of polynomial discrepancy. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2022). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2022.
  • [MNSW95] Peter Bro Miltersen, Noam Nisan, Shmuel Safra, and Avi Wigderson. On data structures and asymmetric communication complexity. In Proceedings of the Twenty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’95, page 103–111, New York, NY, USA, 1995. Association for Computing Machinery.
  • [PR17] Toniann Pitassi and Robert Robere. Strongly exponential lower bounds for monotone computation. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 1246–1255, 2017.
  • [Raz92] Aleksandr Razborov. On the distributional complexity of set disjointness. Theoretical Computer Science, 106:385–390, 1992.
  • [Raz03] Alexander A Razborov. Quantum communication complexity of symmetric predicates. Izvestiya: Mathematics, 67(1):145, 2003.
  • [RM97] Ran Raz and Pierre McKenzie. Separation of the monotone nc hierarchy. In Proceedings 38th Annual Symposium on Foundations of Computer Science, pages 234–243. IEEE, 1997.
  • [RPRC16] Robert Robere, Toniann Pitassi, Benjamin Rossman, and Stephen A. Cook. Exponential lower bounds for monotone span programs. In Proceedings of the 57th Symposium on Foundations of Computer Science (FOCS), pages 406–415. IEEE Computer Society, 2016.
  • [RY15] Anup Rao and Amir Yehudayoff. Simplified lower bounds on the multiparty communication complexity of disjointness. In 30th Conference on Computational Complexity (CCC 2015). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2015.
  • [RY20] Anup Rao and Amir Yehudayoff. Communication Complexity: and Applications. Cambridge University Press, 2020.
  • [ST13] Mert Saglam and Gábor Tardos. On the communication complexity of sparse set disjointness and exists-equal problems. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pages 678–687, 2013.
  • [WW15] Omri Weinstein and David P Woodruff. The simultaneous communication of disjointness with applications to data streams. In International Colloquium on Automata, Languages, and Programming, pages 1082–1093. Springer, 2015.
  • [Yao79] Andrew Chi-Chih Yao. Some complexity questions related to distributive computing (preliminary report). In Proceedings of the eleventh annual ACM symposium on Theory of computing, pages 209–213, 1979.
  • [YZ22] Guangxu Yang and Jiapeng Zhang. Simulation methods in communication lower bounds, revisited. Electron. Colloquium Comput. Complex., TR22-019, 2022.

Appendix A Missing proofs in Section 4

In this section, we give a proof for the bias lemma (Lemma 4.11). We first recall this lemma below.

Lemma A.1 (Lemma 4.11 restated).

Let δ>0\delta>0 be the constant and J[n]J\subseteq[n] be the set from Lemma 4.9. For any I[n]I\subseteq[n] and distinct i,iIJi,i^{\prime}\in I\cap J, we have that

γiI{i}δ.\gamma_{i^{\prime}}^{I\setminus\{i\}}\leq\delta.

The proof of Lemma A.1 relies on the following inequality from [YZ22].

Lemma A.2.

[YZ22] For any x1,,xn0x_{1},...,x_{n}\geq 0 and y1,,yn0y_{1},...,y_{n}\geq 0,

1j=1nyjj=1nyjxjxj+yjj=1nxjj=1n(xj+yj).\frac{1}{\sum_{j=1}^{n}y_{j}}\cdot\sum_{j=1}^{n}\frac{y_{j}\cdot x_{j}}{x_{j}+y_{j}}\leq\frac{\sum_{j=1}^{n}x_{j}}{\sum_{j=1}^{n}(x_{j}+y_{j})}.
Proof of Lemma A.1.

We first recall the random variable \ell in the definition of 𝒫\mathcal{P} (Definition 4.1). Then for any i[n]i\in[n], we have that

Prx𝒫[xD0=i]=Prx𝒫[xDi=i]=1/2\Pr_{x\sim\mathcal{P}}[x\in D_{0}\mid\ell=i]=\Pr_{x\sim\mathcal{P}}[x\in D_{i}\mid\ell=i]=1/2

For an i[n]i\in[n], we call a rectangle RR\in\cal{R} good for ii if

Prx𝒫[xDixR,=i]0.01.\Pr_{x\sim\mathcal{P}}[x\in D_{i}\mid x\in R,\ell=i]\leq 0.01.

Since the error of deterministic protocol CC under 𝒫\mathcal{P} is at most ϵ\epsilon, we have

Prx𝒫[C(x)k-UDISJ(x)]ϵ.\Pr_{x\sim\mathcal{P}}[C(x)\neq k\text{-}\mathrm{UDISJ}(x)]\leq\epsilon.

Let 0\mathcal{R}_{0} be the set of leaf rectangles that protocol CC output 0 and 1\mathcal{R}_{1} be the set of leaf rectangles that protocol CC output 11, we have

Prx𝒫[C(x)k-UDISJ(x)]=Prx𝒫[x0,xD]+Prx𝒫[x1,xD0]ϵ.\Pr_{x\sim\mathcal{P}}[C(x)\neq k\text{-}\mathrm{UDISJ}(x)]=\Pr_{x\sim\mathcal{P}}[x\in\mathcal{R}_{0},x\in D_{*}]+\Pr_{x\sim\mathcal{P}}[x\in\mathcal{R}_{1},x\in D_{0}]\leq\epsilon.

Since Prx𝒬[xR]=Prx𝒫[xR|xD0]2Prx𝒫[xR,xD0]\Pr_{x\sim\mathcal{Q}}[x\in R]=\Pr_{x\sim\mathcal{P}}[x\in R|x\in D_{0}]\leq 2\cdot\Pr_{x\sim\mathcal{P}}[x\in R,x\in D_{0}], we have,

R1Prx𝒬[xR]Prx𝒫[xD|xR]R12Prx𝒫[xR,xD0]=2Prx𝒫[x1,xD0]\sum_{R\in\mathcal{R}_{1}}\Pr_{x\sim\mathcal{Q}}[x\in R]\cdot\Pr_{x\sim\mathcal{P}}[x\in D_{*}|x\in R]\leq\sum_{R\in\mathcal{R}_{1}}2\cdot\Pr_{x\sim\mathcal{P}}[x\in R,x\in D_{0}]=2\cdot\Pr_{x\sim\mathcal{P}}[x\in\mathcal{R}_{1},x\in D_{0}]

Since Prx𝒬[xR]=Prx𝒫[xR|xD0]2Prx𝒫[xR]\Pr_{x\in\mathcal{Q}}[x\in R]=\Pr_{x\in\mathcal{P}}[x\in R|x\in D_{0}]\leq 2\cdot\Pr_{x\in\mathcal{P}}[x\in R], we have,

R0Prx𝒬[xR]Prx𝒫[xD|xR]2R0Prx𝒫[xR,xD]=2Prx𝒫[x0,xD]\sum_{R\in\mathcal{R}_{0}}\Pr_{x\sim\mathcal{Q}}[x\in R]\cdot\Pr_{x\sim\mathcal{P}}[x\in D_{*}|x\in R]\leq 2\cdot\sum_{R\in\mathcal{R}_{0}}\Pr_{x\sim\mathcal{P}}[x\in R,x\in D_{*}]=2\cdot\Pr_{x\sim\mathcal{P}}[x\in\mathcal{R}_{0},x\in D_{*}]

Thus,

RPrx𝒬[xR]Prx𝒫[xD|xR]\displaystyle\sum_{R\in\mathcal{R}}\Pr_{x\sim\mathcal{Q}}[x\in R]\cdot\Pr_{x\sim\mathcal{P}}[x\in D_{*}|x\in R]
=\displaystyle= R0Prx𝒬[xR]Prx𝒫[xD|xR]+R1Prx𝒬[xR]Prx𝒫[xD|xR]\displaystyle\sum_{R\in\mathcal{R}_{0}}\Pr_{x\sim\mathcal{Q}}[x\in R]\cdot\Pr_{x\sim\mathcal{P}}[x\in D_{*}|x\in R]+\sum_{R\in\mathcal{R}_{1}}\Pr_{x\sim\mathcal{Q}}[x\in R]\cdot\Pr_{x\sim\mathcal{P}}[x\in D_{*}|x\in R]
\displaystyle\leq 2Prx𝒫[x0,xD]+2Prx𝒫[x1,xD0]\displaystyle 2\cdot\Pr_{x\sim\mathcal{P}}[x\in\mathcal{R}_{0},x\in D_{*}]+2\cdot\Pr_{x\sim\mathcal{P}}[x\in\mathcal{R}_{1},x\in D_{0}]
\displaystyle\leq 2ϵ\displaystyle 2\cdot\epsilon

Since

RPrx𝒬[xR]Prx𝒫[=i|xR]Prx𝒫[xDi|xR,=i]=RPrx𝒬[xR]Prx𝒫[xD|xR]2ϵ\sum_{R\in\mathcal{R}}\Pr_{x\sim\mathcal{Q}}[x\in R]\cdot\Pr_{x\sim\mathcal{P}}[\ell=i|x\in R]\Pr_{x\sim\mathcal{P}}[x\in D_{i}|x\in R,\ell=i]=\sum_{R\in\mathcal{R}}\Pr_{x\sim\mathcal{Q}}[x\in R]\cdot\Pr_{x\sim\mathcal{P}}[x\in D_{*}|x\in R]\leq 2\cdot\epsilon

By average argument, there is a set of coordinates J[n]J\subseteq[n] with |J|=Ω(n)|J|=\Omega(n) such that for any iJi\in J,

PrR,i[R is good for i]10.01\underset{R\sim\mathcal{R},i\sim\ell}{\Pr}[R\text{ is good for }i]\geq 1-0.01

For any (x,ρ,j)(x^{*},\rho^{*},j^{*}),

p(x,ρ,j):=Pr(x,ρ,j)(𝒬I{i,i},𝒲Ic,𝒲iR)[(x,ρ,j)=(x,ρ,j)]p(x^{*},\rho^{*},j^{*}):=\Pr_{(x^{\prime},\rho^{\prime},j^{\prime})\sim(\mathcal{Q}_{I\setminus\{i^{\prime},i\}},\mathcal{W}_{I^{c}},\mathcal{W}_{i}\mid R)}[(x^{\prime},\rho^{\prime},j^{\prime})=(x^{*},\rho^{*},j^{*})]
r(x,ρ,j):=Pr(x,ρ,j)(𝒫,𝒲Ic,𝒲iR)[xD0,xI{i,i}=x,ρ=ρ,j=jxlIcDl].r(x^{*},\rho^{*},j^{*}):=\Pr_{(x^{\prime},\rho^{\prime},j^{\prime})\sim(\mathcal{P},\mathcal{W}_{I^{c}},\mathcal{W}_{i}\mid R)}\left[x^{\prime}\in D_{0},x^{\prime}\mid_{I\setminus\{i,i^{\prime}\}}=x^{*},\rho^{\prime}=\rho^{*},j^{\prime}=j^{*}\mid x^{\prime}\notin\bigcup_{l\in I^{c}}D_{l}\right].

and

s(x,ρ,j):=Pr(x,ρ,j)(𝒫,𝒲Ic,𝒲iR)[xDi,xI{i,i}=x,ρ=ρ,j=jxlIcDl].s(x^{*},\rho^{*},j^{*}):=\Pr_{(x^{\prime},\rho^{\prime},j^{\prime})\sim(\mathcal{P},\mathcal{W}_{I^{c}},\mathcal{W}_{i}\mid R)}\left[x^{\prime}\in D_{i^{\prime}},x^{\prime}\mid_{I\setminus\{i,i^{\prime}\}}=x^{*},\rho^{\prime}=\rho^{*},j^{\prime}=j^{*}\mid x^{\prime}\notin\bigcup_{l\in I^{c}}D_{l}\right].

Intuitively, p(x,ρ,j)p(x^{*},\rho^{*},j^{*}) is the probability of (x,ρ,j)(x^{*},\rho^{*},j^{*}) happens in distribution (𝒬I{i,i},𝒲Ic,𝒲iR)(\mathcal{Q}_{I\setminus\{i^{\prime},i\}},\mathcal{W}_{I^{c}},\mathcal{W}_{i}\mid R), r(x,ρ,j)r(x^{*},\rho^{*},j^{*}) is the probability of xDix^{\prime}\in D_{i^{\prime}} and (x,ρ,j)(x^{*},\rho^{*},j^{*}) happens in distribution (𝒫,𝒲Ic,𝒲iR)(\mathcal{P},\mathcal{W}_{I^{c}},\mathcal{W}_{i}\mid R), s(x,ρ,j)s(x^{*},\rho^{*},j^{*}) is the probability of xD0x^{\prime}\in D_{0} and (x,ρ,j)(x^{*},\rho^{*},j^{*}) happens in distribution (𝒫,𝒲Ic,𝒲iR)(\mathcal{P},\mathcal{W}_{I^{c}},\mathcal{W}_{i}\mid R).

We recall connections between rj,sjr_{j},s_{j} and pjp_{j}, let ρj=(ρ,wi=j)\rho^{*}_{j^{*}}=(\rho,w_{i}=j^{*}),

p(x,ρ,j)=r(x,ρ,j)r(x,ρ,j)p(x^{*},\rho^{*},j^{*})=\frac{r(x^{*},\rho^{*},j^{*})}{\sum r(x^{*},\rho^{*},j^{*})}
γi,ρj,RI{i}(x)=s(x,ρ,j)s(x,ρ,j)+r(x,ρ,j)\gamma_{i^{\prime},\rho^{*}_{j^{*}},R}^{I\setminus\{i\}}(x^{*})=\frac{s(x^{*},\rho^{*},j^{*})}{s(x^{*},\rho^{*},j^{*})+r(x^{*},\rho^{*},j^{*})}

and

Prxμ[xDixR,=i]=s(x,ρ,j)(s(x,ρ,j)+r(x,ρ,j)).\Pr_{x\sim\mu}[x\in D_{i^{\prime}}\mid x\in R,\ell=i]=\frac{\sum s(x^{*},\rho^{*},j^{*})}{\sum(s(x^{*},\rho^{*},j^{*})+r(x^{*},\rho^{*},j^{*}))}.

By Lemma A.2, we have

γi,RI{i}=(x,ρ,j)p(x,ρ,j)γi,ρj,RI{i}(x)Prxμ[xDixR,=i]0.01\gamma_{i^{\prime},R}^{I\setminus\{i\}}=\sum_{(x^{*},\rho^{*},j^{*})}p(x^{*},\rho^{*},j^{*})\cdot\gamma_{i^{\prime},\rho^{*}_{j^{*}},R}^{I\setminus\{i\}}(x^{*})\leq\Pr_{x\sim\mu}[x\in D_{i^{\prime}}\mid x\in R,\ell=i]\leq 0.01

Since iJi^{\prime}\in J, Pr[ is good for i]10.01{\Pr}[\mathcal{R}\text{ is good for }i^{\prime}]\geq 1-0.01. Thus,

γiI{i}=RPr[=R]γi,RI{i}10.01+0.01(10.01)=δ\gamma_{i^{\prime}}^{I\setminus\{i\}}=\sum_{R}\Pr[\mathcal{R}=R]\cdot\gamma_{i^{\prime},R}^{I\setminus\{i\}}\leq 1\cdot 0.01+0.01\cdot(1-0.01)=\delta

We also can prove that for any iJi^{\prime}\in J, γi[n]δ\gamma_{i}^{[n]}\leq\delta via replacing I{i}I\setminus\{i\} with [n][n] in above proofs. Thus, JJ also satisfies Lemma 4.9. ∎