This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

An O(klogn)O(k\log n) Time Fourier Set Query Algorithm

Yeqi Gao Department of CS
University of Washington
Seattle, USA
a916755226@gmail.com
   Zhao Song Adobe Research
Adobe
San Jose, USA
zsong@adobe.com
   Baocheng Sun Department of CS & Applied Math
Weizmann Institute of Science
Rehovot, Israel
woafrnraetns@gmail.com
Abstract

Fourier transformation is an extensively studied problem in many research fields. It has many applications in machine learning, signal processing, compressed sensing, and so on. In many real-world applications, approximated Fourier transformation is sufficient and we only need to do the Fourier transform on a subset of coordinates. Given a vector xnx\in\mathbb{C}^{n}, an approximation parameter ϵ\epsilon and a query set S[n]S\subset[n] of size kk, we propose an algorithm to compute an approximate Fourier transform result xx^{\prime} which uses O(ϵ1klog(n/δ))O(\epsilon^{-1}k\log(n/\delta)) Fourier measurements, runs in O(ϵ1klog(n/δ))O(\epsilon^{-1}k\log(n/\delta)) time and outputs a vector xx^{\prime} such that (xx^)S22ϵx^S¯22+δx^12\|(x^{\prime}-\widehat{x})_{S}\|_{2}^{2}\leq\epsilon\|\widehat{x}_{\bar{S}}\|_{2}^{2}+\delta\|\widehat{x}\|_{1}^{2} holds with probability of at least 9/109/10.

Index Terms:
Sparse Recovery, Fourier Transform, Set Query.

I Introduction

Fourier transform is ubiquitous in image and audio processing, telecommunications and so on. The time complexity of classical Fast Fourier Transform (FFT) algorithm proposed by Cooley and Turkey [1] is O(nlogn)O(n\log n). Optics imaging [2, 3], magnetic resonance image (MRI) [4] and the physics [5] are benefits from this algorithm. The algorithm proposed by Cooley and Turkey [1] takes O(n)O(n) samples to compute the Fourier transformation result.

The number of samples taken is an important factor. For example, it influences the amount of ionizing radiation that a patient is exposed to during CT scans. The amount of time a patient spends within the scanner can also be reduced by taking fewer samples. Thus, we consider the Fourier Transform problems in two computational aspects. Thus, two aspects of the Fourier Transform problems are taken into consideration by us. The first aspect is the reconstruction time which is the time of decoding the signal from the measurements. The second aspect is the sample complexity. Sample complexity is the number of noisy samples required by the algorithm. There is a long line of works optimizing the time and the sample complexity of Fourier Transform in the field of signal-processing and the field of TCS [1, 5, 4, 2, 6, 7].

As a result, we can anticipate that algorithms that leverage sparsity assumptions about the input and outperform FFT in applications will be of significant practical utility. In general, the two most significant factors to optimize are the sample complexity and the time complexity of obtaining the Fourier Transform result.

In many real world applications, computing the approximate Fourier transformation results for a set of selective coordinates is sufficient, and we can leverage the approximation guarantee to accelerate the computation. The set query is originally proposed by [8]. The original definition doesn’t have restriction on Fourier measurements. Then [9] generalizes the classical set query definition [8] into Fourier setting. In this paper we consider the set estimation based on Fourier measurement problem (defined by [9]) where given a vector xnx\in\operatorname*{\mathbb{C}}^{n}, approximation parameters ϵ,δ(0,1)\epsilon,\delta\in(0,1) and a query set S[n]S\subset[n] and |S|=k|S|=k, we want to compute an approximate Fourier transform result xx^{\prime} in sublinear time and sample complexity and compared with the Fourier transform result x^\widehat{x}, the following approximation guarantee holds:

(xx^)S22ϵx^S¯22+δx^12\displaystyle\|(x^{\prime}-\widehat{x})_{S}\|_{2}^{2}\leq\epsilon\|\widehat{x}_{\bar{S}}\|_{2}^{2}+\delta\|\widehat{x}\|_{1}^{2}

with probability at least 9/109/10. For a set S[n]S\subseteq[n] and a vector xnx\in\mathbb{R}^{n}, we define xSx_{S} by setting if iSi\in S, (xS)i=xi(x_{S})_{i}=x_{i} and otherwise (xS)i=0(x_{S})_{i}=0.

References Samples Time
[6] ϵ1klog2(n)\epsilon^{-1}k\log^{2}(n) ϵ1klog2(n)\epsilon^{-1}k\log^{2}(n)
[9] ϵ1k\epsilon^{-1}k ϵ1klog2.1(n)log(R)\epsilon^{-1}k\log^{2.1}(n)\log(R^{*})
Ours ϵ1klog(n)\epsilon^{-1}k\log(n) ϵ1klog(n)\epsilon^{-1}k\log(n)
TABLE I: Summary of the history of results

For this Fourier set query problem, there are two major prior works [9] and [6]. The [9] studies the problem explicitly and [6] implicitly provides a solution to Fourier set query, we will provide more details in the later paragraphs.

The work by [9] first explicitly define Fourier set query problem and studies it. [9] obtains an algorithm that has sample complexity O(k/ϵ)O(k/\epsilon) and running time O(ϵ1klog2.1(n)log(R))O(\epsilon^{-1}k\log^{2.1}(n)\log(R^{*})) for 2/2\ell_{2}/\ell_{2} Fourier set query. Here, RR^{*} is an upper bound on the \|\cdot\|_{\infty} norm of the vector. In most applications, RR^{*} are considered poly(n)\operatorname{poly}(n). Our approach gives an algorithm with O(ϵ1klog(n))O(\epsilon^{-1}k\log(n)) running time. The running time of our result has no dependence on logR\log R^{*}, but our result do not achieve the optimal sample complexity.

The [6] didn’t study Fourier set query problem, instead they study Fourier sparse recovery problem. However, applying their algorithm [6] to Fourier set query, it provides an algorithm with time complexity of O(ϵ1klog2(n))O(\epsilon^{-1}k\log^{2}(n)) and sample complexity of O(ϵ1klog2(n))O(\epsilon^{-1}k\log^{2}(n)).

Our main contributions are summarized as follows:

  • We present a efficient algorithms for Fourier set query problem.

  • We provide comprehensive theoretical guarantees to show the predominance of our algorithms over the existing algorithm.

Roadmap. We first present the related work about discrete fourier transform, continuous fourier transform and some applications of fourier transform in Section II. We define our problem and present our main theorem in Section III. We present a high-level overview of our techniques in Section IV. We provide some definitions, notations and technique tools in Section V. And as our main result in this paper, our algorithm (See Algorithm 1.) and the analysis about the correctness and complexity of it is given in Section VI. Finally, we conclude our paper in Section VII.

II Related Work

Discrete Fourier Transform

For computational jobs, among the most crucial and often employed algorithms is the discrete Fourier transform (DFT).. There is a long line of works focus on sparse discrete Fourier transforms. Results can be divided into two kinds: the first kind of results choose sublinear measurements and achieve sublinear or linear recovery time. This kind of work includes [10, 6, 11, 12, 13, 14, 15, 9, 16]. The second kind of results randomly choose measurements and prove that a generic recovery algorithm succeed with high probability. A common generic recovery algorithm that this kind of works used is 1\ell_{1} minimization. These results prove the Restricted Isometry Property [17, 18, 19]. Currently, the first kind of solutions have better theoretical guarantee in sample and time complexity. However, the second kind of algorithm has high success probabilities and higher capability in practice.

Continuous Fourier Transform

[20] studies sparse Fourier transforms on a continuous signals. They apply a discrete sparse Fourier transform algorithm, followed by a hill-climbing method to optimize their solution into a reasonable range. [21] presents an algorithm whose sample complexity is only linear to kk and logarithmic in the signal-to-noise ratio. Their frequency resolution is suitable for robustly computing sparse continuous Fourier transforms. [22] generalizes [21] into high-dimensional setting. [23] provide an algorithm that support the reconstruction of a signal without frequency gap. They present a solution to approximate the signal using a constant factor noise growth and takes samples polynomial in kk and logarithmic in the signal-to-noise ratio. Recently [24] improves the approximation ratio of [23].

Application of Fourier Transform

Fourier transformation has a wide application in many fields including physics, mathematics, signal processing, probability theory, statistics, acoustics, cryptography and so on.

Solving partial differential equations is one of the most important application of Fourier transformation. Some differential equations are simpler to understand in the frequency domain because the action of differentiation in the time domain corresponds to the multiplication by the frequency. Additionally, frequency-domain multiplication is equivalent to convolution in the time domain [25], [26], [27].

Various applications of the Fourier transform include nuclear magnetic resonance (NMR) [28], [29], [30] and other types of spectroscopy, such as infrared (FTIR) [31]. In NMR, a free induction decay (FID) signal with an exponential shape is recorded in the time domain and Fourier-transformed into a Lorentzian line-shape in the frequency domain. Mass spectrometry and magnetic resonance imaging (MRI) both employ the Fourier transform. The Fourier transform is also used in quantum mechanics [32].

For the spectrum analysis of time-series [33], [34], the Fourier transform is employed. The Fourier transformation is often not applied to the signal itself in the context of statistical signal processing. It has been discovered in practice that it is best to simulate a signal by a function (or, alternatively, a stochastic process) that is stationary in the sense that its distinctive qualities are constant across all time, even though a genuine signal is in fact transitory. It has been discovered that taking the Fourier transform of the function’s autocorrelation function is more advantageous for the analysis of signals since the Fourier transform of such a function does not exist in the conventional sense.

III Fourier set query

In Section III-A, We define the problem we focus on. In Section III-B, we provide our main result.

III-A Fourier set query problem

In this section, we give a formal definition of the main problem focused on.

Definition III.1 (Sample Complexity).

Given a vector xnx\in\operatorname*{\mathbb{C}}^{n}, we say the sample complexity of an algorithm is cc (an Algorithm takes cc samples), when cc is the number of the coordinates used and cnc\leq n.

Definition III.2 (Main problem).

Given a vector xnx\in\operatorname*{\mathbb{C}}^{n} and the x^\widehat{x} as the concrete Fourier transformation result, then for every ϵ,δ(0,1)\epsilon,\delta\in(0,1) and k1k\geq 1, any S[n]S\subseteq[n], |S|=k|S|=k, the goal is to design an algorithm that

  • takes samples from xnx\in\operatorname*{\mathbb{C}}^{n} (note that we treat one entry of xx as one sample)

  • takes some time to output a vector xnx^{\prime}\in\operatorname*{\mathbb{C}}^{n} such that

    (xx^)S22ϵx^S¯22+δx^12\displaystyle\|(x^{\prime}-\widehat{x})_{S}\|_{2}^{2}\leq\epsilon\|\widehat{x}_{\bar{S}}\|_{2}^{2}+\delta\|\widehat{x}\|_{1}^{2}

We want to optimize both sample complexity (which is the number of coordinates we need to access in xx), and also the running time.

III-B Our Result

We present our main theorem as follows:

Theorem III.3 (Main result).

Given a vector xnx\in\operatorname*{\mathbb{C}}^{n} and the x^\widehat{x} as the concrete Fourier transformation result, then for every ϵ,δ(0,1)\epsilon,\delta\in(0,1) and k1k\geq 1, any S[n]S\subseteq[n], |S|=k|S|=k, there exists an algorithm (Algorithm 1) that takes

O(ϵ1klog(n/δ))\displaystyle O(\epsilon^{-1}k\log(n/\delta))

samples from xx, runs in

O(ϵ1klog(n/δ))\displaystyle O(\epsilon^{-1}k\log(n/\delta))

time, and outputs a vector xnx^{\prime}\in\operatorname*{\mathbb{C}}^{n} such that

(xx^)S22ϵx^S¯22+δx^12,\displaystyle\|(x^{\prime}-\widehat{x})_{S}\|_{2}^{2}\leq\epsilon\|\widehat{x}_{\bar{S}}\|_{2}^{2}+\delta\|\widehat{x}\|_{1}^{2},

holds with probability at least 9/109/10.

IV Technique Overview

In this section, we will give an overview about the technique methods used on the proof of our main result and the complexity analysis about time and sample (See Definition III.1.). At first, we will give an introduction about main functions and their time complexity as well as other properties used in our algorithm. And based on the functions, then we will give the analysis about the correctness of our algorithm where with probability at least 9/109/10 it can finally produce a xx^{\prime} which satisfies

(xx^)S22ϵx^S¯22+δx^12.\|(x^{\prime}-\widehat{x})_{S}\|_{2}^{2}\leq\epsilon\|\widehat{x}_{\bar{S}}\|_{2}^{2}+\delta\|\widehat{x}\|_{1}^{2}.

The analysis of total complexity comes last, with O(ϵ1klog(n/δ))O(\epsilon^{-1}k\log(n/\delta)) as the sample complexity (See Definition III.1) and O(ϵ1klog(n/δ))O(\epsilon^{-1}k\log(n/\delta)) as the time complexity. And then we can make sure the algorithm solve the problem (See Definition III.2.) with better performance compared to the prior works [9] and [6] (See details in Table I).

Technique I: HashToBins

We use the same function HashToBins with the one in [6], which is one of the key part of the function EstimateValues. We can attain a u^\widehat{u}, where the u^j\widehat{u}_{j} for satisfies the following equation

u^j=hσ,b(i)=j(xz)^i(GB,δ,α^)oσ,b(i)ωaσi±δx^1.\displaystyle\widehat{u}_{j}=\sum_{h_{\sigma,b}(i)=j}\widehat{(x-z)}_{i}(\widehat{G^{\prime}_{B,\delta,\alpha}})_{-o_{\sigma,b}(i)}\omega^{a\sigma i}\pm\delta\|\widehat{x}\|_{1}.

To help the analysis of the time complexity of our algorithm 1, by Lemma V.15, the time complexity of the function above is O(Bαlog(n/δ)+z^0+ζlog(n/δ))O(\frac{B}{\alpha}\log(n/\delta)+\|\widehat{z}\|_{0}+\zeta\log(n/\delta)) with

ζ=|{isupp(z^)|Eoff(i)}|.\displaystyle\zeta=|\{i\in\mathrm{supp}(\widehat{z})~{}|~{}E_{\mathrm{off}}(i)\}|.

Technique II: EstimateValues

EstimateValues is an key function in loop (See Section VI-A). By using this function, we attain the new set TiT_{i} and the new value w^(i)\widehat{w}^{(i)} to update SiS_{i} by

Si+1Si\Ti,\displaystyle S_{i+1}\leftarrow S_{i}\backslash T_{i},

and z^(i+1)\widehat{z}^{(i+1)} by

z^(i+1)z^(i)+w^(i).\displaystyle\widehat{z}^{(i+1)}\leftarrow\widehat{z}^{(i)}+\widehat{w}^{(i)}.

Technique III: Query Set SS

We use SS as the query set and SiS_{i} is the set attained by updating SS with i1i-1 iterations. And we use ki=kγi1k_{i}=k\gamma^{i-1} where γ11000\gamma\leq\frac{1}{1000} and k1k\geq 1.

We demonstrate that we can compress SiS_{i} to a small enough value where |Si|ki|S_{i}|\leq k_{i}. Due to reason that SiS_{i} is a query set, the above sentence can be said as that we can finish the query of all the elements in SS with a large enough number of the iterations.

In the proof of above statement, we bring some properties about tt as follows (See Details in Definition V.9):

  1. 1.

    “Collision”

  2. 2.

    “Large offset”

  3. 3.

    “Large noise”

Given a vector xx and t[n]t\in[n] as a coordinate of it, we also give the definition about “well-isolated” based on concepts above. And then we can prove that with probability at least 1ai1-a_{i}, we can have tt is “well-isolated”.

Based on the statement above, we can have small enough |Si||S_{i}| by |Si|ki|S_{i}|\leq k_{i} and a large enough RR here.

Technique IV: Correctness and Complexity

By the upper bound of x^S¯i+1(i+1)22\|\widehat{x}_{\overline{S}_{i+1}}^{(i+1)}\|_{2}^{2} which we attain in Section VI-A. We can demonstrate the error can satisfy the requirement in the problem. With probability 10ai/γ10a_{i}/\gamma, we can have

x^S¯i+1(i+1)22(1+ϵi)x^S¯i(i)22+ϵiδ2nx^12.\displaystyle\|\widehat{x}_{\overline{S}_{i+1}}^{(i+1)}\|_{2}^{2}\leq(1+\epsilon_{i})\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\epsilon_{i}\delta^{2}n\|\widehat{x}\|_{1}^{2}.

Then we can demonstrate

x^Sz^(R+1)22ϵ(x^S¯22+δ2nx^12).\displaystyle\|\widehat{x}_{S}-\widehat{z}^{(R+1)}\|_{2}^{2}\leq\epsilon(\|\widehat{x}_{\overline{S}}\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2}).

Notice that the z^R+1\widehat{z}^{R+1} is the output of our Algorithm 1 which is also the xx^{\prime} in our problem (See Definition III.2). The above inequalities demonstrate that the Algorithm 1 constructed by us can output a xx^{\prime} which satisfies

(x^x)S22ϵx^S¯22+δx^12\displaystyle\|(\widehat{x}-x^{\prime})_{S}\|_{2}^{2}\leq\epsilon\|\widehat{x}_{\bar{S}}\|_{2}^{2}+\delta\|\widehat{x}\|_{1}^{2}

with succeed probability 9/10. And we attain sample complexity and time complexity by

i=1R(Bi/αi)log(n/δ)=ϵ1klog(n/δ).\displaystyle\sum_{i=1}^{R}(B_{i}/\alpha_{i})\log(n/\delta)=\epsilon^{-1}k\log(n/\delta).

V Preliminary

In this section, we first present some definitions and background for Fourier transform in Section V-A. We introduce some technical tools in Section V-B. Then we introduce spectrum permutations and filter functions in Section V-C. They are used as hashing schemes in the Fourier transform literature. In Section V-D, we introduce collision events. large offset events, and large noise events.

V-A Notations

We use 𝐢\mathbf{i} to denote 1\sqrt{-1}. Note that e𝐢θ=cos(θ)+𝐢sin(θ)e^{\mathbf{i}\theta}=\cos(\theta)+\mathbf{i}\sin(\theta). For any complex number zz\in\operatorname*{\mathbb{C}}, we have z=a+𝐢bz=a+\mathbf{i}b, where a,ba,b\in\mathbb{R}. We define the complement of zz as z¯=a𝐢b\overline{z}=a-\mathbf{i}b. We define |z|=zz¯=a2+b2|z|=\sqrt{z\overline{z}}=\sqrt{a^{2}+b^{2}}. For any complex vector xnx\in\operatorname*{\mathbb{C}}^{n}, we use supp(x)\operatorname{supp}(x) to denote the support of xx, and then x0=|supp(x)|\|x\|_{0}=|\operatorname{supp}(x)|. We define ω=e2π𝐢/n\omega=e^{2\pi\mathbf{i}/n}, which is the nn-th unitary root i.e. ωn=1\omega^{n}=1.

The discrete convolution of functions ff and gg is given by,

(fg)[n]=m=+f[m]g[nm]\displaystyle(f*g)[n]=\sum_{m=-\infty}^{+\infty}f[m]g[n-m]

For a complex vector xnx\in\operatorname*{\mathbb{C}}^{n}, we use x^n\widehat{x}\in\operatorname*{\mathbb{C}}^{n} to denote its Fourier spectrum,

x^i=1nj=1ne2π𝐢ij/nxj,i[n].\displaystyle\widehat{x}_{i}=\frac{1}{\sqrt{n}}\sum_{j=1}^{n}e^{-2\pi\mathbf{i}ij/n}x_{j},\forall i\in[n].

Then the inverse transform is

xj=1ni=1ne2π𝐢ij/nx^i,j[n].\displaystyle x_{j}=\frac{1}{\sqrt{n}}\sum_{i=1}^{n}e^{2\pi\mathbf{i}ij/n}\widehat{x}_{i},\forall j\in[n].

We define

Err(x,k):=mink-sparseyxy2.\displaystyle\operatorname{Err}(x,k):=\min_{k\text{-sparse}~{}y}\|x-y\|_{2}.

We define xSx_{S} as a vector by setting if iSi\in S, (xS)i=xi(x_{S})_{i}=x_{i} and otherwise (xS)i=0(x_{S})_{i}=0, for a vector xnx\in\mathbb{R}^{n} and a set S[n]S\subseteq[n].

V-B Technical Tools

We show several technical tools and some lemmas in prior works we used in the following section.

Lemma V.1 (Markov’s inequality).

If XX is a nonnegative random variable and a>0a>0, then the probability that XX is at least a is at most the expectation of XX divided by aa:

Pr[Xa]𝔼(X)a.\displaystyle\Pr[X\geq a]\leq\frac{\operatorname*{{\mathbb{E}}}(X)}{a}.

Let a=a~𝔼(X)a=\widetilde{a}\cdot\operatorname*{{\mathbb{E}}}(X) (where a~>0\widetilde{a}>0); then we can rewrite the previous inequality as

Pr[Xa~𝔼(X)]1a~\displaystyle\Pr[X\geq\widetilde{a}\cdot\operatorname*{{\mathbb{E}}}(X)]\leq\frac{1}{\widetilde{a}}

The following two lemmas of complex number are standard. We prove the following two lemmas for the completeness of the paper.

Lemma V.2.

Given a fixed vector xnx\in\mathbb{R}^{n} and a pairwise independent random variable σi\sigma_{i} where σi=±1\sigma_{i}=\pm 1 with probability 1/21/2 respectively. Then we have:

𝔼σ[(i=1nσixi)2]=x22\displaystyle\operatorname*{{\mathbb{E}}}_{\sigma}[(\sum_{i=1}^{n}\sigma_{i}x_{i})^{2}]=\|x\|_{2}^{2}
Proof.

We have:

𝔼σ[(i=1nσixi)2]\displaystyle~{}\operatorname*{{\mathbb{E}}}_{\sigma}[(\sum_{i=1}^{n}\sigma_{i}x_{i})^{2}]
=\displaystyle= 𝔼[i=1nσi2xi2]+𝔼[ijσixiσjxj]\displaystyle~{}\operatorname*{{\mathbb{E}}}[\sum_{i=1}^{n}\sigma_{i}^{2}x_{i}^{2}]+\operatorname*{{\mathbb{E}}}[\sum_{i\neq j}\sigma_{i}x_{i}\sigma_{j}x_{j}]
=\displaystyle= 𝔼[i=1nσi2xi2]+ij𝔼[σiσj]xixj\displaystyle~{}\operatorname*{{\mathbb{E}}}[\sum_{i=1}^{n}\sigma_{i}^{2}x_{i}^{2}]+\sum_{i\neq j}\operatorname*{{\mathbb{E}}}[\sigma_{i}\sigma_{j}]x_{i}x_{j}
=\displaystyle= 𝔼[i=1nσi2xi2]+ij𝔼[σi]𝔼[σj]xixj\displaystyle~{}\operatorname*{{\mathbb{E}}}[\sum_{i=1}^{n}\sigma_{i}^{2}x_{i}^{2}]+\sum_{i\neq j}\operatorname*{{\mathbb{E}}}[\sigma_{i}]\cdot\operatorname*{{\mathbb{E}}}[\sigma_{j}]x_{i}x_{j}
=\displaystyle= 𝔼[i=1nσi2xi2]+0\displaystyle~{}\operatorname*{{\mathbb{E}}}[\sum_{i=1}^{n}\sigma_{i}^{2}x_{i}^{2}]+0
=\displaystyle= x22\displaystyle~{}\|x\|_{2}^{2}

where the first step comes from the linearity of expectation, the second step follows the linearity of expectation, the third step σi\sigma_{i} is a pairwise independent random variable, the fourth step follows that 𝔼[σi]=0\operatorname*{{\mathbb{E}}}[\sigma_{i}]=0 , and the final step comes from the definition of 2\|\cdot\|_{2} and σi2=1\sigma_{i}^{2}=1. ∎

Lemma V.3.

Let a[n]a\sim[n] uniformly at random. Given a fixed vector xnx\in\operatorname*{\mathbb{C}}^{n} and ωσai\omega^{\sigma ai}, then we have:

𝔼a[|i=1nxiωσai|2]=x22\displaystyle\operatorname*{{\mathbb{E}}}_{a}[|\sum_{i=1}^{n}x_{i}\omega^{\sigma ai}|^{2}]=\|x\|_{2}^{2}
Proof.

For any fixed i[n]i\in[n], we have the inequality as follows

𝔼a[ωai]=\displaystyle\operatorname*{{\mathbb{E}}}_{a}[\omega^{ai}]= 1na=1nωai=1n1ωni1ωi=0\displaystyle~{}\frac{1}{n}\sum_{a=1}^{n}\omega^{ai}=\frac{1}{n}\cdot\frac{1-\omega^{ni}}{1-\omega^{i}}=~{}0 (1)

where the first step comes from geometric sum, and the second step comes from We have:

𝔼a[|i=1nxiωσai|2]\displaystyle~{}\operatorname*{{\mathbb{E}}}_{a}[|\sum_{i=1}^{n}x_{i}\omega^{\sigma ai}|^{2}]
=\displaystyle= 𝔼a[(i=1nxiωσai)(i=1nx¯iωσai)]\displaystyle~{}\operatorname*{{\mathbb{E}}}_{a}[(\sum_{i=1}^{n}x_{i}\omega^{\sigma ai})(\sum_{i=1}^{n}\bar{x}_{i}\omega^{-\sigma ai})]
=\displaystyle= 𝔼a[i=1nxix¯i]+𝔼a[ijxiωσaix¯jωσaj]\displaystyle~{}\operatorname*{{\mathbb{E}}}_{a}[\sum_{i=1}^{n}x_{i}\bar{x}_{i}]+\operatorname*{{\mathbb{E}}}_{a}[\sum_{i\neq j}x_{i}\omega^{\sigma ai}\bar{x}_{j}\omega^{-\sigma aj}]
=\displaystyle= 𝔼a[i=1nxix¯i]+ij𝔼a[ωσa(ij)]xix¯j\displaystyle~{}\operatorname*{{\mathbb{E}}}_{a}[\sum_{i=1}^{n}x_{i}\bar{x}_{i}]+\sum_{i\neq j}\operatorname*{{\mathbb{E}}}_{a}[\omega^{\sigma a(i-j)}]x_{i}\bar{x}_{j}
=\displaystyle= 𝔼a[i=1nxix¯i]+0\displaystyle~{}\operatorname*{{\mathbb{E}}}_{a}[\sum_{i=1}^{n}x_{i}\bar{x}_{i}]+0
=\displaystyle= x22\displaystyle~{}\|x\|_{2}^{2}

where the first step follows that for a complex number zz, |z|2=zz¯|z|^{2}=z\bar{z}, the second step follows the linearity of expectation, the third step follows the linearity of expectation, where the fourth step follows Eq.1, and the final step comes from the definition of 2\|\cdot\|_{2}. ∎

V-C Permutation and filter function

We use the same (pseudorandom) spectrum permutation as [6],

Definition V.4.

Suppose σ1\sigma^{-1} exists mod nn. We define the permutation Pσ,a,bP_{\sigma,a,b} by

(Pσ,a,bx)i=xσ(ia)e2π𝐢σbi/n.\displaystyle(P_{\sigma,a,b}x)_{i}=x_{\sigma(i-a)}e^{-2\pi\mathbf{i}\sigma bi/n}.

We also define πσ,b=σ(ib)(modn)\pi_{\sigma,b}=\sigma(i-b)\pmod{n}. Then we have

Claim V.5.

We have that

Pσ,a,bx^πσ,b(i)=x^ie2π𝐢σai/n.\widehat{P_{\sigma,a,b}x}_{\pi_{\sigma,b}(i)}=\widehat{x}_{i}e^{-2\pi\mathbf{i}\sigma ai/n}.

hσ,b(i)h_{\sigma,b}(i) is defined as the “bin” with the mapping of frequency ii onto. We define oσ,b(i)o_{\sigma,b}(i) as the “offset”. We formally define them as follows:

Definition V.6.

Let the hash function be defined as

hσ,b(i):=round(πσ,b(i)Bn).h_{\sigma,b}(i):=\mathrm{round}(\frac{\pi_{\sigma,b}(i)B}{n}).
Definition V.7.

Let the offset function be defined as

oσ,b(i):=πσ,b(i)hσ,b(i)nB.o_{\sigma,b}(i):=\pi_{\sigma,b}(i)-h_{\sigma,b}(i)\frac{n}{B}.

We use the same filter function as [6, 21, 23],

Definition V.8.

Given parameters B1B\geq 1, δ>0\delta>0, α>0\alpha>0. We say that (G,G^)=(GB,δ,α,G^B,δ,α)n(G,\widehat{G}^{\prime})=(G_{B,\delta,\alpha},\widehat{G}^{\prime}_{B,\delta,\alpha})\in\mathbb{R}^{n} is a filter function if it satisfies the following properties:

  1. 1.

    |supp(G)|=O(α1Blog(n/δ))|\operatorname{supp}(G)|=O(\alpha^{-1}B\log(n/\delta)).

  2. 2.

    if |i|(1α)n/(2B)|i|\leq(1-\alpha)n/(2B), G^i=1\widehat{G}^{\prime}_{i}=1.

  3. 3.

    if |i|n/(2B)|i|\geq n/(2B), G^i=0\widehat{G}^{\prime}_{i}=0.

  4. 4.

    for all ii, G^i[0,1]\widehat{G}^{\prime}_{i}\in[0,1].

  5. 5.

    G^G^<\left\|\widehat{G}^{\prime}-\widehat{G}\right\|_{\infty}<\infty.

V-D Collision event, large offset event, and large noise event

We use three types of events defined in [6] as basic building blocks for analyzing Fourier set query algorithms. For any iSi\in S, we define three types of events associated with ii and SS and defined over the probability space induced by σ\sigma and bb:

Definition V.9 (Collision, large offset, large noise).

The definition of three events are given as follow:

  • We say “Large offset” event Eoff(i)E_{\mathrm{off}}(i) holds if

    n(1α)/(2B)|oσ,b(i)|.n(1-\alpha)/(2B)\leq|o_{\sigma,b}(i)|.
  • We say “Large noise” event Enoise(i)E_{\mathrm{noise}}(i) holds if

    (αB)1Err2(x^,k)𝔼[x^hσ,b1(hσ,b(i))\S22].(\alpha B)^{-1}\cdot\operatorname{Err}^{2}(\widehat{x}^{\prime},k)\leq\operatorname*{{\mathbb{E}}}\left[\left\|\widehat{x}^{\prime}_{h^{-1}_{\sigma,b}(h_{\sigma,b}(i))\backslash S}\right\|_{2}^{2}\right].
  • We say “Collision” event Ecoll(i)E_{\mathrm{coll}}(i) holds if

    hσ,b(i)hσ,b(S\{i}).h_{\sigma,b}(i)\in h_{\sigma,b}(S\backslash\{i\}).
Definition V.10 (Well-isolated).

For a vector xnx\in\mathbb{R}^{n}, we say a coordinate t[n]t\in[n] is “well isolated” when none of “Collision” event, “Large offset” and “Large noise” event holds.

Claim V.11 (Claim 3.1 in [6]).

For all iSi\in S, we have

Pr[Ecoll(i)]4|S|B.\Pr[E_{\mathrm{coll}}(i)]\leq 4\frac{|S|}{B}.
Claim V.12 (Claim 3.2 in [6]).

For all iSi\in S, we have

Pr[Eoff(i)]α.\Pr[E_{\mathrm{off}}(i)]\leq\alpha.
Claim V.13 (Claim 4.1 in [6]).

For any iSi\in S, the event Enoise(i)E_{\mathrm{noise}(i)} holds with probability at most 4α4\alpha

Pr[Enoise(i)]4α.\Pr[E_{\mathrm{noise}(i)}]\leq 4\alpha.
Lemma V.14 (Lemma 4.2 in [6]).

With BB divide nn, aa uniformly sampled from [n][n] and the others without limitation in

u^=HashToBins(Pσ,a,b,α,z^,B,δ,x).\displaystyle\widehat{u}=\textsc{HashToBins}(P_{\sigma,a,b},\alpha,\widehat{z},B,\delta,x).

With all of Eoff(i)E_{\mathrm{off}}(i), Ecoll(i)E_{\mathrm{coll}}(i) and Enoise(i)E_{\mathrm{noise}}(i) not holding and j=hσ,b(i)j=h_{\sigma,b}(i), we have for all i[n]i\in[n],

𝔼[|x^ie2π𝐢naσi|2u^j]2ρ2αB.\displaystyle\operatorname*{{\mathbb{E}}}\left[\left|\widehat{x}^{\prime}_{i}e^{-\frac{2\pi\mathbf{i}}{n}a\sigma i}\right|^{2}-\widehat{u}_{j}\right]\leq 2\frac{\rho^{2}}{\alpha B}.
Lemma V.15 (Lemma 3.3 in [6]).

Suppose BB divides nn. The output u^\widehat{u} of HashToBins satisfies

u^j=hσ,b(i)=j(xz)^i(GB,δ,α^)oσ,b(i)ωaσi±δx^1.\displaystyle\widehat{u}_{j}=\sum_{h_{\sigma,b}(i)=j}\widehat{(x-z)}_{i}(\widehat{G^{\prime}_{B,\delta,\alpha}})_{-o_{\sigma,b}(i)}\omega^{a\sigma i}\pm\delta\|\widehat{x}\|_{1}.

Let

ζ:=|{isupp(z^)|Eoff(i)}|.\zeta:=|\{i\in\mathrm{supp}(\widehat{z})~{}|~{}E_{\mathrm{off}}(i)\}|.

The running time of HashToBins is

O(Bαlog(n/δ)+z^0+ζlog(n/δ)).O(\frac{B}{\alpha}\log(n/\delta)+\|\widehat{z}\|_{0}+\zeta\log(n/\delta)).

VI Analysis on Fourier Set Query Algorithm

In this section, we will give an total analysis about our Algorithm 1. First, we will provide the iterative loop analysis which is the main part of our main function FourierSetQuery in Section VI-A. By this analysis, we demonstrate an important property of the Algorithm 1 in Section VI-B. In Section VI-C, we prove the the correctness of the algorithm. We also provide the analysis of the complexity (sample and time) of Algorithm 1. Then we can give an satisfying answer to the problem (See Definition III.2) with Algorithm 1 attained by us whose performance (on sample and time complexity) is better than prior works (See Table I).

VI-A Iterative loop analysis

Iterative loop analysis for Fourier set query is more tricky than the classic set query, because in the Fourier case, hashing is not perfect, in the sense that by using spectrum permutation and filter function (as the counterpart of hashing techniques), one coordinate can non-trivially contribute to multiple bins. We give iterative loop induction in Lemma VI.4.

Lemma VI.1.

Given a vector xnx\in\mathbb{R}^{n}, γ1/1000\gamma\leq 1/1000, αi=1/(200i3)\alpha_{i}=1/(200i^{3}), for a coordinate t[n]t\in[n] and each i[R]i\in[R], with probability at least 16αi1-6\alpha_{i}, We say that tt is “well isolated” (See Definition V.10).

Proof.

Collision. Using Claim V.11, for any tSit\in S_{i}, the event Ecoll(t)E_{\mathrm{coll}}(t) holds with probability at most

4|Si|/Bi\displaystyle 4|S_{i}|/B_{i}\leq 4kiCki/(αi2ϵi)\displaystyle~{}\frac{4k_{i}}{Ck_{i}/(\alpha_{i}^{2}\epsilon_{i})}
=\displaystyle= 4αi2ϵi/C\displaystyle~{}4\alpha_{i}^{2}\epsilon_{i}/C
\displaystyle\leq αi,\displaystyle~{}\alpha_{i},

where the first step follows from the definition of BiB_{i} and the assumption on |Si||S_{i}|, the second step is straightforward, the third step follows from the definition of ϵi\epsilon_{i}, αi\alpha_{i}, and CC.

It means

Prσ,b[Ecoll(t)]αi.\displaystyle\Pr_{\sigma,b}\left[E_{\mathrm{coll}}(t)\right]\leq\alpha_{i}.

Large offset. Using Claim V.12, for any tSit\in S_{i}, the event Eoff(t)E_{\mathrm{off}}(t) holds with probability at most αi\alpha_{i}, i.e.

Prσ,b[Eoff(t)]αi.\displaystyle\Pr_{\sigma,b}\left[E_{\mathrm{off}}(t)\right]\leq\alpha_{i}.

Large noise. Using Claim V.13, for any tSit\in S_{i},

Prσ,b[Enoise(t)]4αi.\displaystyle\Pr_{\sigma,b}[E_{\mathrm{noise}}(t)]\leq 4\alpha_{i}.

By a union bound over the above three events, we have tt is “well isolated” with probability at least 16αi1-6\alpha_{i}. ∎

Lemma VI.2.

Given parameters C1000C\geq 1000, γ1/1000\gamma\leq 1/1000. For any k1,ϵ(0,1)k\geq 1,\epsilon\in(0,1), R1R\geq 1. For each i[R]i\in[R], we define

ki:=\displaystyle k_{i}:= kγi1,\displaystyle~{}k\gamma^{i-1},
ϵi:=\displaystyle\epsilon_{i}:= ϵ(10γ)i,\displaystyle~{}\epsilon(10\gamma)^{i},
αi:=\displaystyle\alpha_{i}:= 1/(200i3),\displaystyle~{}1/(200i^{3}),
Bi:=\displaystyle B_{i}:= Cki/(αi2ϵi).\displaystyle~{}C\cdot k_{i}/(\alpha_{i}^{2}\epsilon_{i}).

For each i[R]i\in[R]: If for all j[i1]j\leq[i-1] we have

  1. 1.

    supp(w^(j))Sj\operatorname{supp}(\widehat{w}^{(j)})\subseteq S_{j}.

  2. 2.

    |Sj+1|kj+1|S_{j+1}|\leq k_{j+1}.

  3. 3.

    z^(j+1)=z^(j)+w^(j)\widehat{z}^{(j+1)}=\widehat{z}^{(j)}+\widehat{w}^{(j)}.

  4. 4.

    x^(j+1)=x^z^(j+1)\widehat{x}^{(j+1)}=\widehat{x}-\widehat{z}^{(j+1)}.

  5. 5.

    x^S¯j+1(j+1)22(1+ϵj)x^S¯j(j)22+ϵjδ2nx^12\|\widehat{x}_{\overline{S}_{j+1}}^{(j+1)}\|_{2}^{2}\leq(1+\epsilon_{j})\|\widehat{x}_{\overline{S}_{j}}^{(j)}\|_{2}^{2}+\epsilon_{j}\delta^{2}n\|\widehat{x}\|_{1}^{2}.

Then, with probability 110αi/γ1-10\alpha_{i}/\gamma, we have

|Si+1|ki+1.\displaystyle|S_{i+1}|\leq k_{i+1}.
Proof.

We consider a particular step ii. We can condition on |Si|ki|S_{i}|\leq k_{i}.

By Lemma VI.1, we have tt is “well isolated” with probability at least 16αi1-6\alpha_{i}.

Therefore, each tSit\in S_{i} lies in TiT_{i} with probability at least 16αi1-6\alpha_{i}. We have Then by Markov’s inequality (See Lemma V.1) and assumption in the statement, we have

|Si\Ti|γki\displaystyle|S_{i}\backslash T_{i}|\leq\gamma k_{i} (2)

with probability 16αi/γ1-6\alpha_{i}/\gamma. Then we know that

|Si+1|=\displaystyle|S_{i+1}|= |Si\Ti|\displaystyle~{}|S_{i}\backslash T_{i}|
\displaystyle\leq γki\displaystyle~{}\gamma k_{i}
\displaystyle\leq ki+1.\displaystyle~{}k_{i+1}.

where the first step follows from the definition of Si+1=Si\TiS_{i+1}=S_{i}\backslash T_{i}, the second step follows from Eq. (2), the third step follows from the definition of kik_{i} and ki+1k_{i+1}.

Lemma VI.3.

Given parameters C1000C\geq 1000, γ1/1000\gamma\leq 1/1000. For any k1,ϵ(0,1)k\geq 1,\epsilon\in(0,1), R1R\geq 1. For each i[R]i\in[R], we define

ki:=\displaystyle k_{i}:= kγi1,\displaystyle~{}k\gamma^{i-1},
ϵi:=\displaystyle\epsilon_{i}:= ϵ(10γ)i,\displaystyle~{}\epsilon(10\gamma)^{i},
αi:=\displaystyle\alpha_{i}:= 1/(200i3),\displaystyle~{}1/(200i^{3}),
Bi:=\displaystyle B_{i}:= Cki/(αi2ϵi).\displaystyle~{}C\cdot k_{i}/(\alpha_{i}^{2}\epsilon_{i}).

For each i[R]i\in[R]: If for all j[i1]j\leq[i-1] we have

  1. 1.

    supp(w^(j))Sj\operatorname{supp}(\widehat{w}^{(j)})\subseteq S_{j}.

  2. 2.

    |Sj+1|kj+1|S_{j+1}|\leq k_{j+1}.

  3. 3.

    z^(j+1)=z^(j)+w^(j)\widehat{z}^{(j+1)}=\widehat{z}^{(j)}+\widehat{w}^{(j)}.

  4. 4.

    x^(j+1)=x^z^(j+1)\widehat{x}^{(j+1)}=\widehat{x}-\widehat{z}^{(j+1)}.

  5. 5.

    x^S¯j+1(j+1)22(1+ϵj)x^S¯j(j)22+ϵjδ2nx^12\|\widehat{x}_{\overline{S}_{j+1}}^{(j+1)}\|_{2}^{2}\leq(1+\epsilon_{j})\|\widehat{x}_{\overline{S}_{j}}^{(j)}\|_{2}^{2}+\epsilon_{j}\delta^{2}n\|\widehat{x}\|_{1}^{2}.

Then, with probability 110αi/γ1-10\alpha_{i}/\gamma, we have

Pr[x^Ti(i)w^(i)22ϵi20(x^S¯i(i)22+δ2nx^12)]1αi.\displaystyle\Pr\left[\left\|\widehat{x}^{(i)}_{T_{i}}-\widehat{w}^{(i)}\right\|_{2}^{2}\leq\frac{\epsilon_{i}}{20}(\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2})\right]\geq 1-\alpha_{i}.
Proof.

We define ρ(i)\rho^{(i)} and μ(i)\mu^{(i)} as follows

ρ(i)=\displaystyle\rho^{(i)}= x^S¯i(i)22+δ2nx^12,\displaystyle~{}\left\|\widehat{x}^{(i)}_{\overline{S}_{i}}\right\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2},
μ(i)=\displaystyle\mu^{(i)}= ϵiki(x^S¯i(i)22+δ2nx^12).\displaystyle~{}\frac{\epsilon_{i}}{k_{i}}\left(\left\|\widehat{x}^{(i)}_{\overline{S}_{i}}\right\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2}\right). (3)

For a fixed tSit\in S_{i}, let j=hσ,b(t)j=h_{\sigma,b}(t). By Lemma V.15, we have

u^jx^t(i)ωaσt=tTiG^oσ(t)x^t(i)ωaσt±δx^1\displaystyle\widehat{u}_{j}-\widehat{x}_{t}^{(i)}\omega^{a\sigma t}=\sum_{t^{\prime}\in T_{i}}\widehat{G}^{\prime}_{-o_{\sigma}(t^{\prime})}\widehat{x}_{t^{\prime}}^{(i)}\omega^{a\sigma t^{\prime}}\pm\delta\|\widehat{x}\|_{1} (4)

For each tSit\in S_{i}, we define set Qi,t=hσ,b1(j)\{t}Q_{i,t}=h^{-1}_{\sigma,b}(j)\backslash\{t\}. Let TiT_{i} be the set of coordinates tSit\in S_{i} such that Qi,tSi=Q_{i,t}\cap S_{i}=\emptyset. Then it is easy to observe that

tTi|tQi,tG^oσ(t)x^t(i)ωaσt|2\displaystyle~{}\sum_{t\in T_{i}}\left|\sum_{t^{\prime}\in Q_{i,t}}\widehat{G}^{\prime}_{-o_{\sigma}(t^{\prime})}\widehat{x}_{t^{\prime}}^{(i)}\omega^{a\sigma t^{\prime}}\right|^{2}
=\displaystyle= tTi|tQi,t\SiG^oσ(t)x^t(i)ωaσt|2\displaystyle~{}\sum_{t\in T_{i}}\left|\sum_{t^{\prime}\in Q_{i,t}\backslash S_{i}}\widehat{G}^{\prime}_{-o_{\sigma}(t^{\prime})}\widehat{x}_{t^{\prime}}^{(i)}\omega^{a\sigma t^{\prime}}\right|^{2}
\displaystyle\leq tSi|tQi,t\SiG^oσ(t)x^t(i)ωaσt|2\displaystyle~{}\sum_{t\in S_{i}}\left|\sum_{t^{\prime}\in Q_{i,t}\backslash S_{i}}\widehat{G}^{\prime}_{-o_{\sigma}(t^{\prime})}\widehat{x}_{t^{\prime}}^{(i)}\omega^{a\sigma t^{\prime}}\right|^{2}

where the first step comes from Qi,tSi=Q_{i,t}\cap S_{i}=\emptyset, and the second step follows that TiSiT_{i}\subseteq S_{i}.

We can calculate the expectation of x^Ti(i)w^(i)22\|\widehat{x}^{(i)}_{T_{i}}-\widehat{w}^{(i)}\|_{2}^{2}.

We first demonstrate that

𝔼σ,a,b[x^Ti(i)w^(i)22]=𝔼σ,a,b[tTi|x^t(i)u^hσ,b(t)ωaσt|2].\operatorname*{{\mathbb{E}}}_{\sigma,a,b}\left[\left\|\widehat{x}^{(i)}_{T_{i}}-\widehat{w}^{(i)}\right\|_{2}^{2}\right]=\operatorname*{{\mathbb{E}}}_{\sigma,a,b}\left[\sum_{t\in T_{i}}|\widehat{x}_{t}^{(i)}-\widehat{u}_{h_{\sigma,b}(t)}\omega^{-a\sigma t}|^{2}\right].

then get the upper bound of

𝔼σ,a,b[tTi|x^t(i)u^hσ,b(t)ωaσt|2]\operatorname*{{\mathbb{E}}}_{\sigma,a,b}\left[\sum_{t\in T_{i}}|\widehat{x}_{t}^{(i)}-\widehat{u}_{h_{\sigma,b}(t)}\omega^{-a\sigma t}|^{2}\right]

.

We have

𝔼σ,a,b[x^Ti(i)w^(i)22]=\displaystyle\operatorname*{{\mathbb{E}}}_{\sigma,a,b}\left[\left\|\widehat{x}^{(i)}_{T_{i}}-\widehat{w}^{(i)}\right\|_{2}^{2}\right]= 𝔼σ,a,b[tTi|x^t(i)w^t(i)|2]\displaystyle~{}\operatorname*{{\mathbb{E}}}_{\sigma,a,b}\left[\sum_{t\in T_{i}}|\widehat{x}_{t}^{(i)}-\widehat{w}_{t}^{(i)}|^{2}\right]
=\displaystyle= 𝔼σ,a,b[tTi|x^t(i)u^hσ,b(t)ωaσt|2]\displaystyle~{}\operatorname*{{\mathbb{E}}}_{\sigma,a,b}\left[\sum_{t\in T_{i}}|\widehat{x}_{t}^{(i)}-\widehat{u}_{h_{\sigma,b}(t)}\omega^{-a\sigma t}|^{2}\right]
=\displaystyle= 𝔼σ,a,b[tTi|x^t(i)ωaσtu^hσ,b(t)|2]\displaystyle~{}\operatorname*{{\mathbb{E}}}_{\sigma,a,b}\left[\sum_{t\in T_{i}}|\widehat{x}_{t}^{(i)}\omega^{a\sigma t}-\widehat{u}_{h_{\sigma,b}(t)}|^{2}\right]

where the first step follows that summation over TiT_{i}, the second step comes from the definition of w^t(i)\widehat{w}_{t}^{(i)}(in Line 19 in Algorithm 1), the third step follows that

|x^t(i)u^hσ,b(t)ωaσt|=|ωaσt||x^t(i)ωaσtu^hσ,b(t)||\widehat{x}_{t}^{(i)}-\widehat{u}_{h_{\sigma,b}(t)}\omega^{-a\sigma t}|=|\omega^{-a\sigma t}|\cdot|\widehat{x}_{t}^{(i)}\omega^{a\sigma t}-\widehat{u}_{h_{\sigma,b}(t)}|

and |ωaσt|=1|\omega^{-a\sigma t}|=1, the fourth step comes from Eq. (4).

And then we have

𝔼σ,a,b[x^Ti(i)w^(i)22]\displaystyle~{}\operatorname*{{\mathbb{E}}}_{\sigma,a,b}\left[\left\|\widehat{x}^{(i)}_{T_{i}}-\widehat{w}^{(i)}\right\|_{2}^{2}\right]
=\displaystyle= 𝔼σ,a,b[tTi|x^t(i)ωaσtu^hσ,b(t)|2]\displaystyle~{}\operatorname*{{\mathbb{E}}}_{\sigma,a,b}\left[\sum_{t\in T_{i}}|\widehat{x}_{t}^{(i)}\omega^{a\sigma t}-\widehat{u}_{h_{\sigma,b}(t)}|^{2}\right]
\displaystyle\leq tSi2𝔼σ,a,b[|tQi,t\SiG^oσ(t)x^t(i)ωaσt|2]+δ2x^12\displaystyle~{}\sum_{t\in S_{i}}2\operatorname*{{\mathbb{E}}}_{\sigma,a,b}\left[\left|\sum_{t^{\prime}\in Q_{i,t}\backslash S_{i}}\widehat{G}^{\prime}_{-o_{\sigma}(t^{\prime})}{\widehat{x}_{t^{\prime}}^{(i)}}\omega^{a\sigma t^{\prime}}\right|^{2}\right]+\delta^{2}\|\widehat{x}\|_{1}^{2}
\displaystyle\leq tSi2𝔼σ,b[tQi,t\Si|G^oσ(t)x^t(i)|2]+δ2x^12\displaystyle~{}\sum_{t\in S_{i}}2\operatorname*{{\mathbb{E}}}_{\sigma,b}\left[\sum_{t^{\prime}\in Q_{i,t}\backslash S_{i}}\left|\widehat{G}^{\prime}_{-o_{\sigma}(t^{\prime})}{\widehat{x}_{t^{\prime}}^{(i)}}\right|^{2}\right]+\delta^{2}\|\widehat{x}\|_{1}^{2}
=\displaystyle= tSi2𝔼σ,b[tS¯i1(tQi,t\Si)|G^oσ(t)x^t(i)|2]+δ2x^12\displaystyle~{}\sum_{t\in S_{i}}2\operatorname*{{\mathbb{E}}}_{\sigma,b}\left[\sum_{t^{\prime}\in\bar{S}_{i}}\textbf{1}(t^{\prime}\in Q_{i,t}\backslash S_{i})\cdot\left|\widehat{G}^{\prime}_{-o_{\sigma}(t^{\prime})}{\widehat{x}_{t^{\prime}}^{(i)}}\right|^{2}\right]+\delta^{2}\|\widehat{x}\|_{1}^{2}
\displaystyle\leq tSi(1Bix^S¯i(i)22+δ2x^12)\displaystyle~{}\sum_{t\in S_{i}}(\frac{1}{B_{i}}\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\delta^{2}\|\widehat{x}\|_{1}^{2})
\displaystyle\leq |Si|Bix^S¯i(i)22+δ2|Si|x^12\displaystyle~{}\frac{|S_{i}|}{B_{i}}\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\delta^{2}|S_{i}|\cdot\|\widehat{x}\|_{1}^{2}
\displaystyle\leq ϵiαi2Cx^S¯i(i)22+δ2|Si|x^12,\displaystyle~{}\frac{\epsilon_{i}\alpha_{i}^{2}}{C}\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\delta^{2}|S_{i}|\cdot\|\widehat{x}\|_{1}^{2},

where the first step follows the equation above, the second step follows Lemma V.3, the third step follows from expanding the squared sum, the fourth step follows that if A1A2A_{1}\subseteq A_{2}, we have

iA1f(i)=iA21(iA1)f(i),\sum_{i\in A_{1}}f(i)=\sum_{i\in A_{2}}\textbf{1}(i\in A_{1})f(i),

the fifth step follows for two pairwise independent random variable tt and tt^{\prime}, we have hσ,b(t)=hσ,b(t)h_{\sigma,b}(t)=h_{\sigma,b}(t^{\prime}) holds with probability at most 1/Bi1/B_{i}, the sixth step comes from the summation over SiS_{i}, and the last step follows from |Si|ki|S_{i}|\leq k_{i} and Bi=Cki/(αi2ϵi)B_{i}=C\cdot k_{i}/(\alpha_{i}^{2}\epsilon_{i}).

Then, using Markov’s inequality, we have,

Pr[x^Ti(i)w^(i)22ϵiαiCx^S¯i(i)22+δ2|Si|αix^12]αi.\displaystyle\Pr\left[\left\|\widehat{x}^{(i)}_{T_{i}}-\widehat{w}^{(i)}\right\|_{2}^{2}\geq\frac{\epsilon_{i}\alpha_{i}}{C}\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\delta^{2}\frac{|S_{i}|}{\alpha_{i}}\|\widehat{x}\|_{1}^{2}\right]\leq\alpha_{i}.

Note that

ϵiαiCx^S¯i(i)22+δ2|Si|αix^12\displaystyle\frac{\epsilon_{i}\alpha_{i}}{C}\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\delta^{2}\frac{|S_{i}|}{\alpha_{i}}\|\widehat{x}\|_{1}^{2}\leq ϵiCx^S¯i(i)22+δ2|Si|αix^12\displaystyle~{}\frac{\epsilon_{i}}{C}\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\delta^{2}\frac{|S_{i}|}{\alpha_{i}}\|\widehat{x}\|_{1}^{2}
\displaystyle\leq ϵiCx^S¯i(i)22+ϵiCδ2Bix^12\displaystyle~{}\frac{\epsilon_{i}}{C}\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\frac{\epsilon_{i}}{C}\delta^{2}B_{i}\|\widehat{x}\|_{1}^{2}
\displaystyle\leq ϵiCx^S¯i(i)22+ϵiCδ2nx^12\displaystyle~{}\frac{\epsilon_{i}}{C}\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\frac{\epsilon_{i}}{C}\delta^{2}n\|\widehat{x}\|_{1}^{2}
\displaystyle\leq ϵi20(x^S¯i(i)22+δ2nx^12),\displaystyle~{}\frac{\epsilon_{i}}{20}(\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2}),

where the first step follows by αi1\alpha_{i}\leq 1, the second step follows by |Si|ki=ϵiBiαi2/C|S_{i}|\leq k_{i}=\epsilon_{i}B_{i}\alpha_{i}^{2}/C, the third step follows by BinB_{i}\leq n, the last step follows by C1000C\geq 1000.

Thus, we have

Pr[x^Ti(i)w^(i)22ϵi20(x^S¯i(i)22+δ2nx^12)]1αi.\displaystyle\Pr\left[\left\|\widehat{x}^{(i)}_{T_{i}}-\widehat{w}^{(i)}\right\|_{2}^{2}\leq\frac{\epsilon_{i}}{20}(\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2})\right]\geq 1-\alpha_{i}.

Lemma VI.4.

Given parameters C1000C\geq 1000, γ1/1000\gamma\leq 1/1000. For any k1,ϵ(0,1)k\geq 1,\epsilon\in(0,1), R1R\geq 1. For each i[R]i\in[R], we define

ki:=\displaystyle k_{i}:= kγi1,\displaystyle~{}k\gamma^{i-1},
ϵi:=\displaystyle\epsilon_{i}:= ϵ(10γ)i,\displaystyle~{}\epsilon(10\gamma)^{i},
αi:=\displaystyle\alpha_{i}:= 1/(200i3),\displaystyle~{}1/(200i^{3}),
Bi:=\displaystyle B_{i}:= Cki/(αi2ϵi).\displaystyle~{}C\cdot k_{i}/(\alpha_{i}^{2}\epsilon_{i}).

For each i[R]i\in[R]: If for all j[i1]j\leq[i-1] we have

  1. 1.

    supp(w^(j))Sj\operatorname{supp}(\widehat{w}^{(j)})\subseteq S_{j}.

  2. 2.

    |Sj+1|kj+1|S_{j+1}|\leq k_{j+1}.

  3. 3.

    z^(j+1)=z^(j)+w^(j)\widehat{z}^{(j+1)}=\widehat{z}^{(j)}+\widehat{w}^{(j)}.

  4. 4.

    x^(j+1)=x^z^(j+1)\widehat{x}^{(j+1)}=\widehat{x}-\widehat{z}^{(j+1)}.

  5. 5.

    x^S¯j+1(j+1)22(1+ϵj)x^S¯j(j)22+ϵjδ2nx^12\|\widehat{x}_{\overline{S}_{j+1}}^{(j+1)}\|_{2}^{2}\leq(1+\epsilon_{j})\|\widehat{x}_{\overline{S}_{j}}^{(j)}\|_{2}^{2}+\epsilon_{j}\delta^{2}n\|\widehat{x}\|_{1}^{2}.

Then, with probability 110αi/γ1-10\alpha_{i}/\gamma, we have

  1. 1.

    supp(w^(i))Si\operatorname{supp}(\widehat{w}^{(i)})\subseteq S_{i}.

  2. 2.

    |Si+1|ki+1|S_{i+1}|\leq k_{i+1}.

  3. 3.

    z^(i+1)=z^(i)+w^(i)\widehat{z}^{(i+1)}=\widehat{z}^{(i)}+\widehat{w}^{(i)}.

  4. 4.

    x^(i+1)=x^z^(i+1)\widehat{x}^{(i+1)}=\widehat{x}-\widehat{z}^{(i+1)}.

  5. 5.

    x^S¯i+1(i+1)22(1+ϵi)x^S¯i(i)22+ϵiδ2nx^12\|\widehat{x}_{\overline{S}_{i+1}}^{(i+1)}\|_{2}^{2}\leq(1+\epsilon_{i})\|\widehat{x}_{\overline{S}_{i}}^{(i)}\|_{2}^{2}+\epsilon_{i}\delta^{2}n\|\widehat{x}\|_{1}^{2}.

Proof.

We will prove the five results one by one.

Part 1.

Follows from Line 19 in the Algorithm 1, we have that

supp(w^(i))Si.\displaystyle\operatorname{supp}(\widehat{w}^{(i)})\subseteq S_{i}.

Part 2.

By Lemma VI.2, we have that

|Si+1|ki.\displaystyle|S_{i+1}|\leq k_{i}.

Part 3.

Follows from Line 7 in the Algorithm 1, we have that

z^(i+1)=z^(i)+w^(i).\displaystyle\widehat{z}^{(i+1)}=\widehat{z}^{(i)}+\widehat{w}^{(i)}.

Part 4.

Follows from Line 28 in the Algorithm 1, we have that

x^(i+1)=x^z^(i+1).\displaystyle\widehat{x}^{(i+1)}=\widehat{x}-\widehat{z}^{(i+1)}.

Part 5.

By Lemma VI.3, we have that

Pr[x^Ti(i)w^(i)22ϵi20(x^S¯i(i)22+δ2nx^12)]1αi.\displaystyle\Pr\left[\left\|\widehat{x}^{(i)}_{T_{i}}-\widehat{w}^{(i)}\right\|_{2}^{2}\leq\frac{\epsilon_{i}}{20}(\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2})\right]\geq 1-\alpha_{i}. (5)

Recall that

w^(i)=z^(i+1)z^(i)=x^(i)x^(i+1).\displaystyle\widehat{w}^{(i)}=\widehat{z}^{(i+1)}-\widehat{z}^{(i)}=\widehat{x}^{(i)}-\widehat{x}^{(i+1)}.

It is obvious that

supp(w^(i))Ti.\displaystyle\operatorname{supp}(\widehat{w}^{(i)})\subseteq T_{i}.

Conditioning on all coordinates in TiT_{i} are well isolated and Eq. (5) holds, we have

x^S¯i+1(i+1)22=\displaystyle\|\widehat{x}_{\overline{S}_{i+1}}^{(i+1)}\|_{2}^{2}= (x^(i)w^(i))S¯i+122\displaystyle~{}\|(\widehat{x}^{(i)}-\widehat{w}^{(i)})_{\overline{S}_{i+1}}\|_{2}^{2}
=\displaystyle= x^S¯i+1(i)w^S¯i+1(i)22\displaystyle~{}\|\widehat{x}^{(i)}_{\overline{S}_{i+1}}-\widehat{w}^{(i)}_{\overline{S}_{i+1}}\|_{2}^{2}
=\displaystyle= x^S¯i+1(i)w^(i)22\displaystyle~{}\|\widehat{x}^{(i)}_{\overline{S}_{i+1}}-\widehat{w}^{(i)}\|_{2}^{2}
=\displaystyle= x^S¯iTi(i)w^(i)22\displaystyle~{}\|\widehat{x}^{(i)}_{\overline{S}_{i}\cup T_{i}}-\widehat{w}^{(i)}\|_{2}^{2}
=\displaystyle= x^S¯i(i)22+x^Ti(i)w^(i)22\displaystyle~{}\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\|\widehat{x}^{(i)}_{T_{i}}-\widehat{w}^{(i)}\|_{2}^{2}
\displaystyle\leq x^S¯i(i)22+ϵi(x^S¯i(i)22+δ2nx^12)\displaystyle~{}\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\epsilon_{i}(\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2})
=\displaystyle= (1+ϵi)x^S¯i(i)22+ϵiδ2nx^12.\displaystyle~{}(1+\epsilon_{i})\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}+\epsilon_{i}\delta^{2}n\|\widehat{x}\|_{1}^{2}.

where the first step comes from x^(i+1)=x^(i)w^(i)\widehat{x}^{(i+1)}=\widehat{x}^{(i)}-\widehat{w}^{(i)}, the second step is due to rearranging the terms, the third step is due to w^(i)=w^S¯i+1(i)\widehat{w}^{(i)}=\widehat{w}^{(i)}_{\overline{S}_{i+1}}, and the forth step comes from Si=TiSi+1S_{i}=T_{i}\cup S_{i+1}, the fifth step is due to rearranging the terms, the sixth step the comes from a Eq. (5), and the final step comes from merging the x^S¯i(i)22\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2} terms. ∎

VI-B Induction to all the iterations

For completeness, we give the induced result among the all the iterations (i[R]i\in[R]). By the following lemma at hand, we can finally attain the theorem in Section VI-C.

Lemma VI.5.

Given parameters C1000C\geq 1000, γ1/1000\gamma\leq 1/1000. For any k1,ϵ(0,1)k\geq 1,\epsilon\in(0,1), R1R\geq 1. For each i[R]i\in[R], we define

ki:=\displaystyle k_{i}:= kγi1,\displaystyle~{}k\gamma^{i-1},
ϵi:=\displaystyle\epsilon_{i}:= ϵ(10γ)i,\displaystyle~{}\epsilon(10\gamma)^{i},
αi:=\displaystyle\alpha_{i}:= 1/(200i3),\displaystyle~{}1/(200i^{3}),
Bi:=\displaystyle B_{i}:= Cki/(αi2ϵi).\displaystyle~{}C\cdot k_{i}/(\alpha_{i}^{2}\epsilon_{i}).

For each i[R]i\in[R], we have with probability 110αi/γ1-10\alpha_{i}/\gamma, we have

|Si+1|ki\displaystyle|S_{i+1}|\leq k_{i}

and

x^S¯i+1(i+1)22(1+ϵi)x^S¯i(i)22+ϵiδ2nx^12\displaystyle\|\widehat{x}_{\overline{S}_{i+1}}^{(i+1)}\|_{2}^{2}\leq(1+\epsilon_{i})\|\widehat{x}_{\overline{S}_{i}}^{(i)}\|_{2}^{2}+\epsilon_{i}\delta^{2}n\|\widehat{x}\|_{1}^{2}
Proof.

Our proof can be divided into two parts. At first, we consider the correctness of the inequalities above with i=1i=1. And then based on the result we attain above (See Lemma VI.4 ) and inducing over i[n]i\in[n], the proof will be complete.

By Lemma VI.1, we have with probability 16α11-6\alpha_{1}, tt is well isolated (See Definition V.10).

Part 1.

We have |S1|=|S|k=ki|S_{1}|=|S|\leq k=k_{i}. (See Definition III.2). And then by Lemma VI.3, we have that for i[R]i\in[R], |Si+1|ki|S_{i+1}|\leq k_{i}.

Part 2. Given that all coordinates t[n]t\in[n] in T1T_{1} are well isolated, with probability at least 110αi/γ1-10\alpha_{i}/\gamma, we have

x^S¯2(2)22=\displaystyle\|\widehat{x}_{\overline{S}_{2}}^{(2)}\|_{2}^{2}= (x^(1)w^(1))S¯222\displaystyle~{}\|(\widehat{x}^{(1)}-\widehat{w}^{(1)})_{\overline{S}_{2}}\|_{2}^{2}
=\displaystyle= x^S¯2(1)w^S¯2(1)22\displaystyle~{}\|\widehat{x}^{(1)}_{\overline{S}_{2}}-\widehat{w}^{(1)}_{\overline{S}_{2}}\|_{2}^{2}
=\displaystyle= x^S¯2(1)w^(1)22\displaystyle~{}\|\widehat{x}^{(1)}_{\overline{S}_{2}}-\widehat{w}^{(1)}\|_{2}^{2}
=\displaystyle= x^S¯1T1(1)w^(1)22\displaystyle~{}\|\widehat{x}^{(1)}_{\overline{S}_{1}\cup T_{1}}-\widehat{w}^{(1)}\|_{2}^{2}
=\displaystyle= x^S¯1(1)22+x^T1(1)w^(1)22\displaystyle~{}\|\widehat{x}^{(1)}_{\overline{S}_{1}}\|_{2}^{2}+\|\widehat{x}^{(1)}_{T_{1}}-\widehat{w}^{(1)}\|_{2}^{2}
\displaystyle\leq x^S¯1(1)22+ϵ1(x^S¯1(1)22+δ2nx^12)\displaystyle~{}\|\widehat{x}^{(1)}_{\overline{S}_{1}}\|_{2}^{2}+\epsilon_{1}(\|\widehat{x}^{(1)}_{\overline{S}_{1}}\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2})
=\displaystyle= (1+ϵ1)x^S¯1(1)22+ϵ1δ2nx^12.\displaystyle~{}(1+\epsilon_{1})\|\widehat{x}^{(1)}_{\overline{S}_{1}}\|_{2}^{2}+\epsilon_{1}\delta^{2}n\|\widehat{x}\|_{1}^{2}.

where the first step comes from x^(2)=x^(1)w^(1)\widehat{x}^{(2)}=\widehat{x}^{(1)}-\widehat{w}^{(1)}, the second step is due to rearranging the terms, the third step is due to w^(1)=w^S¯2(1)\widehat{w}^{(1)}=\widehat{w}^{(1)}_{\overline{S}_{2}}, and the forth step comes from S1=T1S2S_{1}=T_{1}\cup S_{2}, the fifth step is due to rearranging the terms, the sixth step the comes from expanding the terms, and the final step comes from merging the x^S¯1(1)22\|\widehat{x}^{(1)}_{\overline{S}_{1}}\|_{2}^{2} terms.

By Lemma VI.4, for all i[R]i\in[R], we can have

x^S¯i+1(i+1)22(1+ϵi)x^S¯i(i)22+ϵiδ2nx^12\displaystyle\|\widehat{x}_{\overline{S}_{i+1}}^{(i+1)}\|_{2}^{2}\leq(1+\epsilon_{i})\|\widehat{x}_{\overline{S}_{i}}^{(i)}\|_{2}^{2}+\epsilon_{i}\delta^{2}n\|\widehat{x}\|_{1}^{2}

VI-C Main result

Algorithm 1 Fourier set query algorithm
1:procedure FourierSetQuery(x,S,ϵ,kx,S,\epsilon,k)\triangleright Theorem VI.6
2:     γ1/1000\gamma\leftarrow 1/1000, C1000C\leftarrow 1000, z^(1)0\widehat{z}^{(1)}\leftarrow 0, S1SS_{1}\leftarrow S
3:     for  i=1Ri=1\to R do
4:         kikγik_{i}\leftarrow k\gamma^{i}, ϵiϵ(10γ)i\epsilon_{i}\leftarrow\epsilon(10\gamma)^{i}, αi1/(100i3)\alpha_{i}\leftarrow 1/(100i^{3}), BiCki/(αi2ϵi)B_{i}\leftarrow C\cdot k_{i}/(\alpha_{i}^{2}\epsilon_{i})
5:         w^(i),TiEstimateValues(x,z^(i),Si,Bi,δ,αi)\widehat{w}^{(i)},T_{i}\leftarrow\textsc{EstimateValues}(x,\widehat{z}^{(i)},S_{i},B_{i},\delta,\alpha_{i}) \triangleright w^(i)\widehat{w}^{(i)} is |Ti||T_{i}|-sparse
6:         Si+1Si\TiS_{i+1}\leftarrow S_{i}\backslash T_{i}
7:         z^(i+1)z^(i)+w^(i)\widehat{z}^{(i+1)}\leftarrow\widehat{z}^{(i)}+\widehat{w}^{(i)}
8:     end for
9:     return z^(R+1)\widehat{z}^{(R+1)}
10:end procedure
11:procedure EstimateValues(x,z^,S,B,δ,αx,\widehat{z},S,B,\delta,\alpha) \triangleright Lemma VI.4
12:     Choose a,b[n]a,b\in[n] uniformly at random
13:     Choose σ\sigma uniformly at random from the set of odd numbers in [n][n]
14:     u^HashToBins(Pσ,a,b,α,z^,B,δ,x)\widehat{u}\leftarrow\textsc{HashToBins}(P_{\sigma,a,b},\alpha,\widehat{z},B,\delta,x)
15:     w^0\widehat{w}\leftarrow 0, TT\leftarrow\emptyset
16:     for tSt\in S do
17:         if tt is isolated from other coordinates of SS then \triangleright hσ,b(t)hσ,b(S\{t})h_{\sigma,b}(t)\notin h_{\sigma,b}(S\backslash\{t\})
18:              if no large offset then \triangleright n(1α)/(2B)>|oσ,b(t)|n(1-\alpha)/(2B)>|o_{\sigma,b}(t)|
19:                   w^tu^hσ,b(t)e2π𝐢nσat\widehat{w}_{t}\leftarrow\widehat{u}_{h_{\sigma,b}(t)}e^{-\frac{2\pi\mathbf{i}}{n}\sigma at}
20:                  TT{t}T\leftarrow T\cup\{t\}
21:              end if
22:         end if
23:     end for
24:     return w^,T\widehat{w},T
25:end procedure
26:procedure HashToBins(Pσ,a,b,α,z^,B,δ,xP_{\sigma,a,b},\alpha,\widehat{z},B,\delta,x)
27:     Compute y^jn/B\widehat{y}_{jn/B} for j[B]j\in[B], where y=GB,α,δ(Pσ,a,bx)y=G_{B,\alpha,\delta}\cdot(P_{\sigma,a,b}x)
28:     Compute y^jn/B=y^jn/B(GB,α,δ^Pσ,a,bz^)jn/B\widehat{y}^{\prime}_{jn/B}=\widehat{y}_{jn/B}-(\widehat{G^{\prime}_{B,\alpha,\delta}}*\widehat{P_{\sigma,a,b}z})_{jn/B}
29:     return u^j=y^jn/B\widehat{u}_{j}=\widehat{y}^{\prime}_{jn/B}
30:end procedure

In this subsection, we give the main result as the following theorem.

Theorem VI.6 (Main result).

Given a vector xnx\in\operatorname*{\mathbb{C}}^{n} and the x^\widehat{x} as the concrete Fourier transformation result, for every ϵ,δ(0,1)\epsilon,\delta\in(0,1) and k1k\geq 1, any S[n]S\subseteq[n], |S|=k|S|=k, there exists an algorithm (Algorithm 1) that takes

O(ϵ1klog(n/δ))\displaystyle O(\epsilon^{-1}k\log(n/\delta))

samples, runs in

O(ϵ1klog(n/δ))\displaystyle O(\epsilon^{-1}k\log(n/\delta))

time, and outputs a vector xnx^{\prime}\in\operatorname*{\mathbb{C}}^{n} such that

(xx^)S22ϵx^S¯22+δx^12\displaystyle\|(x^{\prime}-\widehat{x})_{S}\|_{2}^{2}\leq\epsilon\|\widehat{x}_{\bar{S}}\|_{2}^{2}+\delta\|\widehat{x}\|_{1}^{2}

holds with probability at least 9/109/10.

Proof.

By the Setting in the Algorithm 1, we can make the assumption in Lemma VI.4 hold. And by induction on Lemma VI.4, the following conclusion can be attained by us.

By Lemma VI.4 and the parameters as follows

ki:=kγi1,\displaystyle k_{i}:=k\gamma^{i-1},
ϵi:=ϵ(10γ)i,\displaystyle\epsilon_{i}:=\epsilon(10\gamma)^{i},
αi=1/(200i3),\displaystyle\alpha_{i}=1/(200i^{3}),
Bi:=Cki/(αi2ϵi),\displaystyle B_{i}:=C\cdot k_{i}/(\alpha_{i}^{2}\epsilon_{i}),

for i[R]i\in[R], we can have that with probability 110αi/γ1-10\alpha_{i}/\gamma, we have

  1. 1.

    supp(w^(i))Si\operatorname{supp}(\widehat{w}^{(i)})\subseteq S_{i}.

  2. 2.

    |Si+1|ki+1|S_{i+1}|\leq k_{i+1}.

  3. 3.

    z^(i+1)=z^(i)+w^(i)\widehat{z}^{(i+1)}=\widehat{z}^{(i)}+\widehat{w}^{(i)}.

  4. 4.

    x^(i+1)=x^z^(i+1)\widehat{x}^{(i+1)}=\widehat{x}-\widehat{z}^{(i+1)}.

  5. 5.

    x^S¯i+1(i+1)22(1+ϵi)x^S¯i(i)22+ϵiδ2nx^12\|\widehat{x}_{\overline{S}_{i+1}}^{(i+1)}\|_{2}^{2}\leq(1+\epsilon_{i})\|\widehat{x}_{\overline{S}_{i}}^{(i)}\|_{2}^{2}+\epsilon_{i}\delta^{2}n\|\widehat{x}\|_{1}^{2}.

By Lemma VI.5, we can conclude that with R=logkR=\log k iterations, we will attain the result we want. Then we will give the analysis about the time complexity and sample complexity.

Proof of Sample Complexity.

From analysis above, the sample needed in each iteration is O((Bi/αi)log(n/δ))O((B_{i}/\alpha_{i})\log(n/\delta)) then we have the following complexity.

The sample complexity of Estimation is

i=1R(Bi/αi)log(n/δ)=O(ϵ1klog(n/δ)).\displaystyle\sum_{i=1}^{R}(B_{i}/\alpha_{i})\log(n/\delta)=O(\epsilon^{-1}k\log(n/\delta)).

The time in each iteration mainly from two parts. The EstimateValues and HashToBins functions. For the running time of EstimateValues, its running time is mainly from loop. The number of the iterations of the loop can be bounded by O(Bi/αilog(n/δ))O(B_{i}/\alpha_{i}\log(n/\delta)).

By Lemma V.15, we can attain the time complexity of HashToBins with the bound of O(Bi/αilog(n/δ))O(B_{i}/\alpha_{i}\log(n/\delta)). This function is used only once at each iteration.

With R=logkR=\log k, we can have the following equation.

Proof of Time Complexity. The Time complexity of Estimation is

i=1R(Bi/αi)log(n/δ)=O(ϵ1klog(n/δ)).\displaystyle\sum_{i=1}^{R}(B_{i}/\alpha_{i})\log(n/\delta)=O(\epsilon^{-1}k\log(n/\delta)).

Proof of Success Probability.

The failure probability is i=1R10αi/γ<1/10.\sum_{i=1}^{R}10\alpha_{i}/\gamma<1/10.

Upper bound x^S¯i(i)22\|\widehat{x}_{\overline{S}_{i}}^{(i)}\|_{2}^{2}.

By Lemma VI.4, we have that

x^S¯i(i)22\displaystyle\|\widehat{x}^{(i)}_{\overline{S}_{i}}\|_{2}^{2}\leq (1+ϵi)x^S¯i(i)22+ϵiδ2nx^12\displaystyle~{}(1+\epsilon_{i})\|\widehat{x}_{\overline{S}_{i}}^{(i)}\|_{2}^{2}+\epsilon_{i}\delta^{2}n\|\widehat{x}\|_{1}^{2}
\displaystyle\leq (1+ϵi)(1+ϵi1)x^S¯i1(i1)22\displaystyle~{}(1+\epsilon_{i})(1+\epsilon_{i-1})\|\widehat{x}_{\overline{S}_{i-1}}^{(i-1)}\|_{2}^{2}
+((1+ϵi)ϵi1+ϵi)δ2nx^12\displaystyle+((1+\epsilon_{i})\epsilon_{i-1}+\epsilon_{i})\delta^{2}n\|\widehat{x}\|_{1}^{2}
\displaystyle\leq j=1i(1+ϵj)x^S¯j22+j=1iϵjδ2nx^12l=j+1i(1+ϵl)\displaystyle~{}\prod_{j=1}^{i}(1+\epsilon_{j})\|\widehat{x}_{\overline{S}_{j}}\|_{2}^{2}+\sum_{j=1}^{i}\epsilon_{j}\delta^{2}n\|\widehat{x}\|_{1}^{2}\prod_{l=j+1}^{i}(1+\epsilon_{l})
\displaystyle\leq 8(x^S¯i22+δ2nx^12),\displaystyle~{}8(\|\widehat{x}_{\overline{S}_{i}}\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2}), (6)

where the first step comes from the assumption in Lemma VI.4, the second step comes from the assumption in Lemma VI.4, the third step refers to recursively apply the second step, the last step follows by a geometric sum.

Proof of Final Error. We can bound the query error by:

x^Sz^(R+1)22=\displaystyle\|\widehat{x}_{S}-\widehat{z}^{(R+1)}\|_{2}^{2}= i=1Rx^Ti(i)w^(i)22\displaystyle~{}\sum_{i=1}^{R}\|\widehat{x}_{T_{i}}^{(i)}-\widehat{w}^{(i)}\|_{2}^{2}
\displaystyle\leq i=1Rkiμ(i)/20\displaystyle~{}\sum_{i=1}^{R}k_{i}\mu^{(i)}/20
\displaystyle\leq i=1Rϵi(x^S¯i(i)22+δ2nx^12)/20\displaystyle~{}\sum_{i=1}^{R}\epsilon_{i}(\|\widehat{x}_{\overline{S}_{i}}^{(i)}\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2})/20
\displaystyle\leq i=1Rϵi10(x^S¯22+δ2nx^12)/20\displaystyle~{}\sum_{i=1}^{R}\epsilon_{i}\cdot 10(\|\widehat{x}_{\overline{S}}\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2})/20
\displaystyle\leq ϵ(x^S¯22+δ2nx^12).\displaystyle~{}\epsilon(\|\widehat{x}_{\overline{S}}\|_{2}^{2}+\delta^{2}n\|\widehat{x}\|_{1}^{2}).

where the first step follows that TiT_{i} is well isolated (See Definition V.10.) and w^(i)=z^(i+1)z^(i)\widehat{w}^{(i)}=\widehat{z}^{(i+1)}-\widehat{z}^{(i)}, the second step is by Eq. (5), the third step comes from the definition of μ(i)\mu^{(i)} in Eq. (VI-A), the fourth step follows from the Eq.(VI-C), and the final step follows from the geometric sum, ϵi=ϵ(10γ)i\epsilon_{i}=\epsilon(10\gamma)^{i} and γ1/1000\gamma\leq 1/1000.

VII Conclusion

Fourier transformation is an intensively researched topic in a variety of scientific disciplines. Numerous applications exist within machine learning, signal processing, compressed sensing, etc. In this paper, we study the problem of Fourier set query. With an approximation parameter ϵ\epsilon, a vector xnx\in\mathbb{C}^{n} and a query set S[n]S\subset[n] of size kk, our algorithm uses O(ϵ1klog(n/δ))O(\epsilon^{-1}k\log(n/\delta)) Fourier measurements, runs in O(ϵ1klog(n/δ))O(\epsilon^{-1}k\log(n/\delta)) time and outputs a vector xx^{\prime} such that (xx^)S22ϵx^S¯22+δx^12\|(x^{\prime}-\widehat{x})_{S}\|_{2}^{2}\leq\epsilon\|\widehat{x}_{\bar{S}}\|_{2}^{2}+\delta\|\widehat{x}\|_{1}^{2} with probability of at least 9/109/10.

References

  • [1] J. W. Cooley and J. W. Tukey, “An algorithm for the machine calculation of complex Fourier series,” Mathematics of computation, vol. 19, no. 90, pp. 297–301, 1965.
  • [2] D. G. Voelz, Computational fourier optics: a MATLAB tutorial.   SPIE press Bellingham, Washington, 2011.
  • [3] J. Goodman, Introduction to Fourier Optics.   W. H. Freeman, 2017. [Online]. Available: https://books.google.com/books?id=9zY8DwAAQBAJ
  • [4] A. M. Aibinu, M.-J. E. Salami, A. A. Shafie, and A. R. Najeeb, “Mri reconstruction using discrete fourier transform: a tutorial,” 2008.
  • [5] G. O. Reynolds, The New Physical Optics Notebook: Tutorials in Fourier Optics.   ERIC, 1989.
  • [6] H. Hassanieh, P. Indyk, D. Katabi, and E. Price, “Nearly optimal sparse fourier transform,” in Proceedings of the forty-fourth annual ACM symposium on Theory of computing.   ACM, 2012, pp. 563–578.
  • [7] B. Boashash, Time-frequency signal analysis and processing: a comprehensive reference.   Academic press, 2015.
  • [8] E. Price, “Efficient sketches for the set query problem,” in Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms.   Society for Industrial and Applied Mathematics, 2011, pp. 41–56.
  • [9] M. Kapralov, “Sample efficient estimation and recovery in sparse fft via isolation on average,” in Foundations of Computer Science, 2017. FOCS’17. IEEE 58th Annual IEEE Symposium on.   https://arxiv.org/pdf/1708.04544, 2017.
  • [10] A. C. Gilbert, S. Muthukrishnan, and M. Strauss, “Improved time bounds for near-optimal sparse Fourier representations,” in Optics & Photonics 2005.   International Society for Optics and Photonics, 2005, pp. 59 141A–59 141A.
  • [11] H. Hassanieh, P. Indyk, D. Katabi, and E. Price, “Simple and practical algorithm for sparse Fourier transform,” in Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms.   SIAM, 2012, pp. 1183–1194.
  • [12] M. A. Iwen, “Improved approximation guarantees for sublinear-time Fourier algorithms,” Applied And Computational Harmonic Analysis, vol. 34, no. 1, pp. 57–82, 2013.
  • [13] P. Indyk, M. Kapralov, and E. Price, “(Nearly) Sample-optimal sparse Fourier transform,” in Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms.   SIAM, 2014, pp. 480–499.
  • [14] P. Indyk and M. Kapralov, “Sample-optimal fourier sampling in any constant dimension,” in Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on.   IEEE, 2014, pp. 514–523.
  • [15] M. Kapralov, “Sparse Fourier transform in any constant dimension with nearly-optimal sample complexity in sublinear time,” in Symposium on Theory of Computing Conference, STOC’16, Cambridge, MA, USA, June 19-21, 2016, 2016.
  • [16] V. Nakos, Z. Song, and Z. Wang, “(nearly) sample-optimal sparse fourier transform in any dimension; ripless and filterless,” in 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS).   IEEE, 2019, pp. 1568–1577.
  • [17] E. J. Candes, J. K. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate measurements,” Communications on pure and applied mathematics, vol. 59, no. 8, pp. 1207–1223, 2006.
  • [18] M. Rudelson and R. Vershynin, “On sparse reconstruction from fourier and gaussian measurements,” Communications on Pure and Applied Mathematics, vol. 61, no. 8, pp. 1025–1045, 2008.
  • [19] J. Bourgain, “An improved estimate in the restricted isometry problem,” in Geometric Aspects of Functional Analysis.   Springer, 2014, pp. 65–70.
  • [20] L. Shi, O. Andronesi, H. Hassanieh, B. Ghazi, D. Katabi, and E. Adalsteinsson, “Mrs sparse-fft: Reducing acquisition time and artifacts for in vivo 2d correlation spectroscopy,” in ISMRM13, Int. Society for Magnetic Resonance in Medicine Annual Meeting and Exhibition, 2013.
  • [21] E. Price and Z. Song, “A robust sparse Fourier transform in the continuous setting,” in Foundations of Computer Science (FOCS), 2015 IEEE 56th Annual Symposium on.   IEEE, 2015, pp. 583–600.
  • [22] Y. Jin, D. Liu, and Z. Song, “A robust multi-dimensional sparse fourier transform in the continuous setting,” arXiv preprint arXiv:2005.06156, 2020.
  • [23] X. Chen, D. M. Kane, E. Price, and Z. Song, “Fourier-sparse interpolation without a frequency gap,” in Foundations of Computer Science (FOCS), 2016 IEEE 57th Annual Symposium on.   IEEE, 2016, pp. 741–750.
  • [24] Z. Song, B. Sun, O. Weinstein, and R. Zhang, “Sparse fourier transform over lattices: A unified approach to signal reconstruction.”   http://arxiv.org/abs/2205.00658, 2022.
  • [25] C. D. McGillem and G. R. Cooper, Continuous and discrete signal and system analysis.   Harcourt School, 1991.
  • [26] J. G. Proakis, Digital signal processing: principles algorithms and applications.   Pearson Education India, 2001.
  • [27] F. G. Friedlander, M. S. Joshi, M. Joshi, and M. C. Joshi, Introduction to the Theory of Distributions.   Cambridge University Press, 1998.
  • [28] D. I. Hoult and B. Bhakar, “Nmr signal reception: Virtual photons and coherent spontaneous emission,” Concepts in Magnetic Resonance: An Educational Journal, vol. 9, no. 5, pp. 277–297, 1997.
  • [29] I. I. Rabi, J. R. Zacharias, S. Millman, and P. Kusch, “A new method of measuring nuclear magnetic moment,” Physical review, vol. 53, no. 4, p. 318, 1938.
  • [30] K. Schmidt-Rohr and H. W. Spiess, Multidimensional solid-state NMR and polymers.   Elsevier, 2012.
  • [31] P. R. Griffiths, “Fourier transform infrared spectrometry,” Science, vol. 222, no. 4621, pp. 297–302, 1983.
  • [32] M. M. Wilde, Quantum information theory.   Cambridge University Press, 2013.
  • [33] P. J. Schreier and L. L. Scharf, Statistical signal processing of complex-valued data: the theory of improper and noncircular signals.   Cambridge university press, 2010.
  • [34] L. L. Scharf and C. Demeure, Statistical signal processing: detection, estimation, and time series analysis.   Prentice Hall, 1991.