Sublinear Time Nearest Neighbor Search over Generalized Weighted Manhattan Distance

Huan Hu
Harbin Institute of Technology, China
hit_huhuan@foxmail.com
&Jianzhong Li
Harbin Institute of Technology, China
lijzh@hit.edu.cn

Abstract

Nearest Neighbor Search (NNS) over generalized weighted distances is fundamental to a wide range of applications. The problem of NNS over the generalized weighted square Euclidean distance has been studied in previous work. However, numerous studies have shown that the Manhattan distance could be more effective than the Euclidean distance for high-dimensional NNS, which indicates that the generalized weighted Manhattan distance is possibly more practical than the generalized weighted square Euclidean distance in high dimensions. To the best of our knowledge, no prior work solves the problem of NNS over the generalized weighted Manhattan distance in sublinear time. This paper achieves the goal by proposing two novel hashing schemes ( $d_{w}^{l_{1}},l_{2}$ )-ALSH and ( $d_{w}^{l_{1}},\theta$ )-ALSH.

1 Introduction

Nearest Neighbor Search (NNS) over generalized weighted distances is fundamental to a wide variety of applications, such as personalized recommendation [11, 14, 18] and kNN classification [3, 19]. Given a set of $n$ data points $D\subset\mathbb{R}^{d}$ and a query point $q\in\mathbb{R}^{d}$ with a weight vector $w\in\mathbb{R}^{d}$ , NNS over a generalized weighted distance, denoted by $d_{w}$ , is to find a point $o^{*}\in D$ such that $o^{*}$ is the closest data point to $q$ for $d_{w}$ . Formally, the goal of NNS over $d_{w}$ is to return

o^{*}=\arg\min_{o\in D}d_{w}(o,q).

(1)

Note that the weight vector $w$ is specified along with $q$ rather than pre-specified. Moreover, each element of $w$ can be either positive or non-positive.

The generalized weighted Manhattan distance, denoted by $d_{w}^{l_{1}}$ , and the generalized weighted square Euclidean distance, denoted by $d_{w}^{l_{2}}$ , are two typical generalized weighted distances which are derived from the Manhattan distance and the Euclidean distance, respectively. For any two points $o=(o_{1},o_{2},\ldots,o_{d})\in\mathbb{R}^{d}$ and $q=(q_{1},q_{2},\ldots,q_{d})\in\mathbb{R}^{d}$ , the distances $d_{w}^{l_{1}}(o,q)$ and $d_{w}^{l_{2}}(o,q)$ are respectively computed as follows:

\begin{split}d_{w}^{l_{1}}(o,q)=&\sum_{i=1}^{d}w_{i}\left|o_{i}-q_{i}\right|\\ d_{w}^{l_{2}}(o,q)=&\sum_{i=1}^{d}w_{i}\left(o_{i}-q_{i}\right)^{2},\end{split}

(2)

where $w=(w_{1},w_{2},\ldots,w_{d})\in\mathbb{R}^{d}$ . A recent paper [16] studied the problem of NNS over $d_{w}^{l_{2}}$ and provided two sublinear time solutions for it. However, to the best of our knowledge, there is no prior work that solves the problem of NNS over $d_{w}^{l_{1}}$ in sublinear time. Actually, plenty of studies [1, 12] have shown that the Manhattan distance could be more effective than the Euclidean distance for producing meaningful NNS results in high-dimensional spaces. It indicates that NNS over $d_{w}^{l_{1}}$ is possibly more practical than NNS over $d_{w}^{l_{2}}$ in many real scenarios. In this paper, we target to propose sublinear time methods for efficiently solving the problem of NNS over $d_{w}^{l_{1}}$ .

As a matter of fact, existing methods can not handle NNS over $d_{w}^{l_{1}}$ well. Specifically, the brute-force linear scan scales linearly with data size and thus may yield unsatisfactory performance. The conventional spatial index-based methods [2, 6, 22] can only perform well for NNS in low dimensions due to the “curse of dimensionality” [5]. Locality-Sensitive Hashing (LSH) [27] is a popular approach for approximate NNS and exhibits good performance for high-dimensional cases. In the literature, a number of efficient LSH schemes [7, 9, 10, 13, 15, 17, 26, 28] have been proposed based on LSH families, and some of them can answer NNS queries even in sublinear time. Unfortunately, they can not be applied to answer the NNS queries over $d_{w}^{l_{1}}$ unless $w$ is fixed to an all-1 vector.

Recently, Asymmetric Locality-Sensitive Hashing (ALSH) was extended from LSH so that the problems of Maximum Inner Product Search (MIPS) and NNS over $d_{w}^{l_{2}}$ can be addressed in sublinear time [16, 20, 21, 23, 24]. An ALSH scheme relies on an ALSH family. As far as we know, there is no ALSH family proposed for NNS over $d_{w}^{l_{1}}$ in previous work. To provide sublinear time solutions for NNS over $d_{w}^{l_{1}}$ in this paper, we follow the ALSH approach to propose ALSH schemes by introducing ALSH families that are suitable for NNS over $d_{w}^{l_{1}}$ .

Outline. In Section 2, we review the approaches of LSH and ALSH. In Section 3, we show that there is no LSH or ALSH family for NNS over $d_{w}^{l_{1}}$ over the entire space $\mathbb{R}^{d}$ and there is no LSH family for NNS over $d_{w}^{l_{1}}$ over the bounded spaces in $\mathbb{R}^{d}$ . Then we seek to find ALSH families for NNS over $d_{w}^{l_{1}}$ over the bounded spaces in $\mathbb{R}^{d}$ . As a result, we propose two suitable ALSH families and further obtain two sublinear time ALSH schemes ( $d_{w}^{l_{1}},l_{2}$ )-ALSH and ( $d_{w}^{l_{1}},\theta$ )-ALSH in Section 4.

2 Preliminaries

Before introducing our proposed solutions to the problem of NNS over $d_{w}^{l_{1}}$ , we first present the preliminaries on LSH and ALSH.

2.1 Locality-Sensitive Hashing

Let $d(\cdot,\cdot)$ be a distance function and $\mathcal{Z}$ be the space where $d(\cdot,\cdot)$ is defined. Assume that data points and query points are located in $\mathcal{X}\subseteq\mathcal{Z}$ and $\mathcal{Y}\subseteq\mathcal{Z}$ , respectively. Then, an LSH family is formally defined as follows.

Definition 1 (LSH Family)

An LSH family $\mathcal{H}_{(h)}=\{h:\mathcal{Z}\rightarrow BucketIDs\}$ is called $(R_{1},R_{2},P_{1},P_{2})$ -sensitive if for any $o\in\mathcal{X}$ and $q\in\mathcal{Y}$ , the following conditions are satisfied:

•

If $d(o,q)\leq R_{1}$ , then Pr $[h(o)=h(q)]\geq P_{1}$ ;
•

If $d(o,q)\geq R_{2}$ , then Pr $[h(o)=h(q)]\leq P_{2}$ ;
•

$R_{1}<R_{2}$ and $P_{1}>P_{2}$ .

As we can see from Definition 1, an LSH family is essentially a set of hash functions that can hash closer points into the same bucket with higher probability. Thus, the basic idea of an LSH scheme is to use an LSH family to hash points such that only the data points that have the same hash code as the query point are likely to be retrieved to find approximate nearest neighbors. In the following, we review two popular LSH families that were proposed for the $l_{2}$ distance (a.k.a. the Euclidean distance) and the Angular distance, respectively.

The $l_{2}$ distance between any two points $o,q\in\mathbb{R}^{d}$ is computed as $d^{l_{2}}(o,q)=\|o-q\|_{2}$ , where $\|\cdot\|_{2}$ is the $l_{2}$ -norm of a vector. The LSH family proposed for the $l_{2}$ distance in [7] is $\mathcal{H}_{(h^{l_{2}})}=\{h^{l_{2}}:\mathbb{R}^{d}\rightarrow\mathbb{Z}\}$ , where

h^{l_{2}}(x)=\lfloor\frac{a^{T}x+b}{w}\rfloor,

(3)

$a$ is a $d$ -dimensional vector where each element is chosen independently from the standard normal distribution, $b$ is a real number chosen uniformly at random from $[0,w]$ , and $w$ is a user-specified positive constant. Let $r=d^{l_{2}}(o,q)$ . The collision probability function is

P^{l_{2}}(r)=Pr[h^{l_{2}}(o)=h^{l_{2}}(q)]=1-2\Phi(-w/r)-\frac{2}{\sqrt{2\pi}(w/r)}(1-e^{-(w^{2}/2r^{2})}),

(4)

where $\Phi(\cdot)$ is the cumulative distribution function of the standard normal distribution [7].

The Angular distance between any two points $o,q\in\mathbb{R}^{d}$ is computed as $d^{\theta}(o,q)=\arccos(\frac{o^{T}q}{\|o\|_{2}\|q\|_{2}})$ . The LSH family proposed for the Angular distance in [4] is $\mathcal{H}_{(h^{\theta})}=\{h^{\theta}:\mathbb{R}^{d}\rightarrow\{0,1\}\}$ , where

h^{\theta}(x)=\begin{cases}0&\text{if}\ a^{T}x<0\\ 1&\text{if}\ a^{T}x\geq 0\end{cases}

(5)

and $a$ is a $d$ -dimensional vector where each element is chosen independently from the standard normal distribution. Let $r=d^{\theta}(o,q)$ . The collision probability function is

P^{\theta}(r)=Pr[h^{\theta}(o)=h^{\theta}(q)]=1-\frac{r}{\pi}.

(6)

2.2 Asymmetric Locality-Sensitive Hashing

Recent studies have shown that ALSH is an effective approach for solving the problems of MIPS and NNS over $d_{w}^{l_{2}}$ [16, 21, 23, 24]. An ALSH scheme processes NNS queries in a similar way to an LSH scheme. It relies on an ALSH family. Formally, the definition of an ALSH family is as follows.

Definition 2 (ALSH Family)

An ALSH family $\mathcal{H}_{(f,g)}=\{f:\mathcal{Z}\rightarrow BucketIDs\}\bigcup\{g:\mathcal{Z}\rightarrow BucketIDs\}$ is called $(R_{1},R_{2},P_{1},P_{2})$ -sensitive if for any data point $o\in\mathcal{X}$ and query point $q\in\mathcal{Y}$ , the following conditions are satisfied:

•

If $d(o,q)\leq R_{1}$ , then Pr $[f(o)=g(q)]\geq P_{1}$ ;
•

If $d(o,q)\geq R_{2}$ , then Pr $[f(o)=g(q)]\leq P_{2}$ ;
•

$R_{1}<R_{2}$ and $P_{1}>P_{2}$ .

From Definition 2, we can see that an ALSH family $\mathcal{H}_{(f,g)}$ consists of a set of hash functions $\{f:\mathcal{Z}\rightarrow BucketIDs\}$ for data points and a set of hash functions $\{g:\mathcal{Z}\rightarrow BucketIDs\}$ for query points, and it ensures that each query point can collide with closer data points with higher probability. In practice, $\mathcal{H}_{(f,g)}$ is often implemented with an LSH family $\mathcal{H}_{(h^{\prime})}=\{h^{\prime}:\mathcal{Z}^{\prime}\rightarrow BucketIDs\}$ and two vector functions called Preprocessing Transformation $P:\mathcal{X}\rightarrow\mathcal{X}^{\prime}$ and Query Transformation $Q:\mathcal{Y}\rightarrow\mathcal{Y}^{\prime}$ respectively [16, 21, 23, 24] (here, $\mathcal{X}^{\prime}\subseteq\mathcal{Z}^{\prime}$ and $\mathcal{Y}^{\prime}\subseteq\mathcal{Z}^{\prime}$ ). Thus, the hash value of each data point $o\in\mathcal{X}$ is computed as $f(o)=h^{\prime}(P(o))$ and the hash value of each query point $q\in\mathcal{Y}$ is computed as $g(q)=h^{\prime}(Q(q))$ .

Fundamentally, both LSH and ALSH schemes obtain approximate nearest neighbors by efficiently solving the ( $R_{1},R_{2}$ )-Near Neighbor Search (( $R_{1},R_{2}$ )-NNS) problem as follows.

Definition 3 (( $R_{1},R_{2}$ )-NNS)

Given a distance function $d(\cdot,\cdot)$ , two distance thresholds $R_{1}$ and $R_{2}$ ( $R_{1}<R_{2}$ ) and a data set $D\subset\mathcal{X}$ , for any query point $q\in\mathcal{Y}$ , the $(R_{1},R_{2})$ -NNS problem is to return a point $o\in D$ satisfying $d(o,q)\leq R_{2}$ if there exists a point $o^{\prime}\in D$ satisfying $d(o^{\prime},q)\leq R_{1}$ .

The theorem below indicates that the $(R_{1},R_{2})$ -NNS problem can be solved with an LSH or ALSH scheme in sublinear time.

Theorem 1

[7, 16, 21, 23, 24] Given an $(R_{1},R_{2},P_{1},P_{2})$ -sensitive LSH or ALSH family, one can construct a data structure for solving the problem of $(R_{1},R_{2})$ -NNS with $O(n^{\rho}d\log n)$ query time and $O(n^{1+\rho})$ space, where $\rho=\frac{\log P_{1}}{\log P_{2}}<1$ .

3 Negative Results

In this section, we present some negative results on the existence of LSH and ALSH families for NNS over $d_{w}^{l_{1}}$ .

The following theorem indicates that it is impossible to find an LSH or ALSH family for NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=\mathbb{R}^{d}$ ( $d\geq 3$ ).

Theorem 2

For any $d\geq 3$ , $R_{1}<R_{2}$ and $P_{1}>P_{2}$ , there is no $(R_{1},R_{2},P_{1},P_{2})$ -sensitive LSH or ALSH family for NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=\mathbb{R}^{d}$ .

Proof. An LSH (or ALSH) family for NNS over $d_{w}^{l_{1}}$ over $\mathbb{R}^{d}$ ( $d>3$ ) is also an LSH (or ALSH) family for NNS over $d_{w}^{l_{1}}$ over a three-dimensional subspace, i.e., over $\mathbb{R}^{3}$ . Hence, we only need to prove that there is no LSH or ALSH family for NNS over $d_{w}^{l_{1}}$ over $\mathbb{R}^{3}$ . Assume by contradiction that for some $R_{1}<R_{2}$ and $P_{1}>P_{2}$ there exists an $(R_{1},R_{2},P_{1},P_{2})$ -sensitive LSH family $\mathcal{H}_{(h)}$ or ALSH family $\mathcal{H}_{(f,g)}$ for NNS over $d_{w}^{l_{1}}$ over $\mathbb{R}^{3}$ . Consider a set of $N$ data points $\{o^{1},o^{2},\ldots,o^{N}\}\subset\mathbb{R}^{3}$ and a set of $N$ query points $\{q^{1},q^{2},\ldots,q^{N}\}\subset\mathbb{R}^{3}$ , where for $1\leq i,j\leq N$ ,

	$\displaystyle o^{i}$	$\displaystyle=(iR_{1}-iR_{2},0,0)$		(7)
	$\displaystyle q^{j}$	$\displaystyle=(0,jR_{1}-jR_{2},R_{1}).$		(7)

The weight vector specified along with each query point is set as follows:

w=\begin{cases}(1,-1,-1)&\text{if}\ R_{1}<0\\ (1,-1,1)&\text{if}\ R_{1}\geq 0.\end{cases}

(8)

Thus, $d_{w}^{l_{1}}(o^{i},q^{j})=(i-j)(R_{2}-R_{1})+R_{1}$ for $1\leq i,j\leq N$ . As can be seen, $d_{w}^{l_{1}}(o^{i},q^{j})\leq R_{1}$ if $1\leq i\leq j\leq N$ and $d_{w}^{l_{1}}(o^{i},q^{j})\geq R_{2}$ if $1\leq j<i\leq N$ . Let $A\in\mathbb{R}^{N\times N}$ be a sign matrix where each element is

A(i,j)=\begin{cases}1&\text{if}\ d_{w}^{l_{1}}(o^{i},q^{j})\leq R_{1}\\ -1&\text{if}\ d_{w}^{l_{1}}(o^{i},q^{j})\geq R_{2}\\ 0&\text{otherwise}.\end{cases}

(9)

Obviously, $A$ is triangular with +1 on and above the diagonal and -1 below it. Consider also the matrix $B\in\mathbb{R}^{N\times N}$ of collision probabilities $B(i,j)=Pr[h(o^{i})=h(q^{j})]$ (for $\mathcal{H}_{(h)}$ ) or $B(i,j)=Pr[f(o^{i})=g(q^{j})]$ (for $\mathcal{H}_{(f,g)}$ ). Let $\theta=\frac{P_{1}+P_{1}}{2}<1$ and $\epsilon=\frac{P_{1}-P_{2}}{2}>0$ . It is easy to know that $A(i,j)(B(i,j)-\theta)\geq\epsilon$ for $1\leq i,j\leq N$ . That is,

A\odot\frac{B-\theta}{\epsilon}\geq 1,

(10)

where $\odot$ denotes the Hadamard (element-wise) product. From [25], the margin complexity of the sign matrix $A$ is $mc(A)=\inf_{A\odot C\geq 1}\|C\|_{max}$ , where $\|\cdot\|_{max}$ is the max-norm of a matrix. Since $A$ is also an $N\times N$ triangular matrix, the margin complexity of $A$ is bounded by $mc(A)=\Omega(\log N)$ according to [8]. Therefore, from Equation 10, we can obtain

\|\frac{B-\theta}{\epsilon}\|_{max}=\Omega(\log N).

(11)

Since $B$ is a collision probability matrix, the max-norm of $B$ satisfies $\|B\|_{max}\leq 1$ [20]. Shifting $B$ by $0<\theta<1$ changes $\|B\|_{max}$ by at most $\theta$ . Thus, we have

\|B-\theta\|_{max}<2.

(12)

Combining Equations 11 and 12, we can easily derive that $\epsilon=O(\frac{1}{\log N})$ . For any $\epsilon=\frac{P_{1}-P_{2}}{2}>0$ , we get a contradiction by selecting a large enough $N$ . $\square$

The proof of Theorem 2 is similar to that of Theorem 3.1 in [21]. Due to space limitations, for the details of the max-norm and margin complexity involved in the proof of Theorem 2, please refer to http://proceedings.mlr.press/v37/neyshabur15-supp.pdf.

Actually, in real scenarios data points and query points are usually located in bounded spaces. Consider the typical case of $\mathcal{X}=\mathcal{Y}=\mathcal{S}$ , where $\mathcal{S}\subset\mathbb{R}^{d}$ is a bounded space. The following theorem shows nonexistence of an LSH family for NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=\mathcal{S}$ .

Theorem 3

For any $d>0$ , $R_{1}<R_{2}$ and $P_{1}>P_{2}$ , there is no $(R_{1},R_{2},P_{1},P_{2})$ -sensitive LSH family for NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=\mathcal{S}$ .

Proof. Assume by contradiction that for some $d>0$ , $R_{1}<R_{2}$ and $P_{1}>P_{2}$ there exists an $(R_{1},R_{2},P_{1},P_{2})$ -sensitive LSH family $\mathcal{H}_{(h)}$ for NNS over $d_{w}^{l_{1}}$ over $\mathcal{S}$ . Let $o,q\in\mathcal{S}$ where $o\neq q$ ¹¹1Ignore the trivial case that $\mathcal{S}$ contains only a single point.. As $w\in\mathbb{R}^{d}$ , we can always set $w$ to a value such that $d_{w}^{l_{1}}(o,q)=R_{1}$ and thus $Pr[h(o)=h(q)]\geq P_{1}$ . Moreover, we can always set $w$ to another value such that $d_{w}^{l_{1}}(o,q)=R_{2}$ and thus $Pr[h(o)=h(q)]\leq P_{2}$ . However, since data points should be hashed before queries arrive, $\mathcal{H}_{(h)}$ can not involve $w$ . So $Pr[h(o)=h(q)]$ is not affected by $w$ , which leads to a contradiction. $\square$

Due to the negative results in Theorems 2 and 3, we seek to propose ALSH families for NNS over $d_{w}^{l_{1}}$ over bounded spaces in Section 4. Notice that if an ALSH family is suitable for NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=[M_{l},M_{u}]^{d}$ ( $M_{l}<M_{u}$ ), it must also be suitable for NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=\mathcal{S}$ for any $\mathcal{S}\subseteq[M_{l},M_{u}]^{d}$ . Thus, it is sufficient to deal with the case of $\mathcal{X}=\mathcal{Y}=[M_{l},M_{u}]^{d}$ . Further, suppose $\mathcal{X}=\mathcal{Y}=[0,M_{u}-M_{l}]^{d}$ . Otherwise, it can be satisfied by shifting $o,q\in[M_{l},M_{u}]^{d}$ without changing the results of NNS over $d_{w}^{l_{1}}$ .

4 Our Solutions

Let $M=\lfloor(M_{u}-M_{l})t\rfloor$ ( $t>0$ ). The following Observation 1 indicates that if we find an ALSH family for NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=\{0,1,\ldots,M\}^{d}$ , a similar ALSH family can be immediately obtained for NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=[0,M_{u}-M_{l}]^{d}$ . Thus, we only need to consider NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=\{0,1,\ldots,M\}^{d}$ in the rest of the paper. Note that in our solutions $M$ can be an arbitrary positive integer.

Observation 1

Define a vector function: $u_{t}(x)=\lfloor xt\rfloor=(\lfloor x_{1}t\rfloor,\lfloor x_{2}t\rfloor,\ldots,\lfloor x_{d}t\rfloor)$ , where $x=(x_{1},x_{2},\ldots,x_{d})\in[0,M_{u}-M_{l}]^{d}$ and $t>0$ . For any $d>0$ , $R_{1}<R_{2}$ and $P_{1}>P_{2}$ , if $\mathcal{H}_{(f,g)}$ is an $(R_{1},R_{2},P_{1},P_{2})$ -sensitive ALSH family for NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=\{0,1,\ldots,\lfloor(M_{u}-M_{l})t\rfloor\}^{d}$ , then $\mathcal{H}_{(f\circ u_{t},g\circ u_{t})}$ must be an $(R_{1}^{\prime},R_{2}^{\prime},P_{1},P_{2})$ -sensitive ALSH family for NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=[0,M_{u}-M_{l}]^{d}$ , where $R_{1}^{\prime}=(R_{1}-\sum_{i=1}^{d}\left|w_{i}\right|)/t$ and $R_{2}^{\prime}=(R_{2}+\sum_{i=1}^{d}\left|w_{i}\right|)/t$ .

Proof. Let $o,q\in[0,M_{u}-M_{l}]^{d}$ . Then $\lfloor ot\rfloor,\lfloor qt\rfloor\in\{0,1,\ldots,\lfloor(M_{u}-M_{l})t\rfloor\}^{d}$ . A simple calculation shows that $d_{w}^{l_{1}}(\lfloor ot\rfloor,\lfloor qt\rfloor)-\sum_{i=1}^{d}\left|w_{i}\right|\leq d_{w}^{l_{1}}(ot,qt)=td_{w}^{l_{1}}(o,q)\leq d_{w}^{l_{1}}(\lfloor ot\rfloor,\lfloor qt\rfloor)+\sum_{i=1}^{d}\left|w_{i}\right|$ holds. Thus, we have that $d_{w}^{l_{1}}(\lfloor ot\rfloor,\lfloor qt\rfloor)\leq R_{1}$ if $d_{w}^{l_{1}}(o,q)\leq R_{1}^{\prime}$ and $d_{w}^{l_{1}}(\lfloor ot\rfloor,\lfloor qt\rfloor)\geq R_{2}$ if $d_{w}^{l_{1}}(o,q)\geq R_{2}^{\prime}$ . Further, since $f(\lfloor ot\rfloor)=f(u_{t}(o))=(f\circ u_{t})(o)$ and $g(\lfloor qt\rfloor)=g(u_{t}(q))=(g\circ u_{t})(q)$ , we have that $Pr[(f\circ u_{t})(o)=(g\circ u_{t})(q)]\geq P_{1}$ if $d_{w}^{l_{1}}(o,q)\leq R_{1}^{\prime}$ and $Pr[(f\circ u_{t})(o)=(g\circ u_{t})(q)]\leq P_{2}$ if $d_{w}^{l_{1}}(o,q)\geq R_{2}^{\prime}$ . As a result, $\mathcal{H}_{(f\circ u_{t},g\circ u_{t})}$ is an $(R_{1}^{\prime},R_{2}^{\prime},P_{1},P_{2})$ -sensitive ALSH family for NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=[0,M_{u}-M_{l}]^{d}$ (note: $R_{1}^{\prime}<R_{2}^{\prime}$ always holds since $R_{1}<R_{2}$ ). $\square$

4.1 From NNS over $d_{w}^{l_{1}}$ to MIPS

In the following, we take two steps to convert the problem of NNS over $d_{w}^{l_{1}}$ into the problem of MIPS. As a result of these steps, a novel preprocessing transformation and query transformation are introduced for data points and query points, respectively. The two transformations are essential to our solutions.

Step 1: Convert NNS over $d_{w}^{l_{1}}$ into NNS over $d_{w}^{H}$

The generalized weighted Hamming distance $d_{w}^{H}$ is defined on the Hamming space and computed in the same way as the generalized weighted Manhattan distance $d_{w}^{l_{1}}$ . That is, $d_{w}^{H}(o,q)=\sum_{i=1}^{d}w_{i}\left|o_{i}-q_{i}\right|$ for any $w=(w_{1},w_{2},\ldots,w_{d})\in\mathbb{R}^{d}$ , $o=(o_{1},o_{2},\ldots,o_{d})\in\{0,1\}^{d}$ and $q=(q_{1},q_{2},\ldots,q_{d})\in\{0,1\}^{d}$ .

Inspired by [10], we complete this step by applying unary coding. Specifically, each point $x=(x_{1},x_{2},\ldots,x_{d})\in\{0,1,\ldots,M\}^{d}$ is mapped into a binary vector $v(x)=(\text{Unary}(x_{1});\text{Unary}(x_{2});\ldots;\text{Unary}(x_{d}))\in\{0,1\}^{Md}$ , where (;) is the concatenation and each $\text{Unary}(x_{i})=(x_{i1},x_{i2},\ldots,x_{iM})$ is the unary representation of $x_{i}$ , i.e., a sequence of $x_{i}$ 1’s followed by $(M-x_{i})$ 0’s. Then $\left|o_{i}-q_{i}\right|=\sum_{j=1}^{M}\left|o_{ij}-q_{ij}\right|$ for $o=(o_{1},o_{2},\ldots,o_{d})\in\{0,1,\ldots,M\}^{d}$ , $q=(q_{1},q_{2},\ldots,q_{d})\in\{0,1,\ldots,M\}^{d}$ and $1\leq i\leq d$ . Moreover, the weight vector $w=(w_{1},w_{2},\ldots,w_{d})$ is mapped into $I(w)=(I(w_{1});I(w_{2});\ldots;I(w_{d}))$ , where each $I(w_{i})$ is a sequence of $M$ $w_{i}$ ’s. As a result, we have

\begin{split}d_{w}^{l_{1}}(o,q)=&\sum_{i=1}^{d}w_{i}\left|o_{i}-q_{i}\right|\\ =&\sum_{i=1}^{d}w_{i}(\sum_{j=1}^{M}\left|o_{ij}-q_{ij}\right|)\\ =&\sum_{i=1}^{d}\sum_{j=1}^{M}w_{i}\left|o_{ij}-q_{ij}\right|\\ =&d_{w}^{H}(v(o),v(q)),\end{split}

(13)

where $w=(w_{1},w_{2},\ldots,w_{d})\in\mathbb{R}^{d}$ , $o=(o_{1},o_{2},\ldots,o_{d})\in\{0,1,\ldots,M\}^{d}$ and $q=(q_{1},q_{2},\ldots,q_{d})\in\{0,1,\ldots,M\}^{d}$ . Equation 13 indicates that through the above mappings NNS over $d_{w}^{l_{1}}$ over $\mathcal{X}=\mathcal{Y}=\{0,1,\ldots,M\}^{d}$ is converted into NNS over $d_{w}^{H}$ over $\mathcal{X}=\mathcal{Y}=\{v(x)\mid x\in\{0,1,\ldots,M\}^{d}\}\subset\{0,1\}^{Md}$ .

Step 2: Convert NNS over $d_{w}^{H}$ into MIPS

This step is based on the following observation.

Observation 2

For any $o_{ij},q_{ij}\in\{0,1\}$ and $w_{i}\in\mathbb{R}$ , the equation $w_{i}\left|o_{ij}-q_{ij}\right|=w_{i}-\left(\cos(\frac{\pi}{2}o_{ij}),\sin(\frac{\pi}{2}o_{ij})\right)^{T}\left(w_{i}\cos(\frac{\pi}{2}q_{ij}),w_{i}\sin(\frac{\pi}{2}q_{ij})\right)$ always holds.

Proof. We only need to check two cases. Case 1: If $o_{ij}=q_{ij}$ , then $w_{i}\left|o_{ij}-q_{ij}\right|=0=w_{i}-\left(\cos(\frac{\pi}{2}o_{ij}),\sin(\frac{\pi}{2}o_{ij})\right)^{T}\left(w_{i}\cos(\frac{\pi}{2}q_{ij}),w_{i}\sin(\frac{\pi}{2}q_{ij})\right)$ . Case 2: If $o_{ij}\neq q_{ij}$ , then $w_{i}\left|o_{ij}-q_{ij}\right|=w_{i}=w_{i}-\left(\cos(\frac{\pi}{2}o_{ij}),\sin(\frac{\pi}{2}o_{ij})\right)^{T}\left(w_{i}\cos(\frac{\pi}{2}q_{ij}),w_{i}\sin(\frac{\pi}{2}q_{ij})\right)$ . $\square$

For any $x=(x_{1},x_{2},\ldots,x_{d})\in\{0,1,\ldots,M\}^{d}$ , we define $\widetilde{\cos}\left(\frac{\pi}{2}v(x)\right)$ and $\widetilde{\sin}\left(\frac{\pi}{2}v(x)\right)$ as follows:

	$\displaystyle\widetilde{\cos}\left(\frac{\pi}{2}v(x)\right)=\left(\widehat{\cos}\left(\frac{\pi}{2}\text{Unary}(x_{1})\right);\widehat{\cos}\left(\frac{\pi}{2}\text{Unary}(x_{2})\right);\ldots;\widehat{\cos}\left(\frac{\pi}{2}\text{Unary}(x_{d})\right)\right)$		(14)
	$\displaystyle\widetilde{\sin}\left(\frac{\pi}{2}v(x)\right)=\left(\widehat{\sin}\left(\frac{\pi}{2}\text{Unary}(x_{1})\right);\widehat{\sin}\left(\frac{\pi}{2}\text{Unary}(x_{2})\right);\ldots;\widehat{\sin}\left(\frac{\pi}{2}\text{Unary}(x_{d})\right)\right),$		(15)

where

	$\displaystyle\widehat{\cos}\left(\frac{\pi}{2}\text{Unary}(x_{i})\right)=\left(\cos\left(\frac{\pi}{2}x_{i1}\right),\cos\left(\frac{\pi}{2}x_{i2}\right),\ldots,\cos\left(\frac{\pi}{2}x_{iM}\right)\right)$		(16)
	$\displaystyle\widehat{\sin}\left(\frac{\pi}{2}\text{Unary}(x_{i})\right)=\left(\sin\left(\frac{\pi}{2}x_{i1}\right),\sin\left(\frac{\pi}{2}x_{i2}\right),\ldots,\sin\left(\frac{\pi}{2}x_{iM}\right)\right).$		(17)

According to Observation 2, we have

\begin{split}d_{w}^{H}(v(o),v(q))=&\sum_{i=1}^{d}\sum_{j=1}^{M}w_{i}\left|o_{ij}-q_{ij}\right|\\ =&\sum_{i=1}^{d}\sum_{j=1}^{M}\left(w_{i}-\left(\cos(\frac{\pi}{2}o_{ij}),\sin(\frac{\pi}{2}o_{ij})\right)^{T}\left(w_{i}\cos(\frac{\pi}{2}q_{ij}),w_{i}\sin(\frac{\pi}{2}q_{ij})\right)\right)\\ =&M\sum_{i=1}^{d}w_{i}-d^{IP}\left(T^{1}\left(v(o)\right),T^{2}\left(v(q)\right)\right),\\ \end{split}

(18)

where $w=(w_{1},w_{2},\ldots,w_{d})\in\mathbb{R}^{d}$ , $o=(o_{1},o_{2},\ldots,o_{d})\in\{0,1,\ldots,M\}^{d}$ , $q=(q_{1},q_{2},\ldots,q_{d})\in\{0,1,\ldots,M\}^{d}$ , $d^{IP}(\cdot,\cdot)$ is the inner product of two vectors, and $T^{1}(v(o))$ and $T^{2}(v(q))$ are respectively as follows:

	$\displaystyle T^{1}(v(o))=(\widetilde{\cos}(\frac{\pi}{2}v(o));\widetilde{\sin}(\frac{\pi}{2}v(o)))$		(19)
	$\displaystyle T^{2}(v(q))=(I(w)\odot\widetilde{\cos}(\frac{\pi}{2}v(q));I(w)\odot\widetilde{\sin}(\frac{\pi}{2}v(q))).$		(20)

From Equation 18, we can see that NNS over $d_{w}^{H}$ over $\mathcal{X}=\mathcal{Y}=\{v(x)\mid x\in\{0,1,\ldots,M\}^{d}\}\subset\{0,1\}^{Md}$ can be converted into MIPS over $\mathcal{X}=\{T^{1}(v(x))\mid x\in\{0,1,\ldots,M\}^{d}\}\subset\{0,1\}^{2Md}$ and $\mathcal{Y}=\{T^{2}(v(x))\mid x\in\{0,1,\ldots,M\}^{d}\}\subset\mathbb{R}^{2Md}$ .

To sum up, after Steps 1 and 2, we convert NNS over $d_{w}^{l_{1}}$ into MIPS by using the composite functions $T^{1}(v(\cdot))$ and $T^{2}(v(\cdot))$ that respectively map data points and query points from $\{0,1,\ldots,M\}^{d}$ into two higher-dimensional spaces. Let $P(o)=T^{1}(v(o))$ and $Q_{w}(q)=T^{2}(v(q))$ for $o,q\in\{0,1,\ldots,M\}^{d}$ . The vector functions $P(\cdot)$ and $Q_{w}(\cdot)$ are respectively the preprocessing and query transformations for the ALSH families introduced later.

4.2 ALSH Schemes for NNS over $d_{w}^{l_{1}}$

Next, we formally present two ALSH schemes for NNS over $d_{w}^{l_{1}}$ : the first one is called ( $d_{w}^{l_{1}},l_{2}$ )-ALSH and the second one is called ( $d_{w}^{l_{1}},\theta$ )-ALSH. ( $d_{w}^{l_{1}},l_{2}$ )-ALSH solves the problem of NNS over $d_{w}^{l_{1}}$ by reducing it to the problem of NNS over the $l_{2}$ distance, while ( $d_{w}^{l_{1}},\theta$ )-ALSH solves the problem of NNS over $d_{w}^{l_{1}}$ by reducing it to the problem of NNS over the Angular distance.

4.2.1 ( $d_{w}^{l_{1}},l_{2}$ )-ALSH

Based on the transformations $P(\cdot)$ and $Q_{w}(\cdot)$ and the LSH family $\mathcal{H}_{(h^{l_{2}})}$ introduced in Section 2.1, ( $d_{w}^{l_{1}},l_{2}$ )-ALSH uses the ALSH family $\mathcal{H}_{(f^{(d_{w}^{l_{1}},l_{2})},g^{(d_{w}^{l_{1}},l_{2})})}=\{f^{(d_{w}^{l_{1}},l_{2})}:\{0,1,\ldots,M\}^{d}\rightarrow\mathbb{Z}\}\bigcup\{g^{(d_{w}^{l_{1}},l_{2})}:\{0,1,\ldots,M\}^{d}\rightarrow\mathbb{Z}\}$ , where $f^{(d_{w}^{l_{1}},l_{2})}(x)=h^{l_{2}}(P(x))$ and $g^{(d_{w}^{l_{1}},l_{2})}(x)=h^{l_{2}}(Q_{w}(x))$ for $x\in\{0,1,\ldots,M\}^{d}$ . Combining Equations 13 and 18 we obtain

d_{w}^{l_{1}}(o,q)=M\sum_{i=1}^{d}w_{i}-d^{IP}(P(o),Q_{w}(q)).

(21)

It is easy to know

	$\displaystyle\\|P(o)\\|_{2}^{2}=Md$		(22)
	$\displaystyle\\|Q_{w}(q)\\|_{2}^{2}=M\sum_{i=1}^{d}w_{i}^{2}.$		(23)

Thus, we have

\begin{split}d^{l_{2}}(P(o),Q_{w}(q))&=\|P(o)-Q_{w}(q)\|_{2}\\ &=\sqrt{\|P(o)\|_{2}^{2}+\|Q_{w}(q)\|_{2}^{2}-2d^{IP}(P(o),Q_{w}(q))}\\ &=\sqrt{M\left(d+\sum_{i=1}^{d}w_{i}^{2}\right)-2\left(M\sum_{i=1}^{d}w_{i}-d_{w}^{l_{1}}(o,q)\right)}.\end{split}

(24)

Let $r=d_{w}^{l_{1}}(o,q)$ . According to Equations 4 and 24, the collision probability function with respect to $\mathcal{H}_{(f^{(d_{w}^{l_{1}},l_{2})},g^{(d_{w}^{l_{1}},l_{2})})}$ is

\begin{split}P^{(d_{w}^{l_{1}},l_{2})}(r)&=Pr[f^{(d_{w}^{l_{1}},l_{2})}(o)=g^{(d_{w}^{l_{1}},l_{2})}(q)]\\ &=Pr[h^{l_{2}}(P(o))=h^{l_{2}}(Q_{w}(q))]\\ &=P^{l_{2}}(\|P(o)-Q_{w}(q)\|_{2})\\ &=P^{l_{2}}\left(\sqrt{M\left(d+\sum_{i=1}^{d}w_{i}^{2}\right)-2\left(M\sum_{i=1}^{d}w_{i}-r\right)}\right).\end{split}

(25)

Since $P^{l_{2}}(\cdot)$ is a decreasing function, $P^{(d_{w}^{l_{1}},l_{2})}(R_{1})>P^{(d_{w}^{l_{1}},l_{2})}(R_{2})$ holds for any $R_{1}<R_{2}$ . Therefore, we obtain the following Lemma 1.

Lemma 1

$\mathcal{H}_{(f^{(d_{w}^{l_{1}},l_{2})},g^{(d_{w}^{l_{1}},l_{2})})}$ is $(R_{1},R_{2},P^{(d_{w}^{l_{1}},l_{2})}(R_{1}),P^{(d_{w}^{l_{1}},l_{2})}(R_{2}))$ -sensitive for any $R_{1}<R_{2}$ .

According to Theorem 1 and Lemma 1, we have the following Theorem 4.

Theorem 4

( $d_{w}^{l_{1}},l_{2}$ )-ALSH can solve the problem of $(R_{1},R_{2})$ -NNS over $d_{w}^{l_{1}}$ with $O(n^{\rho^{(d_{w}^{l_{1}},l_{2})}}d\log n)$ query time and $O(n^{1+\rho^{(d_{w}^{l_{1}},l_{2})}})$ space, where $\rho^{(d_{w}^{l_{1}},l_{2})}=\frac{\log\left(P^{(d_{w}^{l_{1}},l_{2})}(R_{1})\right)}{\log\left(P^{(d_{w}^{l_{1}},l_{2})}(R_{2})\right)}<1$ .

4.2.2 ( $d_{w}^{l_{1}},\theta$ )-ALSH

Now we introduce the scheme of ( $d_{w}^{l_{1}},\theta$ )-ALSH. Based on the transformations $P(\cdot)$ and $Q_{w}(\cdot)$ and the LSH family $\mathcal{H}_{(h^{\theta})}$ introduced in Section 2.1, ( $d_{w}^{l_{1}},\theta$ )-ALSH uses the ALSH family $\mathcal{H}_{(f^{(d_{w}^{l_{1}},\theta)},g^{(d_{w}^{l_{1}},\theta)})}=\{f^{(d_{w}^{l_{1}},\theta)}:\{0,1,\ldots,M\}^{d}\rightarrow\{0,1\}\}\bigcup\{g^{(d_{w}^{l_{1}},\theta)}:\{0,1,\ldots,M\}^{d}\rightarrow\{0,1\}\}$ , where $f^{(d_{w}^{l_{1}},\theta)}(x)=h^{\theta}(P(x))$ and $g^{(d_{w}^{l_{1}},\theta)}(x)=h^{\theta}(Q_{w}(x))$ for $x\in\{0,1,\ldots,M\}^{d}$ . According to Equations 21, 22 and 23, the relationship between $d_{w}^{l_{1}}(o,q)$ and $d^{\theta}(P(o),Q_{w}(q))$ is as follows:

\begin{split}d^{\theta}(P(o),Q_{w}(q))&=\arccos\left(\frac{P(o)^{T}Q_{w}(q)}{\|P(o)\|_{2}\|Q_{w}(q)\|_{2}}\right)\\ &=\arccos\left(\frac{d^{IP}(P(o),Q_{w}(q))}{\|P(o)\|_{2}\|Q_{w}(q)\|_{2}}\right)\\ &=\arccos\left(\frac{M\sum_{i=1}^{d}w_{i}-d_{w}^{l_{1}}(o,q)}{M\sqrt{d\sum_{i=1}^{d}w_{i}^{2}}}\right).\end{split}

(26)

Let $r=d_{w}^{l_{1}}(o,q)$ . From Equations 6 and 26, it can be seen that the collision probability function with respect to $\mathcal{H}_{(f^{(d_{w}^{l_{1}},\theta)},g^{(d_{w}^{l_{1}},\theta)})}$ is

\begin{split}P^{(d_{w}^{l_{1}},\theta)}(r)&=Pr[f^{(d_{w}^{l_{1}},\theta)}(o)=g^{(d_{w}^{l_{1}},\theta)}(q)]\\ &=Pr[h^{\theta}(P(o))=h^{\theta}(Q_{w}(q))]\\ &=1-\frac{1}{\pi}\arccos\left(\frac{P(o)^{T}Q_{w}(q)}{\|P(o)\|_{2}\|Q_{w}(q)\|_{2}}\right)\\ &=1-\frac{1}{\pi}\arccos\left(\frac{M\sum_{i=1}^{d}w_{i}-r}{M\sqrt{d\sum_{i=1}^{d}w_{i}^{2}}}\right).\end{split}

(27)

It is easy to know that $P^{(d_{w}^{l_{1}},\theta)}(\cdot)$ is a decreasing function. Thus, $P^{(d_{w}^{l_{1}},\theta)}(R_{1})>P^{(d_{w}^{l_{1}},\theta)}(R_{2})$ holds for any $R_{1}<R_{2}$ . Then we obtain the following Lemma 2.

Lemma 2

$\mathcal{H}_{(f^{(d_{w}^{l_{1}},\theta)},g^{(d_{w}^{l_{1}},\theta)})}$ is $(R_{1},R_{2},P^{(d_{w}^{l_{1}},\theta)}(R_{1}),P^{(d_{w}^{l_{1}},\theta)}(R_{2}))$ -sensitive for any $R_{1}<R_{2}$ .

Combining Theorem 1 and Lemma 2 we have the following Theorem 5.

Theorem 5

( $d_{w}^{l_{1}},\theta$ )-ALSH can solve the problem of $(R_{1},R_{2})$ -NNS over $d_{w}^{l_{1}}$ with $O(n^{\rho^{(d_{w}^{l_{1}},\theta)}}d\log n)$ query time and $O(n^{1+\rho^{(d_{w}^{l_{1}},\theta)}})$ space, where $\rho^{(d_{w}^{l_{1}},\theta)}=\frac{\log\left(P^{(d_{w}^{l_{1}},\theta)}(R_{1})\right)}{\log\left(P^{(d_{w}^{l_{1}},\theta)}(R_{2})\right)}<1$ .

4.2.3 Implementation Skills of ( $d_{w}^{l_{1}},l_{2}$ )-ALSH and ( $d_{w}^{l_{1}},\theta$ )-ALSH

The scheme of ( $d_{w}^{l_{1}},l_{2}$ )-ALSH (or ( $d_{w}^{l_{1}},\theta$ )-ALSH) needs to compute the hash values $h^{l_{2}}(P(o))$ and $h^{l_{2}}(Q_{w}(q))$ (or $h^{\theta}(P(o))$ and $h^{\theta}(Q_{w}(q))$ ). We can easily know that the running time of computing $h^{l_{2}}(P(o))$ (or $h^{\theta}(P(o))$ ) is dominated by the time cost of obtaining $a^{T}P(o)$ and the running time of computing $h^{l_{2}}(Q_{w}(q))$ (or $h^{\theta}(Q_{w}(q))$ ) is dominated by the time cost of obtaining $a^{T}Q_{w}(q)$ , where $a$ is a $2Md$ -dimensional vector where each entry is chosen independently from the standard normal distribution. The naive approach to obtain $a^{T}P(o)$ or $a^{T}Q_{w}(q)$ is to compute the inner product of the two corresponding vectors. However, it will require $2Md$ multiplications and $2Md-1$ additions, which is expensive when $M$ is large.

Next, we show how to obtain $a^{T}P(o)$ with only $2d-1$ additions and obtain $a^{T}Q_{w}(q)$ with only $2d-1$ additions and $d$ multiplications. Suppose $a=(a_{1};a_{2};\ldots;a_{d};a_{d+1};a_{d+2};\ldots;a_{2d})$ , where $a_{i}=(a_{i1},a_{i2},\ldots,a_{iM})\in\mathbb{R}^{M}$ . According to Equations 14-17, 19 and 20, we have $a^{T}P(o)=\sum_{i=1}^{d}a_{i}^{T}\widehat{\cos}(\frac{\pi}{2}\text{Unary}(o_{i}))+\sum_{i=1}^{d}a_{d+i}^{T}\widehat{\sin}(\frac{\pi}{2}\text{Unary}(o_{i}))$ and $a^{T}Q_{w}(q)=\sum_{i=1}^{d}a_{i}^{T}\widehat{\cos}(\frac{\pi}{2}\text{Unary}(q_{i}))w_{i}+\sum_{i=1}^{d}a_{d+i}^{T}\widehat{\sin}(\frac{\pi}{2}\text{Unary}(q_{i}))w_{i}$ . Since $\widehat{\cos}(\frac{\pi}{2}\text{Unary}(o_{i}))$ is a sequence of $o_{i}$ 0’s followed by $(M-o_{i})$ 1’s and $\widehat{\sin}(\frac{\pi}{2}\text{Unary}(o_{i}))$ is a sequence of $o_{i}$ 1’s followed by $(M-o_{i})$ 0’s, it is easy to know that $a_{i}^{T}\widehat{\cos}(\frac{\pi}{2}\text{Unary}(o_{i}))$ is the sum of the last $M-o_{i}$ elements of $a_{i}$ and $a_{d+i}^{T}\widehat{\sin}(\frac{\pi}{2}\text{Unary}(o_{i}))$ is the sum of the first $o_{i}$ elements of $a_{d+i}$ . Thus, we preprocess the vector $a$ to obtain $a^{\prime}=(a_{1}^{\prime};a_{2}^{\prime};\ldots;a_{d}^{\prime};a_{d+1}^{\prime};a_{d+2}^{\prime};\ldots;a_{2d}^{\prime})$ , where $a_{i}^{\prime}=(a_{i1}^{\prime},a_{i2}^{\prime},\ldots,a_{iM}^{\prime},a_{i(M+1)}^{\prime})$ and

a_{ij}^{\prime}=\begin{cases}\sum_{k=j}^{M}a_{ik}&\text{if}\ 1\leq i\leq d\text{ and }1\leq j\leq M\\ 0&\text{if}\ 1\leq i\leq d\text{ and }j=M+1\\ 0&\text{if}\ d+1\leq i\leq 2d\text{ and }j=1\\ \sum_{k=1}^{j-1}a_{ik}&\text{if}\ d+1\leq i\leq 2d\text{ and }2\leq j\leq M+1.\\ \end{cases}

(28)

Then we have $a^{T}P(o)=\sum_{i=1}^{d}a_{i(o_{i}+1)}^{\prime}+\sum_{i=1}^{d}a_{(d+i)(o_{i}+1)}^{\prime}$ . It can be seen that $a^{T}P(o)$ can be obtained with $2d-1$ additions by using $a^{\prime}$ . Similarly, we have $a^{T}Q_{w}(q)=\sum_{i=1}^{d}w_{i}a_{i(q_{i}+1)}^{\prime}+\sum_{i=1}^{d}w_{i}a_{(d+i)(q_{i}+1)}^{\prime}=\sum_{i=1}^{d}w_{i}(a_{i(q_{i}+1)}^{\prime}+a_{(d+i)(q_{i}+1)}^{\prime})$ . Therefore, $a^{T}Q_{w}(q)$ can be obtained with $2d-1$ additions and $d$ multiplications by using $a^{\prime}$ .

5 Conclusion

This paper studies the fundamental problem of Nearest Neighbor Search (NNS) over the generalized weighted Manhattan distance ( $d_{w}^{l_{1}}$ ). As far as we know, there is no prior work that solves the problem in sublinear time. In this paper, we first prove that there is no LSH or ALSH family for $d_{w}^{l_{1}}$ over the entire space $\mathbb{R}^{d}$ . Then, we prove that there is still no LSH family suitable for $d_{w}^{l_{1}}$ over a bounded space. After that, we propose two ALSH families for $d_{w}^{l_{1}}$ over a bounded space. Based on these ALSH families, two ALSH schemes ( $d_{w}^{l_{1}},l_{2}$ )-ALSH and ( $d_{w}^{l_{1}},\theta$ )-ALSH are proposed for solving NNS over $d_{w}^{l_{1}}$ in sublinear time.

References

[1] Charu C. Aggarwal, Alexander Hinneburg, and Daniel A. Keim. On the surprising behavior of distance metrics in high dimensional spaces. In ICDT, 2001.
[2] Jon Louis Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 1975.
[3] Gautam Bhattacharya, Koushik Ghosh, and Ananda S. Chowdhury. Granger causality driven AHP for feature weighted knn. Pattern Recognit., 2017.
[4] Moses Charikar. Similarity estimation techniques from rounding algorithms. In STOC, 2002.
[5] Lei Chen. Curse of dimensionality. In Encyclopedia of Database Systems. 2009.
[6] King Lum Cheung and Ada Wai-Chee Fu. Enhanced nearest neighbour search on the r-tree. SIGMOD Rec., 1998.
[7] Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In SCG, 2004.
[8] Jürgen Forster, Niels Schmitt, Hans Ulrich Simon, and Thorsten Suttorp. Estimating the optimal margins of embeddings in euclidean half spaces. Mach. Learn., 2003.
[9] Junhao Gan, Jianlin Feng, Qiong Fang, and Wilfred Ng. Locality-sensitive hashing scheme based on dynamic collision counting. In SIGMOD, 2012.
[10] Aristides Gionis, Piotr Indyk, and Rajeev Motwani. Similarity search in high dimensions via hashing. In VLDB, 1999.
[11] Yupeng Gu, Bo Zhao, David Hardtke, and Yizhou Sun. Learning global term weights for content-based recommender systems. In WWW, 2016.
[12] Alexander Hinneburg, Charu C. Aggarwal, and Daniel A. Keim. What is the nearest neighbor in high dimensional spaces? In VLDB, 2000.
[13] Qiang Huang, Jianlin Feng, Qiong Fang, Wilfred Ng, and Wei Wang. Query-aware locality-sensitive hashing scheme for l ${}_{\mbox{p}}$ norm. VLDB J., 2017.
[14] Chein-Shung Hwang, Yi-Ching Su, and Kuo-Cheng Tseng. Using genetic algorithms for personalized recommendation. In ICCCI, Lecture Notes in Computer Science. Springer, 2010.
[15] Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In STOC, 1998.
[16] Yifan Lei, Qiang Huang, Mohan S. Kankanhalli, and Anthony K. H. Tung. Sublinear time nearest neighbor search over generalized weighted space. In ICML, 2019.
[17] Kejing Lu, Hongya Wang, Wei Wang, and Mineichi Kudo. VHP: approximate nearest neighbor search via virtual hypersphere partitioning. Proc. VLDB Endow., 2020.
[18] Julian J. McAuley, Christopher Targett, Qinfeng Shi, and Anton van den Hengel. Image-based recommendations on styles and substitutes. In SIGIR, 2015.
[19] Alejandro Moreo, Andrea Esuli, and Fabrizio Sebastiani. Learning to weight for text classification. IEEE Trans. Knowl. Data Eng., 2020.
[20] Behnam Neyshabur, Yury Makarychev, and Nathan Srebro. Clustering, hamming embedding, generalized LSH and the max norm. In ALT, 2014.
[21] Behnam Neyshabur and Nathan Srebro. On symmetric and asymmetric lshs for inner product search. In ICML, 2015.
[22] Hanan Samet. Foundations of multidimensional and metric data structures. Morgan Kaufmann, 2006.
[23] Anshumali Shrivastava and Ping Li. Asymmetric LSH (ALSH) for sublinear time maximum inner product search (MIPS). In NeurIPS, 2014.
[24] Anshumali Shrivastava and Ping Li. Improved asymmetric locality sensitive hashing (ALSH) for maximum inner product search (MIPS). In UAI, 2015.
[25] Nathan Srebro and Adi Shraibman. Rank, trace-norm and max-norm. In COLT, Lecture Notes in Computer Science. Springer, 2005.
[26] Yufei Tao, Ke Yi, Cheng Sheng, and Panos Kalnis. Efficient and accurate nearest neighbor and closest pair search in high-dimensional space. TODS, 2010.
[27] Jingdong Wang, Heng Tao Shen, Jingkuan Song, and Jianqiu Ji. Hashing for similarity search: A survey. CoRR, 2014.
[28] Bolong Zheng, Xi Zhao, Lianggui Weng, Nguyen Quoc Viet Hung, Hang Liu, and Christian S. Jensen. PM-LSH: A fast and accurate LSH framework for high-dimensional approximate NN search. Proc. VLDB Endow., 2020.