This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Generation Matrix: An Embeddable Matrix Representation for Hierarchical Trees

Jianping Cai jpingcai@163.com Ximeng Liu nbnix@qq.com Jiayin Li lijiayin2019@gmail.com Shuangyue Zhang zhangsy@hxxy.edu.cn College of Computer and Data Science, Fuzhou University, Fuzhou 350108, China College of information and Smart Electromechanical Engineering, Xiamen Huaxia University, Xiamen 361024, China
Abstract

Starting from the local structures to study hierarchical trees is a common research method. However, the cumbersome analysis and description make the naive method challenging to adapt to the increasingly complex hierarchical tree problems. To improve the efficiency of hierarchical tree research, we propose an embeddable matrix representation for hierarchical trees, called Generation Matrix. It can transform the abstract hierarchical tree into a concrete matrix representation and then take the hierarchical tree as a whole to study, which dramatically reduces the complexity of research. Mathematical analysis shows that Generation Matrix can simulate various recursive algorithms without accessing local structures and provides a variety of interpretable matrix operations to support the research of hierarchical trees. Applying Generation Matrix to differential privacy hierarchical tree release, we propose a Generation Matrix-based optimally consistent release algorithm (GMC). It provides an exceptionally concise process description so that we can describe its core steps as a simple matrix expression rather than multiple complicated recursive processes like existing algorithms. Our experiments show that GMC takes only a few seconds to complete a release for large-scale datasets with more than 1010 million nodes. The calculation efficiency is increased by up to 100100 times compared with the state-of-the-art schemes.

keywords:
Generation Matrix, Matrix Representation, Hierarchical Tree, Differential Privacy, Consistency
MSC:
[2020] 05C62 , 05C05
label3label3footnotetext: This work was supported in part by the National Natural Science Foundation of China (project numbers 62072109 and U1804263).

1 Introduction

As a fundamental data structure, hierarchical trees are widely used in different areas, including file systems [1], census [2], evolution [3], etc. For example, in the U.S. Census Bureau’s plan to apply differential privacy to protect privacy[2], designing a novel hierarchical tree releasing algorithm is one of the important challenges[4]. The scale of the census data is so large that we can organize them into a hierarchical tree with more than 10 million nodes. Hence, the hierarchical tree release algorithms must be specially designed and highly efficient to ensure the timely release of such large-scale data. However, the design of efficient algorithms usually requires a large amount of hierarchical tree research as a theoretical basis.

Most hierarchical tree research works naturally regard the hierarchical tree as a collection of nodes and relationships. Starting from the perspective of individual nodes and local relationships to study hierarchical trees is a common research method, called Naive Research Method. Empirically, Naive Research Method often leads to overly cumbersome analysis and abstract algorithm descriptions, mainly reflected in two aspects. On the one hand, hierarchical trees contain rich relationships, such as father, son, ancestor, descendant, sibling, or cousin, but these relationships usually lack a concrete enough description. When multiple node relationships occur in an algorithm simultaneously, the intricate node relationships will make the algorithm challenging to understand. On the other hand, Naive Research Method focuses on the local structure of hierarchical trees rather than the overall structure. The local structure is part of the overall structure, but the studies falling into local structures may prevent researchers from solving problems from a macro perspective. Worse, the additional auxiliary symbols or indexes for describing relationships and local structures pose a significant challenge to the researchers’ data analysis capabilities. Imagine a researcher facing a half-page expression with various randomly labeled symbols and subscripts; how should the next step go? Therefore, Naive Research Method is too cumbersome to solve the increasingly complex hierarchical tree problems effectively.

Considering the complexity of Naive Research Method, we adopted a Matrixing Research Method for hierarchical tree problems. Its core idea is to transform the hierarchical tree into a specific matrix representation and then embed it into the research works or algorithm designs, making the initially abstract and indescribable hierarchical tree concrete and analyzable. As a more advanced research method, similar ideas widely exist in many fields such as graph theory [5, 6, 7, 8], group theory [9], and deep learning[10, 11]. Unlike Naive Research Method, the research object of Matrixing Research Method is the hierarchical tree itself rather than the local structure. It emphasizes avoiding visiting individual nodes as much as possible but implementing operations of the hierarchical tree by the matrix representation. Therefore, the matrix representation design is critical and directly determines whether the research can proceed smoothly. The challenges of matrix representation design are as follows.

  • 1)

    A non-negligible problem is the universality of recursion in hierarchical tree algorithms, while the access of individual nodes and the description of local relationships are almost inevitable in recursion. It violates the core idea of the Matrixing Research Method. Some matrix representations [5, 6, 7, 8] have been used in the spectral theory of trees, but there is no achievement to show that the existing matrix representations can implement recursion without accessing local structures. So that whether supporting recursion is critical to the matrix representation.

  • 2)

    We hope that the matrix representation can directly serve algorithm designs, not just a theoretical analysis tool. Therefore, the matrix representation should be succinct to ensure the efficiency of algorithms. Specifically, the space overhead of each hierarchical tree node should be constant rather than the dense matrices like Distance Matrix [7] or Ancestral Matrix [8].

1.1 Our contributions

Considering the challenges above, we propose an embeddable matrix representation called Generation Matrix. Generation Matrix is a lower triangular matrix containing only 2n12n-1 non-zero elements (i.e., the weights of nodes and edges). Applying sparse storage technologies[12], we only need to store non-zero elements, satisfying the succinctness. Compared with others [5, 6, 7, 8], Generation Matrix emphasizes the application in the hierarchical tree algorithms. Our analysis of properties shows that many calculations on Generation Matrix have specific mathematical meanings. We can explain them and combine them to design complex hierarchical tree algorithms. More importantly, we demonstrate that the inverse of the Generation Matrix contains the inherent logic of recursion. Therefore, we can use Generation Matrix to simulate the top-down and bottom-up recursions without accessing the local structures. Besides, we study the relationship between Generation Matrix and some existing matrix representations and find it can be easily converted to others. It implies that Generation Matrix can be combined with the theories from other matrix representations to solve hierarchical tree problems.

To demonstrate the practicability of Generation Matrix, we introduce an application on the differentially private hierarchical tree release above. Considering the consistency problem[13, 14], we design a Generation Matrix-based optimally consistent release algorithm (GMC) for differentially private hierarchical trees. To our knowledge, GMC is the first solution to the problem by using matrix theory. It has an exceptionally concise process description so that just a simple matrix expression can summarize the core process. GMC embodies the advantages of Generation Matrix in solving problems in cross-domain. Therefore, Generation Matrix has positive significance for promoting the development of hierarchical tree-related research and applying matrix theories to solve hierarchical tree problems.

1.2 Organization of paper

The rest of the paper is organized as follows. Section 2 reviews the related works of existing hierarchical tree representations and differentially private hierarchical tree release. Section 3 introduces the preliminaries of hierarchical trees and the optimally consistent release of differentially private hierarchical trees. Section 4 defines Generation Matrix, then analyzes its mathematical properties and the conversion relationship with other matrix representations. In Section 5, we show the application of Generation Matrix on differentially private hierarchical tree release and design GMC. Finally, Section 6 compares GMC with the existing technology through experiments and demonstrates its efficiency.

2 Related Works

The research on the hierarchical tree representations mainly concentrates on data structure and graph theory fields. In the data structure field, researchers have achieved better performance in storage [15], query [16], or structure updating [17]. However, these representations are mainly for storage in computers but do not support mathematical analysis. We can not symbolize them and use them as tools for hierarchical tree researches. In the graph theory field, many works adopt matrix representations to represent trees, including Adjacency Matrix [5], Laplacian Matrix [3, 6], Distance Matrix [7], etc. Among them, Adjacency Matrix only describes the edge information, so that it is challenging to undertake the complex model analysis. Laplacian matrix and Distance matrix are two meaningful matrix representations widely used in spectral graph theory. However, they are for undirected graphs or trees but not suitable for representing rooted hierarchical trees. To summarize, none of the three matrix representations is the best choice for representing hierarchical trees. Subsequently, Eric et al. [8] proposed a new matrix representation for rooted trees (i.e., hierarchical trees), named Ancestor Matrix. It represents the structure of a hierarchical tree by describing the number of overlapping edges on the path from any two leaves to the root. Studying Ancestral Matrix, Eric et al. [8] obtained many essential conclusions, such as the maximum spectrum radius and the determinants of the characteristic polynomial. However, it is also not the best choice for the calculations of hierarchical trees. First, the dense Ancestral Matrix is not succinct enough. Secondly, Ancestral Matrix is a kind of matrix with a high degree of feature summary. Although it can deterministically express the structure of a hierarchical tree only by describing the leaves, it is very unintuitive and cannot simulate the operations of hierarchical trees. Therefore, the existing matrix representations are not suitable for the analysis and calculation of hierarchical tree models. Significantly, the broad application of the Laplacian matrix in deep learning[10, 11] in recent years implies that matrix representation has essential value for solving complex scientific problems. It motivates us to design a new matrix representation to solve hierarchical tree problems and design algorithms.

Differentially private hierarchical tree release is a data releasing technology that organizes the data into a hierarchical tree and applies differential privacy (DP) [18] to protect individual privacy. It is widely used in many scenarios, such as histogram publishing [13, 19], location privacy release based on spatial partitioning [20], trajectory data publishing [21], frequent term discovery [14]. By adding random noise to the data, DP provides a provable and quantifiable guarantee of individual privacy. However, the random noise will destroy the consistency that the hierarchical tree should satisfy, i.e., “the sum of the children’s values equals the value at the parent”[13]. Therefore, ensuring that the released results meet consistency and obtain a higher accuracy is one of the leading research goals. Hay et al. [13] first applied a hierarchical tree to improve the accuracy of range query and designed Boosting for the consistency of histogram release. However, Boosting can only support complete kk-ary trees, which significantly limits its application. Moreover, Hay et al.’s error analysis [13] of the released results is rough, and only qualitative error results are obtained. Subsequently, Wahlbeh et al. analyzed the error of Boosting and designed an algorithm to calculate the error. However, it also can only support complete kk-ary trees. In the differentially private frequent term discovery problem studied by Ning et al. [14], the hierarchical tree is arbitrary. Therefore, it is impossible to apply Boosting. For this reason, Ning et al. designed an optimally consistent release algorithm for arbitrary hierarchical trees in their proposed algorithms PrivTrie [14]. Its implementation is based on multiple complex recursions, which is not easy to understand and a large number of function calls result in significant additional computational overhead. Applying the idea of maximum likelihood estimation, Lee et al. [22] proposed a general solution for differentially private optimally consistent release. It solves the optimally consistent release by establishing a quadratic programming equation and has a closed-form matrix expression. Theoretically, it can apply to arbitrary optimally consistent release, but the computational overhead is so significant that it can only be processed for small-scale releases. However, Lee et al.’s research work [22] is inspiring. It motivates us to try to analyze the issues of differentially private hierarchical tree release from matrix analysis.

Under the representation of Generation Matrix, we can introduce many matrix analysis methods to solve hierarchical tree problems. One of them is QR decomposition [23]. In QR decomposition, we can transform any matrix into a triangular matrix by orthogonal transformation. Compared with the original form, the triangular matrix is simpler, has many exciting properties [24]. The orthogonal transformation methods commonly used for QR decomposition include Householder transformation [25], Gram-Schmidt orthogonalization [26], Givens rotation [27], etc. Among them, Householder transformation is the simplest and more suitable for sparse matrices.

3 Preliminaries

In this section, we describe some preliminaries of our work. Before formally describing, we first introduce some of the main notation definitions shown in Tab 1.

Table 1: Notations Descriptions in Our Work
Notations Descriptions
𝒯\mathcal{T} Hierarchical tree with arbitrary structure
𝒯(k){{\mathcal{T}}^{\left(k\right)}} kk-order subtree of 𝒯\mathcal{T}
fi{{f}_{i}}, 𝒞i{{\mathcal{C}}_{i}} Parent of node ii; the set of children of node ii
nn, nk{{n}_{k}} The number of nodes of the hierarchical tree; the kk-order subtree
hh, hi{{h}_{i}} The height of the hierarchy tree; The height of node ii
mm The number of unit counts
𝐆𝒯(𝐰node,𝐰edge)n×m\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}\in{{\mathbb{R}}^{n\times m}} The Generation Matrix defined by a 𝒯\mathcal{T} with the node weights 𝐰node{{\mathbf{w}}_{node}} and the edge weights 𝐰edge{{\mathbf{w}}_{edge}}
𝐆𝒯n×n{{\mathbf{G}}_{\mathcal{T}}}\in{{\mathbb{R}}^{n\times n}} The structure matrix of 𝒯\mathcal{T}
𝐌𝒯n×m{{\mathbf{M}}_{\mathcal{T}}}\in{{\mathbb{R}}^{n\times m}} The consistency constraint matrix of 𝒯\mathcal{T}
𝐆𝒯(1)𝒯n1×n1{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}\in{{\mathbb{R}}^{{n}_{1}\times{n}_{1}}} The Generation Matrix inner-product equivalent to 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}
𝐀𝒯,𝐋𝒯,𝐃𝒯n×n{{\mathbf{A}}_{\mathcal{T}}},{{\mathbf{L}}_{\mathcal{T}}},{{\mathbf{D}}_{\mathcal{T}}}\in{{\mathbb{R}}^{n\times n}}, 𝐂𝒯m×m{{\mathbf{C}}_{\mathcal{T}}}\in{{\mathbb{R}}^{m\times m}} The Adjacency Matrix, Laplacian Matrix, Distance Matrix and Ancestral Matrix of 𝒯\mathcal{T}
𝐱m×1\mathbf{x}\in{{\mathbb{R}}^{m\times 1}} The vector composed of unit counts xi{{x}_{i}}
𝐯n×1\mathbf{v}\in{{\mathbb{R}}^{n\times 1}} The vector composed of the values of nodes of the hierarchical tree arranged in order
𝐯~n×1\widetilde{\mathbf{v}}\in{{\mathbb{R}}^{n\times 1}}, 𝐱~m×1\widetilde{\mathbf{x}}\in{{\mathbb{R}}^{m\times 1}} The noisy 𝐯\mathbf{v} and 𝐱\mathbf{x} satisfying DP
𝐯¯n×1\overline{\mathbf{v}}\in{{\mathbb{R}}^{n\times 1}}, 𝐱¯m×1\overline{\mathbf{x}}\in{{\mathbb{R}}^{m\times 1}} Optimally consistent release after post-processing and the vector restored from 𝐯¯\overline{\mathbf{v}}
𝐇{0,1}m×n{{\mathbf{H}}_{\hbar}}\in{{\left\{0,1\right\}}^{m\times n}} The mapping matrix representing the mapping relationship between xi{{x}_{i}} and vi{{v}_{i}}

3.1 Hierarchical Tree

Refer to caption
Figure 1: Hierarchical Tree and Its kk-Order Subtree

We first recall the definition of the hierarchical tree.

Definition 1 (Hierarchical Tree[28]).

The hierarchical tree 𝒯\mathcal{T} is a collection of nodes numbered 1,2,,n1,2,\ldots,n, which satisfies

  • 1)

    𝒯\mathcal{T} contains a specially designated node called the root.

  • 2)

    The remaining nodes are divided into several non-empty collections called the subtrees of the root.

In the hierarchical tree, we denote the relationships between the nodes as iji\to j, indicating node jj is the parent of node ii. Fig. 1 shows a weighted hierarchical tree with 1010 nodes. The node weights and edge weights are denoted as wi{{w}_{i}} and wij{{w}_{i\to j}}, respectively. By the height of each node, we can obtain a good quasi-ranking, which is defined as follows.

Definition 2 (Node Height).

The node height of a hierarchical tree refers to the height of the subtree rooted at the current node. Let the height of node ii be denoted as hi{{h}_{i}}, then hi{{h}_{i}} can be calculated by the following recursive expression.

hi={1,node i is leafmaxj𝒞ihj+1,otherwise.{{h}_{i}}=\left\{\begin{array}[]{*{35}{l}}1&,\text{node }i\text{ is leaf}\\ \underset{j\in{{\mathcal{C}}_{i}}}{\mathop{\max}}\,{{h}_{j}}+1&,\text{otherwise}\\ \end{array}\right.. (1)

By the height of the nodes, we define a kind of induced subtree of hierarchical trees, called kk-Order Subtree.

Definition 3 (kk-Order Subtree).

For a hierarchical tree 𝒯\mathcal{T} under the descending order of height, the kk-Order Subtree 𝒯(k){{\mathcal{T}}^{\left(k\right)}} is defined as an induced subtree retained after 𝒯\mathcal{T} deletes all leaves kk times. 𝒯(k){{\mathcal{T}}^{\left(k\right)}} satisfies

𝒯(k)={i|i𝒯hi>k}.{{\mathcal{T}}^{\left(k\right)}}=\left\{i\left|i\in\mathcal{T}\wedge{{h}_{i}}>k\right.\right\}. (2)

As a bottom-up induced subtree, 𝒯(k){{\mathcal{T}}^{\left(k\right)}} satisfies transitive, i.e., 𝒯(a)(b)=𝒯(a+b){{\mathcal{T}}^{\left(a\right)\left(b\right)}}={{\mathcal{T}}^{\left(a+b\right)}}. It can help us determine whether the subtrees obtained in different ways are equivalent. In subsequent applications, we use the concept of kk-Order Subtree to simplify the description of the tree structure. In Fig. 1, we use dotted circles to mark the subtrees of 𝒯\mathcal{T} from 0 to 22 orders. It can be seen that the 00-Order Subtree is 𝒯\mathcal{T} itself actually; 𝒯(1){{\mathcal{T}}^{\left(1\right)}} is a subtree composed of non-leaf nodes of 𝒯\mathcal{T}. Let nk{{n}_{k}} denote the number of nodes contained in 𝒯(k){{\mathcal{T}}^{\left(k\right)}}, then there are n0n{{n}_{0}}\equiv n and n1{{n}_{1}} equal to the number of non-leaf nodes of 𝒯\mathcal{T}.

3.2 Optimally Consistent Release of Differentially Private Hierarchical Tree

Before describing the optimally consistent releasing of the differentially private hierarchical tree, we first recall the hierarchical tree releasing model. Consider a set of unit counts xi:𝒟{{x}_{i}}:\mathcal{D}\to\mathbb{N}(1im)\left(1\leq i\leq m\right) for private dataset 𝒟\mathcal{D}, where xi{{x}_{i}} indicates the number of records in 𝒟\mathcal{D} that satisfies the mutually exclusive unit condition φi{{\varphi}_{i}}. The unit count xi{{x}_{i}} satisfies

xi=|{t𝐃|φi(t)=True}|.{{x}_{i}}=\left|\left\{t\in\mathbf{D}\left|{{\varphi}_{i}}\left(t\right)=True\right.\right\}\right|. (3)

Since φi{{\varphi}_{i}} is mutually exclusive, any t𝒟t\in\mathcal{D} satisfies and only satisfies one φi{{\varphi}_{i}}. Therefore, organizing xi{{x}_{i}} into the form of a vector, we will get 𝐱=[x1,x2,,xm]T\mathbf{x}={{\left[{{x}_{1}},{{x}_{2}},\ldots,{{x}_{m}}\right]}^{T}}.

Refer to caption
Figure 2: The Process of Hierarchical Tree Release

As shown in Fig. 2, each leaf corresponds to a xi{{x}_{i}}. The non-leaf node’s value equals the sum of the leaves’ value of the subtree rooted at that node. Therefore, the results of the hierarchical tree meet the consistency, i.e., “the sum of the values of the child nodes is equal to the value of the parent node”. In Fig. 2, we denote the value of the ii-th node as vi{{v}_{i}}. Then, organizing vi{{v}_{i}} into the form of 𝐯=[v1,v2,,vn]T\mathbf{v}={{\left[{{v}_{1}},{{v}_{2}},\ldots,{{v}_{n}}\right]}^{\operatorname{T}}} in turn, we can get the to-be-released data 𝐯\mathbf{v}.

However, several works[13, 19, 20, 21, 14] have demonstrated that releasing an unprotected hierarchical tree will result in privacy disclosure. To protect individual privacy, DWork et al. [18] proposed differential privacy, defined as follows.

Definition 4 (ε\varepsilon-Differential Privacy[18]).

If a random algorithm \mathcal{M} satisfies ε\varepsilon-difference privacy, then for any two neighboring datasets 𝒟\mathcal{D} and 𝒟\mathcal{D}^{\prime}, all outputs ORange()O\in\operatorname{Range}\left(\mathcal{M}\right) satisfies

Pr((𝒟)=O)eεPr((𝒟)=O).\Pr\left(\mathcal{M}\left(\mathcal{D}\right)=O\right)\leq{{e}^{\varepsilon}}\Pr\left(\mathcal{M}\left(\mathcal{D}^{\prime}\right)=O\right). (4)

Under differential privacy, the process of hierarchical tree releasing can be described as

𝐯~=𝐯+ξ,\widetilde{\mathbf{v}}=\mathbf{v}+\mathbf{\xi}, (5)

where 𝐯~\widetilde{\mathbf{v}} is the 𝐯\mathbf{v} after noise addition, which satisfies differential privacy. ξ\mathbf{\xi} is the random vector for the noise addition. Each element ξi{{\xi}_{i}} is i.i.d and satisfies ξiLap(Δ/ε){{\xi}_{i}}\sim\operatorname{Lap}\left({\Delta}/{\varepsilon}\;\right), where Lap\operatorname{Lap} represents a Laplacian distribution and Δ\Delta is data sensitivity. In hierarchical tree releasing, Δ\Delta equals the height of 𝒯\mathcal{T}[13].

To keep the consistency of the hierarchical tree after adding noise, we can get the optimally consistent release 𝐯¯\overline{\mathbf{v}} by following the optimization equation according to maximum likelihood post-processing proposed by Lee et al. [22].

min𝐯¯𝐯¯𝐯~s.t.𝐌𝒯T𝐯¯=𝟎,\begin{array}[]{*{35}{l}}\underset{\overline{\mathbf{v}}}{\mathop{\min}}&\left\|\overline{\mathbf{v}}-\widetilde{\mathbf{v}}\right\|\\ s.t.&{{\mathbf{M}}_{\mathcal{T}}}^{T}\overline{\mathbf{v}}=\mathbf{0}\\ \end{array}, (6)

where 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}} is the consistency constraint matrix of a hierarchical tree, defined as follows.

Definition 5 (Consistency Constraint Matrix of Hierarchical Tree).

Given a hierarchical tree 𝒯\mathcal{T} containing nn nodes. Let n1{{n}_{1}} denote the number of non-leaf nodes in 𝒯\mathcal{T}. The value mij{{m}_{ij}} in row ii and column jj of the consistency constraint matrix 𝐌𝒯n×n1{{\mathbf{M}}_{\mathcal{T}}}\in{{\mathbb{R}}^{n\times{{n}_{1}}}} is defined as follows:

mij={1,i=j1,j=fi0,otherwise,{{m}_{ij}}=\left\{\begin{array}[]{*{35}{l}}1&,i=j\\ -1&,j={{f}_{i}}\\ 0&,\text{otherwise}\\ \end{array}\right., (7)

where fi{{f}_{i}} is the parent of node ii.

The optimization equation (6) has the following closed-form expression:

𝐯¯=𝐯~𝐌𝒯(𝐌𝒯T𝐌𝒯)1𝐌𝒯T𝐯~.\overline{\mathbf{v}}=\widetilde{\mathbf{v}}-{{\mathbf{M}}_{\mathcal{T}}}{{\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}\right)}^{-1}}{{\mathbf{M}}_{\mathcal{T}}}^{T}\widetilde{\mathbf{v}}. (8)

Since Formula (8) involves the inner product and inverse operations of the matrix, the time complexity of the direct solution is as high as O(n3)O\left({{n}^{3}}\right). The amount of calculation is too large to obtain an efficient enough algorithm directly by the expressions. On the surface, Formula (8) is not a good choice for solving optimally consistent releases, but under the theories of Generation Matrix, we can convert it into another form and apply the properties of Generation Matrix to obtain an efficient algorithm.

4 Generation Matrix Model for Hierarchical Tree

4.1 Generation Matrix

Before defining Generation Matrix, we number the nodes of 𝒯\mathcal{T} by descending order of height firstly.

Definition 6 (Descending Order of Height).

Let hi{{h}_{i}} denote the height of node ii defined by Def. 1. If any two nodes ii and jj in 𝒯\mathcal{T} satisfy

i<jlilj,i<j\Rightarrow{{l}_{i}}\geq{{l}_{j}}, (9)

we say that 𝒯\mathcal{T} satisfies Descending Order of Height.

Under the descending order of height, we define Generation Matrix for 𝒯\mathcal{T}.

Refer to caption
Figure 3: Generation Matrix and Its kk-Order Submatrix
Definition 7 (Generation Matrix).

Considering a non-zero weighted hierarchical tree 𝒯\mathcal{T} under descending order of height, let wi{{w}_{i}} and wifi{{w}_{i\to{{f}_{i}}}}(wi,wifi0)\left({{w}_{i}},{{w}_{i\to{{f}_{i}}}}\neq 0\right) denote the weight values of the node ii and the edge ifii\to{{f}_{i}}. Organizing them into a vector, denoted as 𝐰node{{\mathbf{w}}_{node}} and 𝐰edge{{\mathbf{w}}_{edge}}, the generation matrix is denoted as 𝐆𝒯(𝐰node,𝐰edge)n×n\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}\in{{\mathbb{R}}^{n\times n}}, whose element gi,j{{g}_{i,j}} in row ii and column jj is defined as follows,

gi,j={wi,i=jwifi,ifi0,otherwise.{{g}_{i,j}}=\left\{\begin{array}[]{*{35}{l}}{{w}_{i}}&,i=j\\ -{{w}_{i\to{{f}_{i}}}}&,i\to{{f}_{i}}\\ 0&,\text{otherwise}\\ \end{array}\right.. (10)

As shown in Fig. 3, since 𝒯\mathcal{T} satisfies Descending Order of Height, the number of any non-root node ii in 𝒯\mathcal{T} is always bigger than its parent. It ensures Generation Matrix is always a lower triangular matrix. According to Def. 10, there is a one-to-one mapping between arbitrary non-zero weighted hierarchical tree and 𝐆𝒯(𝐰node,𝐰edge)n×n\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}\in{{\mathbb{R}}^{n\times n}}, i.e., the matrix representation of the non-zero weighted hierarchical tree is unique. When we only need to describe the structure of the hierarchical tree, we can use a Generation Matrix with all weights of 11 to represent it, i.e., 𝐆𝒯(𝟏,𝟏)\mathbf{G}_{\mathcal{T}}^{\left(\mathbf{1},\mathbf{1}\right)}. We call 𝐆𝒯(𝟏,𝟏)\mathbf{G}_{\mathcal{T}}^{\left(\mathbf{1},\mathbf{1}\right)} Structure Matrix, which is abbreviated as 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}}. If two hierarchical trees have the same structure and arrangement, the positions of non-zero elements in their Generation Matrices are always the same, which we call Similar Generation Matrices.

Definition 8 (Similar Generation Matrices).

If 𝐆1{{\mathbf{G}}_{1}} and 𝐆2{{\mathbf{G}}_{2}} are two Generation Matrices defined by the hierarchical trees with the same structure and arrangement (or the same tree), we call them similar Generation Matrices, which are denoted as 𝐆1𝐆2{{\mathbf{G}}_{1}}\sim{{\mathbf{G}}_{2}}.

Every Generation Matrix from the same hierarchical tree is always similar. We can use 𝐆𝐆𝒯\mathbf{G}\sim{{\mathbf{G}}_{\mathcal{T}}} as sufficient to judge whether the hierarchical tree represented by 𝐆\mathbf{G} has the same structure as 𝒯\mathcal{T}.

According to Def. 2, 𝒯(k){{\mathcal{T}}^{\left(k\right)}} is an induced subtree composed of nodes ii with hik{{h}_{i}}\geq k in 𝒯\mathcal{T}. Under Descending Order of Height, these nodes are always ranked first. In Fig. 3, kk-Order Submatrix that represents the kk-Order Subtree is denoted as 𝐆𝒯(k)(𝐰node,𝐰edge)\mathbf{G}_{{{\mathcal{T}}^{\left(k\right)}}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}. We can obtain it by taking the nk{{n}_{k}}-order leading principal minor of 𝐆𝒯(𝐰node,𝐰edge)\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)} (i.e., the elements in rows and columns from 11 to nk{{n}_{k}}). In particular, 𝐆𝒯(1)(𝐰node,𝐰edge)\mathbf{G}_{{{\mathcal{T}}^{\left(1\right)}}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)} represents the sub-tree composed of non-leaf nodes of 𝒯\mathcal{T}.

Considering a specific application, one problem we may encounter is that the nodes of the hierarchical tree are numbered but not in Descending Order of Height. Under the matrix representation, the problem is elementary to solve. We can adopt a sparse mapping matrix to convert the original number into Descending Order of Height.

Definition 9 (Mapping Matrix).

Given an ordered set 𝒮=s1,s2,,sm\mathcal{S}=\left\langle{{s}_{1}},{{s}_{2}},\ldots,{{s}_{m}}\right\rangle represents the mapping relationship between integers, satisfying si+sin{{s}_{i}}\in{{\mathbb{N}}^{+}}\wedge{{s}_{i}}\leq n, the mapping matrix defined by 𝒮\mathcal{S} is denoted as 𝐇𝒮{0,1}m×n{{\mathbf{H}}_{\mathcal{S}}}\in{{\left\{0,1\right\}}^{m\times n}}. The value hi,j{{h}_{i,j}} in row ii and column jj of 𝐇𝒮{{\mathbf{H}}_{\mathcal{S}}} satisfies

hi,j={1,si=j0,otherwise.{{h}_{i,j}}=\left\{\begin{array}[]{*{35}{l}}1&,{{s}_{i}}=j\\ 0&,\text{otherwise}\\ \end{array}\right.. (11)

By the definition of the mapping matrix, 𝒮\mathcal{S} always represents an injection and satisfies 𝐇𝒮𝐇𝒮T=𝐈{{\mathbf{H}}_{\mathcal{S}}}{{\mathbf{H}}_{\mathcal{S}}}^{T}=\mathbf{I}. If 𝒮\mathcal{S} represents a bijection, 𝐇𝒮{{\mathbf{H}}_{\mathcal{S}}} will be a permutation matrix. When the basis vector 𝐞i{{\mathbf{e}}_{i}} acts on 𝐇𝒮{{\mathbf{H}}_{\mathcal{S}}}, the following equations holds.

𝐇𝒮T𝐞i=𝐞si,{{\mathbf{H}}_{\mathcal{S}}}^{T}{{\mathbf{e}}_{i}}={{\mathbf{e}}_{{{s}_{i}}}}, (12)
𝐇𝒮𝐞i={𝐞si1,i𝒮𝟎,i𝒮,{{\mathbf{H}}_{\mathcal{S}}}{{\mathbf{e}}_{i}}=\left\{\begin{array}[]{*{35}{l}}{{\mathbf{e}}_{s_{i}^{-1}}}&,i\in\mathcal{S}\\ \mathbf{0}&,i\notin\mathcal{S}\\ \end{array}\right., (13)

where si1s_{i}^{-1} represents the inverse mapping of si{{s}_{i}}, which satisfies ssi1=is_{{{s}_{i}}}^{-1}=i.

To describe the mapping relationship between the nodes before and after sorting by Descending Order of Height, we only need to define the ordered set 𝒮=s1,s2,,sn\mathcal{S}=\left\langle{{s}_{1}},{{s}_{2}},\ldots,{{s}_{n}}\right\rangle, where si{{s}_{i}} represents the sorted number of the node initially numbered ii. Then, we can use 𝐇𝒮𝐯{{\mathbf{H}}_{\mathcal{S}}}\mathbf{v} to get the vector before sorting from 𝐯\mathbf{v} after sorting.

In addition, the mapping matrix can be used to represent other mapping relationships, such as the mapping of vi{{v}_{i}} to xi{{x}_{i}}. For example, in Fig. 1, if we use an ordered set =1,2,,m\mathcal{H}=\left\langle{{\hbar}_{1}},{{\hbar}_{2}},\ldots,{{\hbar}_{m}}\right\rangle to represent the mapping relationship between vi{{v}_{i}} and xi{{x}_{i}}, then i=i+5{{\hbar}_{i}}=i+5, and we have the mapping matrix 𝐇{{\mathbf{H}}_{\hbar}} to represent their mapping relationship.

4.2 Properties of Generation Matrix

Our research shows that Generation Matrix has many mathematical properties that deserve attention. These properties can help us solve various problems about the analysis and calculation of hierarchical trees. According to Def. 10, it is not difficult to find that Generation Matrix satisfies sparsity.

Property 1 (Sparsity).

Considering a hierarchical tree 𝒯\mathcal{T} consists of nn nodes, 𝐆𝒯(𝐰node,𝐰edge)\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)} is has and only has 2n12n-1 non-zero elements. Its first row has only 11 non-zero element, and the remaining n1n-1 rows have 22 non-zero elements.

Due to the sparsity of Generation Matrix, we can apply various sparse matrix technologies such as COO (Coordinate Format) and CSR (Compressed Sparse Row) to calculate hierarchical tree models efficiently. Under the sparse representations, the storage and calculation of Generation Matrix are both only O(n)O\left(n\right). Therefore, the application based on Generation Matrix does not cause more computational overhead. Currently, the computing technologies of sparse matrices are very mature and widely used in various high-performance platforms [29, 30, 31].

One of the fundamental properties of Generation Matrices is invertibility. By solving the equations 𝐆𝐳𝒯(𝐰node,𝐰edge)T=𝐯\mathbf{G}{{{}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}}^{T}}\mathbf{z}=\mathbf{v} and 𝐆𝒯(𝐰node,𝐰edge)𝐳=𝐯\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}\mathbf{z}=\mathbf{v} about 𝐳\mathbf{z}, we find two interesting and important mathematical properties of Generation Matrices. We collectively call them the propagation of Generation Matrix.

Property 2 (Upward Propagation).

Let gi,j{{g}_{i,j}} denotes the element in row ii and column jj of 𝐆𝒯(𝐰node,𝐰edge)\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}, and vi{{v}_{i}} is the value of node ii of 𝒯\mathcal{T}. Organize vi{{v}_{i}} into vector 𝐯=[v1,v2,,vn]T\mathbf{v}={{\left[{{v}_{1}},{{v}_{2}},\ldots,{{v}_{n}}\right]}^{T}}, then 𝐳=𝐆𝐯𝒯(𝐰node,𝐰edge)T\mathbf{z}=\mathbf{G}{{{}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}}^{-T}}\mathbf{v} is an upward propagation on 𝐯\mathbf{v}. The value zi{{z}_{i}} of 𝐳\mathbf{z} satisfies

zi={vi/gi,i,node i is leaf(vij𝒞igj,izj)/gi,i,otherwise.{{z}_{i}}=\left\{\begin{array}[]{*{35}{l}}{{{v}_{i}}}/{{{g}_{i,i}}}&,\text{node }i\text{ is leaf}\\ {\left({{v}_{i}}-\sum\limits_{j\in{{\mathcal{C}}_{i}}}{{{g}_{j,i}}{{z}_{j}}}\right)}/{{{g}_{i,i}}}&,\text{otherwise}\\ \end{array}\right.. (14)
Property 3 (Downward Propagation).

If 𝐆𝒯(𝐰node,𝐰edge)\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)} and 𝐯\mathbf{v} have the exact definition as Prop. 14, then 𝐳=𝐆𝐯𝒯(𝐰node,𝐰edge)1\mathbf{z}=\mathbf{G}{{{}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}}}^{-1}\mathbf{v} is a downward propagation on 𝐯\mathbf{v}. The value zi{{z}_{i}} of 𝐳\mathbf{z} satisfies

zi={vi/gi,i,node i is root(vigi,fizfi)/gi,i,otherwise.{{z}_{i}}=\left\{\begin{array}[]{*{35}{l}}{{{v}_{i}}}/{{{g}_{i,i}}}&,\text{node }i\text{ is root}\\ {\left({{v}_{i}}-{{g}_{i,{{f}_{i}}}}{{z}_{{{f}_{i}}}}\right)}/{{{g}_{i,i}}}&,\text{otherwise}\\ \end{array}\right.. (15)

Prop. 14 and Prop. 15 show that Generation Matrix can simulate multiple recursive operations of hierarchical trees. According them, Cor. 1 analyzes the elements affected by the propagations of Generation Matrix.

Corollary 1.

For the upward propagation, affected zj{{z}_{j}} by vi{{v}_{i}} satisfies j=ij=i, or jj is the ancestor of ii in 𝒯\mathcal{T}; for the downward propagation, affected zj{{z}_{j}} by vi{{v}_{i}} satisfies j=ij=i, or jj is a descendant of ii in 𝒯\mathcal{T}.

Combining the propagations, we further study a variety of matrix operations of Generation Matrices. They have strong interpretability and provide crucial theoretical support for the research of hierarchical trees.

Property 4.

Let the vector 𝐳=(𝐈𝐆𝒯T)𝟏\mathbf{z}=\left(\mathbf{I}-{{\mathbf{G}}_{\mathcal{T}}}^{T}\right)\mathbf{1}, then the ii-th element zi{{z}_{i}} of 𝐳\mathbf{z} represents the number of children of the node ii, i.e., zi=|𝒞i|{{z}_{i}}=\left|{{\mathcal{C}}_{i}}\right|.

Property 5.

If the vector 𝐳=𝐆𝒯T𝟏\mathbf{z}={{\mathbf{G}}_{\mathcal{T}}}^{-T}\mathbf{1}, the ii-th element zi{{z}_{i}} of 𝐳\mathbf{z} represents the number of nodes contained in the subtree rooted at node ii.

Property 6.

Let the vector 𝐳=𝐆𝒯1𝟏\mathbf{z}={{\mathbf{G}}_{\mathcal{T}}}^{-1}\mathbf{1}, then the ii-th element zi{{z}_{i}} of 𝐳\mathbf{z} represents the depth of node ii, where the depth of the root is 11.

The properties above indicate that Generation Matrix is an effective and easy-to-use tool for various hierarchical tree analyses. In addition to the conclusions about vectors discussed above, there are some conclusions about matrices as follows. Compared with conclusions about vectors, they focus on describing the characteristics between nodes.

Property 7.

Let gij(1)g_{ij}^{\left(-1\right)} denote the element in row ii and column jj of 𝐆𝒯1{{\mathbf{G}}_{\mathcal{T}}}^{-1}, then gij(1)g_{ij}^{\left(-1\right)} satisfies

gij(1)={1,i=jj is an ancestor of i0,otherwise.g_{ij}^{\left(-1\right)}=\left\{\begin{array}[]{*{35}{l}}1&,i=j\vee j\text{ is an ancestor of }i\\ 0&,\text{otherwise}\\ \end{array}\right.. (16)
Proof.

See Appendix A.1. ∎

Prop. 16 shows that 𝐆𝒯1{{\mathbf{G}}_{\mathcal{T}}}^{-1} is a matrix indicating the relationship between ancestors and descendants. Although 𝐆𝒯1{{\mathbf{G}}_{\mathcal{T}}}^{-1} is denser than 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}}, in most cases, 𝐆𝒯1{{\mathbf{G}}_{\mathcal{T}}}^{-1} is still sparse.

Property 8.

Let 𝐌=𝐆𝒯𝐆𝒯T\mathbf{M}={{\mathbf{G}}_{\mathcal{T}}}{{\mathbf{G}}_{\mathcal{T}}}^{T} and mi,j{{m}_{i,j}} denote the element in row ii and column jj of 𝐌\mathbf{M}. Except for m11=1{{m}_{11}}=1, other elements satisfy “mij=1{{m}_{ij}}=1 \Leftrightarrow ii and jj are sibling nodes”.

Property 9.

Let 𝐌=(𝐆𝒯T𝐆𝒯)1\mathbf{M}={{\left({{\mathbf{G}}_{\mathcal{T}}}^{T}{{\mathbf{G}}_{\mathcal{T}}}\right)}^{-1}} and mij{{m}_{ij}} is the element in row ii and column jj of 𝐌\mathbf{M}, then the value of mij{{m}_{ij}} represents the number of common ancestors of the node pair ii and jj, and mii{{m}_{ii}} represents the depth of ii.

Proof.

See Appendix A.2. ∎

Similar to Prop. 16, Prop. 8 can also be used as an indicator matrix to describe the relationship between nodes. Prop. 9 is an important property, which describes an effective method of calculating common ancestors. As an essential feature to describe the correlation between nodes, the number of common ancestors has an important application value for hierarchical tree analyses[8].

In the study of spectral graph theory, feature analysis is usually indispensable. Our research shows that the eigenvalues and eigenvectors of a Generation Matrix satisfy the following properties.

Property 10 (Eigenvalues and Eigenvectors).

Let 𝛌=[λ1,λ2,,λn]T\boldsymbol{\lambda}={{\left[{{\lambda}_{1}},{{\lambda}_{2}},\ldots,{{\lambda}_{n}}\right]}^{T}} denote the eigenvalues of 𝐆𝒯(𝐰node,𝐰edge)\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}, then the ii-th eigenvalueλi{{\lambda}_{i}} is wiw_{i}.

Let the left eigenvector and the right eigenvector corresponding to the ii-th eigenvalue denote as 𝐮i{{\mathbf{u}}_{i}} and 𝐯i{{\mathbf{v}}_{i}}, respectively. The premise of the existence of 𝐮i{{\mathbf{u}}_{i}} is that each ancestor jj of ii satisfies wiwjw_{i}\neq w_{j}, and the premise of the existence of 𝐯i{{\mathbf{v}}_{i}} is that each descendant jj of ii satisfies wiwjw_{i}\neq w_{j}.

Let the jj-th element of 𝐮i{{\mathbf{u}}_{i}} and 𝐯i{{\mathbf{v}}_{i}} denote as uj(i)u_{j}^{\left(i\right)} and vj(i)v_{j}^{\left(i\right)}, respectively. If 𝐮i{{\mathbf{u}}_{i}} exists, then uj(i)=0u_{j}^{\left(i\right)}=0 for any j>ij>i. Let ui(i)=1u_{i}^{\left(i\right)}=1. The remaining elements uj(i)(j<i)u_{j}^{\left(i\right)}\left(j<i\right) can be obtained by

uj(i)={k𝒞jwkjuk(i)wjwi,j is a ancestor of i0,otherwise.u_{j}^{\left(i\right)}=\left\{\begin{array}[]{*{35}{l}}\frac{\sum\nolimits_{k\in{{\mathcal{C}}_{j}}}{w_{k\to j}u_{k}^{\left(i\right)}}}{w_{j}-w_{i}}&,j\text{ is a ancestor of }i\\ 0&,\text{otherwise}\\ \end{array}\right.. (17)

If 𝐯i{{\mathbf{v}}_{i}} exists, then vj(i)=0v_{j}^{\left(i\right)}=0 for any j<ij<i. Let vi(i)=1v_{i}^{\left(i\right)}=1. The remaining elements vj(i)(j>i)v_{j}^{\left(i\right)}\left(j>i\right) can be obtained by

vj(i)={wjfjvfj(k)wjwi,j is a descendant of i0,otherwise.v_{j}^{\left(i\right)}=\left\{\begin{array}[]{*{35}{l}}\frac{w_{j\to{{f}_{j}}}v_{{{f}_{j}}}^{\left(k\right)}}{w_{j}-w_{i}}&,j\text{ is a descendant of }i\\ 0&,\text{otherwise}\\ \end{array}\right.. (18)

Prop. 10 shows that the eigenvalues and eigenvectors of the Generation Matrix have many interesting properties. For example, the eigenvalue of Generation Matrix is the weights of the nodes, which is much easier to solve than other matrix representations; the eigenvectors also satisfy some propagation properties similar to Prop. 14 and 15. Notably, the eigenvectors are conditional, which means that Generation Matrix is not always diagonalizable. Some Generation Matrices, especially the eigenvectors of 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}}, still have many problems waiting to be studied. Although feature analysis is not the main focus in this paper, Prop. 10 still provides some valuable references for the subsequent research works.

Considering the relationship between 𝐆𝒯(𝐰node,𝐰edge)\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)} and corresponding 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}}, we find that 𝐆𝒯(𝐰node,𝐰edge)\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)} satisfies a particular decomposition form, which we call the diagonal decomposition of Generation Matrix.

Property 11 (Diagonal Decomposition).

Given arbitrary 𝐆𝒯(𝐰node,𝐰edge)\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}, there is always a pair of vectors 𝛂,𝛃n\boldsymbol{\alpha},\boldsymbol{\beta}\in{{\mathbb{R}}^{n}}, making the following decomposition hold for 𝐆𝒯(𝐰node,𝐰edge)\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)} and the structure matrix 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}}.

𝐆𝒯(𝐰node,𝐰edge)=diag(𝜷)𝐆𝒯diag(𝜶)\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}=\operatorname{diag}\left(\boldsymbol{\beta}\right){{\mathbf{G}}_{\mathcal{T}}}\operatorname{diag}\left(\boldsymbol{\alpha}\right) (19)

Let αi{{\alpha}_{i}} and βi{{\beta}_{i}} denote the ii-th element of them, respectively. Then a pair of legal 𝛂\boldsymbol{\alpha} and 𝛃\boldsymbol{\beta} can be obtained by

{𝜶=exp(𝐆𝒯1(ln(𝐰node)ln(𝐰edge)))𝜷=𝐰node𝜶.\left\{\begin{aligned} &\boldsymbol{\alpha}=\exp\left({{\mathbf{G}}_{\mathcal{T}}}^{-1}\left(\ln\left({{\mathbf{w}}_{node}}\right)-\ln\left({{\mathbf{w}}_{edge}}\right)\right)\right)\\ &\boldsymbol{\beta}={{\mathbf{w}}_{node}}\oslash\boldsymbol{\alpha}\\ \end{aligned}\right.. (20)

Where “\oslash” denote the element-wise division of vectors. Note, since the root numbered 11 has no parent, we set w1=1{{w}_{1\to\varnothing}}=1 as the first element of 𝐰edge{{\mathbf{w}}_{edge}}, and wifi{{w}_{i\to{{f}_{i}}}} is the ii-th element of 𝐰edge{{\mathbf{w}}_{edge}} in the remaining elements.

Proof.

See Appendix A.3. ∎

By the diagonal decomposition of Generation Matrix, we can express arbitrary 𝐆𝒯(𝐰node,𝐰edge)\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)} as an expression with 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}}. Using Prop. 11, we can extend some mathematical properties of 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}} to 𝐆𝒯(𝐰node,𝐰edge)\mathbf{G}_{\mathcal{T}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)} to solve more problems effectively.

4.3 The Conversion between Generation Matrix and Other Matrix Representations

Refer to caption
Figure 4: Conversion Relationships from Generation Matrix to Other Matrix Representations

Our research shows that Generation Matrix is not an isolated matrix representation from others. Through proper operations, we can convert Generation Matrix into other matrix representations. Fig. 4 shows the four matrix representations that can be transformed by the Generation Matrix constructed by the hierarchical tree in Fig. 1, including Adjacency Matrix, Laplacian Matrix, Distance Matrix, and Ancestor Matrix.

Theorem 1.

Let 𝐀𝒯{{\mathbf{A}}_{\mathcal{T}}} be Adjacency Matrix of 𝒯\mathcal{T}, then 𝐀𝒯{{\mathbf{A}}_{\mathcal{T}}} can be obtained by the following expression of 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}}:

𝐀𝒯=𝐈𝐆𝒯.{{\mathbf{A}}_{\mathcal{T}}}=\mathbf{I}-{{\mathbf{G}}_{\mathcal{T}}}. (21)
Theorem 2.

Let 𝐋𝒯{{\mathbf{L}}_{\mathcal{T}}} be Laplacian Matrix of 𝒯\mathcal{T}, then 𝐋𝒯{{\mathbf{L}}_{\mathcal{T}}} can be obtained by the following expression of 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}}:

𝐋𝒯=𝐆𝒯T𝐆𝒯𝐞1𝐞1T.{{\mathbf{L}}_{\mathcal{T}}}={{\mathbf{G}}_{\mathcal{T}}}^{T}{{\mathbf{G}}_{\mathcal{T}}}-{{\mathbf{e}}_{1}}{{\mathbf{e}}_{1}}^{T}. (22)
Theorem 3.

Let 𝐃𝒯{{\mathbf{D}}_{\mathcal{T}}} be Distance Matrix of 𝒯\mathcal{T}, then 𝐃𝒯{{\mathbf{D}}_{\mathcal{T}}} can be obtained by the following expression of 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}}:

𝐃𝒯=𝐆𝒯1𝐈𝐈T+𝐈𝐈T𝐆𝒯T2(𝐆𝒯T𝐆𝒯)1.{{\mathbf{D}}_{\mathcal{T}}}={{\mathbf{G}}_{\mathcal{T}}}^{-1}\mathbf{I}{{\mathbf{I}}^{T}}+\mathbf{I}{{\mathbf{I}}^{T}}{{\mathbf{G}}_{\mathcal{T}}}^{-T}-2{{\left({{\mathbf{G}}_{\mathcal{T}}}^{T}{{\mathbf{G}}_{\mathcal{T}}}\right)}^{-1}}. (23)
Theorem 4.

Let 𝐂𝒯{{\mathbf{C}}_{\mathcal{T}}} be Ancestral Matrix[8] of 𝒯\mathcal{T}, then 𝐂𝒯{{\mathbf{C}}_{\mathcal{T}}} can be obtained by the following expression of 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}}:

𝐂𝒯=𝐇(𝐆𝒯T𝐆𝒯)1𝐇T1.{{\mathbf{C}}_{\mathcal{T}}}={{\mathbf{H}}_{\mathcal{H}}}{{\left({{\mathbf{G}}_{\mathcal{T}}}^{T}{{\mathbf{G}}_{\mathcal{T}}}\right)}^{-1}}{{\mathbf{H}}_{\mathcal{H}}}^{T}-1. (24)
Proof.

The proofs of Thm. 2-4 in Appendix A.4 to A.6. ∎

It can be seen from the theorems above that we can convert Generation Matrix into other matrix representations by simple matrix expressions. However, the reverse is not easy. Except for Adjacency Matrix, other matrix representations cannot be directly converted back to Generation Matrix. Therefore, we can use Generation Matrix to construct other matrix representations. Besides, it also implies that the theories of Generation Matrix have a particular internal connection with the matrix represented. We can combine the theories of Generation Matrix and other matrix representations to solve more problems about hierarchical trees.

5 The Application on Differentially Private Hierarchical Tree Release

5.1 Hierarchical Tree Release Based on Generation Matrix

In this section, we introduce how to apply Generation Matrix to efficiently and concisely achieve an optimally consistent release on differentially private hierarchical tree release. Since each leaf of 𝒯{\mathcal{T}} corresponds to a xi{{x}_{i}}, we use the mapping matrix 𝐇{{\mathbf{H}}_{\mathcal{H}}} to represent the mapping relationship between leaf nodes and hierarchical tree nodes. Using Prop. 14, the hierarchical tree building process BuildTree𝒯{{\operatorname{BuildTree}}_{\mathcal{T}}} can be described as

𝐯=BuildTree𝒯(𝐱)=𝐆𝒯T𝐇T𝐱.\mathbf{v}={{\operatorname{BuildTree}}_{\mathcal{T}}}\left(\mathbf{x}\right)={{\mathbf{G}}_{\mathcal{T}}}^{-T}{{\mathbf{H}}_{\mathcal{H}}}^{T}\mathbf{x}. (25)

At the same time, we also define the inverse tree-building process BuildTree𝒯1{{\operatorname{BuildTree}}_{\mathcal{T}}}^{-1} as the following expression, which is taking the leaves of 𝒯\mathcal{T} and restoring them to 𝐱\mathbf{x}.

𝐱=BuildTree𝒯1(𝐯)=𝐇𝐯.\mathbf{x}={{\operatorname{BuildTree}}_{\mathcal{T}}}^{-1}\left(\mathbf{v}\right)={{\mathbf{H}}_{\mathcal{H}}}\mathbf{v}. (26)

Although most works[13, 19, 20, 21, 14] do not take matrix analysis as the theoretical basis for optimally consistent release, there are many advantages to applying matrix analysis. One of them is error analysis. Using matrix analysis, we can quickly calculate the overall mean square error of the “Node Query” after the post-processing for optimally consistent release.

Theorem 5.

Given the privacy budget ε\varepsilon and the to-be-released hierarchical tree 𝒯\mathcal{T} containing nn nodes and mm leaves, whose height is hh, the overall mean square error before and after post-processing mse(𝐯~)mse\left(\widetilde{\mathbf{v}}\right) and mse(𝐯¯)mse\left(\overline{\mathbf{v}}\right) satisfy

mse(𝐯~)=i=1n𝔼((v~ivi)2)=2nh2/ε2,\operatorname{mse}\left(\widetilde{\mathbf{v}}\right)=\sum\nolimits_{i=1}^{n}{\mathbb{E}\left({{\left({{\widetilde{v}}_{i}}-{{v}_{i}}\right)}^{2}}\right)}={2n{{h}^{2}}}/{{{\varepsilon}^{2}}}\;, (27)
mse(𝐯¯)=i=1n𝔼((v¯ivi)2)=2mh2/ε2.\operatorname{mse}\left(\overline{\mathbf{v}}\right)=\sum\nolimits_{i=1}^{n}{\mathbb{E}\left({{\left({{\overline{v}}_{i}}-{{v}_{i}}\right)}^{2}}\right)}={2m{{h}^{2}}}/{{{\varepsilon}^{2}}}\;. (28)
Proof.

See Appendix A.7. ∎

According to Thm. 28, the overall mean square error depends on the number of leaves after post-processing. Generally, nn is much less than the number of leaves mm, so post-processing will significantly reduce the error. As the proof shown in Appendix A.7, We applied the property of the trace of the matrix to obtain a concrete and concise demonstration, which embodies matrix analysis’s great potential for solving problems.

Next, we will introduce how to apply Generation Matrix to achieve an efficient enough algorithm.

5.2 “LO”\text{``LO''}-QR Decomposition on 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}

To obtain an efficient release algorithm, we apply QR decomposition to analyze Formula (8). Unfortunately, the traditional QR decomposition[23] cannot meet our analysis requirements, so we propose another QR decomposition form, namely the “LO”\text{``LO''}-QR decomposition.

Definition 10 (“LO”\text{``LO''}-QR Decomposition).

For matrix 𝐌n×m(nm)\mathbf{M}\in{{\mathbb{R}}^{n\times m}}\left(n\geq m\right), QR decomposition looks for an orthogonal matrix 𝐐\mathbf{Q}, which converts 𝐌\mathbf{M} into a form composed of lower triangular matrix 𝐋\mathbf{L} and zero matrix. i.e.,

𝐌=𝐐[𝐋𝐎].\mathbf{M}=\mathbf{Q}\left[\begin{matrix}\mathbf{L}\\ \mathbf{O}\\ \end{matrix}\right]. (29)

Correspondingly, we call the traditional QR decomposition the “UO”\text{``UO''}-QR decomposition, which decomposes a matrix into an upper triangular matrix. Although the two have different forms, they both achieve decomposition through a series of basic orthogonal transformations. Householder transformation is the most widely used among various orthogonal transformation techniques because of its high efficiency, easy implementation, and applicability to sparse matrices. Completing QR decomposition once requires mm householder transformations. To describe the QR decomposition process in more detail, we define (𝒮,j)\left(\mathcal{S},j\right)-Householder transformation to describe each transformation.

Definition 11 ((𝒮,j)\left(\mathcal{S},j\right)-Householder Transformation).

For matrix 𝐌\mathbf{M}, given an ordered set 𝒮=s1,s2,,sr\mathcal{S}=\left\langle{{s}_{1}},{{s}_{2}},\ldots,{{s}_{r}}\right\rangle and column number jj, (𝒮,j)\left(\mathcal{S},j\right)-Householder Transformation selects the rows s1,s2,,sr{{s}_{1}},{{s}_{2}},\ldots,{{s}_{r}} and column jj of 𝐌\mathbf{M} as the reference for householder transformation, 𝐘=𝐐𝐌\mathbf{Y}=\mathbf{QM}. The transformation result will make 𝐘\mathbf{Y} satisfy ys1,j=s𝒮ms,j2{{y}_{{{s}_{1}},j}}=\sqrt{\sum\nolimits_{s\in\mathcal{S}}{{{m}_{s,j}}^{2}}} and ysi,j=0,2ir{{y}_{{{s}_{i}},j}}=0,2\leq i\leq r, where 𝐐\mathbf{Q} satisfies the following expression:

{𝐦=𝐇𝒮𝐌𝐞j𝝎=𝐦𝐦𝐞1𝐐=𝐈2𝐇𝒮T𝝎𝝎T𝝎T𝝎𝐇𝒮.\left\{\begin{aligned} &\mathbf{m}={{\mathbf{H}}_{\mathcal{S}}}\mathbf{M}{{\mathbf{e}}_{j}}\\ &\boldsymbol{\omega}=\mathbf{m}-\left\|\mathbf{m}\right\|{{\mathbf{e}}_{1}}\\ &\mathbf{Q}=\mathbf{I}-2{{\mathbf{H}}_{\mathcal{S}}}^{T}\frac{\boldsymbol{\omega}{{\boldsymbol{\omega}}^{T}}}{{{\boldsymbol{\omega}}^{T}}\boldsymbol{\omega}}{{\mathbf{H}}_{\mathcal{S}}}\\ \end{aligned}\right.. (30)

For 𝐌\mathbf{M}, we denote (𝒮,j)\left(\mathcal{S},j\right)-Householder transformation as 𝐘=House𝒮,j(𝐌)\mathbf{Y}={{\operatorname{House}}_{\mathcal{S},j}}\left(\mathbf{M}\right). The QR decomposition based on the Householder transformation can be expressed as

𝐑=HouseςmHouseς2Houseς1(𝐌),\mathbf{R}={{\operatorname{House}}_{{{\varsigma}_{m}}}}\centerdot\cdots\centerdot{{\operatorname{House}}_{{{\varsigma}_{2}}}}\centerdot{{\operatorname{House}}_{{{\varsigma}_{1}}}}\left(\mathbf{M}\right), (31)

where 𝐑\mathbf{R} is the result of QR decomposition and “\centerdot” represents the operation of function composition such as f2f1(x)=f2(f1(x)){{f}_{2}}\centerdot{{f}_{1}}\left(x\right)={{f}_{2}}\left({{f}_{1}}\left(x\right)\right). For “UO”\text{``UO''}-QR decomposition, the parameter ςi=(i,,n,i){{\varsigma}_{i}}=\left(\left\langle i,\cdots,n\right\rangle,i\right); for “LO”\text{``LO''}-QR decomposition, the parameter ςi=(mi+1,1,,mi,m+1,,n,mi+1){{\varsigma}_{i}}=\left(\left\langle m-i+1,1,\cdots,m-i,m+1,\cdots,n\right\rangle,m-i+1\right).

Next, we apply the “LO”\text{``LO''}-QR decomposition on 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}. To understand how “LO”\text{``LO''}-QR decomposition affects 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}, we divide 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}} into the following forms,

𝐌𝒯[𝐌𝐌].{{\mathbf{M}}_{\mathcal{T}}}\equiv\left[\begin{matrix}{{\mathbf{M}}_{\uparrow}}\\ {{\mathbf{M}}_{\downarrow}}\\ \end{matrix}\right]. (32)

The upper half 𝐌n1×n1{{\mathbf{M}}_{\uparrow}}\in{{\mathbb{R}}^{{{n}_{1}}\times{{n}_{1}}}} of 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}} consists of the first n1{{n}_{1}} rows; The second half 𝐌m×n1{{\mathbf{M}}_{\downarrow}}\in{{\mathbb{R}}^{m\times{{n}_{1}}}} consists of the remaining mm rows. According to Def. 5, we have 𝐌=𝐆𝒯(1){{\mathbf{M}}_{\uparrow}}={{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}}} and 𝐌=𝐇𝒮(f){{\mathbf{M}}_{\downarrow}}=-{{\mathbf{H}}_{{{\mathcal{S}}^{\left(f\right)}}}}, where the ii-th element of ordered set 𝒮(f){{\mathcal{S}}^{\left(f\right)}} satisfies si(f)= fi+n1s_{i}^{\left(f\right)}=\text{ }{{f}_{i+{{n}_{1}}}}. Considering the effects of Householder transformation on 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}, we denote the matrix obtained after the kk-th Householder transformation as 𝐌𝒯(k)(0kn1)\mathbf{M}_{\mathcal{T}}^{\left(k\right)}\left(0\leq k\leq{{n}_{1}}\right), whose upper half and second half are denoted as 𝐌(k)\mathbf{M}_{\uparrow}^{\left(k\right)} and 𝐌(k)\mathbf{M}_{\downarrow}^{\left(k\right)}. 𝐌𝒯(0)\mathbf{M}_{\mathcal{T}}^{\left(0\right)} is the form before Householder transformation, satisfying 𝐌𝒯(0)=𝐌𝒯\mathbf{M}_{\mathcal{T}}^{\left(0\right)}={{\mathbf{M}}_{\mathcal{T}}}; 𝐌𝒯(n1)\mathbf{M}_{\mathcal{T}}^{\left({{n}_{1}}\right)} is the result after “LO”\text{``LO''}-QR decomposition. Thm. 6 demonstrates that, in the process of Householder transformation, 𝐌𝒯(k)\mathbf{M}_{\mathcal{T}}^{\left(k\right)} satisfies some invariant properties.

Refer to caption
Figure 5: Householder Transformations in “LO”\text{``LO''}-QR Decomposition on 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}
Theorem 6.

For the “LO”\text{``LO''}-QR decomposition on 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}, after kk (0kn11)\left(0\leq k\leq{{n}_{1}}-1\right) times of Householder transformation, 𝐌𝒯(k)\mathbf{M}_{\mathcal{T}}^{\left(k\right)} always keeps the following three properties unchanged:

  • a)

    𝐌(k)𝐆𝒯(1)\mathbf{M}_{\uparrow}^{\left(k\right)}\sim{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}}};

  • b)

    Each line of 𝐌(k)\mathbf{M}_{\downarrow}^{\left(k\right)} has one and only one non-zero value;

  • c)

    The last kk columns of 𝐌(k)\mathbf{M}_{\downarrow}^{\left(k\right)} (i.e., the columns from n1k+1{{n}_{1}}-k+1 to n1{{n}_{1}}) are all 0.

Proof.

See Appendix A.8. ∎

As shown in Fig. 5, with the Householder transformation proceeds, the non-zero elements in 𝐌(k)\mathbf{M}_{\downarrow}^{\left(k\right)} shift from right to left, and the position of the non-zero elements of 𝐌(k)\mathbf{M}_{\uparrow}^{\left(k\right)} remains unchanged. Specifically, in the kk-th Householder transformation, all non-zero elements in the (n1k+1)\left({{n}_{1}}-k+1\right)-th column of 𝐌(k)\mathbf{M}_{\downarrow}^{\left(k\right)} are transferred to the fn1k+1{{f}_{{{n}_{1}}-k+1}}-th column. Combining Thm. 6, we can infer the specific form of 𝐌𝒯(n11)\mathbf{M}_{\mathcal{T}}^{\left({{n}_{1}}-1\right)} after n11{{n}_{1}}-1 times of Householder transformation. After the last householder transformation, the result satisfies the following theorem.

Theorem 7.

After “LO”\text{``LO''}-QR decomposition, there is a 𝐆𝒯(1)(𝐰node,𝐰edge)\mathbf{G}_{{{\mathcal{T}}^{\left(1\right)}}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)} and an orthogonal matrix 𝐐\mathbf{Q}, which let 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}} satisfies:

𝐌𝒯=𝐐[𝐆𝒯(1)(𝐰node,𝐰edge)𝐎].{{\mathbf{M}}_{\mathcal{T}}}=\mathbf{Q}\left[\begin{matrix}\mathbf{G}_{{{\mathcal{T}}^{\left(1\right)}}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}\\ \mathbf{O}\\ \end{matrix}\right]. (33)
Proof.

See Appendix A.9. ∎

Thm. 33 implies that the calculation (𝐌𝒯T𝐌𝒯)1{{\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}\right)}^{-1}} about 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}} in expression (8) can be replaced by 𝒯(1){{\mathcal{T}}^{\left(1\right)}}. Since it is related to 𝒯\mathcal{T} and 𝒯(1){{\mathcal{T}}^{\left(1\right)}} simultaneously, we denote it as 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}. Thm. 8 shows that 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} and 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}} are inner-product-equivalent.

Theorem 8.

For arbitrary 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}, there is a 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} inner-product-equivalent to it, which satisfies

𝐌𝒯T𝐌𝒯=𝐆𝒯(1)𝒯T𝐆𝒯(1)𝒯,{{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}={{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}^{T}{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}, (34)

where 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} is the upper half 𝐌(n1)\mathbf{M}_{\downarrow}^{\left({{n}_{1}}\right)} of 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}} after “LO”\text{``LO''}-QR decomposition.

Proof.

See Appendix A.10. ∎

5.3 Generation Matrix-based Optimally Consistent Release Algorithm

The property of inner product equivalence provides us with a vital optimization idea for an optimal release, as shown in Cor. 35. Using matrix analysis, we convert the expression (8) into an expression about 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} and then use the mathematical properties of Generation Matrix to improve the efficiency of optimal release.

Corollary 2.

The expression 𝐲=(𝐌𝒯T𝐌𝒯)1𝐱\mathbf{y}={{\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}\right)}^{-1}}\mathbf{x} can be obtained by performing an upward propagation and a downward propagation successively about 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}. That is

𝐲=(𝐌𝒯T𝐌𝒯)1𝐱=𝐆𝒯(1)𝒯1(𝐆𝒯(1)𝒯T𝐱).\mathbf{y}={{\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}\right)}^{-1}}\mathbf{x}={{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}^{-1}\left({{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}^{-T}\mathbf{x}\right). (35)
Proof.

See Appendix A.11. ∎

According to Cor. 35, we get another optimal release form as follows.

𝐯¯=𝐯~𝐌𝒯(𝐆𝒯(1)𝒯1(𝐆𝒯(1)𝒯T(𝐌𝒯T𝐯~))).\overline{\mathbf{v}}=\widetilde{\mathbf{v}}-{{\mathbf{M}}_{\mathcal{T}}}\left({{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}^{-1}\left({{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}^{-T}\left({{\mathbf{M}}_{\mathcal{T}}}^{T}\widetilde{\mathbf{v}}\right)\right)\right). (36)

We illustrate with parentheses that Formula (36) is calculated from right to left, ensuring that all multiplication and solution equations are for matrices and vectors. According to the sparsity of 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}, the time complexity of 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}-related multiplication calculation in Formula (36) is O(n)O\left(n\right). Besides, according to Prop. 14 and Prop. 15, the time complexity of solving the linear equation of Generation Matrix is also O(n1)O\left({{n}_{1}}\right). Therefore, the overall time complexity of Formula (36) is O(n)O\left(n\right). Formula (36) has completely summarized the core process of optimally consistent release. So long as we execute the formula directly after constructing 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}, we can efficiently obtain the optimal consistent release. The algorithm description with Generation Matrices is very concise and easy to implement.

However, only ensuring that the calculation of 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} is also highly efficient, the whole process is efficient. So, we further proposed Thm. 38 to calculate it.

Theorem 9.

Let wi{{w}_{i}}(1in1)\left(1\leq i\leq{{n}_{1}}\right) and wifi{{w}_{i\to{{f}_{i}}}}(2in1)\left(2\leq i\leq{{n}_{1}}\right) as the weights of the nodes and edges in 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}, respectively. They satisfy

{wi=1+θiwifi=wi1,\left\{\begin{aligned} &{{w}_{i}}=\sqrt{1+{{\theta}_{i}}}\\ &{{w}_{i\to{{f}_{i}}}}={{w}_{i}}^{-1}\\ \end{aligned}\right., (37)

where θi(1in1){{\theta}_{i}}\left(1\leq i\leq{{n}_{1}}\right) satisfies

θi=|𝒞i|j𝒞ijn1(1+θj)1.{{\theta}_{i}}=\left|{{\mathcal{C}}_{i}}\right|-\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{{{\left(1+{{\theta}_{j}}\right)}^{-1}}}. (38)
Proof.

See Appendix A.12. ∎

Thm. 38 shows that wi{{w}_{i}} and wifi{{w}_{i\to{{f}_{i}}}} can be directly calculated by only requiring θi{{\theta}_{i}}, and the calculation of θi{{\theta}_{i}} only needs to traverse from n1{{n}_{1}} to 11 once. Since this process involves the calculation of |𝒞i|\left|{{\mathcal{C}}_{i}}\right|, the overall time complexity of constructing 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} is O(n)O\left(n\right). The specific construction process is shown in Alg. 1.

Algorithm 1 Construct 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} from 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}
0:  Consistency Constraint Matrix 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}
0:  the inner-product-equivalent 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}
1:  Initialize 𝜽n1×1\boldsymbol{\theta}\in{{\mathbb{R}}^{{{n}_{1}}\times 1}}, satisfying θi=|𝒞i|{{\theta}_{i}}=\left|{\mathcal{C}}_{i}\right|.
2:  for i=n1 to 2i={{n}_{1}}\text{ to 2} do θfiθfi(1+θi)1{{\theta}_{{{f}_{i}}}}\leftarrow{{\theta}_{{{f}_{i}}}}-{{\left(1+{{\theta}_{i}}\right)}^{-1}};
3:  for i=1 to n1i=1\text{ to }{{n}_{1}} do Let wi=1+θi{{w}_{i}}=\sqrt{1+{{\theta}_{i}}};
4:  for i=2 to n1i=2\text{ to }{{n}_{1}} do Let wifi=wi1{{w}_{i\to{{f}_{i}}}}={{w}_{i}}^{-1};
5:  Construct 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} by wi{{w}_{i}} and wifi{{w}_{i\to{{f}_{i}}}};
6:  return  𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}};

In summary, we propose a Generation Matrix-based optimally consistent release algorithm (GMC) for differentially private hierarchical trees, described as Alg. 2. Note that Step 1 to Step 3 in Alg. 2 are the normal hierarchical tree release process, and Step 4 is the call of Alg. 1. Only step 5 is the core step, which uses formula (36) to achieve optimally consistent release. In the previous, the time complexity of formula (36) has been proved as O(n)O\left(n\right). Therefore, the overall time complexity of GMC is also O(n)O\left(n\right). Besides, it can be seen that GMC is a two-stage algorithm, i.e., the construction of 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} and post-processing. If the same hierarchical tree is used for multiple releases, we only need to construct 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} once.

Algorithm 2 Generation Matrix-based Optimally Consistent Release Algorithm
0:  hierarchical tree 𝒯\mathcal{T}, dataset 𝒟\mathcal{D}, privacy parameters ε\varepsilon
0:  the optimally consistent release 𝐯¯\overline{\mathbf{v}}
1:  Construct a vector 𝐱\mathbf{x} from 𝒟\mathcal{D}, which satisfies xi=ϕi(𝐃){{x}_{i}}={{\phi}_{i}}\left(\mathbf{D}\right).
2:  Build a hierarchical tree with 𝐯=𝐆𝒯T𝐇T𝐱\mathbf{v}={{\mathbf{G}}_{\mathcal{T}}}^{-T}{{\mathbf{H}}_{\mathcal{H}}}^{T}\mathbf{x};
3:  Calculate the height of 𝒯\mathcal{T}, then get 𝐯~\widetilde{\mathbf{v}} by adding noise to 𝐯\mathbf{v}, where v~i=vi+ξi,ξiLap(h/ε){{\widetilde{v}}_{i}}={{v}_{i}}+{{\xi}_{i}},{{\xi}_{i}}\sim\operatorname{Lap}\left({h}/{\varepsilon}\;\right).
4:  Construct 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}, then substitute it into Alg. 1 to obtain 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}};
5:  Calculate 𝐯¯=𝐯~𝐌𝒯(𝐆𝒯(1)𝒯1(𝐆𝒯(1)𝒯T(𝐌𝒯T𝐯~)))\overline{\mathbf{v}}=\widetilde{\mathbf{v}}-{{\mathbf{M}}_{\mathcal{T}}}\left({{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}^{-1}\left({{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}^{-T}\left({{\mathbf{M}}_{\mathcal{T}}}^{T}\widetilde{\mathbf{v}}\right)\right)\right);
6:  return  𝐯¯\overline{\mathbf{v}};

In addition, we can further optimize the algorithm. Considering that the construction of 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} involves square root extraction, which may cause more calculation overhead, we propose an improved version of Alg. 2 to avoid any square root. The main idea is shown as follows.

𝐯¯=𝐯~𝐌𝒯(𝐆1𝒯(1)(𝜽,𝟏)(𝜽(𝐆T𝒯(1)(𝜽,𝟏)(𝐌𝒯T𝐯~)))).\overline{\mathbf{v}}=\widetilde{\mathbf{v}}-{{\mathbf{M}}_{\mathcal{T}}}\left(\mathbf{G}{{{}_{{{\mathcal{T}}^{\left(1\right)}}}^{\left(\boldsymbol{\theta}^{\prime},\mathbf{1}\right)}}^{-1}}\left(\boldsymbol{\theta}^{\prime}*\left(\mathbf{G}{{{}_{{{\mathcal{T}}^{\left(1\right)}}}^{\left(\boldsymbol{\theta}^{\prime},\mathbf{1}\right)}}^{-T}}\left({{\mathbf{M}}_{\mathcal{T}}}^{T}\widetilde{\mathbf{v}}\right)\right)\right)\right). (39)

Where the vector 𝜽=[θ1,θ2,,θn1]T+1\boldsymbol{\theta}^{\prime}={{\left[{{\theta}_{1}},{{\theta}_{2}},\ldots,{{\theta}_{{{n}_{1}}}}\right]}^{T}}+1, and “*” is Hadamard product [32]. The new version without square root extraction can slightly improve the calculation efficiency, which shows that the algorithm under the matrix description has high scalability. The improvement of the algorithm only needs to make some slight modifications on Formula (36). Furthermore, we can directly extend the existing model to solve other hierarchical tree problems, such as the hierarchical tree release with the non-uniform privacy budget.

6 Experiment

We conducted experiments to verify the performance of GMC our proposed for differentially private hierarchical tree release. All our experiments are run on a computer with Dual 44 Core 3.93.9 GHz AMD Ryzen CPUs, 3232GB RAM, and MATLAB’s development software. To improve the reliability of our experimental results, we repeatedly run the same experimental setup 100100 times and then take the average of multiple experimental results as the final results. In addition, we denote the output of our algorithm by 𝐯(out){{\mathbf{v}}^{\left(out\right)}} or 𝐱(out){{\mathbf{x}}^{\left(out\right)}}, vi(out)v_{i}^{\left(out\right)}, and their ii-th outputs are denoted as vi(out)v_{i}^{\left(out\right)} and xi(out)x_{i}^{\left(out\right)}, respectively.

6.1 Datasets and Comparison Algorithms

Our experiments run on 33 large datasets with more than 1010 million nodes. They are Census2010, NYCTaxi, and SynData, with the following details.

Census2010[33]: Due to the plan that the U.S. Census Bureau announced using differential privacy[2], we adopt the 2010 U.S. Census dataset as our first dataset, which contains demographic information of all Americans. The statistical results are divided into 88 levels by the geographic components, i.e., “United States - State - County - County Subdivision - Place/Remainder - Census Tract - Block Group - Block”. The dataset contains 312,471,327312,471,327 individuals, which construct a hierarchical tree containing 11,802,16211,802,162 nodes. One of its typical applications is to provide users with queries on the population of the area of interest, called “Node Query”. For example, a user submits a query request “What is the population of Albany County in New York, USA?”. The system will return the value at the node “USA - New York - Albany County”. After verification, we ensured that the data before adding noise satisfies consistency.

NYCTaxi[34]: This data set comes from car ride records in New York City in 2013. In order to ensure the uniqueness of the data, we selected the data provided by Creative Mobile Technologies. According to the start time of the taxi, we constructed a statistical histogram of the frequency of people taking a taxi in seconds and provided a range query. The process of building a hierarchical tree is random, which ensures that our algorithm can handle any hierarchical tree structure. The data contains 86,687,77586,687,775 individuals. Since we count the ride frequency by seconds, the number of histograms (leaf nodes) totals 31,536,00031,536,000. The fan-out of nodes is random. In our experiments, we set the proportion of nodes with fan-outs of 22, 33, 44 and 55 to {40%,30%,20%,10%}\left\{40\%,30\%,20\%,10\%\right\}. In this way, the hierarchical tree we build contains approximately 4545 to 5050 million nodes.

SynData: To test the hierarchical tree with special structures, we adopt a randomly synthesized dataset for our experiments. In synthetic data, the hierarchical tree structure is a complete binary, and the number of nodes nn is controlled by the tree height hh, where n=2h1n={{2}^{h}}-1. We generate xi{{x}_{i}} by Poisson distribution, i.e., xiPoi(λ){{x}_{i}}\sim\operatorname{Poi}\left(\lambda\right). In our experiments, we set λ=100\lambda=100 and take h=24h=24 as the complete dataset, containing 16,777,21516,777,215 nodes. Like NYCTaxi, we focus on the “Range Query” on SynData.

Table 2: Description of the Algorithms
Name Description
Processless No processing after adding noise.
Boosting Boosting[13], the classic post-processing of only for the complete trees. For arbitrary structure, we take the fan-out of root as the fan-out of the algorithm.
PrivTrie The post-processing in PrivTrie[14], which has been proven to achieve the optimally consistent release for arbitrary hierarchical tree.
GMC Our post-processing for arbitrary hierarchical tree, which achieve the optimally consistent release based on Generation Matrix.

Our experiments compare 44 different algorithm settings. Their details are shown in Tab. 2. Because our experiment is for large-scale hierarchical trees with more than 1010 million nodes, the time complexity of the selected algorithm settings is both O(n)O\left(n\right).

6.2 Verifying the Effectiveness for GMC

In the first experiment, we verify the effectiveness of GMC with two main aspects, i.e., error and consistency.

The error is measured by the root mean square error (rmsermse), whose calculation methods are different for “Node Query” and “Range Query”. We denote the rmsermse of “Node Query” (For Census2010) by rmseNQ{{rmse}_{NQ}}, and its formula is

rmseNQ=1ni=1n(vi(out)vi)2.{{rmse}_{NQ}}=\sqrt{\frac{1}{n}\sum\nolimits_{i=1}^{n}{{{\left(v_{i}^{\left(out\right)}-{{v}_{i}}\right)}^{2}}}}. (40)

And the rmsermse of “Range Query” (For NYCTaxi and SynData) is denoted as rmseRQ{{rmse}_{RQ}}, calculated by

rmseRQ=1qi=1q(j=aibixi(out)j=aibixj)2.{{rmse}_{RQ}}=\sqrt{\frac{1}{q}\sum\nolimits_{i=1}^{q}{{{\left(\sum\nolimits_{j={{a}_{i}}}^{{{b}_{i}}}{x_{i}^{\left(out\right)}}-\sum\nolimits_{j={{a}_{i}}}^{{{b}_{i}}}{{{x}_{j}}}\right)}^{2}}}}. (41)

Where qq is the number of range queries selected randomly and without repetition, and the range of the ii-th query is recorded as [ai,bi]\left[{{a}_{i}},{{b}_{i}}\right]. The main reason for random sampling is that all range queries are up to Cm2C_{m}^{2} so that we cannot test all range queries. In our experiment, we take q=105q={{10}^{5}}.

The consistency of the outputs is measured by consistency bias. Let 𝚫=𝐌T𝐯(out)\mathbf{\Delta}={{\mathbf{M}}^{T}}{{\mathbf{v}}^{\left(out\right)}}, and Δi{{\Delta}_{i}} as the ii-th element of 𝚫\mathbf{\Delta}, then biasbias satisfies

bias=i=1n1Δi2/n1.bias=\sqrt{{\sum\nolimits_{i=1}^{{{n}_{1}}}{{{\Delta}_{i}}^{2}}}/{{{n}_{1}}}\;}. (42)

Besides, we adopt the complete datasets in the experiment and take ε=1\varepsilon=1.

Table 3: Experimental Results on Multiple Algorithm Settings and Large-scale Datasets
Census2010 NYCTaxi SynData
rmse{{rmse}} bias{{bias}} rmse{{rmse}} bias{{bias}} rmse{{rmse}} bias{{bias}}
Processless 11.32 49.61 138.42 48.38 166.37 61.24
Boosting 11.30 159.01 120.75 22.42 74.94 0.00
PrivTrie 11.00 0.00 68.69 0.00 74.94 0.00
GMC 11.00 0.00 68.69 0.00 74.94 0.00

Since the optimally consistent release problem of the differentially private hierarchical tree is a convex optimization problem, its solution is unique. If the outputs of GMC also satisfy optimally consistent, they should be the same as PrivTrie. The results in Tab. 3 confirm this point. They show that GMC is effective and correct. However, it does not mean that PrivTrie and GMC are entirely equivalent. Their implementations are entirely different, so we need further to analyze the algorithms’ performance in the subsequent experiments.

In addition, the results also show the major drawback of Boosting. i.e., it can only guarantee the consistency of complete trees. If the fan-outs of nodes are different, Boosting cannot guarantee that the results are consistent. Especially for Census2010, where the fan-out of nodes is highly different, Boosting results in more significant consistency bias after post-processing.

6.3 Performance Testing

In this section, we focus on the algorithms’ performance and test the running time of the algorithms above from small-scale data to large-scale data. To construct hierarchical trees with different scales, we adopted the following different methods according to the characteristics of each dataset. For Census2010, we obtain a smaller hierarchical tree by kk-order subtree. For example, in its 44-order subtree, the leaves represent the “County Subdivision” level. For NYC taxi, we divide the data of 2013 into 1212 months. The data of the first kk months as the kk-th subset. For SynData, we control the data scale by directly setting the tree height. In the experiment, we used the six different tree heights {8,12,16,20,24}\left\{8,12,16,20,24\right\} to generate hierarchical trees with different scales. Since Processless does not perform any processing for consistency, we omit it in the experiment.

Refer to caption
(a) Census2010
Refer to caption
(b) NYCTaxi
Refer to caption
(c) SynData
Figure 6: The Comparison of Running Time for Various Different Data Scales

In Fig. 6, the experimental results show that all three algorithms can complete the post-processing within some time, but their running times are quite different. Boosting and PrivTrie need more than 200 seconds to process the hierarchical trees with tens of millions of nodes, while GMC only needs about 22 seconds to process the same data (See Fig. 6a). Their performance gap is up to 100100 times. It shows that even if Boosting and PrivTrie have reached the lowest time complexity with O(n)O\left(n\right), they still have much space for performance improvement. In addition to the relatively inefficient recursive algorithms, another important reason is that GMC uses standard matrix operations to complete the post-processing after establishing the matrix model. These standard matrix operations introduce many optimization techniques in the underlying design, which can make full use of computer resources and significantly improve computing efficiency. Although the results above do not deny that Boosting and PrivTrie can handle large-scale hierarchical trees, applying them to some scenarios, such as real-time data updates or more complex models, may cause considerable challenges in computational efficiency.

Refer to caption
Figure 7: Running Time for Two Stages of GMC with Complete Datasets

Finally, we test the running time for the two stages of GMC, i.e., the Generation Matrix construction (Stage 1) and the post-processing (Stage 2). In Fig. 7, we can see that the time overhead in Stage 1 is much more significant than in Stage 2. The reason is the Generation Matrix construction includes the sorting by Descending Order of Height, counting the number of children, and some high-cost operations such as division, square root (in the process of calculating θi{{\theta}_{i}}). It is worth noting that the Generation Matrix construction is independent of the data to be released and does not involve individual privacy. When the same hierarchical tree structure is reused in multiple releases, we only need to perform the construction process once, further reducing the overall running time of the optimally consistent releases.

7 Conclusions and Future Work

In the previous, we successfully defined Generation Matrix and demonstrated many of its critical mathematical properties. Using Generation Matrix, we can implement various hierarchical tree operations without accessing the local structure, which provides crucial theoretical support for the Matrixing Research Method of hierarchical trees. The application on the differentially private hierarchical tree release reflects the Generation Matrix’s practicability. The proposed GMC based on Generation Matrix provides a concise algorithm for optimally consistent release. Our experiments show that GMC has achieved a significant performance improvement of up to 100100 times compared with the state-of-the-art schemes.

The scientific problem that we solve in this paper is not very complicated, but it is classic and is suitable as an example to show the practicability of the Generation Matrix. However, the hierarchical tree problems that Generation Matrix can solve are far more than the application. We can use it to explore more complex hierarchical tree release problems and even the problems that have not been solved so far. For example, the issue of non-negative consistent release of hierarchical trees currently does not exist any closed-form solution with time complexity of O(n)O\left(n\right), which makes it very difficult to solve the optimal release satisfying both consistency and non-negativity for large-scale data. Nonetheless, Generation Matrix provides us with a critical analysis tool to challenge the problem.

Appendix A Partial Proofs

A.1 Proof of Property 16

Proof.

According to the basic properties of the lower triangular matrix[24], the inverse 𝐆𝒯1{{\mathbf{G}}_{\mathcal{T}}}^{-1} of the lower triangular matrix 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}} is also a lower triangular matrix.

Considering 𝐆𝒯1{{\mathbf{G}}_{\mathcal{T}}}^{-1}’s jj-th column vector 𝐠j(1)=𝐆𝒯1𝐞j\mathbf{g}_{j}^{\left(-1\right)}={{\mathbf{G}}_{\mathcal{T}}}^{-1}{{\mathbf{e}}_{j}}, whose ii-th element is denoted as gij(1)g_{ij}^{\left(-1\right)}, we have gij(1)=0,i<jg_{ij}^{\left(-1\right)}=0,i<j. Since only the jj-th element of 𝐞j{{\mathbf{e}}_{j}} is 11, but the rest are all 0, gij(1)g_{ij}^{\left(-1\right)}, for iji\geq j, satisfies

gij(1)={1,i=jgfj,j(1),i>j.g_{ij}^{\left(-1\right)}=\left\{\begin{array}[]{*{35}{l}}1&,i=j\\ g_{{{f}_{j}},j}^{\left(-1\right)}&,i>j\\ \end{array}\right.. (43)

Therefore, gjj(1)=1g_{jj}^{\left(-1\right)}=1. Next, we adopt the contradiction method to prove. Suppose jj is the ancestor of ii, but gij(1)1g_{ij}^{\left(-1\right)}\neq 1.

Let t0(i),t1(i),,tp(i)t_{0}^{\left(i\right)},t_{1}^{\left(i\right)},\ldots,t_{p}^{\left(i\right)} denote the node ii and its pp ancestors respectively.

Since jj is an ancestor of ii, there is a qq such that j=itq(i)j=it_{q}^{\left(i\right)}. Therefore, we have gjj(1)=gtq(i)j(1)==gt1(i)j(1)=gij(1)1g_{jj}^{\left(-1\right)}=g_{t_{q}^{\left(i\right)}j}^{\left(-1\right)}=\cdots=g_{t_{1}^{\left(i\right)}j}^{\left(-1\right)}=g_{ij}^{\left(-1\right)}\neq 1 according to the recurrence formula (43).

The conclusion contradicts gjj(1)=1g_{jj}^{\left(-1\right)}=1. Therefore, when jj is an ancestor of ii, gij(1)=1g_{ij}^{\left(-1\right)}=1.

If jj is not an ancestor of ii, obviously jj is not the root, i.e., j>1j>1. According to the formula (43), we have g1,j(1)=gtp(i)j(1)==gt1(i)j(1)=gij(1)g_{1,j}^{\left(-1\right)}=g_{t_{p}^{\left(i\right)}j}^{\left(-1\right)}=\cdots=g_{t_{1}^{\left(i\right)}j}^{\left(-1\right)}=g_{ij}^{\left(-1\right)}. Since g1,j(1),j>1g_{1,j}^{\left(-1\right)},j>1 has been proved to be 0, thus gij(1)=0g_{ij}^{\left(-1\right)}=0. ∎

A.2 Proof of Property 9

Proof.

Considering the ii-th row of 𝐆𝒯1{{\mathbf{G}}_{\mathcal{T}}}^{-1}, we denote Ui={k|gik(1)0}{{U}_{i}}=\left\{k\left|g_{ik}^{\left(-1\right)}\neq 0\right.\right\} as the set formed by the column subscripts of the non-zero elements in the ii-th row. According to 𝐌=(𝐆𝒯T𝐆𝒯)1=𝐆𝒯1𝐆𝒯T\mathbf{M}={{\left({{\mathbf{G}}_{\mathcal{T}}}^{T}{{\mathbf{G}}_{\mathcal{T}}}\right)}^{-1}}={{\mathbf{G}}_{\mathcal{T}}}^{-1}{{\mathbf{G}}_{\mathcal{T}}}^{-T}, we have mij{{m}_{ij}} satisfied

mij=k=1ngik(1)gjk(1)=kUiUjgik(1)gjk(1)=|UiUj|{{m}_{ij}}=\sum\nolimits_{k=1}^{n}{g_{ik}^{\left(-1\right)}g_{jk}^{\left(-1\right)}}=\sum\nolimits_{k\in{{U}_{i}}\cap{{U}_{j}}}{g_{ik}^{\left(-1\right)}g_{jk}^{\left(-1\right)}}=\left|{{U}_{i}}\cap{{U}_{j}}\right| (44)

According to Prop. 16, Ui={k|k is an ancestor of i}{{U}_{i}}=\left\{k\left|k\text{ is an ancestor of }i\right.\right\}. Therefore, UiUj{{U}_{i}}\cap{{U}_{j}} is the common ancestor of ii and jj and mij{{m}_{ij}} records the number of their common ancestors.

Specifically, if i=ji=j, mii=|Ui|{{m}_{ii}}=\left|{{U}_{i}}\right|, i.e., the number of ancestors of ii plus 11 (itself). Let the depth of the root be 1, mii{{m}_{ii}} is the depth of ii. ∎

A.3 Proof of Property 11

Proof.

Let 𝜶=[α1,α2,,αn]T\boldsymbol{\alpha}={{\left[{{\alpha}_{1}},{{\alpha}_{2}},\ldots,{{\alpha}_{n}}\right]}^{T}}, 𝜷=[β1,β2,,βn]T\boldsymbol{\beta}={{\left[{{\beta}_{1}},{{\beta}_{2}},\ldots,{{\beta}_{n}}\right]}^{T}}. According to Equation (19), we have wi=αiβi{{w}_{i}}={{\alpha}_{i}}{{\beta}_{i}}, wifi=αfiβi{{w}_{i\to{{f}_{i}}}}={{\alpha}_{{{f}_{i}}}}{{\beta}_{i}}. That is, lnwi=lnαi+lnβi\ln{{w}_{i}}=\ln{{\alpha}_{i}}+\ln{{\beta}_{i}}, lnwifi=αfi+βi\ln{{w}_{i\to{{f}_{i}}}}={{\alpha}_{{{f}_{i}}}}+{{\beta}_{i}}. Therefore, we have the following matrix equation holds.

[𝐈𝐈𝐈𝐆𝐈][ln𝜶ln𝜷]=[ln𝐰nodeln𝐰edge].\left[\begin{matrix}\mathbf{I}&\mathbf{I}\\ \mathbf{I}-\mathbf{G}&\mathbf{I}\\ \end{matrix}\right]\left[\begin{matrix}\ln\boldsymbol{\alpha}\\ \ln\boldsymbol{\beta}\\ \end{matrix}\right]=\left[\begin{matrix}\ln{{\mathbf{w}}_{node}}\\ \ln{{\mathbf{w}}_{edge}}\\ \end{matrix}\right]. (45)

According to the property of Block Matrix Inversion, we have

[𝐈𝐈𝐈𝐆𝐈]1=[𝐆1𝐆1𝐈𝐆1𝐆1].{{\left[\begin{matrix}\mathbf{I}&\mathbf{I}\\ \mathbf{I}-\mathbf{G}&\mathbf{I}\\ \end{matrix}\right]}^{-1}}=\left[\begin{matrix}{{\mathbf{G}}^{-1}}&-{{\mathbf{G}}^{-1}}\\ \mathbf{I}-{{\mathbf{G}}^{-1}}&{{\mathbf{G}}^{-1}}\\ \end{matrix}\right]. (46)

And then,

[ln𝜶ln𝜷]=[𝐆𝒯1(ln𝐰nodeln𝐰edge)ln𝐰node+𝐆𝒯1(ln𝐰edgeln𝐰node)]=[ln𝜶ln𝐰nodeln𝜶].\left[\begin{matrix}\ln\boldsymbol{\alpha}\\ \ln\boldsymbol{\beta}\\ \end{matrix}\right]=\left[\begin{matrix}{{\mathbf{G}}_{\mathcal{T}}}^{-1}\left(\ln{{\mathbf{w}}_{node}}-\ln{{\mathbf{w}}_{edge}}\right)\\ \ln{{\mathbf{w}}_{node}}+{{\mathbf{G}}_{\mathcal{T}}}^{-1}\left(\ln{{\mathbf{w}}_{edge}}-\ln{{\mathbf{w}}_{node}}\right)\\ \end{matrix}\right]=\left[\begin{matrix}\ln\boldsymbol{\alpha}\\ \ln{{\mathbf{w}}_{node}}-\ln\boldsymbol{\alpha}\\ \end{matrix}\right]. (47)

Performing the exponential operations exp()\exp\left(*\right) on both sides of the equation (47) at the same time, we have formula (20) holds. ∎

A.4 Proof of Theorem 22

Proof.

Let lij{{l}_{ij}} denote the element in row ii and column jj of 𝐋𝒯{{\mathbf{L}}_{\mathcal{T}}}. According to the definition of Laplacian Matrix[6], we have the diagonal elements lii{{l}_{ii}} representing the number of adjacent nodes of ii; and if jj is the adjacent node of ii, i.e., j=fij={{f}_{i}} or i=fji={{f}_{j}}, we have lij=1{{l}_{ij}}=-1; otherwise, lij=0{{l}_{ij}}=0.

According to the expression (22), we have

lij={(kU1gk1gk1)1,i=j=1kUiUjgkigkj,otherwise{{l}_{ij}}=\left\{\begin{array}[]{*{35}{l}}\left(\sum\nolimits_{k\in{{U}_{1}}}{{{g}_{k1}}{{g}_{k1}}}\right)-1&,i=j=1\\ \sum\nolimits_{k\in{{U}_{i}}\cap{{U}_{j}}}{{{g}_{ki}}{{g}_{kj}}}&,\text{otherwise}\\ \end{array}\right. (48)

where Uj={k|gkj0}={j}𝒞j{{U}_{j}}=\left\{k\left|{{g}_{kj}}\neq 0\right.\right\}=\left\{j\right\}\cup{{\mathcal{C}}_{j}} denotes the set formed by the row subscripts of non-zero elements in the jj-th column of 𝐆𝒯{{\mathbf{G}}_{\mathcal{T}}}.

Next, we discuss the following situations:

For i=j=1i=j=1, under the definition of Laplacian Matrix, l11{{l}_{11}} should be equal to the number of children of the root, i.e., l11=|𝒞1|{{l}_{11}}=\left|{{\mathcal{C}}_{1}}\right|. According to the expression (22), there is k{1}𝒞1k\in\left\{1\right\}\cup{{\mathcal{C}}_{1}} and l11=|{1}𝒞1|1=|𝒞1|{{l}_{11}}=\left|\left\{1\right\}\cup{{\mathcal{C}}_{1}}\right|-1=\left|{{\mathcal{C}}_{1}}\right|. It conforms to the definition of Laplacian Matrix.

For i=j>1i=j>1, under the definition of Laplacian Matrix, lii{{l}_{ii}} should be equal to the number of ii’s children and parent, i.e., lii=|𝒞ifi|=|𝒞i|+1{{l}_{ii}}=\left|{{\mathcal{C}}_{i}}\cup{{f}_{i}}\right|=\left|{{\mathcal{C}}_{i}}\right|+1. According to the expression (22), lii=kUigkigki=|{i}𝒞i|=|𝒞i|+1{{l}_{ii}}=\sum\nolimits_{k\in{{U}_{i}}}{{{g}_{ki}}{{g}_{ki}}}=\left|\left\{i\right\}\cup{{\mathcal{C}}_{i}}\right|=\left|{{\mathcal{C}}_{i}}\right|+1. It conforms to the definition of Laplacian Matrix.

For iji\neq j but i𝒞ji\in{{\mathcal{C}}_{j}}, under the definition of Laplacian Matrix, lij=1{{l}_{ij}}=-1. According to the expression (22), kUiUj=({i}𝒞i)({j}𝒞j)={i}𝒞j={i}k\in{{U}_{i}}\cap{{U}_{j}}\text{=}\left(\left\{i\right\}\cup{{\mathcal{C}}_{i}}\right)\cap\left(\left\{j\right\}\cup{{\mathcal{C}}_{j}}\right)=\left\{i\right\}\cap{{\mathcal{C}}_{j}}=\left\{i\right\}, we have lij=giigij=1{{l}_{ij}}={{g}_{ii}}{{g}_{ij}}=-1. Again, It conforms to the definition of Laplacian Matrix. Similarly, if iji\neq j and j𝒞ij\in{{\mathcal{C}}_{i}}, lij=1{{l}_{ij}}=-1. Finally, when iji\neq j and jj and ii do not contain a parent-child relationship, UiUj={{U}_{i}}\cap{{U}_{j}}=\varnothing , Lij=0{{L}_{ij}}=0, which also conforms to the definition of Laplacian Matrix.

Therefore, in any case, the expression(22) always conforms to the definition of the Laplacian Matrix. ∎

A.5 Proof of Theorem 23

Proof.

Let dij{{d}_{ij}} denote the element in row ii and column jj of 𝐃𝒯{{\mathbf{D}}_{\mathcal{T}}}.

Refer to caption
Figure 8: The Distance of Node ii, jj and Their Nearest Common Ancestor

According to the definition of distance matrix[7], dij{{d}_{ij}} is the distance from node ii to jj. As shown in Fig. 8, cc (red node) is the nearest common ancestor of nodes ii and jj. dc{{d}_{c}} denotes the depth of node cc (i.e., the distance from node C to the root (orange node) plus 1); d1{{d}_{1}} denotes the distance from cc to ii; d2{{d}_{2}} denotes the distance from cc to jj. Obviously, the distance dij=d1+d2{{d}_{ij}}={{d}_{1}}+{{d}_{2}}.

Considering expression (23), let 𝐌1=𝐆𝒯1𝐈𝐈T{{\mathbf{M}}_{1}}={{\mathbf{G}}_{\mathcal{T}}}^{-1}\mathbf{I}{{\mathbf{I}}^{T}}, 𝐌2=𝐈𝐈T𝐆𝒯T{{\mathbf{M}}_{2}}=\mathbf{I}{{\mathbf{I}}^{T}}{{\mathbf{G}}_{\mathcal{T}}}^{-T}, 𝐌2=𝐈𝐈T𝐆𝒯T{{\mathbf{M}}_{2}}=\mathbf{I}{{\mathbf{I}}^{T}}{{\mathbf{G}}_{\mathcal{T}}}^{-T} and mij(k)m_{ij}^{\left(k\right)} denotes the element in the row ii and column jj of 𝐌k{{\mathbf{M}}_{k}}. According to Prop. 6, mij(1)=mji(2)m_{ij}^{\left(1\right)}=m_{ji}^{\left(2\right)} is the depth of the node ii, i.e., mij(3)=dcm_{ij}^{\left(3\right)}={{d}_{c}}. According to Prop. 9, mij(3)m_{ij}^{\left(3\right)} is the number of common ancestors of ii and jj (the depth of cc), i.e., mij(3)=dcm_{ij}^{\left(3\right)}={{d}_{c}}.

Substituting them into expression (23), we have

dij=mij(1)+mij(2)2mij(3)=(d1+dc)+(d2+dc)2dc=d1+d2{{d}_{ij}}=m_{ij}^{\left(1\right)}+m_{ij}^{\left(2\right)}-2m_{ij}^{\left(3\right)}=\left({{d}_{1}}+{{d}_{c}}\right)+\left({{d}_{2}}+{{d}_{c}}\right)-2{{d}_{c}}={{d}_{1}}+{{d}_{2}} (49)

The expression (49) is consistent with the definition, so 𝐃𝒯{{\mathbf{D}}_{\mathcal{T}}} can be calculated by expression (23). ∎

A.6 Proof of Theorem 24

Proof.

Let cij{{c}_{ij}} denote the element in row ii and column jj of 𝐂𝒯{{\mathbf{C}}_{\mathcal{T}}}.

According to the definition of Ancestral Matrix[8], cij{{c}_{ij}} represents the distance from the nearest common ancestor of leaves ii and jj to the root.

Note, Prop. 9 has been proved that the element of (𝐆𝒯T𝐆𝒯)1{{\left({{\mathbf{G}}_{\mathcal{T}}}^{T}{{\mathbf{G}}_{\mathcal{T}}}\right)}^{-1}} is the number of the common nodes, i.e., the distance from the nearest common ancestor to the root node plus 11.

Therefore, we can get 𝐂𝒯{{\mathbf{C}}_{\mathcal{T}}} by taking the sub-matrix corresponding to the leaves in (𝐆𝒯T𝐆𝒯)1{{\left({{\mathbf{G}}_{\mathcal{T}}}^{T}{{\mathbf{G}}_{\mathcal{T}}}\right)}^{-1}} and then subtracting 11. ∎

A.7 Proof of Theorem 28

Proof.

According to formula (5), the noise we add to each element in 𝐯~\widetilde{\mathbf{v}} is i.i.d, and satisfies Lap(h/ε)\operatorname{Lap}\left({h}/{\varepsilon}\;\right). Since the variance of Lap(b)\operatorname{Lap}\left(b\right) is 2b22{{b}^{2}}, the covariance matrix of 𝐯~\widetilde{\mathbf{v}} is D(𝐯~)=2(h2/ε2)𝐈D\left(\widetilde{\mathbf{v}}\right)=2\left({{{h}^{2}}}/{{{\varepsilon}^{2}}}\;\right)\mathbf{I}.

According to the mean square error analysis of differential privacy, we have

mse(𝐯~)=trace(D(𝐯~))=2(h2/ε2)trace(𝐈)=2nh2/ε2\operatorname{mse}\left(\widetilde{\mathbf{v}}\right)=\operatorname{trace}\left(D\left(\widetilde{\mathbf{v}}\right)\right)=2\left({{{h}^{2}}}/{{{\varepsilon}^{2}}}\;\right)\operatorname{trace}\left(\mathbf{I}\right)={2n{{h}^{2}}}/{{{\varepsilon}^{2}}}\; (50)

In addition, according to formula (8),

D(𝐯¯)=D((𝐈𝐌𝒯(𝐌𝒯T𝐌𝒯)1𝐌𝒯T)𝐯~)=2(h2/ε2)(𝐈𝐌𝒯(𝐌𝒯T𝐌𝒯)1𝐌𝒯T)D\left(\overline{\mathbf{v}}\right)=D\left(\left(\mathbf{I}-{{\mathbf{M}}_{\mathcal{T}}}{{\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}\right)}^{-1}}{{\mathbf{M}}_{\mathcal{T}}}^{T}\right)\widetilde{\mathbf{v}}\right)=2\left({{{h}^{2}}}/{{{\varepsilon}^{2}}}\;\right)\left(\mathbf{I}-{{\mathbf{M}}_{\mathcal{T}}}{{\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}\right)}^{-1}}{{\mathbf{M}}_{\mathcal{T}}}^{T}\right) (51)

Then,

mse(𝐯¯)=trace(D(𝐯¯))=2(h2/ε2)trace(𝐈𝐌𝒯(𝐌𝒯T𝐌𝒯)1𝐌𝒯T)=2(h2/ε2)(ntrace(𝐌𝒯T𝐌𝒯(𝐌𝒯T𝐌𝒯)1)).\begin{aligned} \operatorname{mse}\left(\overline{\mathbf{v}}\right)=&\operatorname{trace}\left(D\left(\overline{\mathbf{v}}\right)\right)=2\left({{{h}^{2}}}/{{{\varepsilon}^{2}}}\;\right)\operatorname{trace}\left(\mathbf{I}-{{\mathbf{M}}_{\mathcal{T}}}{{\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}\right)}^{-1}}{{\mathbf{M}}_{\mathcal{T}}}^{T}\right)\\ =&2\left({{{h}^{2}}}/{{{\varepsilon}^{2}}}\;\right)\left(n-\operatorname{trace}\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}{{\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}\right)}^{-1}}\right)\right)\\ \end{aligned}. (52)

Since 𝐌𝒯n×n1{{\mathbf{M}}_{\mathcal{T}}}\in{{\mathbb{R}}^{n\times{{n}_{1}}}}, we have

trace(𝐌𝒯T𝐌𝒯(𝐌𝒯T𝐌𝒯)1)=trace(𝐈)=n1.\operatorname{trace}\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}{{\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}\right)}^{-1}}\right)=\operatorname{trace}\left(\mathbf{I}\right)={{n}_{1}}. (53)

Substituting it into (52), we have

mse(𝐯¯)=2(h2/ε2)(nn1)=2mh2/ε2\operatorname{mse}\left(\overline{\mathbf{v}}\right)=2\left({{{h}^{2}}}/{{{\varepsilon}^{2}}}\;\right)\left(n-{{n}_{1}}\right)={2m{{h}^{2}}}/{{{\varepsilon}^{2}}}\; (54)

A.8 Proof of Theorem 6

Proof.

By Def. 29, it is obvious that the process of Householder transformation satisfies property c). Because the process of “LO”\text{``LO''}-QR decomposition is the process of setting 0 column by column from the last column to the first column.

According to Def. 5, 𝐌𝒯(0)\mathbf{M}_{\mathcal{T}}^{\left(0\right)} satisfies both properties a),b) and c).

Assume that M satisfies both properties a), b) and c). By the formula (31), the expression of the kk-th Householder transformation is as follows:

𝐌𝒯(k)=House𝒮k,tk(𝐌𝒯(k1)),\mathbf{M}_{\mathcal{T}}^{\left(k\right)}={{\operatorname{House}}_{{{\mathcal{S}}_{k}},{{t}_{k}}}}\left(\mathbf{M}_{\mathcal{T}}^{\left(k-1\right)}\right), (55)

where tk=n1k+1{{t}_{k}}={{n}_{1}}-k+1. Considering that 𝐌𝒯(k1)\mathbf{M}_{\mathcal{T}}^{\left(k-1\right)} satisfies the property a) and is a lower triangular matrix, the values of 𝒮k{{\mathcal{S}}_{k}} can be simplified, taking 𝒮k=tk,n1+1,,n{{\mathcal{S}}_{k}}=\left\langle{{t}_{k}},{{n}_{1}}+1,\ldots,n\right\rangle.

According to Def. 30 , during the transformation process, 𝐦k{{\mathbf{m}}_{k}} and 𝝎k{{\boldsymbol{\omega}}_{k}} satisfy

{𝐦k=[mtk,tk(k1),mn1+1,tk(k1),mn1+2,tk(k1),,mn,tk(k1)]T𝝎k=[mtk,tk(k1)ρ,mn1+1,tk(k1),mn1+2,tk(k1),,mn,tk(k1)]T,\left\{\begin{aligned} &{{\mathbf{m}}_{k}}={{\left[m_{{{t}_{k}},{{t}_{k}}}^{\left(k-1\right)},m_{{{n}_{1}}+1,{{t}_{k}}}^{\left(k-1\right)},m_{{{n}_{1}}+2,{{t}_{k}}}^{\left(k-1\right)},\ldots,m_{n,{{t}_{k}}}^{\left(k-1\right)}\right]}^{T}}\\ &{{\boldsymbol{\omega}}_{k}}={{\left[m_{{{t}_{k}},{{t}_{k}}}^{\left(k-1\right)}-\rho,m_{{{n}_{1}}+1,{{t}_{k}}}^{\left(k-1\right)},m_{{{n}_{1}}+2,{{t}_{k}}}^{\left(k-1\right)},\ldots,m_{n,{{t}_{k}}}^{\left(k-1\right)}\right]}^{T}}\\ \end{aligned}\right., (56)

where ρk=𝐦k=s𝒮km2s,tk(k){{\rho}_{k}}=\left\|{{\mathbf{m}}_{k}}\right\|=\sqrt{\sum\limits_{s\in{{\mathcal{S}}_{k}}}{m{{{}_{s,{{t}_{k}}}^{\left(k\right)}}^{2}}}}. By calculating 𝝎kT𝐦k{{\boldsymbol{\omega}}_{k}}^{T}{{\mathbf{m}}_{k}}, we have

𝝎kT𝐦k=(mtk,tk(k1)ρk)mtk,j(k1)+s=n1+1nms,tk(k1)ms,j(k1).{{\boldsymbol{\omega}}_{k}}^{T}{{\mathbf{m}}_{k}}=\left(m_{{{t}_{k}},{{t}_{k}}}^{\left(k-1\right)}-{{\rho}_{k}}\right)m_{{{t}_{k}},j}^{\left(k-1\right)}+\sum\limits_{s={{n}_{1}}+1}^{n}{m_{s,{{t}_{k}}}^{\left(k-1\right)}m_{s,j}^{\left(k-1\right)}}. (57)

Due to 𝐌𝒯(k1)\mathbf{M}_{\mathcal{T}}^{\left(k-1\right)} satisfies the property b), for the same row s>n1s>{{n}_{1}}, at most only one of ms,tk(k1)m_{s,{{t}_{k}}}^{\left(k-1\right)} and ms,j(k1)m_{s,j}^{\left(k-1\right)} in 𝐌𝒯(k1)\mathbf{M}_{\mathcal{T}}^{\left(k-1\right)} is non-zero. Therefore,

𝝎kT𝐦k=(mtk,tk(k1)ρk)mtk,j(k1).{{\boldsymbol{\omega}}_{k}}^{T}{{\mathbf{m}}_{k}}=\left(m_{{{t}_{k}},{{t}_{k}}}^{\left(k-1\right)}-{{\rho}_{k}}\right)m_{{{t}_{k}},j}^{\left(k-1\right)}. (58)

Simplifying according to formula (30), we can get 𝐌𝒯(k)\mathbf{M}_{\mathcal{T}}^{\left(k\right)} after kk-th Householder transformation, whose element mi,j(k)m_{i,j}^{\left(k\right)} in in row ii and column jj satisfies

mi,j(k)={ρk,i=tk,j=tk0,n1+1in,j=tkmtk,j(k1)mtk,tk(k1)/ρk,i=tk,jtkmi,j(k1)+mi,tk(k1)mtk,j(k1)/ρk,n1+1in,jtkmi,j(k1),otherwisem_{i,j}^{\left(k\right)}=\left\{\begin{array}[]{*{35}{l}}{{\rho}_{k}}&,i={{t}_{k}},j={{t}_{k}}\\ 0&,{{n}_{1}}+1\leq i\leq n,j={{t}_{k}}\\ {m_{{{t}_{k}},j}^{\left(k-1\right)}m_{{{t}_{k}},{{t}_{k}}}^{\left(k-1\right)}}/{{{\rho}_{k}}}&,i={{t}_{k}},j\neq{{t}_{k}}\\ m_{i,j}^{\left(k-1\right)}+{m_{i,{{t}_{k}}}^{\left(k-1\right)}m_{{{t}_{k}},j}^{\left(k-1\right)}}/{{{\rho}_{k}}}&,{{n}_{1}}+1\leq i\leq n,j\neq{{t}_{k}}\\ m_{i,j}^{\left(k-1\right)}&,\text{otherwise}\\ \end{array}\right. (59)

According to the recursive expression (59), consider the properties of 𝐌𝒯(k)\mathbf{M}_{\mathcal{T}}^{\left(k\right)}. First, consider the property a). Since 𝐌(k1)𝐆𝒯(1)\mathbf{M}_{\uparrow}^{\left(k-1\right)}\sim{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}}}, and only the row tk=n1k+1{{t}_{k}}={{n}_{1}}-k+1 in the first n1{{n}_{1}} rows of 𝐌𝒯(k1)\mathbf{M}_{\mathcal{T}}^{\left(k-1\right)} is affected, according to the sparsity of GM, it can be known that the tk{{t}_{k}}-th row of 𝐌𝒯(k1)\mathbf{M}_{\mathcal{T}}^{\left(k-1\right)} satisfies mtk,j(k1)0m_{{{t}_{k}},j}^{\left(k-1\right)}\neq 0 if and only if j{tk,ftk}j\in\left\{{{t}_{k}},{{f}_{{{t}_{k}}}}\right\}.

Since mtk,tk(k1)/ρk0{m_{{{t}_{k}},{{t}_{k}}}^{\left(k-1\right)}}/{{{\rho}_{k}}}\;\neq 0, according to mtk,j(k)=mtk,j(k1)mtk,tk(k1)/ρkm_{{{t}_{k}},j}^{\left(k\right)}={m_{{{t}_{k}},j}^{\left(k-1\right)}m_{{{t}_{k}},{{t}_{k}}}^{\left(k-1\right)}}/{{{\rho}_{k}}}\; for jtkj\neq{{t}_{k}}, we have mtk,j(k)0mtk,j(k1)0m_{{{t}_{k}},j}^{\left(k\right)}\neq 0\Leftrightarrow m_{{{t}_{k}},j}^{\left(k-1\right)}\neq 0.

Therefore, there is no mutual conversion between non-zero elements and zero elements in the only affected tk{{t}_{k}}-th row. 𝐌𝒯(k)\mathbf{M}_{\mathcal{T}}^{\left(k\right)} satisfies property a).

Next consider property b). For the rows n1+1n{{n}_{1}}+1\sim n, according to the property c), all the values of the tk{{t}_{k}}-th column of 𝐌(k)\mathbf{M}_{\downarrow}^{\left(k\right)} are 0. Consider the jj-th column (jtk)\left(j\neq{{t}_{k}}\right) of 𝐌(k)\mathbf{M}_{\downarrow}^{\left(k\right)}, we have

mi,j(k)=mi,j(k1)+mi,tk(k1)mtk,j(k1)/ρk.m_{i,j}^{\left(k\right)}=m_{i,j}^{\left(k-1\right)}+{m_{i,{{t}_{k}}}^{\left(k-1\right)}m_{{{t}_{k}},j}^{\left(k-1\right)}}/{{{\rho}_{k}}}\;. (60)

If jftkj\neq{{f}_{{{t}_{k}}}}, there is mtk,j(k1)=0m_{{{t}_{k}},j}^{\left(k-1\right)}=0, i.e., mi,j(k)=mi,j(k1)m_{i,j}^{\left(k\right)}=m_{i,j}^{\left(k-1\right)}; If j=ftkj={{f}_{{{t}_{k}}}}, there is mtk,j(k1)0m_{{{t}_{k}},j}^{\left(k-1\right)}\neq 0, i.e., mi,j(k)0mi,j(k1)0mi,tk(k1)0m_{i,j}^{\left(k\right)}\neq 0\Leftrightarrow m_{i,j}^{\left(k-1\right)}\neq 0\vee m_{i,{{t}_{k}}}^{\left(k-1\right)}\neq 0.

Therefore, the kk-th Householder transformation is equivalent to transferring non-zero elements from the tk{{t}_{k}}-th column of 𝐌(k1)\mathbf{M}_{\downarrow}^{\left(k-1\right)} to the ftk{{f}_{{{t}_{k}}}}-th column of 𝐌(k1)\mathbf{M}_{\downarrow}^{\left(k-1\right)}. The process keeps the property b) holds.

In summary, 𝐌𝒯(k)\mathbf{M}_{\mathcal{T}}^{\left(k\right)} also satisfies the properties a), b) and c). Since 𝐌𝒯(0)\mathbf{M}_{\mathcal{T}}^{\left(0\right)} satisfies the properties a), b) and c), all 𝐌𝒯(k)\mathbf{M}_{\mathcal{T}}^{\left(k\right)}(0kn11)\left(0\leq k\leq{{n}_{1}}-1\right) satisfy the properties a), b) and c). ∎

A.9 Proof of Theorem 33

Proof.

By Thm. 6, after n11{{n}_{1}}-1 Householder transformations, 𝐌𝒯(n11)\mathbf{M}_{\mathcal{T}}^{\left({{n}_{1}}-1\right)} satisfies both the properties a), b), and c).

Consider 𝐌𝒯(n1)=House𝒮n1,1(𝐌𝒯(n11))\mathbf{M}_{\mathcal{T}}^{\left({{n}_{1}}\right)}={{\operatorname{House}}_{{{\mathcal{S}}_{{{n}_{1}}}},1}}\left(\mathbf{M}_{\mathcal{T}}^{\left({{n}_{1}}-1\right)}\right) for the n1{{n}_{1}}-th Householder transformation. Due to 𝐌(n11)𝐆𝒯(1)\mathbf{M}_{\uparrow}^{\left({{n}_{1}}-1\right)}\sim{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}}} and only m1,1(n11)m_{1,1}^{\left({{n}_{1}}-1\right)} in the first row of 𝐌(n11)\mathbf{M}_{\uparrow}^{\left({{n}_{1}}-1\right)} is non-zero, which is the only element of 𝐌(n11)\mathbf{M}_{\uparrow}^{\left({{n}_{1}}-1\right)} affected by the last transformation. The transformed m1,1(n1)m_{1,1}^{\left({{n}_{1}}\right)} satisfies

m1,1(n1)=m21,1(n11)+s=n1+1nm2s,1(n11).m_{1,1}^{\left({{n}_{1}}\right)}=\sqrt{m{{{}_{1,1}^{\left({{n}_{1}}-1\right)}}^{2}}+\sum\limits_{s={{n}_{1}}+1}^{n}{m{{{}_{s,1}^{\left({{n}_{1}}-1\right)}}^{2}}}}. (61)

Therefore, there is still 𝐌(n1)𝐆𝒯(1)\mathbf{M}_{\uparrow}^{\left({{n}_{1}}\right)}\sim{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}}} after the last transformation, i.e., there is a 𝐆𝒯(1)(𝐰node,𝐰edge)\mathbf{G}_{{{\mathcal{T}}^{\left(1\right)}}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)} satisfying 𝐆𝒯(1)(𝐰node,𝐰edge)=𝐌(n1)\mathbf{G}_{{{\mathcal{T}}^{\left(1\right)}}}^{\left({{\mathbf{w}}_{node}},{{\mathbf{w}}_{edge}}\right)}=\mathbf{M}_{\uparrow}^{\left({{n}_{1}}\right)}. Besides, according to Def. 30 and property c), it can be known that 𝐌(n1)=𝐎\mathbf{M}_{\downarrow}^{\left({{n}_{1}}\right)}=\mathbf{O} after the last transformation. ∎

A.10 Proof of Theorem 8

Proof.

Let 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} be the upper half of 𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}} after the “LO”\text{``LO''}-QR decomposition. According to Thm. 33, we have

𝐌𝒯=𝐐[𝐆𝒯(1)𝒯𝐎].{{\mathbf{M}}_{\mathcal{T}}}=\mathbf{Q}\left[\begin{matrix}{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}\\ \mathbf{O}\\ \end{matrix}\right]. (62)

Substituting it into the expression 𝐌𝒯T𝐌𝒯{{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}, we have

𝐌𝒯T𝐌𝒯=[𝐆𝒯(1)𝒯T𝐎]𝐐T𝐐[𝐆𝒯(1)𝒯𝐎]=[𝐆𝒯(1)𝒯T𝐎][𝐆𝒯(1)𝒯𝐎]=𝐆𝒯(1)𝒯T𝐆𝒯(1)𝒯.{{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}=\left[\begin{matrix}{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}^{T}&\mathbf{O}\\ \end{matrix}\right]{{\mathbf{Q}}^{T}}\mathbf{Q}\left[\begin{matrix}{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}\\ \mathbf{O}\\ \end{matrix}\right]=\left[\begin{matrix}{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}^{T}&\mathbf{O}\\ \end{matrix}\right]\left[\begin{matrix}{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}\\ \mathbf{O}\\ \end{matrix}\right]={{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}^{T}{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}. (63)

A.11 Proof of Corollary 35

Proof.

Since 𝐆𝒯(1)𝒯{{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}} is reversible, according to Thm. 8, substituting formula (34) into 𝐲=(𝐌𝒯T𝐌𝒯)1𝐱\mathbf{y}={{\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}\right)}^{-1}}\mathbf{x}, we have

𝐲=(𝐌𝒯T𝐌𝒯)1𝐱=𝐆𝒯(1)𝒯1(𝐆𝒯(1)𝒯T𝐱).\mathbf{y}={{\left({{\mathbf{M}}_{\mathcal{T}}}^{T}{{\mathbf{M}}_{\mathcal{T}}}\right)}^{-1}}\mathbf{x}={{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}^{-1}\left({{\mathbf{G}}_{{{\mathcal{T}}^{\left(1\right)}}\leftarrow\mathcal{T}}}^{-T}\mathbf{x}\right). (64)

A.12 Proof of Theorem 38

Proof.

Define a sequence θi(1in1){{\theta}_{i}}\left(1\leq i\leq{{n}_{1}}\right) satisfies

θi=s=n1+1nm2s,i(ti),{{\theta}_{i}}=\sum\limits_{s={{n}_{1}}+1}^{n}{m{{{}_{s,i}^{\left({{t}_{i}}\right)}}^{2}}}, (65)

where ti=n1i{{t}_{i}}={{n}_{1}}-i. The (ti+1)\left({{t}_{i}}+1\right)-th Householder transformation is the Householder transformation on the ii-th column. And let jj denote the child of ii, i.e., (j𝒞i)\left(j\in{{\mathcal{C}}_{i}}\right).

By recursive expression (59), in the first ti{{t}_{i}} Householder transformations, only the Householder transformation on the jj-th column (i.e., the (tj+1)\left({{t}_{j}}+1\right)-th transformation) will cause the value of ms,i{{m}_{s,i}}(n1+1sn)\left({{n}_{1}}+1\leq s\leq n\right) to change. Therefore, ms,i(ti)m_{s,i}^{\left({{t}_{i}}\right)} satisfies

ms,i(ti)=ms,i(0)+j𝒞ijn1(ms,j(tj)mj,i(tj)/mj,j(tj+1)).m_{s,i}^{\left({{t}_{i}}\right)}=m_{s,i}^{\left(0\right)}+\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{\left({m_{s,j}^{\left({{t}_{j}}\right)}m_{j,i}^{\left({{t}_{j}}\right)}}/{m_{j,j}^{\left({{t}_{j}}+1\right)}}\;\right)}. (66)

For a non-leaf node ii, since the values of mj,i(tj)m_{j,i}^{\left({{t}_{j}}\right)} and mj,j(tj)m_{j,j}^{\left({{t}_{j}}\right)} haven’t changed in the first tj{{t}_{j}} Householder transformations, there are mj,j(tj)=1m_{j,j}^{\left({{t}_{j}}\right)}=1 and mj,i(tj)=1m_{j,i}^{\left({{t}_{j}}\right)}=-1. Therefore, mj,j(tj+1)m_{j,j}^{\left({{t}_{j}}+1\right)} satisfies

mj,j(tj+1)=m2j,j(tj)+s=n1+1nm2s,j(tj)=1+θj.m_{j,j}^{\left({{t}_{j}}+1\right)}=\sqrt{m{{{}_{j,j}^{\left({{t}_{j}}\right)}}^{2}}+\sum\limits_{s={{n}_{1}}+1}^{n}{m{{{}_{s,j}^{\left({{t}_{j}}\right)}}^{2}}}}=\sqrt{1+{{\theta}_{j}}}. (67)

Therefore, for n1+1sn{{n}_{1}}+1\leq s\leq n, we have

ms,i(ti)=ms,i(0)j𝒞ijn1ms,j(tj)/1+θj.m_{s,i}^{\left({{t}_{i}}\right)}=m_{s,i}^{\left(0\right)}-\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{{m_{s,j}^{\left({{t}_{j}}\right)}}/{\sqrt{1+{{\theta}_{j}}}}\;}. (68)

According to formula (65), θi{{\theta}_{i}} satisfies

θi=s=n1+1nm2s,i(ti)=s=n1+1n(ms,i(0)j𝒞ijn1ms,j(tj)/1+θj)2.{{\theta}_{i}}=\sum\limits_{s={{n}_{1}}+1}^{n}{m{{{}_{s,i}^{\left({{t}_{i}}\right)}}^{2}}}=\sum\limits_{s={{n}_{1}}+1}^{n}{{{\left(m_{s,i}^{\left(0\right)}-\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{{m_{s,j}^{\left({{t}_{j}}\right)}}/{\sqrt{1+{{\theta}_{j}}}}\;}\right)}^{2}}}. (69)

According to Thm. 6, the process of “LO”\text{``LO''}-QR decomposition always satisfies the property b), so at most only one item in ms,i(0)j𝒞ijn1ms,j(tj)/1+θjm_{s,i}^{\left(0\right)}-\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{{m_{s,j}^{\left({{t}_{j}}\right)}}/{\sqrt{1+{{\theta}_{j}}}}\;} is non-zero, and we have

θi=(ms,i(0)j𝒞ijn1ms,j(tj)/1+θj)2=m2s,i(0)+j𝒞ijn1m2s,j(tj)/(1+θj).{{\theta}_{i}}={{\left(m_{s,i}^{\left(0\right)}-\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{{m_{s,j}^{\left({{t}_{j}}\right)}}/{\sqrt{1+{{\theta}_{j}}}}\;}\right)}^{2}}=m{{{}_{s,i}^{\left(0\right)}}^{2}}+\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{{m{{{}_{s,j}^{\left({{t}_{j}}\right)}}^{2}}}/{\left(1+{{\theta}_{j}}\right)}\;}. (70)

Therefore,

θi=\displaystyle{{\theta}_{i}}= s=n1+1n(ms,i(0)j𝒞ijn1ms,j(tj)/1+θj)2=s=n1+1nm2s,i(0)+j𝒞ijn1(s=n1+1nm2s,j(tj)/(1+θj))\displaystyle\sum\limits_{s={{n}_{1}}+1}^{n}{{{\left(m_{s,i}^{\left(0\right)}-\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{{m_{s,j}^{\left({{t}_{j}}\right)}}/{\sqrt{1+{{\theta}_{j}}}}\;}\right)}^{2}}}=\sum\limits_{s={{n}_{1}}+1}^{n}{m{{{}_{s,i}^{\left(0\right)}}^{2}}}+\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{\left({\sum\limits_{s={{n}_{1}}+1}^{n}{m{{{}_{s,j}^{\left({{t}_{j}}\right)}}^{2}}}}/{\left(1+{{\theta}_{j}}\right)}\;\right)} (71)
=\displaystyle= s=n1+1nm2s,i(0)+j𝒞ijn1θj/(1+θj)=s=n1+1nm2s,i(0)+j𝒞ijn1(1(1+θj)1).\displaystyle\sum\limits_{s={{n}_{1}}+1}^{n}{m{{{}_{s,i}^{\left(0\right)}}^{2}}}+\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{{{{\theta}_{j}}}/{\left(1+{{\theta}_{j}}\right)}\;}=\sum\limits_{s={{n}_{1}}+1}^{n}{m{{{}_{s,i}^{\left(0\right)}}^{2}}}+\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{\left(1-{{\left(1+{{\theta}_{j}}\right)}^{-1}}\right)}.

According to the definition of 𝐌{{\mathbf{M}}_{\downarrow}}, for n1+1sn{{n}_{1}}+1\leq s\leq n, if s𝒞is\in{{\mathcal{C}}_{i}}, then ms,i(0)=1m_{s,i}^{\left(0\right)}=-1, otherwise ms,i(0)=0m_{s,i}^{\left(0\right)}=0. Finally, θi{{\theta}_{i}} is reduced to

θi=j𝒞ij>n11+j𝒞ijn11j𝒞ijn11/(1+θj)=j𝒞i1j𝒞ijn11/(1+θj)=|𝒞i|j𝒞ijn11/(1+θj).{{\theta}_{i}}=\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j>{{n}_{1}}}{1}+\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{1}-\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{{1}/{\left(1+{{\theta}_{j}}\right)}\;}=\sum\limits_{j\in{{\mathcal{C}}_{i}}}{1}-\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{{1}/{\left(1+{{\theta}_{j}}\right)}\;}=\left|{{\mathcal{C}}_{i}}\right|-\sum\limits_{j\in{{\mathcal{C}}_{i}}\wedge j\leq{{n}_{1}}}{{1}/{\left(1+{{\theta}_{j}}\right)}\;}. (72)

According to formula (67), we have

wi=mi,i(ti+1)=1+θi.{{w}_{i}}=m_{i,i}^{\left({{t}_{i}}+1\right)}=\sqrt{1+{{\theta}_{i}}}. (73)

Then, according to formula (59), we have

wifi=mfi,i(ti+1)=mi,fi(ti)mi,i(ti)/mi,i(ti+1)=mi,fi(0)mi,i(0)/mi,i(ti+1)=(1×1)/wi=wi1.{{w}_{i\to{{f}_{i}}}}=-m_{{{f}_{i}},i}^{\left({{t}_{i}}+1\right)}=-{m_{i,{{f}_{i}}}^{\left({{t}_{i}}\right)}m_{i,i}^{\left({{t}_{i}}\right)}}/{m_{i,i}^{\left({{t}_{i}}+1\right)}}\;=-{m_{i,{{f}_{i}}}^{\left(0\right)}m_{i,i}^{\left(0\right)}}/{m_{i,i}^{\left({{t}_{i}}+1\right)}}\;=-{\left(-1\times 1\right)}/{{{w}_{i}}}\;={{w}_{i}}^{-1}. (74)

References

  • [1] S. Niazi, M. Ismail, S. Grohsschmiedt, M. Ronstrm, S. Haridi, J. Dowling, Hopsfs: Scaling hierarchical file system metadata using newsql databases.
  • [2] J. Abowd, The u.s. census bureau adopts differential privacy, 2018, pp. 2867–2867. doi:10.1145/3219819.3226070.
  • [3] T. Lima, M. de Aguiar, Laplacian matrices for extremely balanced and unbalanced phylogenetic trees (08 2020).
  • [4] S. Garfinkel, J. Abowd, S. Powazek, Issues encountered deploying differential privacy (09 2018). doi:10.1145/3267323.3268949.
  • [5] X. Li, Z. Wang, Trees with extremal spectral radius of weighted adjacency matrices among trees weighted by degree-based indices, Linear Algebra and its Applications 620. doi:10.1016/j.laa.2021.02.023.
  • [6] S. Ganesh, S. Mohanty, Trees with matrix weights: Laplacian matrix and characteristic-like vertices (09 2020).
  • [7] R. Bapat, S. Sivasubramanian, Squared distance matrix of a tree: Inverse and inertia, Linear Algebra and its Applications 491. doi:10.1016/j.laa.2015.09.008.
  • [8] E. Andriantiana, K. Dadedzi, S. Wagner, The ancestral matrix of a rooted tree, Linear Algebra and Its Applications 575 (2019) 35–65. doi:10.1016/j.laa.2019.04.004.
  • [9] W. Fulton, J. Harris, Graduate texts in mathematics, Representation Theory. A First Course, Readings in Mathematics 129.
  • [10] J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, M. Sun, Graph neural networks: A review of methods and applications, AI Open 1 (2020) 57–81. doi:10.1016/j.aiopen.2021.01.001.
  • [11] X. A. Chen, Understanding spectral graph neural network.
  • [12] M. Deveci, C. Trott, S. Rajamanickam, Multi-threaded sparse matrix-matrix multiplication for many-core and gpu architectures, Parallel Computing 78. doi:10.1016/j.parco.2018.06.009.
  • [13] M. Hay, V. Rastogi, G. Miklau, D. Suciu, Boosting the accuracy of differentially-private histograms through consistency, Proceedings of the VLDB Endowment 3. doi:10.14778/1920841.1920970.
  • [14] N. Wang, X. Xiao, Y. Yang, T. Hoang, H. Shin, J. Shin, G. Yu, Privtrie: Effective frequent term discovery under local differential privacy, 2018, pp. 821–832. doi:10.1109/ICDE.2018.00079.
  • [15] J. Jansson, K. Sadakane, W.-K. Sung, Ultra-succinct representation of ordered trees with applications, J. Comput. Syst. Sci. 78 (2012) 619–631. doi:10.1016/j.jcss.2011.09.002.
  • [16] D. Tsur, Succinct representation of labeled trees, Theoretical Computer Science 562. doi:10.1016/j.tcs.2014.10.006.
  • [17] A. Farzan, J. Munro, Succinct representation of dynamic trees, Theor. Comput. Sci. 412 (2011) 2668–2678. doi:10.1016/j.tcs.2010.10.030.
  • [18] C. Dwork, F. McSherry, K. Nissim, A. Smith, Calibrating noise to sensitivity in private data analysis, Journal of Privacy and Confidentiality 7 (2017) 17–51. doi:10.29012/jpc.v7i3.405.
  • [19] W. Qardaji, W. Yang, N. Li, Understanding hierarchical methods for differentially private histograms, Proceedings of the VLDB Endowment 6 (2013) 1954–1965. doi:10.14778/2556549.2556576.
  • [20] G. Cormode, M. Procopiuc, E. Shen, D. Srivastava, T. Yu, Differentially private spatial decompositions, Computing Research Repository - CORRdoi:10.1109/ICDE.2012.16.
  • [21] S. Yuan, D. Pi, X. Zhao, M. Xu, Differential privacy trajectory data protection scheme based on r-tree, Expert Systems with Applications 182 (2021) 115215. doi:10.1016/j.eswa.2021.115215.
  • [22] J. Lee, Y. Wang, D. Kifer, Maximum likelihood postprocessing for differential privacy under consistency constraints, 2015, pp. 635–644. doi:10.1145/2783258.2783366.
  • [23] G. Strang, Linear algebra and learning from data, Wellesley-Cambridge Press Cambridge, 2019.
  • [24] G. Birkenmeier, H. Heatherly, J. Kim, J. Park, Triangular matrix representations, Journal of Algebra 230 (2000) 558–595. doi:10.1006/jabr.2000.8328.
  • [25] L. Minah, A. Fox, G. Sanders, Rounding error analysis of mixed precision block householder qr algorithms, SIAM Journal on Scientific Computing 43 (2021) A1723–A1753. doi:10.1137/19M1296367.
  • [26] P. Desai, S. Aslan, J. Saniie, Fpga implementation of gram-schmidt qr decomposition using high level synthesis, 2017, pp. 482–487. doi:10.1109/EIT.2017.8053410.
  • [27] W. Fam, A. Alimohammad, Givens rotation-based qr decomposition for mimo systems, IET Communications 11. doi:10.1049/iet-com.2016.0789.
  • [28] R. Stanley, Enumerative combinatorics — volume 1.
  • [29] M. C. M. incorporates LAPACK, Increasing the speed and capabilities of matrix computation, [Online] (2000).
  • [30] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, L. Kaiser, M. Kudlur, J. Levenberg, X. Zheng, Tensorflow : Large-scale machine learning on heterogeneous distributed systems (01 2015).
  • [31] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell, Caffe: Convolutional architecture for fast feature embedding, MM 2014 - Proceedings of the 2014 ACM Conference on Multimediadoi:10.1145/2647868.2654889.
  • [32] J. Magnus, Matrix differential calculus with applications in statistics and econometricsdoi:10.1002/9781119541219.
  • [33] U. C. Bureau, 2010 census summary file 1, https://www.census.gov/prod/cen2010/doc/sf1.pdf (2012).
  • [34] New york city taxi data, http://www.nyc.gov/html/tlc/html/about/trip/record/data.shtml.