Generation Matrix: An Embeddable Matrix Representation for Hierarchical Trees
Abstract
Starting from the local structures to study hierarchical trees is a common research method. However, the cumbersome analysis and description make the naive method challenging to adapt to the increasingly complex hierarchical tree problems. To improve the efficiency of hierarchical tree research, we propose an embeddable matrix representation for hierarchical trees, called Generation Matrix. It can transform the abstract hierarchical tree into a concrete matrix representation and then take the hierarchical tree as a whole to study, which dramatically reduces the complexity of research. Mathematical analysis shows that Generation Matrix can simulate various recursive algorithms without accessing local structures and provides a variety of interpretable matrix operations to support the research of hierarchical trees. Applying Generation Matrix to differential privacy hierarchical tree release, we propose a Generation Matrix-based optimally consistent release algorithm (GMC). It provides an exceptionally concise process description so that we can describe its core steps as a simple matrix expression rather than multiple complicated recursive processes like existing algorithms. Our experiments show that GMC takes only a few seconds to complete a release for large-scale datasets with more than million nodes. The calculation efficiency is increased by up to times compared with the state-of-the-art schemes.
keywords:
Generation Matrix, Matrix Representation, Hierarchical Tree, Differential Privacy, ConsistencyMSC:
[2020] 05C62 , 05C051 Introduction
As a fundamental data structure, hierarchical trees are widely used in different areas, including file systems [1], census [2], evolution [3], etc. For example, in the U.S. Census Bureau’s plan to apply differential privacy to protect privacy[2], designing a novel hierarchical tree releasing algorithm is one of the important challenges[4]. The scale of the census data is so large that we can organize them into a hierarchical tree with more than 10 million nodes. Hence, the hierarchical tree release algorithms must be specially designed and highly efficient to ensure the timely release of such large-scale data. However, the design of efficient algorithms usually requires a large amount of hierarchical tree research as a theoretical basis.
Most hierarchical tree research works naturally regard the hierarchical tree as a collection of nodes and relationships. Starting from the perspective of individual nodes and local relationships to study hierarchical trees is a common research method, called Naive Research Method. Empirically, Naive Research Method often leads to overly cumbersome analysis and abstract algorithm descriptions, mainly reflected in two aspects. On the one hand, hierarchical trees contain rich relationships, such as father, son, ancestor, descendant, sibling, or cousin, but these relationships usually lack a concrete enough description. When multiple node relationships occur in an algorithm simultaneously, the intricate node relationships will make the algorithm challenging to understand. On the other hand, Naive Research Method focuses on the local structure of hierarchical trees rather than the overall structure. The local structure is part of the overall structure, but the studies falling into local structures may prevent researchers from solving problems from a macro perspective. Worse, the additional auxiliary symbols or indexes for describing relationships and local structures pose a significant challenge to the researchers’ data analysis capabilities. Imagine a researcher facing a half-page expression with various randomly labeled symbols and subscripts; how should the next step go? Therefore, Naive Research Method is too cumbersome to solve the increasingly complex hierarchical tree problems effectively.
Considering the complexity of Naive Research Method, we adopted a Matrixing Research Method for hierarchical tree problems. Its core idea is to transform the hierarchical tree into a specific matrix representation and then embed it into the research works or algorithm designs, making the initially abstract and indescribable hierarchical tree concrete and analyzable. As a more advanced research method, similar ideas widely exist in many fields such as graph theory [5, 6, 7, 8], group theory [9], and deep learning[10, 11]. Unlike Naive Research Method, the research object of Matrixing Research Method is the hierarchical tree itself rather than the local structure. It emphasizes avoiding visiting individual nodes as much as possible but implementing operations of the hierarchical tree by the matrix representation. Therefore, the matrix representation design is critical and directly determines whether the research can proceed smoothly. The challenges of matrix representation design are as follows.
-
1)
A non-negligible problem is the universality of recursion in hierarchical tree algorithms, while the access of individual nodes and the description of local relationships are almost inevitable in recursion. It violates the core idea of the Matrixing Research Method. Some matrix representations [5, 6, 7, 8] have been used in the spectral theory of trees, but there is no achievement to show that the existing matrix representations can implement recursion without accessing local structures. So that whether supporting recursion is critical to the matrix representation.
-
2)
We hope that the matrix representation can directly serve algorithm designs, not just a theoretical analysis tool. Therefore, the matrix representation should be succinct to ensure the efficiency of algorithms. Specifically, the space overhead of each hierarchical tree node should be constant rather than the dense matrices like Distance Matrix [7] or Ancestral Matrix [8].
1.1 Our contributions
Considering the challenges above, we propose an embeddable matrix representation called Generation Matrix. Generation Matrix is a lower triangular matrix containing only non-zero elements (i.e., the weights of nodes and edges). Applying sparse storage technologies[12], we only need to store non-zero elements, satisfying the succinctness. Compared with others [5, 6, 7, 8], Generation Matrix emphasizes the application in the hierarchical tree algorithms. Our analysis of properties shows that many calculations on Generation Matrix have specific mathematical meanings. We can explain them and combine them to design complex hierarchical tree algorithms. More importantly, we demonstrate that the inverse of the Generation Matrix contains the inherent logic of recursion. Therefore, we can use Generation Matrix to simulate the top-down and bottom-up recursions without accessing the local structures. Besides, we study the relationship between Generation Matrix and some existing matrix representations and find it can be easily converted to others. It implies that Generation Matrix can be combined with the theories from other matrix representations to solve hierarchical tree problems.
To demonstrate the practicability of Generation Matrix, we introduce an application on the differentially private hierarchical tree release above. Considering the consistency problem[13, 14], we design a Generation Matrix-based optimally consistent release algorithm (GMC) for differentially private hierarchical trees. To our knowledge, GMC is the first solution to the problem by using matrix theory. It has an exceptionally concise process description so that just a simple matrix expression can summarize the core process. GMC embodies the advantages of Generation Matrix in solving problems in cross-domain. Therefore, Generation Matrix has positive significance for promoting the development of hierarchical tree-related research and applying matrix theories to solve hierarchical tree problems.
1.2 Organization of paper
The rest of the paper is organized as follows. Section 2 reviews the related works of existing hierarchical tree representations and differentially private hierarchical tree release. Section 3 introduces the preliminaries of hierarchical trees and the optimally consistent release of differentially private hierarchical trees. Section 4 defines Generation Matrix, then analyzes its mathematical properties and the conversion relationship with other matrix representations. In Section 5, we show the application of Generation Matrix on differentially private hierarchical tree release and design GMC. Finally, Section 6 compares GMC with the existing technology through experiments and demonstrates its efficiency.
2 Related Works
The research on the hierarchical tree representations mainly concentrates on data structure and graph theory fields. In the data structure field, researchers have achieved better performance in storage [15], query [16], or structure updating [17]. However, these representations are mainly for storage in computers but do not support mathematical analysis. We can not symbolize them and use them as tools for hierarchical tree researches. In the graph theory field, many works adopt matrix representations to represent trees, including Adjacency Matrix [5], Laplacian Matrix [3, 6], Distance Matrix [7], etc. Among them, Adjacency Matrix only describes the edge information, so that it is challenging to undertake the complex model analysis. Laplacian matrix and Distance matrix are two meaningful matrix representations widely used in spectral graph theory. However, they are for undirected graphs or trees but not suitable for representing rooted hierarchical trees. To summarize, none of the three matrix representations is the best choice for representing hierarchical trees. Subsequently, Eric et al. [8] proposed a new matrix representation for rooted trees (i.e., hierarchical trees), named Ancestor Matrix. It represents the structure of a hierarchical tree by describing the number of overlapping edges on the path from any two leaves to the root. Studying Ancestral Matrix, Eric et al. [8] obtained many essential conclusions, such as the maximum spectrum radius and the determinants of the characteristic polynomial. However, it is also not the best choice for the calculations of hierarchical trees. First, the dense Ancestral Matrix is not succinct enough. Secondly, Ancestral Matrix is a kind of matrix with a high degree of feature summary. Although it can deterministically express the structure of a hierarchical tree only by describing the leaves, it is very unintuitive and cannot simulate the operations of hierarchical trees. Therefore, the existing matrix representations are not suitable for the analysis and calculation of hierarchical tree models. Significantly, the broad application of the Laplacian matrix in deep learning[10, 11] in recent years implies that matrix representation has essential value for solving complex scientific problems. It motivates us to design a new matrix representation to solve hierarchical tree problems and design algorithms.
Differentially private hierarchical tree release is a data releasing technology that organizes the data into a hierarchical tree and applies differential privacy (DP) [18] to protect individual privacy. It is widely used in many scenarios, such as histogram publishing [13, 19], location privacy release based on spatial partitioning [20], trajectory data publishing [21], frequent term discovery [14]. By adding random noise to the data, DP provides a provable and quantifiable guarantee of individual privacy. However, the random noise will destroy the consistency that the hierarchical tree should satisfy, i.e., “the sum of the children’s values equals the value at the parent”[13]. Therefore, ensuring that the released results meet consistency and obtain a higher accuracy is one of the leading research goals. Hay et al. [13] first applied a hierarchical tree to improve the accuracy of range query and designed Boosting for the consistency of histogram release. However, Boosting can only support complete ary trees, which significantly limits its application. Moreover, Hay et al.’s error analysis [13] of the released results is rough, and only qualitative error results are obtained. Subsequently, Wahlbeh et al. analyzed the error of Boosting and designed an algorithm to calculate the error. However, it also can only support complete ary trees. In the differentially private frequent term discovery problem studied by Ning et al. [14], the hierarchical tree is arbitrary. Therefore, it is impossible to apply Boosting. For this reason, Ning et al. designed an optimally consistent release algorithm for arbitrary hierarchical trees in their proposed algorithms PrivTrie [14]. Its implementation is based on multiple complex recursions, which is not easy to understand and a large number of function calls result in significant additional computational overhead. Applying the idea of maximum likelihood estimation, Lee et al. [22] proposed a general solution for differentially private optimally consistent release. It solves the optimally consistent release by establishing a quadratic programming equation and has a closed-form matrix expression. Theoretically, it can apply to arbitrary optimally consistent release, but the computational overhead is so significant that it can only be processed for small-scale releases. However, Lee et al.’s research work [22] is inspiring. It motivates us to try to analyze the issues of differentially private hierarchical tree release from matrix analysis.
Under the representation of Generation Matrix, we can introduce many matrix analysis methods to solve hierarchical tree problems. One of them is QR decomposition [23]. In QR decomposition, we can transform any matrix into a triangular matrix by orthogonal transformation. Compared with the original form, the triangular matrix is simpler, has many exciting properties [24]. The orthogonal transformation methods commonly used for QR decomposition include Householder transformation [25], Gram-Schmidt orthogonalization [26], Givens rotation [27], etc. Among them, Householder transformation is the simplest and more suitable for sparse matrices.
3 Preliminaries
In this section, we describe some preliminaries of our work. Before formally describing, we first introduce some of the main notation definitions shown in Tab 1.
Notations | Descriptions |
---|---|
Hierarchical tree with arbitrary structure | |
order subtree of | |
, | Parent of node ; the set of children of node |
, | The number of nodes of the hierarchical tree; the order subtree |
, | The height of the hierarchy tree; The height of node |
The number of unit counts | |
The Generation Matrix defined by a with the node weights and the edge weights | |
The structure matrix of | |
The consistency constraint matrix of | |
The Generation Matrix inner-product equivalent to | |
, | The Adjacency Matrix, Laplacian Matrix, Distance Matrix and Ancestral Matrix of |
The vector composed of unit counts | |
The vector composed of the values of nodes of the hierarchical tree arranged in order | |
, | The noisy and satisfying DP |
, | Optimally consistent release after post-processing and the vector restored from |
The mapping matrix representing the mapping relationship between and |
3.1 Hierarchical Tree

We first recall the definition of the hierarchical tree.
Definition 1 (Hierarchical Tree[28]).
The hierarchical tree is a collection of nodes numbered , which satisfies
-
1)
contains a specially designated node called the root.
-
2)
The remaining nodes are divided into several non-empty collections called the subtrees of the root.
In the hierarchical tree, we denote the relationships between the nodes as , indicating node is the parent of node . Fig. 1 shows a weighted hierarchical tree with nodes. The node weights and edge weights are denoted as and , respectively. By the height of each node, we can obtain a good quasi-ranking, which is defined as follows.
Definition 2 (Node Height).
The node height of a hierarchical tree refers to the height of the subtree rooted at the current node. Let the height of node be denoted as , then can be calculated by the following recursive expression.
(1) |
By the height of the nodes, we define a kind of induced subtree of hierarchical trees, called Order Subtree.
Definition 3 (Order Subtree).
For a hierarchical tree under the descending order of height, the Order Subtree is defined as an induced subtree retained after deletes all leaves times. satisfies
(2) |
As a bottom-up induced subtree, satisfies transitive, i.e., . It can help us determine whether the subtrees obtained in different ways are equivalent. In subsequent applications, we use the concept of Order Subtree to simplify the description of the tree structure. In Fig. 1, we use dotted circles to mark the subtrees of from to orders. It can be seen that the Order Subtree is itself actually; is a subtree composed of non-leaf nodes of . Let denote the number of nodes contained in , then there are and equal to the number of non-leaf nodes of .
3.2 Optimally Consistent Release of Differentially Private Hierarchical Tree
Before describing the optimally consistent releasing of the differentially private hierarchical tree, we first recall the hierarchical tree releasing model. Consider a set of unit counts for private dataset , where indicates the number of records in that satisfies the mutually exclusive unit condition . The unit count satisfies
(3) |
Since is mutually exclusive, any satisfies and only satisfies one . Therefore, organizing into the form of a vector, we will get .

As shown in Fig. 2, each leaf corresponds to a . The non-leaf node’s value equals the sum of the leaves’ value of the subtree rooted at that node. Therefore, the results of the hierarchical tree meet the consistency, i.e., “the sum of the values of the child nodes is equal to the value of the parent node”. In Fig. 2, we denote the value of the -th node as . Then, organizing into the form of in turn, we can get the to-be-released data .
However, several works[13, 19, 20, 21, 14] have demonstrated that releasing an unprotected hierarchical tree will result in privacy disclosure. To protect individual privacy, DWork et al. [18] proposed differential privacy, defined as follows.
Definition 4 (Differential Privacy[18]).
If a random algorithm satisfies difference privacy, then for any two neighboring datasets and , all outputs satisfies
(4) |
Under differential privacy, the process of hierarchical tree releasing can be described as
(5) |
where is the after noise addition, which satisfies differential privacy. is the random vector for the noise addition. Each element is i.i.d and satisfies , where represents a Laplacian distribution and is data sensitivity. In hierarchical tree releasing, equals the height of [13].
To keep the consistency of the hierarchical tree after adding noise, we can get the optimally consistent release by following the optimization equation according to maximum likelihood post-processing proposed by Lee et al. [22].
(6) |
where is the consistency constraint matrix of a hierarchical tree, defined as follows.
Definition 5 (Consistency Constraint Matrix of Hierarchical Tree).
Given a hierarchical tree containing nodes. Let denote the number of non-leaf nodes in . The value in row and column of the consistency constraint matrix is defined as follows:
(7) |
where is the parent of node .
The optimization equation (6) has the following closed-form expression:
(8) |
Since Formula (8) involves the inner product and inverse operations of the matrix, the time complexity of the direct solution is as high as . The amount of calculation is too large to obtain an efficient enough algorithm directly by the expressions. On the surface, Formula (8) is not a good choice for solving optimally consistent releases, but under the theories of Generation Matrix, we can convert it into another form and apply the properties of Generation Matrix to obtain an efficient algorithm.
4 Generation Matrix Model for Hierarchical Tree
4.1 Generation Matrix
Before defining Generation Matrix, we number the nodes of by descending order of height firstly.
Definition 6 (Descending Order of Height).
Let denote the height of node defined by Def. 1. If any two nodes and in satisfy
(9) |
we say that satisfies Descending Order of Height.
Under the descending order of height, we define Generation Matrix for .

Definition 7 (Generation Matrix).
Considering a non-zero weighted hierarchical tree under descending order of height, let and denote the weight values of the node and the edge . Organizing them into a vector, denoted as and , the generation matrix is denoted as , whose element in row and column is defined as follows,
(10) |
As shown in Fig. 3, since satisfies Descending Order of Height, the number of any non-root node in is always bigger than its parent. It ensures Generation Matrix is always a lower triangular matrix. According to Def. 10, there is a one-to-one mapping between arbitrary non-zero weighted hierarchical tree and , i.e., the matrix representation of the non-zero weighted hierarchical tree is unique. When we only need to describe the structure of the hierarchical tree, we can use a Generation Matrix with all weights of to represent it, i.e., . We call Structure Matrix, which is abbreviated as . If two hierarchical trees have the same structure and arrangement, the positions of non-zero elements in their Generation Matrices are always the same, which we call Similar Generation Matrices.
Definition 8 (Similar Generation Matrices).
If and are two Generation Matrices defined by the hierarchical trees with the same structure and arrangement (or the same tree), we call them similar Generation Matrices, which are denoted as .
Every Generation Matrix from the same hierarchical tree is always similar. We can use as sufficient to judge whether the hierarchical tree represented by has the same structure as .
According to Def. 2, is an induced subtree composed of nodes with in . Under Descending Order of Height, these nodes are always ranked first. In Fig. 3, Order Submatrix that represents the Order Subtree is denoted as . We can obtain it by taking the order leading principal minor of (i.e., the elements in rows and columns from to ). In particular, represents the sub-tree composed of non-leaf nodes of .
Considering a specific application, one problem we may encounter is that the nodes of the hierarchical tree are numbered but not in Descending Order of Height. Under the matrix representation, the problem is elementary to solve. We can adopt a sparse mapping matrix to convert the original number into Descending Order of Height.
Definition 9 (Mapping Matrix).
Given an ordered set represents the mapping relationship between integers, satisfying , the mapping matrix defined by is denoted as . The value in row and column of satisfies
(11) |
By the definition of the mapping matrix, always represents an injection and satisfies . If represents a bijection, will be a permutation matrix. When the basis vector acts on , the following equations holds.
(12) |
(13) |
where represents the inverse mapping of , which satisfies .
To describe the mapping relationship between the nodes before and after sorting by Descending Order of Height, we only need to define the ordered set , where represents the sorted number of the node initially numbered . Then, we can use to get the vector before sorting from after sorting.
In addition, the mapping matrix can be used to represent other mapping relationships, such as the mapping of to . For example, in Fig. 1, if we use an ordered set to represent the mapping relationship between and , then , and we have the mapping matrix to represent their mapping relationship.
4.2 Properties of Generation Matrix
Our research shows that Generation Matrix has many mathematical properties that deserve attention. These properties can help us solve various problems about the analysis and calculation of hierarchical trees. According to Def. 10, it is not difficult to find that Generation Matrix satisfies sparsity.
Property 1 (Sparsity).
Considering a hierarchical tree consists of nodes, is has and only has non-zero elements. Its first row has only non-zero element, and the remaining rows have non-zero elements.
Due to the sparsity of Generation Matrix, we can apply various sparse matrix technologies such as COO (Coordinate Format) and CSR (Compressed Sparse Row) to calculate hierarchical tree models efficiently. Under the sparse representations, the storage and calculation of Generation Matrix are both only . Therefore, the application based on Generation Matrix does not cause more computational overhead. Currently, the computing technologies of sparse matrices are very mature and widely used in various high-performance platforms [29, 30, 31].
One of the fundamental properties of Generation Matrices is invertibility. By solving the equations and about , we find two interesting and important mathematical properties of Generation Matrices. We collectively call them the propagation of Generation Matrix.
Property 2 (Upward Propagation).
Let denotes the element in row and column of , and is the value of node of . Organize into vector , then is an upward propagation on . The value of satisfies
(14) |
Property 3 (Downward Propagation).
If and have the exact definition as Prop. 14, then is a downward propagation on . The value of satisfies
(15) |
Prop. 14 and Prop. 15 show that Generation Matrix can simulate multiple recursive operations of hierarchical trees. According them, Cor. 1 analyzes the elements affected by the propagations of Generation Matrix.
Corollary 1.
For the upward propagation, affected by satisfies , or is the ancestor of in ; for the downward propagation, affected by satisfies , or is a descendant of in .
Combining the propagations, we further study a variety of matrix operations of Generation Matrices. They have strong interpretability and provide crucial theoretical support for the research of hierarchical trees.
Property 4.
Let the vector , then the -th element of represents the number of children of the node , i.e., .
Property 5.
If the vector , the -th element of represents the number of nodes contained in the subtree rooted at node .
Property 6.
Let the vector , then the -th element of represents the depth of node , where the depth of the root is .
The properties above indicate that Generation Matrix is an effective and easy-to-use tool for various hierarchical tree analyses. In addition to the conclusions about vectors discussed above, there are some conclusions about matrices as follows. Compared with conclusions about vectors, they focus on describing the characteristics between nodes.
Property 7.
Let denote the element in row and column of , then satisfies
(16) |
Proof.
See Appendix A.1. ∎
Prop. 16 shows that is a matrix indicating the relationship between ancestors and descendants. Although is denser than , in most cases, is still sparse.
Property 8.
Let and denote the element in row and column of . Except for , other elements satisfy “ and are sibling nodes”.
Property 9.
Let and is the element in row and column of , then the value of represents the number of common ancestors of the node pair and , and represents the depth of .
Proof.
See Appendix A.2. ∎
Similar to Prop. 16, Prop. 8 can also be used as an indicator matrix to describe the relationship between nodes. Prop. 9 is an important property, which describes an effective method of calculating common ancestors. As an essential feature to describe the correlation between nodes, the number of common ancestors has an important application value for hierarchical tree analyses[8].
In the study of spectral graph theory, feature analysis is usually indispensable. Our research shows that the eigenvalues and eigenvectors of a Generation Matrix satisfy the following properties.
Property 10 (Eigenvalues and Eigenvectors).
Let denote the eigenvalues of , then the -th eigenvalue is .
Let the left eigenvector and the right eigenvector corresponding to the -th eigenvalue denote as and , respectively. The premise of the existence of is that each ancestor of satisfies , and the premise of the existence of is that each descendant of satisfies .
Let the -th element of and denote as and , respectively. If exists, then for any . Let . The remaining elements can be obtained by
(17) |
If exists, then for any . Let . The remaining elements can be obtained by
(18) |
Prop. 10 shows that the eigenvalues and eigenvectors of the Generation Matrix have many interesting properties. For example, the eigenvalue of Generation Matrix is the weights of the nodes, which is much easier to solve than other matrix representations; the eigenvectors also satisfy some propagation properties similar to Prop. 14 and 15. Notably, the eigenvectors are conditional, which means that Generation Matrix is not always diagonalizable. Some Generation Matrices, especially the eigenvectors of , still have many problems waiting to be studied. Although feature analysis is not the main focus in this paper, Prop. 10 still provides some valuable references for the subsequent research works.
Considering the relationship between and corresponding , we find that satisfies a particular decomposition form, which we call the diagonal decomposition of Generation Matrix.
Property 11 (Diagonal Decomposition).
Given arbitrary , there is always a pair of vectors , making the following decomposition hold for and the structure matrix .
(19) |
Let and denote the -th element of them, respectively. Then a pair of legal and can be obtained by
(20) |
Where “” denote the element-wise division of vectors. Note, since the root numbered has no parent, we set as the first element of , and is the -th element of in the remaining elements.
Proof.
See Appendix A.3. ∎
By the diagonal decomposition of Generation Matrix, we can express arbitrary as an expression with . Using Prop. 11, we can extend some mathematical properties of to to solve more problems effectively.
4.3 The Conversion between Generation Matrix and Other Matrix Representations

Our research shows that Generation Matrix is not an isolated matrix representation from others. Through proper operations, we can convert Generation Matrix into other matrix representations. Fig. 4 shows the four matrix representations that can be transformed by the Generation Matrix constructed by the hierarchical tree in Fig. 1, including Adjacency Matrix, Laplacian Matrix, Distance Matrix, and Ancestor Matrix.
Theorem 1.
Let be Adjacency Matrix of , then can be obtained by the following expression of :
(21) |
Theorem 2.
Let be Laplacian Matrix of , then can be obtained by the following expression of :
(22) |
Theorem 3.
Let be Distance Matrix of , then can be obtained by the following expression of :
(23) |
Theorem 4.
Let be Ancestral Matrix[8] of , then can be obtained by the following expression of :
(24) |
Proof.
The proofs of Thm. 2-4 in Appendix A.4 to A.6. ∎
It can be seen from the theorems above that we can convert Generation Matrix into other matrix representations by simple matrix expressions. However, the reverse is not easy. Except for Adjacency Matrix, other matrix representations cannot be directly converted back to Generation Matrix. Therefore, we can use Generation Matrix to construct other matrix representations. Besides, it also implies that the theories of Generation Matrix have a particular internal connection with the matrix represented. We can combine the theories of Generation Matrix and other matrix representations to solve more problems about hierarchical trees.
5 The Application on Differentially Private Hierarchical Tree Release
5.1 Hierarchical Tree Release Based on Generation Matrix
In this section, we introduce how to apply Generation Matrix to efficiently and concisely achieve an optimally consistent release on differentially private hierarchical tree release. Since each leaf of corresponds to a , we use the mapping matrix to represent the mapping relationship between leaf nodes and hierarchical tree nodes. Using Prop. 14, the hierarchical tree building process can be described as
(25) |
At the same time, we also define the inverse tree-building process as the following expression, which is taking the leaves of and restoring them to .
(26) |
Although most works[13, 19, 20, 21, 14] do not take matrix analysis as the theoretical basis for optimally consistent release, there are many advantages to applying matrix analysis. One of them is error analysis. Using matrix analysis, we can quickly calculate the overall mean square error of the “Node Query” after the post-processing for optimally consistent release.
Theorem 5.
Given the privacy budget and the to-be-released hierarchical tree containing nodes and leaves, whose height is , the overall mean square error before and after post-processing and satisfy
(27) |
(28) |
Proof.
See Appendix A.7. ∎
According to Thm. 28, the overall mean square error depends on the number of leaves after post-processing. Generally, is much less than the number of leaves , so post-processing will significantly reduce the error. As the proof shown in Appendix A.7, We applied the property of the trace of the matrix to obtain a concrete and concise demonstration, which embodies matrix analysis’s great potential for solving problems.
Next, we will introduce how to apply Generation Matrix to achieve an efficient enough algorithm.
5.2 QR Decomposition on
To obtain an efficient release algorithm, we apply QR decomposition to analyze Formula (8). Unfortunately, the traditional QR decomposition[23] cannot meet our analysis requirements, so we propose another QR decomposition form, namely the QR decomposition.
Definition 10 (QR Decomposition).
For matrix , QR decomposition looks for an orthogonal matrix , which converts into a form composed of lower triangular matrix and zero matrix. i.e.,
(29) |
Correspondingly, we call the traditional QR decomposition the QR decomposition, which decomposes a matrix into an upper triangular matrix. Although the two have different forms, they both achieve decomposition through a series of basic orthogonal transformations. Householder transformation is the most widely used among various orthogonal transformation techniques because of its high efficiency, easy implementation, and applicability to sparse matrices. Completing QR decomposition once requires householder transformations. To describe the QR decomposition process in more detail, we define Householder transformation to describe each transformation.
Definition 11 (Householder Transformation).
For matrix , given an ordered set and column number , Householder Transformation selects the rows and column of as the reference for householder transformation, . The transformation result will make satisfy and , where satisfies the following expression:
(30) |
For , we denote Householder transformation as . The QR decomposition based on the Householder transformation can be expressed as
(31) |
where is the result of QR decomposition and “” represents the operation of function composition such as . For QR decomposition, the parameter ; for QR decomposition, the parameter .
Next, we apply the QR decomposition on . To understand how QR decomposition affects , we divide into the following forms,
(32) |
The upper half of consists of the first rows; The second half consists of the remaining rows. According to Def. 5, we have and , where the -th element of ordered set satisfies . Considering the effects of Householder transformation on , we denote the matrix obtained after the -th Householder transformation as , whose upper half and second half are denoted as and . is the form before Householder transformation, satisfying ; is the result after QR decomposition. Thm. 6 demonstrates that, in the process of Householder transformation, satisfies some invariant properties.

Theorem 6.
For the QR decomposition on , after times of Householder transformation, always keeps the following three properties unchanged:
-
a)
;
-
b)
Each line of has one and only one non-zero value;
-
c)
The last columns of (i.e., the columns from to ) are all .
Proof.
See Appendix A.8. ∎
As shown in Fig. 5, with the Householder transformation proceeds, the non-zero elements in shift from right to left, and the position of the non-zero elements of remains unchanged. Specifically, in the -th Householder transformation, all non-zero elements in the -th column of are transferred to the -th column. Combining Thm. 6, we can infer the specific form of after times of Householder transformation. After the last householder transformation, the result satisfies the following theorem.
Theorem 7.
After QR decomposition, there is a and an orthogonal matrix , which let satisfies:
(33) |
Proof.
See Appendix A.9. ∎
Thm. 33 implies that the calculation about in expression (8) can be replaced by . Since it is related to and simultaneously, we denote it as . Thm. 8 shows that and are inner-product-equivalent.
Theorem 8.
For arbitrary , there is a inner-product-equivalent to it, which satisfies
(34) |
where is the upper half of after QR decomposition.
Proof.
See Appendix A.10. ∎
5.3 Generation Matrix-based Optimally Consistent Release Algorithm
The property of inner product equivalence provides us with a vital optimization idea for an optimal release, as shown in Cor. 35. Using matrix analysis, we convert the expression (8) into an expression about and then use the mathematical properties of Generation Matrix to improve the efficiency of optimal release.
Corollary 2.
The expression can be obtained by performing an upward propagation and a downward propagation successively about . That is
(35) |
Proof.
See Appendix A.11. ∎
According to Cor. 35, we get another optimal release form as follows.
(36) |
We illustrate with parentheses that Formula (36) is calculated from right to left, ensuring that all multiplication and solution equations are for matrices and vectors. According to the sparsity of , the time complexity of -related multiplication calculation in Formula (36) is . Besides, according to Prop. 14 and Prop. 15, the time complexity of solving the linear equation of Generation Matrix is also . Therefore, the overall time complexity of Formula (36) is . Formula (36) has completely summarized the core process of optimally consistent release. So long as we execute the formula directly after constructing , we can efficiently obtain the optimal consistent release. The algorithm description with Generation Matrices is very concise and easy to implement.
However, only ensuring that the calculation of is also highly efficient, the whole process is efficient. So, we further proposed Thm. 38 to calculate it.
Theorem 9.
Let and as the weights of the nodes and edges in , respectively. They satisfy
(37) |
where satisfies
(38) |
Proof.
See Appendix A.12. ∎
Thm. 38 shows that and can be directly calculated by only requiring , and the calculation of only needs to traverse from to once. Since this process involves the calculation of , the overall time complexity of constructing is . The specific construction process is shown in Alg. 1.
In summary, we propose a Generation Matrix-based optimally consistent release algorithm (GMC) for differentially private hierarchical trees, described as Alg. 2. Note that Step 1 to Step 3 in Alg. 2 are the normal hierarchical tree release process, and Step 4 is the call of Alg. 1. Only step 5 is the core step, which uses formula (36) to achieve optimally consistent release. In the previous, the time complexity of formula (36) has been proved as . Therefore, the overall time complexity of GMC is also . Besides, it can be seen that GMC is a two-stage algorithm, i.e., the construction of and post-processing. If the same hierarchical tree is used for multiple releases, we only need to construct once.
In addition, we can further optimize the algorithm. Considering that the construction of involves square root extraction, which may cause more calculation overhead, we propose an improved version of Alg. 2 to avoid any square root. The main idea is shown as follows.
(39) |
Where the vector , and “” is Hadamard product [32]. The new version without square root extraction can slightly improve the calculation efficiency, which shows that the algorithm under the matrix description has high scalability. The improvement of the algorithm only needs to make some slight modifications on Formula (36). Furthermore, we can directly extend the existing model to solve other hierarchical tree problems, such as the hierarchical tree release with the non-uniform privacy budget.
6 Experiment
We conducted experiments to verify the performance of GMC our proposed for differentially private hierarchical tree release. All our experiments are run on a computer with Dual Core GHz AMD Ryzen CPUs, GB RAM, and MATLAB’s development software. To improve the reliability of our experimental results, we repeatedly run the same experimental setup times and then take the average of multiple experimental results as the final results. In addition, we denote the output of our algorithm by or , , and their -th outputs are denoted as and , respectively.
6.1 Datasets and Comparison Algorithms
Our experiments run on large datasets with more than million nodes. They are Census2010, NYCTaxi, and SynData, with the following details.
Census2010[33]: Due to the plan that the U.S. Census Bureau announced using differential privacy[2], we adopt the 2010 U.S. Census dataset as our first dataset, which contains demographic information of all Americans. The statistical results are divided into levels by the geographic components, i.e., “United States - State - County - County Subdivision - Place/Remainder - Census Tract - Block Group - Block”. The dataset contains individuals, which construct a hierarchical tree containing nodes. One of its typical applications is to provide users with queries on the population of the area of interest, called “Node Query”. For example, a user submits a query request “What is the population of Albany County in New York, USA?”. The system will return the value at the node “USA - New York - Albany County”. After verification, we ensured that the data before adding noise satisfies consistency.
NYCTaxi[34]: This data set comes from car ride records in New York City in 2013. In order to ensure the uniqueness of the data, we selected the data provided by Creative Mobile Technologies. According to the start time of the taxi, we constructed a statistical histogram of the frequency of people taking a taxi in seconds and provided a range query. The process of building a hierarchical tree is random, which ensures that our algorithm can handle any hierarchical tree structure. The data contains individuals. Since we count the ride frequency by seconds, the number of histograms (leaf nodes) totals . The fan-out of nodes is random. In our experiments, we set the proportion of nodes with fan-outs of , , and to . In this way, the hierarchical tree we build contains approximately to million nodes.
SynData: To test the hierarchical tree with special structures, we adopt a randomly synthesized dataset for our experiments. In synthetic data, the hierarchical tree structure is a complete binary, and the number of nodes is controlled by the tree height , where . We generate by Poisson distribution, i.e., . In our experiments, we set and take as the complete dataset, containing nodes. Like NYCTaxi, we focus on the “Range Query” on SynData.
Name | Description |
---|---|
Processless | No processing after adding noise. |
Boosting | Boosting[13], the classic post-processing of only for the complete trees. For arbitrary structure, we take the fan-out of root as the fan-out of the algorithm. |
PrivTrie | The post-processing in PrivTrie[14], which has been proven to achieve the optimally consistent release for arbitrary hierarchical tree. |
GMC | Our post-processing for arbitrary hierarchical tree, which achieve the optimally consistent release based on Generation Matrix. |
Our experiments compare different algorithm settings. Their details are shown in Tab. 2. Because our experiment is for large-scale hierarchical trees with more than million nodes, the time complexity of the selected algorithm settings is both .
6.2 Verifying the Effectiveness for GMC
In the first experiment, we verify the effectiveness of GMC with two main aspects, i.e., error and consistency.
The error is measured by the root mean square error (), whose calculation methods are different for “Node Query” and “Range Query”. We denote the of “Node Query” (For Census2010) by , and its formula is
(40) |
And the of “Range Query” (For NYCTaxi and SynData) is denoted as , calculated by
(41) |
Where is the number of range queries selected randomly and without repetition, and the range of the -th query is recorded as . The main reason for random sampling is that all range queries are up to so that we cannot test all range queries. In our experiment, we take .
The consistency of the outputs is measured by consistency bias. Let , and as the -th element of , then satisfies
(42) |
Besides, we adopt the complete datasets in the experiment and take .
Census2010 | NYCTaxi | SynData | ||||
---|---|---|---|---|---|---|
Processless | 11.32 | 49.61 | 138.42 | 48.38 | 166.37 | 61.24 |
Boosting | 11.30 | 159.01 | 120.75 | 22.42 | 74.94 | 0.00 |
PrivTrie | 11.00 | 0.00 | 68.69 | 0.00 | 74.94 | 0.00 |
GMC | 11.00 | 0.00 | 68.69 | 0.00 | 74.94 | 0.00 |
Since the optimally consistent release problem of the differentially private hierarchical tree is a convex optimization problem, its solution is unique. If the outputs of GMC also satisfy optimally consistent, they should be the same as PrivTrie. The results in Tab. 3 confirm this point. They show that GMC is effective and correct. However, it does not mean that PrivTrie and GMC are entirely equivalent. Their implementations are entirely different, so we need further to analyze the algorithms’ performance in the subsequent experiments.
In addition, the results also show the major drawback of Boosting. i.e., it can only guarantee the consistency of complete trees. If the fan-outs of nodes are different, Boosting cannot guarantee that the results are consistent. Especially for Census2010, where the fan-out of nodes is highly different, Boosting results in more significant consistency bias after post-processing.
6.3 Performance Testing
In this section, we focus on the algorithms’ performance and test the running time of the algorithms above from small-scale data to large-scale data. To construct hierarchical trees with different scales, we adopted the following different methods according to the characteristics of each dataset. For Census2010, we obtain a smaller hierarchical tree by order subtree. For example, in its order subtree, the leaves represent the “County Subdivision” level. For NYC taxi, we divide the data of 2013 into months. The data of the first months as the -th subset. For SynData, we control the data scale by directly setting the tree height. In the experiment, we used the six different tree heights to generate hierarchical trees with different scales. Since Processless does not perform any processing for consistency, we omit it in the experiment.



In Fig. 6, the experimental results show that all three algorithms can complete the post-processing within some time, but their running times are quite different. Boosting and PrivTrie need more than 200 seconds to process the hierarchical trees with tens of millions of nodes, while GMC only needs about seconds to process the same data (See Fig. 6a). Their performance gap is up to times. It shows that even if Boosting and PrivTrie have reached the lowest time complexity with , they still have much space for performance improvement. In addition to the relatively inefficient recursive algorithms, another important reason is that GMC uses standard matrix operations to complete the post-processing after establishing the matrix model. These standard matrix operations introduce many optimization techniques in the underlying design, which can make full use of computer resources and significantly improve computing efficiency. Although the results above do not deny that Boosting and PrivTrie can handle large-scale hierarchical trees, applying them to some scenarios, such as real-time data updates or more complex models, may cause considerable challenges in computational efficiency.

Finally, we test the running time for the two stages of GMC, i.e., the Generation Matrix construction (Stage 1) and the post-processing (Stage 2). In Fig. 7, we can see that the time overhead in Stage 1 is much more significant than in Stage 2. The reason is the Generation Matrix construction includes the sorting by Descending Order of Height, counting the number of children, and some high-cost operations such as division, square root (in the process of calculating ). It is worth noting that the Generation Matrix construction is independent of the data to be released and does not involve individual privacy. When the same hierarchical tree structure is reused in multiple releases, we only need to perform the construction process once, further reducing the overall running time of the optimally consistent releases.
7 Conclusions and Future Work
In the previous, we successfully defined Generation Matrix and demonstrated many of its critical mathematical properties. Using Generation Matrix, we can implement various hierarchical tree operations without accessing the local structure, which provides crucial theoretical support for the Matrixing Research Method of hierarchical trees. The application on the differentially private hierarchical tree release reflects the Generation Matrix’s practicability. The proposed GMC based on Generation Matrix provides a concise algorithm for optimally consistent release. Our experiments show that GMC has achieved a significant performance improvement of up to times compared with the state-of-the-art schemes.
The scientific problem that we solve in this paper is not very complicated, but it is classic and is suitable as an example to show the practicability of the Generation Matrix. However, the hierarchical tree problems that Generation Matrix can solve are far more than the application. We can use it to explore more complex hierarchical tree release problems and even the problems that have not been solved so far. For example, the issue of non-negative consistent release of hierarchical trees currently does not exist any closed-form solution with time complexity of , which makes it very difficult to solve the optimal release satisfying both consistency and non-negativity for large-scale data. Nonetheless, Generation Matrix provides us with a critical analysis tool to challenge the problem.
Appendix A Partial Proofs
A.1 Proof of Property 16
Proof.
According to the basic properties of the lower triangular matrix[24], the inverse of the lower triangular matrix is also a lower triangular matrix.
Considering ’s -th column vector , whose -th element is denoted as , we have . Since only the -th element of is , but the rest are all , , for , satisfies
(43) |
Therefore, . Next, we adopt the contradiction method to prove. Suppose is the ancestor of , but .
Let denote the node and its ancestors respectively.
Since is an ancestor of , there is a such that . Therefore, we have according to the recurrence formula (43).
The conclusion contradicts . Therefore, when is an ancestor of , .
If is not an ancestor of , obviously is not the root, i.e., . According to the formula (43), we have . Since has been proved to be , thus . ∎
A.2 Proof of Property 9
Proof.
Considering the -th row of , we denote as the set formed by the column subscripts of the non-zero elements in the -th row. According to , we have satisfied
(44) |
According to Prop. 16, . Therefore, is the common ancestor of and and records the number of their common ancestors.
Specifically, if , , i.e., the number of ancestors of plus (itself). Let the depth of the root be 1, is the depth of . ∎
A.3 Proof of Property 11
Proof.
Let , . According to Equation (19), we have , . That is, , . Therefore, we have the following matrix equation holds.
(45) |
According to the property of Block Matrix Inversion, we have
(46) |
And then,
(47) |
A.4 Proof of Theorem 22
Proof.
Let denote the element in row and column of . According to the definition of Laplacian Matrix[6], we have the diagonal elements representing the number of adjacent nodes of ; and if is the adjacent node of , i.e., or , we have ; otherwise, .
According to the expression (22), we have
(48) |
where denotes the set formed by the row subscripts of non-zero elements in the -th column of .
Next, we discuss the following situations:
For , under the definition of Laplacian Matrix, should be equal to the number of children of the root, i.e., . According to the expression (22), there is and . It conforms to the definition of Laplacian Matrix.
For , under the definition of Laplacian Matrix, should be equal to the number of ’s children and parent, i.e., . According to the expression (22), . It conforms to the definition of Laplacian Matrix.
For but , under the definition of Laplacian Matrix, . According to the expression (22), , we have . Again, It conforms to the definition of Laplacian Matrix. Similarly, if and , . Finally, when and and do not contain a parent-child relationship, , , which also conforms to the definition of Laplacian Matrix.
Therefore, in any case, the expression(22) always conforms to the definition of the Laplacian Matrix. ∎
A.5 Proof of Theorem 23
Proof.
Let denote the element in row and column of .

According to the definition of distance matrix[7], is the distance from node to . As shown in Fig. 8, (red node) is the nearest common ancestor of nodes and . denotes the depth of node (i.e., the distance from node C to the root (orange node) plus 1); denotes the distance from to ; denotes the distance from to . Obviously, the distance .
A.6 Proof of Theorem 24
Proof.
Let denote the element in row and column of .
According to the definition of Ancestral Matrix[8], represents the distance from the nearest common ancestor of leaves and to the root.
Note, Prop. 9 has been proved that the element of is the number of the common nodes, i.e., the distance from the nearest common ancestor to the root node plus .
Therefore, we can get by taking the sub-matrix corresponding to the leaves in and then subtracting . ∎
A.7 Proof of Theorem 28
Proof.
According to formula (5), the noise we add to each element in is i.i.d, and satisfies . Since the variance of is , the covariance matrix of is .
According to the mean square error analysis of differential privacy, we have
(50) |
A.8 Proof of Theorem 6
Proof.
By Def. 29, it is obvious that the process of Householder transformation satisfies property c). Because the process of QR decomposition is the process of setting column by column from the last column to the first column.
According to Def. 5, satisfies both properties a),b) and c).
Assume that M satisfies both properties a), b) and c). By the formula (31), the expression of the -th Householder transformation is as follows:
(55) |
where . Considering that satisfies the property a) and is a lower triangular matrix, the values of can be simplified, taking .
According to Def. 30 , during the transformation process, and satisfy
(56) |
where . By calculating , we have
(57) |
Due to satisfies the property b), for the same row , at most only one of and in is non-zero. Therefore,
(58) |
Simplifying according to formula (30), we can get after -th Householder transformation, whose element in in row and column satisfies
(59) |
According to the recursive expression (59), consider the properties of . First, consider the property a). Since , and only the row in the first rows of is affected, according to the sparsity of GM, it can be known that the -th row of satisfies if and only if .
Since , according to for , we have .
Therefore, there is no mutual conversion between non-zero elements and zero elements in the only affected -th row. satisfies property a).
Next consider property b). For the rows , according to the property c), all the values of the -th column of are . Consider the -th column of , we have
(60) |
If , there is , i.e., ; If , there is , i.e., .
Therefore, the -th Householder transformation is equivalent to transferring non-zero elements from the -th column of to the -th column of . The process keeps the property b) holds.
In summary, also satisfies the properties a), b) and c). Since satisfies the properties a), b) and c), all satisfy the properties a), b) and c). ∎
A.9 Proof of Theorem 33
Proof.
By Thm. 6, after Householder transformations, satisfies both the properties a), b), and c).
Consider for the -th Householder transformation. Due to and only in the first row of is non-zero, which is the only element of affected by the last transformation. The transformed satisfies
(61) |
Therefore, there is still after the last transformation, i.e., there is a satisfying . Besides, according to Def. 30 and property c), it can be known that after the last transformation. ∎
A.10 Proof of Theorem 8
Proof.
Let be the upper half of after the QR decomposition. According to Thm. 33, we have
(62) |
Substituting it into the expression , we have
(63) |
∎
A.11 Proof of Corollary 35
A.12 Proof of Theorem 38
Proof.
Define a sequence satisfies
(65) |
where . The -th Householder transformation is the Householder transformation on the -th column. And let denote the child of , i.e., .
By recursive expression (59), in the first Householder transformations, only the Householder transformation on the -th column (i.e., the -th transformation) will cause the value of to change. Therefore, satisfies
(66) |
For a non-leaf node , since the values of and haven’t changed in the first Householder transformations, there are and . Therefore, satisfies
(67) |
Therefore, for , we have
(68) |
According to formula (65), satisfies
(69) |
According to Thm. 6, the process of QR decomposition always satisfies the property b), so at most only one item in is non-zero, and we have
(70) |
Therefore,
(71) | ||||
References
- [1] S. Niazi, M. Ismail, S. Grohsschmiedt, M. Ronstrm, S. Haridi, J. Dowling, Hopsfs: Scaling hierarchical file system metadata using newsql databases.
- [2] J. Abowd, The u.s. census bureau adopts differential privacy, 2018, pp. 2867–2867. doi:10.1145/3219819.3226070.
- [3] T. Lima, M. de Aguiar, Laplacian matrices for extremely balanced and unbalanced phylogenetic trees (08 2020).
- [4] S. Garfinkel, J. Abowd, S. Powazek, Issues encountered deploying differential privacy (09 2018). doi:10.1145/3267323.3268949.
- [5] X. Li, Z. Wang, Trees with extremal spectral radius of weighted adjacency matrices among trees weighted by degree-based indices, Linear Algebra and its Applications 620. doi:10.1016/j.laa.2021.02.023.
- [6] S. Ganesh, S. Mohanty, Trees with matrix weights: Laplacian matrix and characteristic-like vertices (09 2020).
- [7] R. Bapat, S. Sivasubramanian, Squared distance matrix of a tree: Inverse and inertia, Linear Algebra and its Applications 491. doi:10.1016/j.laa.2015.09.008.
- [8] E. Andriantiana, K. Dadedzi, S. Wagner, The ancestral matrix of a rooted tree, Linear Algebra and Its Applications 575 (2019) 35–65. doi:10.1016/j.laa.2019.04.004.
- [9] W. Fulton, J. Harris, Graduate texts in mathematics, Representation Theory. A First Course, Readings in Mathematics 129.
- [10] J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, M. Sun, Graph neural networks: A review of methods and applications, AI Open 1 (2020) 57–81. doi:10.1016/j.aiopen.2021.01.001.
- [11] X. A. Chen, Understanding spectral graph neural network.
- [12] M. Deveci, C. Trott, S. Rajamanickam, Multi-threaded sparse matrix-matrix multiplication for many-core and gpu architectures, Parallel Computing 78. doi:10.1016/j.parco.2018.06.009.
- [13] M. Hay, V. Rastogi, G. Miklau, D. Suciu, Boosting the accuracy of differentially-private histograms through consistency, Proceedings of the VLDB Endowment 3. doi:10.14778/1920841.1920970.
- [14] N. Wang, X. Xiao, Y. Yang, T. Hoang, H. Shin, J. Shin, G. Yu, Privtrie: Effective frequent term discovery under local differential privacy, 2018, pp. 821–832. doi:10.1109/ICDE.2018.00079.
- [15] J. Jansson, K. Sadakane, W.-K. Sung, Ultra-succinct representation of ordered trees with applications, J. Comput. Syst. Sci. 78 (2012) 619–631. doi:10.1016/j.jcss.2011.09.002.
- [16] D. Tsur, Succinct representation of labeled trees, Theoretical Computer Science 562. doi:10.1016/j.tcs.2014.10.006.
- [17] A. Farzan, J. Munro, Succinct representation of dynamic trees, Theor. Comput. Sci. 412 (2011) 2668–2678. doi:10.1016/j.tcs.2010.10.030.
- [18] C. Dwork, F. McSherry, K. Nissim, A. Smith, Calibrating noise to sensitivity in private data analysis, Journal of Privacy and Confidentiality 7 (2017) 17–51. doi:10.29012/jpc.v7i3.405.
- [19] W. Qardaji, W. Yang, N. Li, Understanding hierarchical methods for differentially private histograms, Proceedings of the VLDB Endowment 6 (2013) 1954–1965. doi:10.14778/2556549.2556576.
- [20] G. Cormode, M. Procopiuc, E. Shen, D. Srivastava, T. Yu, Differentially private spatial decompositions, Computing Research Repository - CORRdoi:10.1109/ICDE.2012.16.
- [21] S. Yuan, D. Pi, X. Zhao, M. Xu, Differential privacy trajectory data protection scheme based on r-tree, Expert Systems with Applications 182 (2021) 115215. doi:10.1016/j.eswa.2021.115215.
- [22] J. Lee, Y. Wang, D. Kifer, Maximum likelihood postprocessing for differential privacy under consistency constraints, 2015, pp. 635–644. doi:10.1145/2783258.2783366.
- [23] G. Strang, Linear algebra and learning from data, Wellesley-Cambridge Press Cambridge, 2019.
- [24] G. Birkenmeier, H. Heatherly, J. Kim, J. Park, Triangular matrix representations, Journal of Algebra 230 (2000) 558–595. doi:10.1006/jabr.2000.8328.
- [25] L. Minah, A. Fox, G. Sanders, Rounding error analysis of mixed precision block householder qr algorithms, SIAM Journal on Scientific Computing 43 (2021) A1723–A1753. doi:10.1137/19M1296367.
- [26] P. Desai, S. Aslan, J. Saniie, Fpga implementation of gram-schmidt qr decomposition using high level synthesis, 2017, pp. 482–487. doi:10.1109/EIT.2017.8053410.
- [27] W. Fam, A. Alimohammad, Givens rotation-based qr decomposition for mimo systems, IET Communications 11. doi:10.1049/iet-com.2016.0789.
- [28] R. Stanley, Enumerative combinatorics — volume 1.
- [29] M. C. M. incorporates LAPACK, Increasing the speed and capabilities of matrix computation, [Online] (2000).
- [30] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, L. Kaiser, M. Kudlur, J. Levenberg, X. Zheng, Tensorflow : Large-scale machine learning on heterogeneous distributed systems (01 2015).
- [31] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell, Caffe: Convolutional architecture for fast feature embedding, MM 2014 - Proceedings of the 2014 ACM Conference on Multimediadoi:10.1145/2647868.2654889.
- [32] J. Magnus, Matrix differential calculus with applications in statistics and econometricsdoi:10.1002/9781119541219.
- [33] U. C. Bureau, 2010 census summary file 1, https://www.census.gov/prod/cen2010/doc/sf1.pdf (2012).
- [34] New york city taxi data, http://www.nyc.gov/html/tlc/html/about/trip/record/data.shtml.