Point Cloud Attribute Compression via Successive Subspace Graph Transform ††thanks: 978-1-7281-8068-7/20/$31.00 ©2020 IEEE
Abstract
Inspired by the recently proposed successive subspace learning (SSL) principles, we develop a successive subspace graph transform (SSGT) to address point cloud attribute compression in this work. The octree geometry structure is utilized to partition the point cloud, where every node of the octree represents a point cloud subspace with a certain spatial size. We design a weighted graph with self-loop to describe the subspace and define a graph Fourier transform based on the normalized graph Laplacian. The transforms are applied to large point clouds from the leaf nodes to the root node of the octree recursively, while the represented subspace is expanded from the smallest one to the whole point cloud successively. It is shown by experimental results that the proposed SSGT method offers better R-D performances than the previous Region Adaptive Haar Transform (RAHT) method.
Index Terms:
Point cloud compression, Graph Laplacian, successive subspace learning, Successive subspace graph transformI Introduction
Three dimensional (3D) object processing has received a lot of attention in recent years due to the rapid development in 3D autonomous driving, gaming, and remote sensing. The point cloud model is one of the most popular data formats to represent a 3D object in many applications, because of its easy access and complete description in the 3D space. Point cloud models are always composed of a large number of points associated with attributes (e.g., color), leading to huge storage space and high computational cost. It makes the compression of point clouds to be a critical and valuable research topic.
Point cloud compression has been widely studied in the research community. The octree structure is an effective representation of point clouds and has been adopted in point cloud geometry coding, e.g. [1][2]. In this paper, we use this well-developed octree method for geometry coding. As for attribute compression, many efforts have been made to explore the relationships among points based on the octree structure. Zhang et al. [3] designed graphs at a certain level of the octree, which uses the subset of points as the vertex set and connects points within the threshold distance. Then graph transform is applied to encode point cloud attributes. The transform scheme provided better performance over traditional discrete cosine transform (DCT) in point cloud compression. Their way to construct the graph would create isolated sub-graphs when point clouds are sparse. To address this problem, Cohen et al. [4] proposed a K-nearest neighbor (KNN) method by adding edges between more distant points in the graph.
For those one-stage graph transform methods, better performance is reported when constructing a larger graph at the higher level of the octree, while the higher computational cost is required simultaneously. The region adaptive hierarchical transform (RAHT) [5] is proposed by Ricardo et al. to achieve the more efficient attribute compression via the multi-stage transform scheme based on octree structure. However, the transforms simply apply on two nodes in each stage, which is weak to fully represent the spatial correlations among points.
Motivated by the successive subspace learning (SSL) framework in the subspace learning field [6], we design a new hierarchical transform method, called successive subspace graph transform (SSGT) for point cloud attribute compression. Existing works [7, 8] using SSL have been proposed to solve image and point cloud classifications. The set of effective features of successively growing subspace points or patches are extracted through multi-stage Saab (subspace approximation with adjusted bias) transform [7] and utilized to address the classification problems. Principled by SSL, the proposed SSGT takes advantage of the “successive subspace growing” process to handle large point clouds. The graph of a larger subspace is constructed on those of its constituent subspace of smaller sizes, offering an effective description of relationships among points.
These are two major contributions of this work. First, we propose a novel way to construct a weighted graph with self-loop to represent the subspace of the point clouds and the graph Fourier transform defined on the normalized Laplacian is utilized to encode the color attributes. Second, we adopt the SSL framework for point cloud attribute compression and introduce the successive subspace graph transform scheme. It provides a more efficient and effective way to represent the spatial correlations among a large number of points.
II Proposed SSGT method
II-A Framework overview
The framework overview of the proposed SSGT method is shown in Fig. 1. To transform a point cloud model consists of a large number of points, we first apply the octree decomposition, and the point cloud decomposes into eight octants recursively to form an octree-tree structure with its root being the whole point cloud and its occupied leaf being a point. Those occupied nodes at different levels represent a subspace of the point cloud model with different spatial sizes. We design a weighted graph with self-loop to describe the subspace and define the graph Fourier transform based on the symmetric normalized Laplacian. The attribute signal (e.g., color) can be transformed to the graph spectral domain with the energy compaction property. The first-stage graph transform is applied at the bottom of the octree. Then, multi-stage transforms are processed from all child nodes to their parents stage by stage until the root node is reached. The proposed SSGT framework successively learns the point cloud subspace and offers an efficient and effective way for point cloud compression.
II-B Graph Construction
Problem Formulation. We define a point cloud of points as , where . Assume occupies the space of size , and can be partitioned using the octree decomposition with depth . The octree will contain leaf nodes with occupied ones and the spatial size of the leaf node is .
At the level , the th node represents a subspace of the point cloud of size . we can construct a weighted graph with self-loop, , to describe it, where the vertex set is composed of all occupied child nodes. The designed graph model the relationships among the nodes and the corresponding graph Fourier transform, , is calculated based on the symmetric normalized graph laplacian. The multi-stage transforms can be applied from level to the root level, the represented subspace expands from size to accordingly.
Designed Graph. The graph design is illustrated in Fig. 2. For the th node at the th level, we can construct a weighted graph with self-loop, , where denotes a set of vertices, which contains all occupied child nodes at level . The set of edges is defined by . denotes a edge weight matrix and denotes the number of vertices.
The spatial position of the is denoted as and the number of the points in the is denoted as . The weight matrix is defined by
(1) |
(2) |
Considering the relationships among vertices, the edge weights are designed in two aspects: 1) the number of points (i.e., ) within each vertex, and 2) the distances between vertex pairs. In the first aspect, we assume each point in the subspace have relations with all other points, representing by the sub-edges to connect every point pairs. In th vertex, there are sub-edges among points. From th vertex to th vertex, there are sub-edges among points. The number of sub-edges is introduced to the Eq. 1 to represent the impact of different numbers of points in vertices. In the second aspect, distances between vertex pairs are utilized as expressed in Eq. 2, where the parameter can control the speed of the edge weight decay. And the same is adopted over all stages of transforms. In the experiments, we simply use the voxelized position of the vertices, and the voxel grid size increases from to when applying the transform from the level to the root node recursively.
Graph Fourier transforms. Based on the designed graph, we can compute the degree matrix , where . The symmetric normalized Laplacian matrix [9] is expressed as
(3) |
(4) |
The eigenvalue decompoistion of :
(5) |
where is the matrix whose column is the eigenvector of and is the matrix whose diagonal terms are the corresponding eigenvalues and are sorted in an increasing order. The graph signal can be transformed by into the graph spectral domain.
The reason for using the normalized Laplacian matrix is to introduce the effect of the self-edges in the graph. Based on the properties of the symmetric normalized Laplacian matrix [10], we have and its eigenvector . We can treat as the DC transform kernel, and the DC coefficient is the weighted average of the input graph signal. The other eigenvectors as the AC transform kernels to describe the high frequency details. In the proposed SSGT method, only the DC coefficients are propagated to the next stage.
II-C Multi-stage transforms
To transform large point clouds, we use the multiple subspace graph transforms in cascade to expand the subspace size gradually. The subspace transform starts at level of the octree and applied to their occupied child node in the th level which is a point. The attribute signal defined on the point can be transformed accordingly. The DC coefficients are fed into the upper level as a new signal defined on the nodes at level . This process can be repeated stage by stage until the root node is reached.
The speed of the subspace expansions can be adjusted in the proposed framework. If we learn the graph transforms at every level of the octree, the size of the subspace represented by the parent nodes is eight times larger than that of their child nodes. We can increase the expansion speed by applying the subspace transform for every two levels, which leads to eight times faster than the previous settings. However, it should be noticed that if the expansion speed is too fast, the represented graph will contain too many vertices, and its computational cost will increase. In contrast, the capability of the small graph is limited to model the subspace well and it would provide the worse compression performance.
II-D Attribute Compression
The point cloud attributes can be treated as the graph signal defined on points. In a subspace, we can define the signal on the constructed graph as . The signal can be transformed to . is the DC coefficient, and are all AC coefficients. Only will be propagated to the upper level as a new graph signal.
For example, when encoding the color information of point clouds, at level , can be the color values (e.g., luminance values) at points in the subspace represented by the th node. Then the DC coefficients of would be a new signal defined on the th node and will be further encoded in the next stage. Finally, all AC coefficients and a DC coefficient of the root node are quantized and entropy-coded.
II-E Comparison with RAHT
RAHT is a special case of the proposed SSGT scheme. In every level of the octree, the subspace is further partitioned in x,y,z directions, and the created graphs only contain two vertices, denoted as and . The corresponding degree matrix, Laplacian matrix, and the normalized Laplacian matrix are expressed as following:
(6) |
(7) |
(8) |
where and represent the number of points in and respectively. Calculate the eigenvalue decomposition of , the graph transform matrix and the eigenvalue matrix is
(9) |
(10) |
is the same as the transform matrix in RAHT. There are two limitations of only adopting two-vertices transforms in RAHT: 1) it treats the x, y, z directions differently, which causes the performance difference when changing the order of the transforms along x, y, and z axes[11], and 2) this simple graph neglects the influence of the pairwise distances among points leading to weak modeling of the point relationships in the subspace.
III Experiments
We test the proposed SSGT method for the color attributes compression using frames extracted from four sequences: “longdress”, “redandblack”, “loot” and “soldier”, in 8i dynamic point cloud datasets [12]. Each point cloud model is voxelized by two different resolutions and decomposed as the octree with depth 10 and 12 respectively. The color attributes are firstly converted to YCbCr color space, and each color channel is processed independently. We adopt the R-D performance to evaluate the compression algorithms, where the compression rate is reported in bits per point (bpp) when encoding the Y, Cb, Cr channels, and the distortion is reported in peak signal-to-noise ratio (PSNR) of the luminance Y. Fig.4 shows the rendering results of using the proposed SSGT method on the ”longdress” frame with different quantization steps.
Test Data | SSGT | RAHT | ||
Y-PSNR (dB) | Bitrate (bpp) | Y-PSNR (dB) | Bitrate (bpp) | |
loot_vox10 | 38.92 | 0.42 | 37.96 | 0.42 |
redandblack_vox10 | 38.50 | 0.80 | 37.74 | 0.88 |
soldier_vox10 | 37.28 | 0.69 | 36.07 | 0.70 |
longdress_vox10 | 35.34 | 1.63 | 34.67 | 1.72 |
Test Data | SSGT | RAHT | ||
Y-PSNR (dB) | Bitrate (bpp) | Y-PSNR (dB) | Bitrate (bpp) | |
loot_vox10 | 41.02 | 0.23 | 40.09 | 0.23 |
redandblack_vox10 | 39.95 | 0.50 | 39.30 | 0.54 |
soldier_vox10 | 38.54 | 0.43 | 37.67 | 0.44 |
longdress_vox10 | 36.42 | 1.05 | 35.77 | 1.12 |
Table I and Table II show the R-D performance comparison of the proposed SSGT and RAHT method when the distortion is similar. For the proposed SSGT method, we apply the graph transforms at every two levels of the octree, where each corresponding subspace is partitioned into voxels. Generally, our proposed SSGT method provides consistent performance gains on four testing frames comparing with the RAHT method. In Fig. 3, we report the comparison of the R-D performance on the ”longdress” frame by using five different quantization steps size, 15, 20, 25, 30, and 35. From the results, we can observe that the proposed SSGT method significantly outperforms RAHT at all quantization steps. Also, we compare two different subspace expansion speeds. The parent subspace is 8 times or 64 times the size of its child subspace, where the voxelized subspace contains voxels or voxels respectively. The results demonstrate that the larger constructed graph offers a better compression performance and description of the points relationships as discussed in Sec. II-C.
IV Conclusion and Future work
An attribute point cloud compression method based on the successive subspace graph transform (SSGT) was proposed in this paper. Experimental results demonstrate that the proposed method provides better attribute compression performance than the RAHT method in terms of compression rate and distortion. Moreover, the proposed method has a more flexible hierarchical transform structure comparing with RAHT and has the potential to compress larger point clouds effectively. In the near future, we would like to explore other geometry structures of the point cloud and better graph designs under the SSL framework.
References
- [1] Jingliang Peng, Chang-Su Kim, and C-C Jay Kuo, “Technologies for 3d mesh compression: A survey,” Journal of Visual Communication and Image Representation, vol. 16, no. 6, pp. 688–733, 2005.
- [2] Ruwen Schnabel and Reinhard Klein, “Octree-based point-cloud compression.,” Spbg, vol. 6, pp. 111–120, 2006.
- [3] Cha Zhang, Dinei Florencio, and Charles Loop, “Point cloud attribute compression with graph transform,” in 2014 IEEE International Conference on Image Processing (ICIP). IEEE, 2014, pp. 2066–2070.
- [4] Robert A Cohen, Dong Tian, and Anthony Vetro, “Attribute compression for sparse point clouds using graph transforms,” in 2016 IEEE International Conference on Image Processing (ICIP). IEEE, 2016, pp. 1374–1378.
- [5] Ricardo L De Queiroz and Philip A Chou, “Compression of 3d point clouds using a region-adaptive hierarchical transform,” IEEE Transactions on Image Processing, vol. 25, no. 8, pp. 3947–3956, 2016.
- [6] Yueru Chen and C-C Jay Kuo, “Pixelhop: A successive subspace learning (ssl) method for object recognition,” Journal of Visual Communication and Image Representation, p. 102749, 2020.
- [7] C-C Jay Kuo, Min Zhang, Siyang Li, Jiali Duan, and Yueru Chen, “Interpretable convolutional neural networks via feedforward design,” Journal of Visual Communication and Image Representation, vol. 60, pp. 346–359, 2019.
- [8] Min Zhang, Haoxuan You, Pranav Kadam, Shan Liu, and C-C Jay Kuo, “Pointhop: An explainable machine learning method for point cloud classification,” IEEE Transactions on Multimedia, 2020.
- [9] Fan RK Chung and Fan Chung Graham, Spectral graph theory, Number 92. American Mathematical Soc., 1997.
- [10] Ulrike Von Luxburg, “A tutorial on spectral clustering,” Statistics and computing, vol. 17, no. 4, pp. 395–416, 2007.
- [11] Sujun Zhang, Wei Zhang, Fuzheng Yang, and Junyan Huo, “A 3d haar wavelet transform for point cloud attribute compression based on local surface analysis,” in 2019 Picture Coding Symposium (PCS). IEEE, 2019, pp. 1–5.
- [12] E d’Eon, Bob Harrison, Taos Myers, and Philip A Chou, “8i voxelized full bodies, version 2–a voxelized point cloud dataset,” ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document m40059 M, vol. 74006, 2017.