Non-uniform Point Cloud Upsampling via Local Manifold Distribution

Yaohui Fang 202321081054@mail.bnu.edu.cn 0009-0009-0368-2696 Beijing Normal UniversityBeijingChina and Xingce Wang wangxingce@bnu.edu.cn 0000-0002-3177-8902 Beijing Normal UniversityBeijingChina

Abstract.

Existing learning-based point cloud upsampling methods often overlook the intrinsic data distribution characteristics of point clouds, leading to suboptimal results when handling sparse and non-uniform point clouds. We propose a novel approach to point cloud upsampling by imposing constraints from the perspective of manifold distributions. Leveraging the strong fitting capability of Gaussian functions, our method employs a network to iteratively optimize Gaussian components and their weights, accurately representing local manifolds. By utilizing the probabilistic distribution properties of Gaussian functions, we construct a unified statistical manifold to impose distribution constraints on the point cloud. Experimental results on multiple datasets demonstrate that our method generates higher-quality and more uniformly distributed dense point clouds when processing sparse and non-uniform inputs, outperforming state-of-the-art point cloud upsampling techniques.

Point Cloud Upsampling, Gaussian, Statistical Manifold

^†^†copyright: acmlicensed^†^†journal: PACMCGIT^†^†journalyear: 2025^†^†journalvolume: 8^†^†journalnumber: 1^†^†article: 13^†^†publicationmonth: 5^†^†doi: 10.1145/3728309^†^†ccs: Computing methodologies Computer Graphics

1. Introduction

Point clouds are a specialized representation of objects in three-dimensional space. In recent years, significant advancements have been made in 3D sensing technologies, leading to their widespread use as input for various 3D applications, such as autonomous driving (Wang et al., 2019; Lang et al., 2019), 3D city reconstruction (Lafarge and Mallet, 2012; Musialski et al., 2013), and virtual/augmented reality (Held et al., 2012; Santana et al., 2017). However, due to practical limitations, raw point clouds generated through 3D scanning are typically noisy, sparse, and unevenly distributed. To ensure the smooth execution of subsequent tasks involving point clouds, it is crucial to perform upsampling to obtain dense, complete, and clean point clouds, while preserving their smoothness and accurately recovering geometric structures.

The goal of point cloud upsampling extends beyond simply generating dense point sets from sparse inputs. More importantly, the generated points should closely approximate the underlying surface and be evenly distributed. Early optimization-based point cloud upsampling methods (Alexa et al., 2003; Lipman et al., 2007; Huang et al., 2009, 2013) employed various shape priors to constrain point cloud generation, which worked well for simple, smooth surfaces. However, these methods exhibit poor robustness when dealing with complex structures in point clouds. With the rise of deep learning, many deep learning-based methods (Yu et al., 2018b; Li et al., 2019; Qian et al., 2021, 2020) have been applied to point cloud upsampling tasks, effectively extracting features. However, when dealing with highly sparse and non-uniform point clouds, these networks are unable to effectively learn the geometric structure of the point clouds, resulting in the loss of fine details or structural inaccuracies.

For point cloud upsampling tasks, existing methods typically generate new points by extracting features based solely on spatial positions, without considering the distribution characteristics of the point cloud itself. This limitation results in non-uniform upsampling when applied to sparse and non-uniform point clouds. In this paper, we propose a method that constructs a unified statistical manifold by Gaussian representation for each local neighborhood, and enforces distribution constraints on the manifold to achieve uniform point cloud upsampling.

The contributions of this paper are summarized as follows:

•

We propose a novel point cloud representation that interprets the point cloud as samples drawn from a probability distribution function, establishing a statistical manifold framework for point clouds. The distribution characteristics reflect the intrinsic patterns of data generation. By constructing a statistical manifold, we can better simulate and understand this generative process, as well as model the distribution properties of point clouds, such as point density, clustering regions, and directionalities. These distributional details capture the underlying laws and characteristics of the point cloud data, which are crucial for guiding the upsampling process.
•

We use Gaussian functions to fit the local neighborhoods of the point cloud, where each point on the manifold represents a probability distribution. The overall structure and geometric shape of the point cloud can be described using a statistical model, with the geometric information of the point cloud encoded within the Gaussian functions, thereby providing accurate local geometric guidance for subsequent upsampling.
•

We minimize the geodesic distance between corresponding points on the statistical manifold to constrain the distribution of the point cloud, ensuring that the upsampled point cloud better recovers and preserves the details and shapes of the original data. This approach maintains the consistency of its shape and details, generating a more uniform point distribution.

2. Releated Work

2.1. Optimization-based Point Upsampling

Optimization-based methods typically require geometric prior information of the point cloud, such as edges and normals, and their performance tends to degrade for complex shapes. Alexa et al. (Alexa et al., 2003) first introduced a point cloud upsampling algorithm, which adds new points at the vertices of the Voronoi diagram in the local tangent space to achieve point cloud upsampling. Subsequently, Lipman et al. (Lipman et al., 2007) proposed a non-parametric method based on the Local Optimal Projection (LOP) operator, which resamples points based on the L1 norm. While this method is robust to noise and outliers, its performance on complex geometries is relatively poor. Later, Huang et al. (Huang et al., 2009) developed an improved weighted LOP, which performs upsampling in an edge-aware manner, reducing the quality degradation at sharp edges and corners. Huang et al. (Huang et al., 2013) also proposed an Edge-Aware Resampling (EAR) method, which preserves edges during point cloud upsampling but heavily depends on the provided normal information and parameter tuning. Overall, these methods heavily rely on prior information and place significant demands on the original point cloud.

2.2. Deep learning-based Point Upsampling

With the development of deep learning techniques and inspired by the success of PointNet (Qi et al., 2017a), many deep learning methods have been proposed for various point cloud processing tasks such as classification, completion, denoising, and other applications. For point cloud upsampling, PU-Net (Yu et al., 2018b) was the first to apply neural networks to point cloud upsampling, using PointNet++ (Qi et al., 2017b) to extract point features and extending those features through a multi-branch MLP to generate upsampled point clouds. However, this approach does not consider the spatial relationships between points, and thus cannot guarantee the generation of uniformly distributed points. EC-Net (Yu et al., 2018a) employs a joint loss of point-edge distances, allowing for the generation of sharp new points at edges. MPU (Yifan et al., 2019) is a progressive method for upsampling point patches, but it struggles to generate new points in missing regions for non-uniform point clouds. PU-GAN (Li et al., 2019) was the first to apply Generative Adversarial Networks (GANs) to point cloud upsampling, introducing uniformity loss and becoming the first method capable of generating uniformly distributed upsampled point clouds. PU-GCN (Qian et al., 2021) uses graph convolutional networks for feature extraction and expansion, proposing a novel point cloud upsampling module called NodeShuffle. PUGeo (Qian et al., 2020) is the first method to utilize geometric approaches for point cloud upsampling, improving upsampling results by employing local differential geometry constraints. Dis-PU (Li et al., 2021) defines two cascaded subnetworks that complete point cloud upsampling in stages. NP (Feng et al., 2022) introduces the use of neural networks to continuously represent geometric shapes, enabling sampling at arbitrary resolutions. Recently, GeoUDF (Ren et al., 2023) introduced a local geometric representation that approximates local shapes using quadratic surfaces. RepKPU (Rong et al., 2024) proposed a new paradigm, kernel-to-displacement generation, for point generation. However, existing learning-based methods often fail to achieve good performance when handling highly sparse and non-uniform datasets. Drawing inspiration from LGSur-Net (Xiao et al., 2024), our approach performs Gaussian fitting on the local patches of point clouds and applies further constraints on the statistical manifold constructed from the parameters. Our method generates more uniform results for sparse and non-uniform point clouds.

Refer to caption — Figure 1. Our upsampling algorithm pipeline. For the input point cloud, we first construct a local coordinate system for each query point based on its surrounding neighborhood. We then perform local Gaussian fitting in the 2D parameter plane. Subsequently, geodesic constraints are applied on the statistical manifold formed by the local Gaussian functions. Finally, by resampling the constructed local Gaussian functions, we generate the upsampled point cloud.

3. Method

Given a sparse and non-uniform point cloud $P=\left\{p_{i}\in\mathbb{R}^{3\times 1}\right\}_{i=1}^{N}$ which contains N points, traditional point cloud representations only capture a single 3D position per point, which fails to adequately preserve the underlying distribution characteristics when generating new points. For a given upsampling factor $r$ , our goal is to obtain a dense and uniformly distributed upsampled point cloud $P_{r}=\left\{p_{i}|p_{i}^{r}\in\mathbb{R}^{3}\right\}_{i=1}^{rN}$ by starting from each local manifold of the point cloud. Specifically, we perform the following steps. First, a certain number of query points are selected from the point cloud, and for each point $p_{i}$ , an overlapping local neighborhood is constructed to define a local coordinate system, an attention network is then employed to generate Gaussian parameters and combination weights, thereby enabling local Gaussian fitting for $p_{i}$ . Next, resampling is performed on the projection plane to generate the required number of upsampled points, which are mapped back to the original point cloud through the local Gaussian representation. Finally, each local Gaussian function is integrated into a unified manifold, with further constraints applied using the neighborhood distribution distance of the corresponding $p_{i}$ points. Figure 1 illustrates the overall pipeline of our method. Below, we will describe each step of our method in detail.

3.1. Statistical Manifold Representation of Point Clouds

Point cloud data typically originate from the surfaces of objects in three-dimensional space, and these surfaces can locally be approximated as manifolds. A manifold is a topological space, where an n-dimensional topological manifold $\mathcal{M}$ is a locally Euclidean, second-countable Hausdorff space. In other words, every portion of the manifold can locally be homeomorphic to Euclidean space. Therefore, we represent each local neighborhood of a point cloud as a manifold. For the selection of basis functions for fitting the surface, the fitting capability of Gaussian functions has been widely proven. Due to the excellent linear combination properties of Gaussian basis functions, they can form complex functions through weighted summation, thus capable of expressing surfaces of various shapes. Moreover, the shape of a Gaussian function is well captured by its covariance matrix. Furthermore, inspired by (Yu and Turk, 2013), we incorporate anisotropic kernels to accommodate the irregular distribution of particles. Additionally, anisotropic kernel regression and covariance analysis enable the accurate capture of the local distribution characteristics of fluid surfaces. Considering that a point cloud can be viewed as a sample drawn from some probability distribution, i.e., the distribution of local point cloud data can be considered as a probability distribution, the Gaussian distribution is a natural choice. It can effectively and simply describe the central tendency and the direction of spread of the data through its mean and covariance matrix. Due to the uniformity of the basis functions, we construct a unified manifold representation for the entire point cloud. Given the distributional properties of Gaussian functions, we use a statistical model to describe the structure and geometric shape of the manifold. As a result, the statistical manifold can be defined by the properties of the Gaussian functions themselves.

(1)

\Omega=\{p(\mathbf{x};\theta)|\theta=(\mu,\Sigma)\}

where $p(\mathbf{x};\theta)$ is the Gaussian functions.

That is, we map the manifold representation of each neighborhood in the point cloud to a unified manifold. Since each point on the manifold represents a probability distribution, it is referred to as a statistical manifold.

3.2. Local Gaussian Fitting

Considering the local manifold properties of point clouds, we perform a fitting representation of the local neighborhoods. Let the input point cloud be denoted as $P$ . To ensure the overlap of neighborhood regions and prevent gaps during surface fitting, we first randomly select a certain number of points from the original point cloud as initial query points. This selection process ensures a uniform distribution, avoiding excessive local concentration or sparsity. Subsequently, each local patch is updated by incorporating points based on nearest-neighbor neighborhoods. Finally, the centroid of each patch is computed and designated as the final query point. For each query point $P_{i}$ , we construct a local coordinate system, and the projection plane is denoted as $D_{xy}$ . The specific transformation formula for the coordinates on the projection plane is given below:

(2)

P_{\mathbf{x}\mathbf{y}}=R_{n}P

where $P_{\mathbf{x}\mathbf{y}}\in D_{xy}$ . The columns of matrix $R_{n}$ represent the directions of the x-axis, y-axis, and z-axis in the local coordinate system.

We treat the local neighborhood of point $P_{i}$ as a 2D manifold in 3D space. Since the manifold surface in 3D space is homeomorphic to a 2D region, for each point and its local neighborhood on the manifold, there exists a local coordinate system such that this neighborhood can be mapped to a 2D Euclidean space via a homeomorphic mapping. Therefore, a 3D local neighborhood patch can be isomorphic to the 2D parameter domain D. To eliminate the complexity caused by surface tilt in the original global coordinates, we construct a 2D parameter domain $\mathcal{D}$ for the projection plane. This means that we can establish a continuous mapping between the original local neighborhood and $\mathcal{D}$ :

(3)

\phi_{i}{:\mathcal{P}{{}_{i}}}\to\mathcal{D}

Considering the challenges of directly fitting the point cloud surface in three-dimensional space, and given that local surfaces are inherently two-dimensional, it is unnecessary to use three-dimensional Gaussian functions to describe the redundant dimension. Instead, we perform surface function fitting in a two-dimensional parameter domain and transform the parameters back to represent the original point cloud surface. In the local coordinate region, we define a series of Gaussian kernels centered at the base points.

(4)

g_{t}(u,v)=\exp\left(-\frac{1}{2}\left[\left(u-u_{t}\right),\left(v-v_{t}\right)\right]\Sigma^{-1}\left[\left(u-u_{t}\right),\left(v-v_{t}\right)\right]^{\top}\right)

where $u,v\in\mathcal{D}$ , $u_{t}$ , $v_{t}$ represents the center of the Gaussian functions we select.

To ensure the fitting accuracy for sparse and non-uniform point clouds, we design the number of Gaussian functions $t$ , based on the number of points within the neighborhood. Furthermore, drawing inspiration from the data-adaptive kernel regression method proposed in (Takeda et al., 2007), we capture the local characteristics of point clouds by adaptively adjusting the shape and size of the kernel function. To reduce the optimization difficulty, we fix the Gaussian function centers and only optimize the covariance matrix for each function. Through our design, the manifold of each local neighborhood in the point cloud is described as a weighted sum of multiple Gaussian functions, with the local point cloud distribution represented by the covariance matrices of the Gaussian components. Considering the statistical properties of the covariance matrix, we adopt the concept of Gaussian splatting(Kerbl et al., 2023), to ensure the positive definiteness of the covariance matrix, we decompose it into the product of a rotation matrix $R$ and a scaling matrix $S$ , as expressed by the following equation:

(5)

\Sigma=RSS^{T}R^{T}

During the entire optimization process, we refine the rotation matrix $R$ and scaling matrix $S$ to maintain the positive definiteness of the covariance matrix. Finally, we fit the local neighborhood manifold of the point cloud in the 2D parametric domain, expressed as follows:

(6)

f=\sum_{t}\omega_{t}g_{t}(\mathbf{u},\mathbf{v})

where $\omega_{t}$ is the weight corresponding to the Gaussian element.

In the weighted combination of Gaussian basis functions, although the basis functions themselves define part of the shape of the local surface, they do not directly define the position of the surface. The weighted Gaussian basis functions in equation (3) capture the shape features of the surface, but they do not specify the exact position of these shapes in three-dimensional space. Therefore, we design the local surface fitting for a local patch near a query point $q\in\mathbb{R}^{3}$ as follows:

(7)

f_{q}=q+\sum_{t}\omega_{t}g_{t}(\mathbf{u},\mathbf{v})

This is our local Gaussian representation. Although our Gaussian representation inherently possesses strong approximation capabilities, extracting a reasonable Gaussian representation from information-deficient and sparse point clouds presents a significant challenge. During the optimization process of the above formulation, we leverage a generative model to learn and shape a prior. Specifically, we incorporate modules such as EdgeConv and cross-attention to account for the geometric structure of local neighborhoods, facilitating feature aggregation. The covariance matrix is optimized to model the Gaussian distribution, while the optimized Gaussian weight matrix yields the best parameter approximation, ultimately generating a Gaussian representation of the local neighborhood. Our approach ensures a refined consideration of the interactions between shallow and deep features, leveraging the powerful capabilities of neural networks to enable a more nuanced and detailed representation.

3.3. Upsampling with Manifold Distribution Constraints

To better generate upsampled point clouds with a uniform distribution, we first resample a sufficient number of coordinate points in the 2D parameter domain of each local neighborhood and obtain the upsampled point cloud through local Gaussian representation mapping. Additionally, current point cloud upsampling methods fail to leverage the distribution information of the point cloud to supervise the generation of upsampled points. From the statistical manifold constructed by the Gaussian representations, we propose a novel method for correcting the distribution of the point cloud.

Since each point on the manifold represents a probability distribution, to ensure that the upsampled point cloud better aligns with the true distribution, the problem can be reformulated such that the distance between corresponding points on the statistical manifold is minimized. Here, we choose the Fisher-Rao distance between two distributions. The Fisher-Rao distance between two probability distributions on the statistical manifold is the length of a geodesic, which is the local shortest distance between the two points. This geodesic is defined under the Fisher information metric between the distributions. The elements of the Fisher information matrix and the classic Fisher-Rao distance are defined as follows:

(8)

I_{ij}(\theta)=\mathbb{E}\left[\frac{\partial\log p(x;\theta)}{\partial\theta_{i}}\frac{\partial\log p(x;\theta)}{\partial\theta_{j}}\right]

(9)

d_{FR}\left(\theta_{P},\theta_{Q}\right)=\int_{\gamma}\sqrt{\mathbf{v}^{\mathrm{T}}I(\theta)\mathbf{v}}d\ell

where $\gamma$ is the geodesic connecting $\theta_{P}$ and $\theta_{Q}$ , $I(\theta)$ is the Fisher information matrix, $\mathbf{v}$ is the tangent vector along the geodesic at each point.

Since our local Gaussian representation consists of multiple Gaussian functions (normal distributions), the Fisher-Rao distance between two multivariate normal distributions can be expressed as:

(10)

d_{FR}(\theta_{P},\theta_{Q})=\sqrt{\sum_{i=1}^{t}d_{u-FR}^{2}((\mu_{Pi},\sigma_{Pi}),(\mu_{Qi},\sigma_{Qi}))}

where $\Sigma=\operatorname{diag}\left(\sigma_{1}^{2},\sigma_{2}^{2},\ldots\ldots,\sigma_{t}^{2}\right)$ .

Following the approach in (Costa et al., 2015), we compute the Fisher-Rao distance for a univariate Gaussian probability distribution. Since the hyperbolic distance between two points $\boldsymbol{x}=\left(x_{1},x_{2}\right)$ and $\boldsymbol{y}=\left(y_{1},y_{2}\right)$ in the Poincaré half-plane model is given by:

(11)

d_{H}(\boldsymbol{x},\boldsymbol{y})=\ln\left(\frac{|\boldsymbol{x}-\bar{\boldsymbol{y}}|+|\boldsymbol{x}-\boldsymbol{y}|}{|\boldsymbol{x}-\bar{\boldsymbol{y}}|-|\boldsymbol{x}-\boldsymbol{y}|}\right)

where $d_{H}$ is the hyperbolic distance, $\bar{\boldsymbol{y}}=(y_{1},-y_{2})$ and $|\cdot|$ is the standard Euclidean norm.

We derive and establish the relationship between the Fisher-Rao distance for a univariate Gaussian probability distribution and the Poincaré distance $d_{H}$ within the Poincaré half-plane model in hyperbolic geometry:

(12)

d_{u-\mathrm{FR}}((\mu_{1},\sigma_{1}),(\mu_{2},\sigma_{2}))=\sqrt{2}d_{H}\left(\left(\frac{\mu_{1}}{\sqrt{2}},\sigma_{1}\right),\left(\frac{\mu_{2}}{\sqrt{2}},\sigma_{2}\right)\right)

Thus, the final computation formula is given as follows:

(13)

\begin{gathered}d_{u-\mathrm{FR}}\left(\left(\mu_{1},\sigma_{1}\right)\left(\mu_{2},\sigma_{2}\right)\right)\\ =\sqrt{2}\mathrm{log}\frac{\left|\left(\frac{\mu_{1}}{\sqrt{2}},\sigma_{1}\right)-\left(\frac{\mu_{2}}{\sqrt{2}},-\sigma_{2}\right)\right|+\left|\left(\frac{\mu_{1}}{\sqrt{2}},\sigma_{1}\right)-\left(\frac{\mu_{2}}{\sqrt{2}},\sigma_{2}\right)\right|}{\left|\left(\frac{\mu_{1}}{\sqrt{2}},\sigma_{1}\right)-\left(\frac{\mu_{2}}{\sqrt{2}},-\sigma_{2}\right)\right|-\left|\left(\frac{\mu_{1}}{\sqrt{2}},\sigma_{1}\right)-\left(\frac{\mu_{2}}{\sqrt{2}},\sigma_{2}\right)\right|}\end{gathered}

Now, to constrain the distribution of the point cloud, the objective is to minimize:

(14)

\min d_{FR}\left(\theta_{P},\theta_{Q}\right)=\min\sqrt{\sum_{i=1}^{n}d_{u-FR}^{2}\left(\left(\mu_{Pi},\sigma_{Pi}\right),\left(\mu_{Qi},\sigma_{Qi}\right)\right)}\\

3.4. Loss Function

In this section, we introduce the design of the loss function, which includes constraints on the quality of the point cloud, the similarity between distributions, and the smoothness of the surface shape. First, considering the importance of point cloud quality for point cloud upsampling, our method should encourage the generated points to be closer to the real point cloud. Therefore, a reconstruction loss function is needed to evaluate the similarity between the two point clouds. Here, the reconstruction loss uses the Earth Mover’s distance(EMD).

(15)

\mathcal{L}_{emd}(P,P_{r})=\min_{\phi:P\to P_{r}}\sum_{x\in P}\|x-\phi(x))\|_{2}

Then, during the minimization process of Eq. (14), to prevent interference from outliers in some distributions, we impose a first-order differential constraint on the surface function, as shown in Eq. (16). This constraint limits the gradient variation of the surface, promoting smooth transitions in adjacent regions and enhancing surface continuity. Additionally, smoothing the gradient also contributes to improving the uniformity of the point cloud distribution.

(16)

\mathcal{L}_{\mathrm{first\_der}}=\sum_{i=1}^{N}\|\nabla f_{q}(\mathbf{x}_{i},\mathbf{y}_{i})\|^{2}

where $\nabla f_{q}(\mathbf{x},\mathbf{y})=W_{q}\nabla\Phi(\mathbf{x},\mathbf{y})$ , $\nabla\Phi(\mathbf{x},\mathbf{y})=\begin{bmatrix}\frac{\partial\phi_{1}}{\partial\mathbf{x}},\ldots,\frac{\partial\phi_{T}}{\partial\mathbf{x}}\\ \frac{\partial\phi_{1}}{\partial\mathbf{y}},\ldots,\frac{\partial\phi_{T}}{\partial\mathbf{y}}\end{bmatrix}$ .

Finally, based on the above description, the final combined loss function is as follows:

(17)

\mathcal{L}_{total}=\mathcal{L}_{emd}+\lambda_{1}d_{FR}+\lambda_{2}\mathcal{L}_{first\_der}

4. Experiment

4.1. Datasets and Implementation Details

Compared to the traditional PUGAN (Li et al., 2019)dataset, the PU1K dataset (Qian et al., 2021) is more challenging because it contains a larger volume of data and more diverse categories. This dataset includes large objects with complex shapes, as provided by PUGCN. The PU1K dataset is collected from both PUGAN and ShapeNetCore (Xu et al., 2019). Since our method is designed for upsampling non-uniform point clouds, we processed the PU1K dataset to account for non-uniformity. During the training phase, we first sampled a dense point cloud from the original mesh as ground truth. For each patch during training, we applied non-uniform sampling as the input data.

For the testing dataset, we selected different resolutions of the PU1K dataset for evaluation. Considering the testing requirements of our method for upsampling sparse, non-uniform point clouds, we applied non-uniform processing to the original dataset as well. Additionally, to validate the generalization of our method and assess performance at different upsampling scales, we also evaluated the Sketchfab dataset (Qian et al., 2020) and the KITTI dataset (Geiger et al., 2013).

All of our experiments are implemented using PyTorch. We set the cross-attention and graph convolution modules to three layers and the number of Gaussian factors per patch to $t=\left\lfloor\mathcal{N}(p_{i})/8\right\rfloor$ . We use the Adam optimizer to train our model on an RTX 3090 GPU, with a batch size of 64, an upsampling factor of 4, and for 400 epochs. The initial learning rate is set to 0.001, decaying by a factor of 0.7 every 80 iterations. Additionally, to eliminate unnecessary degrees of freedom in the input data space and reduce the learning difficulty, we normalize each point’s coordinates by its patch radius and rotate the points into a local coordinate system defined by PCA. To avoid overfitting during training, we augment the network inputs with random rotations, scaling, and Gaussian noise perturbations.

Table 1. The table compares the upsampling performance of different methods on the PU1K dataset using three different input sizes. The best results are highlighted in bold.

Input	Method	CD	HD	P2F	JSD	UNI
256	PUGeo (Qian et al., 2020)	15.372	6.911	6.383	0.476	11.327
	PUCRN (Du et al., 2022)	11.076	6.780	5.761	0.484	9.347
	APUNet (Zhao et al., 2023)	9.960	6.143	5.683	0.375	7.325
	Grad-PU (He et al., 2023)	9.023	5.679	7.004	0.301	18.682
	RepKPU (Rong et al., 2024)	8.747	5.443	6.769	0.287	17.126
	Ours	8.538	5.910	6.337	0.218	5.642
1024	PUGeo (Qian et al., 2020)	3.795	4.211	2.363	0.411	5.723
	PUCRN (Du et al., 2022)	3.382	4.076	2.727	0.457	3.697
	APUNet (Zhao et al., 2023)	2.604	3.726	2.283	0.302	3.018
	Grad-PU (He et al., 2023)	4.102	3.817	2.209	0.243	8.012
	RepKPU (Rong et al., 2024)	3.626	3.762	2.421	0.211	7.935
	Ours	2.742	3.540	2.059	0.163	2.791
4096	PUGeo (Qian et al., 2020)	2.863	2.456	1.679	0.325	3.413
	PUCRN (Du et al., 2022)	2.238	2.154	0.937	0.364	1.039
	APUNet (Zhao et al., 2023)	1.396	1.496	0.364	0.213	0.591
	Grad-PU (He et al., 2023)	2.963	1.539	0.367	0.175	4.537
	RepKPU (Rong et al., 2024)	2.723	1.442	0.232	0.134	4.261
	Ours	1.014	1.238	0.038	0.097	0.128

Table 2. Floating point operations, training and inference time comparison.

Method	FLOPs(G)	Training(h)	Inference(s)
PUGeo (Qian et al., 2020)	8.782	2.1	0.325
PUCRN (Du et al., 2022)	4.315	5.8	0.278
APUNet (Zhao et al., 2023)	5.713	6.3	0.536
Grad-PU (He et al., 2023)	2.718	4.8	0.269
RepKPU (Rong et al., 2024)	2.062	4.3	0.187
Ours	3.207	1.3	0.201

4.2. Evaluation Metrics and Comparisons

Similar to recent point cloud upsampling works, for quantitative evaluation, we use three common metrics: Chamfer Distance ( $\text{CD}\times 10^{-3}$ ), Hausdorff Distance ( $\text{HD}\times 10^{-4}$ ), and Point-to-Surface Distance ( $\text{P2F}\times 10^{-3}$ ). These metrics reflect the reconstruction quality of the upsampled point cloud. Additionally, since our method considers the distribution similarity of each patch, we include the Jensen-Shannon Divergence ( $\text{JSD}\times 10^{-3}$ ) metric, which measures the similarity between point cloud distributions. Finally, we introduce the Uniformity ( $\text{UNI}\times 10^{-2}$ ) metric proposed in PU-GAN (Li et al., 2019), which reflects the uniformity of the point cloud. For all metrics, smaller values indicate better performance.

We compare the method proposed in this paper with five existing point cloud upsampling methods, including PUGeo (Qian et al., 2020), PUCRN (Du et al., 2022), APUNet(Zhao et al., 2023), Grad-PU(He et al., 2023), and RepKPU(Rong et al., 2024). For a fair comparison, we retrained these methods on our dataset in the same experimental environment, using the official code and recommended settings.

4.3. Results on synthetic dataset

The quantitative comparison results are shown in Table 1. We performed a quantitative analysis on sparse, non-uniform point clouds with 256, 1024, and 4096 input points. From the table, we can see that our method achieves superior results in upsampling sparse, non-uniform point clouds compared to most other methods. Although, for 256 input points, some metrics like HD and P2M are not as good as RepKPU, the visual results in Figure 3 show that RepKPU produces highly uneven results. This is because when dealing with extremely sparse point clouds, anomalies can occur during point generation, affecting some metrics significantly. In contrast, other methods tend to generate highly clustered points, which may yield better results for some metrics. However, the visual results clearly demonstrate that our method generates points that most closely align with the original features.

Table 3. Quantitative results of upsampling at different noise levels. The best results are highlighted in bold.

Noise Level	Method	CD	HD	P2F	JSD	UNI
0.5%	PUGeo (Qian et al., 2020)	2.961	2.447	1.702	0.412	3.437
	PUCRN (Du et al., 2022)	2.269	2.170	1.003	0.395	1.219
	APUNet (Zhao et al., 2023)	1.418	1.489	0.387	0.276	0.613
	Grad-PU (He et al., 2023)	2.901	1.543	0.397	0.231	4.576
	RepKPU (Rong et al., 2024)	2.736	1.441	0.242	0.215	4.432
	Ours	1.009	1.245	0.039	0.103	0.125
1%	PUGeo (Qian et al., 2020)	3.107	2.491	1.875	0.435	3.764
	PUCRN (Du et al., 2022)	2.364	2.202	1.127	0.412	1.493
	APUNet (Zhao et al., 2023)	1.739	1.532	0.291	0.298	0.836
	Grad-PU (He et al., 2023)	3.046	1.594	0.406	0.265	4.731
	RepKPU (Rong et al., 2024)	2.938	1.479	0.325	0.251	4.682
	Ours	1.225	1.301	0.049	0.127	0.135
2%	PUGeo (Qian et al., 2020)	3.208	2.553	1.938	0.467	4.024
	PUCRN (Du et al., 2022)	2.567	2.239	1.183	0.422	1.992
	APUNet (Zhao et al., 2023)	1.962	1.633	0.579	0.310	1.267
	Grad-PU (He et al., 2023)	3.298	1.607	0.711	0.288	6.931
	RepKPU (Rong et al., 2024)	3.125	1.532	0.653	0.267	6.637
	Ours	1.312	1.332	0.067	0.145	0.147

In addition to the quantitative results, we also present point cloud upsampling and surface reconstruction results for some models in Figure 2. Here, we use Poisson reconstruction (Kazhdan and Hoppe, 2013). As seen in the figure, when dealing with complex data, such as the elephant, our method produces smoother and more uniformly distributed points compared to other methods, providing excellent guidance for subsequent reconstruction. Our upsampling results effectively fill in gaps, while other methods tend to generate more noise and uneven point sets.

Additionally, to evaluate runtime performance, we compare these methods in terms of floating point operations (FLOPs), as well as the training and the inference time. As shown in Table 2, our method exhibits the shortest training time while also achieving competitive performance in FLOPs and inference efficiency.

Table 4. Ablation study of different modules of our method on the PU1K dataset. I is the absence of a Gaussian representation, II is without the constraints of manifold distribution, III is without the first-order derivative loss.

Ablation	CD	HD	P2F	JSD	UNI
I	1.022	1.240	0.041	0.103	0.132
II	1.016	1.239	0.042	0.108	0.130
III	1.025	1.241	0.044	0.099	0.130
Ours	1.014	1.238	0.038	0.097	0.128

4.4. Results on noise dataset

To validate the robustness of our method against noisy results, we performed upsampling on point clouds with noise levels of 0.5%, 1%, and 2%. As shown in Table 2, our method provides the best results across different noise levels. From Figure 4, we can observe that our method effectively constrains the noisy outliers to be closer to the surface, and our method also achieves the best performance in terms of uniformity.

4.5. Results on real-scanned point clouds

To validate the effectiveness of our method in real-world scenarios, we conducted experiments on the KITTI dataset. Since there is no ground truth point cloud, we provide a visual comparison in Figure 5. As seen, even when facing real scanned point clouds, our method is still able to fill in gaps and output a more uniform point distribution.

4.6. Ablation Study

To verify the effectiveness of each module, we conduct an ablation study to demonstrate how each component impacts the final results. Specifically, we use the PU1K dataset as a benchmark. We focus on the selection of Gaussian functions(I), the distribution constraints on the manifold(II), and the design of the first-order derivative loss(III). For the function selection, we adopt the quadratic term representation of surfaces provided in GeoUDF (Ren et al., 2023). The quantitative results of the ablation study are shown in Table 3. It can be observed that each component we designed significantly improves the experimental performance. From Figure 6, it is evident that without the manifold distribution constraint, the points become more clustered. The absence of the first-order derivative loss results in interference from anomalous points. The experimental results strongly support the effectiveness and necessity of each strategy we employed.

5. Conclusion

We take an innovative approach by starting from the perspective of statistical manifolds. In the local coordinate system of point cloud neighborhoods, we construct Gaussian functions, and through continuous optimization of function parameters via a neural network, we fit local surfaces. The parameters of each local function are treated as a point on the statistical manifold, and point cloud upsampling is performed through distribution constraints on the manifold. Our method effectively learns from sparse and non-uniform datasets and generates more uniformly distributed points than existing methods. Extensive experiments have validated the strong representational power, robustness, and generalization ability of our approach, significantly improving the efficiency of subsequent reconstruction tasks.

Acknowledgements.

This work was supported in part by Beijing Municipal Science and Technology Commission and Zhongguancun Science Park Management Committee under Grant Z221100002722020, in part by the National Nature Science Foundation of China under Grant 62072045, in part by the National Nature Science Foundation of Beijing under Grant 7242167.

References

(1)
Alexa et al. (2003) Marc Alexa, Johannes Behr, Daniel Cohen-Or, Shachar Fleishman, David Levin, and Claudio T. Silva. 2003. Computing and rendering point set surfaces. IEEE Transactions on Visualization and Computer Graphics 9, 1 (2003), 3–15.
Costa et al. (2015) Sueli IR Costa, Sandra A Santos, and Joao E Strapasson. 2015. Fisher information distance: A geometrical reading. Discrete Applied Mathematics 197 (2015), 59–69.
Du et al. (2022) Hang Du, Xuejun Yan, Jingjing Wang, Di Xie, and Shiliang Pu. 2022. Point cloud upsampling via Cascaded Refinement Network. In Proceedings of the Asian Conference on Computer Vision. 586–601.
Feng et al. (2022) Wanquan Feng, Jin Li, Hongrui Cai, Xiaonan Luo, and Juyong Zhang. 2022. Neural Points: Point Cloud Representation with Neural Fields for Arbitrary Upsampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18633–18642.
Geiger et al. (2013) Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. 2013. Vision meets robotics: The KITTI dataset. The International Journal of Robotics Research 32, 11 (2013), 1231–1237.
He et al. (2023) Yun He, Danhang Tang, Yinda Zhang, Xiangyang Xue, and Yanwei Fu. 2023. Grad-PU: Arbitrary-Scale Point Cloud Upsampling via Gradient Descent with Learned Distance Functions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5354–5363.
Held et al. (2012) Robert Held, Ankit Gupta, Brian Curless, and Maneesh Agrawala. 2012. 3D puppetry: a Kinect-based interface for 3D animation.. In UIST, Vol. 12. 423–434.
Huang et al. (2009) Hui Huang, Dan Li, Hao Zhang, Uri Ascher, and Daniel Cohen-Or. 2009. Consolidation of unorganized point clouds for surface reconstruction. ACM Transactions on Graphics (TOG) 28, 5 (2009), 1–7.
Huang et al. (2013) Hui Huang, Shihao Wu, Minglun Gong, Daniel Cohen-Or, Uri Ascher, and Hao Zhang. 2013. Edge-Aware Point Set Resampling. ACM transactions on Graphics (TOG) 32, 1 (2013), 1–12.
Kazhdan and Hoppe (2013) Michael Kazhdan and Hugues Hoppe. 2013. Screened Poisson surface reconstruction. ACM Transactions on Graphics (ToG) 32, 3 (2013), 1–13.
Kerbl et al. (2023) Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 2023. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Trans. Graph. 42, 4 (2023), 139–1.
Lafarge and Mallet (2012) Florent Lafarge and Clément Mallet. 2012. Creating Large-Scale City Models from 3D-Point Clouds: A Robust Approach with Hybrid Representation. International Journal of Computer Vision 99 (2012), 69–85.
Lang et al. (2019) Alex H Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom. 2019. PointPillars: Fast Encoders for Object Detection from Point Clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12697–12705.
Li et al. (2019) Ruihui Li, Xianzhi Li, Chi-Wing Fu, Daniel Cohen-Or, and Pheng-Ann Heng. 2019. PU-GAN: A Point Cloud Upsampling Adversarial Network. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7203–7212.
Li et al. (2021) Ruihui Li, Xianzhi Li, Pheng-Ann Heng, and Chi-Wing Fu. 2021. Point Cloud Upsampling via Disentangled Refinement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 344–353.
Lipman et al. (2007) Yaron Lipman, Daniel Cohen-Or, David Levin, and Hillel Tal-Ezer. 2007. Parameterization-free projection for geometry reconstruction. ACM Transactions on Graphics (ToG) 26, 3 (2007), 22–es.
Musialski et al. (2013) Przemyslaw Musialski, Peter Wonka, Daniel G Aliaga, Michael Wimmer, Luc Van Gool, and Werner Purgathofer. 2013. A survey of urban reconstruction. In Computer Graphics Forum, Vol. 32. Wiley Online Library, 146–177.
Qi et al. (2017a) Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017a. Pointnet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 652–660.
Qi et al. (2017b) Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017b. Pointnet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Advances in Neural Information Processing Systems 30 (2017).
Qian et al. (2021) Guocheng Qian, Abdulellah Abualshour, Guohao Li, Ali Thabet, and Bernard Ghanem. 2021. PU-GCN: Point Cloud Upsampling using Graph Convolutional Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11683–11692.
Qian et al. (2020) Yue Qian, Junhui Hou, Sam Kwong, and Ying He. 2020. PUGeo-Net: A Geometry-centric Network for 3D Point Cloud Upsampling. In Proceedings of the European Conference on Computer Vision. Springer, 752–769.
Ren et al. (2023) Siyu Ren, Junhui Hou, Xiaodong Chen, Ying He, and Wenping Wang. 2023. GeoUDF: Surface Reconstruction from 3D Point Clouds via Geometry-guided Distance Representation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14214–14224.
Rong et al. (2024) Yi Rong, Haoran Zhou, Kang Xia, Cheng Mei, Jiahao Wang, and Tong Lu. 2024. RepKPU: Point Cloud Upsampling with Kernel Point Representation and Deformation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 21050–21060.
Santana et al. (2017) José Miguel Santana, Jochen Wendel, Agustín Trujillo, José Pablo Suárez, Alexander Simons, and Andreas Koch. 2017. Multimodal location based services—semantic 3D city data as virtual and augmented reality. In Progress in location-based services 2016. Springer, 329–353.
Takeda et al. (2007) Hiroyuki Takeda, Sina Farsiu, and Peyman Milanfar. 2007. Kernel Regression for Image Processing and Reconstruction. IEEE Transactions on Image Processing 16, 2 (2007), 349–366.
Wang et al. (2019) Yan Wang, Wei-Lun Chao, Divyansh Garg, Bharath Hariharan, Mark Campbell, and Kilian Q Weinberger. 2019. Pseudo-lidar from visual depth estimation: Bridging the gap in 3d object detection for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8445–8453.
Xiao et al. (2024) Zijian Xiao, Tianchen Zhou, and Li Yao. 2024. LGSur-Net: A Local Gaussian Surface Representation Network for Upsampling Highly Sparse Point Cloud. In Computer Graphics Forum. Wiley Online Library, e15257.
Xu et al. (2019) Qiangeng Xu, Weiyue Wang, Duygu Ceylan, Radomir Mech, and Ulrich Neumann. 2019. DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction. Advances in Neural Information Processing Systems 32 (2019).
Yifan et al. (2019) Wang Yifan, Shihao Wu, Hui Huang, Daniel Cohen-Or, and Olga Sorkine-Hornung. 2019. Patch-based progressive 3d point set upsampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5958–5967.
Yu and Turk (2013) Jihun Yu and Greg Turk. 2013. Reconstructing surfaces of particle-based fluids using anisotropic kernels. ACM Transactions on Graphics (TOG) 32, 1 (2013), 1–12.
Yu et al. (2018a) Lequan Yu, Xianzhi Li, Chi-Wing Fu, Daniel Cohen-Or, and Pheng-Ann Heng. 2018a. EC-Net: an Edge-aware Point set Consolidation Network. In Proceedings of the European Conference on Computer Vision. 386–402.
Yu et al. (2018b) Lequan Yu, Xianzhi Li, Chi-Wing Fu, Daniel Cohen-Or, and Pheng-Ann Heng. 2018b. PU-Net: Point Cloud Upsampling Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2790–2799.
Zhao et al. (2023) Tianming Zhao, Linfeng Li, Tian Tian, Jiayi Ma, and Jinwen Tian. 2023. APUNet: Attention-guided upsampling network for sparse and non-uniform point cloud. Pattern Recognition 143 (2023), 109796.