This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: University of Illinois at Urbana-Champaign Urbana, IL 61801, USA

Abstract Rendering: Computing All that is Seen in Gaussian Splat Scenes

Yangge Li    Chenxi Ji    Xiangru Zhong    Huan Zhang    Sayan Mitra
Abstract

We introduce abstract rendering, a method for computing a set of images by rendering a scene from a continuously varying range of camera positions. The resulting abstract image—which encodes an infinite collection of possible renderings—is represented using constraints on the image matrix, enabling rigorous uncertainty propagation through the rendering process. This capability is particularly valuable for the formal verification of vision-based autonomous systems and other safety-critical applications. Our approach operates on Gaussian splat scenes, an emerging representation in computer vision and robotics. We leverage efficient piecewise linear bound propagation to abstract fundamental rendering operations, while addressing key challenges that arise in matrix inversion and depth sorting—two operations not directly amenable to standard approximations. To handle these, we develop novel linear relational abstractions that maintain precision while ensuring computational efficiency. These abstractions not only power our abstract rendering algorithm but also provide broadly applicable tools for other rendering problems. Our implementation, AbstractSplat, is optimized for scalability, handling up to 750k Gaussians while allowing users to balance memory and runtime through tile and batch-based computation. Compared to the only existing abstract image method for mesh-based scenes, AbstractSplat achieves 214×2-14\times speedups while preserving precision. Our results demonstrate that continuous camera motion, rotations, and scene variations can be rigorously analyzed at scale, making abstract rendering a powerful tool for uncertainty-aware vision applications.

Keywords:
Abstract Rendering Gaussian Splat Abstract Image

1 Abstract Rendering for Verification and Challenges

Rendering computes an image from the description of a scene and a camera. In this paper, we introduce abstract rendering, which computes the set of all images that can be rendered from a set of scenes and a continuously varying range of camera poses. The resulting infinite collection of images, called an abstract image, is represented by constraints on the image matrix, such as using interval matrices. An example abstract image generated from the continuous movement of a camera is shown in Figure 1. Abstract rendering provides a powerful tool for the formal verification of autonomous systems, virtual reality environments, robotics, and other safety-critical applications that rely on computer vision. For instance, when verifying a vision-based autonomous system, uncertainties in the agent’s state naturally translate into a set of camera positions. These uncertainties must be systematically propagated through the rendering process to analyze their impact on perception, control, and overall system safety. By enabling rigorous uncertainty propagation, abstract rendering supports verification of statements such as “No object is visible to the camera as it pans through a specified angle.”

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 1: Renderings of a Gaussian splat scene with a Lego bulldozer (from [9]). Rendered image before (Left) and after camera shifts horizontally by 10cm (Center left). The lower (Center right) and upper (Right) bounds on the abstract image computed by our abstract rendering method for the given camera movement. For every pixel, the color in any image generated by the camera movement is guaranteed to lie between the corresponding pixel values in these lower and upper bound images.

Our abstract rendering algorithm operates on scenes represented by Gaussian splats—an efficient and differentiable representation for 3D environments [7]. A scene consists of a collection of 3D Gaussians, each with a distinct color and opacity. The rendering algorithm projects these splats onto the 2D image plane of a camera and blends them to determine the final color of each pixel. Gaussian splats are gaining prominence in computer vision and robotics due to their ability to be effectively learned from camera images and their capacity to render high-quality images in near real time [24]. They have transformed virtual reality applications by enabling the rapid creation of detailed 3D scenes [5, 11, 1]. In robotics and autonomous systems, Gaussian splatting has proven valuable for Simultaneous Localization and Mapping (SLAM) and scene reconstruction [17].

To rigorously propagate uncertainty in such settings, we represent all inputs—such as camera parameters and scene descriptions—as linear sets (i.e., vectors or matrices defined by linear constraints). This abstraction allows us to systematically propagate uncertainty through each rendering operation.

The rendering algorithm for Gaussian splats computes the color of each pixel 𝗎\mathsf{u} in four steps. First, each 3D Gaussian is transformed from world coordinates to camera coordinates, and then to 2D pixel coordinates. The third step computes the effective opacity of each Gaussian at the pixel 𝗎\mathsf{u} by evaluating its probability density. Finally, the colors of all Gaussians are blended, weighted by their effective opacities and sorted by their distance to the image plane.

The first two steps in the rendering process involve continuous transformations that can be precisely bounded above and below by piecewise linear functions. Although it is mathematically straightforward to propagate linear sets through these operations, achieving accurate bounds for high-dimensional objects (e.g., scenes with over 100k Gaussians and images with over 10k pixels) requires a GPU-optimized linear bound propagation method such as the ones implemented in CROWN [21, 23].

The third step evaluates Gaussian distributions over a linear set, which requires computing an interval matrix containing the inverses of multiple covariance matrices. Standard inversion methods (e.g., the adjugate approach) become numerically unstable near singularities and yield overly conservative bounds. We mitigate this by using a Taylor expansion around a stable reference—such as the midpoint of the input set—and by carefully tuning the expansion order and bounding the truncation error, we achieve inverse bounds that are sufficiently tight for abstract rendering.

The final step blends Gaussian colors based on depth. In abstract rendering, uncertain depths allow multiple possible sorting orders, and merging these orders leads to overly conservative bounds. For example, if several Gaussians with different colors can be frontmost, the resulting bound may unrealistically turn white. To address this, we replace depth-sorting with an index-free blending formulation that uses an indicator function to directly model pairwise occlusion, ensuring each Gaussian’s contribution is bounded independently.

Our abstract rendering algorithm, AbstractSplat, integrates these techniques and supports tunable time–memory tradeoffs for diverse hardware. Our experiments show that AbstractSplat scales from single-object scenes to those with over 750k Gaussians—handling both camera translations and rotations—and achieves 2–14×\times speedups over the only existing abstract image method for mesh-based scenes [4] while preserving precision. Tile and batch optimizations enable users to balance GPU memory usage (which scales quadratically with tile size) against runtime (which scales linearly with batch size). Moreover, our experiments confirm that regularizing near-singular Gaussians is critical for reducing over-approximation. Together, these contributions significantly advance rigorous uncertainty quantification in rendering.

In summary, the key contributions from our paper are: (1) We present the first abstract rendering algorithm for Gaussian splats. It can rigorously bound the output of 3D Gaussian splatting under scene and camera pose uncertainty in both position and orientation. This enables formal analysis of vision-based systems in dynamic environments. (2) We have developed a mathematical and a software framework for abstracting the key operations needed in the rendering algorithms. While many of these approximation can be directly handled by a library like CROWN [21], we developed new methods for handling challenging operations like matrix inversion and sorting. We believe this framework can be useful beyond Gaussian splats to abstract other rendering algorithms. (3) We present a carefully engineered implementation that leverages tiled rendering and batched computation to achieve excellent scalability of abstract rendering. This design enables a tunable performance–memory trade-off, allowing our implementation to adapt effectively to diverse hardware configurations.

Relationship to Existing Research

The only closely related work computes abstract images for triangular mesh scenes [4]. To handle camera-pose uncertainty, that method collects each pixel’s RGB values into intervals over all possible viewpoints. In practice, this amounts to computing partial interval depth-images for each triangle and merging them using a depth-based union, resulting in an interval image that over-approximates all concrete images for that region of poses. This work considers only camera translations, not rotations. Our method works with Gaussian splats which involve more complex operations and are learnable. Our algorithm also has certain performance and accuracy advantages as discussed in Section 4.3.

Differentiable rendering [16] algorithms are based on ray-casting [8, 9, 10] or rasterization [4, 6, 7]. Although these techniques are amenable to gradient-based sensitivity analysis, they are not concerned with rigorous uncertainty propagation through the rendering process. We addresses this gap for Gaussian splatting. Our mathematical framework could be extended to other rasterization pipelines and even to ray-casting methods where similar numerical challenges arise.

Abstract interpretation [2, 14] is effective for verifying large-scale software systems, however, applying it directly to rendering presents unique challenges. Rendering involves high-dimensional geometric transformations (e.g., perspective projection, matrix inversion) and discontinuous operations (e.g., sorting) that exceed the typical scope of tools designed for control-flow analysis [3, 13] or neural network verification [12, 21]. Our method adapts abstract interpretation principles specifically to the rendering domain, providing a tailored approach for bounding rendering-specific operations while balancing precision and scalability.

Finally, although neural network robustness verification [12, 18, 21] analyzes how fixed image perturbations affect network outputs, our work operates upstream by generating the complete set of images a system may encounter under scene and camera uncertainty. This distinction is critical for safety-critical applications, where environmental variations—such as lighting and object motion—must be rigorously modeled before perception occurs.

2 Background: Linear Approximations and Rendering

Our abstract rendering algorithm in Section 3 computes linear bounds on pixel colors derived from bounds on the scene and the camera parameters. In this section, we give an overview of the standard rendering algorithm for 3D Gaussian splat scenes and introduce linear over-approximations of functions.

2.1 Linear Over-approximations of Basic Operations

For a vector xnx\in\mathbb{R}^{n}, we denote its i𝑡ℎi^{\mathit{th}} element by xix_{i}. Similarly, for a matrix Am×nA\in\mathbb{R}^{m\times n}, AijA_{ij} is the element in the i𝑡ℎi^{\mathit{th}} row and j𝑡ℎj^{\mathit{th}} column. For two vectors (or matrices) x,yx,y of the same shape, x<yx<y denotes element-wise comparison. For both a vector and matrix xx, we denote x\|x\| as its Frobenius norm. We use boldface to distinguish set-valued variables from their fixed-valued counterparts, (e.g., 𝐮𝐜\mathbf{uc} v.s. 𝗎𝖼\mathsf{uc}). A linear set or a polytope is a set {xnAxb}\{x\in\mathbb{R}^{n}\mid Ax\leq b\}, where AxbAx\leq b are the constraints defining the set. An interval is a special linear set. A piecewise linear relation RR on n×m\mathbb{R}^{n}\times\mathbb{R}^{m} is a relation of the form {x,yn×mA¯x+b¯yA¯x+b¯,Axb}\{\langle x,y\rangle\in\mathbb{R}^{n}\times\mathbb{R}^{m}\mid\underline{A}x+\underline{b}\leq y\leq\overline{A}x+\overline{b},Ax\leq b\}, where A¯,b¯,A¯,b¯\langle\underline{A},\underline{b},\overline{A},\overline{b}\rangle defines linear constraints on yy with respect to xx. A constant relation is a special piecewise linear relation where A¯\underline{A} and A¯\overline{A} are zero matrices. We define a class of functions that can be approximated arbitrarily precisely with piecewise linear lower and upper bounds over any compact input set.

Definition 1

A function f:Xmf:X\to\mathbb{R}^{m} is called linearly over-approximable if, for any compact BXB\subseteq X and ε>0\varepsilon>0, there exist piecewise linear maps B,uB:Bm\ell_{B},u_{B}:B\to\mathbb{R}^{m} such that for all xBx\in B, B(x)f(x)uB(x)\ell_{B}(x)\leq f(x)\leq u_{B}(x) and |uB(x)B(x)|<ε|u_{B}(x)-\ell_{B}(x)|<\varepsilon.

For any linearly over-approximable function y=f(x)y=f(x) and any linear set of inputs BXB\subseteq X, R={(x,y)B(x)yuB(x),xB}R=\{(x,y)\mid\ell_{B}(x)\leq y\leq u_{B}(x),x\in B\} is a piecewise linear relation that over-approximates function ff over BB.

Proposition 1

Any continuous function is linearly over-approximable.

Although Proposition 1 implies piecewise linear bounds on continuous functions can be adjusted sufficiently tight by shrinking the neighborhood, finding accurate bounds over larger neighborhoods remains a significant challenge. Our implementation of abstract rendering uses CROWN [21] for computing such bounds. Except for Ind\mathrm{Ind}, Inv\mathrm{Inv}, and Sort\mathrm{Sort}, all the basic operations listed in Table 1 are continuous and therefore linearly over-approximable. Moreover, since Ind\mathrm{Ind} is a piecewise constant function, it can be tightly bounded on each subdomain by partitioning at x=0x=0, allowing it to be treated as a linearly over-approximable function. This is formalized in Corollary 1.

Corollary 1

All operations in Table 1, except for Inv\mathrm{Inv} and Sort\mathrm{Sort}, are linearly over-approximable.

Table 1: Operations. Conditional A?B:CA?B:C returns BB if AA is true and otherwise CC. Permuting xx based on yy means reordering the elements of xx according to the indices that sort yy in ascending order. E.g., permuting (9,3,7)(9,3,7) based on (5,13,8)(5,13,8) results in (9,7,3)(9,7,3).

Operation Inputs Output Math Representation Element-wise add (Add\mathrm{Add}) x,yn×mx,y\in\mathbb{R}^{n\times m} zn×mz\in\mathbb{R}^{n\times m} zij=xij+yijz_{ij}=x_{ij}+y_{ij} Element-wise multiply (Mul\mathrm{Mul}) x,yn×mx,y\in\mathbb{R}^{n\times m} zn×mz\in\mathbb{R}^{n\times m} zij=xijyijz_{ij}=x_{ij}*y_{ij} Division (Div\mathrm{Div}) x>0x\in\mathbb{R}_{>0} z>0z\in\mathbb{R}_{>0} z=1xz=\frac{1}{x} Matrix multiplication (Mmul\mathrm{Mmul}) xn×m,ym×kx\in\mathbb{R}^{n\times m},y\in\mathbb{R}^{m\times k} zn×kz\in\mathbb{R}^{n\times k} z=x×yz=x\times y Matrix inverse (Inv\mathrm{Inv}) xn×nx\in\mathbb{R}^{n\times n} zn×nz\in\mathbb{R}^{n\times n} z=x1z=x^{-1} Matrix power (Pow\mathrm{Pow}) xn×n,kx\in\mathbb{R}^{n\times n},k\in\mathbb{N} zn×nz\in\mathbb{R}^{n\times n} z=x×x××xz=x\times x\times\cdots\times x Summation (Sum\mathrm{Sum}) xn,kx\in\mathbb{R}^{n},k\in\mathbb{N} zz\in\mathbb{R} z=i=1kxiz=\sum_{i=1}^{k}x_{i} Product (Prod\mathrm{Prod}) xn,kx\in\mathbb{R}^{n},k\in\mathbb{N} zz\in\mathbb{R} z=i=1kxiz=\prod_{i=1}^{k}x_{i} Matrix transpose (\top) xn×mx\in\mathbb{R}^{n\times m} zm×nz\in\mathbb{R}^{m\times n} zij=xjiz_{ij}=x_{ji} Element-wise exponential (Exp\mathrm{Exp}) xn×mx\in\mathbb{R}^{n\times m} zm×nz\in\mathbb{R}^{m\times n} zij=exijz_{ij}=e^{x_{ij}} Frobenius norm (Norm\mathrm{Norm}) xn×mx\in\mathbb{R}^{n\times m} zz\in\mathbb{R} z=xz=\|x\| Element-wise indicator (Ind\mathrm{Ind}) xn×mx\in\mathbb{R}^{n\times m} zn×mz\in\mathbb{R}^{n\times m} zij=(xij>0)?1:0z_{ij}=(x_{ij}>0)?1:0 Sorting (Sort\mathrm{Sort}) znz\in\mathbb{R}^{n} xn,ynx\in\mathbb{R}^{n},y\in\mathbb{R}^{n} permute xx based on yy

Given an input interval X=[x¯,x¯]X=[\underline{x},\overline{x}]\subseteq\mathbb{R}, piecewise linear bounds and corresponding piecewise linear relations for operators Exp\mathrm{Exp}, Div\mathrm{Div} and Ind\mathrm{Ind} are shown in Table 2. These bounds confirm these operations’ linearly over-approximability.

Table 2: Bounds for operations Exp\mathrm{Exp}, Div\mathrm{Div} and Ind\mathrm{Ind}. Corresponding piecewise linear relations can be written as R={(x,f(x))|X(x)f(x)uX(x),x¯xx¯}R=\{(x,f(x))|\ell_{X}(x)\leq f(x)\leq u_{X}(x),\underline{x}\leq x\leq\overline{x}\}.
Operation f(x)f(x) Linear lower bound X\ell_{X} Linear upper bound uXu_{X}
Exp(x)\mathrm{Exp}(x) exp(x¯)(xx¯)+exp(x¯)\exp(\underline{x})(x-\underline{x})+\exp(\underline{x}) exp(x¯)exp(x¯)x¯x¯(xx¯)+exp(x¯)\frac{\exp(\overline{x})-\exp(\underline{x})}{\overline{x}-\underline{x}}(x-\underline{x})+\exp(\underline{x})
Div(x)\mathrm{Div}(x) 1x¯2(xx¯)+1x¯-\frac{1}{\underline{x}^{2}}(x-\underline{x})+\frac{1}{\underline{x}} 1x¯x¯(xx¯)+1x¯-\frac{1}{\underline{x}\overline{x}}(x-\underline{x})+\frac{1}{\underline{x}}
Ind(x)\mathrm{Ind}(x) {0 if x01 if x>0\left\{\begin{matrix}0&\mbox{ if }x\leq 0\\ 1&\mbox{ if }x>0\end{matrix}\right. {0 if x01 if x>0\left\{\begin{matrix}0&\mbox{ if }x\leq 0\\ 1&\mbox{ if }x>0\end{matrix}\right.

2.2 Rendering Gaussian Splat Scenes

A Gaussian splat scene models a three dimensional scene as a collection of Gaussians with different means, covariances, colors, and transparencies. The rendering algorithm for Gaussian splat scenes projects these 3D Gaussians as 2D splats on the camera’s image plane [7].

Given a world coordinate frame ww, a 3D Gaussian is a tuple 𝗎𝗐,𝖬𝗐,𝗈,𝖼\langle\sf{uw},\sf{Mw},\sf{o},\sf{c}\rangle where, 𝗎𝗐𝟥\sf{uw}\in\mathbb{R}^{3} is its mean position in the world coordinates, 𝖬𝗐𝟥×𝟥\sf{Mw}\in\mathbb{R}^{3\times 3} is its covariance matrix111The covariance matrix 𝖢𝗈𝗏\sf{Cov} is represented by its Cholesky factor 𝖬𝗐\sf{Mw}, i.e., 𝖢𝗈𝗏=𝖬𝗐(𝖬𝗐)𝖳\sf{Cov}=\sf{Mw}(\sf{Mw})^{T}., 𝗈[𝟢,𝟣]\sf{o}\in[0,1] is its opacity, and 𝖼[𝟢,𝟣]𝟥\sf{c}\in[0,1]^{3} is its colors in RGB channels. A scene 𝖲𝖼\mathsf{Sc} is a collection of Gaussians in the world coordinate frame, indexed by a finite set 𝖨\mathsf{I}. A visualization of a scene with three Gaussians is shown on the right side of Figure 2.

Refer to caption
Figure 2: Visualization of a 3D Gaussian Scene: Gaussians are defined in world coordinates (right), transformed into camera coordinates (middle), and projected onto the image plane/pixel coordinates (left). Here, ff denotes the focal length, and 𝗎𝗐[1]\mathsf{uw}[1], 𝗎𝖼[1]\mathsf{uc}[1], and 𝗎𝗉[1]\mathsf{up}[1] represent the red Gaussian’s mean in the respective coordinate systems.

A camera CC is defined by a translation 𝗍𝟥\sf{t}\in\mathbb{R}^{3} and a rotation matrix 𝖱𝟥×𝟥\sf{R}\in\mathbb{R}^{3\times 3} relative to the world coordinates, a focal length (𝖿𝗑,𝖿𝗒)2(\mathsf{f_{x},f_{y}})\in\mathbb{R}^{2}, and an offset (𝖼𝗑,𝖼𝗒)2(\mathsf{c_{x},c_{y}})\in\mathbb{R}^{2}. The position and the rotation define the extrinsic matrix of the camera, and the latter parameters define the intrinsic matrix. Together, they define how 3D objects in the world are transformed to 2D pixels on the camera’s image plane. An Image of size 𝖶×𝖧\sf{W}\times\sf{H} (denoted as 𝖶𝖧\sf{WH}) is defined by a collection of pixel colors, where each pixel color 𝗉𝖼[𝟢,𝟣]𝟥\sf{pc}\in[0,1]^{3}.

Given a scene 𝖲𝖼\mathsf{Sc} and a camera 𝖢\mathsf{C}, the image seen by the camera is obtained by projecting the 3D Gaussians in the scene onto the camera’s image plane and blending their colors. Each pixel’s color is blended based on the distance between the pixel’s position and the 2D Gaussian’s mean, as well as the depth ordering of the 3D Gaussians from the image plane. In detail, the rendering algorithm GaussianSplat (shown in Listing 1) computes the color of each pixel 𝗎𝖶𝖧\mathsf{u}\in\sf{WH} in the following four steps. The first three steps are for each Gaussian and the final step is on the aggregate.

Step 1.

Transforms each 3D Gaussian 𝗂𝖨\sf{i}\in\mathsf{I} in the scene 𝖲𝖼\mathsf{Sc} from the world coordinates (w)w) into camera coordinates (cc) (Line 1-1). This is done by applying the rotation 𝖱\mathsf{R} and the translation 𝗍\mathsf{t} to the mean 𝗎𝗐[𝗂]\mathsf{uw[i]} and the rotation to the covariance 𝖬𝗐[𝗂]\mathsf{Mw[i]}.

Step 2.

Transforms each of the Gaussians 𝗂\sf{i} from 3D camera coordinates (cc) to 2D pixel coordinates (pp) on the image plane (Line 1-1) (Figure 2) by applying the intrinsic matrix 𝖪=[𝖿𝗑0𝖼𝗑0𝖿𝗒𝖼𝗒]\mathsf{K}=\begin{bmatrix}\mathsf{f_{x}}&0&\mathsf{c_{x}}\\ 0&\mathsf{f_{y}}&\mathsf{c_{y}}\end{bmatrix} to 𝗎𝖼[𝗂]\mathsf{uc[i]} and the projective transformation 𝖩[𝗂]\sf{J[i]} to 𝖬𝖼[𝗂]\sf{Mc[i]}. And computes depth 𝖽[𝗂]\sf{d[i]} to image plane for Gaussian 𝗂\sf{i} simply as the third coordinate of 𝗎𝖼[𝗂]\mathsf{uc[i]} (as the camera axis is orthogonal to the image plane).

Step 3.

Next, computes the inverse of covariance 𝖢𝗈𝗇𝗂𝖼[𝗂]\sf{Conic[i]} for the 𝗂𝗍𝗁\sf{i^{th}} 2D Gaussian in Step 2 (Line 1). Effective opacity of each Gaussian 𝖺[𝗂]\sf{a[i]} at pixel position 𝗎\sf{u} is computed as opacity 𝗈[𝗂]\sf{o[i]} weighed by the probability density of the 𝗂𝗍𝗁\sf{i^{th}} 2D Gaussian at pixel 𝗎\sf{u} (Line 1-1). The complicated expression for 𝖺[𝗂]\sf{a[i]} is the familiar definition of the Gaussian distribution with covariance 𝖬𝗉[𝗂]\sf{Mp[i]} written using the operators in Table 1.

Step 4.

Finally, the color of the pixel 𝗉𝖼\sf{pc} is determined by blending the colors of all the Gaussian weighted by their effective opacities 𝖺\sf{a} and adjusted according to their depth to the image plane 𝖽\sf{d} (Line 1).

1
2 for 𝗂𝖨\forall\mathsf{i}\in\mathsf{I} do
3       𝗎𝖼[𝗂]Mmul(𝖱,Add(𝗎𝗐[𝗂],𝗍))\mathsf{uc}[\mathsf{i}]\leftarrow\mathrm{Mmul}(\mathsf{R},\mathrm{Add}(\mathsf{uw}[\mathsf{i}],-\mathsf{t}));
4       𝖬𝖼[𝗂]Mmul(𝖱,𝖬𝗐[𝗂])\mathsf{Mc}[\mathsf{i}]\leftarrow\mathrm{Mmul}(\mathsf{R},\mathsf{Mw}[\mathsf{i}]);
5       𝖩[𝗂][Mul(𝖿𝗑,𝗎𝖼[𝗂,2])0Mul(𝖿𝗑,𝗎𝖼[𝗂,0])0Mul(𝖿𝗒,𝗎𝖼[𝗂,2])Mul(𝖿𝗒,𝗎𝖼[𝗂,1])]\mathsf{J}[\mathsf{i}]\leftarrow\begin{bmatrix}\mathrm{Mul}(\mathsf{f_{x}},\mathsf{uc}[\mathsf{i},2])&0&-\mathrm{Mul}(\mathsf{f_{x}},\mathsf{uc}[\mathsf{i},0])\\ 0&\mathrm{Mul}(\mathsf{f_{y}},\mathsf{uc}[\mathsf{i},2])&-\mathrm{Mul}(\mathsf{f_{y}},\mathsf{uc}[\mathsf{i},1])\end{bmatrix};
6       𝗎𝗉[𝗂]Mmul(𝖪,𝗎𝖼[𝗂])\mathsf{up}[\mathsf{i}]\leftarrow\mathrm{Mmul}(\mathsf{K},\mathsf{uc}[\mathsf{i}]);
7       𝖬𝗉[𝗂]Mmul(𝖩[𝗂],𝖬𝖼[𝗂])\mathsf{Mp}[\mathsf{i}]\leftarrow\mathrm{Mmul}(\mathsf{J}[\mathsf{i}],\mathsf{Mc}[\mathsf{i}]);
8       𝖽[𝗂]𝗎𝖼[𝗂,2]\mathsf{d}[\mathsf{i}]\leftarrow\mathsf{uc}[\mathsf{i},2];
9       𝖢𝗈𝗇𝗂𝖼[𝗂]Inv(Mmul(𝖬𝗉[𝗂],𝖬𝗉[𝗂]))\mathsf{Conic}[\mathsf{i}]\leftarrow\mathrm{Inv}(\mathrm{Mmul}(\mathsf{Mp}[\mathsf{i}],\mathsf{Mp}[\mathsf{i}]^{\top}));
10       𝗊[𝗂]Mmul(Add(Mul(𝖽[𝗂],𝖽[𝗂],𝗎),Mul(𝖽[𝗂],𝗎𝗉[𝗂])),𝖢𝗈𝗇𝗂𝖼[𝗂],𝖬𝗉[𝗂])\mathsf{q}[\mathsf{i}]\leftarrow\mathrm{Mmul}(\mathrm{Add}(\mathrm{Mul}(\mathsf{d}[\mathsf{i}],\mathsf{d}[\mathsf{i}],\mathsf{u}),-\mathrm{Mul}(\mathsf{d}[\mathsf{i}],\mathsf{up}[\mathsf{i}])),\mathsf{Conic}[\mathsf{i}],\mathsf{Mp}[\mathsf{i}]);
11       𝖺[𝗂]Mmul(𝗈[𝗂],Exp(Mmul(12,𝗊[𝗂],𝗊[𝗂])))\mathsf{a}[\mathsf{i}]\leftarrow\mathrm{Mmul}(\mathsf{o}[\mathsf{i}],\mathrm{Exp}(\mathrm{Mmul}(-\frac{1}{2},\mathsf{q}[\mathsf{i}],\mathsf{q}[\mathsf{i}]^{\top})));
12      
13𝗉𝖼BlendSort(𝖺,𝖼,𝖽)\mathsf{pc}\leftarrow\textsc{BlendSort}(\mathsf{a},\mathsf{c},\mathsf{d});
14 return 𝗉𝖼\mathsf{pc};
Listing 1 GaussianSplat(𝖲𝖼,𝖢,𝗎)\textsc{GaussianSplat}(\mathsf{Sc},\mathsf{C},\mathsf{u})

This last step is implemented in the BlendSort (shown in Listing 2) function: it sorts the Gaussians based on their depth to the image plane 𝖽\sf{d}, resulting in reordered effective opacities 𝖺𝗌\sf{as} and Gaussian colors 𝖼𝗌\mathsf{cs} (Line 2-2). Then for each Gaussian, its transparency 𝗏𝗌[𝗂]\sf{vs[i]} is computed as 1𝖺𝗌[𝗂]1-\sf{as[i]} (Line  2). The transmittance of 𝗂𝗍𝗁\sf{i^{th}} Gaussian is obtained as the cumulative product of transparency of all Gaussians in front of it (Line  2). Finally, the pixel color 𝗉𝖼\mathsf{pc} is determined by summation of all Gaussian colors 𝖼𝗌\sf{cs}, weighted by their respective transmittance 𝖳𝗌\mathsf{Ts} and effective opacities 𝖺𝗌\mathsf{as} (Line  2).

For rendering the whole image, GaussianSplat is executed for each pixel. Although, this algorithm looks involved, we note that Steps 1 and 2 only use the Mmul\mathrm{Mmul}, Add\mathrm{Add}, Mul\mathrm{Mul} operations in Table 1. Step 3 involves computing the inverse of the covariance matrix and Step 4 requires sorting the depth.

1 𝖺𝗌Sort(𝖺,𝖽)\mathsf{as}\leftarrow\mathrm{Sort}(\mathsf{a},\mathsf{d});
2 𝖼𝗌Sort(𝖼,𝖽)\mathsf{cs}\leftarrow\mathrm{Sort}(\mathsf{c},\mathsf{d});
3 for 𝗂𝖨\forall\mathsf{i}\in\mathsf{I} do
4       𝗏𝗌[𝗂]Add(1,𝖺𝗌[𝗂])\mathsf{vs}[\mathsf{i}]\leftarrow\mathrm{Add}(1,-\mathsf{as}[\mathsf{i}]) ;
5       𝖳𝗌[𝗂]Prod(𝗏𝗌,𝗂1)\mathsf{Ts}[\mathsf{i}]\leftarrow\mathrm{Prod}(\mathsf{vs},\mathsf{i}-1);
6      
7𝗉𝖼Sum(Mul(𝖳𝗌,𝖺𝗌,𝖼𝗌))\mathsf{pc}\leftarrow\mathrm{Sum}(\mathrm{Mul}(\mathsf{Ts},\mathsf{as},\mathsf{cs}));
8 return 𝗉𝖼\mathsf{pc};
Listing 2 BlendSort(𝖺,𝖼,𝖽)\textsc{BlendSort}(\mathsf{a},\mathsf{c},\mathsf{d})
1 for 𝗂𝖨\forall\mathsf{i}\in\mathsf{I} do
2       for 𝗃𝖨\forall\mathsf{j}\in\mathsf{I} do
3             𝖵[𝗂,𝗃]Add(1,Mul(a[𝗃],Ind(𝖽[𝗂]𝖽[𝗃])))\mathsf{V}[\mathsf{i},\mathsf{j}]\leftarrow\mathrm{Add}(1,-\mathrm{Mul}(a[\mathsf{j}],\mathrm{Ind}(\mathsf{d}[\mathsf{i}]-\mathsf{d}[\mathsf{j}])));
4            
5      𝖳[𝗂]Prod(𝖵[𝗂],N)\mathsf{T}[\mathsf{i}]\leftarrow\mathrm{Prod}(\mathsf{V}[\mathsf{i}],\mathrm{N});
6      
7𝗉𝖼Sum(Mul(𝖳,𝖺,𝖼),N)\mathsf{pc}\leftarrow\mathrm{Sum}(\mathrm{Mul}(\mathsf{T},\mathsf{a},\mathsf{c}),\mathrm{N});
8 return 𝗉𝖼\mathsf{pc};
Listing 3 BlendInd(𝖺,𝖼,𝖽)\textsc{BlendInd}(\mathsf{a},\mathsf{c},\mathsf{d})

Abstract rendering problem.

Our goal is to design an algorithm that for all pixel 𝗎𝖶𝖧\mathsf{u}\in\mathsf{WH}, it takes as input a linear set of scenes 𝐒𝐜\mathsf{\mathbf{Sc}} and cameras 𝐂\mathsf{\mathbf{C}}, and outputs a linear set of pixel color 𝐩𝐜\mathsf{\mathbf{pc}} such that GaussianSplat(𝖲𝖼,𝖢,𝗎)𝐩𝐜\textsc{GaussianSplat}(\mathsf{Sc},\mathsf{C},\mathsf{u})\in\mathsf{\mathbf{pc}} , for each 𝖲𝖼𝐒𝐜\mathsf{Sc}\in\mathsf{\mathbf{Sc}} and 𝖢𝐂\mathsf{C}\in\mathsf{\mathbf{C}}.

3 Abstract Rendering

Our approach for creating the abstract rendering algorithm, AbstractSplat, takes a linear set of scenes and cameras as input and step-by-step derives the piecewise linear relation between each intermediate variable and input through GaussianSplat. This process continues until the final step, where the piecewise linear relation between pixel values and inputs is established. The final piecewise linear relation is then used to compute linear bounds on each pixel, and by iterating over all pixels in the image, we obtain linear bounds for the entire image.

In GaussianSplat, the first two steps involve linear operations, making it straightforward to derive the linear relation at each step as a combination of the current relation and the linear bounds of the ongoing operations. However, matrix inversion in step 3, using the adjugate method222In the adjugate method, the matrix inverse is computed by dividing the adjugate matrix by the determinant, can introduce large over-approximation errors due to uncertainty in the denominator, particularly when the matrix approaches singularity. Step 4 further exacerbates these errors with sorting, as depth order uncertainty may cause a single Gaussian to be mapped to multiple indices, leading to over-counting in the pixel color summation, and potentially causing pixel color overflow in extreme cases.

To address these issues, we first introduce a Taylor expansion-based Algorithm MatrixInv in Section 3.1, capable of mitigating the uncertainty of division to be arbitrarily small by setting a sufficiently high Taylor order. Then in Section 3.2, we propose BlendInd (shown in Listing 3) in replacement to BlendSort, which eliminates indices uncertainty and improves bound tightness by implementing Gaussian color blending without the need for Sort\mathrm{Sort} operation. Finally, we present a detailed explanation of AbstractSplat and prove its soundness in Section 3.3.

3.1 Abstracting Matrix Inverse by Taylor Expansion

Linear bounds on division often lead to greater over-approximation errors compared to addition or multiplication. Thus, we propose a Taylor expansion-based algorithm, MatrixInv (shown in Listing 4), which expresses the matrix inverse as a series involving only addition and multiplication. Though division is used to bound the remainder term, its impact can be minimized by increasing the Taylor order, effectively reducing the remainder to near zero.

1
2𝖨𝖷𝖷𝟢Add(I,Mmul(𝖷,𝖷𝟢))\mathsf{IXX0}\leftarrow\mathrm{Add}(I,-\mathrm{Mmul}(\mathsf{X},\mathsf{X0}));
3
4Assert Norm(𝖨𝖷𝖷𝟢)<1\mathrm{Norm}(\mathsf{IXX0})<1;
5
6for 𝗂{0,1,,𝗄}\forall\mathsf{i}\in\{0,1,\cdots,\sf{k}\} do
7      
8      𝖷𝖺=Mmul(𝖷𝟢,Pow(𝖨𝖷𝖷𝟢,𝗂))\sf{Xa}=\mathrm{Mmul}(\mathsf{X0},\mathrm{Pow}(\sf{IXX0},\sf{i}))
9𝖷𝗉Sum(𝖷𝖺,𝗄+𝟣)\mathsf{Xp}\leftarrow\mathrm{Sum}(\sf{Xa},\sf{k}+1);
10
11𝖤𝗉𝗌Mul(Norm(𝖷𝟢),Pow(Norm(𝖨𝖷𝖷𝟢),k+1),Div(Add(1,Norm(𝖨𝖷𝖷𝟢))))\mathsf{Eps}\leftarrow\mathrm{Mul}(\mathrm{Norm}(\mathsf{X0}),\mathrm{Pow}(\mathrm{Norm}(\mathsf{IXX0}),k+1),\mathrm{Div}(\mathrm{Add}(1,-\mathrm{Norm}(\mathsf{IXX0}))));
12
13𝗅𝖷𝗂𝗇𝗏Add(𝖷𝗉,𝖤𝗉𝗌)\mathsf{lXinv}\leftarrow\mathrm{Add}(\mathsf{Xp},-\mathsf{Eps});
14
15𝗎𝖷𝗂𝗇𝗏Add(𝖷𝗉,𝖤𝗉𝗌)\mathsf{uXinv}\leftarrow\mathrm{Add}(\mathsf{\mathsf{Xp},\mathsf{Eps}});
16 return 𝗅𝖷𝗂𝗇𝗏,𝗎𝖷𝗂𝗇𝗏\langle\mathsf{lXinv},\mathsf{uXinv}\rangle;
Listing 4 MatrixInv(𝖷;𝖷𝟢,𝗄)\textsc{MatrixInv}(\mathsf{X};\mathsf{X0},\mathsf{k})

The MatrixInv function takes non-singular matrix 𝖷\mathsf{X} as input and uses a reference matrix 𝖷𝟢\mathsf{X0} along with a Taylor expansion order 𝗄\mathsf{k} as parameters. It outputs 𝗅𝖷𝗂𝗇𝗏\mathsf{lXinv} and 𝗎𝖷𝗂𝗇𝗏\mathsf{uXinv}, representing the lower and upper bounds of Inv\mathrm{Inv} operation. Given fixed input 𝖷\sf{X}, Algorithm MatrixInv operates as follows: First, it calculates the deviation from the reference matrix, denoted as 𝖨𝖷𝖷𝟢\mathsf{IXX0} (Line 4). Next, it checks the norm of 𝖨𝖷𝖷𝟢\mathsf{IXX0} to ensure that the Taylor expansion does not diverge as the order increases (Line 4). Then, it computes the 𝗄𝗍𝗁\sf{k^{th}} order Taylor approximation 𝖷𝖺\sf{Xa}, and sum them to obtain 𝗄𝗍𝗁\sf{k^{th}} order Taylor polynomial 𝖷𝗉\mathsf{Xp} (Line 4 - Line 4), and computes an upper bound of the remainder norm 𝖤𝗉𝗌\mathsf{Eps} (Line 4). Finally, the lower (upper) bound of matrix inverse 𝗅𝖷𝗂𝗇𝗏\mathsf{lXinv} (𝗎𝖷𝗂𝗇𝗏\mathsf{uXinv}) is achieved by subtracting (adding) 𝖤𝗉𝗌\sf{Eps} to 𝖷𝗉\sf{Xp} (Line 4 - Line 4).

Lemma 1

Given any non-singular matrix 𝖷\sf{X}, the output of Algorithm MatrixInv 𝗅𝖷𝗂𝗇𝗏,𝗎𝖷𝗂𝗇𝗏\langle\mathsf{lXinv},\mathsf{uXinv}\rangle satisfies that:

𝗅𝖷𝗂𝗇𝗏Inv(𝖷)𝗎𝖷𝗂𝗇𝗏\mathsf{lXinv}\leq\mathrm{Inv}(\sf{X})\leq\mathsf{uXinv}

Lemma 1 demonstrates that MatrixInv produces valid bounds for the Inv\mathrm{Inv} operation with fixed input. When applied to a linear set of input matrices, the linear bounds of 𝗅𝖷𝗂𝗇𝗏\sf{lXinv} and 𝗎𝖷𝗂𝗇𝗏\sf{uXinv} are derived through line-by-line propagation of linear relations. Together, these bounds define the overall linear bounds for the Inv\mathrm{Inv} operation on the given input set. To mitigate computational overhead from excessive matrix multiplications, we make the following adjustments for tighter bounds while using a small 𝗄\sf{k}: (1) select 𝖷𝟢\mathsf{X0} as the inverse of the center matrix within the input bounds; (2) iteratively divide the input range until the assertion at Line 4 is satisfied; and (3) increase 𝗄\mathsf{k} until 𝖤𝗉𝗌\mathsf{Eps} falls below the given tolerance. Example 1 highlights the benefits of using MatrixInv for matrix inversion.

Example 1

Given the constant lower and upper bounds of input matrices,

X¯=[0.600.20.020.90],X¯=[0.900.020.021.30],\displaystyle\underline{X}=\begin{bmatrix}0.60&-0.2\\ -0.02&0.90\end{bmatrix},\quad\overline{X}=\begin{bmatrix}0.90&0.02\\ 0.02&1.30\end{bmatrix},

we combine lower bound of 𝗅𝖷𝗂𝗇𝗏\mathsf{lXinv} and upper bound of 𝗎𝖷𝗂𝗇𝗏\mathsf{uXinv} as overall bounds for Inv\mathrm{Inv} operation, and take the norm difference 𝗎𝖷𝗂𝗇𝗏𝗅𝖷𝗂𝗇𝗏||\mathsf{uXinv}-\mathsf{lXinv}|| as a measure of bound tightness. Using the adjugate method, we get 1.221.22, while MatrixInv improves this to 0.700.70, which closely aligns with the empirical value of 0.660.66.

3.2 Abstracting Color Blending by Replacing Sorting Operator

In BlendSort, the Sort\mathrm{Sort} operation orders Gaussians by increasing depth so that those closer to the camera receive lower indices, enabling cumulative occlusion computation. However, we observe that instead of sorting, one can directly identify occluding Gaussians through pairwise depth comparisons. Our algorithm, BlendInd, iterates over Gaussian pairs and uses a boolean operator Ind\mathrm{Ind} to select those with smaller depths relative to a given Gaussian. This approach avoids the complications associated with index-based accumulation (for example, the potential for counting the occlusion effect of the same Gaussian multiple times) and ultimately yields significantly tighter bounds on the pixel colors.

Algorithm BlendInd takes as input the effective opacity 𝖺\mathsf{a}, Gaussian colors 𝖼\mathsf{c}, and Gaussian depths 𝖽\mathsf{d} relative to the image plane, and outputs the pixel color 𝗉𝖼\mathsf{pc}. Its proceeds as follows: First, it computes the transparency between each pair of Gaussians, denoted as matrix 𝖵\mathsf{V} (Line 3). If the 𝗂𝗍𝗁\sf{i^{th}} Gaussian is behind the 𝗃𝗍𝗁\sf{j^{th}} (𝖽[𝗂]𝖽[𝗃]>𝟢\sf{d[i]-d[j]>0}), then Ind(𝖽[𝗂]𝖽[𝗃])\mathrm{Ind}(\sf{d[i]-d[j]}) outputs 11 and value of 𝖵[𝗂,𝗃]\mathsf{V[i,j]} is less than 11, indicating that the contribution of the 𝗂𝗍𝗁\sf{i^{th}} Gaussian’s color is attenuated; otherwise, Ind(𝖽[𝗂]𝖽[𝗃])\mathrm{Ind}(\sf{d[i]-d[j]}) outputs 0 and value of 𝖵[𝗂,𝗃]\mathsf{V[i,j]} is set to be exact 11, meaning the contribution of the 𝗂𝗍𝗁\sf{i^{th}} Gaussian’s color is unaffected by the 𝗃𝗍𝗁\sf{j^{th}}. Next, the combined transmittance for all Gaussians 𝖳\mathsf{T} is computed (Line 3). Finally, the pixel color 𝗉𝖼\mathsf{pc} is obtained by aggregating the colors of all Gaussians, weighted accordingly (Line 3).

Lemma 2

For any given inputs — effective opacities 𝖺\sf{a}, colors 𝖼\sf{c} and depth 𝖽\sf{d}:

BlendSort(𝖺,𝖼,𝖽)=BlendInd(𝖺,𝖼,𝖽)\textsc{BlendSort}(\sf{a},\sf{c},\sf{d})=\textsc{BlendInd}(\sf{a},\sf{c},\sf{d})

Lemma 2 demonstrates that BlendSort and BlendInd are equivalent, i.e., replacing BlendSort with BlendInd in GaussianSplat yields same output values for same inputs. Furthermore, under input perturbation, BlendInd produces tighter linear bounds than BlendSort, as shown in Example 2.

Example 2

Consider a scene with three Gaussians (red, green, blue) and a camera at the origin facing upward, with a perturbation of ±0.3 applied to each coordinate. At pixel (10,2) in a 20×20 image, the red channel’s upper bound is 0.6526 using BlendSort, 0.575 using BlendInd, and 0.574 via empirical sampling. Similar results are observed for the green and blue channels. Figure 3 visualizes the upper bounds on the RGB channels for the entire image.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 3: Visualization of the scene and upper bounds from Example 2: left, the scene and camera; mid left, bounds computed with BlendSort; mid right, bounds from BlendInd; and right, empirical sampling bounds. Note that the bounds from BlendIndBlendIndare tighter (darker) and closely match the sampled bounds.

3.3 Abstract Rendering Algorithm AbstractSplat

Our abstract rendering algorithm, AbstractSplat, is derived from modifications to GaussianSplat, as follows:

(1)

All input arguments and variables in GaussianSplat are replaced with their linear set-valued counterparts (with boldface font).

(2)

All the operations in Lines 1 - 1 and 1 - 1 are replaced by its corresponding linear relational operations.

(3)

Line 1 is replaced to

𝐂𝐨𝐧𝐢𝐜[𝗂]MatrixInv(Mmul(𝐌𝐩[𝗂],𝐌𝐩[𝗂]);𝖢𝗈𝗇𝗂𝖼𝟢,𝗄),\mathbf{Conic}\sf{[i]}\leftarrow\textsc{MatrixInv}(\mathrm{Mmul}(\mathbf{Mp}\sf{[i]},\mathbf{Mp}\sf{[i]}^{\top});\mathsf{Conic0},\mathsf{k}),

where both 𝐂𝐨𝐧𝐢𝐜[𝗂]\mathbf{Conic}\sf{[i]} and 𝐌𝐩[𝗂]\mathbf{Mp}\sf{[i]} are linear set-valued variables; and matrix 𝖢𝗈𝗇𝗂𝖼𝟢\mathsf{Conic0} and Taylor order kk are two parameters333matrix 𝖢𝗈𝗇𝗂𝖼𝟢\mathsf{Conic0} is selected as the inverse of center matrix from linear set of Mmul(𝐌𝐩[𝗂],𝐌𝐩[𝗂])\mathrm{Mmul}(\mathbf{Mp}\sf{[i]},\mathbf{Mp}\sf{[i]}^{\top}), and kk defaulting to 8. A detailed explanation on selection for 𝖢𝗈𝗇𝗂𝖼𝟢\mathsf{Conic0} and 𝗄\mathsf{k} is provided in Section 3.1..

(4)

Line 1 is modified to

𝐩𝐜BlendInd(𝐚,𝐜,𝐝),\mathbf{pc\leftarrow\textsc{BlendInd}(\mathbf{a},\mathbf{c},\mathbf{d})},

where all involved variables are linear set-valued.

(5)

The final output of AbstractSplat is a linear set of colors 𝐩𝐜[0,1]3\mathbf{pc}\subseteq[0,1]^{3} representing the set of possible colors for pixel 𝗎\sf{u}.

The complete abstract rendering algorithm computes the set of possible colors for every pixels in the image. The next theorem establishes the soundness of AbstractSplat, which follows from the fact that all the operations involved in AbstractSplat are linear relational operations of linearly over-approximable functions.

Theorem 3.1

AbstractSplat computes sound piecewise linear over-approximations of GaussianSplat for any linear sets of inputs 𝐒𝐜\mathbf{Sc} and 𝐂\mathbf{C}.

Proof

Let us fix the input linear sets of scenes 𝐒𝐜\mathbf{Sc}, cameras 𝐂\mathbf{C}, and a pixel 𝗎\mathsf{u}. Consider any specific scene 𝖲𝖼𝐒𝐜\mathsf{Sc}\in\mathbf{Sc} and camera 𝖢𝐂\mathsf{C}\in\mathbf{C}. At Line 1, 𝗎𝖼[𝗂]𝐮𝐜[𝗂]\mathsf{uc}\mathsf{[i]}\in\mathbf{uc}\mathsf{[i]}, because Line 1 in GaussianSplat only contains linearly over-approximable operations (e.g. Add\mathrm{Add}, Mul\mathrm{Mul}), and 𝐮𝐜[𝐢]\mathbf{uc[i]} are generated by their corresponding linearly relational operation. Similar containment relationship continues through to Line 1, where 𝖽[𝗂]𝐝[𝗂]\mathsf{d[i]}\in\mathbf{d}\mathsf{[i]}.

At Line 1, by Lemma 1, we establish that 𝖢𝗈𝗇𝗂𝖼[𝗂][𝗅𝖷𝗂𝗇𝗏[𝗂],𝗎𝖷𝗂𝗇𝗏[𝗂]]\mathsf{Conic[i]}\in[\mathsf{lXinv[i]},\mathsf{uXinv[i]}]. Since each operation involved in MatrixInv is linearly over-approximable, it follows that 𝗅𝖷𝗂𝗇𝗏[𝗂]𝐥𝐗𝐢𝐧𝐯[𝗂]\mathsf{lXinv[i]}\in\mathbf{lXinv}\mathsf{[i]} and 𝗎𝖷𝗂𝗇𝗏[𝗂]𝐮𝐗𝐢𝐧𝐯[𝗂]\mathsf{uXinv[i]}\in\mathbf{uXinv}\mathsf{[i]}. Thus, for the set 𝐗𝐢𝐧𝐯[𝗂]\mathbf{Xinv}\mathsf{[i]}, defined as the union of 𝐥𝐗𝐢𝐧𝐯[𝗂]\mathbf{lXinv}\mathsf{[i]} and 𝐮𝐗𝐢𝐧𝐯[𝗂]\mathbf{uXinv}\mathsf{[i]}, it implies 𝖢𝗈𝗇𝗂𝖼[𝗂]𝐗𝐢𝐧𝐯[𝗂]\mathsf{Conic[i]}\in\mathbf{Xinv}\mathsf{[i]}.

Similarly, at Line 1 and Line 1, a corresponding containment relationship exists for the same reason as in Line 1. At Line 1, by Lemma 2, we have 𝗉𝖼=BlendInd(𝖺,𝖼,𝖽)\mathsf{pc}=\textsc{BlendInd}(\mathsf{a,c,d}), and since all operations involved in BlendInd are also linearly over-approximable, it follows that BlendInd(𝖺,𝖼,𝖽)𝐩𝐜\textsc{BlendInd}(\mathsf{a,c,d})\in\mathbf{pc}, and therefore, 𝗉𝖼𝐩𝐜\mathsf{pc}\in\mathbf{pc}.

As the result of AbstractSplat over-approximates that of GaussianSplat line by line, it follows that for any specific scene 𝖲𝖼𝐒𝐜\mathsf{Sc}\in\mathbf{Sc} and camera 𝖢𝐂\mathsf{C}\in\mathbf{C}, we have 𝗉𝖼𝐩𝐜\mathsf{pc}\in\mathbf{pc}, thus completing the proof.

4 Experiments with Abstract Rendering

We implemented the abstract rendering algorithm AbstractSplat described in Section 3. As mentioned earlier, the linear approximation of the continuous operations in Table 1 is implemented using CROWN [21].

AbstractSplat on GPUs.

GPUs excel at parallelizing tasks that apply the same operation simultaneously across large datasets, such as numerical computations and rendering. Consider a 𝖶×𝖧\mathsf{W}\times\mathsf{H} pixel image of a scene with 𝖭\sf{N} Gaussians. A naive per-pixel implementation requires 𝖶𝖧×𝖭\mathsf{WH}\times\mathsf{N} computations for effective opacities 𝖺\mathsf{a}, while BlendInd requires O(𝖶𝖧×𝖭2)O(\mathsf{WH}\times\mathsf{N}^{2}) operations. For realistic values (𝖭>10k\mathsf{N}>10k), this can become prohibitive. Our implementation adopts a tile-based approach, similar to original Gaussian Splatting  [7], where the whole image is partitioned into small tiles and the computation of effective opacity 𝖺\mathsf{a} is batched per tile, allowing independent parallel processing

Batching within each tile increases memory usage. Let 𝖳𝖲\mathsf{TS} denote the tile size and 𝖡𝖲\mathsf{BS} the batch size for processing Gaussians; then storing the linear bounds for 𝖳𝖲2\mathsf{TS}^{2} pixels and 𝖡𝖲\mathsf{BS} Gaussians scales at O(𝖡𝖲×𝖳𝖲2)O(\mathsf{BS}\times\mathsf{TS}^{2}). Similarly, directly computing depth-based occlusion for all 𝖭2\mathsf{N}^{2} Gaussian pairs in BlendInd are infeasible, so we batch the inner loop only, fixing a batch size for the outer loop to balance throughput and memory consumption.

Thus, the choices of tile size 𝖳𝖲\mathsf{TS} and batch size 𝖡𝖲\mathsf{BS} provide two tunable knobs for balancing performance and memory. Larger 𝖡𝖲\mathsf{BS} and 𝖳𝖲\mathsf{TS} yield faster computation but higher memory usage, while smaller values conserve memory at the cost of speed. By adjusting these parameters, users can optimize AbstractSplat for their available computing resources.

4.1 Benchmarks and Evaluation Plan

We evaluate AbstractSplat on six different Gaussian-splat scenarios chosen to cover a variety of scales and scene complexities. These scenes are as follows: BullDozer [9] is a scene with a LEGO bulldozer against an empty background. This scene makes it apparent how camera pose changes influence the rendered image. PineTree and OSM are scenes from [4] originally described by triangular meshes. We converted them into Gaussian splats by training on images rendered from the mesh scenes. The resulting 3D Gaussian scenes are visually indistinguishable from the original meshes. Airport is a large-scale scene created from an airport environment in the Gazebo simulator. Garden and Stump [7] are real-world scenarios with large number of Gaussians.

All the scenes were trained using the Splatfacto implementation of Gaussian splatting from Nerfstudio [15]. For each scene, we generate multiple test cases 𝖳𝖼\mathsf{Tc} by varying the nominal (unperturbed) camera pose. Table 3 summarizes the details of these scenes.

Table 3: Gaussian splat scenes with their nominal (unperturbed) camera poses. For each scene, the camera pose is defined by the rotation 𝖱\mathsf{R} (given as XYZ Euler angles) and the translation 𝗍\mathsf{t}. Multiple nominal camera poses are used for each scene.
𝖲𝖼\mathsf{Sc} 𝖭\mathsf{N} 𝖳𝖼\mathsf{Tc} 𝖱\mathsf{R} 𝗍\mathsf{t}
BullDozer 56989 BD-1 [-0.68,0.08,3.12] [-0.3,5.2,2.7]
PineTree 113368 PT-1 [0.0,1.57,0.0] [194.5, -0.1, 4.5]
PT-2 [0.0,1.57,0.0] [135.0, -0.1, 4.5]
PT-3 [0.0,0.0,0.0] [80.0, 12.0, 40.0]
OSM 144336 OSM-1 [0.0,1.57,0.0] [160.0,-0.1,20.0]
OSM-2 [0.0,1.57,0.0] [160.0,-0.1,4.5]
OSM-3 [0.0,-1.57,0.0] [-160.0,0.0,10.0]
Airport 878217 AP-1 [-1.71, -1.12, -1.73] [-2914.7, 676.5, -97.5]
AP-2 [-1.36, -1.50, -1.36] [-2654.9, 122.7, 26.7]
Garden 524407 GD-1 [-0.72, 0.31, 2.99] [1.7, 3.7, 1.4]
Stump 751333 ST-1 [1.96, 0.10, 0.09] [0.6, -1.8,-2.6]

We define two metrics to quantify the precision of abstract images: (1) Mean Pixel Gap (𝖬𝖯𝖦\mathsf{MPG})

𝖬𝖯𝖦=1𝖶𝖧j𝖶𝖧𝗉𝖼𝗃¯𝗉𝖼𝗃¯2\mathsf{MPG}=\frac{1}{\mathsf{WH}}\sum_{j\in\mathsf{WH}}\|\mathsf{\overline{pc_{j}}}-\mathsf{\underline{pc_{j}}}\|_{2}

computes the average Euclidean distance across the RGB channels between the upper and lower bound. This gives an overall measure of bound tightness across the entire image. (2) Max Pixel Gap (𝖷𝖯𝖦\mathsf{XPG})

𝖷𝖯𝖦=maxj𝖶𝖧𝗉𝖼𝗃¯𝗉𝖼𝗃¯2\mathsf{XPG}=\max_{j\in\mathsf{WH}}\|\mathsf{\overline{pc_{j}}}-\mathsf{\underline{pc_{j}}}\|_{2}

measures the largest pixel-wise L2L_{2} norm indicating the worst-case losses in any single pixel’s bound. A smaller 𝖬𝖯𝖦\mathsf{MPG} or 𝖷𝖯𝖦\mathsf{XPG} value corresponds to more precise abstract rendering.

4.2 AbstractSplat is Sound

We applied AbstractSplat to the scenes in Table 4 under various camera poses and perturbations. In these experiments, the input to AbstractSplat is a set of cameras 𝐂=Bϵ𝗍,ϵ𝖱(𝖢0){\bf C}=B_{\mathsf{\epsilon_{t}},\mathsf{\epsilon_{R}}}(\mathsf{C}_{0}), where ϵ𝗍\mathsf{\epsilon_{t}} and ϵ𝖱\mathsf{\epsilon_{R}} denote the perturbation radii for position and orientation, and C0C_{0} is the nominal pose. We recorded two metrics, 𝖬𝖯𝖦\mathsf{MPG} and 𝖷𝖯𝖦\mathsf{XPG}, and compared them against empirical estimates obtained by randomly sampling poses from 𝐂{\bf C}. Different scenes were rendered at varying resolutions. For some test cases, we further partitioned 𝐂{\bf C} into smaller subsets. We then computed bounds on the pixel colors for each subset and took their union. This practice improves the overall bound as the relaxation becomes tighter in each subset. Table 4 summarizes the results from these experiments.

Table 4: AbstractSplat scenes. Test case 𝖳𝖼\mathsf{Tc}, position perturbation ϵ𝗍\mathsf{\epsilon_{t}}, orientation perturbation ϵ𝖱\mathsf{\epsilon_{R}}, the number of partitions #𝗉𝖺𝗋𝗍\mathsf{\#part}, resolution 𝗋𝖾𝗌\mathsf{res}, runtime 𝖱𝗍\mathsf{Rt}, and metrics 𝖬𝖯𝖦\mathsf{MPG} and 𝖷𝖯𝖦\mathsf{XPG} under our method and via empirical sampling. Scalar ϵ𝗍\mathsf{\epsilon_{t}} and ϵ𝖱\mathsf{\epsilon_{R}} perturb all dimensions while vectors perturb only specific ones.
Ours Empirical
𝖳𝖼\mathsf{Tc} ϵ𝗍\mathsf{\epsilon_{t}} ϵ𝖱\mathsf{\epsilon_{R}} #𝗉𝖺𝗋𝗍\mathsf{\#part} 𝗋𝖾𝗌\mathsf{res} 𝖱𝗍\mathsf{Rt} 𝖬𝖯𝖦\mathsf{MPG} 𝖷𝖯𝖦\mathsf{XPG} 𝖬𝖯𝖦\mathsf{MPG} 𝖷𝖯𝖦\mathsf{XPG}
PT-3 [2,0,0] N/A 10 72×7272\times 72 246s 0.27 1.73 0.06 1.37
PT-3 [10,0,0] N/A 100 72×7272\times 72 48min 0.47 1.73 0.18 1.37
PT-3 [10,0,0] N/A 200 72×7272\times 72 82min 0.41 1.73 0.18 1.37
BD-1 0.10.1 N/A 20 80×8080\times 80 45min 0.85 1.73 0.31 1.69
BD-1 0.20.2 N/A 40 80×8080\times 80 74min 1.01 1.73 0.67 1.67
BD-1 N/A [0,0,0.1] 10 80×8080\times 80 22min 0.51 1.73 0.22 1.57
BD-1 N/A [0,0,0.3] 60 80×8080\times 80 138min 0.64 1.73 0.47 1.67
AP-1 0.50.5 N/A 1 160×160160\times 160 20min 0.64 1.72 0.07 1.27
AP-2 N/A [0.001,0,0][0.001,0,0] 1 160×160160\times 160 20min 0.40 1.64 0.02 0.62
GD-1 [0.05,0.0,0.0][0.05,0.0,0.0] N/A 1 100×100100\times 100 306s 1.10 1.73 0.08 0.83
ST-1 0.0020.002 N/A 1 200×200200\times 200 10min 0.12 1.53 0.01 0.41
OSM-3 0.01 0.001 1 48×4848\times 48 143s 0.69 1.73 0.04 1.01
OSM-3 0.02 0.002 1 48×4848\times 48 145s 1.19 1.73 0.07 1.20

First, pixel-color bounds produced by AbstractSplat are indeed sound. This is not surprising given Theorem 3.1, but given a good sanity check for our code. In all cases, the intervals computed strictly contain the empirically observed range of pixel colors from random sampling, confirming that our over-approximation is indeed conservative.

AbstractSplat handles a variety of 3D Gaussian scenes—from single-object (e.g., BullDozer) and synthetic (PineTree, OSM) to large-scale realistic scenes with hundreds of thousands of Gaussians (Airport, Garden, Stump)—with reasonable runtime even at 200×200200\times 200 resolution. Moreover, it accommodates both positional (x,y,z)(x,y,z) and angular (roll, pitch, yaw) camera perturbations.

Figure 1 visualizes the computed lower and upper pixel-color bounds for row 4 in Table 4, corresponding to a 10 cm horizontal shift in the camera position. Every pixel color from renderings within this range falls within our bounds. Our upper bound is brighter and the lower bound darker compared to the empirical bounds. The over-approximation is largely due to repeated Mul\mathrm{Mul} operations in BlendInd.

As the size of 𝐂{\bf C} increases, the pixel-color bounds naturally slacken. For example, in rows 12–13 of Table 4, a larger perturbation range yields higher 𝖬𝖯𝖦\mathsf{MPG} and 𝖷𝖯𝖦\mathsf{XPG} values and a wider gap from empirical estimates. Conversely, partitioning the input into finer subsets tightens the bounds: in rows 2–3 (PT-3), increasing partitions from 100 to 200 noticeably reduces 𝖬𝖯𝖦\mathsf{MPG}, and a similar trend is observed in rows 6–7 for yaw perturbations.

Overall, these experiments confirm that our method yields sound over approximations for GaussianSplat and can flexibly trade off runtime for tighter bounds by adjusting input partition granularity.

4.3 Comparing with Image Abstraction Method of [4]

We compare AbstractSplat with the baseline approach from [4], which computes interval images from triangular meshes under a set of camera positions. We selected four test cases from the scenes in [4] and introduced various sets of cameras 𝐂{\bf C}. We recorded two metrics, 𝖬𝖯𝖦\mathsf{MPG} and 𝖷𝖯𝖦\mathsf{XPG}, under both our approach and the baseline. To match with  [4], all images are rendered at 49×4949\times 49 pixels. Table 5 summarizes our results.

Table 5: Comparison of [4] and our method. Test case 𝖳𝖼\mathsf{Tc}, runtime 𝖱𝗍\mathsf{Rt} for our method and baseline and the two metrics measured for our method and baseline.
Ours Baseline [4]
𝖳𝖼\mathsf{Tc} ϵ𝗍\mathsf{\epsilon_{t}} ϵ𝖱\mathsf{\epsilon_{R}} 𝖱𝗍\mathsf{Rt} 𝖬𝖯𝖦\mathsf{MPG} 𝖷𝖯𝖦\mathsf{XPG} 𝖱𝗍\mathsf{Rt} 𝖬𝖯𝖦\mathsf{MPG} 𝖷𝖯𝖦\mathsf{XPG}
PT-1 0.01 N/A 125s 0.07 0.74 222s 0.05 1.25
PT-1 0.02 N/A 124s 0.24 1.54 251s 0.05 1.25
PT-1 N/A [0,0,0.01][0,0,0.01] 67s 0.33 1.72 N/A N/A N/A
PT-2 0.01 N/A 84s 0.07 1.36 110s 0.03 1.62
PT-2 0.02 N/A 88s 0.14 1.60 120s 0.03 1.62
PT-2 N/A [0,0,0.01][0,0,0.01] 35s 0.08 1.52 N/A N/A N/A
OSM-1 0.01 N/A 105s 0.11 0.66 26min 0.11 1.59
OSM-1 0.03 N/A 111s 0.32 1.54 29min 0.14 1.65
OSM-1 N/A [0,0.001,0][0,0.001,0] 53s 0.51 1.70 N/A N/A N/A
OSM-2 0.01 N/A 108s 0.11 0.84 26min 0.10 1.60
OSM-2 0.03 N/A 104s 0.30 1.63 29min 0.12 1.65
OSM-2 N/A [0,0.001,0][0,0.001,0] 59s 0.51 1.73 N/A N/A N/A

Overall, our algorithm achieves faster runtime than the baseline in most tests. For PT-1, our method is about 2×\times faster, while in the OSM cases, it is roughly 14×\times faster. In terms of 𝖷𝖯𝖦\mathsf{XPG}, our approach often yields tighter worst-case bounds, especially with small input perturbations, indicating a more precise upper bound on pixel colors. However, as the perturbation grows, our 𝖬𝖯𝖦\mathsf{MPG} can become larger (i.e., less tight) compared to the baseline. However, because our method runs faster, we can split 𝐂{\bf C} into smaller subsets to refine these bounds when needed, as discussed in Section 4.2. Lastly, our method also supports perturbations in camera orientation, whereas the baseline does not handle those. This flexibility allows us to cover a broader range of real-world camera variations while retaining reasonable performance. Note that the rendering algorithm discussed in [4] could potentially be abstracted by our method since it can be written using operations in Table 1.

4.4 Scalability of AbstractSplat

We evaluate the scalability of AbstractSplat by varying the tile size (𝖳𝖲\mathsf{TS}), batch size (𝖡𝖲\mathsf{BS}), and image resolution (𝗋𝖾𝗌\mathsf{res}) while keeping the same scene and camera perturbations (see Table 6). The table reports the number of Gaussians processed and peak GPU memory usage.

Increasing the tile size from 4×44\times 4 to 8×88\times 8 reduces computation time but increases memory usage roughly quadratically with the tile dimension. Similarly, as 𝖡𝖲\mathsf{BS} increases, runtime drops while memory consumption grows proportionally; however, beyond a certain batch size—roughly the number of Gaussians per tile—the performance gains diminish, and too small a batch size negates memory savings due to overhead in BlendInd. Additionally, processing more Gaussians and higher resolutions linearly increase both runtime and memory demands.

Overall, these results highlight a clear trade-off between speed and memory usage. By adjusting 𝖳𝖲\mathsf{TS}, 𝖡𝖲\mathsf{BS}, and 𝗋𝖾𝗌\mathsf{res}, users can balance performance and resource requirements to best suit their hardware.

Table 6: Scalability result for PT-1 Under Varying Tile and Batch Sizes.
𝖳𝖼\mathsf{Tc} 𝗋𝖾𝗌\mathsf{res} 𝖳𝖲\mathsf{TS} 𝖡𝖲\mathsf{BS} #𝖦𝖺𝗎𝗌𝗌\mathsf{\#Gauss} 𝖱𝗍\mathsf{Rt} GPU Mem (MB)
PT-1 48×4848\times 48 4×44\times 4 10000 317243 283s 2166
PT-1 48×4848\times 48 8×88\times 8 2000 189054 256s 9968
PT-1 48×4848\times 48 8×88\times 8 4000 189054 162s 10838
PT-1 48×4848\times 48 8×88\times 8 10000 189054 110s 14856
PT-1 48×4848\times 48 8×88\times 8 30000 189054 91s 21452
PT-1 48×4848\times 48 8×88\times 8 30000 216671 107s 19384
PT-1 96×9696\times 96 8×88\times 8 30000 232509 256s 12694

4.5 Long Spikey Guassians Cause Corase Analysis

An interesting observation from our experiments is that long, narrow Gaussians in a scene can lead to significantly more conservative bounds in our method. Figure 4 illustrates this phenomenon using the same test case (BD-1) under a small yaw perturbation of -0.005-0 rad.

In the left images, the bounds are visibly too loose because the scene contains many “spiky” Gaussians with near-singular covariance matrices. Even small perturbations in such covariances—when inverting them—can cause very large ranges in the conic form as discussed in Section 3.1. For example, consider one Gaussian in the image coordinate frame with tight covariance bounds:

𝖢𝗈𝗏¯=[0.31580.32030.32030.3500],𝖢𝗈𝗏¯=[0.31960.32260.32260.3508]\small\underline{\mathsf{Cov}}=\begin{bmatrix}0.3158&0.3203\\ 0.3203&0.3500\end{bmatrix},\overline{\mathsf{Cov}}=\begin{bmatrix}0.3196&0.3226\\ 0.3226&0.3508\end{bmatrix}

After inversion, this becomes:

𝖢𝗈𝗇𝗂𝖼¯=[37.934047.813947.838134.5632],𝖢𝗈𝗇𝗂𝖼¯=[51.929834.704534.662646.9024]\small\underline{\mathsf{Conic}}=\begin{bmatrix}37.9340&-47.8139\\ -47.8381&34.5632\end{bmatrix},\overline{\mathsf{Conic}}=\begin{bmatrix}51.9298&-34.7045\\ -34.6626&46.9024\end{bmatrix}

Such drastic changes in the inverse bleeds into large values from the Gaussians, overestimated effective opacities 𝖺\mathsf{a}, and ultimately, loose pixel-color bounds.

To mitigate this issue, we retrained the scene while penalizing thin Gaussians to avoid near-singular covariances. The resulting bounds (shown in the right images of Fig. 4) are significantly tighter. Thus, the structure of the 3D Gaussian scene itself can substantially impact the precision of our analysis, and regularizing or penalizing long, thin Gaussians provides a practical way to reduce over-approximation.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 4: Left &\& center left: Lower and upper pixel-color bounds before penalizing thin Gaussians. Center right &\& right: Bounds after penalizing thin Gaussians.

4.6 Abstract Rendering Sets of Scene

While prior experiments focused on camera pose uncertainty, AbstractSplat also supports propagating sets of 3D Gaussian scene parameters (e.g., color, mean position, opacity). This capability enables analyzing environmental variations or dynamic scenes represented as evolving Gaussians. We validate this with three experiments on the PT-1, fixing the camera pose and perturbing specific Gaussian attributes.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 5: Lower (Top) and upper (Bottom) bound under Gaussian color (Left), mean (Mid) and opacity (Right) perturbation.

In the first experiment, we perturb the red channel of the Gaussians representing the two trees by 0–0.5. The resulting pixel-color bounds (Fig.5 Left) fully encompass the empirical range, confirming that our method captures chromatic uncertainty. In the second experiment, a 1 m perturbation along the +y-axis is applied to the tree Gaussians’ mean positions; the bounds (Fig.5 Mid) capture both spatial displacement and occlusion effects, demonstrating that AbstractSplat effectively propagates geometric uncertainty. Finally, by scaling the original opacities by 0.1 (capping them) and adding a +0.1 perturbation, we test transparency variations. The resulting bounds (Fig. 5 Right) remain sound, though somewhat looser due to multiplicative interactions in BlendInd.

These experiments highlight the flexibility of AbstractSplat: by abstracting Gaussian parameters, we can model environmental dynamics such as object motion, material fading, or scene evolution. This capability aligns with recent work on dynamic 3D Gaussians [19, 22, 20] and underscores the potential for robustness analysis in vision systems operating in non-static environments.

5 Discussion and Future Work

We presented the first method for rigorously bounding the output of 3D Gaussian splatting under camera pose uncertainty, enabling formal analysis of vision-based systems operating in dynamic environments. By reformulating depth-sorting as an index-free blending process and bounding matrix inverses via Taylor expansions, our approach avoids combinatorial over-approximation while maintaining computational tractability. Experiments demonstrate scalability to scenes with over 750k Gaussians.

Potential directions include refining the blending process to reduce error accumulation, exploring higher-order bounds or symbolic representations for pixel correlations, and extending the framework to closed-loop system verification. Adapting our approach to other rendering paradigms (e.g., NeRF, mesh-based rendering) could further unify uncertainty-aware analysis across computer vision pipelines. It is worth further investigating abstract rendering for scene variations.

References

  • [1] Anonymous: GS-VTON: Controllable 3d virtual try-on with gaussian splatting (2025), https://openreview.net/forum?id=8eenzfwKqU
  • [2] Bouissou, O., Conquet, E., Cousot, P., Cousot, R., Feret, J., Ghorbal, K., Goubault, E., Lesens, D., Mauborgne, L., Miné, A., Putot, S., Rival, X., Turin, M.: Space Software Validation using Abstract Interpretation. In: The International Space System Engineering Conference : Data Systems in Aerospace - DASIA 2009. vol. 1, pp. 1–7. EUROSPACE, European Space Agency, Istambul, Turkey (May 2009), https://inria.hal.science/inria-00528590
  • [3] Cousot, P., Cousot, R., Feret, J., Mauborgne, L., Miné, A., Monniaux, D., Rival, X.: The astreé analyzer. In: Sagiv, M. (ed.) Programming Languages and Systems. pp. 21–30. Springer Berlin Heidelberg, Berlin, Heidelberg (2005)
  • [4] Habeeb, P., D’Souza, D., Lodaya, K., Prabhakar, P.: Interval image abstraction for verification of camera-based autonomous systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 43(11), 4310–4321 (2024). https://doi.org/10.1109/TCAD.2024.3448306
  • [5] Jiang, Y., Yu, C., Xie, T., Li, X., Feng, Y., Wang, H., Li, M., Lau, H., Gao, F., Yang, Y., Jiang, C.: Vr-gs: A physical dynamics-aware interactive gaussian splatting system in virtual reality. arXiv preprint arXiv:2401.16663 (2024)
  • [6] Johnson, J., Ravi, N., Reizenstein, J., Novotny, D., Tulsiani, S., Lassner, C., Branson, S.: Accelerating 3d deep learning with pytorch3d. In: SIGGRAPH Asia 2020 Courses. SA ’20, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3415263.3419160, https://doi.org/10.1145/3415263.3419160
  • [7] Kerbl, B., Kopanas, G., Leimkuehler, T., Drettakis, G.: 3d gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. 42(4) (Jul 2023). https://doi.org/10.1145/3592433, https://doi.org/10.1145/3592433
  • [8] Lombardi, S., Simon, T., Saragih, J., Schwartz, G., Lehrmann, A., Sheikh, Y.: Neural volumes: Learning dynamic renderable volumes from images. ACM Trans. Graph. 38(4), 65:1–65:14 (Jul 2019)
  • [9] Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: Representing scenes as neural radiance fields for view synthesis. In: ECCV (2020)
  • [10] Rematas, K., Ferrari, V.: Neural voxel renderer: Learning an accurate and controllable rendering tool. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 5416–5426 (2020). https://doi.org/10.1109/CVPR42600.2020.00546
  • [11] Schieber, H., Young, J., Langlotz, T., Zollmann, S., Roth, D.: Semantics-controlled gaussian splatting for outdoor scene reconstruction and rendering in virtual reality (2024), https://arxiv.org/abs/2409.15959
  • [12] Singh, G., Gehr, T., Püschel, M., Vechev, M.: An abstract domain for certifying neural networks. Proc. ACM Program. Lang. 3(POPL) (Jan 2019). https://doi.org/10.1145/3290354, https://doi.org/10.1145/3290354
  • [13] Singh, G., Püschel, M., Vechev, M.: Making numerical program analysis fast. In: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. p. 303–313. PLDI ’15, Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2737924.2738000, https://doi.org/10.1145/2737924.2738000
  • [14] Souyris, J., Delmas, D.: Experimental assessment of astrée on safety-critical avionics software. In: Saglietti, F., Oster, N. (eds.) Computer Safety, Reliability, and Security. pp. 479–490. Springer Berlin Heidelberg, Berlin, Heidelberg (2007)
  • [15] Tancik, M., Weber, E., Ng, E., Li, R., Yi, B., Kerr, J., Wang, T., Kristoffersen, A., Austin, J., Salahi, K., Ahuja, A., McAllister, D., Kanazawa, A.: Nerfstudio: A modular framework for neural radiance field development. In: ACM SIGGRAPH 2023 Conference Proceedings. SIGGRAPH ’23 (2023)
  • [16] Tewari, A., Thies, J., Mildenhall, B., Srinivasan, P., Tretschk, E., Yifan, W., Lassner, C., Sitzmann, V., Martin-Brualla, R., Lombardi, S., Simon, T., Theobalt, C., Nießner, M., Barron, J.T., Wetzstein, G., Zollhöfer, M., Golyanik, V.: Advances in neural rendering. Computer Graphics Forum 41(2), 703–735 (2022). https://doi.org/https://doi.org/10.1111/cgf.14507, https://onlinelibrary.wiley.com/doi/abs/10.1111/cgf.14507
  • [17] Tosi, F., Zhang, Y., Gong, Z., Sandström, E., Mattoccia, S., Oswald, M.R., Poggi, M.: How nerfs and 3d gaussian splatting are reshaping slam: a survey (2024), https://arxiv.org/abs/2402.13255
  • [18] Tran, H.D., Bak, S., Xiang, W., Johnson, T.T.: Verification of deep convolutional neural networks using imagestars. In: 32nd International Conference on Computer-Aided Verification (CAV). Springer (July 2020). https://doi.org/10.1007/978-3-030-53288-8_2
  • [19] Wu, G., Yi, T., Fang, J., Xie, L., Zhang, X., Wei, W., Liu, W., Tian, Q., Wang, X.: 4d gaussian splatting for real-time dynamic scene rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 20310–20320 (June 2024)
  • [20] Xu, C., Kerr, J., Kanazawa, A.: Splatfacto-w: A nerfstudio implementation of gaussian splatting for unconstrained photo collections (2024), https://arxiv.org/abs/2407.12306
  • [21] Xu, K., Shi, Z., Zhang, H., Wang, Y., Chang, K.W., Huang, M., Kailkhura, B., Lin, X., Hsieh, C.J.: Automatic perturbation analysis for scalable certified robustness and beyond. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. NIPS ’20, Curran Associates Inc., Red Hook, NY, USA (2020)
  • [22] Yang, Z., Yang, H., Pan, Z., Zhang, L.: Real-time photorealistic dynamic scene representation and rendering with 4d gaussian splatting. In: International Conference on Learning Representations (ICLR) (2024)
  • [23] Zhang, H., Weng, T.W., Chen, P.Y., Hsieh, C.J., Daniel, L.: Efficient neural network robustness certification with general activation functions. Advances in Neural Information Processing Systems 31, 4939–4948 (2018), https://arxiv.org/pdf/1811.00866.pdf
  • [24] Zhu, S., Wang, G., Kong, X., Kong, D., Wang, H.: 3d gaussian splatting in robotics: A survey (2024), https://arxiv.org/abs/2410.12262

Appendix 0.A Appendix

0.A.1 Proofs of Lemmas used for Abstract Rendering

Proposition 2

Any continuous function is linearly over-approximable.

Proof

We present a simple proof for scalar ff and this can be extended to higher dimensions. We fix x0x_{0}\in\mathbb{R} and a finite slope kk\in\mathbb{R}. Since f(x)k(xx0)f(x)-k(x-x_{0}) is continuous, for any ε>0\varepsilon>0, there must exist an open neighborhood U(x0)U(x_{0}) around x0x_{0}, such that xR\forall x\in R,

|f(x)f(x0)k(xx0)|<ε3.\displaystyle|f(x)-f(x_{0})-k(x-x_{0})|<\frac{\varepsilon}{3}. (1)

Equivalently,

f(x0)+k(xx0)ε3<f(x)<f(x0)+k(xx0)+ε3.\displaystyle f(x_{0})+k(x-x_{0})-\frac{\varepsilon}{3}<f(x)<f(x_{0})+k(x-x_{0})+\frac{\varepsilon}{3}. (2)

We construct candidate linear upper and lower bounds over UU as:

lU(x)=k(xx0)+f(x0)ε3,uU(x)=k(xx0)+f(x0)+ε3.\displaystyle l_{U}(x)=k(x-x_{0})+f(x_{0})-\frac{\varepsilon}{3},\quad u_{U}(x)=k(x-x_{0})+f(x_{0})+\frac{\varepsilon}{3}. (3)

We can check that these linear functions lU(x)l_{U}(x) and uU(x)u_{U}(x) are valid lower and upper bounds for f(x)f(x) over UU, because xU(x0)\forall x\in U(x_{0}):

f(x)lU(x)=f(x)(k(xx0)+f(x0)ε3)>ε3+ε3=0,\displaystyle f(x)-l_{U}(x)=f(x)-(k(x-x_{0})+f(x_{0})-\frac{\varepsilon}{3})>-\frac{\varepsilon}{3}+\frac{\varepsilon}{3}=0, (4)
f(x)uU(x)=f(x)(k(xx0)+f(x0)+ε3)<ε3ε3=0.\displaystyle f(x)-u_{U}(x)=f(x)-(k(x-x_{0})+f(x_{0})+\frac{\varepsilon}{3})<\frac{\varepsilon}{3}-\frac{\varepsilon}{3}=0. (5)

Additionally, difference between lU(x)l_{U}(x) and uU(x)u_{U}(x) is tightly bounded, as xU(x0)\forall x\in U(x_{0}):

|uU(x)lU(x)|=(k(xx0)+f(x0)+ε3)(k(xx0)+f(x0)ε3)=2ε3<ε.\displaystyle|u_{U}(x)-l_{U}(x)|=(k(x-x_{0})+f(x_{0})+\frac{\varepsilon}{3})-(k(x-x_{0})+f(x_{0})-\frac{\varepsilon}{3})=\frac{2\varepsilon}{3}<\varepsilon.

Now, for any compact set BB, the collection 𝒰={U(x)xB}\mathcal{U}=\{U(x)\mid x\in B\} forms an open cover of BB. By the definition of compactness, there exists a finite subcover of BB, denoted by {Ui=U(xi)i=1,,s}\{U_{i}=U(x_{i})\mid i=1,\dots,s\}. We can now define piecewise linear functions on BB as follows:

uB(x)\displaystyle u_{B}(x) =minjIuUj(x),\displaystyle=\min_{j\in I}u_{U_{j}}(x), (6)
lB(x)\displaystyle l_{B}(x) =maxjIlUj(x),ifx{UjjI}\displaystyle=\max_{j\in I}l_{U_{j}}(x),\quad if\ x\in\{U_{j}\mid j\in I\}

where II is the index set of UiU_{i} that cover xx. Consequently, we have

uB(x)=minjIuUj(x)f(x)maxjIlUj(x)=lB(x),u_{B}(x)=\min_{j\in I}u_{U_{j}}(x)\geq f(x)\geq\max_{j\in I}l_{U_{j}}(x)=l_{B}(x), (7)

and

|uB(x)lB(x)|uUj(x)lUj(x)ε,jI.|u_{B}(x)-l_{B}(x)|\leq u_{U_{j}}(x)-l_{U_{j}}(x)\leq\varepsilon,\quad\forall j\in I. (8)

Thus, ff is linearly over-approximable.

Lemma 3

Given any non-singular matrix 𝖷\sf{X}, the output of Algorithm MatrixInv 𝗅𝖷𝗂𝗇𝗏,𝗎𝖷𝗂𝗇𝗏\langle\mathsf{lXinv},\mathsf{uXinv}\rangle satisfies that:

𝗅𝖷𝗂𝗇𝗏Inv(𝖷)𝗎𝖷𝗂𝗇𝗏\mathsf{lXinv}\leq\mathrm{Inv}(\sf{X})\leq\mathsf{uXinv}
Proof

For any non-singular n×nn\times n matrices 𝖷\sf{X} and 𝖷𝗋𝖾𝖿\sf{X_{ref}}, denote their difference as Δ𝖷=𝖷𝖷𝗋𝖾𝖿\sf{\Delta X}=\sf{X}-\sf{X_{ref}}. The 𝗄𝗍𝗁\sf{k^{th}} order Taylor Polynomial 𝖷𝗉\sf{Xp} and remainder 𝖱\sf{R} of matrix inverse of 𝖷\sf{X}, estimated at 𝖷𝗋𝖾𝖿\sf{X_{ref}}, can be written as follows:

𝖷𝗉=𝖷𝗋𝖾𝖿𝟣𝗂=0𝗄(Δ𝖷𝖷𝗋𝖾𝖿𝟣)𝗂\displaystyle\mathsf{Xp}=\mathsf{X_{ref}^{-1}}\sum_{\mathsf{i}=0}^{\mathsf{k}}(\mathsf{\Delta X}\cdot\mathsf{X_{ref}^{-1}})^{\mathsf{i}} (9)
𝖱=𝖷𝗋𝖾𝖿𝟣𝗂=𝗄+1(Δ𝖷𝖷𝗋𝖾𝖿𝟣)𝗂\displaystyle\mathsf{R}=\mathsf{X_{ref}^{-1}}\sum_{\mathsf{i}=\mathsf{k}+1}^{\infty}(\mathsf{\Delta X}\cdot\mathsf{X_{ref}^{-1}})^{\mathsf{i}} (10)

Assuming that Δ𝖷𝖷𝗋𝖾𝖿𝟣<1\|\mathsf{\Delta X}\cdot\mathsf{X_{ref}^{-1}}\|<1, the remainder 𝖱\sf{R} can be bounded by:

𝖤𝗉𝗌=𝖱𝖷𝗋𝖾𝖿𝟣𝗂=𝗄+1Δ𝖷𝖷𝗋𝖾𝖿𝟣𝗂=𝖷𝗋𝖾𝖿𝟣Δ𝖷𝖷𝗋𝖾𝖿𝟣𝗄+11Δ𝖷𝖷𝗋𝖾𝖿𝟣\displaystyle\begin{split}\mathsf{Eps}=\|\sf{R}\|&\leq\|\mathsf{X_{ref}^{-1}}\|\cdot\sum_{\mathsf{i}=\mathsf{k}+1}^{\infty}\|\mathsf{\Delta X}\cdot\mathsf{X_{ref}^{-1}}\|^{\mathsf{i}}=\|\mathsf{X_{ref}^{-1}}\|\cdot\frac{\|\mathsf{\Delta X}\cdot\mathsf{X_{ref}^{-1}}\|^{\mathsf{k}+1}}{1-\|\mathsf{\Delta X}\cdot\mathsf{X_{ref}^{-1}}\|}\end{split} (11)

Using the bound in inequality 11, we obtain bounds on 𝖷𝟣\sf{X^{-1}}:

𝖷𝗉𝖤𝗉𝗌𝖷𝟣𝖷𝗉+𝖤𝗉𝗌\displaystyle\mathsf{Xp}-\mathsf{Eps}\leq\mathsf{X^{-1}}\leq\mathsf{Xp}+\mathsf{Eps} (12)

Thus, we establish lower and upper bound for 𝖷𝟣\sf{X^{-1}} as:

𝗅𝖷𝗂𝗇𝗏=𝖷𝗉𝖤𝗉𝗌,𝗎𝖷𝗂𝗇𝗏=𝖷𝗉+𝖤𝗉𝗌.\displaystyle\sf{lXinv}=\sf{Xp}-\sf{Eps},\quad\sf{uXinv}=\sf{Xp}+\sf{Eps}. (13)

Finally, by replacing 𝖷𝗋𝖾𝖿𝟣\sf{X_{ref}^{-1}} with 𝖷𝟢\sf{X0} in the expression for 𝖷𝗉\sf{Xp} (in Equation 9), 𝖤𝗉𝗌\sf{Eps} (in Equation 11), 𝗅𝖷𝗂𝗇𝗏\sf{lXinv} and 𝗎𝖷𝗂𝗇𝗏\sf{uXinv} (in Equation 13), we obtain the same expressions as those given in Algorithm MatrixInv. ∎

Lemma 4

For any given inputs — effective opacities 𝖺\sf{a}, colors 𝖼\sf{c} and depth 𝖽\sf{d}, it follows:

BlendSort(𝖺,𝖼,𝖽)=BlendInd(𝖺,𝖼,𝖽)\textsc{BlendSort}(\sf{a},\sf{c},\sf{d})=\textsc{BlendInd}(\sf{a},\sf{c},\sf{d})
Proof

We first derive the math expression of pixel color 𝗉𝖼\sf{pc} from BlendSort. By denoting argsort(𝖽)\operatorname*{arg\,sort}(\sf{d}) as a mapping σ:𝖨𝖨\sigma:\sf{I}\to\sf{I}, where 𝖨\sf{I} refers to the index set of Gaussians, 𝖺𝗌\sf{as} at Line 2, 𝖼𝗌\sf{cs} at Line 2 and 𝗏𝗌\sf{vs} at Line 2 can be written as follows.

𝖺𝗌𝗂=𝖺σ(𝗂)\displaystyle\mathsf{as}_{\mathsf{i}}=\mathsf{a}_{\mathsf{\sigma(i)}} (14)
𝖼𝗌𝗂=𝖼σ(𝗂)\displaystyle\mathsf{cs}_{\mathsf{i}}=\mathsf{c}_{\mathsf{\sigma(i)}} (15)
𝗏𝗌𝗂=1𝖺σ(𝗂)\displaystyle\mathsf{vs}_{\mathsf{i}}=1-\mathsf{a}_{\mathsf{\sigma(i)}} (16)

According to the definition of Ind\mathrm{Ind} operation, we have:

Ind(𝖽𝗂𝖽𝗃)={1 if 𝖽𝗂>𝖽𝗃0 if 𝖽𝗂𝖽𝗃.\displaystyle\mathrm{Ind}(\mathsf{d}_{\mathsf{i}}-\mathsf{d}_{\mathsf{j}})=\left\{\begin{matrix}1&\mbox{ if }\mathsf{d}_{\mathsf{i}}>\mathsf{d}_{\mathsf{j}}\\ 0&\mbox{ if }\mathsf{d}_{\mathsf{i}}\leq\mathsf{d}_{\mathsf{j}}.\end{matrix}\right. (17)

Then by replacing 𝗂,𝗃\sf{i,j} with σ(𝗂),σ(𝗃)\sf{\sigma(i),\sigma(j)}, and multiplying 𝖺σ(𝗂)\sf{a}_{\sigma(\sf{i})}, (17) becomes:

𝖺σ(𝗂)Ind(𝖽σ(𝗂)𝖽σ(𝗃))={𝖺σ(𝗂) if 𝖽σ(𝗂)>𝖽σ(𝗃)0 if 𝖽σ(𝗂)𝖽σ(𝗃)\displaystyle\mathsf{a}_{\sigma(\mathsf{i})}\cdot\mathrm{Ind}(\mathsf{d}_{\sigma(\mathsf{i})}-\mathsf{d}_{\sigma(\mathsf{j})})=\left\{\begin{matrix}\mathsf{a}_{\sigma(\mathsf{i})}&\mbox{ if }\mathsf{d}_{\sigma(\mathsf{i})}>\mathsf{d}_{\sigma(\mathsf{j})}\\ 0&\mbox{ if }\mathsf{d}_{\sigma(\mathsf{i})}\leq\mathsf{d}_{\sigma(\mathsf{j})}\end{matrix}\right. (18)

For any fixed index 𝗂\sf{i}, by multiplying all the case in the second branch of  (18), it follows:

𝗃=𝗂N(1𝖺σ(𝗂)Ind(𝖽σ(𝗂)𝖽σ(𝗃)))=1.\displaystyle\prod_{\mathsf{j}=\mathsf{i}}^{N}(1-\mathsf{a}_{\mathsf{\sigma(i)}}\cdot\mathrm{Ind}(\mathsf{d}_{\sigma(\mathsf{i})}-\mathsf{d}_{\sigma(\mathsf{j})}))=1. (19)

By applying Equation 16, Equation 18 and Equation 19 into Line 2, 𝖳𝗌𝗂\sf{Ts}_{\sf{i}} can be written as follows:

𝖳𝗌𝗂=𝗃=1𝗂𝟣(1𝗏𝗌𝗂)=𝗃=1𝗂𝟣(1𝖺σ(𝗂))=𝗃=1i1(1𝖺σ(𝗂)Ind(𝖽σ(𝗂)𝖽σ(𝗃)))=𝗃=1N(1𝖺σ(𝗂)Ind(𝖽σ(𝗂)𝖽σ(𝗃)))\displaystyle\begin{split}\mathsf{Ts}_{\mathsf{i}}&=\prod_{\mathsf{j}=1}^{\mathsf{i-1}}(1-\mathsf{vs}_{\mathsf{i}})=\prod_{\mathsf{j}=1}^{\mathsf{i-1}}(1-\mathsf{a}_{\mathsf{\sigma(i)}})\\ &=\prod_{\mathsf{j}=1}^{i-1}(1-\mathsf{a}_{\mathsf{\sigma(i)}}\cdot\mathrm{Ind}(\mathsf{d}_{\sigma(\mathsf{i})}-\mathsf{d}_{\sigma(\mathsf{j})}))\\ &=\prod_{\mathsf{j}=1}^{N}(1-\mathsf{a}_{\mathsf{\sigma(i)}}\cdot\mathrm{Ind}(\mathsf{d}_{\sigma(\mathsf{i})}-\mathsf{d}_{\sigma(\mathsf{j})}))\\ \end{split} (20)

Then by applying Equation 20 into Line 2, the pixel color 𝗉𝖼\sf{pc} computed via Algorithm BlendSort can be expressed as:

𝗉𝖼=𝗂=1𝖭(𝗃=1N(1𝖺σ(𝗂)Ind(𝖽σ(𝗂)𝖽σ(𝗃)))𝖺σ(𝗂)𝖼σ(𝗂))\displaystyle\begin{split}\mathsf{pc}=&\sum_{\mathsf{i}=1}^{\mathsf{N}}\left(\prod_{\mathsf{j}=1}^{N}(1-\mathsf{a}_{\mathsf{\sigma(i)}}\cdot\mathrm{Ind}(\mathsf{d}_{\sigma(\mathsf{i})}-\mathsf{d}_{\sigma(\mathsf{j})}))\mathsf{a}_{\mathsf{\sigma(\mathsf{i})}}\mathsf{c}_{\mathsf{\sigma(\mathsf{i})}}\right)\end{split} (21)

Since the summation and product operations are invariant to the reordering of input elements, and both indices 𝗂\sf{i} and 𝗃\sf{j} iterate over the entire index set 𝖨\mathsf{I}, the reordering mapping σ\sigma in Equation 21 can be eliminated, resulting in:

𝗉𝖼=𝗂=1𝖭(𝗃=1N(1𝖺𝗂Ind(𝖽𝗂𝖽𝗃))𝖺𝗂𝖼𝗂)\displaystyle\begin{split}\mathsf{pc}=&\sum_{\mathsf{i}=1}^{\mathsf{N}}\left(\prod_{\mathsf{j}=1}^{N}(1-\mathsf{a}_{\mathsf{i}}\cdot\mathrm{Ind}(\mathsf{d}_{\mathsf{i}}-\mathsf{d}_{\mathsf{j}}))\mathsf{a}_{\mathsf{\mathsf{i}}}\mathsf{c}_{\mathsf{\mathsf{i}}}\right)\end{split} (22)

Next, we derive the math expression for 𝗉𝖼\sf{pc} in Algorithm BlendInd. 𝖵𝗂𝗃\sf{V_{ij}} at Line 3 and 𝖳𝗂\sf{T_{i}} at Line 3 can be written as:

𝖵𝗂𝗃=1𝖺𝗃Ind(𝖽𝗂𝖽𝗃)\displaystyle\mathsf{V_{ij}}=1-\mathsf{a_{j}}\cdot\mathrm{Ind}(\mathsf{d_{i}}-\mathsf{d_{j}}) (23)
𝖳𝗂=𝗃=1N(1𝖺𝗂Ind(𝖽𝗂𝖽𝗃))\displaystyle\mathsf{T}_{\mathsf{i}}=\prod_{\mathsf{j}=1}^{N}(1-\mathsf{a}_{\mathsf{i}}\cdot\mathrm{Ind}(\mathsf{d}_{\mathsf{i}}-\mathsf{d}_{\mathsf{j}})) (24)

Then, by applying Equation 24 into Line 3 of Algorithm BlendInd, the expression for 𝗉𝖼\mathsf{pc} becomes:

𝗉𝖼=\displaystyle\mathsf{pc}= 𝗂=1𝖭(𝗃=1N(1𝖺𝗂Ind(𝖽𝗂𝖽𝗃))𝖺𝗂𝖼𝗂)\displaystyle\sum_{\mathsf{i}=1}^{\mathsf{N}}\left(\prod_{\mathsf{j}=1}^{N}(1-\mathsf{a}_{\mathsf{i}}\cdot\mathrm{Ind}(\mathsf{d}_{\mathsf{i}}-\mathsf{d}_{\mathsf{j}}))\mathsf{a}_{\mathsf{\mathsf{i}}}\mathsf{c}_{\mathsf{\mathsf{i}}}\right) (25)

Note that the expressions for 𝗉𝖼\mathsf{pc} are identical in both Equation 22 and Equation 25. This demonstrates that Algorithms BlendSort and BlendInd yield the same input-output relationship while applying different operations, thus completing this proof. ∎