This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Statistical Inference on Grayscale Images via the Euler-Radon Transform

Kun Meng Division of Applied Mathematics, Brown University, RI, USA Corresponding Author: e-mail: kun_meng@brown.edu. Mattie Ji Department of Mathematics, Brown University, RI, USA Jinyu Wang Data Science Initiative, Brown University, RI, USA Kexin Ding Department of Mathematics, Brown University, RI, USA Henry Kirveslahti Laboratory for Topology and Neuroscience, EPFL, Lausanne, Switzerland Ani Eloyan Department of Biostatistics, Brown University School of Public Health, RI, USA Lorin Crawford Department of Biostatistics, Brown University School of Public Health, RI, USA Microsoft Research New England, Cambridge, MA, USA
Abstract

Tools from topological data analysis have been widely used to represent binary images in many scientific applications. Methods that aim to represent grayscale images (i.e., where pixel intensities instead take on continuous values) have been relatively underdeveloped. In this paper, we introduce the Euler-Radon transform, which generalizes the Euler characteristic transform to grayscale images by using o-minimal structures and Euler integration over definable functions. Coupling the Karhunen–Loève expansion with our proposed topological representation, we offer hypothesis-testing algorithms based on the χ2\chi^{2} distribution for detecting significant differences between two groups of grayscale images. We illustrate our framework via extensive numerical experiments and simulations. 111 Keywords: Euler calculus; Karhunen–Loève expansion; o-minimal structures; smooth Euler-Radon transform. Abbreviations: BIS, binary image segmentation; CT, computed tomography; CDT, cell decomposition theorem; DERT, dual Euler-Radon transform; ECT, Euler characteristic transform; GBM, glioblastoma multiforme; iid, independently and identically distributed; LECT, lifted Euler characteristic transform; MEC, marginal Euler curve; MRI, magnetic resonance imaging; Micro-CT, micro-computed tomography; PHT, persistent homology transform; PET, positron emission tomography; RCLL, right continuous with left limit; SECT, smooth Euler characteristic transform; TDA, topological data analysis; WECT, weighted Euler curve transform.

1 Introduction

The analysis of grayscale images is important in many fields. In medical imaging, data can be derived from different modalities including magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET), and micro-computed tomography (Micro-CT). Analyzing the variation among pixel intensities in these images can help diagnose diseases, detect abnormalities in human tissues, and monitor the effectiveness of different treatment strategies (e.g., see Figure 1). Beyond medicine, grayscale imaging is essential in astronomy for capturing details in celestial bodies and other phenomena (Howell, 2006), in geology for studying rock and mineral compositions using electron microscopy (Reed, 2005), and in meteorology for interpreting satellite imagery to predict weather patterns (Kidder and Haar, 1995).

There is an important distinction between binary and grayscale images. Unlike binary images, where pixels are exactly one of two colors (usually black and white), pixels in grayscale images take on continuous values. A binary image can be modeled as a binary-valued function which is often expressed in the following form

𝟙K(x)={1, if the color of point (pixel) x is white,0, if the color of point (pixel) x is black,\displaystyle\mathbbm{1}_{K}(x)=\left\{\begin{aligned} 1,\ \ &\mbox{ if the color of point (pixel) $x$ is white},\\ 0,\ \ &\mbox{ if the color of point (pixel) $x$ is black},\end{aligned}\right. (1.1)

where the subscript KK denotes the region of white points in the image (which we will refer to as a “shape” throughout this paper). In contrast, a grayscale image must be modeled as a real-valued function. Specifically, the grayscale intensity of each pixel is represented as the function value at that corresponding point.

Many methods in the field of topological data analysis (TDA) (Carlsson, 2009, Vishwanath et al., 2020) have been developed for analyzing binary images (equivalently, shapes). Some of these works include the persistent homology transform (PHT) (Turner et al., 2014), the Euler characteristic transform (ECT) (Turner et al., 2014, Ghrist et al., 2018), and the smooth Euler characteristic transform (SECT) (Crawford et al., 2020) — all of which are proposed statistics used to represent shapes while preserving the complete information they contain. Crawford et al. (2020) applied the SECT on MRI-derived binary images taken from tumors of glioblastoma multiforme (GBM) patients. Here, the authors used the resulting summary statistics from the SECT in a class of Gaussian process regression models to predict survival-based outcomes. Wang et al. (2021) utilized the ECT for sub-image analysis which aims to identify geometric features that are important for distinguishing various classes of shapes. Marsh et al. (2022) recently presented the DETECT: an extension of the ECT to analyze temporal changes in shapes. The authors demonstrated their approach by studying the growth of mouse small intestine organoid experiments from segmented videos. Lastly, Meng et al. (2022) recently used the SECT framework to introduce a χ2\chi^{2} distribution-based approach to test hypotheses on random shapes, with the corresponding mathematical foundation being established therein through algebraic topology, functional analysis with Sobolev embeddings (Brezis, 2011), and probability theory using the Karhunen–Loève expansion (Alexanderian, 2015).

Previous TDA methods that use Euler characteristic-based invariants are well suited for binary images and shapes with clearly defined boundaries (e.g., the region KK of white points defined via the binary-valued function in Eq. (1.1)). Grayscale images, on the other hand, are arrays of 2-dimensional pixel or 3-dimensional voxel intensities that represent varying levels of brightness in a continuous space (e.g., see Figures 1 and 2). These images lack the clear boundaries that can be used to define and compute the Euler characteristic. Consequently, due to the inherent continuous nature of grayscale images, both the Euler characteristic and the corresponding TDA methods that leverage it are not immediately applicable.

One natural way to apply the Euler characteristic to grayscale images is through binary image segmentation (BIS) — a process that divides points in a grayscale image into foreground (the shape of interest) and background. That is, BIS can serve as a preprocessing step to convert a grayscale image into a binary format, facilitating the application of the Euler characteristic. Numerous state-of-the-art applications of Euler characteristic-based TDA methods to grayscale images rely on BIS. For example, Crawford et al. (2020) used the computer-assisted program MITKats (Chen and Rabadán, 2017) to threshold MRI scans of GBM tumors and convert them to binary formats for their analyses with the SECT (refer to Figure 3 in Crawford et al. (2020)). Unfortunately, there are several challenges associated with BIS that can impede on the effectiveness and accuracy of downstream analyses with TDA methods. One of the primary challenges is selecting an appropriate threshold to distinguish between foreground and background points. Improper choice of a BIS threshold may result in the issue of over- or under-segmentation. Moreover, pinpointing a proper threshold is often not straightforward (especially when the images have low image contrast, large noise, or complex heterogeneity) and can be computationally demanding. This selection process can be especially challenging for medical images of soft tissues (including organs, tumors, and blood vessels). Most importantly, performing BIS inevitably causes a loss of information in images of interest. By generalizing Euler characteristic-based statistics and enabling them to be directly applicable to grayscale images without the need for BIS, we will increase their utility for better powered shape analyses.

Refer to caption
Figure 1: CT scans of lung cancer tumors labeled as “benign” or “malignant.” This figure has been previously published in Maldonado et al. (2021). The details of this figure are provided therein.

Radiomics (Aerts et al., 2014) has been used to estimate features from grayscale images to predict outcomes of interest, such as patient survival time in cancer. Most commonly used radiomics approaches are based on estimating parameters that describe various image intensity features, which are then employed for prediction. For example, Just (2014) described the estimation of moments of intensity histograms, Brooks and Grigsby (2013) proposed a pixel-based distance-dependent approach using the deviation from a linear intensity gradation, and Eloyan et al. (2020) described an approach for estimation of intensity histogram and shape features taking into account the correlation structure of pixel intensities. Additionally, Aerts et al. (2014) showed the computation of shape and texture features from the image intensities. While radiomic features computed from grayscale images can be used to compare populations of images, they also inevitably lose information about the grayscale images of interest. This loss reduces the power of downstream statistical analyses.

The major contributions that we present in this paper all center around developing a generalization of the ECT for grayscale images. A summary of these include:

  1. i)

    Utilizing o-minimal structures (van den Dries, 1998) and the framework proposed in Baryshnikov and Ghrist (2010) for Euler integration over definable functions, we introduce the Euler-Radon transform (ERT). This ERT serves as a topological summary statistic that aims to unify the ECT (Turner et al., 2014, Ghrist et al., 2018), the weighted Euler curve transform (WECT) (Jiang et al., 2020), and the marginal Euler curve (MEC) (Kirveslahti and Mukherjee, 2023). Notably, when the ERT is employed on binary images, it coincides with the ECT. Moreover, akin to the framework presented in Kirveslahti and Mukherjee (2023), our ERT does not rely on the diffeomorphism assumption posited in many state-of-the-art methods (e.g., Ashburner, 2007). Lastly, unlike previous TDA-based methods that rely on BIS and standard radiomic approaches, the “Schapira inversion formula” (Schapira, 1995) guarantees that our proposed ERT summary preserves all information within the pixel intensity arrays of grayscale images.

  2. ii)

    Using the proposed ERT as a building block, we also introduce the smooth Euler-Radon transform (SERT). When applied to binary images, the SERT coincides with the SECT (Crawford et al., 2020, Meng et al., 2022). Importantly, the SERT represents the grayscale images as functional data. Numerous tools from functional data analysis (FDA) (Hsing and Eubank, 2015) and functional analysis (Brezis, 2011) are applicable with the SERT (e.g., the Karhunen–Loève expansion Alexanderian, 2015).

  3. iii)

    Using the ERT and SERT, we propose several statistical algorithms aimed at detecting significant differences between paired collections of grayscale images (e.g., analyzing CT scans of malignant and benign tumors from Figure 1). Particularly, our proposed algorithms combine the Karhunen–Loève expansion with a permutation-based approach. Using simulations, we show that our hypothesis test is uniformly powerful across various scenarios and does not suffer from type I error inflation. These algorithms are a generalization of results presented in Meng et al. (2022).

Beyond the contributions outlined above, due to the resemblance between the ECT and our proposed ERT, this paper paves the way for generalizing a series of ECT-based methods (Crawford et al., 2020, Wang et al., 2021, Marsh et al., 2022) from binary images to grayscale images.

The remainder of this paper is organized as follows. In Section 2, we review some existing representations of grayscale images. In Section 3, we first review the basics of Euler calculus that have been described in van den Dries (1998), Baryshnikov and Ghrist (2010), and Ghrist (2014). Then, using Euler calculus, we define the ERT and detail its properties. In Section 4, we comprehensively describe the relationship between our proposed ERT and existing topological representations of grayscale images (Jiang et al., 2020, Kirveslahti and Mukherjee, 2023). Section 5 offers a proof-of-concept example that illustrates the behavior of our proposed SERT. In Section 6, we propose an ERT-based alignment approach for preprocessing grayscale images prior to statistical inference without relying on correspondences between them. In Section 7, we propose several statistical algorithms designed to differentiate between two sets of grayscale images. The performance of the proposed algorithms is presented in Section 8 using simulations. Lastly, we conclude this paper in Section 9 and discuss several future research directions. The proofs of all theorems in this paper are provided in Appendix A unless otherwise stated.

2 Representations of Grayscale Images

The statistical inference on grayscale images necessitates an appropriate representation of grayscale images. For the application of statistical methods akin to the ECT-based methods in Crawford et al. (2020), Wang et al. (2021), Meng et al. (2022), and Marsh et al. (2022), it is imperative that our proposed representation of grayscale images aligns as closely as possible with the ECT. Prior to delving into our proposed approach, this section offers a review of some existing representations of grayscale images.

In an attempt to employ an ECT-like method to grayscale images of GBM tumors, Jiang et al. (2020) transformed each grayscale image into the discrete representation aσ𝟙σ(x)\sum a_{\sigma}\cdot\mathbbm{1}_{\sigma}(x) in their preprocessing step, where \sum denotes a finite sum, each σ\sigma is a simplicial complex, and each weight aσa_{\sigma} belongs to :={0,1,2,}\mathbb{N}:=\{0,1,2,\ldots\}. In other words, each grayscale image is transformed into a weighted sum of a finite set of simplexes (referred to as a “weighted complex” therein). Subsequently, Jiang et al. (2020) introduced the weighted Euler curve transform (WECT) tailored for weighted complexes. A limitation of the WECT method is the dependency of data analysis results on the discretization aσ𝟙σ(x)\sum a_{\sigma}\cdot\mathbbm{1}_{\sigma}(x), unless the original image is discrete and can be exactly depicted as a weighted complex. When dealing with a high-resolution grayscale image, the deviation between the original image and its weighted complex discretization can be substantial. Importantly, the reliance of data analysis on the discretization complicates theoretical analysis, which necessitates discretization-free representations of grayscale images.

Many state-of-the-art methods for analyzing grayscale images rely on the diffeomorphism assumption that the grayscale images being analyzed are diffeomorphic (e.g., Ashburner, 2007). To obviate the diffeomorphism assumption, Kirveslahti and Mukherjee (2023) introduced two novel representations called the lifted Euler characteristic transform (LECT) and the super LECT (SELECT) utilizing a lifting technique. In contrast to the approach proposed by Jiang et al. (2020), neither the LECT nor SELECT depends on the discretization preprocessing for grayscale images. However, the lifting procedure used in both the LECT and SELECT introduces an additional dimension. For example, the LECT represents a dd-dimensional grayscale image as a function on a (d+1)(d+1)-dimensional manifold (see Eq. (4.1) for details). This increase in dimensionality distinguishes the LECT and SELECT distinctly from the ECT, precluding straightforward theoretical and methodological generalizations of ECT-based methods to grayscale images (Ghrist et al., 2018, Crawford et al., 2020, Wang et al., 2021, Meng et al., 2022, Marsh et al., 2022). Additionally, the augmented dimensionality elevates the computational expense associated with subsequent statistical inference.

A grayscale image can also be represented using function values where g(x)g(x) is used to denote the grayscale intensity at point (pixel) xx. One straightforward approach to quantify differences between two images can then be done using the Lebesgue integral |g(1)(x)g(2)(x)|2𝑑x\int|g^{(1)}(x)-g^{(2)}(x)|^{2}\,dx for grayscale images represented by functions g(1)g^{(1)} and g(2)g^{(2)}. Representing grayscale images through topological invariants presents distinct advantages compared to using function values. These advantages have been documented by Kirveslahti and Mukherjee (2023). As a toy example, the Euler characteristic of a singular point is 1; whereas, its Lebesgue integral is 0.

3 Euler-Radon Transform of Grayscale Functions

In this section, we generalize the ECT from binary images to grayscale images via the Euler calculus (Baryshnikov and Ghrist, 2010, Ghrist, 2014). This generalization does not depend on the discretization of grayscale images nor does it introduce an additional dimension. More precisely, we introduce the Euler-Radon transform (ERT) as a means to perform statistical inference on grayscale images. This approach serves as a natural extension of both the ECT and WECT, and has a direct connection to the LECT and SELECT. Furthermore, to apply functional data analysis, we propose a smooth version of the ERT — the smooth Euler-Radon transform (SERT), which is a generalization of the SECT.

3.1 Outline

To elucidate the conceptual foundation of our ERT as an extension of the ECT, we revisit the ECT of shapes KK from the viewpoint of Euler calculus. The discussion in this subsection primarily serves a heuristic purpose; a thorough and precise exposition is given in Section 3.3.

Without loss of generality, we assume KBd(0,R):={xd:x<R}K\subseteq B_{\mathbb{R}^{d}}(0,R):=\{x\in\mathbb{R}^{d}:\|x\|<R\} for a prespecified radius R>0R>0 and KK is compact. The ECT of KK is a collection of Euler characteristics {χ(Ktν):(ν,t)𝕊d1×[0,2R]}\{\chi(K_{t}^{\nu}):\,(\nu,t)\in\mathbb{S}^{d-1}\times[0,2R]\}, where χ(Ktν) is the Euler characteristic of Ktν\chi\left(K_{t}^{\nu}\right)\text{ is the Euler characteristic of }K_{t}^{\nu} and Ktν:={xK:xνtR}K_{t}^{\nu}:=\{x\in K:\,x\cdot\nu\leq t-R\} (also Meng et al., 2022). The Euler characteristic χ(Ktν)\chi\left(K_{t}^{\nu}\right) can be represented as follows using the Euler functional ()𝑑χ\int(\cdot)d\chi (e.g., Ghrist et al., 2018)

χ(Ktν)=𝟙K(x)R(x,ν,t)𝑑χ(x),where R(x,ν,t):=𝟙{(x,ν,t)Bd(0,R)×𝕊d1×[0,T]:xνtR} and T:=2R.\displaystyle\begin{aligned} &\chi\left(K_{t}^{\nu}\right)=\int\mathbbm{1}_{K}(x)\cdot R(x,\nu,t)\,d\chi(x),\\ &\text{where }\ \ R(x,\nu,t):=\mathbbm{1}\left\{\left(x,\nu,t\right)\in B_{\mathbb{R}^{d}}(0,R)\times\mathbb{S}^{d-1}\times[0,T]\,:\,x\cdot\nu\leq t-R\right\}\text{ and }T:=2R.\end{aligned} (3.1)

In essence, this formulation can be viewed as a generalized Radon transform of the function 𝟙K\mathbbm{1}_{K} (Schapira, 1995, Baryshnikov et al., 2011, Ghrist et al., 2018). Here, the indicator function 𝟙K\mathbbm{1}_{K} represents a binary image, as referenced in Eq. (1.1). In the development of the ERT, our primary objective is to substitute the indicator function 𝟙K\mathbbm{1}_{K} in Eq. (3.1) with a real-valued function gg representing a grayscale image.

Refer to caption
(a)
Refer to caption
(b)
Figure 2: Panel (a) shows a grayscale image represented by a function gg in the form of Eq. (3.2). Here, each white pixel corresponds to value 1 and each black pixel corresponds to 0. Panel (b) presents a rescaled version of gg, where the negative coefficient λ\lambda induces a “white-to-black” transition. In this image, each white pixel corresponds to value 0 and each black pixel corresponds to value -0.5.

A conventional approach to modeling a grayscale image gg takes the form

g(x)={1, if the color at point (pixel) x is white,#,the value of the grayscale intensity at point (pixel) x scaled between 0 and 1,0, if the color at point (pixel) x is black.\displaystyle g(x)=\left\{\begin{aligned} &1,\ \ \ \mbox{ if the color at point (pixel) $x$ is white},\\ &\#,\ \ \ \mbox{the value of the grayscale intensity at point (pixel) $x$ scaled between 0 and 1,}\\ &0,\ \ \ \mbox{ if the color at point (pixel) $x$ is black}.\end{aligned}\right. (3.2)

For reference, we encourage readers to compare Eq. (3.2) with Eq. (1.1). An example of a grayscale image is presented in Figure 2(a). In many applications, it becomes important to rescale a grayscale image where g(x)λg(x)g(x)\mapsto\lambda\cdot g(x) for λ\lambda\in\mathbb{R}. However, the configuration in Eq. (3.2) is not invariant under this rescaling transform. Therefore, we will no longer view a grayscale image as a function taking values between [0,1][0,1]. Instead, throughout this paper, we will treat each grayscale image generally as a bounded function gg, which we subsequently term a grayscale function (see Section 3.3 for details). Obviously, a rescaled bounded function is still bounded, although it may take values outside [0,1][0,1]. Furthermore, for negative λ\lambda, this rescaling inverts the grayscale spectrum, reminiscent of a “white-to-black” transition (an example of this is available in Figure 2(b)). By encompassing general bounded functions — beyond those merely within the [0,1][0,1] range — our approach broadens its applicability, encapsulating scalar fields that are beyond the structure posited in Eq. (3.2) (e.g., the realizations of Gaussian random fields) (Adler et al., 2007, Bobrowski and Borman, 2012).

As previously mentioned, the key to generalizing the ECT to the ERT is replacing the indicator function 𝟙K\mathbbm{1}_{K} in Eq. (3.1) with a real-valued function gg representing a grayscale image. That is, the ERT of a dd-dimensional grayscale image gg can be heuristically expressed as follows

g(x)R(x,ν,t)𝑑χ(x),\displaystyle\int g(x)\cdot R(x,\nu,t)\,d\chi(x), (3.3)

which is a function of (ν,t)(\nu,t) on the dd-dimensional manifold 𝕊d1×[0,T]\mathbb{S}^{d-1}\times[0,T]. If the grayscale function gg in Eq. (3.3) takes only finitely many values in \mathbb{Z} (e.g., gg is a discretized version of some underlying high-resolution grayscale image), the integration in Eq. (3.3) can be easily defined via the Euler integration over constructible functions (see Section 3.6 of Ghrist (2014)) and is equal to the WECT under some tameness conditions (Jiang et al., 2020). When the grayscale function gg has an “infinitely fine resolution” (i.e., gg is generally real-valued), the rigorous definition of the integration in Eq. (3.3) necessitates the techniques developed in Baryshnikov and Ghrist (2010). A more elaborate discussion on this, using o-minimal structures, will be discussed in Section 3.2.

3.2 Euler Calculus via O-minimal Structures

The primary objective of this subsection is to revisit Euler calculus — to prepare for a rigorous version of the heuristic integration presented in Eq. (3.3). This will result in the precise definition of the ERT which we detail in Section 3.3.

O-minimal Structures and Definable Functions.

Euler calculus has been extensively examined in the literature (Baryshnikov and Ghrist, 2010, Ghrist, 2014). Lebesgue integration is established for measurable functions. In a manner similar to Lebesgue integration, Euler integration begins by determining a set of functions that act as integrands for Euler integrals. These specific functions are termed “definable functions” and are specified through o-minimal structures. The definition of o-minimal structures is available in van den Dries (1998) and rephrased as follows:

Definition 3.1.

An o-minimal structure on \mathbb{R} is a sequence 𝒪={𝒪n}n1\mathcal{O}=\{\mathcal{O}_{n}\}_{n\geq 1} satisfying the following axioms:
(i) for each nn, the collection 𝒪n\mathcal{O}_{n} is a Boolean algebra of subsets of n\mathbb{R}^{n};
(ii) A𝒪nA\in\mathcal{O}_{n} implies A×𝒪n+1A\times\mathbb{R}\in\mathcal{O}_{n+1} and ×A𝒪n+1\mathbb{R}\times A\in\mathcal{O}_{n+1};
(iii) {(x1,,xn)n:xi=xj}𝒪n\{(x_{1},\ldots,x_{n})\in\mathbb{R}^{n}\,:\,x_{i}=x_{j}\}\in\mathcal{O}_{n} for all 1i<jn1\leq i<j\leq n;
(iv) A𝒪n+1A\in\mathcal{O}_{n+1} implies π(A)𝒪n\pi(A)\in\mathcal{O}_{n}, where π:n+1n\pi:\mathbb{R}^{n+1}\rightarrow\mathbb{R}^{n} is the projection map on the first nn coordinates;
(v) {r}𝒪1\{r\}\in\mathcal{O}_{1} for all rr\in\mathbb{R}, and {(x,y)2:x<y}𝒪2\{(x,y)\in\mathbb{R}^{2}\,:\,x<y\}\in\mathcal{O}_{2};
(vi) the only sets in 𝒪1\mathcal{O}_{1} are the finite unions of open intervals (with ±\pm\infty endpoints allowed) and points.
A set KK is said to be definable with respect to 𝒪\mathcal{O} if K𝒪K\in\mathcal{O} (i.e., there exists an nn such that K𝒪nK\in\mathcal{O}_{n}).

A typical example of o-minimal structures is the collection of semialgebraic sets which is defined as: a set KnK\subseteq\mathbb{R}^{n} is said to be a semialgebraic subset of n\mathbb{R}^{n} if it is a finite union of sets of the following form

{xn:p1(x)=0,,pk(x)=0,q1(x)>0,,ql(x)>0},\displaystyle\Big{\{}x\in\mathbb{R}^{n}\,:\,p_{1}(x)=0,\,\ldots,\,p_{k}(x)=0,\,q_{1}(x)>0,\,\ldots,\,q_{l}(x)>0\Big{\}},

where kk and ll are positive integers, and p1,,pk,q1,,qlp_{1},\ldots,p_{k},q_{1},\ldots,q_{l} are real polynomial functions on n\mathbb{R}^{n} (also see Chapter 2 of van den Dries (1998)). Specifically, if we let 𝒪n=\mathcal{O}_{n}= the collection of semialgebraic subsets of n\mathbb{R}^{n}, then 𝒪={𝒪n}n1\mathcal{O}=\{\mathcal{O}_{n}\}_{n\geq 1} is an o-minimal structure. Since the unit sphere 𝕊n1={xn:x21=0}\mathbb{S}^{n-1}=\{x\in\mathbb{R}^{n}:\,\|x\|^{2}-1=0\} is defined using the polynomial x21\|x\|^{2}-1, it is definable with respect to this o-minimal structure. It is also true for the open ball Bn(0,R)={xn:x2R<0}B_{\mathbb{R}^{n}}(0,R)=\left\{x\in\mathbb{R}^{n}:\,\|x\|^{2}-R<0\right\} centered at the origin with radius R>0R>0. Throughout this paper, to include many common sets in our framework, we assume the o-minimal structures 𝒪\mathcal{O} of interest to satisfy the following assumption:

Assumption 1.

The o-minimal structure 𝒪\mathcal{O} of interest contains all semialgebraic sets.

Importantly, under Assumption 1, we are able to apply the “triangulation theorem” and the “trivialization theorem” as presented in van den Dries (1998). In particular, the “triangulation theorem” (see Chapter 8 of van den Dries (1998)) indicates that each definable set is homeomorphic to a polyhedron, which subsequently suggests that each definable set is Borel-measurable (see “Definition 1.21” of Klenke (2020)).

In addition to o-minimal structures, we need the concepts in the following definition, which are also available in Baryshnikov and Ghrist (2010).

Definition 3.2.

Suppose we have an o-minimal structure 𝒪\mathcal{O} on \mathbb{R}. Let XX be definable and YNY\subseteq\mathbb{R}^{N} for some positive integer NN.

  1. i)

    A function g:XYg:X\rightarrow Y is said to be definable if its graph Γ(g):={(x,y)X×Y:y=g(x)}\Gamma(g):=\{(x,y)\in X\times Y:y=g(x)\} is definable (i.e., Γ(g)𝒪\Gamma(g)\in\mathcal{O}).

  2. ii)

    Let Def(X;Y)\operatorname{Def}(X;Y) denote the collection of compactly supported definable functions g:XYg:X\rightarrow Y. Denote Def(X):=Def(X;)\operatorname{Def}(X):=\operatorname{Def}(X;\mathbb{R}).

  3. iii)

    Denote CF(X):=Def(X;)\operatorname{CF}(X):=\operatorname{Def}(X;\,\mathbb{Z}). Any function in CF(X)\operatorname{CF}(X) is called a constructible function.

  4. iv)

    If a definable set is also compact, we call this set a constructible set; the collection of all constructible subsets of XX is denoted by CS(X)\operatorname{CS}(X). Obviously, for any constructible subset KXK\subseteq X, K𝟙KK\mapsto\mathbbm{1}_{K} is an injective map from CS(X)\operatorname{CS}(X) to CF(X)\operatorname{CF}(X).

It is simple to verify that, under Assumption 1, the function R(x,ν,t)R(x,\nu,t) defined in Eq. (3.1) is a constructible function where RCF(2d+1)R\in\operatorname{CF}(\mathbb{R}^{2d+1}), and {(x,ν,t)2d+1:R(x,ν,t)=1}\{(x,\nu,t)\in\mathbb{R}^{2d+1}\,:\,R(x,\nu,t)=1\} is a constructible set.

The term “tameness” is frequently used in the TDA literature. While “tame” is often used synonymously with “definable,” certain works (e.g., Bobrowski and Borman, 2012) attribute “tameness” to a notion that slightly diverges from the definability outlined in Definition 3.2. To ensure clarity, we will use “definable” in lieu of “tame.” An in-depth exploration of the interplay between definability and tameness can be found in Appendix B.

Euler Characteristic.

In Euler calculus, the Euler characteristic χ()\chi(\cdot) plays a role analogous to the Lebesgue measure in the Lebesgue integral theory. For any definable set K𝒪K\in\mathcal{O}, the cell decomposition theorem (CDT) indicates that there exists a partition 𝒫\mathcal{P} of KK, where 𝒫\mathcal{P} is a finite collection of cells (see Chapter 3 of van den Dries (1998) for the definition of cells and CDT). Then, the Euler characteristic χ(K)\chi(K) of the definable set KK is defined as follows

χ(K):=C𝒫(1)dim(C),\displaystyle\chi(K):=\sum_{C\in\mathcal{P}}(-1)^{\operatorname{dim}(C)}, (3.4)

where dim(C)\operatorname{dim}(C) denotes the dimension of the cell CC (see Chapter 4 of van den Dries (1998) for the definition of dimensions). One can also show that the value χ(K)\chi(K) does not depend on the choice of partition 𝒫\mathcal{P} (see Chapter 4 of van den Dries (1998), Section 2 therein). As discussed at the beginning of Baryshnikov and Ghrist (2010), the Euler characteristic in Eq. (3.4) is equivalently defined via the Borel-Moore homology, where χ(K)=n(1)ndimHnBM(K;)\chi(K)=\sum_{n\in\mathbb{Z}}(-1)^{n}\cdot\operatorname{dim}H_{n}^{BM}(K;\mathbb{R}) with HBMH_{*}^{BM} denoting the Borel-Moore homology (Bredon, 2012). Note that χ(K)\chi(K) is a homotopy invariant if KK is compact but is only a homeomorphism invariant in general.

Euler Integration over Constructible Functions.

For any constructible function gCF(X)g\in\operatorname{CF}(X), its Euler integral is defined as follows (also see Section 3.6 of Ghrist (2014))

Xg(x)𝑑χ(x):=n=+nχ({xX:g(x)=n}).\displaystyle\int_{X}g(x)\,d\chi(x):=\sum_{n=-\infty}^{+\infty}n\cdot\chi\left(\{x\in X:\,g(x)=n\}\right). (3.5)

Particularly, X𝟙K(x)𝑑χ(x)=χ(K)\int_{X}\mathbbm{1}_{K}(x)\,d\chi(x)=\chi(K) for all KCS(X)K\in\operatorname{CS}(X). Using the Euler integration ()𝑑χ\int(\cdot)\,d\chi defined in Eq. (3.5), we may represent the ECT by the following222\dagger: We use {f(x)}xX\{f(x)\}_{x\in X} to denote the function f:X,xf(x)f:X\rightarrow\mathbb{R},\,x\mapsto f(x) throughout this paper.

ECT:CS(Bd(0,R))𝕊d1×[0,T],KECT(K)={χ(Ktν)=Bd(0,R)𝟙K(x)R(x,ν,t)𝑑χ(x)}(ν,t)𝕊d1×[0,T],\displaystyle\begin{aligned} \operatorname{ECT}:\ \ &\operatorname{CS}\left(B_{\mathbb{R}^{d}}(0,R)\right)\rightarrow\mathbb{Z}^{\mathbb{S}^{d-1}\times[0,T]},\\ &K\mapsto\operatorname{ECT}(K)=\left\{\chi(K_{t}^{\nu})=\int_{B_{\mathbb{R}^{d}}(0,R)}\mathbbm{1}_{K}(x)\cdot R(x,\nu,t)\,d\chi(x)\right\}_{(\nu,t)\in\mathbb{S}^{d-1}\times[0,T]},\end{aligned} (3.6)

where T=2RT=2R and R(x,ν,t)R(x,\nu,t) is the indicator function defined in Eq. (3.1). Eq. (3.6) is a rigorous version of Eq. (3.1) in the sense that Eq. (3.6) specifies the collection of shapes for which the ECT is well-defined. Furthermore, χ(Ktν)\chi(K_{t}^{\nu}) varies “definably” with respect to (ν,t)(\nu,t), which is precisely presented by the following theorem.

Theorem 3.1.

Suppose KCS(Bd(0,R))K\in\operatorname{CS}(B_{\mathbb{R}^{d}}(0,R)). Then, we have the following:

  1. i)

    χ(Ktν)\chi(K_{t}^{\nu}) takes only finitely many values as (ν,t)(\nu,t) runs through 𝕊d1×[0,T]\mathbb{S}^{d-1}\times[0,T]. In addition, for each integer zz\in\mathbb{Z}, the set {(ν,t)𝕊d1×[0,T]:χ(Ktν)=z}\{(\nu,t)\in\mathbb{S}^{d-1}\times[0,T]:\,\chi(K_{t}^{\nu})=z\} is definable; hence, the function (ν,t)χ(Ktν)(\nu,t)\mapsto\chi(K_{t}^{\nu}) is a definable function.

  2. ii)

    The function (ν,t)χ(Ktν)(\nu,t)\mapsto\chi(K_{t}^{\nu}) is Borel-measurable.

  3. iii)

    For each fixed direction ν𝕊d1\nu\in\mathbb{S}^{d-1}, the function tχ(Ktν)t\mapsto\chi(K_{t}^{\nu}) has at most finitely many discontinuities. More precisely, there are points a1<<aka_{1}<\ldots<a_{k} in (0,T)(0,T) such that on each interval (aj,aj+1)(a_{j},a_{j+1}) with ak+1=Ta_{k+1}=T, the function is constant.

The proof of Theorem 3.1 is given in Appendix A.2. The third result of Theorem 3.1 indicates that the “tameness assumption” in (an old version of) Meng et al. (2022) is redundant if the shapes of interest are definable. Recall that the SECT of KCS(Bd(0,R))K\in\operatorname{CS}(B_{\mathbb{R}^{d}}(0,R)) is defined by the following (Crawford et al., 2020, Meng et al., 2022)

SECT(K)(ν,t):=0tχ(Kτν)𝑑τtT0Tχ(Kτν)𝑑τ,for all (ν,t)𝕊d1×[0,T].\displaystyle\operatorname{SECT}(K)(\nu,t):=\int_{0}^{t}\chi(K_{\tau}^{\nu})\,d\tau-\frac{t}{T}\int_{0}^{T}\chi(K_{\tau}^{\nu})\,d\tau,\ \ \ \text{for all }(\nu,t)\in\mathbb{S}^{d-1}\times[0,T]. (3.7)

Theorem 3.1 also guarantees that the Lebesgue integrals in Eq. (3.7) are well-defined and that the map (ν,t)SECT(K)(ν,t)(\nu,t)\mapsto\operatorname{SECT}(K)(\nu,t) is Borel-measurable.

Euler Integration over Definable Functions.

The Euler integration X()𝑑χ\int_{X}(\cdot)d\chi defined in Eq. (3.5) is exclusively tailored for integer-valued functions within CF(X)=Def(X;)\operatorname{CF}(X)=\operatorname{Def}(X;\mathbb{Z}) (e.g., the indicator function 𝟙K\mathbbm{1}_{K} that represents a binary image). Consequently, it cannot accommodate real-valued grayscale functions gg possessing infinitely fine resolutions (e.g., see Figures 1 and 2). Therefore, to provide a rigorous definition of the integrals in Eq. (3.3), we need a framework that extends beyond the scope of Eq. (3.5).

When the integrands are real-valued functions, one needs step-function approximations. We first review the definitions of floor and ceiling functions. For any real number ss, s:=\lfloor s\rfloor:= the greatest integer less than or equal to ss, and s:=\lceil s\rceil:= the least integer greater than or equal to ss. Based on the functional X()𝑑χ\int_{X}(\cdot)d\chi in Eq. (3.5), Baryshnikov and Ghrist (2010) proposed the Euler integration functionals X()dχ\int_{X}(\cdot)\,\lfloor d\chi\rfloor, X()dχ\int_{X}(\cdot)\,\lceil d\chi\rceil, and X()[dχ]\int_{X}(\cdot)\,[d\chi] for real-valued definable function gDef(X;)g\in\operatorname{Def}(X;\mathbb{R}) as follows

(floor version)Xg(x)dχ(x):=limn{1nXng(x)𝑑χ(x)},(ceiling version)Xg(x)dχ(x):=limn{1nXng(x)𝑑χ(x)},(averaged version)Xg(x)[dχ(x)]:=12(Xg(x)dχ(x)+Xg(x)dχ(x)).\displaystyle\begin{aligned} \text{(floor version)}\ \ \ \ \ &\int_{X}g(x)\,\lfloor d\chi(x)\rfloor:=\lim_{n\rightarrow\infty}\left\{\frac{1}{n}\int_{X}\lfloor n\cdot g(x)\rfloor\,d\chi(x)\right\},\\ \text{(ceiling version)}\ \ \ \ \ &\int_{X}g(x)\,\lceil d\chi(x)\rceil:=\lim_{n\rightarrow\infty}\left\{\frac{1}{n}\int_{X}\lceil n\cdot g(x)\rceil\,d\chi(x)\right\},\\ \text{(averaged version)}\ \ \ \ \ &\int_{X}g(x)\,[d\chi(x)]:=\frac{1}{2}\left(\int_{X}g(x)\,\lfloor d\chi(x)\rfloor+\int_{X}g(x)\,\lceil d\chi(x)\rceil\right).\end{aligned} (3.8)

Baryshnikov and Ghrist (2010) (“Lemma 3” therein) showed that the limits in Eq. (3.8) exist; hence, the functionals in Eq. (3.8) are well-defined. The following equation indicates that the functionals defined in Eq. (3.8) are generalizations of X()𝑑χ\int_{X}(\cdot)\,d\chi where

X𝟙K(x)dχ(x)=X𝟙K(x)dχ(x)=X𝟙K(x)[dχ(x)]=X𝟙K(x)𝑑χ(x)=χ(K),\displaystyle\begin{aligned} \int_{X}\mathbbm{1}_{K}(x)\,\lfloor d\chi(x)\rfloor=\int_{X}\mathbbm{1}_{K}(x)\,\lceil d\chi(x)\rceil&=\int_{X}\mathbbm{1}_{K}(x)\,[d\chi(x)]=\int_{X}\mathbbm{1}_{K}(x)\,d\chi(x)=\chi(K),\end{aligned} (3.9)

for all KCS(X)K\in\operatorname{CS}(X). The proof of Eq. (3.9) is in Appendix A.1. However, for general integrands, Xg(x)dχ(x)\int_{X}g(x)\,\lfloor d\chi(x)\rfloor and Xg(x)dχ(x)\int_{X}g(x)\,\lceil d\chi(x)\rceil are not equal (see “Lemma 1” of Baryshnikov and Ghrist (2010)). Neither X()dχ\int_{X}(\cdot)\,\lfloor d\chi\rfloor nor X()dχ\int_{X}(\cdot)\,\lceil d\chi\rceil is linear. What is more, neither of them is homogeneous — they are only positively homogeneous (see Baryshnikov and Ghrist (2010) for details). Fortunately, the “averaged version” X()[dχ]\int_{X}(\cdot)\,[d\chi] is homogeneous. This means that Xλg(x)[dχ(x)]=λXg(x)[dχ(x)]\int_{X}\lambda\cdot g(x)\,[d\chi(x)]=\lambda\cdot\int_{X}g(x)\,[d\chi(x)] for all λ\lambda\in\mathbb{R}, which is implied by “Lemmas 4 and 6” of Baryshnikov and Ghrist (2010).

Within the trio of functionals outlined in Eq. (3.8), our study predominantly employs the averaged version, X()[dχ]\int_{X}(\cdot)[d\chi], to ensure the homogeneity of the proposed ERT. To elucidate, consider the ERT denoted as ERT:gERT(g)\operatorname{ERT}:g\mapsto\operatorname{ERT}(g). Our objective is to maintain homogeneity such that ERT(λg)=λERT(g)\operatorname{ERT}(\lambda\cdot g)=\lambda\cdot\operatorname{ERT}(g) holds universally for any λ\lambda\in\mathbb{R}. This homogeneity property is validated using the averaged version X()[dχ]\int_{X}(\cdot)[d\chi] as detailed in Theorem 3.2 in Section 3.3. This choice not only streamlines the theoretical presentation but also yields computational efficiency in practical applications. For example, suppose we need to rescale (and maybe also white-to-black transition as presented in Figure 2) the grayscale function gg post ERT(g)\operatorname{ERT}(g) computation. In that case, we may directly rescale the computed ERT — meaning that we can compute λERT(g)\lambda\cdot\operatorname{ERT}(g) instead of computing the ERT of the rescaled image λg\lambda\cdot g. Rescaling a computed ERT is much more efficient than computing the ERT of a rescaled image.

3.3 Euler-Randon Transform

In this subsection, we introduce the precise definition of the ERT for grayscale images. Without loss of generality, we postulate that all functions representing grayscale images of interest are defined on the open ball Bd(0,R)B_{\mathbb{R}^{d}}(0,R) with a prespecified radius R<R<\infty (after all, there is no infinitely large image in practice). We model grayscale images/functions by the following definition.

Definition 3.3.

Any element in the function class 𝔇R,d\mathfrak{D}_{R,d} defined as follows is called a grayscale function or grayscale image

𝔇R,d:={gDef(Bd(0,R);):supx|g(x)|< and dist(supp(g),Bd(0,R))>0},\displaystyle\mathfrak{D}_{R,d}:=\left\{g\in\operatorname{Def}\left(B_{\mathbb{R}^{d}}(0,R)\,;\,\mathbb{R}\right)\,:\,\sup_{x}|g(x)|<\infty\text{ and }\operatorname{dist}\Big{(}\operatorname{supp}(g),\partial B_{\mathbb{R}^{d}}(0,R)\Big{)}>0\right\},

where dist(A,B):=inf{ab:aA and bB}\operatorname{dist}(A,B):=\inf\{\|a-b\|:\,a\in A\text{ and }b\in B\} denotes the distance between two sets.

The condition dist(supp(g),Bd(0,R))>0\operatorname{dist}\left(\operatorname{supp}(g),\partial B_{\mathbb{R}^{d}}(0,R)\right)>0 in Definition 3.3 means that the support of every grayscale function is strictly smaller than the domain Bd(0,R)B_{\mathbb{R}^{d}}(0,R). This condition simplifies the proofs of Theorem 3.4 and Eq. (3.16) and can be easily satisfied (e.g., we can always enlarge the radius RR and extend gg by zero to satisfy the condition).

Definition of the ERT.

Using the Euler integration X()[dχ]\int_{X}(\cdot)\,[d\chi] defined in Eq. (3.6), we define the ERT of grayscale functions as follows

ERT:𝔇R,d𝕊d1×[0,T],gERT(g)={ERT(g)(ν,t)}(ν,t)𝕊d1×[0,T],where ERT(g)(ν,t):=Bd(0,R)g(x)R(x,ν,t)[dχ(x)],\displaystyle\begin{aligned} \operatorname{ERT}:\ \ &\mathfrak{D}_{R,d}\rightarrow\mathbb{R}^{\mathbb{S}^{d-1}\times[0,T]},\\ &g\mapsto\operatorname{ERT}(g)=\left\{\operatorname{ERT}(g)(\nu,t)\right\}_{(\nu,t)\in\mathbb{S}^{d-1}\times[0,T]},\\ &\text{where }\ \operatorname{ERT}(g)(\nu,t):=\int_{B_{\mathbb{R}^{d}}(0,R)}g(x)\cdot R(x,\nu,t)\,[d\chi(x)],\end{aligned} (3.10)

and T=2RT=2R. We may also replace the averaged version ()[dχ]\int(\cdot)\,[d\chi] in Eq. (3.10) with the floor version ()dχ\int(\cdot)\,\lfloor d\chi\rfloor or ceiling version ()dχ\int(\cdot)\,\lceil d\chi\rceil; the transforms corresponding to ()dχ\int(\cdot)\,\lfloor d\chi\rfloor and ()dχ\int(\cdot)\,\lceil d\chi\rceil are denoted as ERT(g)\lfloor\operatorname{ERT}\rfloor(g) and ERT(g)\lceil\operatorname{ERT}\rceil(g), respectively. Eq. (3.9) indicates that the ERT, as well as ERT(g)\lfloor\operatorname{ERT}\rfloor(g) and ERT(g)\lceil\operatorname{ERT}\rceil(g), is a generalization of the ECT in the following sense

ERT(𝟙K)=ERT(𝟙K)=ERT(𝟙K)=ECT(K), for all KCS(Bd(0,R)).\displaystyle\operatorname{ERT}(\mathbbm{1}_{K})=\lfloor\operatorname{ERT}\rfloor(\mathbbm{1}_{K})=\lceil\operatorname{ERT}\rceil(\mathbbm{1}_{K})=\operatorname{ECT}(K),\ \ \ \text{ for all }K\in\operatorname{CS}\left(B_{\mathbb{R}^{d}}(0,R)\right). (3.11)

Properties of the ERT.

Here, we present several properties of the ERT that will be utilized in later sections. First, the homogeneity of ()[dχ]\int(\cdot)\,[d\chi] and “Lemma 6” of Baryshnikov and Ghrist (2010) directly imply the following theorem (its proof is omitted)

Theorem 3.2.

Suppose g𝔇R,dg\in\mathfrak{D}_{R,d}. Then, we have the following:

  1. i)

    ERT\lfloor\operatorname{ERT}\rfloor and ERT\lceil\operatorname{ERT}\rceil are positively homogeneous, such that ERT(λg)=λERT(g)\lfloor\operatorname{ERT}\rfloor(\lambda\cdot g)=\lambda\cdot\lfloor\operatorname{ERT}\rfloor(g) and ERT(λg)=λERT(g)\lceil\operatorname{ERT}\rceil(\lambda\cdot g)=\lambda\cdot\lceil\operatorname{ERT}\rceil(g) for all λ>0\lambda>0.

  2. ii)

    ERT\operatorname{ERT} is homogeneous, such that ERT(λg)=λERT(g)\operatorname{ERT}(\lambda\cdot g)=\lambda\cdot\operatorname{ERT}(g) for all λ\lambda\in\mathbb{R}.

Secondly, we have the following theorem on the measurability of the function (ν,t)ERT(g)(ν,t)(\nu,t)\mapsto\operatorname{ERT}(g)(\nu,t).

Theorem 3.3.

Suppose g𝔇R,dg\in\mathfrak{D}_{R,d}. Then, the function (ν,t)ERT(g)(ν,t)(\nu,t)\mapsto\operatorname{ERT}(g)(\nu,t) is Borel-measurable, which holds for ERT\lfloor\operatorname{ERT}\rfloor and ERT\lceil\operatorname{ERT}\rceil as well.

Theorem 3.3 will be a straightforward result of a later Theorem 4.2.

Definition of the SERT.

Theorem 3.3 allows us to define the smooth Euler-Radon transform (SERT) as follows,

SERT:𝔇R,d𝕊d1×[0,T],gSERT(g)={SERT(g)(ν,t)}(ν,t)𝕊d1×[0,T],where SERT(g)(ν,t):=0tERT(g)(ν,τ)𝑑τtT0TERT(g)(ν,τ)𝑑τ.\displaystyle\begin{aligned} \operatorname{SERT}:\ \ &\mathfrak{D}_{R,d}\rightarrow\mathbb{R}^{\mathbb{S}^{d-1}\times[0,T]},\\ &g\mapsto\operatorname{SERT}(g)=\left\{\operatorname{SERT}(g)(\nu,t)\right\}_{(\nu,t)\in\mathbb{S}^{d-1}\times[0,T]},\\ &\text{where }\ \operatorname{SERT}(g)(\nu,t):=\int_{0}^{t}\operatorname{ERT}(g)(\nu,\tau)\,d\tau-\frac{t}{T}\int_{0}^{T}\operatorname{ERT}(g)(\nu,\tau)\,d\tau.\end{aligned} (3.12)

Theorem 3.3 implies that the Lebesgue integrals in Eq. (3.12) are well-defined, and the function (ν,t)SERT(g)(ν,t)(\nu,t)\mapsto\operatorname{SERT}(g)(\nu,t) is Borel-measurable. Transforms SERT\lfloor\operatorname{SERT}\rfloor and SERT\lceil\operatorname{SERT}\rceil are defined via ERT\lfloor\operatorname{ERT}\rfloor and ERT\lceil\operatorname{ERT}\rceil, respectively, in a similar way; they also have the referred properties of SERT\operatorname{SERT}. Furthermore, Eq. (3.11) implies that SERT(𝟙K)=SERT(𝟙K)=SERT(𝟙K)=SECT(K)\operatorname{SERT}(\mathbbm{1}_{K})=\lfloor\operatorname{SERT}\rfloor(\mathbbm{1}_{K})=\lceil\operatorname{SERT}\rceil(\mathbbm{1}_{K})=\operatorname{SECT}(K) for all KCS(Bd(0,R))K\in\operatorname{CS}\left(B_{\mathbb{R}^{d}}(0,R)\right).

Comparing the ERT and SERT.

The following theorem indicates that the SERT preserves all information about the ERT under a regularity condition.

Theorem 3.4.

Suppose g𝔇R,dg\in\mathfrak{D}_{R,d}. If tERT(g)(ν,t)t\mapsto\operatorname{ERT}(g)(\nu,t) is a right continuous function for each fixed ν𝕊d1\nu\in\mathbb{S}^{d-1}, then ERT(g)\operatorname{ERT}(g) can be expressed in terms of SERT(g)\operatorname{SERT}(g); hence, ERT(g)\operatorname{ERT}(g) and SERT(g)\operatorname{SERT}(g) determine each other.

The proof of Theorem 3.4 is given in Appendix A.3. We selected right continuity over left continuity in Theorem 3.4 for four reasons. The first reason is to align with Morse theory (see Remark 2.4 in Milnor (1963), on page 20 therein, for a right-continuity result). The second reason is that tERT(g)(ν,t)t\mapsto\operatorname{ERT}(g)(\nu,t) is not left continuous in general (see Appendix E for examples with right-continuity). The third reason comes from probability theory. When gg is random, the function tERT(g)(ν,t)t\mapsto\operatorname{ERT}(g)(\nu,t) is a stochastic process for each fixed ν𝕊d1\nu\in\mathbb{S}^{d-1}; if tERT(g)(ν,t)t\mapsto\operatorname{ERT}(g)(\nu,t) is right continuous, it automatically becomes a stochastic process whose sample paths are right continuous with left limit (RCLL). Stochastic processes with RCLL sample paths are well studied in probability theory (e.g., see Section 21.4 of Klenke (2020)). The fourth reason is rooted in the following deliberation: if we view the Euler characteristic χ\chi as an analog of a probability measure \mathbb{P}, then ERT(𝟙K)(ν,t)=χ({xK;xν+Rt})\operatorname{ERT}(\mathbbm{1}_{K})(\nu,t)=\chi(\{x\in K;\,x\cdot\nu+R\leq t\}) is an analog of the cumulative distribution function FX(t):=(Xt)F_{X}(t):=\mathbb{P}(X\leq t) of a random variable XX and the function tFX(t)t\mapsto F_{X}(t) is right continuous.

Invertibility of the ERT and SERT.

In practical applications, grayscale images must be discretized into arrays of pixel intensities (or their higher-dimensional counterparts such as voxels) for storage on electronic devices. Consequently, we can represent grayscale images utilized in these contexts as members of the following piecewise-constant function class

𝔇R,dpc:={g𝔇R,d:g takes finitely many values in }.\displaystyle\mathfrak{D}_{R,d}^{pc}:=\left\{g\in\mathfrak{D}_{R,d}:\,g\text{ takes finitely many values in }\mathbb{R}\right\}. (3.13)

For any piecewise-constant grayscale image gg in 𝔇R,dpc\mathfrak{D}_{R,d}^{pc}, the following theorem indicates that ERT(g)\operatorname{ERT}(g) does not lose any information about the image gg and, as a result, justifies the implementation of the ERT in practical applications.

Theorem 3.5.

(i) The restriction of ERT\operatorname{ERT} on 𝔇R,dpc\mathfrak{D}_{R,d}^{pc} is invertible for all dimensions dd. (ii) The restriction of SERT\operatorname{SERT} on {g𝔇R,dpc: the function tERT(g)(ν,t) is right continuous for each fixed ν}\{g\in\mathfrak{D}_{R,d}^{pc}:\text{ the function $t\mapsto\operatorname{ERT}(g)(\nu,t)$ is right continuous for each fixed }\nu\} is invertible for all dimensions dd.

Theorem 3.5 extends the invertibility findings of the ECT and SECT in “Corollary 1” of Ghrist et al. (2018). A comprehensive proof for Theorem 3.5 can be found in Appendix A.4. The piecewise constant constraint in Theorem 3.5 can be slightly relaxed. Namely, the invertibility of the ERT holds for all grayscale functions g𝔇R,dg\in\mathfrak{D}_{R,d} that satisfy the “Fubini condition” (see Appendix C). The invertibility of the ERT on 𝔇R,d\mathfrak{D}_{R,d}, instead of 𝔇R,dpc\mathfrak{D}_{R,d}^{pc}, is still an open problem and is discussed in detail in Appendix C. The main obstacle in solving the open problem is that the “Fubini theorem” in Euler calculus (Ghrist, 2014, Section 3.8) does not hold over real-valued Euler integrands.

3.4 Dual Euler-Radon Transform

As a remark on the invertibility of the ERT presented in Theorem 3.5, we introduce the dual Euler-Radon transform (DERT). Let RR^{\prime} be the dual kernel of R(x,v,t)R(x,v,t) (see Eq. (3.1)) given by

R(ν,t,x)𝟙{(ν,t,x)𝕊d1×[0,T]×Bd(0,R):xνtR}.\displaystyle R^{\prime}(\nu,t,x)\coloneqq\mathbbm{1}\left\{\left(\nu,t,x\right)\in\mathbb{S}^{d-1}\times[0,T]\times B_{\mathbb{R}^{d}}(0,R)\,:\,x\cdot\nu\geq t-R\right\}. (3.14)

We define the DERT as follows

DERT:Def(𝕊d1×[0,T])Bd(0,R),hDERT(h)={DERT(h)(x):=𝕊d1×[0,T]h(ν,t)R(ν,t,x)[dχ(ν,t)]}xBd(0,R).\displaystyle\begin{aligned} \operatorname{DERT}:\ \ &\operatorname{Def}(\mathbb{S}^{d-1}\times[0,T])\rightarrow\mathbb{R}^{B_{\mathbb{R}^{d}}(0,R)},\\ &h\mapsto\operatorname{DERT}(h)=\left\{\operatorname{DERT}(h)(x):=\int_{\mathbb{S}^{d-1}\times[0,T]}h(\nu,t)\cdot R^{\prime}(\nu,t,x)\,[d\chi(\nu,t)]\right\}_{x\in B_{\mathbb{R}^{d}}(0,R)}.\end{aligned} (3.15)

Using the DERT, the following provides an inversion formula for recovering g𝔇R,dpcg\in\mathfrak{D}_{R,d}^{pc} from ERT(g)\operatorname{ERT}(g)

g(x)=1(1)d+1(DERTERT)(g)(x)limξBd(0,R)1(1)d+1(DERTERT)(g)(ξ),\displaystyle g(x^{\prime})=\frac{1}{(-1)^{d+1}}\cdot(\operatorname{DERT}\circ\operatorname{ERT})(g)(x^{\prime})-\lim_{\xi\rightarrow\partial B_{\mathbb{R}^{d}}(0,R)}\frac{1}{(-1)^{d+1}}\cdot(\operatorname{DERT}\circ\operatorname{ERT})(g)(\xi), (3.16)

where limξBd(0,R)\lim_{\xi\rightarrow\partial B_{\mathbb{R}^{d}}(0,R)} means that ξ\xi converges to a point on the sphere Bd(0,R)={xd:x=R}\partial B_{\mathbb{R}^{d}}(0,R)=\{x\in\mathbb{R}^{d}:\,\|x\|=R\}. The details and proof of Eq. (3.16) are given in Appendix C.

4 Existing Frameworks

In Section 3.3, we demonstrated that the introduced ERT and SERT serve as generalizations of the ECT and SECT, respectively. In this section, we discuss the relationship between our proposed framework and other established transforms: the WECT, LECT, SELECT, and the marginal Euler curve (MEC).

LECT and SELECT.

Kirveslahti and Mukherjee (2023) introduced the LECT and SELECT for the analysis of scalar fields (including grayscale images), motivated by the idea of super-level sets implemented in the topology of Gaussian random fields (Adler et al., 2007, Taylor and Worsley, 2008). Using the notations introduced in Section 3.3, the LECT can be represented as follows

LECT:𝔇R,d𝕊d1×[0,T]×[0,1],gLECT(g):={LECT(g)(ν,t,s)}(ν,t,s)𝕊d1×[0,T]×[0,1],where LECT(g)(ν,t,s):=χ({xBd(0,R):xνtR and g(x)=s}).\displaystyle\begin{aligned} \operatorname{LECT}:\ \ &\mathfrak{D}_{R,d}\rightarrow\mathbb{Z}^{\mathbb{S}^{d-1}\times[0,T]\times[0,1]},\\ &g\mapsto\operatorname{LECT}(g):=\left\{\operatorname{LECT}(g)(\nu,t,s)\right\}_{(\nu,t,s)\in\mathbb{S}^{d-1}\times[0,T]\times[0,1]},\\ &\text{where }\ \operatorname{LECT}(g)(\nu,t,s):=\chi\left(\left\{x\in B_{\mathbb{R}^{d}}(0,R):\,x\cdot\nu\leq t-R\text{ and }g(x)=s\right\}\right).\end{aligned} (4.1)

That is, the LECT transforms dd-dimensional grayscale images into integer-valued functions defined on a (d+1)(d+1)-dimensional manifold. Inspired by Morse theory (Milnor, 1963) and considerations of statistical robustness, the SELECT can be represented as follows

SELECT:𝔇R,d𝕊d1×[0,T]×[0,1],gSELECT(g):={SELECT(g)(ν,t,s)}(ν,t,s)𝕊d1×[0,T]×[0,1],where SELECT(g)(ν,t,s):=χ({xBd(0,R):xνtR and g(x)s}).\displaystyle\begin{aligned} \operatorname{SELECT}:\ \ &\mathfrak{D}_{R,d}\rightarrow\mathbb{Z}^{\mathbb{S}^{d-1}\times[0,T]\times[0,1]},\\ &g\mapsto\operatorname{SELECT}(g):=\left\{\operatorname{SELECT}(g)(\nu,t,s)\right\}_{(\nu,t,s)\in\mathbb{S}^{d-1}\times[0,T]\times[0,1]},\\ &\text{where }\ \operatorname{SELECT}(g)(\nu,t,s):=\chi\left(\left\{x\in B_{\mathbb{R}^{d}}(0,R):\,x\cdot\nu\leq t-R\text{ and }g(x)\geq s\right\}\right).\end{aligned} (4.2)

We have the following result on the functions (ν,t,s)LECT(g)(ν,t,s)(\nu,t,s)\mapsto\operatorname{LECT}(g)(\nu,t,s) and (ν,t,s)SELECT(g)(ν,t,s)(\nu,t,s)\mapsto\operatorname{SELECT}(g)(\nu,t,s), which is an analog of Theorem 3.1.

Theorem 4.1.

Suppose g𝔇R,dg\in\mathfrak{D}_{R,d}. Then, we have the following:

  1. i)

    LECT(g)(ν,t,s)\operatorname{LECT}(g)(\nu,t,s) and SELECT(g)(ν,t,s)\operatorname{SELECT}(g)(\nu,t,s) take only finitely many values as (ν,t,s)(\nu,t,s) runs through 𝕊d1×[0,T]×[0,1]\mathbb{S}^{d-1}\times[0,T]\times[0,1]. In addition, for each integer zz\in\mathbb{Z}, the sets {(ν,t)𝕊d1×[0,T]×[0,1]:LECT(g)(ν,t,s)=z}\{(\nu,t)\in\mathbb{S}^{d-1}\times[0,T]\times[0,1]:\,\operatorname{LECT}(g)(\nu,t,s)=z\} and {(ν,t)𝕊d1×[0,T]×[0,1]:SELECT(g)(ν,t,s)=z}\{(\nu,t)\in\mathbb{S}^{d-1}\times[0,T]\times[0,1]:\,\operatorname{SELECT}(g)(\nu,t,s)=z\} are definable; hence, the functions (ν,t,s)LECT(g)(ν,t,s)(\nu,t,s)\mapsto\operatorname{LECT}(g)(\nu,t,s) and (ν,t,s)SELECT(g)(ν,t,s)(\nu,t,s)\mapsto\operatorname{SELECT}(g)(\nu,t,s) are definable.

  2. ii)

    The functions (ν,t,s)LECT(g)(ν,t,s)(\nu,t,s)\mapsto\operatorname{LECT}(g)(\nu,t,s) and (ν,t,s)SELECT(g)(ν,t,s)(\nu,t,s)\mapsto\operatorname{SELECT}(g)(\nu,t,s) are Borel-measurable.

Since the proof of Theorem 4.1 is similar to that of Theorem 3.1, we omit it.

MEC.

Kirveslahti and Mukherjee (2023) also proposed the marginal Euler curve (MEC) Mνg(t)M_{\nu}^{g}(t) which is defined via the SELECT as follows

Mνg(t):=SELECT(g)(ν,t,s)𝑑s, for all (ν,t)𝕊d1×[0,T].\displaystyle M_{\nu}^{g}(t):=\int_{\mathbb{R}}\operatorname{SELECT}(g)(\nu,t,s)\,ds,\ \ \ \text{ for all }(\nu,t)\in\mathbb{S}^{d-1}\times[0,T]. (4.3)

Theorem 4.1 guarantees that the Lebesgue integral in Eq. (4.3) is well-defined.

Relationship with the ERT.

The subsequent theorem establishes a connection between our proposed ERT and the LECT and SELECT — thereby also linking the ERT to both the MEC and WECT.

Theorem 4.2.

Suppose g𝔇R,dg\in\mathfrak{D}_{R,d}. Then, we have the following Euler integration representation of the ERT\operatorname{ERT} via the LECT\operatorname{LECT}

ERT(g)(ν,t)=sLECT(g)(ν,t,s)[dχ(s)],for all (ν,t)𝕊d1×[0,T].\displaystyle\operatorname{ERT}(g)(\nu,t)=\int_{\mathbb{R}}s\cdot\operatorname{LECT}(g)(\nu,t,s)\,[d\chi(s)],\ \ \ \text{for all }(\nu,t)\in\mathbb{S}^{d-1}\times[0,T]. (4.4)

In addition, we have the following Lebesgue integration representations of the ERT\lfloor\operatorname{ERT}\rfloor and ERT\lceil\operatorname{ERT}\rceil via the LECT\operatorname{LECT} and SELECT\operatorname{SELECT}

ERT(g)(ν,t)=0SELECT(g)(ν,t,s)SELECT(g)(ν,t,s)+LECT(g)(ν,t,s)ds,ERT(g)(ν,t)=0SELECT(g)(ν,t,s)LECT(g)(ν,t,s)SELECT(g)(ν,t,s)ds.\displaystyle\begin{aligned} &\lfloor\operatorname{ERT}\rfloor(g)(\nu,t)=\int_{0}^{\infty}\operatorname{SELECT}(g)(\nu,t,s)-\operatorname{SELECT}(-g)(\nu,t,s)+\operatorname{LECT}(-g)(\nu,t,s)\,ds,\\ &\lceil\operatorname{ERT}\rceil(g)(\nu,t)=\int_{0}^{\infty}\operatorname{SELECT}(g)(\nu,t,s)-\operatorname{LECT}(g)(\nu,t,s)-\operatorname{SELECT}(-g)(\nu,t,s)\,ds.\end{aligned} (4.5)

In particular, we have the following Lebesgue integration representation of the ERT\operatorname{ERT}

ERT(g)(ν,t)=0{SELECT(g)(ν,t,s)SELECT(g)(ν,t,s)}𝑑s+120{LECT(g)(ν,t,s)LECT(g)(ν,t,s)}𝑑s.\displaystyle\begin{aligned} \operatorname{ERT}(g)(\nu,t)=&\int_{0}^{\infty}\left\{\operatorname{SELECT}(g)(\nu,t,s)-\operatorname{SELECT}(-g)(\nu,t,s)\right\}\,ds\\ &+\frac{1}{2}\int_{0}^{\infty}\left\{\operatorname{LECT}(-g)(\nu,t,s)-\operatorname{LECT}(g)(\nu,t,s)\right\}\,ds.\end{aligned} (4.6)

The proof of Theorem 4.2 is in Appendix A.5. Theorem 4.1 guarantees that the Euler integral and Lebesgue integrals in Eqs. (4.4)-(4.6) are well-defined. The representations in Theorem 4.2 will be used to compute the ERT in the next section. Furthermore, with the Fubini theorem, the Lebesgue integration representations in Eq. (4.5) and Eq. (4.6) imply the result in Theorem 3.3.

The representations in Theorem 4.2 connect to our proposed ERT to the MEC and WECT in the following ways:

  1. i)

    If g(x)0g(x)\geq 0 for all xx, then we have {xBd(0,R):xνtR and g(x)s}=\{x\in B_{\mathbb{R}^{d}}(0,R):\,x\cdot\nu\leq t-R\text{ and }-g(x)\geq s\}=\emptyset for all s>0s>0. Hence, ERT(g)(ν,t)=0SELECT(g)(ν,t,s)𝑑s=Mνg(t)\lfloor\operatorname{ERT}\rfloor(g)(\nu,t)=\int_{0}^{\infty}\operatorname{SELECT}(g)(\nu,t,s)\,ds=M_{\nu}^{g}(t) — meaning that the ERT\lfloor\operatorname{ERT}\rfloor is equal to the MEC specified in Eq. (4.3) for nonnegative grayscale functions.

  2. ii)

    If gg is nonnegative and LECT(g)(ν,t,s)=0\operatorname{LECT}(g)(\nu,t,s)=0 for almost every ss with respect to the Lebesgue measure, Eq. (4.6) implies that ERT(g)(ν,t)=Mνg(t)\operatorname{ERT}(g)(\nu,t)=M_{\nu}^{g}(t). From this viewpoint, we can easily show that the WECT proposed in Jiang et al. (2020) is a special case of our proposed ERT. Jiang et al. (2020) models each grayscale image as a “weighted simplicial complex” in the form g=σ𝔖aσ𝟙σg=\sum_{\sigma\in\mathfrak{S}}a_{\sigma}\cdot\mathbbm{1}_{\sigma} for a finite sum (each σ\sigma is a simplex, aσa_{\sigma}\in\mathbb{N}, and 𝔖\mathfrak{S} is a finite collection of simplexes), assuming the “consistency condition” aσ=max{aτ:τ𝔖 and σ is a face of τ}a_{\sigma}=\max\{a_{\tau}:\tau\in\mathfrak{S}\text{ and $\sigma$ is a face of $\tau$}\} for all σ𝔖\sigma\in\mathfrak{S}. The WECT of gg is defined as follows

    WECT(g)(ν,t):=d=0max{dim(σ):σ𝔖}(1)d(σ𝔖,dim(σ)=d,and σ{xd:xνtR}aσ).\displaystyle\operatorname{WECT}(g)(\nu,t):=\sum_{d=0}^{\max\{\operatorname{dim}(\sigma):\,\sigma\in\mathfrak{S}\}}(-1)^{d}\cdot\left(\sum_{\sigma\in\mathfrak{S},\,\operatorname{dim}(\sigma)=d,\,\text{and }\sigma\subseteq\{x\in\mathbb{R}^{d}:x\cdot\nu\leq t-R\}}a_{\sigma}\right).

    Kirveslahti and Mukherjee (2023) show that the WECT of gg coincides with the MEC of gg where WECT(g)(ν,t)=Mνg(t)\operatorname{WECT}(g)(\nu,t)=M_{\nu}^{g}(t). Furthermore, the definition of the LECT in Eq. (4.1) indicates that LECT(g)(ν,t,s)=0\operatorname{LECT}(g)(\nu,t,s)=0 unless s=aσs=a_{\sigma}\in\mathbb{N} for some simplex σ\sigma. Therefore, 01LECT(h)(ν,t,s)𝑑s=0\int_{0}^{1}\operatorname{LECT}(h)(\nu,t,s)\,ds=0. Hence, we have WECT(g)(ν,t)=Mνg(t)=ERT(g)(ν,t)\operatorname{WECT}(g)(\nu,t)=M_{\nu}^{g}(t)=\operatorname{ERT}(g)(\nu,t) for all (ν,t)𝕊d1×[0,T](\nu,t)\in\mathbb{S}^{d-1}\times[0,T], if g=σaσ𝟙σg=\sum_{\sigma}a_{\sigma}\cdot\mathbbm{1}_{\sigma} with aσa_{\sigma}\in\mathbb{N}.

Dissimilarities between Grayscale Images.

Lastly, we may use the ERT and SERT to measure the (dis-)similarity between two grayscale functions. Inspired by the dissimilarity quantities defined in the literature (Turner et al., 2014, Crawford et al., 2020, Meng et al., 2022, Marsh and Beers, 2023, Wang et al., 2023), we introduce the following (semi-)distances between grayscale functions g1,g2𝔇R,dg_{1},g_{2}\in\mathfrak{D}_{R,d}

distp,qERT(g1,g2):=ERT(g1)ERT(g2)LνqLtp,distp,qSERT(g1,g2):=SERT(g1)SERT(g2)LνqLtp,\displaystyle\begin{aligned} &\operatorname{dist}^{\operatorname{ERT}}_{p,q}(g_{1},g_{2}):=\left\|\,\operatorname{ERT}(g_{1})-\operatorname{ERT}(g_{2})\,\right\|_{L_{\nu}^{q}L_{t}^{p}},\\ &\operatorname{dist}^{\operatorname{SERT}}_{p,q}(g_{1},g_{2}):=\left\|\,\operatorname{SERT}(g_{1})-\operatorname{SERT}(g_{2})\,\right\|_{L_{\nu}^{q}L_{t}^{p}},\end{aligned} (4.7)

where 1p,q1\leq p,q\leq\infty, and the LνqLtpL_{\nu}^{q}L_{t}^{p}-norm fLνqLtp\|f\|_{L_{\nu}^{q}L_{t}^{p}} of a function (ν,t)f(ν,t)(\nu,t)\mapsto f(\nu,t) is defined in a two-step process. Initially, we take the LpL^{p}-norm f(ν,)Ltp\|f(\nu,\cdot)\|_{L_{t}^{p}} with respect to t[0,T]t\in[0,T]; then we next take the LqL^{q}-norm with respect to ν𝕊d1\nu\in\mathbb{S}^{d-1} (regarding the spherical measure dνd\nu on 𝕊d1\mathbb{S}^{d-1}). Theorem 3.3 implies that the LνqLtpL_{\nu}^{q}L_{t}^{p}-norm used in Eq. (4.7) is well-defined. Theorem 3.5 indicates that distp,qERT\operatorname{dist}^{\operatorname{ERT}}_{p,q} is a distance, rather than just a semi-distance, on the space 𝔇R,dpc\mathfrak{D}_{R,d}^{pc} defined in Eq. (3.13). Let K1K_{1} and K2K_{2} belong to CS(Bd(0,R))\operatorname{CS}(B_{\mathbb{R}^{d}}(0,R)). Then, we have the following:

  • distp,pSERT(𝟙K1,𝟙K2)\operatorname{dist}^{\operatorname{SERT}}_{p,p}(\mathbbm{1}_{K_{1}},\mathbbm{1}_{K_{2}}) agrees with the SECT distance stated in Crawford et al. (2020);

  • distp,pERT(𝟙K1,𝟙K2)\operatorname{dist}^{\operatorname{ERT}}_{p,p}(\mathbbm{1}_{K_{1}},\mathbbm{1}_{K_{2}}) agrees with the ECT distance stated in Curry et al. (2022);

  • dist2,ERT(𝟙K1,𝟙K2)\operatorname{dist}^{\operatorname{ERT}}_{2,\infty}(\mathbbm{1}_{K_{1}},\mathbbm{1}_{K_{2}}) agrees with the distance stated in Meng et al. (2022), which underpins the generation of the Borel algebra implemented in that work;

  • dist1,ERT(𝟙K1,𝟙K2)\operatorname{dist}^{\operatorname{ERT}}_{1,\infty}(\mathbbm{1}_{K_{1}},\mathbbm{1}_{K_{2}}) aligns with the distance stated in Marsh and Beers (2023) for the examination of the stability of the ECT;

  • dist2,SERT(𝟙K1,𝟙K2)\operatorname{dist}^{\operatorname{SERT}}_{2,\infty}(\mathbbm{1}_{K_{1}},\mathbbm{1}_{K_{2}}) agrees with the distance utilized in Wang et al. (2023) for permutation test.

Futhermore, the distances defined in Eq. (4.7) are the analogs of the following distances proposed in Kirveslahti and Mukherjee (2023)

distpSELECT(g1,g2):=(𝕊d10T|SELECT(g1)(ν,t,s)SELECT(g2)(ν,t,s)|p𝑑s𝑑t𝑑ν)1/p,distpMEC(g1,g2):=(𝕊d10T|Mνg1(t)Mνg2(t)|p𝑑t𝑑ν)1/p,\displaystyle\begin{aligned} &\operatorname{dist}^{\operatorname{SELECT}}_{p}(g_{1},g_{2}):=\left(\int_{\mathbb{S}^{d-1}}\int_{0}^{T}\int_{\mathbb{R}}\left|\,\operatorname{SELECT}(g_{1})(\nu,t,s)-\operatorname{SELECT}(g_{2})(\nu,t,s)\right|^{p}\,ds\,dt\,d\nu\right)^{1/p},\\ &\operatorname{dist}^{\operatorname{MEC}}_{p}(g_{1},g_{2}):=\left(\int_{\mathbb{S}^{d-1}}\int_{0}^{T}\left|\,M_{\nu}^{g_{1}}(t)-M_{\nu}^{g_{2}}(t)\,\right|^{p}\,dt\,d\nu\right)^{1/p},\end{aligned} (4.8)

for 1p<1\leq p<\infty. Theorem 4.1 implies that the Lebesgue integrals implemented in the LpL^{p}-norms in Eq. (4.8) are well-defined. We will compare the performance of all the referred distances from a statistical inference perspective in Section 8.

Refer to caption
Figure 3: The torus in panel (a) presents the level set {x3:g(x)=0.0834}\{x\in\mathbb{R}^{3}:\,g(x)=0.0834\}, where gg is defined in Eq. (5.1). Panel (b) presents curve tSERT(g)(ν,t)t\mapsto\operatorname{SERT}(g)(\nu,t) with ν=(0,1,0)\nu=(0,1,0)^{\intercal}. Panel (c) presents curve tSERT(g)(ν,t)t\mapsto\operatorname{SERT}(g)(\nu,t) with ν=(0,0,1)\nu=(0,0,1)^{\intercal}. Panels (d1, d2, e1, e2, f1, f2) present the surfaces (θ,t)SERT(g)(ν,t)(\theta,t)\mapsto\operatorname{SERT}(g)(\nu,t) with different directions ν\nu as in Eq. (5.2). Panels (d1, d2) present the same surface from different angles. Similar for panels (e1, e2) and (f1, f2).

5 A Proof-of-Concept Example

Our proposed SERT plays an important role in the statistical inference discussed in Section 5, given its ability to convert grayscale images into functional data. Before delving into the SERT-based statistical inference, in this section, we illustrate the SERT via a proof-of-concept example. Here, we focus on dimensionality d=3d=3 and the following scalar field

g(x):={(34(x12+x22)12)2+3x324}𝟙{1x1,x2,x31}, where x=(x1,x2,x3).\displaystyle g(x):=\left\{\left(\sqrt{\frac{3}{4}\left(x_{1}^{2}+x_{2}^{2}\right)}-\frac{1}{2}\right)^{2}+\frac{3x_{3}^{2}}{4}\right\}\cdot\mathbbm{1}_{\{-1\leq x_{1},\,x_{2},\,x_{3}\leq 1\}},\ \ \ \text{ where }x=(x_{1},x_{2},x_{3})^{\intercal}. (5.1)

The gg in Eq. (5.1) is a grayscale function (in the sense of Definition 3.3), where g𝔇R,dg\in\mathfrak{D}_{R,d} with d=3d=3 and R=2R=2. A level set {x3:g(x)=0.0834}\{x\in\mathbb{R}^{3}:\,g(x)=0.0834\} of gg is presented in Figure 3 (the level 0.0834 was specifically chosen to make the visualization of the level set look like a torus). We computed the ERT of the gg in a sequential manner. We first compute the LECT and SELECT using the MATLAB isosurface procedure. Next, we compute the ERT utilizing the Lebesgue integration representation in Eq. (4.6). Finally, the SERT is derived from the ERT via a standard numerical integration method.

The SERT of the 33-dimensional grayscale image gg defined in Eq. (5.1) is a scalar field over the 3-dimensional product manifold 𝕊2×[0,4]\mathbb{S}^{2}\times[0,4], where (ν,t)SERT(g)(ν,t)(\nu,t)\mapsto\operatorname{SERT}(g)(\nu,t) with ν=(ν1,ν2,ν3)𝕊2\nu=(\nu_{1},\nu_{2},\nu_{3})^{\intercal}\in\mathbb{S}^{2}. We visualize the following segments of the scalar field

(θ,t)SERT(g)(ν,t)withν=(cosθ,sinθ, 0)in Figure 3(d1, d2);(θ,t)SERT(g)(ν,t)withν=(cosθ, 0,sinθ)in see Figure 3(e1, e2);(θ,t)SERT(g)(ν,t)withν=(0,cosθ,sinθ)in Figure 3(f1, f2);\displaystyle\begin{aligned} &(\theta,t)\mapsto\operatorname{SERT}(g)(\nu,t)\ \text{with}\ \nu=(\cos\theta,\,\sin\theta,\,0)^{\intercal}\ \text{in Figure \ref{fig: Merged_document}(d1, d2)};\\ &(\theta,t)\mapsto\operatorname{SERT}(g)(\nu,t)\ \text{with}\ \nu=(\cos\theta,\,0,\,\sin\theta)^{\intercal}\ \text{in see Figure \ref{fig: Merged_document}(e1, e2)};\\ &(\theta,t)\mapsto\operatorname{SERT}(g)(\nu,t)\ \text{with}\ \nu=(0,\,\cos\theta,\,\sin\theta)^{\intercal}\ \text{in Figure \ref{fig: Merged_document}(f1, f2)};\end{aligned} (5.2)

where θ[0,2π]\theta\in[0,2\pi] and t[0,4]t\in[0,4]. The maps in Eq. (5.2) are scalar fields over [0,2π]×[0,4][0,2\pi]\times[0,4] and are presented in the second and third rows of Figure 3.

In Figure 3, the surfaces corresponding to (θ,t)SERT(g)(ν,t)(\theta,t)\mapsto\operatorname{SERT}(g)(\nu,t) consistently exhibit an approximate periodicity of π/2\pi/2 in the variable θ\theta (indicated by the axis label “Direction ν\nu”). This approximate periodicity is fundamentally derived from the “box-shape” indicator function 𝟙{1x1,,x2,,x31}\mathbbm{1}_{\{-1\leq x_{1},,x_{2},,x_{3}\leq 1\}} in Eq. (5.1). The surfaces depicted in Figures 3(e1, e2) and 3(f1, f2) are indistinguishable, which is a characteristic attributed to the x1x_{1}-x2x_{2} symmetry of the grayscale function gg defined in Eq. (5.1). The surfaces presented in Figure 3(d1, d2) exhibit subtle distinctions from those in 3(e1, e2, f1, f2). The curves and surfaces in Figure 3, as well as the scalar field (ν,t)SERT(g)(ν,t)(\nu,t)\mapsto\operatorname{SERT}(g)(\nu,t), suggest a potential association of the SERT with manifold learning (Yue et al., 2016, Dunson and Wu, 2021, Meng and Eloyan, 2021, Li et al., 2022).

Refer to caption
Figure 4: The left panel presents the surface of the function defined in Eq. (6.7) for θ[π/2,π/2]\theta\in[-\pi/2,\,\pi/2] and λ[3/2, 3/2]\lambda\in[-3/2,\,3/2]. Notably, we have also considered the negative scaling parameters λ<0\lambda<0, which correspond to the “white-to-black” transition. The right panel presents the same function with θ[0.45, 0.45]\theta\in[-0.45,\,0.45] and λ[0.9, 1.1]\lambda\in[0.9,\,1.1]. Both surfaces indicate that the function in Eq. (6.7) is minimized when λ=1\lambda=1 and θ=0\theta=0.

6 Alignment of Images and Invariance of the ERT

In various applications, images are typically aligned before analysis (e.g., Bankman, 2008). Wang et al. (2021) utilized an ECT-based approach to align shapes (equivalently, binary images) through orthogonal actions. The primary aim of the ECT-based strategy is to lessen the difference between a pair of shapes resulting from the orthogonal movements. An in-depth exposition of the ECT-based method, along with a proof-of-concept example, can be found in Supplementary Section 4 of Wang et al. (2021). Analogous correspondence free alignment techniques are also needed for grayscale images. Kirveslahti and Mukherjee (2023) examined the invariance of the LECT with respect to orthogonal alignments and introduced an LECT/SELECT-based alignment method tailored for grayscale images. Motivated by the studies in both Wang et al. (2021) and Kirveslahti and Mukherjee (2023), in this section, we introduce an ERT-based alignment approach. This approach will serve as a preprocessing step for the ERT-based statistical inference detailed in Section 7.

Let gg^{\diamondsuit} represent a reference image designated as a template. For any source image gg under consideration, we consider a collection of transforms, denoted by 𝒯\mathscr{T}, encompassing all transformations TT of interest. From 𝒯\mathscr{T}, we select the transformation T𝒯T^{\blacklozenge}\in\mathscr{T} such that the transformed image T(g)T^{\blacklozenge}(g) is the most similar to gg^{\diamondsuit} based on a specified dissimilarity metric. Subsequent analysis is then conducted on the aligned image T(g)T^{\blacklozenge}(g) instead of the original source image gg. A potential choice for the dissimilarity metric can be one of the distances defined in Eq.(4.7), denoted as dist\operatorname{dist}. This can be summarized as follows

T:=argminT𝒯dist(g,T(g)).\displaystyle T^{\blacklozenge}:=\operatorname*{arg\,min}_{T\in\mathscr{T}}\,\operatorname{dist}\Big{(}g^{\diamondsuit},T(g)\Big{)}. (6.1)

Beyond the orthogonal actions implemented in Wang et al. (2021) and Kirveslahti and Mukherjee (2023), we also consider scaling transforms of grayscale images, including the “white-to-black” transition. In this paper, we focus on the following collection of transforms

𝒯={T:(Tg)(x)=λg(𝑨1x), where λ and 𝑨O(d)},\displaystyle\mathscr{T}=\left\{T:\,(Tg)(x)=\lambda\cdot g(\boldsymbol{A}^{-1}x),\text{ where }\lambda\in\mathbb{R}\text{ and }\boldsymbol{A}\in\operatorname{O}(d)\right\}, (6.2)

where O(d)\operatorname{O}(d) denotes the orthogonal group in dimension dd, and it contains all the rotations and reflections in d\mathbb{R}^{d}. For ease of notation, we define the dual 𝑨\boldsymbol{A}_{*} of 𝑨\boldsymbol{A} by (𝑨g)(x):=g(𝑨1x)(\boldsymbol{A}_{*}g)(x):=g\left(\boldsymbol{A}^{-1}x\right). The goal of employing minimization across the collection in Eq. (6.2) is to mitigate disparities between two images arising from orthogonal movements and variations in pixel intensity scales.

A challenge in using the criterion presented in Eq. (6.1), combined with any of the dissimilarity metrics in Eq.(4.7), is the computational cost of obtaining ERT(T(g))\operatorname{ERT}(T(g)) for all T𝒯T\in\mathscr{T}. More specifically, the computation of ERT(λ𝑨g)\operatorname{ERT}(\lambda\cdot\boldsymbol{A}_{*}g) is required for every λ\lambda\in\mathbb{R} and 𝑨O(d)\boldsymbol{A}\in\operatorname{O}(d). Fortunately, the homogeneity of the ERT (see Theorem 3.2) implies ERT(λ𝑨g)=λERT(𝑨g)\operatorname{ERT}(\lambda\cdot\boldsymbol{A}_{*}g)=\lambda\cdot\operatorname{ERT}(\boldsymbol{A}_{*}g) for all λ\lambda\in\mathbb{R}. Therefore, we can simply calculate ERT(𝑨g)\operatorname{ERT}(\boldsymbol{A}_{*}g) and get the ERT(λ𝑨g)\operatorname{ERT}(\lambda\cdot\boldsymbol{A}_{*}g) for all λ\lambda\in\mathbb{R} by simply scaling the computed ERT(𝑨g)\operatorname{ERT}(\boldsymbol{A}_{*}g). Similarly, if we further have the “𝑨\boldsymbol{A}_{*}-homogeneity”, where “ERT(𝑨g)=𝑨ERT(g)\operatorname{ERT}(\boldsymbol{A}_{*}g)=\boldsymbol{A}_{*}\operatorname{ERT}(g),” the amount of computation required in Eq. (6.1) is further reduced. The “𝑨\boldsymbol{A}_{*}-homogeneity” is true and accurately presented by the following result.

Theorem 6.1.

For any g𝔇R,dg\in\mathfrak{D}_{R,d} and 𝐀O(d)\boldsymbol{A}\in\operatorname{O}(d), we have ECT(𝐀g)(ν,t)=ECT(g)(𝐀1ν,t)=:𝐀ECT(g)(ν,t)\operatorname{ECT}(\boldsymbol{A}_{*}g)(\nu,t)=\operatorname{ECT}(g)(\boldsymbol{A}^{-1}\nu,t)=:\boldsymbol{A}_{*}\operatorname{ECT}(g)(\nu,t) for all ν𝕊d1\nu\in\mathbb{S}^{d-1} and t[0,T]t\in[0,T].

Theorem 6.1 is a direct result of “Proposition 2.18” of Kirveslahti and Mukherjee (2023) via Eq. (4.6); hence, we omit its proof. Combining the scalar homogeneity and 𝑨\boldsymbol{A}_{*}-homogeneity, we have the following

ECT(λ𝑨g)(ν,t)=λECT(g)(𝑨1ν,t)=:λ𝑨ECT(g)(ν,t),\displaystyle\operatorname{ECT}(\lambda\cdot\boldsymbol{A}_{*}g)(\nu,t)=\lambda\cdot\operatorname{ECT}(g)(\boldsymbol{A}^{-1}\nu,t)=:\lambda\cdot\boldsymbol{A}_{*}\operatorname{ECT}(g)(\nu,t), (6.3)

for all g𝔇R,dg\in\mathfrak{D}_{R,d}, ν𝕊d1\nu\in\mathbb{S}^{d-1}, and t[0,T]t\in[0,T]. Using dist=distp,qERT\operatorname{dist}=\operatorname{dist}^{\operatorname{ERT}}_{p,q} as an example (see Eq. (4.7) for the definition of distp,qERT\operatorname{dist}^{\operatorname{ERT}}_{p,q}), an optimal transformation across the collection in Eq. (6.2) can be represented via Eq. (6.3) as follows

argminλ and 𝑨O(d)ERT(g)λ𝑨ERT(g)LνqLtp.\displaystyle\operatorname*{arg\,min}_{\lambda\in\mathbb{R}\text{ and }\boldsymbol{A}\in\operatorname{O}(d)}\,\left\|\,\operatorname{ERT}(g)-\lambda\cdot\boldsymbol{A}_{*}\operatorname{ERT}(g)\,\right\|_{L_{\nu}^{q}L_{t}^{p}}. (6.4)

In Eq. (6.4), we only need to compute the ERT of gg^{\diamondsuit} and gg instead of all the λ𝑨g\lambda\cdot\boldsymbol{A}_{*}g for all λ\lambda\in\mathbb{R} and 𝑨O(d)\boldsymbol{A}\in\operatorname{O}(d). Notably, when both gg^{\diamondsuit} and gg are indicator functions representing constructible sets, the alignment method detailed in Eq. (6.4) is equivalent to the ECT-based alignment method proposed in Wang et al. (2021).

To illustrate the performance of the alignment approach described in Eq. (6.4), we present a proof-of-concept example. Our benchmark criterion for assessing the efficacy of the alignment approach is its ability to eliminate differences between images arising from rotation and scaling. Let gg denote the grayscale function defined in Eq. (5.1). We will study the scaled and rotated version λ𝑨θ,g\lambda\cdot\boldsymbol{A}_{\theta,*}g of gg, where 𝑨θ,\boldsymbol{A}_{\theta,*} denotes the dual of the rotation matrix 𝑨θ\boldsymbol{A}_{\theta} defined as

𝑨θ:=(cosθ0sinθ010sinθ0cosθ),\displaystyle\boldsymbol{A}_{\theta}:=\begin{pmatrix}\cos\theta&0&\sin\theta\\ 0&1&0\\ -\sin\theta&0&\cos\theta\end{pmatrix}, (6.5)

which represents rotation on the (x1,x3)(x_{1},x_{3})-plane by angle θ\theta. Obviously, the differences between the source image λ𝑨θ,g\lambda\cdot\boldsymbol{A}_{\theta,*}g and the reference image gg vanish if and only if θ=0\theta=0 and λ=1\lambda=1 (i.e., no rotation or scaling). Hence, in this example, our proposed alignment approach is effective if

(0,1)=argmin(θ,λ){ERT(g)λ𝑨θ,ERT(g)Lν2Lt2}.\displaystyle(0,1)=\operatorname*{arg\,min}_{(\theta,\lambda)}\left\{\left\|\,\operatorname{ERT}(g)-\lambda\cdot\boldsymbol{A}_{\theta,*}\operatorname{ERT}(g)\,\right\|_{L_{\nu}^{2}L_{t}^{2}}\right\}. (6.6)

To validate Eq. (6.6), we analyze the surface of the following function of (θ,λ)(\theta,\lambda)

(θ,λ)ERT(g)λ𝑨θ,ERT(g)Lν2Lt2,\displaystyle(\theta,\lambda)\mapsto\left\|\,\operatorname{ERT}(g)-\lambda\cdot\boldsymbol{A}_{\theta,*}\operatorname{ERT}(g)\,\right\|_{L_{\nu}^{2}L_{t}^{2}}, (6.7)

which is presented in Figure 4. The surfaces presented in Figure 4 confirm the minimization in Eq. (6.6) — the minimum point of the surface corresponds to the coordinates (θ,λ)=(0,1)(\theta,\lambda)=(0,1), implying that our alignment approach delineated in Eq. (6.4) is effective.

To illustrate the performance of the proposed alignment approach in our proof-of-concept example, we consider the gg defined in Eq. (5.1) as the reference image, as depicted in Figure 5(a). Next, λ𝐀θ,g(x)\lambda\cdot\mathbf{A}_{\theta,*}g(x) with parameters (θ,λ)=(π/6, 1/5)(\theta,\lambda)=(\pi/6,\,1/5) is selected as the source image, as illustrated in Figure 5(b). The source image is a rotated and scaled version of the reference image. Aligning the source image with respect to the reference yields the post-aligned source image shown in Figure 5(c). The reference image in Figure 5(a) and post-aligned source image in Figure 5(c) are nearly congruent. This similarity underscores the efficacy of the proposed alignment methodology in mitigating discrepancies attributable to rotations and scaling.

Refer to caption
Figure 5: In panel (a), the level set {x3:g(x)=0.0834}\{x\in\mathbb{R}^{3}:g(x)=0.0834\} is illustrated with gg being defined as in Eq. (5.1). Panel (b) depicts the level set {x3:λ𝐀θ,g(x)=0.0834}\{x\in\mathbb{R}^{3}:\lambda\cdot\mathbf{A}_{\theta,*}g(x)=0.0834\} for parameters λ=1/5\lambda=1/5 and θ=π/6\theta=\pi/6. Here, we consider gg to be the reference image and λ𝐀θ,g(x)\lambda\cdot\mathbf{A}_{\theta,*}g(x) with (θ,λ)=(π/6, 1/5)(\theta,\lambda)=(\pi/6,\,1/5) representing the source image that is under investigation. Panels (a) and (b) show that the reference and source images are drastically different. Employing the methodology from Eq. (6.4), we align λ𝐀θ,g(x)\lambda\cdot\mathbf{A}_{\theta,*}g(x) which results in the aligned image denoted as g~\tilde{g}. Panel (c) showcases the level set {x3:g~(x)=0.0834}\{x\in\mathbb{R}^{3}:\tilde{g}(x)=0.0834\}, closely mirroring the level set displayed in panel (a).

7 Statistical Inference of Grayscale Functions

We now provide our second major contribution — approaches for statistical inference on grayscale images. The grayscale functions presented in the previous sections of this work have been viewed as deterministic. In this section, we now view grayscale functions as random where we assume that they are generated from underlying distributions satisfying some regularity conditions (see Assumptions 2 and 3 in Section 7.1). Let Ω\Omega denote the collection of grayscale functions of interest, and assume that Ω\Omega is equipped with a σ\sigma-field \mathscr{F}. Next, suppose that there are two underlying grayscale function-generating distributions (probability measures), (1)\mathbb{P}^{(1)} and (2)\mathbb{P}^{(2)}, defined on the sample space (Ω,)(\Omega,\mathscr{F}). Our data are two collections of random grayscale functions sampled from the two distributions: {gi(1)}i=1niid(1)\{g_{i}^{(1)}\}_{i=1}^{n}\overset{iid}{\sim}\mathbb{P}^{(1)} and {gi(2)}i=1niid(2)\{g_{i}^{(2)}\}_{i=1}^{n}\overset{iid}{\sim}\mathbb{P}^{(2)}. Here, we provide approaches to testing if the two collections of functions are significantly different. More precisely, we propose methods of testing the following hypotheses

H0:(1)=(2)vs.H1:(1)(2).\displaystyle H_{0}^{*}:\ \ \mathbb{P}^{(1)}=\mathbb{P}^{(2)}\ \ \ \text{vs.}\ \ \ H_{1}^{*}:\ \ \mathbb{P}^{(1)}\neq\mathbb{P}^{(2)}. (7.1)

Without loss of generality, hereafter, we assume that the grayscale images {gi(1)}i=1n\{g_{i}^{(1)}\}_{i=1}^{n} and {gi(2)}i=1n\{g_{i}^{(2)}\}_{i=1}^{n} have been aligned using the ERT-based alignment method proposed in Section 6.

7.1 χ2\chi^{2}-test via the Karhunen–Loève Expansion

In this subsection, we propose χ2\chi^{2}-based hypothesis testing procedures via the Karhunen–Loève expansion and central limit theorem (CLT). These can be as generalizations of results presented in Meng et al. (2022) for the SECT and binary images. Testing the hypotheses in Eq. (7.1) is a highly nonparametric problem, and the χ2\chi^{2}-test approaches transform it into a parametric problem.

Suppose the grayscale image gg is random and g(j)g\sim\mathbb{P}^{(j)} for either j=1j=1 or j=2j=2. For each fixed (ν,t)𝕊d1×[0,T](\nu,t)\in\mathbb{S}^{d-1}\times[0,T], we have SERT(g)(ν,t)\operatorname{SERT}(g)(\nu,t) as a real-valued random variable. Hence, for each fixed direction ν𝕊d1\nu\in\mathbb{S}^{d-1}, SECT(g)(ν):={SERT(g)(ν,t)}t[0,T]\operatorname{SECT}(g)(\nu):=\{\operatorname{SERT}(g)(\nu,t)\}_{t\in[0,T]} is a stochastic process. Note that it is straightforward that the sample paths of the stochastic process are continuous (see Eq. (3.12)). In this section, we assume the following regarding (1)\mathbb{P}^{(1)} and (2)\mathbb{P}^{(2)}.

Assumption 2.

For each j{1,2}j\in\{1,2\} and (ν,t)𝕊d1×[0,T](\nu,t)\in\mathbb{S}^{d-1}\times[0,T], we have the finite second moment 𝔼(j)|SERT(ν,t)|2:=Ω|SERT(g)(ν,t)|2(j)(dg)<\mathbb{E}^{(j)}\left|\operatorname{SERT}(\nu,t)\right|^{2}:=\int_{\Omega}|\operatorname{SERT}(g)(\nu,t)|^{2}\,\mathbb{P}^{(j)}(dg)<\infty.

Under Assumption 2, we define the mean and covariance functions of SECT(g)(ν)\operatorname{SECT}(g)(\nu) as follows

mν(j)(t):=𝔼(j){SERT(ν,t)}=ΩSERT(g)(ν,t)(j)(dg),\displaystyle m_{\nu}^{(j)}(t):=\mathbb{E}^{(j)}\left\{\operatorname{SERT}(\nu,t)\right\}=\int_{\Omega}\operatorname{SERT}(g)(\nu,t)\,\mathbb{P}^{(j)}(dg),
κν(j)(s,t):=Ω(SERT(g)(ν,s)mν(j)(s))(SERT(g)(ν,t)mν(j)(t))(j)(dg),\displaystyle\kappa_{\nu}^{(j)}(s,t):=\int_{\Omega}\Big{(}\operatorname{SERT}(g)(\nu,s)-m_{\nu}^{(j)}(s)\Big{)}\cdot\Big{(}\operatorname{SERT}(g)(\nu,t)-m_{\nu}^{(j)}(t)\Big{)}\,\mathbb{P}^{(j)}(dg),

for j{1,2}j\in\{1,2\}, ν𝕊d1\nu\in\mathbb{S}^{d-1}, and s,t[0,T]s,t\in[0,T], where 𝔼(j)\mathbb{E}^{(j)} denotes the expectation associated with the probability measure (j)\mathbb{P}^{(j)}. Furthermore, we need the following assumption on the covariance functions.

Assumption 3.

For each j{1,2}j\in\{1,2\} and fixed ν𝕊d1\nu\in\mathbb{S}^{d-1}, the function (s,t)κν(j)(s,t)(s,t)\mapsto\kappa_{\nu}^{(j)}(s,t) is continuous on the product space [0,T]×[0,T][0,T]\times[0,T].

Under Assumption 3, the stochastic process SECT(g)(ν)\operatorname{SECT}(g)(\nu) is mean-square continuous, which is a direct result of “Lemma 4.2” of Alexanderian (2015). The mean-square continuity implies that tmν(j)(t)t\mapsto m_{\nu}^{(j)}(t) is a continuous function over the compact interval [0,T][0,T].

Distinguishing the two collections of grayscale images, {gi(1)}i=1niid(1)\{g_{i}^{(1)}\}_{i=1}^{n}\overset{iid}{\sim}\mathbb{P}^{(1)} and {gi(2)}i=1niid(2)\{g_{i}^{(2)}\}_{i=1}^{n}\overset{iid}{\sim}\mathbb{P}^{(2)}, is done by rejecting the null hypothesis H0H_{0}^{*} in Eq. (7.1). To reject the null H0H_{0}^{*} in Eq. (7.1), it suffices to reject the null hypothesis H0H_{0} in the following test

H0:mν(1)(t)=mν(2)(t) for all t[0,T]vs.H1:mν(1)(t)=mν(2)(t) for some t[0,T],where ν:=argmaxν𝕊d1{supt[0,T]|mν(1)(t)mν(2)(t)|}.\displaystyle\begin{aligned} &H_{0}:\,m_{\nu^{*}}^{(1)}(t)=m_{\nu^{*}}^{(2)}(t)\text{ for all }t\in[0,T]\ \ \ vs.\ \ \ H_{1}:\,m_{\nu^{*}}^{(1)}(t^{\prime})=m_{\nu^{*}}^{(2)}(t^{\prime})\text{ for some }t^{\prime}\in[0,T],\\ &\text{where }\ \ \nu^{*}:=\operatorname*{\arg\!\max}_{\nu\in\mathbb{S}^{d-1}}\left\{\sup_{t\in[0,T]}\left|m_{\nu^{*}}^{(1)}(t)-m_{\nu^{*}}^{(2)}(t)\right|\right\}.\end{aligned} (7.2)

We need the following assumption to perform a χ2\chi^{2}-test for the hypotheses in Eq. (7.2).

Assumption 4.

κν(1)=κν(2)\kappa_{\nu^{*}}^{(1)}=\kappa_{\nu^{*}}^{(2)} where the direction ν\nu^{*} is defined in Eq. (7.2).

Assumption 4 is true under the null H0:(1)=(2)H_{0}^{*}:\mathbb{P}^{(1)}=\mathbb{P}^{(2)} in Eq. (7.1). Under Assumption 4, we denote κ:=κν(1)=κν(2)\kappa:=\kappa_{\nu^{*}}^{(1)}=\kappa_{\nu^{*}}^{(2)}. Under Assumption 3, we have κL2([0,T]×[0,T])\kappa\in L^{2}([0,T]\times[0,T]) which further implies that the integral operator f0Tf(s)κ(s,)𝑑sf\mapsto\int_{0}^{T}f(s)\cdot\kappa(s,\cdot)\,ds defined on L2(0,T)L^{2}(0,T) is a compact, positive, and self-adjoint (see “Lemma 5.1” of Alexanderian (2015)). The Hilbert-Schmidt theorem (see “Theorem VI.16” of Reed (2012)) indicates that this integral operator has countably many orthonormal eigenfunctions {ϕl}l=1\{\phi_{l}\}_{l=1}^{\infty} and nonnegative eigenvalues {λl}l=1\{\lambda_{l}\}_{l=1}^{\infty}. Without loss of generality, we assume λ1λ20\lambda_{1}\geq\lambda_{2}\geq\ldots\geq 0. Following the proof of the “Karhunen-Loève expansion” in Meng et al. (2022), one can show the following result.

Theorem 7.1.

Suppose g(1)(1)g^{(1)}\sim\mathbb{P}^{(1)} and g(2)(2)g^{(2)}\sim\mathbb{P}^{(2)} are independent. Let (1)(2)\mathbb{P}^{(1)}\otimes\mathbb{P}^{(2)} denote a product probability measure. For each fixed ll\in\mathbb{N}, the following identity holds with probability one

12λl0T{SERT(g(1))(ν,t)SERT(g(2))(ν,t)}ϕl(t)𝑑t=θl+(Zl(1)(g(1))Zl(2)(g(1))2),\displaystyle\frac{1}{\sqrt{2\lambda_{l}}}\int_{0}^{T}\left\{\,\operatorname{SERT}(g^{(1)})(\nu^{*},t)-\operatorname{SERT}(g^{(2)})(\nu^{*},t)\,\right\}\cdot\phi_{l}(t)\,dt=\theta_{l}+\left(\frac{Z_{l}^{(1)}(g^{(1)})-Z_{l}^{(2)}(g^{(1)})}{\sqrt{2}}\right), (7.3)

that is, (1)(2){Eq. (7.3) holds}=1\mathbb{P}^{(1)}\otimes\mathbb{P}^{(2)}\left\{\text{Eq.\leavevmode\nobreak\ \eqref{eq: Karhunen–Loève expansion} holds}\right\}=1, where,

θl:=12λl0T{mν(1)(t)mν(2)(t)}ϕl(t)𝑑t,Zl(j)(g)=1λl0T{SECT(g)(ν,t)mν(j)(t)}ϕl(t)𝑑t, for j{1,2}.\displaystyle\begin{aligned} &\theta_{l}:=\frac{1}{\sqrt{2\lambda_{l}}}\int_{0}^{T}\left\{\,m_{\nu^{*}}^{(1)}(t)-m_{\nu^{*}}^{(2)}(t)\,\right\}\cdot\phi_{l}(t)\,dt,\\ &Z_{l}^{(j)}(g)=\frac{1}{\sqrt{\lambda_{l}}}\int_{0}^{T}\left\{\,\operatorname{SECT}(g)(\nu^{*},t)-m_{\nu^{*}}^{(j)}(t)\,\right\}\cdot\phi_{l}(t)\,dt,\ \ \ \text{ for }j\in\{1,2\}.\end{aligned} (7.4)

Furthermore, for each j{1,2}j\in\{1,2\}, random variables {Zl(j)}l=1\{Z_{l}^{(j)}\}_{l=1}^{\infty} are defined on the probability space (Ω,,(j))(\Omega,\mathscr{F},\mathbb{P}^{(j)}), mutually uncorrelated, and have mean 0 and variance 1.

Following the discussion in Meng et al. (2022), one can show that the null hypothesis H0H_{0} in Eq. (7.2) is equivalent to θl=0\theta_{l}=0 for all l=1,2,3,l=1,2,3,\ldots. It is infeasible to check θl\theta_{l} for all positive integers ll. In addition, a small λl\lambda_{l} in the denominator (e.g., see Eq. (7.4)) induces numerical instability. Therefore, we only consider θl\theta_{l} for l=1,2,,Ll=1,2,\ldots,L, where LL is given as the following

L:=min{k:k=1kλkk′′=1λk′′>0.99}.\displaystyle L:=\min\left\{\,k\in\mathbb{N}\,:\,\frac{\sum_{k^{\prime}=1}^{k}\lambda_{k^{\prime}}}{\sum_{k^{\prime\prime}=1}^{\infty}\lambda_{k^{\prime\prime}}}>0.99\right\}. (7.5)

Here, 0.99 can be replaced with any value in (0, 1)(0,\,1); we take 0.99 as an example. The LL defined in Eq. (7.5) is motivated by principal component analysis (Jolliffe, 2002) and indicates that we maintain at least 99%99\% of the cumulative variance in the data. Hence, to test the hypotheses in Eq. (7.2), we may test the following approximate hypotheses

H0^:θ0=θ1==θL=0vs.H1^:there exists k{1,2,,L} such that θk0.\displaystyle\widehat{H_{0}}:\,\theta_{0}=\theta_{1}=\cdots=\theta_{L}=0\ \ \ vs.\ \ \ \widehat{H_{1}}:\,\text{there exists }k^{\prime}\in\{1,2,\ldots,L\}\text{ such that }\theta_{k^{\prime}}\neq 0. (7.6)

Given data {gi(1)}i=1niid(1)\{g^{(1)}_{i}\}_{i=1}^{n}\overset{iid}{\sim}\mathbb{P}^{(1)} and {gi(2)}i=1niid(2)\{g^{(2)}_{i}\}_{i=1}^{n}\overset{iid}{\sim}\mathbb{P}^{(2)}, the approximate hypotheses in Eq. (7.6) can be tested using the random variables {ξl,i:l=1,,L and i=1,,n}\{\xi_{l,i}:\,l=1,\ldots,L\text{ and }i=1,\ldots,n\} defined as follows

ξl,i\displaystyle\xi_{l,i} :=12λl0T{SERT(gi(1))(ν,t)SERT(gi(2))(ν,t)}ϕl(t)𝑑t=θl+(Zl(1)(gi(1))Zl(2)(gi(1))2).\displaystyle:=\frac{1}{\sqrt{2\lambda_{l}}}\int_{0}^{T}\left\{\,\operatorname{SERT}(g_{i}^{(1)})(\nu^{*},t)-\operatorname{SERT}(g_{i}^{(2)})(\nu^{*},t)\,\right\}\cdot\phi_{l}(t)\,dt=\theta_{l}+\left(\frac{Z_{l}^{(1)}(g_{i}^{(1)})-Z_{l}^{(2)}(g_{i}^{(1)})}{\sqrt{2}}\right).

Theorem 7.5 implies that the random variables ξl,i\xi_{l,i} satisfy the following properties:

  • For each l{1,,L}l\in\{1,\ldots,L\} and i{1,,n}i\in\{1,\ldots,n\}, the random variable ξl,i\xi_{l,i} has mean θl\theta_{l} and variance 1.

  • For each fixed i{1,,n}i\in\{1,\ldots,n\}, the random variables ξ1,i,,ξL,i\xi_{1,i},\ldots,\xi_{L,i} are mutually uncorrelated.

  • For each fixed l{1,,L}l\in\{1,\ldots,L\}, the random variables ξl,1,,ξl,n\xi_{l,1},\ldots,\xi_{l,n} are iid.

The properties above indicate:

  1. i)

    For each fixed l{1,,L}l\in\{1,\ldots,L\}, the standardized 1ni=1nξl,i\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\xi_{l,i} asymptotically follows a standard Gaussian distribution N(0,1)N(0,1) under the null hypothesis H0^\widehat{H_{0}} in Eq. (7.6).

  2. ii)

    The asymptotic normality implies that random variables 1ni=1nξ1,i,,1ni=1nξL,i\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\xi_{1,i},\ldots,\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\xi_{L,i} are asymptotically independent.

Hence, l=1L(1ni=1nξl,i)2\sum_{l=1}^{L}\left(\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\xi_{l,i}\right)^{2} is asymptotically χL2\chi^{2}_{L} under the null hypothesis H0^\widehat{H_{0}} in Eq. (7.6), and we reject H0^\widehat{H_{0}} with asymptotic significance α(0,1)\alpha\in(0,1) if

l=1L(1ni=1nξl,i)2>χL,1α2= the 1α lower quantile of the χL2 distribution.\displaystyle\sum_{l=1}^{L}\left(\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\xi_{l,i}\right)^{2}>\chi_{L,1-\alpha}^{2}=\text{ the $1-\alpha$ lower quantile of the $\chi^{2}_{L}$ distribution}. (7.7)

Overall, we summarize our hypothesis testing problem as follows. Our goal is to test the hypotheses in Eq. (7.1). Through the Karhunen–Loève expansion (see Theorem 7.1), it suffices to test the approximate hypotheses in Eq. (7.6), which can be achieved by the χ2\chi^{2}-test in Eq. (7.7).

Suppose we have two groups of grayscale functions, {gi(1)}i=1niid(1)\{g^{(1)}_{i}\}_{i=1}^{n}\overset{iid}{\sim}\mathbb{P}^{(1)} and {gi(2)}i=1niid(2)\{g^{(2)}_{i}\}_{i=1}^{n}\overset{iid}{\sim}\mathbb{P}^{(2)}. We calculate the discretized SERT of the grayscale images, denoted as 𝒟(j):={SERT(gi(j))(νp,tq):p=1,,Γ and q=1,,Δ}i=1n\mathcal{D}^{(j)}:=\{\operatorname{SERT}(g_{i}^{(j)})(\nu_{p},t_{q}):\,p=1,\ldots,\Gamma\text{ and }q=1,\ldots,\Delta\}_{i=1}^{n} for j{1,2}j\in\{1,2\}. Then, we apply 𝒟(1)\mathcal{D}^{(1)} and 𝒟(2)\mathcal{D}^{(2)} as our input data to test the hypotheses in Eq. (7.2), which are approximated by that in Eq. (7.6). We may apply “Algorithm 1” in Meng et al. (2022) to test the hypotheses in Eq. (7.6) by simply replacing the “SECT of two collections of shapes” therein with the “SERT of two collections of grayscale functions”. The replacing of the SECT with SERT approach is explicitly summarized in Algorithm 1.

Algorithm 1 : χ2\chi^{2}-test
1:(i) Two collections {gi(1)}i=1n\{g_{i}^{(1)}\}_{i=1}^{n} and {gi(2)}i=1n\{g_{i}^{(2)}\}_{i=1}^{n} of grayscale functions; (ii) desired asymptotic confidence level 1α1-\alpha with asymptotic significance α(0,1)\alpha\in(0,1).
2:Accept or Reject the null hypothesis H0^\widehat{H_{0}} in Eq. (7.6). (Rejecting the H0^\widehat{H_{0}} implies rejecting the null hypothesis H0H_{0}^{*} in Eq. (7.1).)
3:Compute the discretized SERT {SERT(gi(j))(νp,tq):p=1,,Γ and q=1,,Δ}i=1n\{\operatorname{SERT}(g_{i}^{(j)})(\nu_{p},t_{q}):\,p=1,\ldots,\Gamma\text{ and }q=1,\ldots,\Delta\}_{i=1}^{n} for j{1,2}j\in\{1,2\} of the input grayscale functions.
4:Replace the input “SECT” in “Algorithm 1” of Meng et al. (2022) with the SERT computed in the previous step.
5:Implement “Algorithm 1” of Meng et al. (2022) and get the output.

Although the null hypothesis in Eq. (7.1) theoretically implies Assumption 4, the finite sample size n<n<\infty may numerically violate Assumption 4 due to the inaccuracy in the estimation of covariance functions. The (numerical) violation of Assumption 4 tends to lead to type-I error inflation. To reduce the type-I error rate, we apply a permutation technique (Good, 2013). That is, we first apply Algorithm 1 to our original grayscale images gi(j)g_{i}^{(j)} and then repeatedly re-apply Algorithm 1 to the grayscale images with shuffled group labels jj. Next, we compare how the χ2\chi^{2} statistic derived from the original data (see Eq. (7.7)) differs from that computed on the shuffled data. The idea behind the permutation approach is that shuffling the group labels jj of images gi(j)g_{i}^{(j)} should not significantly change the test statistic under the null hypothesis. The combination of the permutation technique and Algorithm 1 is an analog of the permutation test proposed in Meng et al. (2022). We summarize this method in Algorithm 2. Among the algorithms we propose throughout, we particularly recommend using Algorithm 2 in practice — this is also supported by simulation study results presented in Section 8. Specifically, we will show that Algorithm 2 is uniformly powerful under the alternative hypotheses and does not suffer from type I error inflation.

Algorithm 2 : Permutation-based χ2\chi^{2}-test
1:(i) Two collections {gi(1)}i=1n\{g_{i}^{(1)}\}_{i=1}^{n} and {gi(2)}i=1n\{g_{i}^{(2)}\}_{i=1}^{n} of grayscale functions; (ii) desired asymptotic confidence level 1α1-\alpha with asymptotic significance α(0,1)\alpha\in(0,1); (iii) the number of permutations Π\Pi.
2:Accept or Reject the null hypothesis H0^\widehat{H_{0}} in Eq. (7.6). (Rejecting the H0^\widehat{H_{0}} implies rejecting the null hypothesis H0H_{0}^{*} in Eq. (7.1).)
3:Compute the discretized SERT {SERT(gi(j))(νp,tq):p=1,,γ and q=1,,Δ}i=1n\{\operatorname{SERT}(g_{i}^{(j)})(\nu_{p},t_{q}):\,p=1,\ldots,\gamma\text{ and }q=1,\ldots,\Delta\}_{i=1}^{n} for j{1,2}j\in\{1,2\} of the input grayscale functions.
4:Replace the input “SECT” in “Algorithm 2” of Meng et al. (2022) with the SERT computed in the previous step.
5:Implement “Algorithm 2” of Meng et al. (2022) and get the output.

7.2 Full Permutation Test

In Section 7.1, we proposed two parametric-based approaches — Algorithms 1 and 2 — to testing the hypotheses in Eq. (7.1). Although Algorithm 2 involves a permutation-like technique, it still heavily depends on the χ2\chi^{2}-test in Eq. (7.7). In contrast, we also propose a full permutation hypothesis test. We will also compare this approach with Algorithms 1 and 2 using simulations in Section 8. The simulation studies therein indicate that our proposed Algorithms 1 and 2 tend to be more powerful than the fully permutation-based test. Following a strategy proposed by Robinson and Turner (2017), we apply the full permutation test based on the following loss function

L({gi(1)}i=1n,{gi(2)}i=1n):=12n(n1)k,l=1n{dist(gk(1),gl(1))+dist(gk(2),gl(2))},\displaystyle L\left(\{g_{i}^{(1)}\}_{i=1}^{n},\,\{g_{i}^{(2)}\}_{i=1}^{n}\right):=\frac{1}{2n(n-1)}\sum_{k,l=1}^{n}\left\{\,\operatorname{dist}\left(\,g_{k}^{(1)},\,g_{l}^{(1)}\,\right)+\operatorname{dist}\left(\,g_{k}^{(2)},\,g_{l}^{(2)}\,\right)\,\right\}, (7.8)

where dist{distp,qERT,distp,qSERT,distpSELECT,distpMEC}\operatorname{dist}\in\{\operatorname{dist}^{\operatorname{ERT}}_{p,q},\operatorname{dist}^{\operatorname{SERT}}_{p,q},\operatorname{dist}^{\operatorname{SELECT}}_{p},\operatorname{dist}^{\operatorname{MEC}}_{p}\} (see Eq. (4.7) and Eq. (4.8)). The full permutation test based on Eq. (7.8) is summarized in Algorithm 3.

Algorithm 3 : Full Permutation Test
1:(i) Two collections {gi(1)}i=1n\{g_{i}^{(1)}\}_{i=1}^{n} and {gi(2)}i=1n\{g_{i}^{(2)}\}_{i=1}^{n} of grayscale functions; (ii) desired asymptotic confidence level 1α1-\alpha with α(0,1)\alpha\in(0,1); (iii) the number Π\Pi of permutations; (iv) distance function dist{distp,qERT,distp,qSERT,distpSELECT,distpMEC}\operatorname{dist}\in\{\operatorname{dist}^{\operatorname{ERT}}_{p,q},\operatorname{dist}^{\operatorname{SERT}}_{p,q},\operatorname{dist}^{\operatorname{SELECT}}_{p},\operatorname{dist}^{\operatorname{MEC}}_{p}\} with prespecified parameters pp and qq.
2:Accept or Reject the null hypothesis H0H_{0}^{*} in Eq. (7.1).
3:Apply Eq. (7.8) to the original input grayscale functions and compute the value of the loss
𝔖0:=L({gi(1)}i=1n,{gi(2)}i=1n).\mathfrak{S}_{0}:=L\left(\{g_{i}^{(1)}\}_{i=1}^{n},\,\{g_{i}^{(2)}\}_{i=1}^{n}\right).
4:for all k=1,,Πk=1,\cdots,\Pi,  do
5:   Randomly permute the group labels j{1,2}j\in\{1,2\} of the input grayscale functions where the permuted grayscale functions are denoted as {g~i(1)}i=1n\{\tilde{g}_{i}^{(1)}\}_{i=1}^{n} and {g~i(2)}i=1n\{\tilde{g}_{i}^{(2)}\}_{i=1}^{n}.
6:   Apply Eq. (7.8) to the permuted grayscale functions and compute the value of the loss
𝔖k:=L({g~i(1)}i=1n,{g~i(2)}i=1n).\mathfrak{S}_{k}:=L\left(\{\tilde{g}_{i}^{(1)}\}_{i=1}^{n},\,\{\tilde{g}_{i}^{(2)}\}_{i=1}^{n}\right).
7:end for
8:Compute k:=αΠ:=k^{*}:=\lfloor\alpha\cdot\Pi\rfloor:= the largest integer smaller than αΠ\alpha\cdot\Pi.
9:Reject the null hypothesis H0H_{0} if 𝔖0<𝔖k\mathfrak{S}_{0}<\mathfrak{S}_{k^{*}} and report the output.
Refer to caption
Figure 6: The first two rows and the last two rows present the same collection of seven surfaces from different angles. The seven surfaces in each collection represent the level set {x3:h(ϵ)(x)=0.0834}\{x\in\mathbb{R}^{3}:\,h^{(\epsilon)}(x)=0.0834\} for seven indices ϵ{0.7, 0.8, 0.85, 0.875, 0.9, 0.95, 1}\epsilon\in\{0.7,\,0.8,\,0.85,\,0.875,\,0.9,\,0.95,\,1\}, respectively.
Refer to caption
Figure 7: Rejection rates of Algorithms 1, 2, and 3 across different indices ϵ\epsilon. The verticle axis indicates the rejection rates across the 100 data replicates and the horizontal axis indicates 1ϵ1-\epsilon. In the figure, the label “Algorithm 3 with dist2,2ERT\operatorname{dist}_{2,2}^{\operatorname{ERT}}” refers to the implementation of Algorithm 3 where the input is dist=dist2,2ERT\operatorname{dist}=\operatorname{dist}_{2,2}^{\operatorname{ERT}}. Analogously, labels with dist2,2SERT\operatorname{dist}_{2,2}^{\operatorname{SERT}}, dist2SELECT\operatorname{dist}_{2}^{\operatorname{SELECT}}, and dist2MEC\operatorname{dist}_{2}^{\operatorname{MEC}} function in the same manner.

8 Numerical Experiments

In this section, we show the performance of our proposed Algorithms 1, 2, and 3 using simulations. Specifically, we generate grayscale functions from a family of random fields and apply our proposed algorithms to them. Motivated by the simulation designs in Meng et al. (2022) and Kirveslahti and Mukherjee (2023), we apply Algorithms 1-3 to the following family of random grayscale functions

h(ϵ)(x1,x2,x3):={(αϵx12+ϵβx22δ)2+γx32}𝟙{1x1,x2,x31}, for ϵ[0.7, 1].\displaystyle h^{(\epsilon)}(x_{1},x_{2},x_{3}):=\left\{\left(\sqrt{\frac{\alpha}{\epsilon}\cdot x_{1}^{2}+\epsilon\cdot\beta\cdot x_{2}^{2}}-\delta\right)^{2}+\gamma\cdot x_{3}^{2}\right\}\cdot\mathbbm{1}_{\{-1\leq x_{1},x_{2},x_{3}\leq 1\}},\ \ \ \text{ for }\epsilon\in[0.7,\,1].

where ϵ\epsilon is a deterministic index, α,β,γ\alpha,\beta,\gamma are iid Unif(0.5,1)\operatorname{Unif}(0.5,1) random variables, δUnif(0.4,0.6)\delta\sim\operatorname{Unif}(0.4,0.6), and all the random variables are independent. Let (ϵ)\mathbb{P}^{(\epsilon)} denote the underlying distribution corresponding to h(ϵ)h^{(\epsilon)}. All the realizations of h(ϵ)h^{(\epsilon)} belong to 𝔇R,d\mathfrak{D}_{R,d} with R=2R=2 and d=3d=3. The level sets {x3:h(ϵ)(x)=0.0834}\{x\in\mathbb{R}^{3}:\,h^{(\epsilon)}(x)=0.0834\} for different indices ϵ{0.7, 0.8, 0.85, 0.875, 0.9, 0.95, 1}\epsilon\in\{0.7,\,0.8,\,0.85,\,0.875,\,0.9,\,0.95,\,1\} are presented in Figure 6.

Table 1: Rejection rates (RRs) of Algorithms 1, 2, and 3 across different indices ε\varepsilon. In the table, the label “Algorithm 3 with dist2,2ERT\operatorname{dist}_{2,2}^{\operatorname{ERT}}” refers to the implementation of Algorithm 3 where the input is dist=dist2,2ERT\operatorname{dist}=\operatorname{dist}_{2,2}^{\operatorname{ERT}}. Analogously, labels with dist2,2SERT\operatorname{dist}_{2,2}^{\operatorname{SERT}}, dist2SELECT\operatorname{dist}_{2}^{\operatorname{SELECT}}, and dist2MEC\operatorname{dist}_{2}^{\operatorname{MEC}} function in the same manner.
Null 𝑯𝟎\boldsymbol{H_{0}} Alternative 𝑯𝟏\boldsymbol{H_{1}}
Indices (𝟏ε)\boldsymbol{(1-\varepsilon)} 0.000 0.050 0.100 0.125 0.150 0.200 0.300
RRs of Algorithm 1 0.18 0.21 0.43 0.59 0.79 0.99 1.00
RRs of Algorithm 2 0.05 0.08 0.22 0.30 0.52 0.86 1.00
RRs of Algorithm 3 with dist2,2ERT\operatorname{dist}^{\operatorname{ERT}}_{2,2} 0.05 0.02 0.16 0.27 0.40 0.86 1.00
RRs of Algorithm 3 with dist2,2SERT\operatorname{dist}^{\operatorname{SERT}}_{2,2} 0.05 0.06 0.14 0.23 0.34 0.78 1.00
RRs of Algorithm 3 with dist2SELECT\operatorname{dist}^{\operatorname{SELECT}}_{2} 0.04 0.02 0.14 0.27 0.34 0.64 1.00
RRs of Algorithm 3 with dist2MEC\operatorname{dist}^{\operatorname{MEC}}_{2} 0.05 0.02 0.16 0.27 0.40 0.86 1.00

We apply our proposed algorithms to test the following hypotheses

H0:=(1)v(ϵ)s.H1:(1).(ϵ)\displaystyle H_{0}:\ \ \mathbb{P}{{}^{(1)}}=\mathbb{P}{{}^{(\epsilon)}}\ \ vs.\ \ H_{1}:\ \ \mathbb{P}{{}^{(1)}}\neq\mathbb{P}{{}^{(\epsilon)}}. (8.1)

The null hypothesis H0H_{0} in Eq. (8.1) is true if and only if ϵ=1\epsilon=1. We generate n=30n=30 realizations of h(1)h^{(1)}. Then, for each ϵ[0.7,1]\epsilon\in[0.7,1], we generate n=30n=30 realizations of h(ϵ)h^{(\epsilon)}. We apply Algorithms 1, 2, and 3 to the two collections of generated grayscale functions to test the hypotheses in Eq. (8.1) with significance 0.050.05 (i.e., the expected type I error rate is 0.05). We repeat this procedure 100 times, go through values of ϵ[0.7,1]\epsilon\in[0.7,1], and present the rejection rates across the 100 repetitions in Table 1 and Figure 7. The numerical experiment results can be summarized as follows:

  1. i)

    Among all the algorithms, Algorithm 1 is the most powerful under the alternative hypothesis where ϵ1\epsilon\neq 1. However, it suffers from type I error inflation — meaning that the expected rejection rate when ϵ=1\epsilon=1 is 0.05 but the rejection rate of Algorithm 1 is higher. As previously mentioned, the type I error inflation of Algorithm 1 stems from the numerical violation of Assumption 4.

  2. ii)

    Algorithm 2, which is a combination of the permutation technique and Algorithm 1, does not suffer from type I error inflation. While the power of Algorithm 2 is lower than that of Algorithm 1 under the alternative hypothesis, it is still uniformly more powerful than the full permutation test in Algorithm 3 with all the four distance inputs {dist2,2ERT,dist2,2SERT,dist2SELECT,dist2MEC}\{\operatorname{dist}^{\operatorname{ERT}}_{2,2},\operatorname{dist}^{\operatorname{SERT}}_{2,2},\operatorname{dist}^{\operatorname{SELECT}}_{2},\operatorname{dist}^{\operatorname{MEC}}_{2}\}.

  3. iii)

    The four distance inputs for Algorithm 3 result in comparable hypothesis testing performance. Particularly, the dist2,2ERT\operatorname{dist}^{\operatorname{ERT}}_{2,2} and dist2MEC\operatorname{dist}^{\operatorname{MEC}}_{2} result in the same performances when we apply them to Algorithm 3, which results from the similarity between the ERT and MEC. One theoretical advantage of the ERT over the MEC is that the ERT is homogeneous (i.e., ERT(λg)=λERT(g)\operatorname{ERT}(\lambda\cdot g)=\lambda\cdot\operatorname{ERT}(g) for all λ\lambda\in\mathbb{R}).

The results described above are similar to those shown in Meng et al. (2022) for numerical experiments conducted on random binary images. Based on the experiment results concluded above, we recommend Algorithm 2 in applications for the following reason: it is uniformly powerful (compared with the fully permutation-based Algorithm 3) and does not suffer from type I error inflation.

9 Discussion

The ultimate goal of our study is to generalize a series of ECT-based methods (Crawford et al., 2020, Wang et al., 2021, Meng et al., 2022, Marsh et al., 2022) to the analysis of grayscale images. In this paper, we took an initial step towards this goal by proposing an ECT-like topological summary, the ERT. The framework proposed in Baryshnikov and Ghrist (2010) provides solid mathematical foundations for our proposed ERT. Building upon the ERT, we introduced the SERT as a generalization of the SECT (Crawford et al., 2020, Meng et al., 2022). Importantly, the SERT represents grayscale images as functional data. By applying the Karhunen–Loève expansion to the SERT, we have proposed effective statistical algorithms (see Algorithms 1-3) designed to detect significant differences between two sets of grayscale images. Particularly, Algorithm 2 was shown in simulations to be uniformly powerful while not suffering from type I error inflation.

There are many motivating questions for future research. A few of them from the biomedical perspective include:

  1. i)

    Significantly different images usually correspond to different clinical outcomes (e.g., survival rates). Crawford et al. (2020) used the SECT on binary images of GBM tumors as the predictors in statistical inference. Here, the authors showed that the SECT has the power to predict clinical outcomes better than existing tumor quantification approaches. A natural generalization of the approach in Crawford et al. (2020) is the development of an SECT-like statistic designed for grayscale images which could prove to be powerful in terms of predicting clinical outcomes. For instance, one may consider analyzing the grayscale images in Figure 1 to predict the clinical outcomes of the corresponding lung cancer patients.

  2. ii)

    Suppose grayscale images can successfully predict a clinical outcome of interest. In that case, a subsequent question from the sub-image analysis viewpoint is: can we identify the physical features in the grayscales image that are most relevant to the clinical outcome? For binary images (equivalently, shapes), Wang et al. (2021) proposed an efficient method of seeking the desired physical features of shapes via the ECT. One may consider generalizing the method in Wang et al. (2021) to deal with grayscale images.

  3. iii)

    Tumors change over time. Hence, having the ability to study dynamically changing/longitudinal grayscale images is an area of interest. Using the ECT and SECT, Marsh et al. (2022) introduced the DETECT framework to analyze the dynamic changes in shapes. One may consider generalizing the DETECT approach to analyze the dynamic changes in grayscale images.

Lastly, another future direction would be to employ statistical methods analogous to those described in Section 7 for the analysis of networks using curvature-based approaches (Wu et al., 2022).

Software Availability

Code for implementing the Euler-Radon transform (ERT), the smooth Euler-Radon transform (SERT), as well as the lifted Euler characteristic transform (LECT) and the super lifted Euler characteristic transform (SELECT) is freely available at https://github.com/JinyuWang123/ERT.

Acknowledgements

LC would like to acknowledge the support of a David & Lucile Packard Fellowship for Science and Engineering. Research reported in this publication was partially supported by the National Institute On Aging of the National Institutes of Health under Award Number R01AG075511. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Statements and Declarations

The authors declare no competing interests.

Appendix A Proofs

A.1 Proof of Eq. (3.9)

Proof.

For any gCF(X)g\in\operatorname{CF}(X) and nn\in\mathbb{Z}, we have ng=ng=ng\lceil n\cdot g\rceil=n\cdot g=\lfloor n\cdot g\rfloor. Hence,

Xg(x)dχ(x)\displaystyle\int_{X}g(x)\lceil\,d\chi(x)\rceil =limn1nXng(x)𝑑χ(x)\displaystyle=\lim_{n\to\infty}\frac{1}{n}\int_{X}\lceil n\cdot g(x)\rceil\,d\chi(x)
=limn1nXng(x)𝑑χ(x)\displaystyle=\lim_{n\to\infty}\frac{1}{n}\int_{X}n\cdot g(x)\,d\chi(x)
=limnXg(x)𝑑χ(x)\displaystyle=\lim_{n\to\infty}\int_{X}g(x)\,d\chi(x)
=Xg(x)𝑑χ(x).\displaystyle=\int_{X}g(x)\,d\chi(x).

Similarly, we have that Xg(x)dχ(x)=Xg(x)𝑑χ(x)\int_{X}g(x)\lfloor d\chi(x)\rfloor=\int_{X}g(x)d\chi(x). Therefore, we have Xg(x)[dχ(x)]=Xg(x)𝑑χ(x)\int_{X}g(x)[d\chi(x)]=\int_{X}g(x)d\chi(x). Since 𝟙KCF(X)\mathbbm{1}_{K}\in\operatorname{CF}(X) for all KCS(X)K\in\operatorname{CS}(X), we obtain Eq. (3.9). \square

A.2 Proof of Theorem 3.1

We need the following lemma in van den Dries (1998).

Lemma A.1 (rephrased “(2.10) Proposition,” Chapter 4 of van den Dries (1998)).

Let Sm+dS\subseteq\mathbb{R}^{m+d} be definable and Sa:={xd:(a,x)S}S_{a}:=\{x\in\mathbb{R}^{d}:\,(a,x)\in S\} for each ama\in\mathbb{R}^{m}. Then, χ(Sa)\chi(S_{a}) takes only finitely many values as aa runs through m\mathbb{R}^{m}. Furthermore, for each integer zz, the set {am:χ(Sa)=z}\{a\in\mathbb{R}^{m}:\,\chi(S_{a})=z\} is definable.

With Lemma A.1, we provide the proof of Theorem 3.1 as follows

Proof.

(Proof of Theorem 3.1.) We implement Lemma A.1 by defining the following

  1. i)

    m:=d+1m:=d+1;

  2. ii)

    a=(ν,t)𝕊d1×[0,T]m=d×a=(\nu,t)\in\mathbb{S}^{d-1}\times[0,T]\subseteq\mathbb{R}^{m}=\mathbb{R}^{d}\times\mathbb{R};

  3. iii)

    S:={(ν,t,x)d××d:xK and xνtR}S:=\left\{(\nu,t,x)\in\mathbb{R}^{d}\times\mathbb{R}\times\mathbb{R}^{d}:x\in K\text{ and }x\cdot\nu\leq t-R\right\}. Since S=d××K{(ν,t,x)d××d:xνt+R0}S=\mathbb{R}^{d}\times\mathbb{R}\times K\cap\{(\nu,t,x)\in\mathbb{R}^{d}\times\mathbb{R}\times\mathbb{R}^{d}:\,x\cdot\nu-t+R\leq 0\}, the set SS is definable under Assumption 1.

Then, for each fixed a=(ν,t)d×=ma=(\nu,t)\in\mathbb{R}^{d}\times\mathbb{R}=\mathbb{R}^{m}, we have

Sa=S(ν,t)={xK:xνtR}=Ktν.\displaystyle S_{a}=S_{(\nu,t)}=\{x\in K:\,x\cdot\nu\leq t-R\}=K_{t}^{\nu}.

Lemma A.1 implies that χ(Sa)=χ(Ktν)\chi(S_{a})=\chi(K_{t}^{\nu}) takes only finitely many values as a=(ν,t)a=(\nu,t) runs through m=d×\mathbb{R}^{m}=\mathbb{R}^{d}\times\mathbb{R}. Therefore, χ(Ktν)\chi(K_{t}^{\nu}) takes only finitely many values as (ν,t)(\nu,t) runs through 𝕊d1×[0,T]\mathbb{S}^{d-1}\times[0,T].

Furthermore, Lemma A.1 indicates that {(ν,t)m:χ(Ktν)=z}\{(\nu,t)\in\mathbb{R}^{m}:\,\chi(K_{t}^{\nu})=z\} is definable for every integer zz. Because 𝕊d1×[0,T]\mathbb{S}^{d-1}\times[0,T] is definable (under Assumption 1), we have that {(ν,t)𝕊d1×[0,T]:χ(Ktν)=z}=𝕊d1×[0,T]{(ν,t)m:χ(Ktν)=z}\{(\nu,t)\in\mathbb{S}^{d-1}\times[0,T]:\,\chi(K_{t}^{\nu})=z\}=\mathbb{S}^{d-1}\times[0,T]\cap\{(\nu,t)\in\mathbb{R}^{m}:\,\chi(K_{t}^{\nu})=z\} is definable for every integer zz. The proof of the first result of Theorem 3.1 is completed.

The second result is a straightforward corollary of the first. The third result of Theorem 3.1 is implied by the first result and the “monotonicity theorem” in Chapter 3 of van den Dries (1998). \square

A.3 Proof of Theorem 3.4

Proof.

It is straightforward that ERT(g)\operatorname{ERT}(g) determines SERT(g)\operatorname{SERT}(g). It suffices to show that SERT(g)\operatorname{SERT}(g) determines ERT(g)\operatorname{ERT}(g).

For every fixed direction ν𝕊d1\nu\in\mathbb{S}^{d-1}, the definition of SERT(g)\operatorname{SERT}(g) implies

ddtSERT(g)(ν,t)=ERT(g)(ν,t)+(1T0TERT(g)(ν,τ)𝑑τ),\displaystyle\frac{d}{dt}\operatorname{SERT}(g)(\nu,t)=\operatorname{ERT}(g)(\nu,t)+\left(-\frac{1}{T}\int_{0}^{T}\operatorname{ERT}(g)(\nu,\tau)\,d\tau\right),

for all t[0,T]t\in[0,T] that are not discontinuities of tERT(g)(ν,t)t\mapsto\operatorname{ERT}(g)(\nu,t). Recall that the support of gg is strictly smaller than the domain Bd(0,R)B_{\mathbb{R}^{d}}(0,R). For any t<dist(supp(g),Bd(0,R))t^{*}<\operatorname{dist}\left(\operatorname{supp}(g),\partial B_{\mathbb{R}^{d}}(0,R)\right), we have g(x)R(x,ν,t)=0g(x)\cdot R(x,\nu,t^{*})=0 for all xBd(0,R)x\in B_{\mathbb{R}^{d}}(0,R), which indicates that ERT(g)(ν,t)=Bd(0,R)g(x)R(x,ν,t)[dχ(x)]=0\operatorname{ERT}(g)(\nu,t^{*})=\int_{B_{\mathbb{R}^{d}}(0,R)}g(x)\cdot R(x,\nu,t^{*})\,[d\chi(x)]=0. Hence,

limt0+ddtSERT(g)(ν,t)=1T0TERT(g)(ν,τ)𝑑τ,\displaystyle\lim_{t\rightarrow 0+}\frac{d}{dt}\operatorname{SERT}(g)(\nu,t)=-\frac{1}{T}\int_{0}^{T}\operatorname{ERT}(g)(\nu,\tau)\,d\tau,

which implies

ERT(g)(ν,t)=ddtSERT(g)(ν,t)limt0+ddtSERT(g)(ν,t),\displaystyle\operatorname{ERT}(g)(\nu,t)=\frac{d}{dt}\operatorname{SERT}(g)(\nu,t)-\lim_{t\rightarrow 0+}\frac{d}{dt}\operatorname{SERT}(g)(\nu,t), (A.1)

for all t[0,T]t\in[0,T] that are not discontinuities of tERT(g)(ν,t)t\rightarrow\operatorname{ERT}(g)(\nu,t). The right continuity of tERT(g)(ν,t)t\mapsto\operatorname{ERT}(g)(\nu,t) implies that Eq. (A.1) holds for all t[0,T]t\in[0,T]. That is, SERT(g)\operatorname{SERT}(g) determines ERT(g)\operatorname{ERT}(g) through Eq. (A.1). The proof is completed. \square

A.4 Proof of Theorem 3.5

Before the proof of Theorem 3.5, we suggest the audiences read Appendix C, especially Proposition C.1, as a prerequisite for the proof. A takeaway message from Appendix C is that the ERT is invertible if the “Fubini condition” (Eq. (C.2)) is satisfied. In addition to Appendix C, we need the following lemma as a prerequisite

Lemma A.2.

If f,gCF(X)f,g\in\operatorname{CF}(X), we have

X{af(x)+bg(x)}[dχ(x)]=aXf(x)𝑑χ(x)+bXg(x)𝑑χ(x)\displaystyle\int_{X}\left\{a\cdot f(x)+b\cdot g(x)\right\}\,[d\chi(x)]=a\cdot\int_{X}f(x)\,d\chi(x)+b\cdot\int_{X}g(x)\,d\chi(x)

for all a,ba,b\in\mathbb{R}.

Proof.

Both af(x)a\cdot f(x) and bg(x)b\cdot g(x) are compactly supported real-valued definable functions; the images of af(x)a\cdot f(x) and bg(x)b\cdot g(x) are both finite-point sets in \mathbb{R}. The equality above then follows from Lemma B.1 and the homogeneity of [dχ][d\chi]. \square

The following lemma implies the Fubini condition is satisfied by piecewise constant definable functions with compact support.

Lemma A.3.

Suppose XX is a definable set and YY is a bounded subset of N\mathbb{R}^{N} for some positive integer NN. If function fDef(X;)f\in\operatorname{Def}(X;\mathbb{R}) takes finitely many values, we have the following

Y(F1(y)f(x)[dχ(x)])[dχ(y)]=Xf(x)[dχ(x)]\int_{Y}\left(\int_{F^{-1}(y)}f(x)\,[d\chi(x)]\right)[d\chi(y)]=\int_{X}f(x)[d\chi(x)]

for all FDef(X;Y)F\in\operatorname{Def}(X;Y).

Proof.

Suppose ff takes values in {ai}i=1m\{a_{i}\}_{i=1}^{m} with m<m<\infty. Denote Di:={xX:f(x)=ai}D_{i}:=\{x\in X:f(x)=a_{i}\} for i=1,,mi=1,\ldots,m and 𝒟:={Di}i=1m\mathcal{D}:=\{D_{i}\}_{i=1}^{m}. By the “cell decomposition theorem” (see Chapter 3 of van den Dries (1998)), there exists a finite decomposition \mathcal{E} of XX such that, for each EE\in\mathcal{E}, the restriction F|EF|_{E} is continuous. Denote 𝒢:={DE:D𝒟 and E}=:{Gi}i=1n\mathcal{G}:=\{D\cap E:\,D\in\mathcal{D}\text{ and }E\in\mathcal{E}\}=:\{G_{i}\}_{i=1}^{n}; obviously, 𝒢\mathcal{G} is a finite definable partition of XX. Furthermore, for every component Gi𝒢G_{i}\in\mathcal{G}, FF is continuous on Gi𝒢G_{i}\in\mathcal{G} and ff is a constant (say bib_{i}) on GiG_{i}.

For each i=1,,ni=1,\ldots,n, define the restriction Fi:=F|Gi:GiYF_{i}:=F|_{G_{i}}:G_{i}\to Y. Since FiF_{i} is continuous on GiG_{i}, the “trivialization theorem” (see Chapter 9 of van den Dries (1998)) implies that there exists a finite definable partition {Yij}j\{Y_{ij}\}_{j} of YY such that FiF_{i} is definably trivial over YijY_{ij} for each jj. That is, for each jj, there exists a definable set UijNU_{ij}\subseteq\mathbb{R}^{N} for some NN and a definable map λij:Fi1(Yij)Uij\lambda_{ij}:F_{i}^{-1}(Y_{ij})\rightarrow U_{ij} such that hij:=(Fi|Fi1(Yij),λij):Fi1(Yij)Yij×Uijh_{ij}:=(F_{i}|_{F_{i}^{-1}(Y_{ij})},\,\lambda_{ij}):F_{i}^{-1}(Y_{ij})\rightarrow Y_{ij}\times U_{ij} is a homeomorphism; furthermore, Fi|Fi1(Yij)=πhijF_{i}|_{F_{i}^{-1}(Y_{ij})}=\pi\circ h_{ij}, where π:Yij×UijYij\pi:Y_{ij}\times U_{ij}\rightarrow Y_{ij} is a projection map, and Fi1(y)F_{i}^{-1}(y) is definably homeomorphic to UijU_{ij} for each yYijy\in Y_{ij}.

The additivity of Euler characteristics implies

Y(F1(y)f(x)[dχ(x)])[dχ(y)]\displaystyle\int_{Y}\left(\int_{F^{-1}(y)}f(x)\,[d\chi(x)]\right)[d\chi(y)] =Yi=1n(Fi1(y)f(x)[dχ(x)])[dχ(y)]\displaystyle=\int_{Y}\sum_{i=1}^{n}\left(\int_{F_{i}^{-1}(y)}f(x)\,[d\chi(x)]\right)[d\chi(y)]
=Yi=1nbiχ(Fi1(y))[dχ(y)]\displaystyle=\int_{Y}\sum_{i=1}^{n}b_{i}\cdot\chi\left(F_{i}^{-1}(y)\right)[d\chi(y)]

Lemma A.1 implies that yχ(Fi1(y))=χ({xX:Fi(x)=y})y\mapsto\chi(F_{i}^{-1}(y))=\chi(\{x\in X:F_{i}(x)=y\}) is a constructible function defined on YY. Therefore, Lemma A.2, together with the additivity of Euler characteristics, implies the following

Y(F1(y)f(x)[dχ(x)])[dχ(y)]\displaystyle\int_{Y}\left(\int_{F^{-1}(y)}f(x)\,[d\chi(x)]\right)[d\chi(y)] =i=1nbiYχ(Fi1(y))𝑑χ(y)\displaystyle=\sum_{i=1}^{n}b_{i}\int_{Y}\chi\left(F_{i}^{-1}(y)\right)d\chi(y)
=i=1nbijYijχ(Fi1(y))𝑑χ(y)\displaystyle=\sum_{i=1}^{n}b_{i}\sum_{j}\int_{Y_{ij}}\chi\left(F_{i}^{-1}(y)\right)d\chi(y)

Since all fibers Fi1(y)F_{i}^{-1}(y) for yYijy\in Y_{ij} are all definably homeomorphic to UijU_{ij}, we have χ(Fi1(y))=χ(Uij)\chi\left(F_{i}^{-1}(y)\right)=\chi(U_{ij}) for all yYijy\in Y_{ij}. Hence, we have

Y(F1(y)f(x)[dχ(x)])[dχ(y)]\displaystyle\int_{Y}\left(\int_{F^{-1}(y)}f(x)\,[d\chi(x)]\right)[d\chi(y)] =i=1nbijYijχ(Uij)[dχ(y)]\displaystyle=\sum_{i=1}^{n}b_{i}\sum_{j}\int_{Y_{ij}}\chi(U_{ij})\,[d\chi(y)]
=i=1nbijχ(Yij)χ(Uij)\displaystyle=\sum_{i=1}^{n}b_{i}\sum_{j}\chi(Y_{ij})\cdot\chi(U_{ij})
=i=1nbijχ(Yij×Uij)\displaystyle=\sum_{i=1}^{n}b_{i}\sum_{j}\chi(Y_{ij}\times U_{ij})
=i=1nbijχ(Fi1(Yij)).\displaystyle=\sum_{i=1}^{n}b_{i}\sum_{j}\chi\left(F_{i}^{-1}(Y_{ij})\right).

Since, for each ii, {Yij}j\{Y_{ij}\}_{j} is a partition of YY, we have

Y(F1(y)f(x)[dχ(x)])[dχ(y)]\displaystyle\int_{Y}\left(\int_{F^{-1}(y)}f(x)\,[d\chi(x)]\right)[d\chi(y)] =i=1nbiχ(Fi1(Y))\displaystyle=\sum_{i=1}^{n}b_{i}\chi\left(F_{i}^{-1}(Y)\right)
=i=1nbiχ(Gi)\displaystyle=\sum_{i=1}^{n}b_{i}\chi\left(G_{i}\right)
=i=1nbiX𝟙Gi(x)𝑑χ(x).\displaystyle=\sum_{i=1}^{n}b_{i}\int_{X}\mathbbm{1}_{G_{i}}(x)\,d\chi(x).

Applying Lemma A.2 again, we have

Y(F1(y)f(x)[dχ(x)])[dχ(y)]\displaystyle\int_{Y}\left(\int_{F^{-1}(y)}f(x)\,[d\chi(x)]\right)[d\chi(y)] =Xi=1nbi𝟙Gi(x)[dχ(x)]\displaystyle=\int_{X}\sum_{i=1}^{n}b_{i}\cdot\mathbbm{1}_{G_{i}}(x)\,[d\chi(x)]
=Xf(x)[dχ(x)]\displaystyle=\int_{X}f(x)\,[d\chi(x)]

This concludes the proof. \square

Proof of Theorem 3.5.

Our goal is to prove that Eq. (C.2) is true if g𝔇R,dpcg\in\mathfrak{D}_{R,d}^{pc}, which implies the desired invertibility via Proposition C.1.

For the ease of notations, we will denote S=Bd(0,R)S=B_{\mathbb{R}^{d}}(0,R) and T=𝕊d1×[0,T]T=\mathbb{S}^{d-1}\times[0,T]. Let xx^{\prime} be any point in SS and fixed. The kernel functions R(x,ν,t)R(x,\nu,t) and R(ν,t,x)R^{\prime}(\nu,t,x^{\prime}) are defined in Eq. (3.1) and Eq. (3.14), respectively. Let g𝔇R,dpcg\in\mathfrak{D}_{R,d}^{pc}. Then, the function (x,ν,t)g(x)R(x,ν,t)R(ν,t,x)(x,\nu,t)\mapsto g(x)\cdot R(x,\nu,t)\cdot R^{\prime}(\nu,t,x^{\prime}) belongs to Def(S×T;)\operatorname{Def}(S\times T;\mathbb{R}) and takes finitely many values. Consider the standard projection maps p1:S×TSp_{1}:S\times T\to S and p2:S×TTp_{2}:S\times T\to T as follows

  1. i)

    Applying Lemma A.3 to p1p_{1}, we have the following

    S×Tg(x)R(x,ν,t)R(ν,t,x)[dχ(x,ν,t)]\displaystyle\int_{S\times T}g(x)\cdot R(x,\nu,t)\cdot R^{\prime}(\nu,t,x^{\prime})\,[d\chi(x,\nu,t)]
    =S(p11(x)g(x)R(x,ν,t)R(ν,t,x)[dχ(x,ν,t)])[dχ(x)]\displaystyle=\int_{S}\left(\int_{p_{1}^{-1}(x)}g(x)\cdot R(x,\nu,t)\cdot R^{\prime}(\nu,t,x^{\prime})\,[d\chi(x,\nu,t)]\right)[d\chi(x)]
    =S({x}×Tg(x)R(x,ν,t)R(ν,t,x)[dχ(x,ν,t)])[dχ(x)]\displaystyle=\int_{S}\left(\int_{\{x\}\times T}g(x)\cdot R(x,\nu,t)\cdot R^{\prime}(\nu,t,x^{\prime})\,[d\chi(x,\nu,t)]\right)[d\chi(x)]
    =Sg(x)(TR(x,ν,t)R(ν,t,x)[dχ(ν,t)])[dχ(x)].\displaystyle=\int_{S}g(x)\left(\int_{T}R(x,\nu,t)\cdot R^{\prime}(\nu,t,x^{\prime})\,[d\chi(\nu,t)]\right)[d\chi(x)].
  2. ii)

    Applying Lemma A.3 to p2p_{2}, we have the following

    S×Tg(x)R(x,ν,t)R(ν,t,x)[dχ(x,ν,t)]\displaystyle\int_{S\times T}g(x)\cdot R(x,\nu,t)\cdot R^{\prime}(\nu,t,x^{\prime})\,[d\chi(x,\nu,t)]
    =T(p21(ν,t)g(x)R(x,ν,t)R(ν,t,x)[dχ(x,ν,t)])[dχ(ν,t)]\displaystyle=\int_{T}\left(\int_{p_{2}^{-1}(\nu,t)}g(x)\cdot R(x,\nu,t)\cdot R^{\prime}(\nu,t,x^{\prime})\,[d\chi(x,\nu,t)]\right)[d\chi(\nu,t)]
    =T(S×{(ν,t)}g(x)R(x,ν,t)R(ν,t,x)[dχ(x,ν,t)])[dχ(ν,t)]\displaystyle=\int_{T}\left(\int_{S\times\{(\nu,t)\}}g(x)\cdot R(x,\nu,t)\cdot R^{\prime}(\nu,t,x^{\prime})\,[d\chi(x,\nu,t)]\right)[d\chi(\nu,t)]
    =T(Sg(x)R(x,ν,t)R(ν,t,x)[dχ(x)])[dχ(ν,t)]\displaystyle=\int_{T}\left(\int_{S}g(x)\cdot R(x,\nu,t)\cdot R^{\prime}(\nu,t,x^{\prime})\,[d\chi(x)]\right)[d\chi(\nu,t)]
    =T(Sg(x)R(x,ν,t)[dχ(x)])R(ν,t,x)[dχ(ν,t)].\displaystyle=\int_{T}\left(\int_{S}g(x)\cdot R(x,\nu,t)\,[d\chi(x)]\right)R^{\prime}(\nu,t,x^{\prime})\,[d\chi(\nu,t)].

Combining the two equations above gives the desired Eq. (C.2).

Lastly, the invertibility of the SERT follows from Theorem 3.4. The proof is completed. \square

A.5 Proof of Theorem 4.2

Proof.

Denote Sν,t:=Bd(0,R){xd:xνtR}S_{\nu,t}:=B_{\mathbb{R}^{d}}(0,R)\cap\{x\in\mathbb{R}^{d}:x\cdot\nu\leq t-R\}. A direct computation shows that

sLECT(g)(ν,t,s)[dχ(s)]\displaystyle\int_{\mathbb{R}}s\cdot\operatorname{LECT}(g)(\nu,t,s)\,[d\chi(s)] =sχ({xBd(0,R):xνtR and g(x)=s})[dχ(s)]\displaystyle=\int_{\mathbb{R}}s\cdot\chi\left(\left\{x\in B_{\mathbb{R}^{d}}(0,R):\,x\cdot\nu\leq t-R\text{ and }g(x)=s\right\}\right)\,[d\chi(s)]
=sχ({xSν,t:g(x)=s})[dχ(s)].\displaystyle=\int_{\mathbb{R}}s\cdot\chi\left(\left\{x\in S_{\nu,t}:g(x)=s\right\}\right)\,[d\chi(s)].

“Corollary 8” of Baryshnikov and Ghrist (2010) implies the following

sχ({xSν,t:g(x)=s})[dχ(s)]\displaystyle\int_{\mathbb{R}}s\cdot\chi\left(\left\{x\in S_{\nu,t}:g(x)=s\right\}\right)\,[d\chi(s)] =Sν,tg(x)[dχ(x)]\displaystyle=\int_{S_{\nu,t}}g(x)\,[d\chi(x)]
=Bd(0,R)g(x)R(x,ν,t)[dχ(x)]\displaystyle=\int_{B_{\mathbb{R}^{d}}(0,R)}g(x)\cdot R(x,\nu,t)\,[d\chi(x)]
=ERT(g)(ν,t).\displaystyle=\operatorname{ERT}(g)(\nu,t).

Then, Eq. (4.4) follows.

We can write “Proposition 2” of Baryshnikov and Ghrist (2010) using the LECT and SELECT as follows

ERT(g)(ν,t)\displaystyle\lfloor\operatorname{ERT}\rfloor(g)(\nu,t) =Bd(0,R)g(x)R(x,ν,t)dχ(x)\displaystyle=\int_{B_{\mathbb{R}^{d}}(0,R)}g(x)\cdot R(x,\nu,t)\,\lfloor d\chi(x)\rfloor
=Sν,tg(x)dχ(x)\displaystyle=\int_{S_{\nu,t}}g(x)\,\lfloor d\chi(x)\rfloor
=0SELECT(g)(ν,t,s)SELECT(g)(ν,t,s)+LECT(g)(ν,t,s)ds.\displaystyle=\int_{0}^{\infty}\operatorname{SELECT}(g)(\nu,t,s)-\operatorname{SELECT}(-g)(\nu,t,s)+\operatorname{LECT}(-g)(\nu,t,s)\,ds.

Similarly, we have

ERT(g)(ν,t)=0SELECT(g)(ν,t,s)LECT(g)(ν,t,s)SELECT(g)(ν,t,s)ds.\displaystyle\lceil\operatorname{ERT}\rceil(g)(\nu,t)=\int_{0}^{\infty}\operatorname{SELECT}(g)(\nu,t,s)-\operatorname{LECT}(g)(\nu,t,s)-\operatorname{SELECT}(-g)(\nu,t,s)\,ds.

Thus, the proof of Eq. (4.5) is completed. Taking the average of the two expressions in Eq. (4.5) gives Eq. (4.6). \square

Appendix B Definability vs. Tameness of Functions

In the literature on TDA, one may often come across the concept of tameness. The word “tame” is also often used interchangeably with “definable.” The concept of definability is presented in Definition 3.2, and the concept of tameness can be found in Bobrowski and Borman (2012). In this section, we analyze the relationship between them. To avoid confusion, we will not interchange the words “definable” and “tame” in this paper.

B.1 Tameness

We first go through the concept of tameness as follows, which is a generalized version of “Definition 2.2” in Bobrowski and Borman (2012).

Definition B.1.

Let XX be a topological space with finite χ(X)\chi(X) and f:Xf:X\to\mathbb{R} a continuous bounded function. For each α\alpha\in\mathbb{R}, we define the super-level set at α\alpha as Xα+f{xX:f(x)α}X_{\alpha^{+}}^{f}\coloneqq\{x\in X:\ f(x)\geq\alpha\} and the sub-level set α\alpha Xαf{xX:f(x)α}X_{\alpha^{-}}^{f}\coloneqq\{x\in X:\ f(x)\leq\alpha\}. The function ff is said to be tame if it satisfies the following two conditions

  • The homotopy types of Xα+fX_{\alpha^{+}}^{f} and XαfX_{\alpha^{-}}^{f} change finitely many times as α\alpha varies through \mathbb{R};

  • the homology groups of Xα+fX_{\alpha^{+}}^{f} and XαfX_{\alpha^{-}}^{f} are all finitely generated for all α\alpha\in\mathbb{R}.

Similar to Definition B.1, we define the following

Definition B.2.

Let XX be a topological space with finite χ(X)\chi(X). A (not necessarily continuous) bounded function f:Xf:X\to\mathbb{R} is said to be EC-tame if the Euler characteristics χ(Xα+f)\chi(X_{\alpha^{+}}^{f}) and χ(Xαf)\chi(X_{\alpha^{-}}^{f}) are finite for all α\alpha\in\mathbb{R} and change only finitely many times as α\alpha varies through \mathbb{R}.

Note that the definition of “tame functions” in Bobrowski and Borman (2012) are equivalent to continuous EC-tame functions on compact topological space XX in our context.

B.2 Relationship between Tameness and Definability

In general, it is not the case that a tame function is definable in the o-minimal sense, which is illustrated by the following example.

Example B.1.

Let WW denote the Warsaw circle (see Figure 8) defined as the union of the closed topologist’s sine curve and an arc JJ “joining” the two ends of the topologist’s sine curve:

W{(x,sin(2πx)|x(0,1]}{(0,y)|1y1}JW\coloneqq\{(x,\sin(\frac{2\pi}{x})\ |\ x\in(0,1]\}\cup\{(0,y)\ |\ -1\leq y\leq 1\}\cup J
J{(0.5+Rcos(t),2+Rsin(t)|βt2π+α}}J\coloneqq\{(0.5+R\cos(t),-2+R\sin(t)\ |\ \beta\leq t\leq 2\pi+\alpha\}\}
R(12)2+22,αarctan(20.5),β=παR\coloneqq\sqrt{(\frac{1}{2})^{2}+2^{2}},\alpha\coloneqq\arctan(\frac{2}{0.5}),\beta=\pi-\alpha
Refer to caption
Figure 8: The Warsaw Circle WW

WW itself is not definable. However, WW is compact as it is bounded and is the union of two closed sets (the closed topologist’s sine curve and JJ). Now consider the constant continuous function f:Wf:W\to\mathbb{R} that sends every point to 0 - the graph of this function is W×{0}3W\times\{0\}\subseteq\mathbb{R}^{3} and is not definable. Hence, ff is not definable.

On the other hand, the function is a tame function. This is because the Warsaw Circle is known to be simply connected and has all trivial homology groups beyond dimension 0.

Remark B.1.

If f:XYf:X\to Y is a tame function between two definable sets, would ff be definable? The answer is no. Consider the indicator function 𝟙W:2\mathbbm{1}_{W}:\mathbb{R}^{2}\to\mathbb{R} on the Warsaw circle WW.

Conversely, a function that is definable in the o-minimal sense does not have to be tame either. The most obvious obstruction comes from the distinction that definable functions need not be continuous nor bounded. However, when we remove the trivial distinctions between the two, we do have the following result:

Proposition B.1.

Suppose XnX\subseteq\mathbb{R}^{n} is definable. If the function f:Xf:X\to\mathbb{R} is continuous, bounded, and definable, then ff is tame.

Proof.

Let Γ(f)X×\Gamma(f)\subset X\times\mathbb{R} be the graph of ff and let π:n×n\pi:\mathbb{R}^{n}\times\mathbb{R}\to\mathbb{R}^{n} be the standard projection function, we observe that Xα+fX_{\alpha^{+}}^{f} is the set π(Γ(f)X×[α,+))\pi(\Gamma(f)\cap X\times[\alpha,+\infty)) and is thus definable.

By the triangulation theorem (see Chapter 8 of van den Dries (1998)), it follows that Xα+fX_{\alpha^{+}}^{f} is definably homeomorphic to a subcollection of open simplices in some finite Euclidean simplicial complex, which implies that the homology groups of Xα+fX_{\alpha^{+}}^{f} are all finitely generated. The case for XαfX_{\alpha^{-}}^{f} is similar.

Since f:Xf:X\to\mathbb{R} is a continuous definable function between definable sets, the “trivialization theorem” (see Chapter 9 of van den Dries (1998)) asserts that there exists a definable partition of \mathbb{R} into finitely many definable sets R1,,RnR_{1},...,R_{n} such that, for any i{1,,n}i\in\{1,...,n\}, there exists a definable set YiNY_{i}\subseteq\mathbb{R}^{N}, for some dimension NN, making the following diagram commute

f1(Ri){{f^{-1}(R_{i})}}Ri×Yi{{R_{i}\times Y_{i}}}Ri{{R_{i}}}fi\scriptstyle{f_{i}}hi=(fi,λi)\scriptstyle{h_{i}=(f_{i},\,\lambda_{i})}πi\scriptstyle{\pi_{i}}

where fif_{i} is the function ff restricted to f1(Ri)f^{-1}(R_{i}) with fi:=f|f1(Ri)f_{i}:=f|_{f^{-1}(R_{i})}, hih_{i} is a homeomorphism, λi:f1(Ri)Yi\lambda_{i}:f^{-1}(R_{i})\rightarrow Y_{i} is a continuous map, and πi:Ri×YiRi\pi_{i}:R_{i}\times Y_{i}\rightarrow R_{i} is the standard projection map.

It is straightforward to have the following disjoint union

Xα+f={xX:f(x)α}=i=1n{xf1(Ri):fi(x)α}=i=1n{xf1(Ri):πihi(x)α}=i=1n{ξhi(f1(Ri)):πi(ξ)α}=i=1n{ξRi×Yi:πi(ξ)α}.\displaystyle\begin{aligned} X_{\alpha^{+}}^{f}&=\{x\in X:\ f(x)\geq\alpha\}\\ &=\bigcup_{i=1}^{n}\left\{x\in f^{-1}(R_{i}):\,f_{i}(x)\geq\alpha\right\}\\ &=\bigcup_{i=1}^{n}\left\{x\in f^{-1}(R_{i}):\,\pi_{i}\circ h_{i}(x)\geq\alpha\right\}\\ &=\bigcup_{i=1}^{n}\left\{\xi\in h_{i}\left(f^{-1}(R_{i})\right):\,\pi_{i}(\xi)\geq\alpha\right\}\\ &=\bigcup_{i=1}^{n}\left\{\xi\in R_{i}\times Y_{i}:\,\pi_{i}(\xi)\geq\alpha\right\}.\end{aligned} (B.1)

To show that the homotopy type of Xα+fX_{\alpha^{+}}^{f} changes finitely many times, Eq. (B.1) indicates that it suffices to verify that, for each projection map πi\pi_{i}, the homotopy type of the super-level sets of πi\pi_{i} changes finitely many times. Indeed, since Ri𝒪1R_{i}\in\mathcal{O}_{1}, the set RiR_{i} is a finite union of points and open intervals. The homotopy type of {ξRi×Yi:πi(ξ)α}\left\{\xi\in R_{i}\times Y_{i}:\,\pi_{i}(\xi)\geq\alpha\right\} changes only when α\alpha crosses the isolated points and boundary points of RiR_{i}. The verification for the sub-level sets is similar. Hence, ff is a tame function. \square

B.3 A Useful Formula

We use Proposition B.1 to prove a variant of “Proposition 7.2” in Bobrowski and Borman (2012), which will be implemented in Appendix C.

Lemma B.1.

Suppose the topological space XX is definable, and functions h,f:Xh,f:X\to\mathbb{R} are definable. If the image of hh is discrete and ff is bounded, then we have the following formula

X(h+f)dχ=Xhdχ+Xfdχ\int_{X}(h+f)\lceil d\chi\rceil=\int_{X}h\lceil d\chi\rceil+\int_{X}f\lceil d\chi\rceil

The formula holds similarly for dχ\lfloor d\chi\rfloor.

Proof.

Since h(X)h(X) belongs to 𝒪1\mathcal{O}_{1} and is discrete, h(X)h(X) must be a finite point set, say {a1,,an}\{a_{1},...,a_{n}\}. We can then partition XX into A1,,AnA_{1},...,A_{n} such that

h(x)=i=1nai𝟙Ai(x).h(x)=\sum_{i=1}^{n}a_{i}\mathbbm{1}_{A_{i}}(x).

In addition, the “cell decomposition theorem” (see Chapter 3 of van den Dries (1998)) indicates that there exists a cell decomposition 𝒟\mathcal{D} of XX such that ff is continuous on each cell in 𝒟\mathcal{D}. Hence, without loss of generality, we may assume that ff is continuous on each AiA_{i}.

By additivity of Euler characteristics, we can decompose X(h+f)dχ\int_{X}(h+f)\lceil d\chi\rceil as follows

X(h+f)dχ\displaystyle\int_{X}(h+f)\lceil d\chi\rceil =Xi=1n(ai+f)𝟙Aidχ\displaystyle=\int_{X}\sum_{i=1}^{n}(a_{i}+f)\cdot\mathbbm{1}_{A_{i}}\lceil d\chi\rceil
=limk1kXi=1nk(ai+f)𝟙Ai𝑑χ\displaystyle=\lim_{k\rightarrow\infty}\frac{1}{k}\int_{X}\left\lceil\sum_{i=1}^{n}k\cdot(a_{i}+f)\cdot\mathbbm{1}_{A_{i}}\right\rceil\,d\chi
=limk1kXi=1nk(ai+f)𝟙Aidχ\displaystyle=\lim_{k\rightarrow\infty}\frac{1}{k}\int_{X}\sum_{i=1}^{n}\left\lceil k\cdot(a_{i}+f)\right\rceil\cdot\mathbbm{1}_{A_{i}}\,d\chi
=limk1ki=1nXk(ai+f)𝟙Ai𝑑χ\displaystyle=\lim_{k\rightarrow\infty}\frac{1}{k}\sum_{i=1}^{n}\int_{X}\left\lceil k\cdot(a_{i}+f)\right\rceil\cdot\mathbbm{1}_{A_{i}}\,d\chi
=i=1nlimk1kAik(ai+f)𝑑χ\displaystyle=\sum_{i=1}^{n}\lim_{k\rightarrow\infty}\frac{1}{k}\int_{A_{i}}\left\lceil k\cdot(a_{i}+f)\right\rceil\,d\chi
=i=1nAi(ai+f)dχ.\displaystyle=\sum_{i=1}^{n}\int_{A_{i}}(a_{i}+f)\lceil d\chi\rceil.

It then suffices to verify Aiai+fdχ=Aifdχ+Aiaidχ\int_{A_{i}}a_{i}+f\lceil d\chi\rceil=\int_{A_{i}}f\lceil d\chi\rceil+\int_{A_{i}}a_{i}\lceil d\chi\rceil. Since ai+fa_{i}+f is continuous on AiA_{i} for each ii, Proposition B.1 implies that (ai+f)|Ai(a_{i}+f)|_{A_{i}} is tame. Then, it follows from “Proposition 2.4” of Bobrowski and Borman (2012) that

Ai(ai+f)dχ\displaystyle\int_{A_{i}}(a_{i}+f)\,\lceil d\chi\rceil =vCV(ai+f)Δχ(ai+f,v)v\displaystyle=\sum_{v\in\operatorname*{CV}(a_{i}+f)}\Delta_{\chi}(a_{i}+f,v)v
=vCV(f)Δχ(f,v)(v+ai)\displaystyle=\sum_{v\in\operatorname*{CV}(f)}\Delta_{\chi}(f,v)(v+a_{i})
=vCV(f)Δχ(f,v)v+aivCV(f)Δχ(f,v)\displaystyle=\sum_{v\in\operatorname*{CV}(f)}\Delta_{\chi}(f,v)v+a_{i}\sum_{v\in\operatorname*{CV}(f)}\Delta_{\chi}(f,v)
=Aifdχ+aivCV(f)Δχ(f,v)\displaystyle=\int_{A_{i}}f\lceil d\chi\rceil+a_{i}\sum_{v\in\operatorname*{CV}(f)}\Delta_{\chi}(f,v)

where CV(f)\operatorname*{CV}(f) is the set of values α\alpha (referred to as critical values) at which the homotopy type of {xAi:f(x)α}\{x\in A_{i}:\,f(x)\leq\alpha\} changes; and Δχ(f,v)\Delta_{\chi}(f,v) is the change in Euler characteristic:

Δχ(f,v)=χ({xAi:f(x)v+ε})χ({xAi:f(x)vε})\displaystyle\Delta_{\chi}(f,v)=\chi(\{x\in A_{i}:\,f(x)\leq v+\varepsilon\})-\chi(\{x\in A_{i}:\,f(x)\leq v-\varepsilon\}) (B.2)

for sufficiently small ε\varepsilon.

Since ff is bounded, so there exists aba\leq b such that {xAi:f(x)b}=X\{x\in A_{i}:\,f(x)\leq b\}=X and {xAi:f(x)a}=\{x\in A_{i}:\,f(x)\leq a\}=\emptyset. The sum vCV(f)Δχ(f,v)\sum_{v\in\operatorname*{CV}(f)}\Delta_{\chi}(f,v) then collapse as a telescoping sum to χ(Ai)χ()=χ(Ai)\chi(A_{i})-\chi(\emptyset)=\chi(A_{i}) (see Eq. (B.2)), hence

Aiai+fdχ=Aifdχ+aiχ(Ai)=Aifdχ+Aiaidχ\int_{A_{i}}a_{i}+f\lceil d\chi\rceil=\int_{A_{i}}f\lceil d\chi\rceil+a_{i}\chi(A_{i})=\int_{A_{i}}f\lceil d\chi\rceil+\int_{A_{i}}a_{i}\lceil d\chi\rceil

The proof is completed. \square

Appendix C Discussions on the Invertibility of the ERT

In this section, we discuss the invertibility of the ERT, especially its dependence on the “Fubini condition.” This section provides a prerequisite for the proof of Theorem 3.5.

Let μ=1(1)d\mu=1-(-1)^{d} and λ=1\lambda=1. Schapira (1995) and the proof of “Theorem 5” in Ghrist et al. (2018) show the following

𝕊d1×[0,T]R(x,ν,t)R(ν,t,x)𝑑χ(ν,t)=(μλ)δΔ(x,x)+λ,\displaystyle\int_{\mathbb{S}^{d-1}\times[0,T]}R(x,\nu,t)\cdot R^{\prime}(\nu,t,x^{\prime})\,d\chi(\nu,t)=(\mu-\lambda)\delta_{\Delta}(x,x^{\prime})+\lambda, (C.1)

where R(ν,t,x)R^{\prime}(\nu,t,x^{\prime}) is the dual kernel defined in Eq. (3.14), and δΔ(x,x)=1\delta_{\Delta}(x,x^{\prime})=1 if x=xx=x^{\prime} and is 0 otherwise.

We recall the dual Euler-Radon transform (DERT) in Equation 3.15 as follows

DERT:Def(𝕊d1×[0,T])Bd(0,R),hDERT(h)={DERT(h)(x):=𝕊d1×[0,T]h(ν,t)R(ν,t,x)[dχ(ν,t)]}xBd(0,R).\displaystyle\begin{aligned} \operatorname{DERT}:\ \ &\operatorname{Def}(\mathbb{S}^{d-1}\times[0,T])\rightarrow\mathbb{R}^{B_{\mathbb{R}^{d}}(0,R)},\\ &h\mapsto\operatorname{DERT}(h)=\left\{\operatorname{DERT}(h)(x):=\int_{\mathbb{S}^{d-1}\times[0,T]}h(\nu,t)\cdot R^{\prime}(\nu,t,x)\,[d\chi(\nu,t)]\right\}_{x\in B_{\mathbb{R}^{d}}(0,R)}.\end{aligned}

The following proposition shows the relationship between the ERT and DERT, which is the core of the proof of Theorem 3.5.

Proposition C.1.

Suppose g𝔇R,dg\in\mathfrak{D}_{R,d}. If the following condition (referred to as the “Fubini condition” hereafter) holds

𝕊d1×[0,T](Bd(0,R)g(x)R(x,ν,t)[dχ(x)])R(ν,t,x)[dχ(v,t)]=Bd(0,R)g(x)(𝕊d1×[0,T]R(x,ν,t)R(ν,t,x)𝑑χ(ν,t))[dχ(x)],\displaystyle\begin{aligned} &\int_{\mathbb{S}^{d-1}\times[0,T]}\left(\int_{B_{\mathbb{R}^{d}}(0,R)}g(x)\cdot R(x,\nu,t)[d\chi(x)]\right)R^{\prime}(\nu,t,x^{\prime})[d\chi(v,t)]\\ &=\int_{B_{\mathbb{R}^{d}}(0,R)}g(x)\left(\int_{\mathbb{S}^{d-1}\times[0,T]}R(x,\nu,t)\cdot R^{\prime}(\nu,t,x^{\prime})\,d\chi(\nu,t)\right)[d\chi(x)],\end{aligned} (C.2)

we have the following formula

(DERTERT)(g)(x)=(μλ)g(x)+λ(Bd(0,R)g[dχ]),for all xBd(0,R),\displaystyle(\operatorname{DERT}\circ\operatorname{ERT})(g)(x^{\prime})=(\mu-\lambda)\cdot g(x^{\prime})+\lambda\left(\int_{B_{\mathbb{R}^{d}}(0,R)}g[d\chi]\right),\ \ \ \text{for all }x^{\prime}\in B_{\mathbb{R}^{d}}(0,R), (C.3)

where μ=1(1)d\mu=1-(-1)^{d} and λ=1\lambda=1.

Before providing the proof of Proposition C.1, we explain how Eq. (C.3) implies the invertibility of the ERT. Since gg has compact support, we have limξR𝕊d1g(ξ)=0\lim_{\xi\rightarrow R\mathbb{S}^{d-1}}g(\xi)=0, where limξR𝕊d1\lim_{\xi\rightarrow R\mathbb{S}^{d-1}} means that ξ\xi converges to a point on the sphere R𝕊d1={xd:x=R}R\mathbb{S}^{d-1}=\{x\in\mathbb{R}^{d}:\,\|x\|=R\}. Therefore, we have

limξR𝕊d11μλ(DERTERT)(g)(ξ)=limξR𝕊d1g(ξ)+λμλ(Bd(0,R)g[dχ])=λμλ(Bd(0,R)g[dχ]).\displaystyle\lim_{\xi\rightarrow R\mathbb{S}^{d-1}}\frac{1}{\mu-\lambda}\cdot(\operatorname{DERT}\circ\operatorname{ERT})(g)(\xi)=\lim_{\xi\rightarrow R\mathbb{S}^{d-1}}g(\xi)+\frac{\lambda}{\mu-\lambda}\left(\int_{B_{\mathbb{R}^{d}}(0,R)}g[d\chi]\right)=\frac{\lambda}{\mu-\lambda}\left(\int_{B_{\mathbb{R}^{d}}(0,R)}g[d\chi]\right).

The limit above implies

g(x)=1μλ(DERTERT)(g)(x)limξR𝕊d11μλ(DERTERT)(g)(ξ),\displaystyle g(x^{\prime})=\frac{1}{\mu-\lambda}\cdot(\operatorname{DERT}\circ\operatorname{ERT})(g)(x^{\prime})-\lim_{\xi\rightarrow R\mathbb{S}^{d-1}}\frac{1}{\mu-\lambda}\cdot(\operatorname{DERT}\circ\operatorname{ERT})(g)(\xi),

which is the inversion formula in Eq. (3.16) and shows the invertibility of the ERT.

We provide the proof of Proposition C.1 as follows

Proof of Proposition C.1.

For ease of notation, let X=Bd(0,R)X=B_{\mathbb{R}^{d}}(0,R) and Y=𝕊d1×[0,T]Y=\mathbb{S}^{d-1}\times[0,T]. Then, we have

(DERTERT)(g)(x)\displaystyle\left(\operatorname{DERT}\circ\operatorname{ERT}\right)(g)(x^{\prime}) =DERT(Xg(x)R(x,,)[dχ(x)])(x)\displaystyle=\operatorname{DERT}\left(\int_{X}g(x)\cdot R(x,\cdot,\cdot)\,[d\chi(x)]\right)(x^{\prime})
=Y(Xg(x)R(x,ν,t)[dχ(x)])R(ν,t,x)[dχ(ν,t)].\displaystyle=\int_{Y}\left(\int_{X}g(x)\cdot R(x,\nu,t)[d\chi(x)]\right)R^{\prime}(\nu,t,x^{\prime})\,[d\chi(\nu,t)].

The Fubini condition in Eq. (C.2) implies

(DERTERT)(g)(x)\displaystyle\left(\operatorname{DERT}\circ\operatorname{ERT}\right)(g)(x^{\prime}) =Xg(x)(YR(x,ν,t)R(ν,t,x)[dχ(ν,t)])[dχ(x)].\displaystyle=\int_{X}g(x)\left(\int_{Y}R(x,\nu,t)R^{\prime}(\nu,t,x^{\prime})[d\chi(\nu,t)]\right)\,[d\chi(x)].

Eq. (C.1) indicates the following

(DERTERT)(g)(x)\displaystyle\left(\operatorname{DERT}\circ\operatorname{ERT}\right)(g)(x^{\prime}) =Xg(x){(μλ)δΔ(x,x)+λ}[dχ(x)]\displaystyle=\int_{X}g(x)\left\{(\mu-\lambda)\delta_{\Delta}(x,x^{\prime})+\lambda\right\}\,[d\chi(x)]
=X(μλ)g(x)δΔ(x,x)+λg(x)[dχ(x)]\displaystyle=\int_{X}(\mu-\lambda)\cdot g(x)\cdot\delta_{\Delta}(x,x^{\prime})+\lambda\cdot g(x)\,[d\chi(x)]

For each fixed xx^{\prime}, the function (μλ)g(x)δΔ(x,x)(\mu-\lambda)g(x)\delta_{\Delta}(x,x^{\prime}) of xx is clearly discrete. Then, Lemma B.1 implies

(DERTERT)(g)(x)=X(μλ)g(x)δΔ(x,x)[dχ(x)]+Xλg(x)[dχ(x)].\displaystyle\left(\operatorname{DERT}\circ\operatorname{ERT}\right)(g)(x^{\prime})=\int_{X}(\mu-\lambda)g(x)\delta_{\Delta}(x,x^{\prime})[d\chi(x)]+\int_{X}\lambda g(x)[d\chi(x)].

Evaluating the two integrals above and keeping in mind that ()[dχ(x)]\int(\cdot)[d\chi(x)] is homogeneous, we have that

(DERTERT)(h)(x)=(μλ)g(x)+λ(Bd(0,R)g[dχ]),(\operatorname{DERT}\circ\operatorname{ERT})(h)(x^{\prime})=(\mu-\lambda)g(x^{\prime})+\lambda\left(\int_{B_{\mathbb{R}^{d}}(0,R)}g[d\chi]\right),

that is, the proof of Eq. (C.3) is completed. \square

The Fubini condition specified above does fail in general. Plenty of examples are given in “Corollary 6” of Baryshnikov and Ghrist (2010). In “Theorem 7” of Baryshnikov and Ghrist (2010), this condition does hold when the definable function preserves fibers, ie. if F:XYF:X\to Y is definable and hDef(X,)h\in\operatorname*{Def}(X,\mathbb{R}) is constant on the fibers of FF, then

Xh[dχ(x)]=Y(F1(y)h(x)[dχ(x)])[dχ(y)]\int_{X}h[d\chi(x)]=\int_{Y}\left(\int_{F^{-1}(y)}h(x)[d\chi(x)]\right)[d\chi(y)]

Unfortunately, this does not help much in the discussion of invertibility. The typical Fubini’s Theorem for dχd\chi that swaps the order of integration

XYf(x,y)𝑑χ(y)𝑑χ(x)=YXf(x,y)𝑑χ(x)𝑑χ(y)\int_{X}\int_{Y}f(x,y)d\chi(y)d\chi(x)=\int_{Y}\int_{X}f(x,y)d\chi(x)d\chi(y)

is a consequence of choosing FF to be the projection maps pX:X×YXp_{X}:X\times Y\to X and pY:X×YYp_{Y}:X\times Y\to Y. However, if we additionally impose the constraint that ff is constant on the fibers of pXp_{X} and pYp_{Y}, this is the same as requiring ff to be identically constant on X×YX\times Y.

Appendix D Discussion on limσ0ERT(ϕσ𝟙K)\lim_{\sigma\rightarrow 0}\operatorname{ERT}(\phi_{\sigma}*\mathbbm{1}_{K})

Let ϕσ\phi_{\sigma} be a kernel function with a bandwidth σ\sigma (e.g., see Eq. (D.1)), and ϕσ𝟙K\phi_{\sigma}*\mathbbm{1}_{K} the convolution of two functions. Although limσ0ϕσ𝟙K(x)=𝟙K(x)\lim_{\sigma\rightarrow 0}\phi_{\sigma}*\mathbbm{1}_{K}(x)=\mathbbm{1}_{K}(x) almost everywhere (e.g., see “Theorem 4.1” of Stein and Shakarchi (2011)), it is not generally true that limσ0ERT(ϕσ𝟙K)=ERT(𝟙K)=ECT(K)\lim_{\sigma\rightarrow 0}\operatorname{ERT}(\phi_{\sigma}*\mathbbm{1}_{K})=\operatorname{ERT}(\mathbbm{1}_{K})=\operatorname{ECT}(K). This phenomenon is symbolic of the general principle that a set of Lebesgue measure zero may not be of Euler characteristic zero.

Let ϕσ\phi_{\sigma} be a kernel function with a bandwidth σ\sigma. For example

ϕσ(x)={Cdσdexp(σ2σ2x2),x<σ0,xσ,where Cd=(x<1e11x2𝑑x)1.\displaystyle\begin{aligned} &\phi_{\sigma}(x)=\begin{cases}\frac{C_{d}}{\sigma^{d}}\cdot\exp\left(-\frac{\sigma^{2}}{\sigma^{2}-\|x\|^{2}}\right),&\|x\|<\sigma\\ 0,&\|x\|\geq\sigma\end{cases},\\ &\text{where }C_{d}=\left(\int_{\|x\|<1}e^{-\frac{1}{1-\|x\|^{2}}}dx\right)^{-1}.\end{aligned} (D.1)

Let gσ:=ϕσ𝟙Kg_{\sigma}:=\phi_{\sigma}*\mathbbm{1}_{K} denote the convolution of two functions,

gσ(x):=ϕσ𝟙K(x)=dϕσ(y)𝟙K(xy)𝑑y.\displaystyle g_{\sigma}(x):=\phi_{\sigma}*\mathbbm{1}_{K}(x)=\int_{\mathbb{R}^{d}}\phi_{\sigma}(y)\cdot\mathbbm{1}_{K}(x-y)dy.

Furthermore, we set σ[0,110]\sigma\in[0,\frac{1}{10}], d=2d=2, and R=2R=2 for 𝔇R,d\mathfrak{D}_{R,d}.

Although limσ0ϕσ𝟙K(x)=𝟙K(x)\lim_{\sigma\rightarrow 0}\phi_{\sigma}*\mathbbm{1}_{K}(x)=\mathbbm{1}_{K}(x) almost everywhere (e.g., see “Theorem 4.1” of Stein and Shakarchi (2011)), the following limit is generally not true

limσ0ERT(ϕσ𝟙K)=ERT(𝟙K)=ECT(K)\displaystyle\lim_{\sigma\rightarrow 0}\operatorname{ERT}(\phi_{\sigma}*\mathbbm{1}_{K})=\operatorname{ERT}(\mathbbm{1}_{K})=\operatorname{ECT}(K) (D.2)

The failure of Eq. (D.2) is emblematic of the general principle that a set of Lebesgue measure zero may not be of Euler characteristic zero. The failure of Eq. (D.2) is illustrated by the following example.

Example D.1.

Consider the shape KK defined by the following where

K={x2:infySxy110},\displaystyle K=\left\{x\in\mathbb{R}^{2}\,:\,\inf_{y\in S}\|x-y\|\leq\frac{1}{10}\right\},
where S={(910cost,910sint): 0t2π}.\displaystyle\text{where }\ \ S=\left\{\left(\frac{9}{10}\cos t,\frac{9}{10}\sin t\right)\,:\,0\leq t\leq 2\pi\right\}.
Refer to caption
Figure 9: The grayscale image of ϕσ𝟙K\phi_{\sigma}*\mathbbm{1}_{K} with small σ\sigma

Choose ν=(0,1)2\nu=(0,1)\in\mathbb{R}^{2} and any t(21100,2+1100)t\in(2-\frac{1}{100},2+\frac{1}{100}), then K{νxtR}=K{νxt2}K\cap\{\nu\cdot x\leq t-R\}=K\cap\{\nu\cdot x\leq t-2\} has the homotopy type of [0,1][0,1]. Hence, ECT(K)((0,1),t)=1\operatorname{ECT}(K)((0,1),t)=1.

On the other hand, since 0ϕσ𝟙K10\leq\phi_{\sigma}*\mathbbm{1}_{K}\leq 1, it follows from Theorem 4.2 that,

ERT(ϕσ𝟙K)((0,1),t)=01{SELECT(ϕσ𝟙K)((0,1),t,s)12LECT(ϕσ𝟙K)((0,1),t,s)}𝑑s.\displaystyle\begin{aligned} &\operatorname{ERT}(\phi_{\sigma}*\mathbbm{1}_{K})((0,1),t)\\ &=\int_{0}^{1}\left\{\operatorname{SELECT}(\phi_{\sigma}*\mathbbm{1}_{K})((0,1),t,s)-\frac{1}{2}\operatorname{LECT}(\phi_{\sigma}*\mathbbm{1}_{K})((0,1),t,s)\right\}\,ds.\end{aligned} (D.3)

Omitting endpoint behaviors, for 0<s<10<s<1 and sufficiently small σ\sigma (see Figure 9), the level set for LECT(ϕσ𝟙K)\operatorname{LECT}(\phi_{\sigma}*\mathbbm{1}_{K}) has the homotopy type of the disjoint union of two circular arcs, hence its Euler characteristic is 22. On the other hand, the super-level set for SELECT(ϕσ𝟙K)\operatorname{SELECT}(\phi_{\sigma}*\mathbbm{1}_{K}) has the homotopy type of a solid circular arc, hence its Euler characteristic is 11. Thus, we find that

ERT(ϕσ𝟙K)((0,1),t)=01112(2)ds=0, for all t(21100,2+1100).\displaystyle\operatorname{ERT}(\phi_{\sigma}*\mathbbm{1}_{K})((0,1),t)=\int_{0}^{1}1-\frac{1}{2}(2)\,ds=0,\ \ \ \text{ for all }t\in(2-\frac{1}{100},2+\frac{1}{100}).

That is, 0=limσ0ERT(ϕσ𝟙K)((0,1),t)ERT(𝟙K)((0,1),t)=ECT(K)((0,1),t)=10=\lim_{\sigma\rightarrow 0}\operatorname{ERT}(\phi_{\sigma}*\mathbbm{1}_{K})((0,1),t)\neq\operatorname{ERT}(\mathbbm{1}_{K})((0,1),t)=\operatorname{ECT}(K)((0,1),t)=1 for all t(21100,2+1100)t\in(2-\frac{1}{100},2+\frac{1}{100}).

Appendix E ERT(g)(ν,)\operatorname{ERT}(g)(\nu,-) is not left continuous

In this section, we provide two simple examples that ERT(g)(ν,):[0,T]\operatorname{ERT}(g)(\nu,-):[0,T]\to\mathbb{R} is not left continuous for some fixed direction ν𝕊d1\nu\in\mathbb{S}^{d-1}.

Example E.1.

Let R=2R=2 and consider the indicator function on K{(x1,x2)2|x12+x22=1}K\coloneqq\{(x_{1},x_{2})\in\mathbb{R}^{2}\ |\ x_{1}^{2}+x_{2}^{2}=1\}. Since the input function is integer-valued, we have that ERT(𝟙K)(ν,)=ECT(K)(ν,)\operatorname{ERT}(\mathbbm{1}_{K})(\nu,-)=\operatorname{ECT}(K)(\nu,-) for any direction ν\nu. Computing ECT(K)(ν,t)\operatorname{ECT}(K)(\nu,t) for all t[0,4]t\in[0,4], we have that ECT(K)(ν,t)=0\operatorname{ECT}(K)(\nu,t)=0 for t[0,1)[3,4]t\in[0,1)\cup[3,4] and ECT(K)(ν,t)=1\operatorname{ECT}(K)(\nu,t)=1 for t[1,3)t\in[1,3). In particular, this shows that ERT(g)(𝟙K)(ν,)\operatorname{ERT}(g)(\mathbbm{1}_{K})(\nu,-) is right continuous but not left continuous.

Example E.2.

Let R=2R=2 and ν=(1,0)\nu=(1,0). We consider the grayscale function defined as follows

g:B2(0,2),g(x1,x2)={x1+2,(x1,x2)𝔻2{(a,b)2|a2+b21}0,otherwise.g:B_{\mathbb{R}^{2}}(0,2)\to\mathbb{R},\quad g(x_{1},x_{2})=\begin{cases}x_{1}+2,\quad(x_{1},x_{2})\in\mathbb{D}^{2}\coloneqq\{(a,b)\in\mathbb{R}^{2}\ |\ a^{2}+b^{2}\leq 1\}\\ 0,\quad\text{otherwise.}\end{cases}

Clearly, g(x1,x2)=0g(x_{1},x_{2})=0 whenever x1<1x_{1}<-1. For t[0,1)t\in[0,1), we have g(x1,x2)𝟙{x1t2}=0g(x_{1},x_{2})\cdot\mathbbm{1}_{\{x_{1}\leq t-2\}}=0 for all (x1,x2)B2(0,2)(x_{1},x_{2})\in B_{\mathbb{R}^{2}}(0,2). Therefore, ERT(g)(ν,t)=0\operatorname{ERT}(g)(\nu,t)=0 for all t[0,1)t\in[0,1).

Now for t[1,3)t\in[1,3), since 0g30\leq g\leq 3, by Eq. (4.6), we can write

ERT(g)(ν,t)=03SELECT(g)(ν,t,s)12LECT(g)(ν,t,s)ds.\operatorname{ERT}(g)(\nu,t)=\int_{0}^{3}\operatorname{SELECT}(g)(\nu,t,s)-\frac{1}{2}\operatorname{LECT}(g)(\nu,t,s)\,ds.

Recall from Equation 4.1 that

LECT(g)(ν,t,s)=χ({xB2(0,R):x1tR and g(x)=s}).\displaystyle\operatorname{LECT}(g)(\nu,t,s)=\chi\left(\left\{x\in B_{\mathbb{R}^{2}}(0,R):\,x_{1}\leq t-R\text{ and }g(x)=s\right\}\right). (E.1)

For any t[1,3)t\in[1,3) and s(0,3)s\in(0,3), the level set in Eq. (E.1) is either the empty set or is compact contractible. Specifically, we have the following

LECT(g)(ν,t,s)=0, for all t[1,3) and s(0,1),\displaystyle\operatorname{LECT}(g)(\nu,t,s)=0,\ \ \text{ for all }t\in[1,3)\text{ and }s\in(0,1),
LECT(g)(ν,t,s)=1, for all t[1,3) and s[1,t],\displaystyle\operatorname{LECT}(g)(\nu,t,s)=1,\ \ \text{ for all }t\in[1,3)\text{ and }s\in[1,t],
LECT(g)(ν,t,s)=0, for all t[1,3) and s(t,3).\displaystyle\operatorname{LECT}(g)(\nu,t,s)=0,\ \ \text{ for all }t\in[1,3)\text{ and }s\in(t,3).

Similarly, from Equation 4.2, we have that

SELECT(g)(ν,t,s)=1,for all t[1,3) and s(0,t),\displaystyle\operatorname{SELECT}(g)(\nu,t,s)=1,\ \ \text{for all }t\in[1,3)\text{ and }s\in(0,t),
SELECT(g)(ν,t,s)=0,for all t[1,3) and s(t,3).\displaystyle\operatorname{SELECT}(g)(\nu,t,s)=0,\ \ \text{for all }t\in[1,3)\text{ and }s\in(t,3).

It follows that for t[1,3)t\in[1,3),

ERT(g)(ν,t)=01(10)𝑑s+1t(112(1))𝑑s=1+t12=t+12.\operatorname{ERT}(g)(\nu,t)=\int_{0}^{1}(1-0)ds+\int_{1}^{t}(1-\frac{1}{2}(1))ds=1+\frac{t-1}{2}=\frac{t+1}{2}.

Finally, for t[3,4]t\in[3,4], we have g(x1,x2)𝟙x1t2=g(x1,x2)g(x_{1},x_{2})\cdot\mathbbm{1}_{x_{1}\leq t-2}=g(x_{1},x_{2}). Hence, ERT(g)(ν,t)=ERT(g)(ν,3)=3+12=2\operatorname{ERT}(g)(\nu,t)=\operatorname{ERT}(g)(\nu,3)=\frac{3+1}{2}=2. We conclude that ERT(g)(ν,)\operatorname{ERT}(g)(\nu,-) is right continuous but not left continuous.

References

  • Adler et al. (2007) R. J. Adler, J. E. Taylor, et al. Random fields and geometry, volume 80. Springer, 2007.
  • Aerts et al. (2014) H. J. Aerts, E. R. Velazquez, R. T. Leijenaar, C. Parmar, P. Grossmann, S. Carvalho, J. Bussink, R. Monshouwer, B. Haibe-Kains, D. Rietveld, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nature communications, 5(1):4006, 2014.
  • Alexanderian (2015) A. Alexanderian. A brief note on the Karhunen-Loève expansion. arXiv preprint arXiv:1509.07526, 2015.
  • Ashburner (2007) J. Ashburner. A fast diffeomorphic image registration algorithm. Neuroimage, 38(1):95–113, 2007.
  • Bankman (2008) I. Bankman. Handbook of medical image processing and analysis. Elsevier, 2008.
  • Baryshnikov and Ghrist (2010) Y. Baryshnikov and R. Ghrist. Euler integration over definable functions. Proceedings of the National Academy of Sciences, 107(21):9525–9530, 2010.
  • Baryshnikov et al. (2011) Y. Baryshnikov, R. Ghrist, and D. Lipsky. Inversion of Euler integral transforms with applications to sensor data. Inverse problems, 27(12):124001, 2011.
  • Bobrowski and Borman (2012) O. Bobrowski and M. S. Borman. Euler integration of gaussian random fields and persistent homology. Journal of Topology and Analysis, 04(01):49–70, 2012. doi: 10.1142/s1793525312500057.
  • Bredon (2012) G. E. Bredon. Sheaf theory, volume 170. Springer Science & Business Media, 2012.
  • Brezis (2011) H. Brezis. Functional analysis, Sobolev spaces and partial differential equations, volume 2. Springer, 2011.
  • Brooks and Grigsby (2013) F. J. Brooks and P. W. Grigsby. Quantification of heterogeneity observed in medical images. BMC medical imaging, 13(1):1–12, 2013.
  • Carlsson (2009) G. Carlsson. Topology and data. Bulletin of the American Mathematical Society, 46(2):255–308, 2009.
  • Chen and Rabadán (2017) A. X. Chen and R. Rabadán. A fast semi-automatic segmentation tool for processing brain tumor images. In Towards Integrative Machine Learning and Knowledge Extraction: BIRS Workshop, Banff, AB, Canada, July 24-26, 2015, Revised Selected Papers, pages 170–181. Springer, 2017.
  • Crawford et al. (2020) L. Crawford, A. Monod, A. X. Chen, S. Mukherjee, and R. Rabadán. Predicting clinical outcomes in glioblastoma: an application of topological and functional data analysis. Journal of the American Statistical Association, 115(531):1139–1150, 2020.
  • Curry et al. (2022) J. Curry, S. Mukherjee, and K. Turner. How many directions determine a shape and other sufficiency results for two topological transforms. Transactions of the American Mathematical Society, Series B, 9(32):1006–1043, 2022.
  • Dunson and Wu (2021) D. B. Dunson and N. Wu. Inferring manifolds from noisy data using gaussian processes. arXiv preprint arXiv:2110.07478, 2021.
  • Eloyan et al. (2020) A. Eloyan, M. S. Yue, and D. Khachatryan. Tumor heterogeneity estimation for radiomics in cancer. Statistics in medicine, 39(30):4704–4723, 2020.
  • Ghrist et al. (2018) R. Ghrist, R. Levanger, and H. Mai. Persistent homology and Euler integral transforms. Journal of Applied and Computational Topology, 2(1):55–60, 2018.
  • Ghrist (2014) R. W. Ghrist. Elementary applied topology, volume 1. Createspace Seattle, 2014.
  • Good (2013) P. Good. Permutation tests: a practical guide to resampling methods for testing hypotheses. Springer Science & Business Media, 2013.
  • Howell (2006) S. B. Howell. Handbook of CCD astronomy, volume 5. Cambridge University Press, 2006.
  • Hsing and Eubank (2015) T. Hsing and R. Eubank. Theoretical foundations of functional data analysis, with an introduction to linear operators, volume 997. John Wiley & Sons, 2015.
  • Jiang et al. (2020) Q. Jiang, S. Kurtek, and T. Needham. The weighted Euler curve transform for shape and image analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 844–845, 2020.
  • Jolliffe (2002) I. T. Jolliffe. Principal component analysis for special types of data. Springer, 2002.
  • Just (2014) N. Just. Improving tumour heterogeneity mri assessment with histograms. British journal of cancer, 111(12):2205–2213, 2014.
  • Kidder and Haar (1995) S. Q. Kidder and T. H. V. Haar. Satellite meteorology: an introduction. Gulf Professional Publishing, 1995.
  • Kirveslahti and Mukherjee (2023) H. Kirveslahti and S. Mukherjee. Representing fields without correspondences: the lifted euler characteristic transform. Journal of Applied and Computational Topology, pages 1–34, 2023.
  • Klenke (2020) A. Klenke. Probability theory: a comprehensive course. Springer, 2020.
  • Li et al. (2022) D. Li, M. Mukhopadhyay, and D. B. Dunson. Efficient manifold approximation with spherelets. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(4):1129–1149, 2022.
  • Maldonado et al. (2021) F. Maldonado, C. Varghese, S. Rajagopalan, F. Duan, A. B. Balar, D. A. Lakhani, S. L. Antic, P. P. Massion, T. F. Johnson, R. A. Karwoski, et al. Validation of the broders classifier (benign versus aggressive nodule evaluation using radiomic stratification), a novel hrct-based radiomic classifier for indeterminate pulmonary nodules. European Respiratory Journal, 57(4), 2021.
  • Marsh and Beers (2023) L. Marsh and D. Beers. Stability and inference of the Euler characteristic transform. arXiv preprint arXiv:2303.13200, 2023.
  • Marsh et al. (2022) L. Marsh, F. Y. Zhou, X. Quin, X. Lu, H. M. Byrne, and H. A. Harrington. Detecting temporal shape changes with the Euler characteristic transform. arXiv preprint arXiv:2212.10883, 2022.
  • Meng and Eloyan (2021) K. Meng and A. Eloyan. Principal manifold estimation via model complexity selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 83(2):369–394, 2021.
  • Meng et al. (2022) K. Meng, J. Wang, L. Crawford, and A. Eloyan. Randomness and statistical inference of shapes via the smooth Euler characteristic transform. arXiv preprint arXiv:2204.12699, 2022.
  • Milnor (1963) J. W. Milnor. Morse theory. Number 51. Princeton university press, 1963.
  • Reed (2012) M. Reed. Methods of modern mathematical physics: Functional analysis. Elsevier, 2012.
  • Reed (2005) S. J. B. Reed. Electron microprobe analysis and scanning electron microscopy in geology. Cambridge university press, 2005.
  • Robinson and Turner (2017) A. Robinson and K. Turner. Hypothesis testing for topological data analysis. Journal of Applied and Computational Topology, 1:241–261, 2017.
  • Schapira (1995) P. Schapira. Tomography of constructible functions. In Applied Algebra, Algebraic Algorithms and Error-Correcting Codes: 11th International Symposium, AAECC-11 Paris, France, July 17–22, 1995 Proceedings 11, pages 427–435. Springer, 1995.
  • Stein and Shakarchi (2011) E. M. Stein and R. Shakarchi. Fourier analysis: an introduction, volume 1. Princeton University Press, 2011.
  • Taylor and Worsley (2008) J. E. Taylor and K. J. Worsley. Random fields of multivariate test statistics, with applications to shape analysis. 2008.
  • Turner et al. (2014) K. Turner, S. Mukherjee, and D. M. Boyer. Persistent homology transform for modeling shapes and surfaces. Information and Inference: A Journal of the IMA, 3(4):310–344, 2014.
  • van den Dries (1998) L. P. D. van den Dries. Tame topology and o-minimal structures, volume 248. Cambridge university press, 1998.
  • Vishwanath et al. (2020) S. Vishwanath, K. Fukumizu, S. Kuriki, and B. Sriperumbudur. On the limits of topological data analysis for statistical inference. arXiv e-prints, pages arXiv–2001, 2020.
  • Wang et al. (2021) B. Wang, T. Sudijono, H. Kirveslahti, T. Gao, D. M. Boyer, S. Mukherjee, and L. Crawford. A statistical pipeline for identifying physical features that differentiate classes of 3d shapes. The Annals of Applied Statistics, 15(2):638–661, 2021.
  • Wang et al. (2023) J. Wang, K. Meng, and F. Duan. Hypothesis testing for medical imaging analysis via the smooth euler characteristic transform. arXiv preprint arXiv:2308.06645, 2023.
  • Wu et al. (2022) S. Wu, H. Cheng, J. Cai, P. Ma, and W. Zhong. Subsampling in large graphs using ricci curvature. In The Eleventh International Conference on Learning Representations, 2022.
  • Yue et al. (2016) C. Yue, V. Zipunnikov, P.-L. Bazin, D. Pham, D. Reich, C. Crainiceanu, and B. Caffo. Parameterization of white matter manifold-like structures using principal surfaces. Journal of the American Statistical Association, 111(515):1050–1060, 2016.