This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Bayesian hierarchical modeling of simply connected 2D shapes

Kelvin Gu, Debdeep Pati and David B. Dunson
Department of Statistical Science,
Duke University, NC 27708
email: gu.kelvin@gmail.com, debdeep.pati@stat.duke.edu, dunson@stat.duke.edu

Abstract

Models for distributions of shapes contained within images can be widely used in biomedical applications ranging from tumor tracking for targeted radiation therapy to classifying cells in a blood sample. Our focus is on hierarchical probability models for the shape and size of simply connected 2D closed curves, avoiding the need to specify landmarks through modeling the entire curve while borrowing information across curves for related objects. Prevalent approaches follow a fundamentally different strategy in providing an initial point estimate of the curve and/or locations of landmarks, which are then fed into subsequent statistical analyses. Such two-stage methods ignore uncertainty in the first stage, and do not allow borrowing of information across objects in estimating object shapes and sizes. Our fully Bayesian hierarchical model is based on multiscale deformations within a linear combination of cyclic basis characterization, which facilitates automatic alignment of the different curves accounting for uncertainty. The characterization is shown to be highly flexible in representing 2D closed curves, leading to a nonparametric Bayesian prior with large support. Efficient Markov chain Monte Carlo methods are developed for simultaneous analysis of many objects. The methods are evaluated through simulation examples and applied to yeast cell imaging data.

Keywords: Bayesian nonparametrics, cyclic basis, deformation, hierarchical modeling, image cytometry, multiscale, 2d shapes

1 Introduction

Collections of shapes are widely studied across many disciplines, such as biomedical imaging, cytology and computer vision. Perhaps the most fundamental issue when studying shape is the choice of representation. The simplest representations for shape are basic geometric objects, such as ellipses (???), polygons (????), and slightly more involved specifications such as superellipsoids (?).

Clearly, not all shapes can be adequately characterized by simple geometric objects. The landmark-based approach was developed to describe more complex shapes by reducing them to a finite set of landmark coordinates. This is appealing because the joint distribution of these landmarks is tractable to analyze, and because landmarks make registration/alignment of different shapes straightforward. There is a very rich statistical literature on parametric joint distributions for multiple landmarks (?????????), with some recent work on nonparametric distributions, both frequentist (????) and Bayesian (???).

Unfortunately, in many applications it is not possible to define landmarks if the target collection of objects vary greatly. Furthermore, even if landmarks can be chosen, there may be substantial uncertainty in estimating their location, which is not accounted for in landmark-based statistical analyses.

In these situations, one can instead characterize shapes by describing their boundary, using a nonparametric curve (2D) or surface (3D). Curves and surfaces are widely used in biomedical imaging and commercial computer-aided design (????), because they provide a flexible model for a broad range of objects e.g., cells, pollen grains, protein molecules, machine parts, etc. A collection of introductory work on curve and surface modeling can be found in ? and subsequent developments in ?. Popular representations include: Bezier curves, splines, and principal curves (?) (a nonlinear generalization of principal components, involving smooth curves which ‘pass through the middle’ of a data cloud). ? and ? dealt with curve modeling based on smooth stochastic processes. Although there is a hugely vast literature on estimating curves and surfaces, most of the focus is on estimating μ:𝒳\mu:\mathcal{X}\to\mathbb{R}, where 𝒳\mathcal{X} a compact subset of p\mathbb{R}^{p} without making any constraints on μ\mu. Estimating a closed surface or a curve involves a different modeling strategy and there has been very few works in this regime, particularly from a Bayesian point of view. To our knowledge, only ? developed a Bayesian approach for fitting a closed surface using tensor-products.

Many of the above curve representations can successfully fit and describe complex shape boundaries, but they often have high or infinite dimensionality, and it is not clear how to directly analyze them. Also, they were not designed to facilitate comparison between shapes or characterize a collection of shapes. One solution is to re-express each curve using Fourier descriptors or wavelet descriptors (????). Both approaches decompose a curve into components of different scales, so that the coarsest scale components carry the global approximation information while the finer scale components contain the local detailed information. Such multiscale transforms make it easier to compare objects that share the same coarse shape, but differ on finer details, or vice versa. The finer scale components can also be discarded to yield a finite and low-dimensional representation. Other dimensionality-reducing transformations include Principal Component Analysis and Distance Weighted Discrimination.

Note that the entire process is fragmented into three separate tasks: 1) curve fitting, 2) transformation, 3) population-level analysis. This can be problematic for several reasons. First, curve-fitting is not always accurate. If uncertainty is not accounted for, mistakes made during curve-fitting will be propagated into later analyses. Second, dimension-reducing transformations may throw away some of the information captured by curve-fitting. Finally, one suspects that the curve-fitting and transformation steps should be able to benefit from higher-level observations made during subsequent population analysis. For example, if the curve-fitting procedure is struggling to fit a missing or noisy shape boundary, it should be able to draw on similar shapes in the population to achieve a more informed fit. In this paper, we propose a Bayesian hierarchical model for 2D shapes, which addresses all of the aforementioned problems by performing curve fitting, multiscale transformation, and population analysis simultaneously within a single joint model.

The key innovation in our shape model is a shape-generating random process which can produce the whole range of simply-connected 2D shapes (shapes which contain no holes), by applying a sequence of multiscale deformations to a novel type of closed curve based on the work of ?. ?, ? and ? also proposed multiscale curves (with the latter two being more similar to our work, in their usage of Bézier curves and degree-elevation). However, none of these developed a statistical model around their representation or considered a collection of shapes. In analyzing a population of shapes, a notion of average shape or mean shape is quite important. ? discussed notions of mean shape and shape variability and various methods of estimating them pertaining to landmark based analysis. We will follow a different but related strategy for defining the average shape in terms of the basis coefficients or the control points of the Bézier curves. We call it the ‘central shape’. Refer to §2.5 for details. To characterize shape variability, we also define a notion of shape quantile in §2.5.

In §2, we describe the shape-generating random process, how it specifies a multiscale probability distribution over shapes, and how this can be used to express various modeling assumptions, such as symmetry. In §3, we provide theory regarding the flexibility of our model (support of the prior). In §4 and §5, we show how the random process can be used to fit a curve to a point cloud or an image. In §6, we show how to simultaneously fit and characterize a collection of shapes, which also naturally incorporates inter-shape alignment. In §7, we describe the computational details of Bayesian inference behind each of the tasks described earlier. This results in a fast approximate algorithm which is scalable to a huge collection of shapes having a dense point cloud each. Finally, in §8 and §9, we test our model on simulated shapes and real image data respectively. En route, we solve several important sub-problems that may be generally useful in the study of curve and surface fitting. First, we develop a model-based approach for parameterizing point cloud data. Second, we show how fully Bayesian joint modeling can be used to incorporate several pieces of auxiliary information in the process of curve-fitting, such as when each point within a point cloud also reports a surface orientation. Lastly, the concept of multi-scale deformation can be generalized to 3d surfaces in a straightforward manner.

2 Priors for Multiscale Closed Curves

2.1 Overview

Our random shape generation process starts with a closed curve and performs a sequence of multiscale deformations to generate a final shape. In §2.2, we introduce the Roth curve developed by ?, which is used to represent the shape boundary. Then, in §2.3, we demonstrate how to deform a Roth curve at multiple scales to produce any simply-connected shape. Using the mechanisms developed in §2.2 and §2.3, we present the full random shape process in §2.5.

2.2 Roth curve

A Roth curve is a closed parametric curve, C:[π,π]2C:[-\pi,\pi]\rightarrow\mathbb{R}^{2}, defined by a set of 2n+12n+1 control points {cj,j=1,,2n+1}\{c_{j},j=1,\ldots,2n+1\}, where nn is the degree of the curve and we may choose it to be any positive integer, depending on how many control points are desired. For convenience, we will refer to the total number of control points as JJ, where J(n)=2n+1J(n)=2n+1. For notational simplicity, we will drop the dependence of nn in J(n)J(n). As a function of tt, the curve can be viewed as the trajectory of a particle over time. At every time tt, the particle’s location is defined as some convex combination of all control points. The weight accorded to each control point in this convex combination varies with time according to a set of basis functions, {Bjn(t),j=1,,J}\{B^{n}_{j}(t),j=1,\ldots,J\}, where Bjn(t)>0B^{n}_{j}(t)>0 and j=1JBjn(t)=1\sum_{j=1}^{J}B^{n}_{j}(t)=1 for all tt.

C(t)\displaystyle C(t) =\displaystyle= j=1JcjBjn(t),t[π,π],\displaystyle\sum_{j=1}^{J}c_{j}B_{j}^{n}(t),\enspace t\in[-\pi,\pi]\enspace, (1)
Bjn(t)\displaystyle B_{j}^{n}(t) =\displaystyle= hn2n{1+cos(t+2π(j1)2n+1)}n,hn=(2nn!)2(2n+1)!,\displaystyle\frac{h_{n}}{2^{n}}\left\{1+\cos\bigg{(}t+\frac{2\pi(j-1)}{2n+1}\bigg{)}\right\}^{n},\enspace h_{n}=\frac{(2^{n}n!)^{2}}{(2n+1)!}\enspace, (2)

where cj=[cj,xcj,y]c_{j}=[c_{j,x}\enspace c_{j,y}]^{\prime} specifies the location of the jthj^{th} control point and Bjn:[π,π][0,1]B^{n}_{j}:[-\pi,\pi]\rightarrow[0,1] is the jthj^{th} basis function. For simplicity, we omit the superscript nn denoting a basis function’s degree, unless it requires special attention. This representation is a type of Bezier curve. Refer to Figure 1 for an illustration of the Roth basis functions.

Refer to caption
Figure 1: Roth basis

The Roth curve has several appealing properties:

  1. 1.

    It is fully defined by a finite set of control points, despite being an infinite dimensional curve.

  2. 2.

    It is always closed, i.e. C(π)=C(π)C(-\pi)=C(\pi). This is necessary to represent the boundary of a shape.

  3. 3.

    All basis functions are nonlinear translates of each other, and are evenly spaced over the interval [π,π][-\pi,\pi]. They can be cyclically permuted without altering the curve. This implies that each control point exerts the same ‘influence’ over the curve. The influence of the control points is illustrated in §3.3.

  4. 4.

    A degree 1 Roth curve having 3 control points is always a circle or ellipse.

  5. 5.

    Any closed curve can be approximated arbitrarily well by a Roth curve, for some large degree nn. This is because the Roth basis, for a given nn, spans the vector space of trigonometric polynomials of degree nn and as nn\rightarrow\infty, the basis functions span the vector space of Fourier series. We elaborate on this in §3.

  6. 6.

    Roth curves are infinitely smooth in the sense that they are infinitely differentiable (CC^{\infty}).

2.3 Deforming a Roth curve

A Roth curve can be deformed simply by translating some of its control points. We now formally define deformation and illustrate it in Figure 2.

Refer to caption
Figure 2: Deformation of a Roth curve
  • Definition

    Suppose we are given two Roth curves,

    C(t)=j=1JcjBj(t),C~(t)=j=1Jc~jBj(t),\displaystyle C(t)=\sum_{j=1}^{J}c_{j}B_{j}(t),\quad\widetilde{C}(t)=\sum_{j=1}^{J}\widetilde{c}_{j}B_{j}(t), (3)

    where for each jj, c~j=cj+Rjdj\widetilde{c}_{j}=c_{j}+R_{j}d_{j}, dj2d_{j}\in\mathbb{R}^{2} and RjR_{j} is a rotation matrix. Then, we say that C(t)C(t) is deformed into C~(t)\widetilde{C}(t) by the deformation vectors {dj,j=1,,J}\{d_{j},j=1,\ldots,J\}.

Each RjR_{j} orients the deformation vector djd_{j} relative to the original curve’s surface. As a result, positive values for the y-component of djd_{j} always correspond to outward deformation, negative values always correspond to inward deformation, and djd_{j}’s x-component corresponds to deformation parallel to the surface. We will call RjR_{j} a deformation-orienting matrix. In precise terms,

Rj\displaystyle R_{j} =\displaystyle= [cos(θj)sin(θj)sin(θj)cos(θj)]\displaystyle\begin{bmatrix}\cos(\theta_{j})&-\sin(\theta_{j})\\ \sin(\theta_{j})&\cos(\theta_{j})\end{bmatrix} (4)

where θj\theta_{j} is the angle of the curve’s tangent line at qj=2π(j1)2n+1q_{j}=\frac{-2\pi(j-1)}{2n+1}, the point where the control point cjc_{j} has the strongest influence: qj=argmaxt[a,b]Bj(t)q_{j}=\arg\displaystyle\max_{t\in[a,b]}B_{j}(t). θj\theta_{j} can be obtained by computing the first-derivative of the Roth curve, also known as its hodograph.

  • Definition

    The hodograph of a Roth curve is given by:

    H(t)\displaystyle H(t) =\displaystyle= j=1JcjddtBj(t),\displaystyle\sum_{j=1}^{J}c_{j}\frac{d}{dt}B_{j}(t), (5)
    ddtBj(t)\displaystyle\frac{d}{dt}B_{j}(t) =\displaystyle= 2(2n+1)(2nn)j=1Jcjk=0n1(2nk)(nk)sin((nk)t+2(nk)(j1)π2n+1),\displaystyle-\frac{2}{(2n+1){2n\choose n}}\sum_{j=1}^{J}c_{j}\sum_{k=0}^{n-1}{2n\choose k}(n-k)\sin\bigg{(}(n-k)t+\frac{2(n-k)(j-1)\pi}{2n+1}\bigg{)}, (6)

    where t[π,π]t\in[-\pi,\pi]. If we view C(t)C(t) as the trajectory of a particle, H(t)H(t) intuitively gives the velocity of the particle at point tt.

We can now use simple trigonometry to determine that

θj=arctan(Hy(qj)Hx(qj)).\displaystyle\theta_{j}=\arctan\left(\frac{H_{y}(q_{j})}{H_{x}(q_{j})}\right). (7)

Note that RjR_{j} is ultimately just a function of {cj2,j=1,,J}\{c_{j}\in\mathbb{R}^{2},j=1,\ldots,J\}.

Next, we show how to alter the scale of deformation, using an important concept called degree elevation.

  • Definition

    Given any Roth curve, we can use degree elevation to re-express the same curve using a larger number of control points (a higher degree). More precisely, if we are given a curve of degree nn, C(t)=j=12n+1cjBjn(t)C(t)=\sum_{j=1}^{2n+1}c_{j}B_{j}^{n}(t), we can elevate its degree by any positive integer vv, to obtain a new degree elevated curve: C^(t)=j=12(n+v)+1c^jBjn+v(t)\widehat{C}(t)=\sum_{j=1}^{2(n+v)+1}\widehat{c}_{j}B_{j}^{n+v}(t) such that C(t)=C^(t)C(t)=\widehat{C}(t) for all t[π,π]t\in[-\pi,\pi]. In C^(t)\widehat{C}(t), each new degree-elevated control point, c^j\widehat{c}_{j}, can be defined in terms of the original control points, {ci,i=1,,2n+1}\{c_{i},i=1,\ldots,2n+1\}:

    c^j:=12n+1i=12n+1ci+(2(n+v)n+v)hn22n1k=0n1(2nk)(2(n+v)v+k)i=12n+1cos((nk)(2(j1)π2(n+v)+1)+2(nk)(i1)π2n+1)ci.\displaystyle\widehat{c}_{j}:=\frac{1}{2n+1}\sum_{i=1}^{2n+1}c_{i}+\frac{{2(n+v)\choose n+v}h_{n}}{2^{2n-1}}\sum_{k=0}^{n-1}\frac{{2n\choose k}}{{2(n+v)\choose v+k}}\sum_{i=1}^{2n+1}\cos\left((n-k)\left(\frac{-2(j-1)\pi}{2(n+v)+1}\right)+\frac{2(n-k)(i-1)\pi}{2n+1}\right)c_{i}\enspace.

It is crucial to note that the ‘influence’ of a single control point shrinks after degree elevation. We quantify this intuition in §3.3. This is because the curve is now shared by a greater total number of control points. This implies that after degree-elevation, the translation of any single control point will cause a smaller, finer-scale deformation to the curve’s shape. Thus, degree elevation can be used to adjust the scale of deformation. We exploit this strategy in the random shape process proposed in §2.5.

To that end, we first rewrite all of the concepts described above in more compact vector notation. Note that the formulas for degree elevation, deformation, the hodograph and the curve itself all simply involve linear operations on the control points.

2.4 Vector notation

First, we rewrite the control points in a ‘stacked’ vector of length 2J2J.

c=(c1,x,c1,y,c2,x,c2,y,,cJ,x,cJ,y).\displaystyle c=(c_{1,x},c_{1,y},c_{2,x},c_{2,y},\ldots,c_{J,x},c_{J,y})^{\prime}. (8)

The formula for a Roth curve given in (1) can be rewritten as:

C(t)\displaystyle C(t) =\displaystyle= X(t)c\displaystyle X(t)c (9)
X(t)\displaystyle X(t) =\displaystyle= [B1(t)0B2(t)0BJ(t)00B1(t)0B2(t)0BJ(t)]\displaystyle\begin{bmatrix}B_{1}(t)&0&B_{2}(t)&0&\cdots&B_{J}(t)&0\\ 0&B_{1}(t)&0&B_{2}(t)&\cdots&0&B_{J}(t)\end{bmatrix} (10)

The formula for the hodograph given in (5) is rewritten as:

H(t)=X(t)˙c,X˙(t)=ddtX(t)\displaystyle H(t)=\dot{X(t)}c,\quad\dot{X}(t)=\frac{d}{dt}X(t) (11)

Deformation can be written as:

c~=c+T(c)d,d=(d1,x,d1,y,d2,x,d2,y,,dJ,x,dJ,y),T(c)=block(R1,R2,,RJ)\displaystyle\widetilde{c}=c+T(c)d,\quad d=(d_{1,x},d_{1,y},d_{2,x},d_{2,y},\ldots,d_{J,x},d_{J,y})^{\prime},\quad T(c)=\mbox{block}(R_{1},R_{2},\ldots,R_{J}) (12)

where block(A1,,Aq)\mbox{block}(A_{1},\ldots,A_{q}) is a pq×pqpq\times pq block diagonal matrix using p×pp\times p matrices Ai,i=1,,qA_{i},i=1,\ldots,q. We call TT the stacked deformation-orientating matrix. Note that TT is a function of cc, because each RjR_{j} depends on cc. Degree elevation can be written as the linear operator, EE:

c^=Ec,E=(Ei,j)i=1,j=1n+v,n.\displaystyle\widehat{c}=Ec,\quad E=(E_{i,j})_{i=1,j=1}^{n+v,n}.

where

Ei,j\displaystyle E_{i,j} =\displaystyle= 12n+1+(2(n+v)n+v)hn22n1k=0n1(2nk)(2(n+v)v+k)cos((nk)(2(i1)π2(n+v)+1)+2(nk)(j1)π2n+1).\displaystyle\frac{1}{2n+1}+\frac{{2(n+v)\choose n+v}h_{n}}{2^{2n-1}}\sum_{k=0}^{n-1}\frac{{2n\choose k}}{{2(n+v)\choose v+k}}\cos\left((n-k)\left(\frac{-2(i-1)\pi}{2(n+v)+1}\right)+\frac{2(n-k)(j-1)\pi}{2n+1}\right).

We will maintain this vector notation throughout the rest of the paper.

2.5 Random Shape Process

The random shape process starts with some initial Roth curve, specified by an initial set of control points, c(0)c^{(0)}. From here on, we will refer to all curves by the stacked vector of their control points, cc. Then, drawing on the deformation and degree-elevation operations defined earlier, we repeatedly apply the following recursive operation RR times:

c^(r1)=Erc(r1),d(r)N(μr,Σr),c(r)=c^(r1)+Tr(c(r1))d(r)\displaystyle\widehat{c}^{(r-1)}=E_{r}c^{(r-1)},\quad d^{(r)}\sim\mbox{N}(\mu_{r},\Sigma_{r}),\quad c^{(r)}=\widehat{c}^{(r-1)}+T_{r}(c^{(r-1)})d^{(r)} (13)

resulting in a final curve c(R)c^{(R)}. In other words, the process simply has two steps: (i) degree elevate the current curve, (ii) randomly deform it, and repeat a total of RR times. Note that this random process specifies a probability distribution over c(R)c^{(R)}.

Refer to caption

Figure 3: An illustration of the shape generation process. From left to right: 1) initial curve specified by three control points, 2) the same curve after degree elevation, 3) deformation, 4) degree elevation again, 5) deformation again. Dark lines indicate the curve, pale dots indicate the curve’s control points, and pale lines connect the control points in order.

We now elaborate on the details of this recursive process. The parameters of the process are:

  1. 1.

    RR\in\mathbb{Z}, the number of steps in the process

  2. 2.

    nrn_{r}\in\mathbb{Z}, the degree of the curve c(r)c^{(r)}, for each r=1,,Rr=1,\ldots,R. The sequence of {nr}1R\{n_{r}\}_{1}^{R} must be strictly monotonically increasing. For convenience, we will denote the number of control points at a certain step rr to be Jr=2nr+1J_{r}=2n_{r}+1.

  3. 3.

    μr2Jr\mu_{r}\in\mathbb{R}^{2J_{r}}, the average set of deformations applied at step r=1,,Rr=1,\ldots,R. Note that this vector contains a stack of deformations, not just one.

  4. 4.

    Σr2Jr×2Jr\Sigma_{r}\in\mathbb{R}^{2J_{r}\times 2J_{r}}, the covariance in the set of deformations applied at step r=1,,Rr=1,\ldots,R.

According to these parameters, ErE_{r} is then the degree-elevation matrix going from degree nr1n_{r-1}to nrn_{r}, N(,)\mbox{N}(\cdot,\cdot) is a 2Jr2J_{r}-variate normal distribution and TrT_{r} is the stacked deformation orienting matrix.

We take special care in defining the initial curve, c(0)c^{(0)}. We choose c(0)c^{(0)} to be degree n0=1n_{0}=1, which guarantees that it is an ellipse. For j=1,2,3j=1,2,3, we define each control point as:

cj(0)\displaystyle c_{j}^{(0)} =\displaystyle= (0,0)+Rθjdj(0),\displaystyle(0,0)^{\prime}+R_{\theta_{j}}d_{j}^{(0)},
Rθj\displaystyle R_{\theta_{j}} =\displaystyle= rotation matrix where θj=2πj3,\displaystyle\mbox{rotation matrix where }\theta_{j}=\frac{2\pi j}{3},

and where each dj(0)2d_{j}^{(0)}\in\mathbb{R}^{2} is a random deformation vector. In words: we start with a curve that is just a point at the origin, C(t)(0,0)C(t)\equiv(0,0), and apply three random deformations which are rotated by a radially symmetric amount: 0,1200^{\circ},120^{\circ} and 240240^{\circ} (note that the final deformations are not radially symmetric, since each djd_{j} is randomly drawn). We will write this in vector notation as:

d(0)\displaystyle d^{(0)} \displaystyle\sim N(μ0,Σ0)\displaystyle\mbox{N}(\mu_{0},\Sigma_{0})
c(0)\displaystyle c^{(0)} =\displaystyle= 𝟎+T0d(0)\displaystyle{\bf 0}+T_{0}d^{(0)}

The deformations essentially ‘inflate’ the curve into some ellipse. This completes our definition of the random shape process.

We now give some intuition about the process and each of its parameters, and define several additional concepts which make the process easier to interpret. The random shape process gives a multiscale representation of shapes, because each step in the process produces increasingly fine-scale deformations, through degree-elevation.

RR is then the number of scales or ‘resolutions’ captured by the process. Each nrn_{r} specifies the number of control points at resolution rr. We will use 𝕊r\mathbb{S}_{r} to denote the class of shapes that can be exactly represented by a degree nrn_{r} Roth curve. See the definition of nr\mathbb{H}^{n_{r}} in §3 for a formal characterization of a special case of 𝕊r\mathbb{S}_{r} . If {nr}1R\{n_{r}\}_{1}^{R} is monotonically increasing, then 𝕊1𝕊2𝕊R\mathbb{S}_{1}\subset\mathbb{S}_{2}\subset\ldots\subset\mathbb{S}_{R}. Thus, the deformations d(r)d^{(r)} roughly describe the additional details gained going from 𝕊r1\mathbb{S}_{r-1} to 𝕊r\mathbb{S}_{r}.

μr\mu_{r} is the mean deformation at level rr. Based on {μr,r=0,,R}\{\mu_{r},r=0,\ldots,R\}, we define the ‘central shape’ of the random shape process, cc^{*} as:

c\displaystyle c^{*} :=\displaystyle:= c(R)\displaystyle c^{*(R)}
c(r)\displaystyle c^{*(r)} =\displaystyle= Erc(r1)+Tr(c(r1))μr\displaystyle E_{r}c^{*(r-1)}+T_{r}(c^{*(r-1)})\mu_{r}

Note that cc^{*} is simply the deterministic result of the random shape process when each d(r)=μrd^{(r)}=\mu_{r}, rather than being drawn from a distribution centered on μr\mu_{r}. Thus, all shapes generated by the process tend to be deformed versions of the central shape. We illustrate this in Figure 4. If the random shape process is used to describe a population of shapes, the central shape provides a good summary.

Refer to caption Refer to caption
Figure 4: Realizations from the random shape process. The left panel shows realizations (in blue) when the central shape is a moon (shown in red). The right panel shows a similar case for a star.

Σr\Sigma_{r} determines the covariance of the deformations at level rr. This naturally controls the variability among shapes generated by the process. If the variance is very small, all shapes will be very similar to the central shape. Σr\Sigma_{r} can also be chosen to induce correlation between deformation vectors at the same resolution, in the typical way that correlation is induced between dimensions of a multivariate normal distribution. This allows us to incorporate higher-level assumptions about shape, such as reflected or radial symmetry. For example, if R=2,n1=1R=2,n_{1}=1 and n2=2n_{2}=2, we can specify perfect correlation in Σ2\Sigma_{2}, such that d1(2)=d4(2)d_{1}^{(2)}=d_{4}^{(2)} and d2(2)=d3(2)d_{2}^{(2)}=d_{3}^{(2)}. The resulting shapes are guaranteed to be symmetrical along an axis of reflection.

In the subsequent sections 4 and 5, we show how to use our random shape process to guide curve-fitting for various types of data. When doing so, we would like each resolution rr to describe the shape as best as possible, within its class 𝕊r\mathbb{S}_{r}. This can be achieved by setting μr=𝟎\mu_{r}={\bf 0} for r=1,,Rr=1,\ldots,R (not including μ0\mu_{0}).

3 Properties of the Prior

3.1 General notations

The supremum and L1\mbox{L}_{1}-norm are denoted by ||||||\cdot||_{\infty} and ||||1||\cdot||_{1}, respectively. We let ||||p,ν||\cdot||_{p,\nu} denote the norm of Lp(ν)L_{p}(\nu), the space of measurable functions with ν\nu-integrable ppth absolute power. The notation C(𝒳)C(\mathcal{X}) is used for the space of continuous functions f:𝒳f:\mathcal{X}\rightarrow\mathbb{R} endowed with the uniform norm. For α>0\alpha>0 , we let Cα(𝒳)C^{\alpha}(\mathcal{X}) denote the Hölder space of order α\alpha, consisting of the functions fC(𝒳)f\in C(\mathcal{X}) that have α\lfloor\alpha\rfloor continuous derivatives with the α\lfloor\alpha\rfloorth derivative fαf^{\lfloor\alpha\rfloor} being Lipshitz continuous of order αα\alpha-\lfloor\alpha\rfloor. We write “\precsim” for inequality up to a constant multiple and {a(1),a(2),,a(n)}\{a_{(1)},a_{(2)},\ldots,a_{(n)}\} to denote the order statistics of the set {ai:ai,i=1,,n}\{a_{i}:a_{i}\in\mathbb{R},i=1,\ldots,n\}.

3.2 Support

Let the Hölder class of periodic functions on [π,π][-\pi,\pi] of order α\alpha be denoted by Cα([π,π])C^{\alpha}([-\pi,\pi]). Define the class of closed parametric curves 𝒮𝒞(α1,α2)\mathcal{S}_{\mathcal{C}}(\alpha_{1},\alpha_{2}) having different smoothness along different coordinates as

𝒮𝒞(α1,α2):={S=(S1,S2):[π,π]2,SiCαi([π,π]),i=1,2}.\displaystyle\mathcal{S}_{\mathcal{C}}(\alpha_{1},\alpha_{2}):=\{S=(S^{1},S^{2}):[-\pi,\pi]\to\mathbb{R}^{2},S^{i}\in C^{\alpha_{i}}([-\pi,\pi]),i=1,2\}. (14)

Consider for simplicity a single resolution Roth curve with control points {cj,j=0,,2n}\{c_{j},j=0,\ldots,2n\}. Assume we have independent Gaussian priors on each of the two coordinates of cjc_{j} for j=0,,2nj=0,\ldots,2n, i.e., C(t)=j=02ncjBjn(t),cjN2(0,σj2I2),j=0,,2nC(t)=\sum_{j=0}^{2n}c_{j}B_{j}^{n}(t),c_{j}\sim\mbox{N}_{2}(0,\sigma_{j}^{2}I_{2}),j=0,\ldots,2n. Denote the prior for CC by ΠCn\Pi_{C^{n}}. ΠCn\Pi_{C^{n}} defines an independent Gaussian process for each of the components of CC. Technically speaking, the support of a prior is defined as the smallest closed set with probability one. Intuitively, the support characterizes the variety of prior realizations along with those which are in their limit. We construct a prior distribution to have large support so that the prior realizations are flexible enough to approximate the true underlying target object. As reviewed in ?, the support of a Gaussian process (in our case ΠCn\Pi_{C^{n}}) is the closure of the corresponding reproducing kernel Hilbert space (RKHS). The following Lemma 3.1 describes the RKHS of ΠCn\Pi_{C^{n}}, which is a special case of Lemma 2 in ?.

Lemma 3.1

The RKHS n\mathbb{H}^{n} of ΠCn\Pi_{C^{n}} consists of all functions h:[π,π]2h:[-\pi,\pi]\to\mathbb{R}^{2} of the form

h(t)=j=02ncjBjn(t)\displaystyle h(t)=\sum_{j=0}^{2n}c_{j}B_{j}^{n}(t) (15)

where the weights cjc_{j} range over 2\mathbb{R}^{2}. The RKHS norm is given by

hn2=j=02ncj2/σj2.\displaystyle||h||_{\mathbb{H}^{n}}^{2}=\sum_{j=0}^{2n}||c_{j}||^{2}/\sigma_{j}^{2}. (16)

The following theorem describes how well an arbitrary closed parametric surface S0𝒮𝒞(α1,α2)S_{0}\in\mathcal{S}_{\mathcal{C}}(\alpha_{1},\alpha_{2}) can be approximated by the elements of n\mathbb{H}^{n} for each nn. Refer to Appendix A for a proof.

Theorem 3.2

For any fixed S0𝒮𝒞(α1,α2)S_{0}\in\mathcal{S}_{\mathcal{C}}(\alpha_{1},\alpha_{2}), there exists hnh\in\mathbb{H}^{n} with hn2K1j=02n1/σj2||h||^{2}_{\mathbb{H}^{n}}\leq K_{1}\sum_{j=0}^{2n}1/\sigma_{j}^{2} such that

S0hK2nα(1)logn\displaystyle||S_{0}-h||_{\infty}\leq K_{2}n^{-\alpha_{(1)}}\log n (17)

for some constants K1,K2>0K_{1},K_{2}>0 independent of nn.

This shows that the Roth basis expansion is sufficiently flexible to approximate any closed curve arbitrarily well. Although we have only shown large support of the prior under independent Gaussian priors on the control points, the multiscale structure should be even more flexible and hence rich enough to characterize any closed curve. We can also expect minimax optimal posterior contraction rates using the prior ΠCn\Pi_{C^{n}} similar to Theorem 2 in ? for suitable choices of prior distributions on nn.

3.3 Influence of the control points

The unique maximum of basis function Bjn(t)B_{j}^{n}(t) defined in (1) is at t=2π(j1)/Jt=-2\pi(j-1)/J, therefore the control point cjc_{j} has the most significant effect on the shape of the curve in the neighborhood of the point C(2π(j1)/J)C(-2\pi(j-1)/J). Note that Bjn(t)B_{j}^{n}(t) vanishes at t=π2π(j1)/Jt=\pi-2\pi(j-1)/J, thus cjc_{j} has no effect on the corresponding point i.e., the point of the curve is invariant under the modi cation of cjc_{j}. The control point cjc_{j} affects all other points of the curve, i.e. the curve is globally controlled. These properties are illustrated in Figure 5.

Refer to caption
Figure 5: Influence of the control point on the Roth curve

However, we emphasize following Proposition 5 in ? that while control points have a global effect on the shape, this in uence tends to be local and dramatically decreases on further parts of the curve, especially for higher values of nn.

4 Inference from Point Cloud Data

We now demonstrate how our multiscale closed curve process can be used as a prior distribution for fitting a shape to a 2D point cloud. As a byproduct of fitting, we also obtain an intuitive description of the shape in terms of deformation vectors.

Assume that the data consist of points {pi2,i=1,,N}\{p_{i}\in\Re^{2},i=1,\ldots,N\} concentrated near a 2D closed curve. Since a Roth curve can be thought of as a function expressing the trajectory of a particle over time, we view each data point, pip_{i}, as a noisy observation of the particle’s location at a given time tit_{i},

pi=C(ti)+ϵi,ϵiN2(0,σ2I2)\displaystyle p_{i}=C(t_{i})+\epsilon_{i},\quad\epsilon_{i}\sim N_{2}(0,\sigma^{2}I_{2}) (18)

(18) shares a similar form to nonlinear factor models, where tit_{i} is the latent factor score. We start by specifying the likelihood and prior distributions conditionally on the tit_{i}s. We now rewrite the point cloud model in stacked vector notation. Defining

p=(p1,x,p1,y,,pN,x,pN,y),\displaystyle p=(p_{1,x},p_{1,y},\ldots,p_{N,x},p_{N,y})^{\prime}, ϵ=(ϵ1,x,ϵ1,y,,ϵN,x,ϵN,y)\displaystyle\epsilon=(\epsilon_{1,x},\epsilon_{1,y},\ldots,\epsilon_{N,x},\epsilon_{N,y})^{\prime}
t=(t1,x,t1,y,,tN,x,tN,y),\displaystyle t=(t_{1,x},t_{1,y},\ldots,t_{N,x},t_{N,y})^{\prime}, 𝕏(t)=[X(t1)X(t2)X(tN)]\displaystyle\mathbb{X}(t)^{\prime}=[X(t_{1})^{\prime}X(t_{2})^{\prime}\ldots X(t_{N})^{\prime}]

we have

p=𝕏(t)c+ϵ,ϵN2N(0,σ2I2N)\displaystyle p=\mathbb{X}(t)c+\epsilon,\quad\epsilon\sim N_{2N}(0,\sigma^{2}I_{2N}) (19)

where X(ti)X(t_{i}) is as defined in (11).

To fit a Roth curve through the data, we want to infer P(cp)P(c\mid p), the posterior distribution over control points cc, given the data points pp. To compute this, we must specify P(pc)P(p\mid c), the likelihood, and P(c)P(c), the prior distribution over Roth curves specified by cc. Refer to (13) in §2.5 for a multiscale prior P(c)P(c). From (19), we can specify the likelihood function as,

P({pi}1N{ci}1J)\displaystyle P(\{p_{i}\}_{1}^{N}\mid\{c_{i}\}_{1}^{J}) =\displaystyle= i=1NN2(pi;j=1JcjBj(ti),σ2I2),P(pc)=N2N(p;𝕏(t)c,σ2I2N),in vector notn.\displaystyle\displaystyle\prod_{i=1}^{N}N_{2}\bigg{(}p_{i};\sum_{j=1}^{J}c_{j}B_{j}(t_{i}),\sigma^{2}I_{2}\bigg{)},\quad P(p\mid c)=N_{2N}(p;\mathbb{X}(t)c,\sigma^{2}I_{2N}),\,\mbox{in vector notn.} (20)

This completes the Bayesian formulation for inferring cc, given pp and tt. In §7, we describe the exact method for performing Bayesian inference. As a byproduct of inference, we also infer the deformation vectors d(r)d^{(r)} for , r=1,,Rr=1,\ldots,R. Due to their multiscale organization, they may describe shape in a more intuitive manner than {cj,j=1,,J}\{c_{j},j=1,\ldots,J\}.

We propose a prior for tit_{i} conditionally on cc, which is designed to be uniform over the curve’s arc-length. This prior is motivated by the frequentist literature on arc-length parameterizations (?), but instead of replacing the points {pi2}\{p_{i}\in\Re^{2}\} with {ti[π,π]}\{t_{i}\in[-\pi,\pi]\} in a deterministic preliminary step prior to statistical analysis, we use a Bayesian approach to formally accommodate uncertainty in parameterization of the points. Define the arc-length function A:[π,π]+A:[-\pi,\pi]\mapsto\mathbb{R}^{+}

A(u):=A(u;(c0,,c2n))=πuH(t)𝑑t.\displaystyle A(u):=A(u;(c_{0},\ldots,c_{2n}))=\int_{-\pi}^{u}||H(t)||dt. (21)

Note that AA is monotonically increasing and satisfies A(π)=0,A(π)=L(c0,,c2n)A(-\pi)=0,A(\pi)=L(c_{0},\ldots,c_{2n}) where L(c0,,c2n)L(c_{0},\ldots,c_{2n}) is the length of the curve conditional on the control points (c0,,c2n)(c_{0},\ldots,c_{2n}) and is given by ππH(t)𝑑t\int_{-\pi}^{\pi}||H(t)||dt.

Given (c0,,c2n)(c_{0},\ldots,c_{2n}), we draw liUnif(0,L(c0,,c2n))l_{i}\sim\mathrm{Unif}(0,L(c_{0},\ldots,c_{2n})) and set ti=A1(li)t_{i}=A^{-1}(l_{i}). Thus we obtain a prior for the tit_{i}’s which is uniform along the length of the curve. We will discuss a novel griddy Gibbs algorithm for implementing the arc-length parametrization in a fully Bayesian framework in §7.

5 Inferences from Pixelated Image Data

In this section, we define a hierarchical Bayesian model for point cloud data concentrated near a 2d closed curve. We also show how image data gives a bonus estimate for the object’s surface orientation, ωi\omega_{i} at each point pip_{i}. We incorporate this extra information into our model to obtain an even better shape fit, with essentially no sacrifice in computational efficiency.

A grayscale image can be treated as a function Z:2Z:\mathbb{R}^{2}\rightarrow\mathbb{R}. The gradient of this function, Z:22\nabla Z:\mathbb{R}^{2}\rightarrow\mathbb{R}^{2} is a vector field, where Z(x,y)\nabla Z(x,y) is a vector pointing in the direction of steepest ascent. In computer vision, it is well known that the gradient norm of the image, Z2:2||\nabla Z||_{2}:\mathbb{R}^{2}\rightarrow\mathbb{R} approximates a ‘line-drawing’ of all the high-contrast edges in the image. Our goal is to fit the edges in the image with our shape model.

In practice, an image is discretized into pixels {za,ba=1,,X,b=1,,Y}\{z_{a,b}\mid a=1,\ldots,X,b=1,\ldots,Y\} but a discrete version of the gradient can still be computed by taking the difference between neighboring pixels, such that one gradient vector, ga,bg_{a,b} is computed at each pixel. The image’s gradient norm is then just another image, where each pixel ma,b=ga,b2m_{a,b}=||g_{a,b}||_{2}.

Finally, we extract a point cloud: {(a,b)ma,b>M,a=1,,X,b=1,,Y}\{(a,b)\mid m_{a,b}>M,a=1,\ldots,X,b=1,\ldots,Y\} where MM is some user-specified threshold. Each point (a,b)(a,b) can still be matched to a gradient vector ga,bg_{a,b}. For convenience, we will re-index them as pip_{i} and gig_{i}. The gradient vector points in the direction of steepest change in contrast, i.e. it points across the edge of the object, approximating the object’s surface normal. The surface orientation is then just ωi=arctan(gi,ygi,x)\omega_{i}=\arctan(\frac{g_{i,y}}{g_{i,x}}).

In the following, we describe a model relating a Roth curve to each ωi\omega_{i}. This model can be used together with the model we specified earlier for the pip_{i}.

5.1 Modeling surface orientation

Denote by vi=(Hx(ti),Hy(ti))2v_{i}=(H_{x}(t_{i}),H_{y}(t_{i}))\in\mathbb{R}^{2} the velocity vector of the curve C(t)C(t) at the parameterization location ti,i=1,,Nt_{i},i=1,\ldots,N. Note that viv_{i} is always tangent to the curve. Since each ωi\omega_{i} points roughly normal to the curve, we can rotate all of them by 90 degrees, θi=ωi+π2\theta_{i}=\omega_{i}+\frac{\pi}{2}, and treat each θi\theta_{i} as a noisy estimate of viv_{i}’s orientation. Note that we cannot rotate the vector gig_{i} by 90 degrees and directly treat it as a noisy observation of viv_{i}. In particular, gig_{i} ’s magnitude bears no relationship to the magnitude of viv_{i}: gi||g_{i}|| is the rate of change in image brightness when crossing the edge of the shape, while vi||v_{i}|| describes the speed at which the curve passes through pip_{i}.

Suppose we did have some noisy observation of viv_{i}, denoted uiu_{i}. Then, we could have specified the following linear model relating the curve {cj,j=1,,J}\{c_{j},j=1,\ldots,J\} to the uiu_{i}’s:

ui\displaystyle u_{i} =\displaystyle= vi+δi\displaystyle v_{i}+\delta_{i} (22)
=\displaystyle= j=1JcjddtBj(ti)+δi\displaystyle\sum_{j=1}^{J}c_{j}\frac{d}{dt}B_{j}(t_{i})+\delta_{i} (23)

for i=1,,Ni=1,\ldots,N where δiN2(0,τ2I2)\delta_{i}\sim N_{2}(0,\tau^{2}I_{2}). Instead, we only know the angle of uiu_{i}, θi\theta_{i}. In §7, we show that using this model, we can still write the likelihood for θi\theta_{i}, by marginalizing out the unknown magnitude of uiu_{i}. The resulting likelihood still results in conditional conjugacy of the control points.

6 Fitting a population of shapes

We can easily generalize the methodology above to fit a collection of KK separate point clouds, and characterize the resulting population of shapes, represented by a closed curve. Continuing the vector notation earlier, we will represent the kthk^{th} point cloud in a stacked vector pkp^{k}, the corresponding parametrizations tkt^{k}, surface orientations θk\theta^{k}, and the control points corresponding to that point cloud as ckc^{k}. Finally, we will denote the deformations which produce the curve for shape kk as d(r),kd^{(r),k} for each step rr in the shape process.

Up to this point, the parameters specifying each closed curve are separate and independent. This is sufficient if we just wish to fit each point cloud independently. However, now we aim to characterize all the curves as a single population. To do so, we treat each curve as an observation generated from a single random shape process. We borrow information across the population of curves through sharing hyperparameters of our multiscale deformation model or by shrinking to a common value by assigning a hyperprior. These inferred hyperparameters and the uncertainty in estimating them will effectively characterize the whole population of shapes. This is a hierarchical modeling strategy that is often used to characterize a population.

The hyperparameters of our random shape process are μr\mu_{r} and Σr\Sigma_{r} for r=1,,Rr=1,\ldots,R. We can treat each μr\mu_{r} as an unknown and place the following prior on it:

μr\displaystyle\mu_{r} \displaystyle\sim N2Jr(μμr,Σμr)\displaystyle N_{2J_{r}}(\mu_{\mu_{r}},\Sigma_{\mu_{r}}) (24)

By assuming that all shapes are generated from a single shape process with unknown μr\mu_{r}’s, we are basically assuming that all shapes are deformed versions of one ‘central shape’ (defined earlier in §2.5) where the variability in deformation at scale rr is Σr\Sigma_{r}.

Now suppose that each shape is rotated to a different angle. In this case, it may not be ideal to share all the deformations, because this would assume that all shapes are at exactly the same angle. One solution is to simply make the Σ1\Sigma_{1} very large, allowing large variation in the d(1),kd^{(1),k}. These deformations define the coarsest outline of each shape, which may be an ellipse. If these have wide variance, each coarse outline may be rotated to a different angle. Furthermore, since all subsequent deformations are defined relative to this initial coarse outline, different shapes can share the exact same deformations even if they are rotated to different angles.

Note that with this modification, all rotated shapes have essentially been aligned with each other, because all shape details expressed by the deformations for r>1r>1 have been matched up.

7 Posterior computation

7.1 An approximation to the deformation-orienting matrix for the deformation vector

Observe that since Tr(c(r),k)T_{r}(c^{(r),k}) may not be linear in c(r),kc^{(r),k}, due to the arctan\arctan in (7), the full conditional distribution of c(r),kc^{(r),k} is not conditionally conjugate. Below, we develop a novel approximation to the rotation matrix T~r\tilde{T}_{r} of T(c(r1),k)T(c^{(r-1),k}). The approximation ensures that Tr(c(r1),k)T_{r}(c^{(r-1),k}) is linear in the level r1r-1 control points c(r1),kc^{(r-1),k} which results in conditional conjugacy of c(r1),kc^{(r-1),k}. We resort to a Metropolis Hastings algorithm with an independent proposal suggested by the approximation to correct for the approximation error. For j=1,,Jrj=1,\ldots,J_{r}, let qj,r=2π(j1)/Jrq_{j,r}=-2\pi(j-1)/J_{r}. Recall that X˙(qj,r)c(r),k\dot{X}(q_{j,r})c^{(r),k} is the hodograph evaluated at qj,rq_{j,r}. \prop RjR_{j} in (4) can be approximated by R~j,j=1,,Jr\tilde{R}_{j},j=1,\ldots,J_{r}, where R~j\tilde{R}_{j} are approximate rotation matrices given by

2πLA×[X˙x(qj,r)c(r1),kX˙y(qj,r)c(r1),kX˙y(qj,r)c(r1),kX˙x(qj,r)c(r1),k]\displaystyle\frac{2\pi}{L_{A}}\times\begin{bmatrix}\dot{X}_{x}(q_{j,r})c^{(r-1),k}&-\dot{X}_{y}(q_{j,r})c^{(r-1),k}\\ \dot{X}_{y}(q_{j,r})c^{(r-1),k}&\dot{X}_{x}(q_{j,r})c^{(r-1),k}\end{bmatrix} (25)

and LAL_{A} is an approximation for the length of the curve A(π;c(r1),k)A(\pi;c^{(r-1),k}) formed by the control points c(r1),kc^{(r-1),k}.

  • Proof

    Refer to the definition of RjR_{j} in (4). First we derive an approximation for cos(θj)\cos(\theta_{j}^{*}) and sin(θj)\sin(\theta_{j}^{*}) where θj\theta_{j}^{*} is the angle at the point qj,rq_{j,r} of the curve specified by c(r1),kc^{(r-1),k}. We write θj\theta_{j}^{*} in vector notation

    θj=arctan(X˙y(qj,r)c(r1),kX˙x(qj,r)c(r1),k).\displaystyle\theta_{j}^{*}=\arctan\bigg{(}\frac{\dot{X}_{y}(q_{j,r})c^{(r-1),k}}{\dot{X}_{x}(q_{j,r})c^{(r-1),k}}\bigg{)}. (26)

    Using the identities

    cosarctan(x/y)=xx2+y2,sinarctan(x/y)=yx2+y2,\displaystyle\cos\arctan(x/y)=\frac{x}{\sqrt{x^{2}+y^{2}}},\quad\sin\arctan(x/y)=\frac{y}{\sqrt{x^{2}+y^{2}}}, (27)

    we obtain,

    cos(θj)\displaystyle\cos(\theta_{j}^{*}) =\displaystyle= X˙x(qj,r)c(r1),k(X˙x(qj,r)c(r1),k)2+(X˙y(qj,r)c(r1),k)2\displaystyle\frac{\dot{X}_{x}(q_{j,r})c^{(r-1),k}}{\sqrt{(\dot{X}_{x}(q_{j,r})c^{(r-1),k})^{2}+(\dot{X}_{y}(q_{j,r})c^{(r-1),k})^{2}}}
    sin(θj)\displaystyle\sin(\theta_{j}^{*}) =\displaystyle= X˙y(qj,r)c(r1),k(X˙x(qj,r)c(r1),k)2+(X˙y(qj,r)c(r1),k)2\displaystyle\frac{\dot{X}_{y}(q_{j,r})c^{(r-1),k}}{\sqrt{(\dot{X}_{x}(q_{j,r})c^{(r-1),k})^{2}+(\dot{X}_{y}(q_{j,r})c^{(r-1),k})^{2}}}

    The magnitude (X˙x(qj,r)c(r1),k)2+(X˙y(qj,r)c(r1),k)2\sqrt{(\dot{X}_{x}(q_{j,r})c^{(r-1),k})^{2}+(\dot{X}_{y}(q_{j,r})c^{(r-1),k})^{2}} of the velocity vector at the point qj,rq_{j,r} can be well approximated by the quantity A(π;c(r1),k)2π\frac{A(\pi;c^{(r-1),k})}{2\pi} in view of the uniform arc-length parameterization discussed in §LABEL:sec:para. Hence

    cos(θj)2πA(π;c(r1),k)X˙x(qj,r)c(r1),k\displaystyle\cos(\theta_{j}^{*})\approx\frac{2\pi}{A(\pi;c^{(r-1),k})}\dot{X}_{x}(q_{j,r})c^{(r-1),k}
    sin(θj)2πA(π;c(r1),k)X˙y(qj,r)c(r1),k\displaystyle\sin(\theta_{j}^{*})\approx\frac{2\pi}{A(\pi;c^{(r-1),k})}\dot{X}_{y}(q_{j,r})c^{(r-1),k}

    Plugging in in a fixed approximation LAL_{A} for the length of the curve A(π;c(1),k)A(\pi;c^{(1),k}), we obtain the required result. \Box

7.2 Conditional posteriors for mm and d(r)d^{(r)}

Before deriving the conditional posteriors, we first introduce some simplifying notation. Recall from §2.5 that

c(r)\displaystyle c^{(r)} =\displaystyle= Erc(r1)+Tr(Erc(r1))d(r)\displaystyle E_{r}c^{(r-1)}+T_{r}\left(E_{r}c^{(r-1)}\right)d^{(r)}

Using the approximation T^rc(r1)Tr(Erc(r1))d(r)\hat{T}_{r}c^{(r-1)}\approx T_{r}(E_{r}c^{(r-1)})d^{(r)}, we then have c(r)(Er+T^r)c(r1)c^{(r)}\approx\left(E_{r}+\hat{T}_{r}\right)c^{(r-1)}. In the new arrangement, we can now cleanly write c(R)c^{(R)} in terms of the base case, c(0)c^{(0)}.

Ωab\displaystyle\Omega_{a}^{b} =\displaystyle= {r=ab(Er+T^r)ifa<b1otherwise.\displaystyle\begin{cases}\prod_{r=a}^{b}\left(E_{r}+\hat{T}_{r}\right)&\quad\text{if}\,a<b\\ 1&\quad\text{otherwise.}\end{cases}

Hence c(R)c^{(R)} can be approximated by

c(R)\displaystyle c^{(R)} \displaystyle\approx Ω1Rc(0).\displaystyle\Omega_{1}^{R}c^{(0)}.

Note that the terms in Ωr+1R\Omega_{r+1}^{R} are the source of approximation error. Given this expression, we can easily write c(R)c^{(R)} in terms of mm and d(0)d^{(0)},

c(R)\displaystyle c^{(R)} \displaystyle\approx Ω1R(m+T0d(0)).\displaystyle\Omega_{1}^{R}\left(m+T_{0}d^{(0)}\right).

We can also write c(R)c^{(R)} in terms of c(r1)c^{(r-1)} and d(r)d^{(r)}, for any r=1,,Rr=1,\ldots,R

c(R)\displaystyle c^{(R)} \displaystyle\approx Ωr+1R[Erc(r1)+Tr(c(r1))d(r)]\displaystyle\Omega_{r+1}^{R}\left[E_{r}c^{(r-1)}+T_{r}\left(c^{(r-1)}\right)d^{(r)}\right]

Note that as rr approaches RR, Ωr+1R\Omega_{r+1}^{R} involves fewer factors and the amount of approximation error decreases.

We are now ready to derive the conditional posteriors for mkm^{k} and d(r)d^{(r)} (as in §5, we are using a superscript kk to denote variables for the kthk^{\mathrm{th}} shape). First, we claim that all posteriors can be written in the following form for generic ‘xx’, ‘yy’ and ‘zz’.

P(x)\displaystyle P(x\mid-) \displaystyle\propto N(y;Qx,Σy)N(x;z,Σx)\displaystyle\mbox{N}\left(y;Qx,\Sigma_{y}\right)\ \mbox{N}\left(x;z,\Sigma_{x}\right) (28)
P(x)\displaystyle P(x\mid-) \displaystyle\sim N(μ^,Σ^)\displaystyle\mbox{N}\left(\hat{\mu},\hat{\Sigma}\right) (29)
Σ1^\displaystyle\hat{\Sigma^{-1}} =\displaystyle= Σx1+kQΣy1Q\displaystyle\Sigma_{x}^{-1}+\sum_{k}Q^{\prime}\Sigma_{y}^{-1}Q
μ^\displaystyle\hat{\mu} =\displaystyle= Σ^(Σx1z+kQΣy1y).\displaystyle\hat{\Sigma}\left(\Sigma_{x}^{-1}z+\sum_{k}Q^{\prime}\Sigma_{y}^{-1}y\right).

Note that each conditional posterior is simply a multivariate normal. We now prove that each posterior can be rearranged to match the form of (28) - (29).

P(mk)\displaystyle P(m^{k}\mid-) \displaystyle\propto N(pk;𝕏(tk)c(R),k,σ2I2Nk)N(mk;μm,Σm)\displaystyle\mbox{N}\left(p^{k};\mathbb{X}(t^{k})c^{(R),k},\sigma^{2}I_{2N^{k}}\right)\ \mbox{N}(m^{k};\mu_{m},\Sigma_{m})
\displaystyle\propto N(pk;𝕏(tk)Ω1R,k(mk+T0d(0),k),σ2I2Nk)N(mk;μm,Σm)\displaystyle\mbox{N}\left(p^{k};\mathbb{X}(t^{k})\Omega_{1}^{R,k}\left(m^{k}+T_{0}d^{(0),k}\right),\sigma^{2}I_{2N^{k}}\right)\ \mbox{N}(m^{k};\mu_{m},\Sigma_{m})
\displaystyle\propto N(pk𝕏(tk)Ω1R,kT0d(0),k;𝕏(tk)Ω1R,kmk,σ2I2Nk)N(mk;μm,Σm)\displaystyle\mbox{N}\left(p^{k}-\mathbb{X}(t^{k})\Omega_{1}^{R,k}T_{0}d^{(0),k};\mathbb{X}(t^{k})\Omega_{1}^{R,k}m^{k},\sigma^{2}I_{2N^{k}}\right)\ \mbox{N}(m^{k};\mu_{m},\Sigma_{m})
P(d(r))\displaystyle P(d^{(r)}\mid-) \displaystyle\propto N(pk;𝕏(tk)c(R),k,σ2I2Nk)N(d(r);μr,Σr)\displaystyle\mbox{N}\left(p^{k};\mathbb{X}(t^{k})c^{(R),k},\sigma^{2}I_{2N^{k}}\right)\ \mbox{N}(d^{(r)};\mu_{r},\Sigma_{r})
\displaystyle\propto N(pk;𝕏(tk)Ωr+1R[Erc(r1)+Tr(c(r1))d(r)],σ2I2Nk)N(d(r);μr,Σr)\displaystyle\mbox{N}\left(p^{k};\mathbb{X}(t^{k})\Omega_{r+1}^{R}\left[E_{r}c^{(r-1)}+T_{r}\left(c^{(r-1)}\right)d^{(r)}\right],\sigma^{2}I_{2N^{k}}\right)\ \mbox{N}(d^{(r)};\mu_{r},\Sigma_{r})
\displaystyle\propto N(pk𝕏(tk)Ωr+1RErc(r1);𝕏(tk)Ωr+1RTr(c(r1))d(r),σ2I2Nk)N(d(r);μr,Σr)\displaystyle\mbox{N}\left(p^{k}-\mathbb{X}(t^{k})\Omega_{r+1}^{R}E_{r}c^{(r-1)};\mathbb{X}(t^{k})\Omega_{r+1}^{R}T_{r}\left(c^{(r-1)}\right)d^{(r)},\sigma^{2}I_{2N^{k}}\right)\ \mbox{N}(d^{(r)};\mu_{r},\Sigma_{r})

7.3 Conditional update for σ2I2Nk2=τp1\sigma^{2}I_{2N^{k}}^{2}=\tau_{p}^{-1}

P(τp)\displaystyle P(\tau_{p}\mid-) \displaystyle\propto k=1Ki=1NkN(pik;pik,τp1𝕀2)Ga(τp;α,β)\displaystyle\prod_{k=1}^{K}\prod_{i=1}^{N_{k}}\mbox{N}\left(p_{i}^{k};p_{i}^{k}*,\tau_{p}^{-1}\mathbb{I}_{2}\right)\ Ga\left(\tau_{p};\alpha,\beta\right)
\displaystyle\propto τpNtotexp[12τpk=1Ki=1Nk(pikpik)(pikpik)]τpα1exp(βτp)\displaystyle\tau_{p}^{N_{tot}}\exp\left[-\frac{1}{2}\tau_{p}\sum_{k=1}^{K}\sum_{i=1}^{N_{k}}(p_{i}^{k}-p_{i}^{k}*)^{\prime}(p_{i}^{k}-p_{i}^{k}*)\right]\ \tau_{p}^{\alpha-1}\exp\left(-\beta\tau_{p}\right)
\displaystyle\propto τpα+Ntot1exp[(β+12k=1Ki=1Nkpikpik22)τp].\displaystyle\tau_{p}^{\alpha+N_{tot}-1}\exp\left[-\left(\beta+\frac{1}{2}\sum_{k=1}^{K}\sum_{i=1}^{N_{k}}\|p_{i}^{k}-p_{i}^{k}*\|_{2}^{2}\right)\tau_{p}\right].

Where pik=X(tik)c(R),kp_{i}^{k}*=X(t_{i}^{k})c^{(R),k} and Ntot=k=1KNkN_{tot}=\sum_{k=1}^{K}N_{k}. The conditional posterior distribution is then

P(τp)\displaystyle P(\tau_{p}\mid-) \displaystyle\sim Ga(α^,β^)\displaystyle Ga\left(\hat{\alpha},\hat{\beta}\right)

where

α^=α+Ntot1β^=β+k=1Ki=1Nkpikpik222\displaystyle\hat{\alpha}=\alpha+N_{tot}-1\qquad\hat{\beta}=\beta+\frac{{\displaystyle\sum_{k=1}^{K}\sum_{i=1}^{N_{k}}}\|p_{i}^{k}-p_{i}^{k}*\|_{2}^{2}}{2}

7.4 Likelihood contribution from surface-normals

Define

X˙x(ti)\displaystyle\dot{X}_{x}(t_{i}) =\displaystyle= [dB0n1(ti)dt,0,dB1n1(ti)dt,0,,dB2n1n1(ti)dt,0]\displaystyle\left[\frac{dB_{0}^{n_{1}}(t_{i})}{dt},0,\frac{dB_{1}^{n_{1}}(t_{i})}{dt},0,\cdots,\frac{dB_{2n_{1}}^{n_{1}}(t_{i})}{dt},0\right] (30)
X˙y(ti)\displaystyle\dot{X}_{y}(t_{i}) =\displaystyle= [0,dB0n1(ti)dt,0,dB1n1(ti)dt,,0,dB2n1n1(ti)dt]\displaystyle\left[0,\frac{dB_{0}^{n_{1}}(t_{i})}{dt},0,\frac{dB_{1}^{n_{1}}(t_{i})}{dt},\cdots,0,\frac{dB_{2n_{1}}^{n_{1}}(t_{i})}{dt}\right] (31)
\prop

The likelihood contribution of the tangent directions θik,i=1,,Nk\theta_{i}^{k},i=1,\ldots,N^{k} ensures conjugate updates of the control points for a multivariate normal prior.

  • Proof

    Recall the noisy tangent director vectors uiku_{i}^{k}’s and vikv_{i}^{k}’s in (22). Use a simple reparameterization

    uik=(mik,miktanθik)\displaystyle u_{i}^{k}=(m_{i}^{k},m_{i}^{k}\tan\theta_{i}^{k})

    where only θi\theta_{i}^{\prime}s are observed and mim_{i}’s aren’t. Observe that

    vik=(Hx(ti),Hy(ti))=(X˙x(tik)c(3),k,X˙y(tik)c(3),k).\displaystyle v_{i}^{k}=(H_{x}(t_{i}),H_{y}(t_{i}))=(\dot{X}_{x}(t_{i}^{k})c^{(3),k},\dot{X}_{y}(t_{i}^{k})c^{(3),k}). (32)

    Assuming a non-informative prior for the mikm_{i}^{k}’s on \mathbb{R}, the marginal likelihood of the tangent direction θik\theta_{i}^{k} given τ2\tau^{2} and the parameterization tikt_{i}^{k} is given by

    l(θik)=12πτ2exp[12τ2{(mikX˙x(tik)c(3),k)2+(miktan(θi)X˙y(tik)c(3),k)2}]𝑑mik\displaystyle l(\theta_{i}^{k})=\frac{1}{2\pi\tau^{2}}\int_{-\infty}^{\infty}\exp\left[-\frac{1}{2\tau^{2}}\{(m_{i}^{k}-\dot{X}_{x}(t_{i}^{k})c^{(3),k})^{2}+(m_{i}^{k}\tan(\theta_{i})-\dot{X}_{y}(t_{i}^{k})c^{(3),k})^{2}\}\right]dm_{i}^{k}

    It turns out the above expression has a closed form given by

    l(θik)=12πτ22πτ21+tan2(θik)exp[12τ2{(X˙x(tik)c(3),k)2+(X˙y(tik)c(3),k)2(X˙x(tik)c(3),k+X˙y(tik)c(3),ktan(θi))21+tan2(θik)}].\displaystyle l(\theta_{i}^{k})=\frac{1}{2\pi\tau^{2}}\frac{\sqrt{2\pi\tau^{2}}}{\sqrt{1+\tan^{2}(\theta_{i}^{k})}}\exp\left[-\frac{1}{2\tau^{2}}\left\{(\dot{X}_{x}(t_{i}^{k})c^{(3),k})^{2}+(\dot{X}_{y}(t_{i}^{k})c^{(3),k})^{2}-\frac{(\dot{X}_{x}(t_{i}^{k})c^{(3),k}+\dot{X}_{y}(t_{i}^{k})c^{(3),k}\tan(\theta_{i}))^{2}}{1+\tan^{2}(\theta_{i}^{k})}\right\}\right].

    The likelihood for the {θik,i=1,,Nk}\{\theta_{i}^{k},i=1,\ldots,N^{k}\} is given by

    L(θ1k,,θNkk)\displaystyle L(\theta_{1}^{k},\ldots,\theta^{k}_{N^{k}}) \displaystyle\propto 1τNkexp[12τ2i=1Nk(X˙x(tik)c(3),k)2tan2(θi)+(X˙y(tik)c(3),k)22X˙x(tik)c(1),kX˙y(tik)c(1),ktan(θi)1+tan2(θi)]\displaystyle\frac{1}{\tau^{N^{k}}}\exp\left[-\frac{1}{2\tau^{2}}\sum_{i=1}^{N^{k}}\frac{(\dot{X}_{x}(t_{i}^{k})c^{(3),k})^{2}\tan^{2}(\theta_{i})+(\dot{X}_{y}(t_{i}^{k})c^{(3),k})^{2}-2\dot{X}_{x}(t_{i}^{k})c^{(1),k}\dot{X}_{y}(t_{i}^{k})c^{(1),k}\tan(\theta_{i})}{1+\tan^{2}(\theta_{i})}\right]
    =\displaystyle= 1τNkexp[12τ2(c(3),k){i=1N(Tik)ΣikTik}c(3),k]\displaystyle\frac{1}{\tau^{N^{k}}}\exp\left[-\frac{1}{2\tau^{2}}(c^{(3),k})^{\prime}\left\{\sum_{i=1}^{N}(T_{i}^{k})^{\prime}\Sigma_{i}^{k}T_{i}^{k}\right\}c^{(3),k}\right]

    where

    Σi=(tan2(θik)1+tan2(θik)tan(θik)1+tan2(θik)tan(θik)1+tan2(θik)11+tan2(θik))\displaystyle\Sigma_{i}=\left(\begin{array}[]{cc}\frac{\tan^{2}(\theta_{i}^{k})}{1+\tan^{2}(\theta_{i}^{k})}&\frac{-\tan(\theta_{i}^{k})}{1+\tan^{2}(\theta_{i}^{k})}\\ \frac{-\tan(\theta_{i}^{k})}{1+\tan^{2}(\theta_{i}^{k})}&\frac{1}{1+\tan^{2}(\theta_{i}^{k})}\end{array}\right)

    and Ti=[(X˙x(tik))(X˙y(tik)]T_{i}=[(\dot{X}_{x}(t_{i}^{k}))^{\prime}\quad(\dot{X}_{y}(t_{i}^{k})^{\prime}] is a 2(2n3+1)×22(2n_{3}+1)\times 2 matrix. Clearly, an inverse-Gamma for τ2\tau^{2} and a multivariate normal prior for the control points are conjugate choices. \Box

7.5 Griddy Gibbs updates for the parameterizations tikt_{i}^{k}

We discretize the possible values of tik[π,π]t_{i}^{k}\in[-\pi,\pi] to obtain a discrete approximation of its conditional posterior:

tikN(pik;𝕏(ti)c(3),k,σ2I2)τ[π,π]N(pik;𝕏(τ)c(3),k,σ2I2)\displaystyle t_{i}^{k}\mid-\sim\frac{\mbox{N}(p_{i}^{k};\mathbb{X}(t_{i})^{\prime}c^{(3),k},\sigma^{2}I_{2})}{\sum_{\tau\in[-\pi,\pi]}\mbox{N}(p_{i}^{k};\mathbb{X}(\tau)^{\prime}c^{(3),k},\sigma^{2}I_{2})}

We can make this arbitrarily accurate, by making a finer summation over τ\tau.

8 Simulation study

9 Case study

A Proofs of main results

Proof of Theorem 3.2: From (?) and observing that the basis functions {Bjn,j=0,,2n}\{B_{j}^{n},j=0,\ldots,2n\} span the vector space of trigonometric polynomials of degree at most nn, it follows that given any S0iCαi([π,π])S_{0}^{i}\in C^{\alpha_{i}}([-\pi,\pi]), there exists hi(u)=j=02ncjiBjn(u)h^{i}(u)=\sum_{j=0}^{2n}c_{j}^{i}B_{j}^{n}(u), hi:[π,π]h^{i}:[-\pi,\pi]\to\mathbb{R} with |cji|Mi|c_{j}^{i}|\leq M_{i}, such that hiS0iKinαilogn||h^{i}-S_{0}^{i}||_{\infty}\leq K_{i}n^{-\alpha_{i}}\log n for some constants Mi,Ki>0,i=1,2M_{i},K_{i}>0,i=1,2. Setting h(u)=j=02n(cj1,cj2)Bjn(u)h(u)=\sum_{j=0}^{2n}(c_{j}^{1},c_{j}^{2})^{\prime}B_{j}^{n}(u), we have

hS0Mnα(1)logn\displaystyle||h-S_{0}||_{\infty}\leq Mn^{-\alpha_{(1)}}\log n

with h2Kj=02nϕj||h||^{2}_{\mathbb{H}}\leq K\sum_{j=0}^{2n}\phi_{j} where M=M(2),K=K(2)M=M_{(2)},K=K_{(2)}.

Proof of Lemma 3.1:

REFERENCES

  • [2] [] Amenta, N., Bern, M., and Kamvysselis, M. (1998), A new Voronoi-based surface reconstruction algorithm,, in Proceedings of the 25th annual conference on Computer graphics and interactive techniques, ACM, pp. 415–421.
  • [4] [] Aziz, N., Bata, R., and Bhat, S. (2002), “Bezier surface/surface intersection,” Computer Graphics and Applications, IEEE, 10(1), 50–58.
  • [6] [] Barnhill, R. (1985), “Surfaces in computer aided geometric design: A survey with new results,” Computer Aided Geometric Design, 2(1-3), 1–17.
  • [8] [] Bhattacharya, A. (2008), “Statistical analysis on manifolds: A nonparametric approach for inference on shape spaces,” Sankhya: The Indian Journal of Statistics, 70(Part 3), 0–43.
  • [10] [] Bhattacharya, A., and Dunson, D. (2010), “Nonparametric Bayesian density estimation on manifolds with applications to planar shapes,” Biometrika, 97(4), 851.
  • [12] [] Bhattacharya, A., and Dunson, D. (2011a), “Nonparametric Bayes Classification and Hypothesis Testing on Manifolds,” Journal of Multivariate Analysis, . to appear.
  • [14] [] Bhattacharya, A., and Dunson, D. (2011b), “Strong consistency of nonparametric Bayes density estimation on compact metric spaces,” Annals of the Institute of Statistical Mathematics, . to appear.
  • [16] [] Bhattacharya, R., and Bhattacharya, A. (2009), “Statistics on manifolds with applications to shape spaces,” Publishers page, p. 41.
  • [18] [] Bookstein, F. (1986), “Size and shape spaces for landmark data in two dimensions,” Statistical Science, 1(2), 181–222.
  • [20] [] Bookstein, F. (1996a), Landmark methods for forms without landmarks: localizing group differences in outline shape,, in Mathematical Methods in Biomedical Image Analysis, 1996., Proceedings of the Workshop on, IEEE, pp. 279–289.
  • [22] [] Bookstein, F. (1996b), Shape and the information in medical images: A decade of the morphometric synthesis,, in Mathematical Methods in Biomedical Image Analysis, 1996., Proceedings of the Workshop on, IEEE, pp. 2–12.
  • [24] [] Bookstein, F. (1996c), “Standard formula for the uniform shape component in landmark data,” NATO ASI SERIES A LIFE SCIENCES, 284, 153–168.
  • [26] [] Cinquin, P., Chalmond, B., and Berard, D. (1982), “Hip prosthesis design,” Lecture Notes in Medical Informatics, 16, 195–200.
  • [28] [] Désidéri, J., Abou El Majd, B., and Janka, A. (2007), “Nested and self-adaptive Bézier parameterizations for shape optimization,” Journal of Computational Physics, 224(1), 117–131.
  • [30] [] Désidéri, J., and Janka, A. (2004), Multilevel shape parameterization for aerodynamic optimization–application to drag and noise reduction of transonic/supersonic business jet,, in European Congress on Computational Methods in Applied Sciences and Engineering (ECCOMAS 2004), E. Heikkola et al eds., Jyväskyla, pp. 24–28.
  • [32] [] Dryden, I., and Gattone, S. (2001), “Surface shape analysis from MR images,” Proc. Functional and Spatial Data Analysis, pp. 139–142.
  • [34] [] Dryden, I., and Mardia, K. (1993), “Multivariate shape analysis,” Sankhyā: The Indian Journal of Statistics, Series A, pp. 460–480.
  • [36] [] Dryden, I., and Mardia, K. (1998), Statistical shape analysis, Vol. 4 John Wiley & Sons New York.
  • [38] [] Gong, L., Pathak, S., Haynor, D., Cho, P., and Kim, Y. (2004), “Parametric shape modeling using deformable superellipses for prostate segmentation,” Medical Imaging, IEEE Transactions on, 23(3), 340–349.
  • [40] [] Hagen, H., and Santarelli, P. (1992), Variational design of smooth B-spline surfaces,, in Topics in surface modeling, Society for Industrial and Applied Mathematics, pp. 85–92.
  • [42] [] Hastie, T., and Stuetzle, W. (1989), “Principal curves,” Journal of the American Statistical Association, pp. 502–516.
  • [44] [] Kent, J., Mardia, K., Morris, R., and Aykroyd, R. (2001), “Functional models of growth for landmark data,” Proceedings in Functional and Spatial Data Analysis, pp. 109–115.
  • [46] [] Kume, A., Dryden, I., and Le, H. (2007), “Shape-space smoothing splines for planar landmark data,” Biometrika, 94(3), 513–528.
  • [48] [] Kurtek, S., Srivastava, A., Klassen, E., and Ding, Z. (2011), “Statistical Modeling of Curves Using Shapes and Related Features,” , .
  • [50] [] Lang, J., and Röschel, O. (1992), “Developable (1, n)-Bézier surfaces,” Computer Aided Geometric Design, 9(4), 291–298.
  • [52] [] Madi, M. (2004), “Closed-form expressions for the approximation of arclength parameterization for Bezier curves,” International journal of applied mathematics and computer science, 14(1), 33–42.
  • [54] [] Malladi, R., Sethian, J., and Vemuri, B. (1994), “Evolutionary fronts for topology-independent shape modeling and recovery,” Computer Vision ECCV’94, pp. 1–13.
  • [56] [] Malladi, R., Sethian, J., and Vemuri, B. (1995), “Shape modeling with front propagation: A level set approach,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, 17(2), 158–175.
  • [58] [] Mardia, K., and Dryden, I. (1989), “The statistical analysis of shape data,” Biometrika, 76(2), 271–281.
  • [60] [] Mokhtarian, F., and Mackworth, A. (1992), “A theory of multiscale, curvature-based shape representation for planar curves,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(8), 789–805.
  • [62] [] Mortenson, M. (1985), Geometrie modeling John Wiley, New York.
  • [64] [] Muller, H. (2005), Surface reconstruction-an introduction,, in Scientific Visualization Conference, 1997, IEEE, p. 239.
  • [66] [] Pati, D., and Dunson, D. (2011), “Bayesian modeling of closed surfaces through tensor products,” , . (submitted to Biometrika).
  • [68] [] Persoon, E., and Fu, K. (1977), “Shape discrimination using Fourier descriptors,” Systems, Man and Cybernetics, IEEE Transactions on, 7(3), 170–179.
  • [70] [] Rossi, D., and Willsky, A. (2003), “Reconstruction from projections based on detection and estimation of objects–Parts I and II: Performance analysis and robustness analysis,” Acoustics, Speech and Signal Processing, IEEE Transactions on, 32(4), 886–906.
  • [72] [] Róth, Á., Juhász, I., Schicho, J., and Hoffmann, M. (2009), “A cyclic basis for closed curve and surface modeling,” Computer Aided Geometric Design, 26(5), 528–546.
  • [74] [] Sato, Y., Wheeler, M., and Ikeuchi, K. (1997), Object shape and reflectance modeling from observation,, in Proceedings of the 24th annual conference on Computer graphics and interactive techniques, ACM Press/Addison-Wesley Publishing Co., pp. 379–387.
  • [76] [] Sederberg, T., Gao, P., Wang, G., and Mu, H. (1993), 2-D shape blending: an intrinsic solution to the vertex path problem,, in Proceedings of the 20th annual conference on Computer graphics and interactive techniques, ACM, pp. 15–18.
  • [78] [] Stepanets, A. (1974), “THE APPROXIMATION OF CERTAIN CLASSES OF DIFFERENTIABLE PERIODIC FUNCTIONS OF TWO VARIABLES BY FOURIER SUMS,” Ukrainian Mathematical Journal, 25(5), 498–506.
  • [80] [] Su, B., and Liu, D. (1989), Computational geometry: curve and surface modeling Academic Press Professional, Inc. San Diego, CA, USA.
  • [82] [] Su, J., Dryden, I., Klassen, E., Le, H., and Srivastava, A. (2011), “Fitting Optimal Curves to Time-Indexed, Noisy Observations of Stochastic Processes on Nonlinear Manifolds,” Journal of Image and Vision Computing, .
  • [84] [] van der Vaart, A., and van Zanten, J. (2008), “Reproducing kernel Hilbert spaces of Gaussian priors,” IMS Collections, 3, 200–222.
  • [86] [] Whitney, H. (1937), “On regular closed curves in the plane,” Compositio Math, 4, 276–284.
  • [88] [] Zahn, C., and Roskies, R. (1972), “Fourier descriptors for plane closed curves,” Computers, IEEE Transactions on, 100(3), 269–281.
  • [90] [] Zheng, Y., John, M., Liao, R., Boese, J., Kirschstein, U., Georgescu, B., Zhou, S., Kempfert, J., Walther, T., Brockmann, G. et al. (2010), “Automatic aorta segmentation and valve landmark detection in C-arm CT: application to aortic valve implantation,” Medical Image Computing and Computer-Assisted Intervention–MICCAI 2010, pp. 476–483.
  • [91]