This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Geometry and Singularities of the Prony mapping

Dmitry Batenkov Department of Mathematics
Weizmann Institute of Science
Rehovot 76100
Israel
dima.batenkov@weizmann.ac.il http://www.wisdom.weizmann.ac.il/ dmitryb
 and  Yosef Yomdin yosef.yomdin@weizmann.ac.il http://www.wisdom.weizmann.ac.il/ yomdin
Abstract.

Prony mapping provides the global solution of the Prony system of equations

Σi=1nAixik=mk,k=0,1,,2n1.\Sigma_{i=1}^{n}A_{i}x_{i}^{k}=m_{k},\ k=0,1,\dots,2n-1.

This system appears in numerous theoretical and applied problems arising in Signal Reconstruction. The simplest example is the problem of reconstruction of linear combination of δ\delta-functions of the form g(x)=i=1naiδ(xxi)g(x)=\sum_{i=1}^{n}a_{i}\delta(x-x_{i}), with the unknown parameters ai,xi,i=1,,n,a_{i},\ x_{i},\ i=1,\dots,n, from the “moment measurements” mk=xkg(x)𝑑x.m_{k}=\int x^{k}g(x)dx.

Global solution of the Prony system, i.e. inversion of the Prony mapping, encounters several types of singularities. One of the most important ones is a collision of some of the points xi.x_{i}. The investigation of this type of singularities has been started in [21] where the role of finite differences was demonstrated.

In the present paper we study this and other types of singularities of the Prony mapping, and describe its global geometry. We show, in particular, close connections of the Prony mapping with the “Vieta mapping” expressing the coefficients of a polynomial through its roots, and with hyperbolic polynomials and “Vandermonde mapping” studied by V. Arnold.

Key words and phrases:
Singularities, Signal acquisition, Non-linear models, Moments inversion.
2000 Mathematics Subject Classification:
94A12 62J02, 14P10, 42C99
This research is supported by the Adams Fellowship Program of the Israel Academy of Sciences and Humanities, ISF grant 264/09 and the Minerva Foundation.

1. Introduction

Prony system appears as we try to solve a very simple “algebraic signal reconstruction” problem of the following form: assume that the signal F(x)F(x) is known to be a linear combination of shifted δ\delta-functions:

F(x)=j=1dajδ(xxj).F\left(x\right)=\sum_{j=1}^{d}a_{j}\delta\left(x-x_{j}\right). (1.1)

We shall use as measurements the polynomial moments:

mk=mk(F)=xkF(x)dx.m_{k}=m_{k}\left(F\right)=\int x^{k}F\left(x\right)\operatorname{d}x. (1.2)

After substituting FF into the integral defining mkm_{k} we get

mk(F)=xkj=1dajδ(xxj)dx=j=1dajxjk.m_{k}(F)=\int x^{k}\sum_{j=1}^{d}a_{j}\delta(x-x_{j})\operatorname{d}x=\sum_{j=1}^{d}a_{j}x_{j}^{k}.

Considering aja_{j} and xjx_{j} as unknowns, we obtain equations

mk(F)=j=1dajxjk,k=0,1,.m_{k}\left(F\right)=\sum_{j=1}^{d}a_{j}x_{j}^{k},\;k=0,1,\dots. (1.3)

This infinite set of equations (or its part, for k=0,1,,2d1k=0,1,\dots,2d-1), is called Prony system. It can be traced at least to R. de Prony (1795, [19]) and it is used in a wide variety of theoretical and applied fields. See [2] for an extensive bibligoraphy on the Prony method.

In writing Prony system (1.3) we have assumed that all the nodes x1,,xdx_{1},\dots,x_{d} are pairwise different. However, as a right-hand side μ=(m0,,m2d1)\mu=(m_{0},\dots,m_{2d-1}) of (1.3) is provided by the actual measurements of the signal FF, we cannot guarantee a priori, that this condition is satisfied for the solution. Moreover, we shall see below that multiple nodes may naturally appear in the solution process. In order to incorporate possible collisions of the nodes, we consider “confluent Prony systems”.

Assume that the signal F(x)F(x) is a linear combination of shifted δ\delta-functions and their derivatives:

F(x)=j=1s=0dj1aj,δ()(xxj).F\left(x\right)=\sum_{j=1}^{s}\sum_{\ell=0}^{d_{j}-1}a_{j,\ell}\delta^{\left(\ell\right)}\left(x-x_{j}\right). (1.4)
Definition 1.1.

For F(x)F\left(x\right) as above, the vector D(F)=def(d1,,ds)D\left(F\right)\stackrel{{\scriptstyle\text{def}}}{{=}}(d_{1},\dots,d_{s}) is the multiplicity vector of FF, s=s(F)s=s\left(F\right) is the size of its support, T(F)=def(x1,,xs)T\left(F\right)\stackrel{{\scriptstyle\text{def}}}{{=}}\left(x_{1},\dots,x_{s}\right), and rank(F)=defj=1sdj\operatorname{rank}\left(F\right)\stackrel{{\scriptstyle\text{def}}}{{=}}\sum_{j=1}^{s}d_{j} is its rank. For avoiding ambiguity in these definitions, it is always understood that aj,dj10a_{j,d_{j}-1}\neq 0 for all j=1,,sj=1,\dots,s (i.e. djd_{j} is the maximal index for which aj,dj10a_{j,d_{j}-1}\neq 0).

For the moments mk=mk(F)=xkF(x)dxm_{k}=m_{k}(F)=\int x^{k}F(x)\operatorname{d}x we now get

mk=j=1s=0dj1aj,k!(k)!xjk.m_{k}=\sum_{j=1}^{s}\sum_{\ell=0}^{d_{j}-1}a_{j,\ell}\frac{{k!}}{{(k-\ell)!}}x_{j}^{k-\ell}.

Considering xix_{i} and aj,a_{j,\ell} as unknowns, we obtain a system of equations

j=1s=0dj1k!(k)!aj,xjk=mk,k=0,1,,2d1,\sum_{j=1}^{s}\sum_{\ell=0}^{d_{j}-1}\frac{k!}{\left(k-\ell\right)!}a_{j,\ell}x_{j}^{k-\ell}=m_{k},\quad k=0,1,\dots,2d-1, (1.5)

which is called a confluent Prony system of order dd with the multiplicity vector D=(d1,,ds)D=\left(d_{1},\dots,d_{s}\right). The original Prony system (1.3) is a special case of the confluent one, with DD being the vector (1,,1)(1,\dots,1) of length dd.

The system (1.5) arises also in the problem of reconstructing a planar polygon PP (or even an arbitrary semi-analytic quadrature domain) from its moments

mk(χP)=2zkχPdxdy,z=x+ıy,m_{k}(\chi_{P})=\iint_{\mathbb{R}^{2}}z^{k}\chi_{P}\operatorname{d}x\operatorname{d}y,\;z=x+\imath y,

where χP\chi_{P} is the characteristic function of the domain P2P\subset\mathbb{R}^{2}. This problem is important in many areas of science and engineering [11]. The above yields the confluent Prony system

mk=j=1si=0dj1ci,jk(k1)(ki+1)zjki,ci,j,zj{0}.m_{k}=\sum_{j=1}^{s}\sum_{i=0}^{d_{j}-1}c_{i,j}k(k-1)\cdots(k-i+1)z_{j}^{k-i},\qquad c_{i,j}\in\mathbb{C},\;z_{j}\in\mathbb{C}\setminus\left\{0\right\}.
Definition 1.2.

For a given multiplicity vector D=(d1,,ds)D=\left(d_{1},\dots,d_{s}\right), its order is j=1sdj\sum_{j=1}^{s}d_{j}.

As we shall see below, if we start with the measurements μ(F)=μ=(m0,,m2d1)\mu(F)=\mu=(m_{0},\dots,m_{2d-1}), then a natural setting of the problem of solving the Prony system is the following:

Problem 1.3 (Prony problem of order dd).

Given the measurements

μ=(m0,,m2d1)2d\mu=(m_{0},\dots,m_{2d-1})\in\mathbb{C}^{2d}

in the right hand side of (1.5), find the multiplicity vector D=(d1,,ds)D=(d_{1},\dots,d_{s}) of order r=j=1sdjdr=\sum_{j=1}^{s}d_{j}\leq d, and find the unknowns xjx_{j} and aj,,a_{j,\ell}, which solve the corresponding confluent Prony system (1.5) with the multiplicity vector DD (hence, with solution of rank rr).

It is extremely important in practice to have a stable method of inversion. Many research efforts are devoted to this task (see e.g. [3, 7, 10, 17, 18, 20] and references therein). A basic question here is the following.

Problem 1.4 (Noisy Prony problem).

Given the noisy measurements

μ~=(m0~,,m~2d1)2d\tilde{\mu}=(\tilde{m_{0}},\dots,\tilde{m}_{2d-1})\in\mathbb{C}^{2d}

and an estimate of the error |m~kmk|εk\left|\tilde{m}_{k}-m_{k}\right|\leq\varepsilon_{k}, solve Problem 1.3 so as to minimize the reconstruction error.

In this paper we study the global setting of the Prony problem, stressing its algebraic structure. In Section 2 the space where the solution is to be found (Prony space) is described. It turns out to be a vector bundle over the space of the nodes x1,,xdx_{1},\dots,x_{d}. We define also three mappings: “Prony”, “Taylor”, and “Stieltjes” ones, which capture the essential features of the Prony problem and of its solution process.

In Section 3 we investigate solvability conditions for the Prony problem. The answer leads naturally to a stratification of the space of the right-hand sides, according to the rank of the associated Hankel-type matrix and its minors. The behavior of the solutions near various strata turns out to be highly nontrivial, and we present some initial results in the description of the corresponding singularities.

In Section 4, we study the multiplicity-restricted Prony problem, fixing the collision pattern of the solution, and derive simple bounds for the stability of the solution via factorization of the Jacobian determinant of the corresponding Prony map.

In Section 5 we consider the rank-restricted Prony problem, effectively reducing the dimension to 2r2r instead of 2d2d, where rr is precisely the rank of the associated Hankel-type matrix. In this formulation, the Prony problem is solvable in a small neighborhood of the exact measurement vector.

In Section 6 we study one of the most important singularities in the Prony problem: collision of some of the points xi.x_{i}. The investigation of this type of singularities has been started in [21] where the role of finite differences was demonstrated. In the present paper we introduce global bases of finite differences, study their properties, and prove that using such bases we can resolve in a robust way at least the linear part of the Prony problem at and near colliding configurations of the nodes.

In Section 7 we discuss close connections of the Prony problem with hyperbolic polynomials and “Vandermonde mapping” studied by V.I.Arnold in [1] and by V.P.Kostov in [13, 14, 15], and with “Vieta mapping” expressing the coefficients of a polynomial through its roots. We believe that questions arising in theoretical study of Prony problem and in its practical applications justify further investigation of these connections, as well as further applications of Singularity Theory.

Finally, in Appendix A we describe a solution method for the Prony system based on Padé approximation.

2. Prony, Stieltjes and Taylor Mappings

In this section we define “Prony”, “Taylor”, and “Stieltjes” mappings, which capture some essential features of the Prony problem and of its solution process. The main idea behind the spaces and mappings introduced in this section is the following: associate to the signal F(x)=i=1daiδ(xxi)F(x)=\sum_{i=1}^{d}a_{i}\delta(x-x_{i}) the rational function R(z)=i=1daizxiR(z)=\sum_{i=1}^{d}\frac{{a_{i}}}{{z-x_{i}}}. (In fact, RR is the Stieltjes integral transform of FF). The functions RR obtained in this way can be written as R(z)=P(z)Q(z)R(z)=\frac{{P(z)}}{{Q(z)}} with degPdegQ1,\deg P\leq\deg Q-1, and they satisfy R()=0R(\infty)=0. Write RR as R(z)=i=1dzai1xi/z.R(z)=\sum_{i=1}^{d}\frac{{za_{i}}}{{1-x_{i}/z}}. Developing the summands into geometric progressions we conclude that R(z)=k=0mk(1z)k+1,R(z)=\sum_{k=0}^{\infty}m_{k}(\frac{1}{z})^{k+1}, with mk=i=1daixikm_{k}=\sum_{i=1}^{d}a_{i}x_{i}^{k}, so the moment measurements mkm_{k} in the right hand side of the Prony system (1.3) are exactly the Taylor coefficients of R(z)R(z). We shall see below that this correspondence reduces solution of the Prony system to an appropriate Padé approximation problem.

Definition 2.1.

For each w=(x1,,xd)dw=\left(x_{1},\dots,x_{d}\right)\in\mathbb{C}^{d}, let s=s(w)s=s\left(w\right) be the number of distinct coordinates τj\tau_{j}, j=1,,sj=1,\dots,s, and denote T(w)=(τ1,,τs)T\left(w\right)=\left(\tau_{1},\dots,\tau_{s}\right). The multiplicity vector is D=D(w)=(d1,,ds)D=D\left(w\right)=\left(d_{1},\dots,d_{s}\right), where djd_{j} is the number of times the value τj\tau_{j} appears in {x1,,xd}.\left\{x_{1},\dots,x_{d}\right\}. The order of the values in T(w)T\left(w\right) is defined by their order of appearance in ww.

Example 2.2.

For w=(3,1,2,1,0,3,2)w=\left(3,1,2,1,0,3,2\right) we have s(w)=4s\left(w\right)=4, T(w)=(3,1,2,0)T\left(w\right)=\left(3,1,2,0\right) and D(w)=(2,2,2,1)D\left(w\right)=\left(2,2,2,1\right).

Remark 2.3.

Note the slight abuse of notations between Definition 1.1 and Definition 2.1. Note also that the order of D(w)D\left(w\right) equals to dd for all wdw\in\mathbb{C}^{d}.

Definition 2.4.

For each wdw\in\mathbb{C}^{d}, let s=s(w),T(w)=(τ1,,τs)s=s\left(w\right),\;T\left(w\right)=\left(\tau_{1},\dots,\tau_{s}\right) and D(w)=(d1,,ds)D\left(w\right)=\left(d_{1},\dots,d_{s}\right) be as in Definition 2.1.

  1. (1)

    VwV_{w} is the vector space of dimension dd containing the linear combinations

    g=j=1s=0dj1γj,δ()(xτj)g=\sum_{j=1}^{s}\sum_{\ell=0}^{d_{j}-1}\gamma_{j,\ell}\delta^{\left(\ell\right)}\left(x-\tau_{j}\right) (2.1)

    of δ\delta-functions and their derivatives at the points of T(w)T\left(w\right). The “standard basis” of VwV_{w} is given by the distributions

    δj,=δ()(xτj),j=1,,s(w);=0,,dj1.\delta_{j,\ell}=\delta^{\left(\ell\right)}\left(x-\tau_{j}\right),\qquad j=1,\dots,s\left(w\right);\;\ell=0,\dots,d_{j}-1. (2.2)
  2. (2)

    WwW_{w} is the vector space of dimension dd of all the rational functions with poles T(w)T\left(w\right) and multiplicities D(w)D\left(w\right), vanishing at :\infty:

    R(z)=P(z)Q(z),Q(z)=j=1s(zτj)dj,degP(z)<degQd.R\left(z\right)=\frac{P\left(z\right)}{Q\left(z\right)},\qquad Q\left(z\right)=\prod_{j=1}^{s}\left(z-\tau_{j}\right)^{d_{j}},\;\deg P\left(z\right)<\deg Q\leqslant d.

    The “standard basis” of WwW_{w} is given by the elementary fractions

    Rj,=1(zτj),j=1,,s;=1,,dj.R_{j,\ell}=\frac{1}{\left(z-\tau_{j}\right)^{\ell}},\qquad j=1,\dots,s;\;\ell=1,\dots,d_{j}.

Now we are ready to formally define the Prony space Pd{\mathcal{}P}_{d} and the Stieltjes space Sd{\mathcal{}S}_{d}.

Definition 2.5.

The Prony space Pd{\mathcal{}P}_{d} is the vector bundle over d\mathbb{C}^{d}, consisting of all the pairs

(w,g):wd,gVw.\left(w,g\right):\quad w\in\mathbb{C}^{d},\;g\in V_{w}.

The topology on Pd{\mathcal{}P}_{d} is induced by the natural embedding Pdd×D,{\mathcal{}P}_{d}\subset\mathbb{C}^{d}\times{\mathcal{}D}, where D{\mathcal{}D} is the space of distributions on \mathbb{C} with its standard topology.

Definition 2.6.

The Stieltjes space Sd{\mathcal{}S}_{d} is the vector bundle over d\mathbb{C}^{d}, consisting of all the pairs

(w,γ):wd,γWw.\left(w,\gamma\right):\qquad w\in\mathbb{C}^{d},\;\gamma\in W_{w}.

The topology on Sd{\mathcal{}S}_{d} is induced by the natural embedding Sdd×R{\mathcal{}S}_{d}\subset\mathbb{C}^{d}\times{\mathcal{}R}, where R{\mathcal{}R} is the space of complex rational functions with its standard topology.

Definition 2.7.

The Stieltjes mapping SM:PdSd{\mathcal{}SM}:{\mathcal{}P}_{d}\rightarrow{\mathcal{}S}_{d} is defined by the Stieltjes integral transform: for (w,g)Pd(w,g)\in{\mathcal{}P}_{d}

SM((w,g))=(w,γ),γ(z)=g(x)dxzx.{\mathcal{}SM}\left(\left(w,g\right)\right)=\left(w,\gamma\right),\qquad\gamma\left(z\right)=\int_{-\infty}^{\infty}\frac{g\left(x\right)\operatorname{d}x}{z-x}.

Sometimes we abuse notation and write for short SM(g)=γ{\mathcal{}SM}\left(g\right)=\gamma, with the understanding that SM{\mathcal{}SM} is also a map SM:VwWw{\mathcal{}SM}:V_{w}\to W_{w} for each wdw\in\mathbb{C}^{d}.

The following fact is immediate consequence of the above definitions.

Proposition 2.8.

SM{\mathcal{}SM} is a linear isomorphism of the bundles Pd{\mathcal{}P}_{d} and Sd{\mathcal{}S}_{d} (for each wdw\in\mathbb{C}^{d}, SM{\mathcal{}SM} is a linear isomorphism of the vector spaces VwV_{w} and WwW_{w}). In the standard bases of VwV_{w} and WwW_{w}, the map SM{\mathcal{}SM} is diagonal, satisfying

SM(δj,)=(1)!Rj,(z).{\mathcal{}SM}\left(\delta_{j,\ell}\right)=\left(-1\right)^{\ell}\ell!R_{j,\ell}\left(z\right).

Furthermore, for any (w,g)Pd\left(w,g\right)\in{\mathcal{}P}_{d}

SM(g)=P(z)Q(z)irreducible,degP<degQ=rank(g)d.{\mathcal{}SM}\left(g\right)=\underbrace{\frac{P\left(z\right)}{Q\left(z\right)}}_{\text{irreducible}},\qquad\deg P<\deg Q=\operatorname{rank}\left(g\right)\leqslant d. (2.3)
Definition 2.9.

The Taylor space Td{\mathcal{}T}_{d} is the space of complex Taylor polynomials at infinity of degree 2d12d-1 of the form k=02d1mk(1z)k+1\sum_{k=0}^{2d-1}m_{k}(\frac{1}{z})^{k+1}. We shall identify Td{\mathcal{}T}_{d} with the complex space 2d\mathbb{C}^{2d} with the coordinates m0,,m2d1m_{0},\dots,m_{2d-1}.

Definition 2.10.

The Taylor mapping TM:SdTd{\mathcal{}TM}:{\mathcal{}S}_{d}\to{\mathcal{}T}_{d} is defined by the truncated Taylor development at infinity:

TM((w,γ))=k=02d1αk(1z)k+1, where γ(z)=k=0αk(1z)k+1.{\mathcal{}TM}\left(\left(w,\gamma\right)\right)=\sum_{k=0}^{2d-1}\alpha_{k}\left(\frac{1}{z}\right)^{k+1},\qquad\text{ where }\gamma\left(z\right)=\sum_{k=0}^{\infty}\alpha_{k}\left(\frac{1}{z}\right)^{k+1}.

We identify TM((w,γ)){\mathcal{}TM}\left(\left(w,\gamma\right)\right) as above with (α0,,α2d1)2d.\left(\alpha_{0},\dots,\alpha_{2d-1}\right)\in\mathbb{C}^{2d}. Sometimes we write for short TM(γ)=(α0,,α2d1){\mathcal{}TM}\left(\gamma\right)=\left(\alpha_{0},\dots,\alpha_{2d-1}\right).

Finally, we define the Prony mapping PM{\mathcal{}PM} which encodes the Prony problem.

Definition 2.11.

The Prony mapping PM:Pd2d{\mathcal{}PM}:{\mathcal{}P}_{d}\to\mathbb{C}^{2d} for (w,g)Pd\left(w,g\right)\in{\mathcal{}P}_{d} is defined as follows:

PM((w,g))=(m0,,m2d1)2d,mk=mk(g)=xkg(x)dx.{\mathcal{}{\mathcal{}PM}}\left(\left(w,g\right)\right)=\left(m_{0},\dots,m_{2d-1}\right)\in\mathbb{C}^{2d},\qquad m_{k}=m_{k}\left(g\right)=\int x^{k}g\left(x\right)\operatorname{d}x.

By the above definitions, we have

PM=TMSM.{\mathcal{}PM}={\mathcal{}TM}\circ{\mathcal{}SM}. (2.4)

Solving the Prony problem for a given right-hand side (m0,,m2d1)(m_{0},\dots,m_{2d-1}) is therefore equivalent to inverting the Prony mapping PM{\mathcal{}PM}. As we shall elaborate in the subsequent section, the identity (2.4) allows us to split this problem into two parts: inversion of TM{\mathcal{}TM}, which is, essentially, the Padé approximation problem, and inversion of SM{\mathcal{}SM}, which is, essentially, the decomposition of a given rational function into the sum of elementary fractions.

3. Solvability of the Prony problem

3.1. General condition for solvability

In this section we provde a necessary and sufficient condition for the Prony problem to have a solution (which is unique, as it turns out by Proposition 3.2). As mentioned in the end of the previous section, our method is based on inverting (2.4) and thus relies on the solution of the corresponding (diagonal) Padé approximation problem [4].

Problem 3.1 (Diagonal Padé approximation problem).

Given μ=(m0,,m2d1)2d\mu=\left(m_{0},\dots,m_{2d-1}\right)\in\mathbb{C}^{2d}, find a rational function Rd(z)=P(z)Q(z)SdR_{d}(z)=\frac{P\left(z\right)}{Q\left(z\right)}\in{\mathcal{}S}_{d} with degP<degQd\deg P<\deg Q\leqslant d, such that the first 2d2d Taylor coefficients at infinity of Rd(z)R_{d}(z) are {mk}k=02d1\left\{m_{k}\right\}_{k=0}^{2d-1}.

Proposition 3.2.

A solution to Problem 3.1, if exists, is unique.

Proof.

Writing R(z)=P(z)Q(z),R1(z)=P1(z)Q1(z)R\left(z\right)=\frac{P\left(z\right)}{Q\left(z\right)},\;R_{1}\left(z\right)=\frac{P_{1}\left(z\right)}{Q_{1}\left(z\right)}, with degP<degQd\deg P<\deg Q\leqslant d and degP1<degQ1d\deg P_{1}<\deg Q_{1}\leqslant d, we get

RR1=PQ1P1QQQ1,R-R_{1}=\frac{PQ_{1}-P_{1}Q}{QQ_{1}},

and this function, if nonzero, can have a zero of order at most 2d12d-1 at infinity. ∎

Let us summarize the above discussion with the following statement.

Proposition 3.3.

The tuple

{s,D=(d1,,ds),r=j=1sdjd,X={xj}j=1s,A={aj,}j=1,,s;=0,,dj1}\left\{s,\;D=(d_{1},\dots,d_{s}),\;r=\sum_{j=1}^{s}d_{j}\leq d,\;X=\left\{x_{j}\right\}_{j=1}^{s},\;A=\left\{a_{j,\ell}\right\}_{j=1,\dots,s;\;\ell=0,\dots,d_{j}-1}\right\}

is a (unique, up to a permutation of the nodes {xj}\left\{x_{j}\right\}) solution to Problem 1.3 with right-hand side

μ=(m0,,m2d1)2d\mu=(m_{0},\dots,m_{2d-1})\in\mathbb{C}^{2d}

if and only if the rational function

RD,X,A(z)=j=1s=1dj(1)1(1)!aj,1(zxj)=k=02d1mkzk+1+O(z2d1)R_{D,X,A}\left(z\right)=\sum_{j=1}^{s}\sum_{\ell=1}^{d_{j}}\left(-1\right)^{\ell-1}\left(\ell-1\right)!\frac{a_{j,\ell-1}}{\left(z-x_{j}\right)^{\ell}}=\sum_{k=0}^{2d-1}\frac{m_{k}}{z^{k+1}}+O\left(z^{-2d-1}\right)

is a (unique) solution to Problem 3.1 with input μ\mu. In that case,

RD,X,A(z)=g(x)dxzx where g(x)=j=1s=0dj1aj,δ()(xxj),R_{D,X,A}\left(z\right)=\int_{-\infty}^{\infty}\frac{g\left(x\right)\operatorname{d}x}{z-x}\qquad\text{ where }\;g\left(x\right)=\sum_{j=1}^{s}\sum_{\ell=0}^{d_{j}-1}a_{j,\ell}\delta^{\left(\ell\right)}\left(x-x_{j}\right),

i.e. RD,X,A(z)R_{D,X,A}\left(z\right) is the Stieltjes transform of g(x)g\left(x\right).

Proof.

This follows from the definitions of Section 2, (2.4), Proposition 3.2 and the fact that the problem of representing a given rational function as a sum of elementary fractions of the specified form (i.e. inverting SM{\mathcal{}SM}) is always uniquely solvable up to a permutation of the poles. ∎

The next result provides necessary and sufficient conditions for the solvability of Problem 3.1. It summarizes some well-known facts in the theory of Padé approximation, related to “normal indices” (see, for instance, [4]). However, these facts are not usually formulated in the literature on Padé approximation in the form we need in relation to the Prony problem. Consequently, we give a detailed proof of this result in Appendix A. This proof contains, in particular, some facts which are important for understanding the solvability issues of the Prony problem.

Definition 3.4.

Given a vector μ=(m0,,m2d1)\mu=\left(m_{0},\dots,m_{2d-1}\right), let M~d\tilde{M}_{d} denote the d×(d+1)d\times\left(d+1\right) Hankel matrix

M~d=[m0m1m2mdm1m2m3md+1\adots\adots\adots\adots\adotsmd1mdmd+1m2d1].\tilde{M}_{d}=\begin{bmatrix}m_{0}&m_{1}&m_{2}&\dots&m_{d}\\ m_{1}&m_{2}&m_{3}&\dots&m_{d+1}\\ \adots&\adots&\adots&\adots&\adots\\ m_{d-1}&m_{d}&m_{d+1}&\dots&m_{2d-1}\end{bmatrix}. (3.1)

For each ede\leqslant d, denote by M~e\tilde{M}_{e} the e×(e+1)e\times\left(e+1\right) submatrix of M~d\tilde{M}_{d} formed by the first ee rows and e+1e+1 columns, and let MeM_{e} denote the corresponding square matrix.

Theorem 3.5.

Let μ=(m0,,m2d1)\mu=(m_{0},\dots,m_{2d-1}) be given, and let rdr\leqslant d be the rank of the Hankel matrix M~d\tilde{M}_{d} as in (3.1). Then Problem 3.1 is solvable for the input μ\mu if and only if the upper left minor |Mr|\left|M_{r}\right| of M~d\tilde{M}_{d} is non-zero.

As an immediate consequence of Theorem 3.5 and Proposition 3.3, we obtain the following result.

Theorem 3.6.

Let μ=(m0,,m2d1)\mu=(m_{0},\dots,m_{2d-1}) be given, and let rdr\leqslant d be the rank of the Hankel matrix M~d\tilde{M}_{d} as in (3.1). Then Problem 1.3 with input μ\mu is solvable if and only if the upper left minor |Mr|\left|M_{r}\right| of M~d\tilde{M}_{d} is non-zero. The solution, if exists, is unique, up to a permutation of the nodes {xj}\left\{x_{j}\right\}. The multiplicity vector D=(d1,,ds)D=\left(d_{1},\dots,d_{s}\right), of order j=1sdj=r\sum_{j=1}^{s}d_{j}=r, of the resulting confluent Prony system of rank rr is the multiplicity vector of the poles of the rational function RD,X,A(z)R_{D,X,A}\left(z\right), solving the corresponding Padé problem.

As a corollary we get a complete description of the right-hand side data μ2d\mu\in\mathbb{C}^{2d} for which the Prony problem is solvable (unsolvable). Define for r=1,,dr=1,\dots,d sets Σr2d\Sigma_{r}\subset\mathbb{C}^{2d} (respectively, Σr2d\Sigma^{\prime}_{r}\subset\mathbb{C}^{2d}) consisting of μ2d\mu\in\mathbb{C}^{2d} for which the rank of M~d=r\tilde{M}_{d}=r and |Mr|0|M_{r}|\neq 0  (respectively, |Mr|=0)|M_{r}|=0). The set Σr\Sigma_{r} is a difference Σr=Σr1Σr2\Sigma_{r}=\Sigma_{r}^{1}\setminus\Sigma_{r}^{2} of two algebraic sets: Σr1\Sigma_{r}^{1} is defined by vanishing of all the s×ss\times s minors of M~d,r<sd,\tilde{M}_{d},\ r<s\leq d, while Σr2\Sigma_{r}^{2} is defined by vanishing of |Mr|.|M_{r}|. In turn, Σr=Σr1Σr2,\Sigma^{\prime}_{r}=\Sigma_{r}^{{}^{\prime}1}\setminus\Sigma_{r}^{{}^{\prime}2}, with Σr1=Σr1Σr2\Sigma_{r}^{{}^{\prime}1}=\Sigma_{r}^{1}\cap\Sigma_{r}^{2} and Σr2\Sigma_{r}^{{}^{\prime}2} defined by vanishing of all the r×rr\times r minors of M~d.\tilde{M}_{d}. The union ΣrΣr\Sigma_{r}\cup\Sigma^{\prime}_{r} consists of all μ\mu for which the rank of M~d=r,\tilde{M}_{d}=r, which is Σr1Σr2.\Sigma_{r}^{1}\setminus\Sigma_{r}^{{}^{\prime}2}.

Corollary 3.7.

The set Σ\Sigma (respectively, Σ\Sigma^{\prime}) of μ2d\mu\in\mathbb{C}^{2d} for which the Prony problem is solvable (respectively, unsolvable) is the union Σ=r=1dΣr\Sigma=\cup_{r=1}^{d}\Sigma_{r} (respectively, Σ=r=1dΣr\Sigma^{\prime}=\cup_{r=1}^{d}\Sigma^{\prime}_{r}). In particular, Σ{μ2d,detMd=0}.\Sigma^{\prime}\subset\{\mu\in\mathbb{C}^{2d},\det M_{d}=0\}.

So for a generic right hand side μ\mu we have |Md|0|M_{d}|\neq 0, and the Prony problem is solvable. On the algebraic hypersurface of μ\mu for which |Md|=0,|M_{d}|=0, the Prony problem is solvable if Md10M_{d-1}\neq 0, etc.

Let us now consider some examples.

Example 3.8.

Let us fix d=1,2,d=1,2,\dots. Consider μ=(m0,,m2d1)2d\mu=(m_{0},\dots,m_{2d-1})\in\mathbb{C}^{2d}, the right hand sides of the Prony problem, to be of the form μ=μ=(δk)=(0,,0,1 position +1,0,,0)\mu=\mu_{\ell}=\left(\delta_{k\ell}\right)=(0,\dots,0,\underbrace{1}_{\text{ position $\ell$+1}},0,\dots,0), with all the mk=0m_{k}=0 besides m=1,=0,,2d1,m_{\ell}=1,\ \ell=0,\dots,2d-1, and let M~d\tilde{M}_{d}^{\ell} be the corresponding matrix.

Proposition 3.9.

The rank of M~d\tilde{M}_{d}^{\ell} is equal to +1\ell+1 for d1\ell\leq d-1, and it is equal to 2d2d-\ell for d\ell\geq d. The corresponding Prony problem is solvable for d1\ell\leq d-1, and it is unsolvable for d\ell\geq d.

Proof.

For d=5d=5 and =2,4,5,9\ell=2,4,5,9, the corresponding matrices M~d\tilde{M}_{\ell}^{d} are as follows.

M~52\displaystyle\tilde{M}_{5}^{2} =\displaystyle= [001000010000100000000000000000],M~54=[000010000100001000010000100000],(solvable)\displaystyle\begin{bmatrix}0&0&1&0&0&0\\ 0&1&0&0&0&0\\ 1&0&0&0&0&0\\ 0&0&0&0&0&0\\ 0&0&0&0&0&0\end{bmatrix},\;\tilde{M}_{5}^{4}=\begin{bmatrix}0&0&0&0&1&0\\ 0&0&0&1&0&0\\ 0&0&1&0&0&0\\ 0&1&0&0&0&0\\ 1&0&0&0&0&0\end{bmatrix},\qquad\text{(solvable)}
M~55\displaystyle\tilde{M}_{5}^{5} =\displaystyle= [000001000010000100001000010000],M~59=[000000000000000000000000000001].(unsolvable)\displaystyle\begin{bmatrix}0&0&0&0&0&1\\ 0&0&0&0&1&0\\ 0&0&0&1&0&0\\ 0&0&1&0&0&0\\ 0&1&0&0&0&0\end{bmatrix},\;\tilde{M}_{5}^{9}=\begin{bmatrix}0&0&0&0&0&0\\ 0&0&0&0&0&0\\ 0&0&0&0&0&0\\ 0&0&0&0&0&0\\ 0&0&0&0&0&1\end{bmatrix}.\qquad\text{(unsolvable)}

In general, the matrices M~d\tilde{M}_{d}^{\ell} have the same pattern as in the special cases above, so their rank is +1\ell+1 for d1\ell\leqslant d-1, and 2d2d-\ell for d\ell\geqslant d, as stated above. Application of Theorem 3.6 completes the proof. ∎

In fact, μ\mu_{\ell} is a moment sequence of

F(x)=1!δ()(x),F\left(x\right)=\frac{1}{\ell!}\delta^{\left(\ell\right)}\left(x\right),

and this signal belongs to Pd{\mathcal{}P}_{d} if and only if d1\ell\leqslant d-1. In notations of Corollary 3.7 we have

μΣ+1,d1,μΣ2d,d.\displaystyle\begin{aligned} \mu_{\ell}&\in&\Sigma_{\ell+1},&&\ell\leqslant d-1,\\ \mu_{\ell}&\in&\Sigma^{\prime}_{2d-\ell},&&\ell\geqslant d.\end{aligned}

It is easy to provide various modifications of the above example. In particular, for μ=μ~=(0,,0,1,1,,1)\mu=\tilde{\mu}_{\ell}=\left(0,\dots,0,1,1,\dots,1\right), the result of Proposition 3.9 remains verbally true.

Example 3.10.

Another example is provided by μ1,2\mu_{\ell_{1},\ell_{2}}, with all the mk=0m_{k}=0 besides m1=1,m2=1, 01<d22d1.m_{\ell_{1}}=1,\;m_{\ell_{2}}=1,\ 0\leq\ell_{1}<d\leq\ell_{2}\leq 2d-1. For 1<2d+1\ell_{1}<\ell_{2}-d+1 the rank of the correspondent matrix M~d\tilde{M}_{d} is r=2d+12+1r=2d+\ell_{1}-\ell_{2}+1 while |Mr|=0|M_{r}|=0, so the Prony problem for such μ1,2\mu_{\ell_{1},\ell_{2}} is unsolvable. For d=5d=5 and 1=2,2=8\ell_{1}=2,\;\ell_{2}=8 the matrix is as follows:

M~5(2,8)=[001000010000100000000001000010].\tilde{M}_{5}^{\left(2,8\right)}=\begin{bmatrix}0&0&1&0&0&0\\ 0&1&0&0&0&0\\ 1&0&0&0&0&0\\ 0&0&0&0&0&1\\ 0&0&0&0&1&0\end{bmatrix}.

3.2. Near-singular inversion

The behavior of the inversion of the Prony mapping near the unsolvability stratum Σ\Sigma^{\prime} and near the strata where the rank of M~d\tilde{M}_{d} drops, turns out to be pretty complicated. In particular, in the first case at least one of the nodes tends to infinity. In the second case, depending on the way the right-hand side μ\mu approaches the lower rank strata, the nodes may remain bounded, or some of them may tend to infinity. In this section we provide one initial result in this direction, as well as some examples. A comprehensive description of the inversion of the Prony mapping near Σ\Sigma^{\prime} and near the lower rank strata is important both in theoretical study and in applications of Prony-like systems, and we plan to provide further results in this direction separately.

Theorem 3.11.

As the right-hand side μ2dΣ\mu\in\mathbb{C}^{2d}\setminus\Sigma^{\prime} approaches a finite point μ0Σ,\mu_{0}\in\Sigma^{\prime}, at least one of the nodes x1,,xdx_{1},\dots,x_{d} in the solution tends to infinity.

Proof.

By assumptions, the components m0,,m2d1m_{0},\dots,m_{2d-1} of the right-hand side μ=(m0,,m2d1)2d\mu=(m_{0},\dots,m_{2d-1})\in\mathbb{C}^{2d} remain bounded as μμ0\mu\rightarrow\mu_{0}. By Theorem 6.14, the finite differences coordinates of the solution PM1(μ){\mathcal{}PM}^{-1}(\mu) remain bounded as well. Now, if all the nodes are also bounded, by compactness we conclude that PM1(μ)ωPd.{\mathcal{}PM}^{-1}(\mu)\rightarrow\omega\in{\mathcal{}P}_{d}. By continuity in the distribution space (Lemma 6.6) we have PM(ω)=μ0{\mathcal{}PM}(\omega)=\mu_{0}. Hence the Prony problem with the right-hand side μ0\mu_{0} has a solution ωPd,\omega\in{\mathcal{}P}_{d}, in contradiction with the assumption that μ0Σ\mu_{0}\in\Sigma^{\prime}.∎

Example 3.12.

Let us consider an example: d=2d=2 and μ0=(0,0,1,0)\mu_{0}=(0,0,1,0). Here the rank \ell of M~2\tilde{M}_{2} is 22, and |M2|=0|M_{2}|=0, so by Theorem 3.6 we have μ0Σ2Σ\mu_{0}\in\Sigma^{\prime}_{2}\subset\Sigma^{\prime}. Consider now a perturbation μ(ϵ)=(0,ϵ,1,0)\mu(\epsilon)=(0,\epsilon,1,0) of μ0\mu_{0}. For ϵ0\epsilon\neq 0 we have μ(ϵ)Σ2Σ,\mu(\epsilon)\in\Sigma_{2}\subset\Sigma, and the Prony system is solvable for μϵ\mu_{\epsilon}. Let us write an explicit solution: the coefficients c0,c1c_{0},c_{1} of the polynomial Q(z)=c0+c1z+z2Q(z)=c_{0}+c_{1}z+z^{2} we find from the system (A.\star\star):

[0ϵϵ1][c0c1]=[10],\begin{bmatrix}0&\epsilon\\ \epsilon&1\end{bmatrix}\begin{bmatrix}c_{0}\\ c_{1}\end{bmatrix}=\begin{bmatrix}-1\\ 0\end{bmatrix},

whose solution is c1=1ϵ,c0=1ϵ2.c_{1}=-\frac{1}{\epsilon},\ c_{0}=\frac{1}{{\epsilon^{2}}}. Hence the denominator Q(z)Q(z) of R(z)R(z) is Q(z)=1ϵ21ϵz+z2Q(z)=\frac{1}{{\epsilon^{2}}}-\frac{1}{\epsilon}z+z^{2}, and its roots are x1=1+ı32ϵ,x2=1ı32ϵx_{1}=\frac{{1+\imath\sqrt{3}}}{{2\epsilon}},\ x_{2}=\frac{{1-\imath\sqrt{3}}}{{2\epsilon}}. The coefficients b0,b1b_{0},b_{1} of the numerator P(z)=b0+b1zP(z)=b_{0}+b_{1}z we find from (A.\star):

[000ϵ][1ϵ1]=[b1b0],\begin{bmatrix}0&0\\ 0&\epsilon\end{bmatrix}\begin{bmatrix}-\frac{1}{\epsilon}\\ 1\end{bmatrix}=\begin{bmatrix}b_{1}\\ b_{0}\end{bmatrix},

i.e. b1=0,b0=ϵb_{1}=0,\ b_{0}=\epsilon. Thus the solution of the associated Padé problem is

R(z)=P(z)Q(z)=ϵ(zx1)(zx2)=ϵ2ı31(zx1)ϵ2ı31(zx2).R(z)=\frac{{P(z)}}{{Q(z)}}=\frac{\epsilon}{{(z-x_{1})(z-x_{2})}}=\frac{\epsilon^{2}}{\imath\sqrt{3}}\frac{1}{(z-x_{1})}-\frac{\epsilon^{2}}{\imath\sqrt{3}}\frac{1}{(z-x_{2})}.

Finally, the (unique up to a permutation) solution of the Prony problem for μϵ\mu_{\epsilon} is

a1=ϵ2ı3,a2=ϵ2ı3,x1=1+ı32ϵ,x2=1ı32ϵ.a_{1}=\frac{\epsilon^{2}}{\imath\sqrt{3}},\ a_{2}=-\frac{\epsilon^{2}}{\imath\sqrt{3}},\quad x_{1}=\frac{{1+\imath\sqrt{3}}}{{2\epsilon}},\ x_{2}=\frac{{1-\imath\sqrt{3}}}{{2\epsilon}}.

As ϵ\epsilon tends to zero, the nodes x1,x2x_{1},x_{2} tend to infinity while the coefficients a1,a2a_{1},a_{2} tend to zero.

As it was shown above, for a given μΣ\mu\in\Sigma (say, with pairwise different nodes) the rank of the matrix M~d\tilde{M}_{d} is equal to the number of the nodes in the solution for which the corresponding δ\delta-function enters with a non-zero coefficients. So μ\mu approaches a certain μ0\mu_{0} belonging to a stratum of a lower rank of M~d\tilde{M}_{d} if and only if some of the coefficients aja_{j} in the solution tend to zero. We do not analyze all the possible scenarios of such a degeneration, noticing just that if μ0Σ,\mu_{0}\in\Sigma^{\prime}, i.e., the Prony problem is unsolvable for μ0\mu_{0}, then Theorem 3.11 remains true, with essentially the same proof. So at least one of the nodes, say, xj,x_{j}, escapes to infinity. Moreover, one can show that ajxj2d1a_{j}x_{j}^{2d-1} cannot tend to zero - otherwise the remaining linear combination of δ\delta-functions would provide a solution for μ0\mu_{0}.

If μ0Σ,\mu_{0}\in\Sigma, i.e., the Prony problem is solvable for μ0,\mu_{0}, all the nodes may remain bounded, or some xjx_{j} may escape to infinity, but in such a way that ajxj2d1a_{j}x_{j}^{2d-1} tends to zero.

4. Multiplicity-restricted Prony problem

Consider Problem 1.4 at some point μ0Σ\mu_{0}\in\Sigma. By definition, μ0Σr0\mu_{0}\in\Sigma_{r_{0}} for some r0dr_{0}\leq d. Let μ0=PM((w0,g0))\mu_{0}={\mathcal{}PM}\left(\left(w_{0},g_{0}\right)\right) for some (w0,g0)Pd\left(w_{0},g_{0}\right)\in{\mathcal{}P}_{d}. Assume for a moment that the multiplicity vector D0=D(g0)=(d1,ds0)D_{0}=D\left(g_{0}\right)=\left(d_{1},\dots d_{s_{0}}\right), j=1s0dj=r0\sum_{j=1}^{s_{0}}d_{j}=r_{0}, has a non-trivial collision pattern, i.e. dj>1d_{j}>1 for at least one j=1,,s0j=1,\dots,s_{0}. It means, in turn, that the function RD0,X,A(z)R_{D_{0},X,A}\left(z\right) has a pole of multiplicity djd_{j}. Evidently, there exists an arbitrarily small perturbation μ~\tilde{\mu} of μ0\mu_{0} for which this multiple pole becomes a cluster of single poles, thereby changing the multiplicity vector to some DD0D^{\prime}\neq D_{0}. While we address this problem in Section 6 via the bases of divided differences, in this section we consider a “multiplicity-restricted” Prony problem.

Definition 4.1.

Let 𝐱=(x1,,xs)s\mathbf{x}=\left(x_{1},\dots,x_{s}\right)\in\mathbb{C}^{s} and D=(d1,,ds)D=\left(d_{1},\dots,d_{s}\right) with d=j=1sdjd=\sum_{j=1}^{s}d_{j} be given. The d×dd\times d confluent Vandermonde matrix is

V=V(𝐱,D)=V(x1,d1,,xs,ds)=[𝐯𝟏,𝟎𝐯𝟐,𝟎𝐯𝐬,𝟎𝐯𝟏,𝟏𝐯𝟐,𝟏𝐯𝐬,𝟏𝐯𝟏,𝐝𝟏𝐯𝟐,𝐝𝟏𝐯𝐬,𝐝𝟏]V=V\left(\mathbf{x},D\right)=V\left(x_{1},d_{1},\dots,x_{s},d_{s}\right)=\left[\begin{array}[]{cccc}\mathbf{v_{1,0}}&\mathbf{v_{2,0}}&\dotsc&\mathbf{v_{s,0}}\\ \mathbf{v_{1,1}}&\mathbf{v_{2,1}}&\dotsc&\mathbf{v_{s,1}}\\ &&\dotsc\\ \mathbf{v_{1,d-1}}&\mathbf{v_{2,d-1}}&\dotsc&\mathbf{v_{s,d-1}}\end{array}\right] (4.1)

where the symbol 𝐯𝐣,𝐤\mathbf{v_{j,k}} denotes the following 1×dj1\times d_{j} row vector

𝐯𝐣,𝐤=def[xjk,kxjk1,,k(k1)(kdj)xjkdj+1].\mathbf{v_{j,k}}\stackrel{{\scriptstyle\text{def}}}{{=}}\left[\begin{array}[]{cccc}x_{j}^{k},&kx_{j}^{k-1},&\dots&,k\left(k-1\right)\cdots\left(k-d_{j}\right)x_{j}^{k-d_{j}+1}\end{array}\right].
Proposition 4.2.

The matrix VV defines the linear part of the confluent Prony system (1.5) in the standard basis for VwV_{w}, namely,

V(x1,d1,,xs,ds)[a1,0a1,d11as,ds1]=[m0m1md1].V\left(x_{1},d_{1},\dots,x_{s},d_{s}\right)\begin{bmatrix}a_{1,0}\\ \vdots\\ a_{1,d_{1}-1}\\ \vdots\\ \\ a_{s,d_{s}-1}\end{bmatrix}=\begin{bmatrix}m_{0}\\ m_{1}\\ \vdots\\ \\ \\ m_{d-1}\end{bmatrix}. (4.2)
Definition 4.3.

Let PM(w0,g0)=μ0Σr0{\mathcal{}PM}\left(w_{0},g_{0}\right)=\mu_{0}\in\Sigma_{r_{0}} with D(g0)=D0D\left(g_{0}\right)=D_{0} and s(g0)=s0s\left(g_{0}\right)=s_{0}. Let PD0{\mathcal{}P}_{D_{0}} denote the following subbundle of Pd{\mathcal{}P}_{d} of dimension s0+r0s_{0}+r_{0}:

PD0={(w,g)Pd:D(g)=D0}.{\mathcal{}P}_{D_{0}}=\left\{\left(w,g\right)\in{\mathcal{}P}_{d}:\quad D\left(g\right)=D_{0}\right\}.

The multiplicity-restricted Prony mapping PMD0:PD0s0+r0{\mathcal{}{\mathcal{}PM}}_{D_{0}}^{*}:{\mathcal{}P}_{D_{0}}\to\mathbb{C}^{s_{0}+r_{0}} is the composition

PMD0=πPMPD0,{\mathcal{}{\mathcal{}PM}}_{D_{0}}^{*}=\pi\circ{\mathcal{}PM}\restriction_{{\mathcal{}P}_{D_{0}}},

where π:2ds0+r0\pi:\mathbb{C}^{2d}\to\mathbb{C}^{s_{0}+r_{0}} is the projection map on the first s0+r0s_{0}+r_{0} coordinates.

Inverting this PMD0{\mathcal{}{\mathcal{}PM}}_{D_{0}}^{*} represents the solution of the confluent Prony system (1.5) with fixed structure D0D_{0} from the first k=0,1,,s0+r01k=0,1,\dots,s_{0}+r_{0}-1 measurements.

Theorem 4.4 ([7]).

Let μ0=PMD0((w0,g0))s0+r0\mu_{0}^{*}={\mathcal{}{\mathcal{}PM}}_{D_{0}}^{*}\left(\left(w_{0},g_{0}\right)\right)\in\mathbb{C}^{s_{0}+r_{0}} with the unperturbed solution g0=j=1s0=0dj1aj,δ()(xτj)g_{0}=\sum_{j=1}^{s_{0}}\sum_{\ell=0}^{d_{j}-1}a_{j,\ell}\delta^{\left(\ell\right)}\left(x-\tau_{j}\right). In a small neighborhood of (w0,g0)PD0\left(w_{0},g_{0}\right)\in{\mathcal{}P}_{D_{0}}, the map PMD0{\mathcal{}{\mathcal{}PM}}_{D_{0}}^{*} is invertible. Consequently, for small enough ε\varepsilon, the multiplicity-restricted Prony problem with input data μ~r0+s0\tilde{\mu}^{*}\in\mathbb{C}^{r_{0}+s_{0}} satisfying μ~μ0ε\|\tilde{\mu}^{*}-\mu_{0}^{*}\|\leq\varepsilon has a unique solution. The error in this solution satisfies

|Δaj,|\displaystyle\left|\Delta a_{j,\ell}\right| \displaystyle\leq 2!(2δ)s0+r0(12+s0+r0δ)dj(1+|aj,1||aj,dj1|)ε,\displaystyle\frac{2}{\ell!}\left(\frac{2}{\delta}\right)^{s_{0}+r_{0}}\left(\frac{1}{2}+\frac{s_{0}+r_{0}}{\delta}\right)^{d_{j}-\ell}\left(1+\frac{\left|a_{j,\ell-1}\right|}{\left|a_{j,d_{j}-1}\right|}\right)\varepsilon,
|Δτj|\displaystyle\left|\Delta\tau_{j}\right| \displaystyle\leq 2dj!(2δ)s0+r01|aj,dj1|ε,\displaystyle\frac{2}{d_{j}!}\left(\frac{2}{\delta}\right)^{s_{0}+r_{0}}\frac{1}{\left|a_{j,d_{j}-1}\right|}\varepsilon,

where δ=defminij|τiτj|\delta\stackrel{{\scriptstyle\text{def}}}{{=}}\min_{i\neq j}\left|\tau_{i}-\tau_{j}\right| (for consistency we take aj,1=0a_{j,-1}=0 in the above formula).

Proof outline.

The Jacobian of PMD0{\mathcal{}{\mathcal{}PM}}_{D_{0}}^{*} can be easily computed, and it turns out to be equal to the product

JPMD0=V(τ1,d1+1,,τs0,ds0+1)diag{Ej}{\mathcal{}J}_{{\mathcal{}{\mathcal{}PM}}_{D_{0}}^{*}}=V\left(\tau_{1},d_{1}+1,\dots,\tau_{s_{0}},d_{s_{0}}+1\right)\operatorname{diag}\left\{E_{j}\right\}

where VV is the confluent Vandermonde matrix (4.1) on the nodes (τ1,,τs0)\left(\tau_{1},\dots,\tau_{s_{0}}\right), with multiplicity vector

D~0=(d1+1,,ds0+1),\tilde{D}_{0}=\left(d_{1}+1,\dots,d_{s_{0}}+1\right),

while EE is the (dj+1)×(dj+1)\left(d_{j}+1\right)\times\left(d_{j}+1\right) block

Ej=[1000010aj,0000aj,dj1].E_{j}=\begin{bmatrix}1&0&0&\cdots&0\\ 0&1&0&\cdots&a_{j,0}\\ \vdots&\vdots&\vdots&\ \ddots&\vdots\\ 0&0&0&\cdots&a_{j,d_{j}-1}\end{bmatrix}.

Since μ0Σr\mu_{0}\in\Sigma_{r}, the highest order coefficients aj,dj1a_{j,d_{j}-1} are nonzero. Furthermore, since all the τj\tau_{j} are distinct, the matrix VV is nonsingular. Local invertability follows. To estimate the norm of the inverse, use bounds from [6].∎

Remark 4.5.

Note that as two nodes collide (δ0\delta\to 0), the inversion of the multiplicity-restricted Prony mapping PMD0{\mathcal{}{\mathcal{}PM}}_{D_{0}}^{*} becomes ill-conditioned proportionally to δ(s0+r0)\delta^{-\left(s_{0}+r_{0}\right)}.

Let us stress that we are not aware of any general method of inverting PMD0{\mathcal{}{\mathcal{}PM}}_{D_{0}}^{*}, i.e. solving the multiplicity-restricted confluent Prony problem with the smallest possible number of measurements. As we demonstrate in [5], such a method exists for a very special case of a single point, i.e. s=1s=1.

5. Rank-restricted Prony problem

Recall that the Prony problem consists in inverting the Prony mapping PM:PdTd{\mathcal{}PM}:{\mathcal{}P}_{d}\rightarrow{\mathcal{}T}_{d}. So, given μ=(m0,,m2d1)Td\mu=(m_{0},\dots,m_{2d-1})\in{\mathcal{}T}_{d} we are looking for (w,g)Pd(w,g)\in{\mathcal{}P}_{d} such that mk(g)=xkg(x)𝑑x=mkm_{k}(g)=\int x^{k}g(x)dx=m_{k}, with k=0,1,,2d1k=0,1,\dots,2d-1. If μΣr\mu\in\Sigma_{r} with r<dr<d, then in fact any neighborhood of μ\mu will contain points from the non-solvability set Σ\Sigma^{\prime}. Indeed, consider the following example.

Example 5.1.

Slightly modifying the construction of Example 3.10, consider μ1,2,ϵ2d\mu_{\ell_{1},\ell_{2},\epsilon}\in\mathbb{C}^{2d} with all the mk=0m_{k}=0 besides m1=1m_{\ell_{1}}=1 and m2=ϵm_{\ell_{2}}=\epsilon, such that 2>1+d1\ell_{2}>\ell_{1}+d-1. For example, if d=5d=5 and 1=2,2=8\ell_{1}=2,\;\ell_{2}=8, the corresponding matrix is

M~5(2,8,ϵ)=[00100001000010000000000ϵ0000ϵ0].\tilde{M}_{5}^{\left(2,8,\epsilon\right)}=\begin{bmatrix}0&0&1&0&0&0\\ 0&1&0&0&0&0\\ 1&0&0&0&0&0\\ 0&0&0&0&0&\epsilon\\ 0&0&0&0&\epsilon&0\end{bmatrix}.

For ϵ=0\epsilon=0 the Prony problem is solvable, while for any small perturbation ϵ0\epsilon\neq 0 it becomes unsolvable. However, if we restrict the whole problem just to d=3d=3, it remains solvable for any small perturbation of the input.

We therefore propose to consider the rank-restricted Prony problem analogous to the construction of Section 4, but instead of fixing the multiplicity D(g)D\left(g\right) we now fix the rank rr (recall Definition 1.1).

Definition 5.2.

Denote by Pr{\mathcal{}P}_{r} the following vector bundle:

Pr={(w,g):wr,gVw},{\mathcal{}P}_{r}=\left\{\left(w,g\right):\quad w\in\mathbb{C}^{r},\;g\in V_{w}\right\},

where VwV_{w} is defined exactly as in Definition 2.4, replacing dd with rr.

Likewise, we define the Stieltjes bundle of order rr as follows.

Definition 5.3.

Denote by Sr{\mathcal{}S}_{r} the following vector bundle:

Sr={(w,γ):wr,γWw},{\mathcal{}S}_{r}=\left\{\left(w,\gamma\right):\qquad w\in\mathbb{C}^{r},\;\gamma\in W_{w}\right\},

where WwW_{w} is defined exactly as in Definition 2.4, replacing dd with rr.

The Stieltjes mapping acts naturally as a map SM:PrSr{\mathcal{}SM}:{\mathcal{}P}_{r}\to{\mathcal{}S}_{r} with exactly the same definition as Definition 2.7.

The restricted Taylor mapping TMr:Sr2r{\mathcal{}TM}_{r}:{\mathcal{}S}_{r}\to\mathbb{C}^{2r} is, as before, given by the truncated development at infinity to the first 2r2r Taylor coefficients.

Definition 5.4.

Let π:2d2r\pi:\mathbb{C}^{2d}\to\mathbb{C}^{2r} denote the projection operator onto the first 2r2r coordinates. Denote Σr=defπ(Σr)\Sigma_{r}^{*}\stackrel{{\scriptstyle\text{def}}}{{=}}\pi\left(\Sigma_{r}\right). The rank-restricted Prony mapping PMr:PrΣr{\mathcal{}{\mathcal{}PM}}_{r}^{*}:{\mathcal{}P}_{r}\to\Sigma_{r}^{*} is given by by

PMr((w,g))=(m0,,m2r1),mk=mk(g)=xkg(x)dx.{\mathcal{}{\mathcal{}PM}}_{r}^{*}\left(\left(w,g\right)\right)=\left(m_{0},\dots,m_{2r-1}\right),\qquad m_{k}=m_{k}\left(g\right)=\int x^{k}g\left(x\right)\operatorname{d}x.
Remark 5.5.

Pr{\mathcal{}P}_{r} can be embedded in Pd{\mathcal{}P}_{d}, for example by the map Ξr:PrPd\Xi_{r}:{\mathcal{}P}_{r}\to{\mathcal{}P}_{d}

Ξr:(w,g)Pr(w,g)Pd:w=(x1,,xr,0,0×(dr)),g=g.\Xi_{r}:\;\left(w,g\right)\in{\mathcal{}P}_{r}\longmapsto\left(w^{\prime},g^{\prime}\right)\in{\mathcal{}P}_{d}:\qquad w^{\prime}=\left(x_{1},\dots,x_{r},\underbrace{0,\dots 0}_{\times\left(d-r\right)}\right),\;g^{\prime}=g.

With this definition, PMr{\mathcal{}{\mathcal{}PM}}_{r}^{*} can be represented also as the composition

PMr=πPMΞr.{\mathcal{}{\mathcal{}PM}}_{r}^{*}=\pi\circ{\mathcal{}PM}\circ\Xi_{r}.
Proposition 5.6.

The rank-restricted Prony mapping satisfies

PMr=TMrSM.{\mathcal{}{\mathcal{}PM}}_{r}^{*}={\mathcal{}TM}_{r}\circ{\mathcal{}SM}.

Inverting PMr{\mathcal{}{\mathcal{}PM}}_{r}^{*} represents the solution of the rank-restricted Prony problem. Unlike in the multiplicity-restricted setting of Section 4, here we allow two or more nodes to collide (thereby changing the multiplicty vector D(g)D\left(g\right) of the solution).

The basic fact which makes this formulation useful is the following result.

Theorem 5.7.

Let μ0Σr\mu_{0}^{*}\in\Sigma_{r}^{*}. Then in a small neighborhood of μ02r\mu_{0}^{*}\in\mathbb{C}^{2r}, the Taylor mapping TMr{\mathcal{}TM}_{r} is continuously invertible.

Proof.

This is a direct consequence of the solution method to the Padé approximation problem described in Appendix A. Indeed, if the rank of M~r\tilde{M}_{r} is full, then it remains so in a small neighborhood of the entire space 2r\mathbb{C}^{2r}. Therefore, the system (A.\star\star) remains continuously invertible, producing the coefficients of the denominator Q(z)Q\left(z\right). Consequently, the right-hand side of (A.\star) depends continuously on the moment vector μ=(m0,,m2r1)2r\mu^{*}=\left(m_{0},\dots,m_{2r-1}\right)\in\mathbb{C}^{2r}. Again, since the rank always remains full, the polynomials P(z)P\left(z\right) and Q(z)Q\left(z\right) cannot have common roots, and thereby the solution R=PQ=TMr1(μ)R=\frac{P}{Q}={\mathcal{}TM}_{r}^{-1}\left(\mu^{*}\right) depends continuously on μ\mu^{*} (in the topology of the space of rational functions). ∎

In the next section, we consider the remaining problem: how to invert SM{\mathcal{}SM} in this setting.

6. Collision singularities and bases of finite differences

6.1. Introduction

Collision singularities occur in Prony systems as some of the nodes xix_{i} in the signal F(x)=i=1daiδ(xxi)F(x)=\sum_{i=1}^{d}a_{i}\delta(x-x_{i}) approach one another. This happens for μ\mu near the discriminant stratum Δ2d\Delta\subset\mathbb{C}^{2d} consisting of those (m0,,m2d1)(m_{0},\dots,m_{2d-1}) for which some of the coordinates {xj}\left\{x_{j}\right\} in the solution collide, i.e. the function RD,X,A(z)R_{D,X,A}\left(z\right) has multiple poles (or, nontrivial multiplicity vector DD). As we shall see below, typically, as μ\mu approaches μ0Δ\mu_{0}\in\Delta, i.e. some of the nodes xix_{i} collide, the corresponding coefficients aia_{i} tend to infinity. Notice, that all the moments mk=mk(F)m_{k}=m_{k}(F) remain bounded. This behavior creates serious difficulties in solving “near-colliding” Prony systems, both in theoretical and practical settings. Especially demanding problems arise in the presence of noise. The problem of improvement of resolution in reconstruction of colliding nodes from noisy measurements appears in a wide range of applications. It is usually called a “super-resolution problem” and a lot of recent publications are devoted to its investigation in various mathematical and applied settings. See [8] and references therein for a very partial sample.

Here we continue our study of collision singularities in Prony systems, started in [21]. Our approach uses bases of finite differences in the Prony space Pr{\mathcal{}P}_{r} in order to “resolve” the linear part of collision singularities. In these bases the coefficients do not blow up any more, even as some of the nodes collide.

Example 6.1.

Let r=2r=2, and consider the signal F=a1δ(xx1)+a2δ(xx2)F=a_{1}\delta\left(x-x_{1}\right)+a_{2}\delta\left(x-x_{2}\right) with

x1\displaystyle x_{1} =\displaystyle= t,x2=t+ϵ,\displaystyle t,\;x_{2}=t+\epsilon,
a1\displaystyle a_{1} =\displaystyle= ϵ1,a2=ϵ1.\displaystyle-\epsilon^{-1},\;a_{2}=\epsilon^{-1}.

The corresponding Prony system is

(a1x1k+a2x2k=)mk=ktk1+j=2k(kj)tkjϵj1=defρk(t,ϵ),k=0,1,2,3.\left(a_{1}x_{1}^{k}+a_{2}x_{2}^{k}=\right)m_{k}=kt^{k-1}+\underbrace{\sum_{j=2}^{k}{k\choose j}t^{k-j}\epsilon^{j-1}}_{\stackrel{{\scriptstyle\text{def}}}{{=}}\rho_{k}\left(t,\epsilon\right)},\qquad k=0,1,2,3.

As ϵ0\epsilon\to 0, the Prony system as above becomes ill-conditioned and the coefficients {aj}\left\{a_{j}\right\} blow up, while the measurements remain bounded. Note that

M~2=[012t+ρ2(t,ϵ)12t+ρ2(t,ϵ)3t2+ρ3(t,ϵ)],\tilde{M}_{2}=\begin{bmatrix}0&1&2t+\rho_{2}\left(t,\epsilon\right)\\ 1&2t+\rho_{2}\left(t,\epsilon\right)&3t^{2}+\rho_{3}\left(t,\epsilon\right)\end{bmatrix},

therefore rankM~2=2\operatorname{rank}\tilde{M}_{2}=2 and |M2|=10\left|M_{2}\right|=1\neq 0, i.e. the Prony problem with input (m0,,m3)\left(m_{0},\dots,m_{3}\right) remains solvable for all ϵ\epsilon. However, the standard basis {δ(xx1),δ(xx2)}\left\{\delta\left(x-x_{1}\right),\;\delta\left(x-x_{2}\right)\right\} degenerates, and in the limit it is no more a basis. If we represent the solution

Fϵ(x)=1ϵδ(xt)+1ϵδ(xtϵ)F_{\epsilon}\left(x\right)=-\frac{1}{\epsilon}\delta\left(x-t\right)+\frac{1}{\epsilon}\delta\left(x-t-\epsilon\right)

in the basis

Δ1(x1,x2)\displaystyle\Delta_{1}\left(x_{1},x_{2}\right) =\displaystyle= δ(xx1),\displaystyle\delta\left(x-x_{1}\right),
Δ2(x1,x2)\displaystyle\Delta_{2}\left(x_{1},x_{2}\right) =\displaystyle= 1x1x2δ(xx1)+1x2x1δ(xx2),\displaystyle\frac{1}{x_{1}-x_{2}}\delta\left(x-x_{1}\right)+\frac{1}{x_{2}-x_{1}}\delta\left(x-x_{2}\right),

then we have

Fϵ(x)=1Δ2(t,t+ϵ),F_{\epsilon}\left(x\right)=1\cdot\Delta_{2}\left(t,t+\epsilon\right),

i.e. the coefficients in this new basis are just {b1=0,b1=1}\left\{b_{1}=0,\;b_{1}=1\right\}. As ϵ0\epsilon\to 0, in fact we have

Δ2(t,t+ϵ)δ(xt),\Delta_{2}\left(t,t+\epsilon\right)\to\delta^{\prime}\left(x-t\right),

where the convergence is in the topology of the bundle Pr{\mathcal{}P}_{r}.

Our goal in this section is to generalize the construction of Example 6.1 and [21] to handle the general case of colliding configurations.

6.2. Divided finite differences

For modern treatment of divided differences, see e.g. [9, 12, 16]. We follow [9] and adopt what has become by now the standard definition.

Definition 6.2.

Let an arbitrary sequence of points w=(x1,x2,,)w=\left(x_{1},x_{2},\dots,\right) be given (repetitions are allowed). The (n-1)-st divided difference  Δn1(w):Π\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{n-1}\left(w\right):\Pi\to\mathbb{C} is the linear functional on the space Π\Pi of polynomials, associating to each pΠp\in\Pi its (uniquely defined) nn-th coefficient in the Newton form

p(x)=j=1{ Δj1(x1,,xj)p}qj1,w(x),qi,w(x)=defj=1i(xxj).p\left(x\right)=\sum_{j=1}^{\infty}\left\{\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{j-1}\left(x_{1},\dots,x_{j}\right)p\right\}\cdot q_{j-1,w}\left(x\right),\qquad q_{i,w}\left(x\right)\stackrel{{\scriptstyle\text{def}}}{{=}}\prod_{j=1}^{i}\left(x-x_{j}\right). (6.1)

It turns out that this definition can be extended to all sufficiently smooth functions for which the interpolation problem is well-defined.

Definition 6.3 ([9]).

For any smooth enough function ff, defined at least on x1,,xnx_{1},\dots,x_{n}, the divided finite difference  Δn1(x1,,xn)f\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{n-1}\left(x_{1},\dots,x_{n}\right)f is the nn-th coefficient in the Newton form (6.1) of the Hermite interpolation polynomial PnP_{n}, which agrees with ff and its derivatives of appropriate order on x1,,xn:x_{1},\dots,x_{n}:

f()(xj)=Pn()(xj):1jn, 0<dj=def#{i:xi=xj}.f^{\left(\ell\right)}\left(x_{j}\right)=P_{n}^{\left(\ell\right)}\left(x_{j}\right):\qquad 1\leqslant j\leqslant n,\;0\leqslant\ell<d_{j}\stackrel{{\scriptstyle\text{def}}}{{=}}\#\left\{i:\quad x_{i}=x_{j}\right\}. (6.2)

Therefore, each divided difference can be naturally associated with an element of the Prony space (see Item 5 in Proposition 6.4 and Definition 6.5 below for an accurate statement).

Let us now summarize relevant properties of the functional  Δ\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta} which we shall use later on.

Proposition 6.4.

For w=(x1,,xn)nw=\left(x_{1},\dots,x_{n}\right)\in\mathbb{C}^{n}, let s(w),T(w)s\left(w\right),\;T\left(w\right) and D(w)D\left(w\right) be defined according to Definition 2.1. Let qn,w(z)=j=1s(zτj)djq_{n,w}\left(z\right)=\prod_{j=1}^{s}\left(z-\tau_{j}\right)^{d_{j}} be defined as in (6.1).

  1. (1)

    The functional  Δn1(x1,,xn)\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{n-1}\left(x_{1},\dots,x_{n}\right) is a symmetric function of its arguments, i.e. it depends only on the set {x1,,xn}\left\{x_{1},\dots,x_{n}\right\} but not on its ordering.

  2. (2)

     Δn1(x1,,xn)\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{n-1}\left(x_{1},\dots,x_{n}\right) is a continuous function of the vector (x1,,xn)\left(x_{1},\dots,x_{n}\right). In particular, for any test function ff

    lim(x1,,xn)(t1,,tn) Δn1(x1,,xn)f= Δn1(t1,,tn)f.\lim_{\left(x_{1},\dots,x_{n}\right)\to\left(t_{1},\dots,t_{n}\right)}\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{n-1}\left(x_{1},\dots,x_{n}\right)f=\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{n-1}\left(t_{1},\dots,t_{n}\right)f.
  3. (3)

     Δ\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta} may be computed by the recursive rule

     Δn1(x1,,xn)f={Δn2(x2,,xn)fΔn2(x1,,xn1)fxnx1x1xn,{ddξΔn2(ξ,x2,,xn1)f}|ξ=xn,x1=xn,\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{n-1}\left(x_{1},\dots,x_{n}\right)f=\begin{cases}\frac{\mathord{\kern 3.52356pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-3.52356pt\Delta}^{n-2}\left(x_{2},\dots,x_{n}\right)f-\mathord{\kern 3.52356pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-3.52356pt\Delta}^{n-2}\left(x_{1},\dots,x_{n-1}\right)f}{x_{n}-x_{1}}&x_{1}\neq x_{n},\\ \left\{\frac{\operatorname{d}}{\operatorname{d}\xi}\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{n-2}\left(\xi,x_{2},\dots,x_{n-1}\right)f\right\}|_{\xi=x_{n}},&x_{1}=x_{n},\end{cases} (6.3)

    where  Δ0(x1)f=f(x1).\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{0}\left(x_{1}\right)f=f\left(x_{1}\right).

  4. (4)

    Let fz(x)=(zx)1f_{z}\left(x\right)=\left(z-x\right)^{-1}. Then for all z{x1,,xn}z\notin\left\{x_{1},\dots,x_{n}\right\}

     Δn1(x1,,xn)fz=1qn,w(z).\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{n-1}\left(x_{1},\dots,x_{n}\right)f_{z}=\frac{1}{q_{n,w}\left(z\right)}. (6.4)
  5. (5)

    By (6.2),  Δn1(x1,,xn)\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{n-1}\left(x_{1},\dots,x_{n}\right) is a linear combination of the functionals

    δ()(xτj),1js, 0<dj.\delta^{\left(\ell\right)}\left(x-\tau_{j}\right),\qquad 1\leqslant j\leqslant s,\;0\leqslant\ell<d_{j}.

    In fact, using (6.4) we obtain the Chakalov’s expansion (see [9])

     Δn1(x1,,xn)=j=1s=0dj1aj,δ()(xτj),\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{n-1}\left(x_{1},\dots,x_{n}\right)=\sum_{j=1}^{s}\sum_{\ell=0}^{d_{j}-1}a_{j,\ell}\delta^{\left(\ell\right)}\left(x-\tau_{j}\right), (6.5)

    where the coefficients {aj,}\left\{a_{j,\ell}\right\} are defined by the partial fraction decomposition111The coefficients {aj,}\left\{a_{j,\ell}\right\} may be readily obtained by the Cauchy residue formula aj,=1(dj1)!limzτj(ddz)dj1{(zτj)+1qn,w(z)}.a_{j,\ell}=\frac{1}{\left(d_{j}-1-\ell\right)!}\lim_{z\to\tau_{j}}\left(\frac{\operatorname{d}}{\operatorname{d}z}\right)^{d_{j}-1-\ell}\left\{\frac{\left(z-\tau_{j}\right)^{\ell+1}}{q_{n,w}\left(z\right)}\right\}.

    1qn,w(z)=j=1s=0dj1!aj,(zτj)+1.\frac{1}{q_{n,w}\left(z\right)}=\sum_{j=1}^{s}\sum_{\ell=0}^{d_{j}-1}\frac{\ell!a_{j,\ell}}{\left(z-\tau_{j}\right)^{\ell+1}}. (6.6)
  6. (6)

    By (6.5) and (6.6)

     Δn1(t,,t×n)=1(n1)!δ(n1)(xt).\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{n-1}\left(\underbrace{t,\dots,t}_{\times n}\right)=\frac{1}{\left(n-1\right)!}\delta^{\left(n-1\right)}\left(x-t\right). (6.7)
  7. (7)

    Popoviciu’s refinement lemma [9, Proposition 23]: for every index subsequence

    1σ(1)<σ(2)<<σ(k)n,1\leqslant\sigma\left(1\right)<\sigma\left(2\right)<\dots<\sigma\left(k\right)\leqslant n,

    there exist coefficients α(j)\alpha\left(j\right) such that

     Δk1(xσ(1),,xσ(k))=j=σ(1)1σ(k)kα(j) Δk1(xj+1,xj+2,,xj+k).\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{k-1}\left(x_{\sigma\left(1\right)},\dots,x_{\sigma\left(k\right)}\right)=\sum_{j=\sigma\left(1\right)-1}^{\sigma\left(k\right)-k}\alpha\left(j\right)\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{k-1}\left(x_{j+1},x_{j+2},\dots,x_{j+k}\right). (6.8)

Based on the above, we may now identify  Δ\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta} with elements of the bundle Pr{\mathcal{}P}_{r}.

Definition 6.5.

Let w=(x1,,xr)rw=\left(x_{1},\dots,x_{r}\right)\in\mathbb{C}^{r}, and X={n1,n2,,nα}{1,2,,r}X=\left\{n_{1},n_{2},\dots,n_{\alpha}\right\}\subseteq\left\{1,2,\dots,r\right\} of size |X|=α\left|X\right|=\alpha be given. Let the elements of XX be enumerated in increasing order, i.e.

1n1<n2<<nαr.1\leqslant n_{1}<n_{2}<\dots<n_{\alpha}\leqslant r.

Denote by wXw_{X} the vector

wX=def(xn1,xn2,,xnα)α.w_{X}\stackrel{{\scriptstyle\text{def}}}{{=}}\left(x_{n_{1}},x_{n_{2}},\dots,x_{n_{\alpha}}\right)\in\mathbb{C}^{\alpha}.

Then we denote

ΔX(w)=def Δα1(wX).\Delta_{X}\left(w\right)\stackrel{{\scriptstyle\text{def}}}{{=}}\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{\alpha-1}\left(w_{X}\right).

We immediately obtain the following result.

Lemma 6.6.

For all wrw\in\mathbb{C}^{r} and X{1,2,,r}X\subseteq\left\{1,2,\dots,r\right\}, we have ΔX(w)Vw\Delta_{X}\left(w\right)\in V_{w}. Moreover, letting α=|X|\alpha=\left|X\right| we have

SM(ΔX(w))= Δα1(wX)1zx=1qα,wX(z).{\mathcal{}SM}\left(\Delta_{X}\left(w\right)\right)=\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{\alpha-1}\left(w_{X}\right)\frac{1}{z-x}=\frac{1}{q_{\alpha,w_{X}}\left(z\right)}. (6.9)

Finally, (w,ΔX(w))\left(w,\Delta_{X}\left(w\right)\right) is a continuous section of Pr{\mathcal{}P}_{r}.

6.3. Constructing a basis

The following result is well-known, see e.g. [9, Proposition 35].

Theorem 6.7.

Denote Nj={1,2,,j}N_{j}=\left\{1,2,\dots,j\right\} for j=1,2,,rj=1,2,\dots,r. Then for every wrw\in\mathbb{C}^{r}, the collection

{ΔNj(w)}j=1r\left\{\Delta_{N_{j}}\left(w\right)\right\}_{j=1}^{r}

is a basis for VwV_{w}.

There are various proofs of this statement. Below we show how to construct sets which do not necessarily remain basis for all wrw\in\mathbb{C}^{r}, but only for ww in a small neighborhood of a given w0r.w_{0}\in\mathbb{C}^{r}. Theorem 6.7 will then follow as a special case of this construction.

Informally, if two coordinates xix_{i} and xjx_{j} can collide, then it is necessary to allow them to be glued by some element of the basis, i.e. we will need ΔX(w)\Delta_{X}\left(w\right) where i,jXi,j\in X (in Theorem 6.7 all coordinates might be eventually glued into a single point because ww is unrestricted.) In order to make this statement formal, let us introduce a notion of configuration, which is essentially a partition of the set of indices.

Definition 6.8.

A configuration C{\mathcal{}C} is a partition of the set Nr={1,2,,r}N_{r}=\left\{1,2,\dots,r\right\} into s=s(C)s=s\left({\mathcal{}C}\right) disjoint nonempty subsets

i=1sXi=Nr,|Xi|=di>0.\sqcup_{i=1}^{s}X_{i}=N_{r},\qquad\left|X_{i}\right|=d_{i}>0.

The multiplicity vector of C{\mathcal{}C} is

T(C)=(d1,,ds).T\left({\mathcal{}C}\right)=\left(d_{1},\dots,d_{s}\right).

Every configuration defines a continuous family of divided differences as follows.

Definition 6.9.

Let a configuration C={Xj}j=1s(C){\mathcal{}C}=\left\{X_{j}\right\}_{j=1}^{s\left({\mathcal{}C}\right)} . Enumerate each XjX_{j} in increasing order of its elements

Xj={n1j<n2j<ndjj}X_{j}=\left\{n_{1}^{j}<n_{2}^{j}<\dots n_{d_{j}}^{j}\right\}

and denote for every m=1,2,,djm=1,2,\dots,d_{j}

Xj,m=def{nkj:k=1,2,,m}.X_{j,m}\stackrel{{\scriptstyle\text{def}}}{{=}}\left\{n_{k}^{j}:\;k=1,2,\dots,m\right\}.

For every wrw\in\mathbb{C}^{r}, the collection BC(w)Vw{\mathcal{}B_{C}}\left(w\right)\subset V_{w} is defined as follows:

BC(w)=def{ΔXj,m(w)}j=1,,s(C)m=1,,dj.{\mathcal{}B_{C}}\left(w\right)\stackrel{{\scriptstyle\text{def}}}{{=}}\left\{\Delta_{X_{j,m}}\left(w\right)\right\}_{j=1,\dots,s\left({\mathcal{}C}\right)}^{m=1,\dots,d_{j}}.

Now we formally define when a partition is “good” with respect to a point wrw\in\mathbb{C}^{r}.

Definition 6.10.

The point w=(x1,,xr)rw=\left(x_{1},\dots,x_{r}\right)\in\mathbb{C}^{r} is subordinated to the configuration C={Xj}j=1s(C){\mathcal{}C}=\left\{X_{j}\right\}_{j=1}^{s\left({\mathcal{}C}\right)} if whenever xk=xx_{k}=x_{\ell} for a pair of indices kk\neq\ell, then necessarily k,Xjk,\ell\in X_{j} for some XjX_{j}.

Now we are ready to formulate the main result of this section.

Theorem 6.11.

For a given w0rw_{0}\in\mathbb{C}^{r} and a configuration C{\mathcal{}C}, the collection BC(w0){\mathcal{}B_{C}}\left(w_{0}\right) is a basis for Vw0V_{w_{0}} if and only if w0w_{0} is subordinated to C{\mathcal{}C}. In this case, BC(w){\mathcal{}B_{C}}\left(w\right) is a continuous family of bases for VwV_{w} in a sufficiently small neighborhood of w0w_{0}.

Let us first make a technical computation.

Lemma 6.12.

For a configuration C{\mathcal{}C} and a point wrw\in\mathbb{C}^{r}, consider for every fixed j=1,,s(C)j=1,\dots,s\left({\mathcal{}C}\right) the set

Sj=def{ΔXj,m(w)}m=1dj.S_{j}\stackrel{{\scriptstyle\text{def}}}{{=}}\left\{\Delta_{X_{j,m}}\left(w\right)\right\}_{m=1}^{d_{j}}. (6.10)
  1. (1)

    Define for any pair of indices 1kdj1\leqslant k\leqslant\ell\leqslant d_{j} the index set

    Xj,k:=def{nkj<nk+1j<<nj}Xj=Xj,1:dj=Xj,dj.X_{j,k:\ell}\stackrel{{\scriptstyle\text{def}}}{{=}}\left\{n_{k}^{j}<n_{k+1}^{j}<\dots<n_{\ell}^{j}\right\}\subseteq X_{j}=X_{j,1:d_{j}}=X_{j,d_{j}}.

    Then

    ΔXj,k:(w)spanSj.\Delta_{X_{j,k:\ell}}\left(w\right)\in\operatorname{span}S_{j}.
  2. (2)

    For an arbitrary subset YXjY\subseteq X_{j} (and not necessarily containing segments of consecutive indices), we also have

    ΔY(w)spanSj.\Delta_{Y}\left(w\right)\in\operatorname{span}S_{j}.
Proof.

For clarity, we denote yi=xnijy_{i}=x_{n_{i}^{j}} and [k:]=ΔXj,k:(w)\left[k:\ell\right]=\Delta_{X_{j,k:\ell}}\left(w\right). By (6.3) we have in all cases (including repeated nodes)

(yyk)[k:]=[k+1:][k:1].\left(y_{\ell}-y_{k}\right)\left[k:\ell\right]=\left[k+1:\ell\right]-\left[k:\ell-1\right]. (6.11)

The proof of the first statement is by backward induction on n=kn=\ell-k. We start from n=djn=d_{j}, and obviously [1:dj]Sj\left[1:d_{j}\right]\in S_{j}. In addition, by definition of SjS_{j} we have [1:m]Sj\left[1:m\right]\in S_{j} for all m=1,,djm=1,\dots,d_{j}. Therefore, in order to obtain all [k:]\left[k:\ell\right] with k=n1\ell-k=n-1, we apply (6.11) several times as follows.

[2:n]\displaystyle\left[2:n\right] =\displaystyle= (yny1)[1:n]+[1:n1]\displaystyle\left(y_{n}-y_{1}\right)\left[1:n\right]+\left[1:n-1\right]
[3:n+1]\displaystyle\left[3:n+1\right] =\displaystyle= (yn+1y2)[2:n+1]+[2:n]¯\displaystyle\left(y_{n+1}-y_{2}\right)\underleftrightarrow{\left[2:n+1\right]}+\underline{\left[2:n\right]}
\displaystyle\dots
[djn+2:dj]\displaystyle\left[d_{j}-n+2:d_{j}\right] =\displaystyle= (ydjydjn+1)[djn+1:dj]+[djn+1:dj1]¯\displaystyle\left(y_{d_{j}}-y_{d_{j}-n+1}\right)\underleftrightarrow{\left[d_{j}-n+1:d_{j}\right]}+\underline{\left[d_{j}-n+1:d_{j}-1\right]}

Here the symbol ¯\underline{\cdots} under a term means that the term is taken directly from the previous line, while \underleftrightarrow{\cdots} indicates that the induction hypothesis is used. In the end, the left-hand side terms are shown to belong to spanSj\operatorname{span}S_{j}.

In order to prove the second statement, we employ the first statement, (6.8) and Proposition 6.4, Item 1. ∎

Proof of Theorem 6.11.

In one direction, assume that w0=(x1,,xr)w_{0}=\left(x_{1},\dots,x_{r}\right) is subordinated to C{\mathcal{}C}. It is sufficient to show that every element of the standard basis (2.2) belongs to span{BC(w0)}\operatorname{span}\left\{{\mathcal{}B_{C}}\left(w_{0}\right)\right\}.

Let τjT(w0)\tau_{j}\in T\left(w_{0}\right), let djd_{j} be the corresponding multiplicity, and let YjNrY_{j}\subseteq N_{r} denote the index set of size djd_{j}

Yj=def{i:xi=τj}.Y_{j}\stackrel{{\scriptstyle\text{def}}}{{=}}\left\{i:\quad x_{i}=\tau_{j}\right\}.

By the definition of subordination, there exists an element in the partition of C{\mathcal{}C}, say XkX_{k}, for which YjXkY_{j}\subseteq X_{k}. By Lemma 6.12 we conclude that for all subsets ZYjZ\subseteq Y_{j},

ΔZ(w0)span{ΔXk,m(w0)}m=1|Xk|span{BC(w0)}.\Delta_{Z}\left(w_{0}\right)\in\operatorname{span}\left\{\Delta_{X_{k,m}}\left(w_{0}\right)\right\}_{m=1}^{\left|X_{k}\right|}\subseteq\operatorname{span}\left\{{\mathcal{}B_{C}}\left(w_{0}\right)\right\}.

By (6.7), ΔZ(w0)\Delta_{Z}\left(w_{0}\right) is nothing else but

ΔZ(w0)= Δ|Z|1(τj,,τj×|Z|)=1(|Z|1)!δ(|Z|1)(xτj).\Delta_{Z}\left(w_{0}\right)=\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{\left|Z\right|-1}\left(\underbrace{\tau_{j},\dots,\tau_{j}}_{\times\left|Z\right|}\right)=\frac{1}{\left(\left|Z\right|-1\right)!}\delta^{\left(\left|Z\right|-1\right)}\left(x-\tau_{j}\right).

This completes the proof of the necessity. In the other direction, assume by contradiction that xk=x=τx_{k}=x_{\ell}=\tau but nevertheless there exist two distinct elements of the partition C{\mathcal{}C}, say XαX_{\alpha} and XβX_{\beta} such that kXαk\in X_{\alpha} and Xβ.\ell\in X_{\beta}. Let the sets {Sj}j=1s(C)\left\{S_{j}\right\}_{j=1}^{s\left({\mathcal{}C}\right)} be defined by (6.10). Again, by Lemma 6.12 and (6.7) we conclude that

δ(xτ)spanSαspanSβ.\delta\left(x-\tau\right)\in\operatorname{span}S_{\alpha}\cap\operatorname{span}S_{\beta}.

But notice that BC(w0)=j=1s(C)Sj{\mathcal{}B_{C}}\left(w_{0}\right)=\bigcup_{j=1}^{s\left({\mathcal{}C}\right)}S_{j} and j=1s|Sj|=d\sum_{j=1}^{s}\left|S_{j}\right|=d, therefore by counting dimensions we conclude that

dimspan{BC(w0)}<d,\dim\operatorname{span}\left\{{\mathcal{}B_{C}}\left(w_{0}\right)\right\}<d,

in contradiction to the assumption that BC(w0){\mathcal{}B_{C}}\left(w_{0}\right) is a basis.

Finally, one can evidently choose a sufficiently small neighborhood UrU\subset\mathbb{C}^{r} of w0w_{0} such that for all wUw\in U, no new collisions are introduced, i.e. ww is still subordinated to C{\mathcal{}C}. The continuity argument (Lemma 6.6) finishes the proof.∎

Remark 6.13.

Another possible method of proof is to consider the algebra of elementary fractions in the Stieltjes space Sr{\mathcal{}S}_{r}, and use the correspondence (6.9).

As we mentioned, Theorem 6.7 follows as a corollary of Theorem 6.11 for the configuration C{\mathcal{}C} consisting of a single partition set NrN_{r}.

6.4. Resolution of collision singularities

Let μ0Σr2r\mu_{0}^{*}\in\Sigma_{r}^{*}\subset\mathbb{C}^{2r} be given, and let (w0,g0)Pr\left(w_{0},g_{0}\right)\in{\mathcal{}P}_{r} be a solution to the (rank-restricted) Prony problem. The point w0w_{0} is uniquely defined up to a permutation of the coordinates, so we just fix a particular permutation. Let T(w0)=(τ1,,τs)T\left(w_{0}\right)=\left(\tau_{1},\dots,\tau_{s}\right).

Our goal is to solve the rank-restricted Prony problem for every input μ2r\mu^{*}\in\mathbb{C}^{2r} in a small neighborhood of μ0\mu_{0}^{*}. According to Theorem 5.7, this amounts to a continuous representation of the solution Rμ(z)=Pμ(z)Qμ(z)=TMr1(μ)R_{\mu^{*}}\left(z\right)=\frac{P_{\mu^{*}}\left(z\right)}{Q_{\mu^{*}}\left(z\right)}={\mathcal{}TM}_{r}^{-1}\left(\mu^{*}\right) to the corresponding diagonal Padé approximation problem as an element of the bundle Pr{\mathcal{}P}_{r}.

Define δ=minij|τiτj|\delta=\min_{i\neq j}\left|\tau_{i}-\tau_{j}\right| to be the “separation distance” between the clusters. Since the roots of QμQ_{\mu^{*}} depend continuously on μ\mu^{*} and the degree of QμQ_{\mu^{*}} does not drop, we can choose some μ1\mu_{1}^{*} sufficiently close to μ0\mu_{0}^{*}, for which

  1. (1)

    all the roots of Qμ1(z)Q_{\mu_{1}^{*}}\left(z\right) are distinct, and

  2. (2)

    these roots can be grouped into ss clusters, such that each of the elements of the jj-th cluster is at most δ/3\delta/3 away from τj\tau_{j}.

Enumerate the roots of Qμ1Q_{\mu_{1}^{*}} within each cluster in an arbitrary manner. This choice enables us to define locally (in a neighborhood of μ1\mu_{1}^{*}) rr algebraic functions x1(μ),,xr(μ)x_{1}\left(\mu^{*}\right),\dots,x_{r}\left(\mu^{*}\right), satisfying

Qμ(z)=j=1s(zxj(μ)).Q_{\mu^{*}}\left(z\right)=\prod_{j=1}^{s}\left(z-x_{j}\left(\mu^{*}\right)\right).

Then we extend these functions by analytic continuation according to the above formula into the entire neighborhood of μ0\mu_{0}^{*}. Consequently,

w(μ)=def(x1(μ),,xr(μ))w\left(\mu^{*}\right)\stackrel{{\scriptstyle\text{def}}}{{=}}\left(x_{1}\left(\mu^{*}\right),\dots,x_{r}\left(\mu^{*}\right)\right)

is a continuous (multivalued) algebraic function in a neighborhood of μ0\mu_{0}^{*}, satisfying

w(μ0)=w0.w\left(\mu_{0}^{*}\right)=w_{0}.

After this “pre-processing” step, we can solve the rank-restricted Prony problem in this neighborhood of μ0\mu_{0}^{*}, as follows.

Let μ0Σr2r\mu_{0}^{*}\in\Sigma_{r}^{*}\subset\mathbb{C}^{2r} be given, and let (w0,g0)Pr\left(w_{0},g_{0}\right)\in{\mathcal{}P}_{r} be a solution to the (rank-restricted) Prony problem. Let w0w_{0} be subordinated to some configuration C{\mathcal{}C}.

The input to the problem is a measurement vector μ=(m0,,m2r1)2r\mu^{*}=\left(m_{0},\dots,m_{2r-1}\right)\in\mathbb{C}^{2r}, which is in a small neighborhood of μ0\mu_{0}^{*}.

  1. (1)

    Construct the function w=w(μ)w=w\left(\mu^{*}\right) as described above.

  2. (2)

    Build the basis BC(w)={ΔXj,(w)}j=1,,s(C)=1,,dj{\mathcal{}B_{C}}\left(w\right)=\left\{\Delta_{X_{j,\ell}}\left(w\right)\right\}_{j=1,\dots,s\left({\mathcal{}C}\right)}^{\ell=1,\dots,d_{j}} for VwV_{w}.

  3. (3)

    Find the coefficients {βj,}j=1,,s(C)=1,,dj\left\{\beta_{j,\ell}\right\}_{j=1,\dots,s\left({\mathcal{}C}\right)}^{\ell=1,\dots,d_{j}} such that

    SM(j,βj,ΔXj,(w))=R(z),{\mathcal{}SM}\left(\sum_{j,\ell}\beta_{j,\ell}\Delta_{X_{j,\ell}}\left(w\right)\right)=R\left(z\right),

    by solving the linear system

    j,βj,(w)ΔXj,(w)=g(w)(xk)=mk(=xkg(w)(x)dx),k=0,1,,2r1.\underbrace{\sum_{j,\ell}\beta_{j,\ell}\left(w\right)\Delta_{X_{j,\ell}}\left(w\right)}_{=g\left(w\right)}\left(x^{k}\right)=m_{k}\left(=\int x^{k}g\left(w\right)\left(x\right)\operatorname{d}x\right),\qquad k=0,1,\dots,2r-1. (6.12)
Algorithm 1 Solving rank-restricted Prony problem with collisions.
Theorem 6.14.

The coordinates {βj,}\left\{\beta_{j,\ell}\right\} of the solution to the rank-restricted Prony problem, given by Algorithm 1, are (multivalued) algebraic functions, continuous in a neighborhood of the point μ0\mu_{0}^{*} .

Proof.

Since the divided differences Δj,(w)\Delta_{j,\ell}\left(w\right) are continuous in ww, then clearly for each k=0,1,,2r1k=0,1,\dots,2r-1 the functions

νj,,k(w)=Δj,(w)(xk)= Δ1(wXj,)(xk)\nu_{j,\ell,k}\left(w\right)=\Delta_{j,\ell}\left(w\right)\left(x^{k}\right)=\mathord{\kern 4.29993pt\vrule width=0.6pt,height=5.6pt,depth=-0.28pt\kern-4.29993pt\Delta}^{\ell-1}\left(w_{X_{j,\ell}}\right)\left(x^{k}\right)

are continuous222In fact, νj,,k(w)\nu_{j,\ell,k}\left(w\right) are symmetric polynomials in some of the coordinates of ww. in ww, and hence continuous, as multivalued functions, in a neighborhood of μ0\mu_{0}^{*}. Since BC(w(μ)){\mathcal{}B_{C}}\left(w\left(\mu^{*}\right)\right) remains a basis in a (possibly smaller) neighborhood of μ0\mu_{0}^{*}, the system (6.12), taking the form

j,νj,,k(w)βj,(w)=mk,k=0,1,,2r1,\sum_{j,\ell}\nu_{j,\ell,k}\left(w\right)\beta_{j,\ell}\left(w\right)=m_{k},\qquad k=0,1,\dots,2r-1,

remains non-degenerate in this neighborhood. We conclude that the coefficients {βj,(w(μ))}\left\{\beta_{j,\ell}\left(w\left(\mu^{*}\right)\right)\right\} are multivalued algebraic functions, continuous in a neighborhood of μ0\mu_{0}^{*}. ∎

7. Real Prony space and hyperbolic polynomials

In this section we shall restrict ourselves to the real case. Notice that in many applications only real Prony systems are used. On the other hand, considering the Prony problem over the real numbers significantly simplifies some constructions. In particular, we can easily avoid topological problems, related with the choice of the ordering of the points x1,,xd.x_{1},\dots,x_{d}\in\mathbb{C}. So in a definition of the real Prony space RPdR{\mathcal{}P}_{d} we assume that the coordinates x1,,xdx_{1},\ldots,x_{d} are taken with their natural ordering x1x2xdx_{1}\leq x_{2}\leq\dots\leq x_{d}. Accordingly, the real Prony space RPdR{\mathcal{}P}_{d} is defined as the bundle (w,g),wdd,gRVw.(w,g),\ w\in\prod_{d}\subset{\mathbb{R}}^{d},g\in RV_{w}. Here d\prod_{d} is the prism in d{\mathbb{R}}^{d} defined by the inequalities x1x2xdx_{1}\leq x_{2}\leq\dots\leq x_{d}, and RVwRV_{w} is the space of linear combinations with real coefficients of δ\delta-functions and their derivatives with the support {x1,,xd},\{x_{1},\ldots,x_{d}\}, as in Definition 2.4. The Prony, Stieltjes and Taylor maps are the restrictions to the real case of the complex maps defined above.

In this paper we just point out a remarkable connection of the real Prony space and mapping with hyperbolic polynomials, and Vieta and Vandermonde mappings studied in Singularity Theory (see [1, 13, 14, 15] and references therein).

Hyperbolic polynomials (in one variable) are real polynomials Q(z)=zd+j=1dλjzdj,Q(z)=z^{d}+\sum_{j=1}^{d}\lambda_{j}z^{d-j}, with all dd of their roots real. We denote by Γd\Gamma_{d} the space of the coefficients Λ=(λ1,,λd)d\Lambda=(\lambda_{1},\ldots,\lambda_{d})\subset{\mathbb{R}}^{d} of all the hyperbolic polynomials, and by Γ^d\hat{\Gamma}_{d} the set of ΛΓd\Lambda\in\Gamma_{d} with λ1=0,|λ2|1.\lambda_{1}=0,\ |\lambda_{2}|\leq 1. Recalling (2.3), it is evident that all hyperbolic polynomials appear as the denominators of the irreducible fractions in the image of RPdR{\mathcal{}P}_{d} by SM{\mathcal{}SM}. This shows, in particular, that the geometry of the boundary Γ\partial\Gamma of the hyperbolicity domain Γ\Gamma is important in the study of the real Prony map PM{\mathcal{}PM}: it is mapped by PM{\mathcal{}PM} to the boundary of the solvability domain of the real Prony problem. This geometry has been studied in a number of publications, from the middle of 1980s. In [13] V. P. Kostov has shown that Γ^\hat{\Gamma} possesses the Whitney property: there is a constant CC such that any two points λ1,λ2Γ^\lambda_{1},\lambda_{2}\in\hat{\Gamma} can be connected by a curve inside Γ^\hat{\Gamma} of the length at most Cλ2λ1C\|\lambda_{2}-\lambda_{1}\|. “Vieta mapping” which associates to the nodes x1x2xdx_{1}\leq x_{2}\leq\dots\leq x_{d} the coefficients of Q(z)Q(z) having these nodes as the roots, is also studied in [13]. In our notations, Vieta mapping is the composition of the Stieltjes mapping SM{\mathcal{}SM} with the projection to the coefficients of the denominator.

In [1] V.I.Arnold introduced and studied the notion of maximal hyperbolic polynomial, relevant in description of Γ^\hat{\Gamma}. Furthermore, the Vandermonde mapping V:dd{\mathcal{}V}:{\mathbb{R}}^{d}\rightarrow{\mathbb{R}}^{d} was defined there by

{y1=a1x1++adxd,yd=a1x1d++adxdd,\displaystyle\begin{cases}y_{1}=a_{1}x_{1}+\ldots+a_{d}x_{d},\\ \dots\\ y_{d}=a_{1}x_{1}^{d}+\ldots+a_{d}x_{d}^{d},\end{cases}

with a1,,ada_{1},\ldots,a_{d} fixed. In our notations V{\mathcal{}V} is the restriction of the Prony mapping to the pairs (w,g)RPd(w,g)\in R{\mathcal{}P}_{d} with the coefficients of gg in the standard basis of RVwRV_{w} fixed. It was shown in [1] that for a1,,ad>0a_{1},\ldots,a_{d}>0 V{\mathcal{}V} is a one-to-one mapping of d\prod_{d} to its image. In other words, the first dd moments uniquely define the nodes x1x2xdx_{1}\leq x_{2}\leq\dots\leq x_{d}. For a1,,ada_{1},\ldots,a_{d} with varying signs, this is no longer true in general. This result is applied in [1] to the study of the colliding configurations.

Next, the “Vandermonde varieties” are studied in [1], which are defined by the equations

{a1x1++adxd=α1,a1x1++adxd=α.\displaystyle\begin{cases}a_{1}x_{1}+\ldots+a_{d}x_{d}&=\alpha_{1},\\ &\dots\\ a_{1}x_{1}^{\ell}+\ldots+a_{d}x_{d}^{\ell}&=\alpha_{\ell}.\end{cases} d.\displaystyle\ell\leqslant d.

It is shown that for a1,,ad>0a_{1},\ldots,a_{d}>0 the intersections of such varieties with d\prod_{d} are either contractible or empty. Finally, the critical points of the next Vandermonde equation on the Vandermond variety are studied in detail, and on this base a new proof of Kostov’s theorem is given.

We believe that the results of [1, 13] and their continuation in [14, 15] and other publications are important for the study of the Prony problem over the reals, and we plan to present some results in this direction separately.

Appendix A Proof of Theorem 3.5

Recall that we are interested in finding conditions for which the Taylor mapping TM:SdTd{\mathcal{}TM}:\;{\mathcal{}S}_{d}\to{\mathcal{}T}_{d} is invertible. In other words, given

S(z)=k=02d1mk(1z)k+1,S\left(z\right)=\sum_{k=0}^{2d-1}m_{k}\left(\frac{1}{z}\right)^{k+1},

we are looking for a rational function R(z)SdR\left(z\right)\in{\mathcal{}S}_{d} such that

S(z)R(z)=d1z2d+1+d2z2d+2+.S\left(z\right)-R\left(z\right)=\frac{d_{1}}{z^{2d+1}}+\frac{d_{2}}{z^{2d+2}}+\dots. (A.1)

Write R(z)=P(z)Q(z)R\left(z\right)=\frac{P\left(z\right)}{Q\left(z\right)} with Q(z)=j=0dcjzjQ\left(z\right)=\sum_{j=0}^{d}c_{j}z^{j} and P(z)=i=0d1biziP\left(z\right)=\sum_{i=0}^{d-1}b_{i}z^{i}. Multiplying (A.1) by Q(z)Q\left(z\right), we obtain

Q(z)S(z)P(z)=e1zd+1+e2zd+2+.Q\left(z\right)S\left(z\right)-P\left(z\right)=\frac{e_{1}}{z^{d+1}}+\frac{e_{2}}{z^{d+2}}+\dots. (A.2)
Proposition A.1.

The identity (A.2), considered as an equation on PP and QQ with degP<degQd\deg P<\deg Q\leq d, always has a solution.

Proof.

Substituting the expressions for S,PS,\ P and QQ into (A.2) we get

(c0+c1z++cdzd)(m0z+m1z2+)b0bd1zd1=e1zd+1+.\left(c_{0}+c_{1}z+\dots+c_{d}z^{d}\right)\left(\frac{m_{0}}{z}+\frac{m_{1}}{z^{2}}+\dots\right)-b_{0}-\dots-b_{{d-1}}z^{{d-1}}=\frac{e_{1}}{z^{{d+1}}}+\dots. (A.3)

The highest degree of zz in the left hand side of (A.3) is d1d-1. So equating to zero the coefficients of zsz^{s} in (A.3) for s=d1,,ds=d-1,\dots,-d we get the following systems of equations:

[000m000m0m1\adots\adotsm0m1md1][c1c2cd]=[bd1bd2b0].\begin{bmatrix}0&0&0&m_{0}\\ 0&0&m_{0}&m_{1}\\ \adots&\adots\\ m_{0}&m_{1}&\dots&m_{d-1}\end{bmatrix}\begin{bmatrix}c_{1}\\ c_{2}\\ \vdots\\ c_{d}\end{bmatrix}=\begin{bmatrix}b_{d-1}\\ b_{d-2}\\ \vdots\\ b_{0}\end{bmatrix}. (A.\star)

From this point on, the equations become homogeneous:

[m0m1mdm1m2md+1\adots\adotsmd1mdm2d1][c0c1cd]=[000].\begin{bmatrix}m_{0}&m_{1}&\dots&m_{d}\\ m_{1}&m_{2}&\dots&m_{d+1}\\ \adots&\adots\\ m_{d-1}&m_{d}&\dots&m_{2d-1}\end{bmatrix}\begin{bmatrix}c_{0}\\ c_{1}\\ \vdots\\ c_{d}\end{bmatrix}=\begin{bmatrix}0\\ 0\\ \vdots\\ 0\end{bmatrix}. (A.\star\star)

The homogeneous system (A.\star\star) has the Hankel-type d×(d+1)d\times\left(d+1\right) matrix M~d=(mi+j)\tilde{M}_{d}=\left(m_{i+j}\right) with 0id10\leqslant i\leqslant d-1 and 0jd0\leqslant j\leqslant d. This system has dd equations and d+1d+1 unknowns c0,,cdc_{0},\dots,c_{d}. Consequently, it always has a nonzero solution c0,,cdc_{0},\dots,c_{d}. Now substituting these coefficients c0,,cdc_{0},\dots,c_{d} of QQ into the equations (A.\star) we find the coefficients b0,,bd1b_{0},\dots,b_{d-1} of the polynomial PP, satisfying (A.\star). Notice that if cj=0c_{j}=0 for j+1j\geqslant\ell+1 then it follows from the structure of the equations (A.\star) that bj=0b_{j}=0 for jj\geq\ell. Hence these P,QP,Q provide a solution of (A.2), satisfying degP<degQd,\deg P<\deg Q\leq d, and hence belonging to Sd.{\mathcal{}S}_{d}.

However, in general (A.2) does not imply (A.1). This implication holds only if degQ=d\deg Q=d. The following proposition describes a possible “lost of accuracy” as we return from (A.2) to (A.1) and degQ<d\deg Q<d:

Proposition A.2.

Let (A.2) be satisfied with the highest nonzero coefficient of QQ being c,dc_{\ell},\ \ell\leq d. Then

S(z)P(z)Q(z)=d1zd++1+d2zd++2+.S(z)-\frac{{P(z)}}{{Q(z)}}=\frac{d_{1}}{z^{{d+\ell+1}}}+\frac{d_{2}}{z^{{d+\ell+2}}}+\dots. (A.4)
Proof.

We notice that if the leading nonzero coefficient of QQ is cc_{\ell} then we have

1Q=1z(1c+c1z+)=1z(f0+f11z+).\frac{1}{Q}=\frac{1}{{z^{\ell}}}(\frac{1}{{c_{\ell}+\frac{c_{\ell-1}}{z}+\dots}})=\frac{1}{{z^{\ell}}}(f_{0}+f_{1}\frac{1}{z}+\dots).

So multiplying (A.2) by 1Q\frac{1}{Q} we get (A.4). ∎

Proof of Theorem 3.5.

Assume that the rank of M~d\tilde{M}_{d} is rd,r\leq d, and that |Mr|0.|M_{r}|\neq 0. Let us find a polynomial Q(z)Q(z) of degree rr of the form Q(z)=zr+j=0r1cjzj,Q(z)=z^{r}+\sum_{j=0}^{r-1}c_{j}z^{j}, whose coefficients satisfy system (A.\star\star). Put 𝐜r=(c0,,cr1,1)T\mathbf{c}_{r}=(c_{0},\dots,c_{r-1},1)^{T} and consider a linear system M~r𝐜r=0\tilde{M}_{r}\mathbf{c}_{r}=0. Since by assumptions |Mr|0,|M_{r}|\neq 0, this system has a unique solution. Extend this solution by zeroes, i.e. put 𝐜d=(c0,,cr1,1,0,,0)T.\mathbf{c}_{d}=(c_{0},\dots,c_{r-1},1,0,\dots,0)^{T}. We want 𝐜d\mathbf{c}_{d} to satisfy (A.\star\star), which is M~d𝐜d=0\tilde{M}_{d}\mathbf{c}_{d}=0. This fact is immediate for the first rr rows of M~d\tilde{M}_{d}. But since the rank of M~d\tilde{M}_{d} is rr by the assumption, its other rows are linear combinations of the first rr ones. Hence 𝐜d\mathbf{c}_{d} satisfies (A.\star\star).

Now the equations (A.\star) produce a polynomial P(z)P(z) of degree at most r1r-1. So we get a rational function R(z)=P(z)Q(z)SrSdR(z)=\frac{{P(z)}}{{Q(z)}}\in{\mathcal{}S}_{r}\subseteq{\mathcal{}S}_{d} which solves the Padé problem (A.2), with degQ(z)=r\deg Q(z)=r. Write R(z)=k=0αk(1z)k+1R(z)=\sum_{k=0}^{\infty}\alpha_{k}(\frac{1}{z})^{k+1}. By Proposition A.2 we have mk=αkm_{k}=\alpha_{k} till k=d+r1k=d+r-1.

Now, the Taylor coefficients αk\alpha_{k} of R(z)R(z) satisfy a linear recurrence relation

mk=s=1rcsmks,k=r,r+1,.m_{k}=-\sum_{{s=1}}^{r}c_{s}m_{{k-s}},\qquad k=r,r+1,\dots. (A.5)

Considering the rows of the system M~d𝐜d=0\tilde{M}_{d}\mathbf{c}_{d}=0 we see that mkm_{k} satisfy the same recurrence relation (A.5) till k=d+r1k=d+r-1 (we already know that mk=αkm_{k}=\alpha_{k} till k=d+r1k=d+r-1). We shall show that in fact mkm_{k} satisfy (A.5) till k=2d1.k=2d-1.

Consider a d×rd\times r matrix M¯d\bar{M}_{d} formed by the first rr columns of MdM_{d}, and denote its row vectors by 𝐯i=(mi,0,,mi,r1),i=1,,d1\mathbf{v}_{i}=(m_{i,0},\dots,m_{i,r-1}),\ i=1,\dots,d-1. The vectors 𝐯i\mathbf{v}_{i} satisfy

𝐯i=s=1rcs𝐯is,i=r,,d1,\mathbf{v}_{i}=-\sum_{{s=1}}^{r}c_{s}\mathbf{v}_{{i-s}},\quad i=r,\dots,d-1, (A.6)

since their coordinates satisfy (A.5) till k=d+r1k=d+r-1. Now 𝐯0,,𝐯r1\mathbf{v}_{0},\dots,\mathbf{v}_{r-1} are linearly independent, and hence each 𝐯i,i=r,,d1,\mathbf{v}_{i},\ i=r,\dots,d-1, can be expressed as

𝐯i=s=0r1γi,s𝐯s.\mathbf{v}_{i}=\sum_{{s=0}}^{{r-1}}\gamma_{{i,s}}\mathbf{v}_{s}. (A.7)

Denote by 𝐯~i=(mi,0,,mi,d),i=1,,d1\mathbf{\tilde{v}}_{i}=(m_{i,0},\dots,m_{i,d}),\ i=1,\dots,d-1 the row vectors of M~d\tilde{M}_{d}. Since by assumptions the rank of M~d\tilde{M}_{d} is rr, the vectors 𝐯~i\mathbf{\tilde{v}}_{i} can be expressed through the first rr of them exactly in the same form as 𝐯i\mathbf{v}_{i}:

𝐯~i=s=0r1γi,s𝐯~s,i=r,,d1.\mathbf{\tilde{v}}_{i}=\sum_{{s=0}}^{{r-1}}\gamma_{{i,s}}\mathbf{\tilde{v}}_{s},\quad i=r,\dots,d-1. (A.8)

Now the property of a system of vectors to satisfy the linear recurrence relation (A.6) depends only on the coefficients γi,s\gamma_{i,s} in their representation (A.7) or (A.8). Hence from (A.6) we conclude that the full rows 𝐯~i\mathbf{\tilde{v}}_{i} of M~d\tilde{M}_{d} satisfy the same recurrence relation. Coordinate-wise this implies that mkm_{k} satisfy (A.5) till k=2d1,k=2d-1, and hence mk=αkm_{k}=\alpha_{k} till k=2d1.k=2d-1. So R(z)R(z) solves the original Problem 3.1.

In the opposite direction, assume that R(z)R(z) solves Problem 3.1, and that the representation R(z)=P(z)Q(z)SrSdR(z)=\frac{{P(z)}}{{Q(z)}}\in{\mathcal{}S}_{r}\subset{\mathcal{}S}_{d} is irreducible, i.e. degQ=r\deg Q=r. Write Q(z)=zr+j=0r1cjzjQ(z)=z^{r}+\sum_{j=0}^{r-1}c_{j}z^{j}. Then mkm_{k}, being the Taylor coefficients of R(z)R(z) till k=2d1k=2d-1, satisfy a linear recurrence relation (A.5): mk=s=1rcsmks,k=r,r+1,,2d1.m_{k}=-\sum_{s=1}^{r}c_{s}m_{k-s},\ k=r,r+1,\dots,2d-1. Applying this relation coordinate-wise to the rows of M~d\tilde{M}_{d} we conclude that all the rows can be linearly expressed through the first rr ones. So the rank of M~d\tilde{M}_{d} is at most rr.

It remains to show that the left upper minor |Mr||M_{r}| is non-zero, and hence the rank of M~d\tilde{M}_{d} is exactly rr.

By Proposition 3.3, if the decomposition of R(z)R\left(z\right) in the standard basis is

R(z)=j=1s=1djaj,1(1)1(1)!(zxj),R\left(z\right)=\sum_{j=1}^{s}\sum_{\ell=1}^{d_{j}}a_{j,\ell-1}\frac{\left(-1\right)^{\ell-1}\left(\ell-1\right)!}{\left(z-x_{j}\right)^{\ell}},

where j=1sdj=r\sum_{j=1}^{s}d_{j}=r and {xj}\left\{x_{j}\right\} are pairwise distinct, then the Taylor coefficients of R(z)R\left(z\right) are given by (1.5). Clearly, we must have aj,dj10a_{j,d_{j}-1}\neq 0 for all j=1,,sj=1,\dots,s, otherwise degQ<r\deg Q<r, a contradiction. Now consider the following well-known representation of MrM_{r} as a product of three matrices (see e.g. [7]):

Mr=V(x1,d1,,xs,ds)×diag{Aj}j=1s×V(x1,d1,,xs,ds)T,M_{r}=V\left(x_{1},d_{1},\dots,x_{s},d_{s}\right)\times\operatorname{diag}\left\{A_{j}\right\}_{j=1}^{s}\times V\left(x_{1},d_{1},\dots,x_{s},d_{s}\right)^{T}, (A.9)

where V()V\left(\dots\right) is the confluent Vandermonde matrix (4.1) and each AjA_{j} is the following dj×djd_{j}\times d_{j} block:

Aj=def[aj,0aj,1aj,dj1aj,1(dj1dj2)aj,dj100(dj12)aj,dj100aj,dj100].A_{j}\stackrel{{\scriptstyle\text{def}}}{{=}}\begin{bmatrix}a_{j,0}&a_{j,1}&\cdots&\cdots&a_{j,d_{j}-1}\\ a_{j,1}&&&{d_{j}-1\choose d_{j}-2}a_{j,d_{j}-1}&0\\ \cdots&&&\cdots&0\\ &{d_{j}-1\choose 2}a_{j,d_{j}-1}&0&\cdots&0\\ a_{j,d_{j}-1}&0&\cdots&\cdots&0\end{bmatrix}.

The formula (A.9) can be checked by direct computation. Since {xj}\left\{x_{j}\right\} are pairwise distinct and aj,dj10a_{j,d_{j}-1}\neq 0 for all j=1,,sj=1,\dots,s, we immediately conclude that |Mr|0\left|M_{r}\right|\neq 0.

This finishes the proof of Theorem 3.5. ∎

References

  • [1] V.I. Arnol’d. Hyperbolic polynomials and Vandermonde mappings. Functional Analysis and Its Applications, 20(2):125–127, 1986.
  • [2] J.R. Auton. Investigation of Procedures for Automatic Resonance Extraction from Noisy Transient Electromagnetics Data. Volume III. Translation of Prony’s Original Paper and Bibliography of Prony’s Method. Technical report, Effects Technology Inc., Santa Barbara, CA, 1981.
  • [3] R. Badeau, B. David, and G. Richard. Performance of ESPRIT for estimating mixtures of complex exponentials modulated by polynomials. Signal Processing, IEEE Transactions on, 56(2):492–504, 2008.
  • [4] G.A. Baker and P. Graves-Morris. Pade approximants. Part 1: Basic theory. Encyclopedia of Mathematics and its applications, 1981.
  • [5] D. Batenkov. Complete Algebraic Reconstruction of Piecewise-Smooth Functions from Fourier Data. arXiv preprint arXiv:1211.0680.
  • [6] D. Batenkov. On the norm of inverses of confluent Vandermonde matrices. arXiv preprint arXiv:1212.0172.
  • [7] D. Batenkov and Y. Yomdin. On the accuracy of solving confluent Prony systems. To appear in SIAM J.Appl.Math. Arxiv preprint arXiv:1106.1137.
  • [8] E. Candes and C. Fernandez-Granda. Towards a mathematical theory of super-resolution. To appear in Communications on Pure and Applied Mathematics, 2012.
  • [9] C. de Boor. Divided differences. Surv. Approx. Theory, 1:46–69, 2005.
  • [10] D.L. Donoho, M. Elad, and V.N. Temlyakov. Stable recovery of sparse overcomplete representations in the presence of noise. Information Theory, IEEE Transactions on, 52(1):6–18, 2006.
  • [11] B. Gustafsson, C. He, P. Milanfar, and M. Putinar. Reconstructing planar domains from their moments. Inverse Problems, 16(4):1053–1070, 2000.
  • [12] W. Kahan and R.J. Fateman. Symbolic computation of divided differences. ACM SIGSAM Bulletin, 33(2):7–28, 1999.
  • [13] V.P. Kostov. On the geometric properties of Vandermonde’s mapping and on the problem of moments. Proceedings of the Royal Society of Edinburgh: Section A Mathematics, 112(3-4):203–211, 1989.
  • [14] V.P. Kostov. Root arrangements of hyperbolic polynomial-like functions. Revista matemática complutense, 19(1):197–225, 2006.
  • [15] V.P. Kostov. On root arrangements for hyperbolic polynomial-like functions and their derivatives. Bulletin des sciences mathematiques, 131(5):477–492, 2007.
  • [16] L.M. Milne-Thompson. The calculus of finite differences. Macmillan, 1933.
  • [17] T. Peter, D. Potts, and M. Tasche. Nonlinear approximation by sums of exponentials and translates. SIAM Journal on Scientific Computing, 33(4):1920, 2011.
  • [18] D. Potts and M. Tasche. Parameter estimation for exponential sums by approximate Prony method. Signal Processing, 90(5):1631–1642, 2010.
  • [19] R. Prony. Essai experimental et analytique. J. Ec. Polytech.(Paris), 2:24–76, 1795.
  • [20] P. Stoica and N. Arye. MUSIC, maximum likelihood, and Cramer-Rao bound. IEEE Transactions on Acoustics, Speech and Signal Processing, 37(5):720–741, 1989.
  • [21] Y. Yomdin. Singularities in Algebraic Data Acquisition. In M. Manoel, M.C.R. Fuster, and C.T.C. Wall, editors, Real and Complex Singularities. Cambridge University Press, 2010.