This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Stationary measures of continuous time Markov chains with applications to stochastic reaction networks

Mads Christian Hansen 1 Carsten Wiuf 1  and  Chuang Xu 2 wiuf@math.ku.dk (Corresponding author) 1 Department of Mathematical Sciences, University of Copenhagen, 2100 Copenhagen, Denmark. 2 Department of Mathematics
University of Hawai’i at Mānoa, Honolulu
96822, HI, US.
(Date: 25th September 2025)
Abstract.

We study continuous-time Markov chains on the non-negative integers under mild regularity conditions (in particular, the set of jump vectors is finite and both forward and backward jumps are possible). Based on the so-called flux balance equation, we derive an iterative formula for calculating stationary measures. Specifically, a stationary measure π(x)\pi(x) evaluated at x0x\in\mathbb{N}_{0} is represented as a linear combination of a few generating terms, similarly to the characterization of a stationary measure of a birth-death process, where there is only one generating term, π(0)\pi(0). The coefficients of the linear combination are recursively determined in terms of the transition rates of the Markov chain. For the class of Markov chains we consider, there is always at least one stationary measure (up to a scaling constant). We give various results pertaining to uniqueness and non-uniqueness of stationary measures, and show that the dimension of the linear space of signed invariant measures is at most the number of generating terms. A minimization problem is constructed in order to compute stationary measures numerically. Moreover, a heuristic linear approximation scheme is suggested for the same purpose by first approximating the generating terms. The correctness of the linear approximation scheme is justified in some special cases. Furthermore, a decomposition of the state space into different types of states (open and closed irreducible classes, and trapping, escaping and neutral states) is presented. Applications to stochastic reaction networks are well illustrated.

Key words and phrases:
Recurrence, explosivity, stationary distribution, signed invariant measure

1. Introduction

Stochastic models of interacting particle systems are often formulated in terms of continuous-time Markov chains (CTMC) on a discrete state space. These models find application in population genetics, epidemiology and biochemistry, where the particles are known as species. A natural and accessible framework for representing interactions between species is a stochastic reaction network, where the underlying graph captures the possible jumps between states and the interactions between species. In the case where the reaction network consists of a single species, it is referred to as a one-species reaction network. Such networks frequently arise in various applications, ranging from SIS models in epidemiology [26] to bursty chemical processes (for example, in gene regulation) [27] and the Schlögl model [11]. One often focuses on examining the long-term dynamic behavior of the system, which can be captured by its corresponding limiting or stationary distribution, provided it exists. Therefore, characterizing the structure of such distributions is of great interest.

Stochastic models of reaction networks, in general, are highly non-linear, posing challenges for analytical approaches. Indeed, the characterization of stationary distributions remain largely incomplete [28], except for specific cases such as mono-molecular (linear) reaction networks [16], complex balanced reaction networks [2], and when the associated stochastic process is a birth-death process [4]. To obtain statistical information, it is common to resort to stochastic simulation algorithms [12, 9], running the Markov chain numerous times. However, this approach can be computationally intensive, rendering the analysis of complex reaction networks infeasible [21]. Furthermore, it fails to provide analytical insights into the system.

We investigate one-species reaction networks on the non-negative integers 0\mathbb{N}_{0}, and present an analytic characterization of stationary measures for general continuous-time Markov chains, subject to mild and natural regularity conditions (in particular, the set of jump vectors is finite and both forward and backwards jumps are possible), see Proposition 4.3. Furthermore, we provide a decomposition of the state space into different types of states: neutral, trapping, and escaping states, and positive and quasi-irreducible components (Proposition 3.1). Based on this characterization, we provide an iterative formula to calculate π(x)\pi(x), x0x\in\mathbb{N}_{0}, for any stationary measure π\pi, not limited to stationary distributions, in terms of a few generating terms (Theorem 4.4); similarly to the characterization of the stationary distribution/measure of a birth-death process with one generating term π(0)\pi(0). The iterative formula is derived from the general flow balance equation [17].

Moreover, we show that the linear subspace of signed invariant measures has dimension at most the number of generating terms and that each signed invariant measure is given by the iterative formula and a vector of generating terms (Theorem 6.3). We use [15] to argue there always exists a stationary measure and give conditions for uniqueness and non-uniqueness (Corollary 5.5, Corollary 5.6, Theorem 6.6, Lemma 6.7). Furthermore, we demonstrate by example that there might be two or more linearly independent stationary measures (Example 9.5). As birth-death processes has a single generating term, then there cannot be a signed invariant measure taking values with both signs.

Finally, we demonstrate how the iterative formula can be used to approximate a stationary measure. Two methods are discussed: convex optimization (Theorem 7.1) and a heuristic linear approximation scheme (Lemma 7.2). We establish conditions under which the linear approximation scheme is correct, and provide simulation-based illustrations to support the findings. Furthermore, we observe that even when the aforementioned conditions are not met, the linear approximation scheme still produces satisfactory results. In particular, it allows us to recover stationary measures in certain cases. This approach offers an alternative to truncation approaches [14, 20] and forward-in-time simulation techniques such as Gillespie’s algorithm [12].

2. Preliminaries

2.1. Notation

Let 0\mathbb{R}_{\geq 0}, >0\mathbb{R}_{>0}, \mathbb{R} be the non-negative real numbers, the positive real numbers, and the real numbers, respectively, \mathbb{Z} the integers, \mathbb{N} the natural numbers and 0={0}\mathbb{N}_{0}=\mathbb{N}\cup\{0\} the non-negative integers. For m,n0m,n\in\mathbb{N}_{0}, let m×n\mathbb{R}^{m\times n} denote the set of m×nm\times n matrices over \mathbb{R}. The sign of xx\in\mathbb{R} is defined as sgn(x)=1\text{sgn}(x)=1 if x>0x>0, sgn(x)=0\text{sgn}(x)=0 if x=0x=0, and sgn(x)=1\text{sgn}(x)=-1 if x<0x<0. We use \lceil\cdot\rceil to denote the ceiling function, \lfloor\cdot\rfloor to denote the floor function, and ||||p||\cdot||_{p} to denote the pp-norm. Furthermore, let 𝟙B:D{0,1}\mathbbm{1}_{B}\colon D\to\{0,1\} denote the indicator function of a set BDB\subseteq D, where DD is the domain.

2.2. Markov Chains

We define a class of CTMCs on 0\mathbb{N}_{0} in terms of a finite set of jump vectors and non-negative transition functions. The setting can be extended to CTMCs on 0d\mathbb{N}^{d}_{0} for d>1d>1 and to infinite sets of jump vectors. Let Ω{0}\Omega\subseteq\mathbb{Z}\!\setminus\!\{0\} be a finite set and ={λω:ωΩ}\mathcal{F}=\left\{\lambda_{\omega}\colon\omega\in\Omega\right\} a set of non-negative transition functions on 0\mathbb{N}_{0},

λω:00,ωΩ.\lambda_{\omega}\colon\mathbb{N}_{0}\to\mathbb{R}_{\geq 0},\quad\omega\in\Omega.

The notation is standard in reaction network literature [3], where we find our primary source of applications. For convenience, we let

(2.1) λω(k)=0fork<0,andλω0forωΩ.\lambda_{\omega}(k)=0\quad\text{for}\quad k<0,\quad\text{and}\quad\lambda_{\omega}\equiv 0\quad\text{for}\quad\omega\not\in\Omega.

The transition functions define a QQ-matrix Q=(qx,y)x,y0Q=(q_{x,y})_{x,y\in\mathbb{N}_{0}} with qx,y=λyx(x)q_{x,y}=\lambda_{y-x}(x), x,y0x,y\in\mathbb{N}_{0}, and subsequently, a class of CTMCs (Yt)t0(Y_{t})_{t\geq 0} on 0\mathbb{N}_{0} by assigning an initial state Y00Y_{0}\in\mathbb{N}_{0}. Since Ω\Omega is finite, there are at most finitely many jumps from any x0x\in\mathbb{N}_{0}. For convenience, we identify the class of CTMCs with (Ω,)(\Omega,\mathcal{F}).

A state y0y\in\mathbb{N}_{0} is reachable from x0x\in\mathbb{N}_{0} if there exists a sequence of states x(0),,x(m)x^{(0)},\ldots,x^{(m)}, such that x=x(0)x=x^{(0)}, y=x(m)y=x^{(m)} and λω(i)(x(i))>0\lambda_{\omega^{(i)}}(x^{(i)})>0 with ω(i)=x(i+1)x(i)Ω\omega^{(i)}=x^{(i+1)}-x^{(i)}\in\Omega, i=0,,m1i=0,\ldots,m-1. It is accessible if it is reachable from some other state. The state is absorbing if no other states can be reached from it. A set A0A\subseteq\mathbb{N}_{0} is a communicating class of (Ω,)(\Omega,\mathcal{F}) if any two states in AA can be reached from one another, and the set cannot be extended while preserving this property. A state x0x\in\mathbb{N}_{0} is a neutral state of (Ω,)(\Omega,\mathcal{F}) if it is an absorbing state not accessible from any other state, a trapping state of (Ω,)(\Omega,\mathcal{F}) if it is an absorbing state accessible from some other state, and an escaping state of (Ω,)(\Omega,\mathcal{F}) if it forms its own communicating class and some other state is accessible from it. A set A0A\subseteq\mathbb{N}_{0} is a positive irreducible component (PIC) of (Ω,)(\Omega,\mathcal{F}) if it is a non-singleton closed communicating class, and a quasi-irreducible component (QIC) of (Ω,)(\Omega,\mathcal{F}) if it is a non-singleton open communicating class.

Let 𝖭{\sf N}, 𝖳{\sf T}, 𝖤{\sf E}, 𝖯{\sf P}, and 𝖰{\sf Q} be the (possibly empty) set of all neutral states, trapping states, escaping states, PICs and QICs of (Ω,)(\Omega,\mathcal{F}), respectively. Each state has a unique type, hence 𝖭{\sf N}, 𝖳{\sf T}, 𝖤{\sf E}, 𝖯{\sf P}, and 𝖰{\sf Q} form a decomposition of the state space into disjoint sets.

A non-zero measure π\pi on a closed irreducible component A0A\subseteq\mathbb{N}_{0} of (Ω,)(\Omega,\mathcal{F}) is a stationary measure of (Ω,)(\Omega,\mathcal{F}) if π\pi is invariant for the QQ-matrix, that is, if π\pi is a non-negative equilibrium of the master equation [13]:

(2.2) 0=ωΩλω(xω)π(xω)ωΩλω(x)π(x),xA,0=\sum_{\omega\in\Omega}\lambda_{\omega}(x-\omega)\pi(x-\omega)-\sum_{\omega\in\Omega}\lambda_{\omega}(x)\pi(x),\quad x\in A,

and a stationary distribution if it is a stationary measure and xAπ(x)=1\sum_{x\in A}\pi(x)=1. Furthermore, we say π\pi is unique, if it is unique up to a scaling constant. We might leave out ‘on AA’ and just say π\pi is stationary measure of (Ω,)(\Omega,\mathcal{F}), when the context is clear.

2.3. Stochastic reaction networks (SRNs)

For clarity, we only introduce one-species SRNs and not SRNs in generality [3], as one-species SRNs are our primary application area. A one-species SRN is a finite labelled digraph (𝒞,)(\mathcal{C},\mathcal{R}) with an associated CTMC on 0\mathbb{N}_{0}. The elements of \mathcal{R} are reactions, denoted by y\ce>[η]yy\ce{->[\eta]}y^{\prime}, where y,y𝒞y,y^{\prime}\in\mathcal{C} are complexes and the label is a non-negative intensity function on 0\mathbb{N}_{0}. Each complex is an integer multiple of the species S, that is, nSn\text{S} for some nn. In examples, we simply draw the reactions as in the following example with 𝒞={0,S,2S,3S}\mathcal{C}=\{0,\text{S},2\text{S},3\text{S}\}, and four reactions:

(2.3) 4S\ce>[η1]2S\ce<=>[η2][η3]0\ce>[η4]6S.4\text{S}\ce{->[\eta_{1}]}2\text{S}\ce{<=>[\eta_{2}][\eta_{3}]}0\ce{->[\eta_{4}]}6\text{S}.

For convenience, we represent the complexes as elements of 0\mathbb{N}_{0} via the natural embedding, nSnn\text{S}\mapsto n, and number the reactions as in the example above. The associated stochastic process (Xt)t0(X_{t})_{t\geq 0} can be given as

Xt=X0+k=1#(ykyk)Yk(0tηk(Xs)𝑑s),\displaystyle X_{t}=X_{0}+\sum_{k=1}^{\#\mathcal{R}}(y^{\prime}_{k}-y_{k})Y_{k}\left(\int_{0}^{t}\eta_{k}(X_{s})\,ds\right),

where YkY_{k} are independent unit-rate Poisson processes and ηk:0[0,)\eta_{k}\colon\mathbb{N}_{0}\to[0,\infty) are intensity functions [3, 10, 25]. By varying the initial species count X0X_{0}, a whole family of Markov chains is associated with the SRN.

A common choice of intensity functions is that of stochastic mass-action kinetics [3],

ηk(x)=κkx!(xyk)!,x0,\displaystyle\eta_{k}(x)=\kappa_{k}\frac{x!}{(x-y_{k})!},\qquad x\in\mathbb{N}_{0},

where κk>0\kappa_{k}>0 is the reaction rate constant of the kkth reaction, and the combinatorial factor is the number of ways yky_{k} molecules can be chosen out of xx molecules (with order). This intensity function satisfies the stoichiometric admissibility condition:

ηk(x)>0xyk\displaystyle\eta_{k}(x)>0\quad\Leftrightarrow\quad x\geq y_{k}

(\geq is taken component-wise). Thus, every reaction can only fire when the copy-numbers of the species in the current state are no fewer than those of the source complex.

We will assume mass-action kinetics in many examples below and label the reactions with their corresponding reaction rate constants, rather than the intensity functions. To bring SRNs into the setting of the previous section, we define

Ω={ykyk|ykyk},{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\Omega=\{y^{\prime}_{k}-y_{k}\,|\,y_{k}\to y_{k}^{\prime}\in\mathcal{R}\}},
λω(x)=k=1#1{ω}(ykyk)ηk(x),x0,ωΩ.{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\lambda_{\omega}(x)=\sum_{k=1}^{\#\mathcal{R}}1_{\{\omega\}}(y_{k}^{\prime}-y_{k})\eta_{k}(x),\quad x\in\mathbb{N}_{0},\quad\omega\in\Omega.}

3. A classification result

In this section, we classify the state space 0\mathbb{N}_{0} into different types of states. In particular, we are interested in characterising the PICs in connection with studying stationary measures. Our primary goal is to understand the class of one-species SRNs, which we study by introducing a larger class of Markov chains embracing SRNs.

We assume throughout that a class of CTMCs associated with (Ω,)(\Omega,\mathcal{F}) is given. Define

Ω+:={ωΩ:sgn(ω)=1},Ω:={ωΩ:sgn(ω)=1}\displaystyle\Omega_{+}:=\left\{\omega\in\Omega\colon\text{sgn}(\omega)=1\right\},\quad\Omega_{-}:=\left\{\omega\in\Omega\colon\text{sgn}(\omega)=-1\right\}

as the sets of positive and negative jumps, respectively. To avoid trivialities and for regularity, we make the following assumptions.

Assumptions.

  • (A1)

    Ω\Omega is finite, Ω\Omega_{-}\neq\varnothing and Ω+\Omega_{+}\neq\varnothing.

  • (A2)

    For ωΩ\omega\in\Omega, there exists 𝗂ω0{\sf i}_{\omega}\in\mathbb{N}_{0} such that λω(x)>0\lambda_{\omega}(x)>0 if and only if x𝗂ωx\geq{\sf i}_{\omega}

Then, 𝗂ω{\sf i}_{\omega} is the smallest state for which a jump of size ω\omega can occur (𝗂{\sf i} is for input). If either of Ω\Omega_{-} and Ω+\Omega_{+} is empty, then the chain is either a pure death or a pure birth process. Assumption (A1) and (A2) are fulfilled for stochastic mass-action kinetics.

For the classification, we need some further quantities. Let 𝗈ω:=𝗂ω+ω{\sf o}_{\omega}:={\sf i}_{\omega}+\omega be the smallest possible ‘output’ state after applying a jump of size ω\omega, and let

𝗂:=minωΩ𝗂ω,𝗂+:=minωΩ+𝗂ω,𝗈:=minωΩ𝗈ω𝗈:=minωΩ𝗈ω.{\sf i}:=\text{min}_{\omega\in\Omega}{\sf i}_{\omega},\quad{\sf i}_{+}:=\text{min}_{\omega\in\Omega_{+}}{\sf i}_{\omega},\quad{\sf o}:=\text{min}_{\omega\in\Omega}{\sf o}_{\omega}\quad{\sf o}_{-}:=\text{min}_{\omega\in\Omega_{-}}{\sf o}_{\omega}.

Any state x<𝗂x<{\sf i} is a trapping or neutral state (no jumps can occur from one of these states), and 𝗂+{\sf i}_{+} is the smallest state for which a forward jump can occur. Similarly, 𝗈{\sf o} is the smallest state that can be reached from any other state and 𝗈{\sf o}_{-}, the smallest state that can be reached by a backward jump.

In example (2.3), all jumps are multiples of 22, that is, the Markov chain is on 202\mathbb{N}_{0} or 20+12\mathbb{N}_{0}+1, depending on the initial state. Generally, the number of infinitely large irreducible classes is ω=gcd(Ω)\omega_{*}=\text{gcd}(\Omega), the greatest positive common divisor of ωΩ\omega\in\Omega (ω=2\omega_{*}=2 in example (2.3)). The following classification result is a consequence of [29, Corollary 3.15].

Proposition 3.1.

Assume (A1)-(A2). Then

𝖭={0,,min{𝗂,𝗈}1},𝖳={𝗈,,𝗂1},𝖤={𝗂,,max{𝗂+,𝗈}1}.{\sf N}=\{0,\ldots,\min\{{\sf i},{\sf o}\}-1\},\quad{\sf T}=\{{\sf o},\ldots,{\sf i}-1\},\quad{\sf E}=\{{\sf i},\ldots,\max\{{\sf i}_{+},{\sf o}_{-}\}-1\}.

Furthermore, the following hold:

  1. (1)

    If #𝖳=0\#{\sf T}=0, then 𝖰={\sf Q}=\varnothing, and 𝖯s=ω0+s{\sf P}_{s}=\omega_{*}\mathbb{N}_{0}+s, s=𝗈,,𝗈+ω1s={\sf o}_{-},\ldots,{\sf o}_{-}+\omega_{*}-1, are the PICs,

  2. (2)

    If #𝖳ω\#{\sf T}\geq\omega_{*}, then 𝖯={\sf P}=\varnothing, and 𝖰s=ω0+s{\sf Q}_{s}=\omega_{*}\mathbb{N}_{0}+s, s=𝗂+,,𝗂++ω1s={\sf i}_{+},\ldots,{\sf i}_{+}+\omega_{*}-1, are the QICs,

  3. (3)

    If 0<#𝖳<ω0<\#{\sf T}<\omega_{*}, then 𝖯s=ω0{\sf P}_{s}=\omega_{*}\mathbb{N}_{0} +s+s, s=𝗂+,,𝗈+ω1s={\sf i}_{+},\ldots,{\sf o}_{-}+\omega_{*}-1, are the PICs, and 𝖰s=ω0+s{\sf Q}_{s}=\omega_{*}\mathbb{N}_{0}+s, s=𝗈+ω,,𝗂++ω1s={\sf o}_{-}+\omega_{*},\ldots,{\sf i}_{+}+\omega_{*}-1, are the QICs. .

In either case, there are ω\omega_{*} PICs and QICs in total. When PICs exist, these are indexed by s=max{𝗂+,𝗈},,𝗈+ω1s=\max\{{\sf i}_{+},{\sf o}_{-}\},\ldots,{\sf o}_{-}+\omega_{*}-1, and 𝖯{\sf P}\not=\emptyset if and only if 𝗂+<𝗈+ω{\sf i}_{+}<{\sf o}_{-}+\omega_{*}.

Proof.

We apply [29, Corollary 3.15]. To translate the notation of the current paper to that of [29, Corollary 3.15], we put c=0c=0, Lc=0L_{c}=\mathbb{N}_{0}, c=max{𝗂+,𝗈}c_{*}=\max\{{\sf i}_{+},{\sf o}_{-}\}, ω=ω\omega^{*}=\omega_{*}, and ω=ω\omega^{**}=\omega_{*}. Then, the expressions of 𝖭{\sf N}, 𝖳{\sf T}, and 𝖤{\sf E} naturally follow. As in [29], define the following sets,

Σ0+={1+vvωω:v𝖳},\Sigma^{+}_{0}=\left\{1+v-\left\lfloor\frac{v}{\omega_{*}}\right\rfloor\omega_{*}\colon v\in{\sf T}\right\},
Σ0={1+vvωω:v{𝗂,,𝗈+ω1}𝖳}.\Sigma^{-}_{0}=\left\{1+v-\left\lfloor\frac{v}{\omega^{*}}\right\rfloor\omega_{*}\colon v\in\{{\sf i},\ldots,{\sf o}+\omega_{*}-1\}\setminus{\sf T}\right\}.

Since for v0v\in\mathbb{N}_{0},

11+vvωω1+vmodω<1+ω,1\leq 1+v-\left\lfloor\frac{v}{\omega_{*}}\right\rfloor\omega_{*}\equiv 1+v\,\,\text{mod}\,\,\omega_{*}<1+\omega_{*},

we have Σ0+Σ0=,\Sigma_{0}^{+}\cap\Sigma_{0}^{-}=\emptyset, Σ0+Σ0={1,,ω}\Sigma^{+}_{0}\cup\Sigma^{-}_{0}=\{1,\ldots,\omega_{*}\}, and #Σ0+=min{#𝖳,ω}\#\Sigma^{+}_{0}=\min\{\#{\sf T},\omega^{*}\}. If 𝗈<𝗂{\sf o}<{\sf i}, these conclusions follow easily; if 𝗈𝗂{\sf o}\geq{\sf i}, then Σ0+=\Sigma_{0}^{+}=\emptyset, and the conclusions follow.

According to [29, Corollary 3.15], it follows that

(3.1) ω(0+c(v1)ω)+v1={𝖯0(v),vΣ0,𝖰0(v),vΣ0+,\omega_{*}\left(\mathbb{N}_{0}+\left\lceil\frac{c_{*}-(v-1)}{\omega_{*}}\right\rceil\right)+v-1=\begin{cases}{\sf P}^{(v)}_{0},\quad v\in\Sigma^{-}_{0},\\ {\sf Q}^{(v)}_{0},\quad v\in\Sigma^{+}_{0},\end{cases}

are the disjoint PICs and the disjoint QICs of (Ω,)(\Omega,\mathcal{F}), respectively, in the terminology of [29]. Consequently, as #(𝖭𝖳𝖤)=c\#({\sf N}\cup{\sf T}\cup{\sf E})=c_{*},

vΣ0Σ0+(𝖯0(v)𝖰0(v))=0(𝖭𝖳𝖤)=0+c.\bigcup_{v\in\Sigma^{-}_{0}\cup\Sigma^{+}_{0}}\left({\sf P}^{(v)}_{0}\cup{\sf Q}^{(v)}_{0}\right)=\mathbb{N}_{0}\setminus({\sf N}\cup{\sf T}\cup{\sf E})=\mathbb{N}_{0}+c_{*}.

Since, for vv\in\mathbb{Z},

cωc(v1)ω+v1<ω+c,c_{*}\leq\omega_{*}\left\lceil\frac{c_{*}-(v-1)}{\omega_{*}}\right\rceil+v-1<\omega_{*}+c_{*},

then we might state (3.1) as

𝖯0(v)\displaystyle{\sf P}^{(v)}_{0} =(ω+v1)(0+c),forvΣ0,\displaystyle=(\omega_{*}\mathbb{Z}+v-1)\cap(\mathbb{N}_{0}+c_{*}),\quad\text{for}\quad v\in\Sigma^{-}_{0},
𝖰0(v)\displaystyle{\sf Q}^{(v)}_{0} =(ω+v1)(0+c),forvΣ0+.\displaystyle=(\omega_{*}\mathbb{Z}+v-1)\cap(\mathbb{N}_{0}+c_{*}),\quad\text{for}\quad v\in\Sigma^{+}_{0}.

We show that the expressions given for 𝖯0(v),𝖰0(v){\sf P}^{(v)}_{0},{\sf Q}^{(v)}_{0} correspond to those given for 𝖯s,𝖰s{\sf P}_{s},{\sf Q}_{s} in the three cases, by suitable renaming of the indices. First, note that 𝖳={\sf T}=\varnothing if and only if 𝗈𝗂{\sf o}\geq{\sf i}. From 𝗈𝗈<𝗂:=minωΩ𝗂ω{\sf o}\leq{\sf o}_{-}<{\sf i}_{-}:=\text{min}_{\omega\in\Omega_{-}}{\sf i}_{\omega} and 𝗈𝗂{\sf o}\geq{\sf i}, we have 𝗂=𝗂+𝗈{\sf i}={\sf i}_{+}\leq{\sf o}_{-}, which yields c=𝗈c_{*}={\sf o}_{-}. Consequently, Σ0+=\Sigma_{0}^{+}=\emptyset and 𝖰={\sf Q}=\varnothing. This proves the expression for 𝖯s{\sf P}_{s} in (1).

Otherwise, if 𝗈<𝗂{\sf o}<{\sf i}, then 𝗈<𝗂𝗂+<𝗈+:=minωΩ+𝗈ω{\sf o}<{\sf i}\leq{\sf i}_{+}<{\sf o}_{+}:=\text{min}_{\omega\in\Omega_{+}}{\sf o}_{\omega}, which implies that 𝗈=𝗈<𝗂+{\sf o}={\sf o}_{-}<{\sf i}_{+}. Hence, c=𝗂+c_{*}={\sf i}_{+}. If #𝖳ω\#{\sf T}\geq\omega_{*}, then 𝖯={\sf P}=\varnothing, which proves the expression for 𝖰s{\sf Q}_{s} in (2). It remains to prove the last case. If 0<#𝖳<ω0<\#{\sf T}<\omega_{*}, then

𝖯0(v+1)\displaystyle{\sf P}^{(v+1)}_{0} =(ω+v)(0+c),forv=𝗂,,𝗈+ω1.\displaystyle=(\omega_{*}\mathbb{Z}+v)\cap(\mathbb{N}_{0}+c_{*}),\quad\text{for}\quad v={\sf i},\ldots,{\sf o}+\omega_{*}-1.

If 𝗂=𝗂+{\sf i}={\sf i}_{+}, then using the above equation and 𝗈=𝗈{\sf o}={\sf o}_{-}, the expression for 𝖯s{\sf P}_{s} in (3) follows directly, and the remaining irreducible classes must be QICs. Finally, we show 𝗂<𝗂+{\sf i}<{\sf i}_{+} is impossible, which concludes the proof. Assume oppositely that 𝗂<𝗂+{\sf i}<{\sf i}_{+}. Then, 𝗂=𝗂{\sf i}={\sf i}_{-}, 𝖳={𝗈,,𝗂1}{\sf T}=\{{\sf o}_{-},\ldots,{\sf i}_{-}-1\}, and 𝗂𝖤{\sf i}_{-}\in{\sf E}. This implies one can jump from the state 𝗂{\sf i}_{-} (smallest state for which a backward jump can be made) leftwards to a state x𝗂ω<𝗈x\leq{\sf i}_{-}-\omega_{*}<{\sf o}_{-}. The latter inequality comes from 0<#𝖳=𝗂𝗈<ω0<\#{\sf T}={\sf i}-{\sf o}<\omega_{*} and 𝗈=𝗈{\sf o}={\sf o}_{-}. However, this implies x𝖭x\in{\sf N}, which is impossible.

The total number of PICs and QICs follows from Σ0+Σ0={1,,ω}\Sigma^{+}_{0}\cup\Sigma^{-}_{0}=\{1,\ldots,\omega_{*}\}. The indexation follows from c=max{𝗂+,𝗈}c_{*}=\max\{{\sf i}_{+},{\sf o}_{-}\} in the two case (1) and (3). Also, the inequality 𝗂+<𝗈+ω{\sf i}_{+}<{\sf o}_{-}+\omega_{*} follows straightforwardly in these two cases. It remains to check that it is not fulfilled in case (2). In that case, #𝖳=𝗂𝗈ω\#{\sf T}={\sf i}-{\sf o}\geq\omega_{*} by assumption, hence 𝗂+𝗂𝗈+ω=𝗈+ω{\sf i}_{+}\geq{\sf i}\geq{\sf o}+\omega_{*}={\sf o}_{-}+\omega_{*}, and the conclusion follows. ∎

In either of the three cases of the proposition, the index ss is the smallest element of the corresponding class (PIC or QIC). The role of Assumption (A2) in Proposition 3.1, together with Assumption (A1), is to ensure the non-singleton irreducible classes are infinitely large. If the assumption fails there could be non-singleton finite irreducible classes, even when Ω+\Omega_{+}\neq\emptyset.

A stationary distribution exists trivially on each element of 𝖭𝖳{\sf N}\cup{\sf T}. If the chain starts in 𝖤{\sf E}, it will eventually be trapped into a closed class, either into 𝖳{\sf T} or 𝖯{\sf P}, unless absorption is not certain in which case it might be trapped in 𝖰{\sf Q}. If absorption is certain it will eventually leave 𝖰{\sf Q}. We give two examples of CTMCs on 0\mathbb{N}_{0} showing how the chain might be trapped.

Example 3.2.

(A) The reaction network S\ce>[]0, 2S\ce<=>[]3S\text{S}\ce{->[]}0,\ 2\text{S}\ce{<=>[]}3\text{S} has Ω={1,1}\Omega=\{1,-1\}, ω=1\omega_{*}=1, 𝗂1=2{\sf i}_{1}=2, 𝗂1=1{\sf i}_{-1}=1 (note that there are two reactions with jump size 1-1). Hence 𝗂=1{\sf i}=1, 𝗂+=2{\sf i}_{+}=2 and 𝗈=𝗈=0{\sf o}={\sf o}_{-}=0. It follows from Proposition 3.1 that 𝖭={\sf N}=\varnothing, 𝖳={0}{\sf T}=\{0\}, 𝖤={1}{\sf E}=\{1\}, 𝖯={\sf P}=\varnothing, and 𝖰=0+2{\sf Q}=\mathbb{N}_{0}+2. There is only one infinite class, which is a QIC. From the escaping state, one can only jump backward to the trapping state. The escaping state is reached from 𝖰{\sf Q}.

(B) The reaction network S\ce>[]2S, 2S\ce<=>[]3S\text{S}\ce{->[]}2\text{S},\ 2\text{S}\ce{<=>[]}3\text{S} has Ω={1,1}\Omega=\{1,-1\}, ω=1\omega_{*}=1, 𝗂1=1{\sf i}_{1}=1, 𝗂1=3{\sf i}_{-1}=3 (note that there are two reactions with jump size 11). Hence, 𝗂=1{\sf i}=1, 𝗂+=1{\sf i}_{+}=1 and 𝗈=𝗈=2{\sf o}={\sf o}_{-}=2 . It follows from Proposition 3.1 that 𝖭={0}{\sf N}=\{0\}, 𝖳={\sf T}=\varnothing, 𝖤={1}{\sf E}=\{1\}, 𝖯=0+2{\sf P}=\mathbb{N}_{0}+2 and 𝖰={\sf Q}=\varnothing. There is only one infinite class, which is a PIC. From the escaping state, one can only jump forward to this PIC.

4. Characterization of stationary measures

We present an identity for stationary measures based on the flux balance equation [17], see also [31]. It provides means to express any term of a stationary measure as a linear combination with real coefficients of the generating terms. The coefficients are determined recursively, provided (A1) and (A2) are fulfilled.

To smooth the presentation we assume without loss of generality that

  • (A3)

    ω=1\omega_{*}=1, s=0s=0 and 𝖯0=0{\sf P}_{0}=\mathbb{N}_{0}.

Hence, we assume 0\mathbb{N}_{0} is a PIC and remove the index ss from the notation for convenience. For general (Ω,)(\Omega,\mathcal{F}) with ω1\omega_{*}\geq 1 and s{max{𝗂+,𝗈},,𝗈+ω1}s\in\{\max\{{\sf i}_{+},{\sf o}_{-}\},\ldots,{\sf o}_{-}+\omega_{*}-1\}, we translate the Markov chain to one fulfilling (A3) for each ss by defining (Ω,s)(\Omega_{*},\mathcal{F}_{s}) by Ω=Ωω1\Omega_{*}=\Omega\omega_{*}^{-1} (element-wise multiplication) and s={λωs:ωΩ}\mathcal{F}_{s}=\{\lambda^{s}_{\omega}\colon\omega\in\Omega_{*}\} with λωs(x)=λωω(ωx+s)\lambda^{s}_{\omega}(x)=\lambda_{\omega\omega_{*}}(\omega_{*}x+s), x0x\in\mathbb{N}_{0}. Hence, there are no loss in assuming (A3). We assume (𝐀𝟏)({\bf A1})-(𝐀𝟑)({\bf A3}) throughout Section 4 unless otherwise stated.

Let π\pi be any stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}. Define ω+,ω\omega_{+},\omega_{-} to be the largest positive and negative jump sizes, respectively,

(4.1) ω+=maxΩ+,ω=minΩ.\omega_{+}=\max\,\Omega_{+},\quad\omega_{-}=\min\,\Omega_{-}.

We will show that π(x)\pi(x) can be expressed as a linear combination with real coefficients of the generating terms π(L),,π(U)\pi(L),\ldots,\pi(U), where

(4.2) L=𝗂ω+ω=𝗈ω,U=𝗂ω1.L={\sf i}_{\omega_{-}}+\omega_{-}={\sf o}_{\omega_{-}},\qquad U={\sf i}_{\omega_{-}}-1.

are the lower and upper numbers, respectively. Hence, as a sum of UL+1=ωU-L+1=-\omega_{-} terms. No backward jumps of size ω\omega_{-} can occur from any state xUx\leq U, and LL is the smallest output state of a jump of size ω\omega_{-}.

Example 4.1.

(A) Consider the reaction network, 0\ce<=>[][]2S,0\ce{<=>[][]}2\text{S}, 5S\ce>[]S.5\text{S}\ce{->[]}\text{S}. In this case, ω=2\omega_{*}=2, 𝗂+=𝗈=0{\sf i}_{+}={\sf o}_{-}=0, and (A3) does not apply. In fact, s=0,1s=0,1 with PICs 202\mathbb{N}_{0} and 20+12\mathbb{N}_{0}+1, respectively. After translation, we find L0=1L_{0}=1, U0=2U_{0}=2, and L1=0L_{1}=0, U1=1U_{1}=1, where the index refers to s=0,1s=0,1. Thus, the lower and upper numbers are not the same for each class. (B) The reaction network, 0\ce<=>[][]S,0\ce{<=>[][]}\text{S}, nS\ce<=>[](n+2)S,n\text{S}\ce{<=>[]}(n+2)\text{S}, has ω=1\omega_{*}=1, 𝗂+=𝗈=0{\sf i}_{+}={\sf o}_{-}=0. Hence, s=0s=0 with PIC 0\mathbb{N}_{0}, and (A3) applies. We find L=n,U=n+1L=n,U=n+1. Thus, the lower and upper numbers might be arbitrarily large depending on the SRN.

Before presenting the main results, we study an example.

Example 4.2.

Recall Example 4.1(B) with mass-action kinetics, n=2n=2, and reactions

0\ce<=>[κ1][κ2]S2S\ce<=>[κ3][κ4]4S.0\ce{<=>[\kappa_{1}][\kappa_{2}]}\text{S}\quad 2\text{S}\ce{<=>[\kappa_{3}][\kappa_{4}]}4\text{S}.

According to [30, Theorem 7], this SRN is positive recurrent on 0\mathbb{N}_{0} for all κ1,,κ4>0\kappa_{1},\ldots,\kappa_{4}>0 and hence, it has a unique stationary distribution. We have L=2L=2, U=3U=3.

By rewriting the master equation (2.2), we obtain

π(0)\displaystyle\pi(0) =κ2κ1π(1),π(1)=2κ2κ1π(2),\displaystyle=\frac{\kappa_{2}}{\kappa_{1}}\pi(1),\quad\pi(1)=\frac{2\kappa_{2}}{\kappa_{1}}\pi(2),
π(4)\displaystyle\pi(4) =i=13ηi(2)η4(4)π(2)η1(1)η4(4)π(1)η2(3)η4(4)π(3),\displaystyle=\sum_{i=1}^{3}\frac{\eta_{i}(2)}{\eta_{4}(4)}\pi(2)-\frac{\eta_{1}(1)}{\eta_{4}(4)}\pi(1)-\frac{\eta_{2}(3)}{\eta_{4}(4)}\pi(3),
π()\displaystyle\pi(\ell) =i=14ηi(2)η4()π(2)η1(3)η4()π(3)η2(1)η4()π(1)η3(4)η4()π(4),\displaystyle=\sum_{i=1}^{4}\frac{\eta_{i}(\ell-2)}{\eta_{4}(\ell)}\pi(\ell-2)-\frac{\eta_{1}(\ell-3)}{\eta_{4}(\ell)}\pi(\ell-3)-\frac{\eta_{2}(\ell-1)}{\eta_{4}(\ell)}\pi(\ell-1)-\frac{\eta_{3}(\ell-4)}{\eta_{4}(\ell)}\pi(\ell-4),

the latter for >U+1=4\ell>U+1=4. There is not an equation that expresses π(3)\pi(3) in terms of the lower states <3\ell<3 as the state 33 can only be reached from 2,42,4 and 55. Consequently, π(0)\pi(0) and π(1)\pi(1) can be found from π(2)\pi(2) and π(3)\pi(3), but not vice versa, and π()\pi(\ell), U+1=4\ell\geq U+1=4 is determined recursively from π(2)\pi(2) and π(3)\pi(3) using the last equations above, say, π()=γ2()π(2)+γ3()π(3)\pi(\ell)=\gamma_{2}(\ell)\pi(2)+\gamma_{3}(\ell)\pi(3), where the coefficients γ2(),γ3()\gamma_{2}(\ell),\gamma_{3}(\ell) depend on the intensity functions. For =0,1\ell=0,1, these follow from the first equations (see also Theorem 4.4 below),

γ2(0)\displaystyle\gamma_{2}(0) =2κ22κ12,γ3(0)=0,γ2(1)=2κ2κ1,γ3(1)=0,\displaystyle=\frac{2\kappa_{2}^{2}}{\kappa_{1}^{2}},\quad\gamma_{3}(0)=0,\quad\gamma_{2}(1)=\frac{2\kappa_{2}}{\kappa_{1}},\quad\gamma_{3}(1)=0,

while for =2,3\ell=2,3, we obviously have γj()=δj,\gamma_{j}(\ell)=\delta_{j,\ell}, where δ,j\delta_{\ell,j} is the Kronecker symbol. For >3\ell>3, the coefficients are given recursively, see Theorem 4.4 for the general expression. A recursion for π()\pi(\ell) cannot be given in terms of π(0),,π(1)\pi(0),\ldots,\pi(1) alone.

First, we focus on the real coefficients of the linear combination. From Example 4.2 we learn that the coefficients take different forms depending on the state, which is reflected in the definition of γj()\gamma_{j}(\ell) below. For convenience, rows and columns of a matrix are indexed from 0.

By the definition of ω+\omega_{+} and ω\omega_{-}, any state is reachable from at most ω+ω\omega_{+}-\omega_{-} other states in one jump. For this reason, let

(4.3) m=ω+ω11m_{*}=\omega_{+}-\omega_{-}-1\geq 1

(as ω+1\omega_{+}\geq 1, ω1-\omega_{-}\geq 1), and define the functions

γj:,LjU,\gamma_{j}\colon\mathbb{Z}\to\mathbb{R},\quad L\leq j\leq U,

by

(4.4) γj()={0for<0,G,jLfor=0,,L1,j<U,0for=0,,L1,j=U,δ,jfor=L.,U,k=1mγj(k)fk()for>U,\gamma_{j}(\ell)=\left\{\begin{array}[]{cl}0&\quad\text{for}\quad\ell<0,\\ G_{\ell,j-L}&\quad\text{for}\quad\ell=0,\ldots,L-1,\quad j<U,\\ 0&\quad\text{for}\quad\ell=0,\ldots,L-1,\quad j=U,\\ \delta_{\ell,j}&\quad\text{for}\quad\ell=L.\ldots,U,\\ \sum_{k=1}^{m_{*}}\gamma_{j}(\ell-k)f_{k}(\ell)&\quad\text{for}\quad\ell>U,\end{array}\right.

where the functions fk:[U+1,)0f_{k}\colon[U+1,\infty)\cap\mathbb{N}_{0}\to\mathbb{R} are defined by

(4.5) fk()\displaystyle f_{k}(\ell) =ck(k)c0()for>U,k=0,,m,\displaystyle=-\frac{c_{k}(\ell-k)}{c_{0}(\ell)}\qquad\text{for}\qquad\ell>U,\quad k=0,\ldots,m_{*},
(4.6) ck()\displaystyle c_{k}(\ell) =sgn(ω+k+1/2)ωBkλω(),,k=0,,m,\displaystyle={\rm sgn}(\omega_{-}+k+1/2)\sum_{\omega\in B_{k}}\lambda_{\omega}(\ell),\quad\ell\in\mathbb{Z},\quad k=0,\ldots,m_{*},
(4.7) Bk\displaystyle B_{k} ={ωΩ|k(ωk)>0,withk=ω+k+1/2}\displaystyle=\{\omega\in\Omega\ |\ k^{\prime}(\omega-k^{\prime})>0,\,\,\,\text{with}\,\,\,k^{\prime}=\omega_{-}+k+1/2\}

(note that f0()=1f_{0}(\ell)=-1 is not used in the definition of γj()\gamma_{j}(\ell), but appears in the proof of the Proposition 4.3 below), and G=(H1)1H2G=-(H_{1})^{-1}H_{2} is an L×(UL)L\times(U-L) matrix defined from the L×UL\times U matrix H=(H1H2)H=(H_{1}\,\,H_{2}) with entries

(4.8) Hm,n=δm,nλ(mn)(n)ωΩλω(m),m=0,,L1,n=0,,U1,\displaystyle H_{m,n}=\delta_{m,n}-\frac{\lambda_{(m-n)}(n)}{\sum_{\omega\in\Omega}\lambda_{\omega}(m)},\quad m=0,\ldots,L-1,\quad n=0,\ldots,U-1,

where H1H_{1} is L×LL\times L dimensional and H2H_{2} is L×(UL)L\times(U-L) dimensional. Note that HH is the empty matrix if L=0L=0 or U=LU=L. In (4.8), we adopt the convention stated in (2.1). In particular, Hm,m=1H_{m,m}=1 for 0mL10\leq m\leq L-1. By definition, (ILG)(I_{L}\,\,-G) is the row reduced echelon form of HH by Gaussian elimination, where ILI_{L} is the L×LL\times L identity matrix.

The functions ck:c_{k}\colon\mathbb{Z}\to\mathbb{R} and the sets BkB_{k} come out of the flux balance equation, that is an equivalent formulation of the master equation [17]. We use it in (4.10) in the proof of Proposition 4.3. For each x0x\in\mathbb{N}_{0}, it provides an identity between two positive sums, one over terms evaluated in at most ω+\omega_{+} states with values <x\ell<x, and one over terms evaluated in at most ω+1-\omega_{-}+1 states with values x\ell\geq x. The function fk()f_{k}(\ell) is well defined for >U\ell>U if (A1), (A2) are fulfilled. In that case, c0()<0c_{0}(\ell)<0, see Lemma A.1. The matrix HH is well defined with invertible H1H_{1} under the assumptions (A1), (A2), provided L>0L>0 and U>LU>L, see Lemma A.2.

Proposition 4.3 and Theorem 4.4 below do not require uniqueness of the stationary measure.

Proposition 4.3.

A non-zero measure π\pi on 0\mathbb{N}_{0} is a stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0} if and only if

(4.9) k=0mπ(k)ck(k)=0,for>UL.\sum_{k=0}^{m_{*}}\pi(\ell-k)c_{k}(\ell-k)=0,\quad\text{for}\quad\ell>U-L.

Here, π()=0\pi(\ell)=0 for <0\ell<0 for convenience, ckc_{k} is defined as in (4.6), and mm_{*} is defined as in (4.12).

Proof.

We recall an identity related to the flux balance equation [17] for stationary measures for a CTMC on the non-negative integers [31, Theorem 3.3], which is a consequence of the master equation (2.2): A non-negative measure (probability measure) π\pi on 0\mathbb{N}_{0} is a stationary measure (stationary distribution) of (Ω,)(\Omega,\mathcal{F}) if and only if

(4.10) j=ω+10π(xj)ωAjλω(xj)+j=1ω+π(xj)ωAjλω(xj)=0,x,-\sum_{j=\omega_{-}+1}^{0}{\pi(x-j)}\sum_{\omega\in A_{j}}\lambda_{\omega}(x-j)+\sum_{j=1}^{\omega_{+}}\pi(x-j)\sum_{\omega\in A_{j}}\lambda_{\omega}(x-j)=0,\quad x\in\mathbb{Z},

where it is used that #Ω<\#\Omega<\infty, and the domain of π\pi as well as λω\lambda_{\omega} is extended from 0\mathbb{N}_{0} to \mathbb{Z} (that is, π(x)=0\pi(x)=0, λω(x)=0\lambda_{\omega}(x)=0 for x0x\in\mathbb{Z}\setminus\mathbb{N}_{0}). Furthermore, the sets AjA_{j} are defined by

Aj={{ωΩ:j>ω}ifj{ω,,0},{ωΩ:jω}ifj{1,,ω++1}.A_{j}=\begin{cases}\{\omega\in\Omega\colon j>\omega\}\quad\text{if}\ j\in\{\omega_{-},\ldots,0\},\\ \{\omega\in\Omega\colon j\leq\omega\}\quad\text{if}\ j\in\{1,\ldots,\omega_{+}+1\}.\end{cases}

If x0x\leq 0 then all terms in both double sums in (4.10) are zero. In fact, (4.10) for xx is equivalent to the master equation (2.2) for x1x-1.

Assume π\pi is a stationary measure of (Ω,)(\Omega,\mathcal{F}). Let x=+(ω+1)x=\ell+(\omega_{-}+1), \ell\in\mathbb{Z}, and let j=k+(ω+1)j=k+(\omega_{-}+1) with j{ω+1,,ω+}j\in\{\omega_{-}+1,\ldots,\omega_{+}\}. Then, xj=kx-j=\ell-k, and it follows that (4.10) is equivalent to

(4.11) k=0msgn(ω+k+1/2)π(k)ωAk+ω+1λω(k)=0,\sum_{k=0}^{m_{*}}\textrm{sgn}(\omega_{-}+k+1/2)\pi(\ell-k)\sum_{\omega\in A_{k+\omega_{-}+1}}\lambda_{\omega}(\ell-k)=0,

where sgn(ω+k+1/2)=1\textrm{sgn}(\omega_{-}+k+1/2)=1 if ω+k0\omega_{-}+k\geq 0, and 1-1 otherwise. It implies that

(k+ω+1)>ω(k+ω+1/2)>ω,(k+\omega_{-}+1)>\omega\quad\Leftrightarrow\quad(k+\omega_{-}+1/2)>\omega,
(k+ω+1)ω(k+ω+1/2)<ω.{(k+\omega_{-}+1)\leq\omega}\quad\Leftrightarrow\quad(k+\omega_{-}+1/2)<\omega.

Hence, it also follows from the definition (4.7) of BkB_{k} that

Ak+ω+1=Bk,k=0,,m.A_{k+\omega_{-}+1}=B_{k},\quad k=0,\ldots,m_{*}.

Thus, (4.11) is equivalent to

(4.12) k=0mπ(k)ck(k)=0,0,\sum_{k=0}^{m_{*}}\pi(\ell-k)c_{k}(\ell-k)=0,\qquad\ell\in\mathbb{N}_{0},

using the definition of ck()c_{k}(\ell). Since =x(ω+1)=x+UL\ell=x-(\omega_{-}+1)=x+U-L and (4.10) is trivial for x0x\leq 0, then (4.12) is also trivial for =0,,UL\ell=0,\ldots,U-L (all terms are zero). This proves the bi-implication and the proof is completed. ∎

For 0UL0\leq\ell\leq U-L, all mm_{*} terms in (4.9) are zero, hence the identity is a triviality. We express π\pi in terms of the generating terms, π(L),,π(U)\pi(L),\dots,\pi(U).

Theorem 4.4.

A non-zero measure π\pi on 0\mathbb{N}_{0} is a stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0} if and only if

(4.13) π()=j=LUπ(j)γj(),for0,\pi(\ell)=\sum_{j=L}^{U}\pi(j)\gamma_{j}(\ell),\quad\text{for}\quad\ell\in\mathbb{N}_{0},

where γj\gamma_{j} is defined as in (4.4).

Proof.

Assume π\pi is a stationary measure. The proof is by induction. We first prove the induction step. Assume (4.13) holds for 1U\ell-1\geq U and all \ell^{\prime} such that 10\ell-1\geq\ell^{\prime}\geq 0. Then, from Proposition 4.3, (4.4), (4.5), (4.6), and the induction hypothesis, we have

π()=k=1mπ(k)ck(k)c0()=k=1mj=LUπ(j)γj(k)ck(k)c0()=j=LUπ(j)γj().\pi(\ell)=-\sum_{k=1}^{m_{*}}\pi(\ell-k)\frac{c_{k}(\ell-k)}{c_{0}(\ell)}=-\sum_{k=1}^{m_{*}}\sum_{j=L}^{U}\pi(j)\gamma_{j}(\ell-k)\frac{c_{k}(\ell-k)}{c_{0}(\ell)}=\sum_{j=L}^{U}\pi(j)\gamma_{j}(\ell).

We next turn to the induction basis. For =L,,U\ell=L,\ldots,U, (4.13) holds trivially as γj()=δ,j\gamma_{j}(\ell)=\delta_{\ell,j} in this case. It remains to prove (4.13) for =0,,L1\ell=0,\ldots,L-1. Since π\pi is a stationary measure it fulfils the master equation (2.2) for (Ω,)(\Omega,\mathcal{F}). By rewriting this, we obtain

π()=ωΩλω(ω)π(ω)ωΩλω(),0.\pi(\ell)=\frac{\sum_{\omega\in\Omega}\lambda_{\omega}(\ell-\omega)\pi(\ell-{\omega})}{\sum_{\omega\in\Omega}\lambda_{\omega}(\ell)},\qquad\ell\in\mathbb{N}_{0}.

The denominator is never zero because for 0\ell\geq 0, it holds that λω()>0\lambda_{\omega}(\ell)>0 for at least one ωΩ\omega\in\Omega (otherwise zero is a trapping state).

In particular, for =0,,L1\ell=0,\ldots,L-1, using (4.2), it holds that ω<𝗂ω\ell-\omega_{-}<{\sf i}_{\omega_{-}}. Hence, λω(ω)=0\lambda_{\omega_{-}}(\ell-\omega_{-})=0, and

π()=ωΩ{ω}λω(ω)π(ω)ωΩλω(),=0,,L1.\pi(\ell)=\frac{\sum_{\omega\in\Omega\setminus\{\omega_{-}\}}\lambda_{\omega}(\ell-\omega)\pi(\ell-\omega)}{\sum_{\omega\in\Omega}\lambda_{\omega}(\ell)},\qquad\ell=0,\ldots,L-1.

Now, defining n=ωn=\ell-\omega, we have n<L1ω=Un<L-1-\omega_{-}=U for ωΩ{ω}\omega\in\Omega\setminus\{\omega_{-}\}, using UL=(ω+1)U-L=-(\omega_{-}+1), see (4.2). Hence

π()=n=0U1λn(n)π(n)ωΩλω(),=0,,L1,\pi(\ell)=\frac{\sum_{n=0}^{U-1}\lambda_{\ell-n}(n)\pi(n)}{\sum_{\omega\in\Omega}\lambda_{\omega}(\ell)},\qquad\ell=0,\ldots,L-1,

with the convention in (2.1). Evoking the definition of HH in (4.8), this equation may be written as

H(π(0)π(U1))=0.H\begin{pmatrix}\pi(0)\\ \vdots\\ \pi(U-1)\end{pmatrix}=0.

Recall that G=(H1)1H2G=-(H_{1})^{-1}H_{2} with H=(H1H2)H=(H_{1}\,\,H_{2}). Noting that γU()=0\gamma_{U}(\ell)=0, =0,,L1\ell=0,\ldots,L-1, yields that (4.13) is fulfilled with γj()=G,jL\gamma_{j}(\ell)=G_{\ell,j-L}, =0,,L1\ell=0,\ldots,L-1 and j=L,,U1j=L,\ldots,U-1, and the proof of the first part is concluded.

For the reverse part, we note that for 0x1L10\leq x-1\leq L-1, the argument above is ‘if and only if’: π(x1)=j=LUπ(j)γj(x1)\pi(x-1)=\sum_{j=L}^{U}\pi(j)\gamma_{j}(x-1) if and only if the master equation (2.2) is fulfilled for x1x-1. As noted in the proof of Proposition 4.3, this is equivalent to (4.10) being fulfilled for xx, which in turn is equivalent to (4.12) being fulfilled for =x+UL\ell=x+U-L (the equation is replicated here),

(4.14) k=0mπ(k)ck(k)=0.\sum_{k=0}^{m_{*}}\pi(\ell-k)c_{k}(\ell-k)=0.

As 0x1L10\leq x-1\leq L-1, then (4.14) holds for =UL+1,,U\ell=U-L+1,\ldots,U.

For >U\ell>U we have, using (4.4) and (4.5),

k=0mπ(k)ck(k)\displaystyle\sum_{k=0}^{m_{*}}\pi(\ell-k)c_{k}(\ell-k) =k=0mj=LUπ(j)γj(k)ck(k)=j=LUπ(j)k=0mγj(k)ck(k)\displaystyle=\sum_{k=0}^{m_{*}}\sum_{j=L}^{U}\pi(j)\gamma_{j}(\ell-k)c_{k}(\ell-k)=\sum_{j=L}^{U}\pi(j)\sum_{k=0}^{m_{*}}\gamma_{j}(\ell-k)c_{k}(\ell-k)
=j=LUπ(j)γj()c0()+j=LUπ(j)k=1mγj(k)ck(k)\displaystyle=\sum_{j=L}^{U}\pi(j)\gamma_{j}(\ell)c_{0}(\ell)+\sum_{j=L}^{U}\pi(j)\sum_{k=1}^{m_{*}}\gamma_{j}(\ell-k)c_{k}(\ell-k)
=0,\displaystyle=0,

hence (4.14) holds for all >UL\ell>U-L. According to Proposition 4.3, π\pi is then a stationary measure of (Ω,)(\Omega,\mathcal{F}). ∎

For completeness, we apply Proposition 4.3 to Example 4.2. The SRN has ω=2\omega_{-}=-2 and ω+=2\omega_{+}=2, such that m=ω++ω1=3m_{*}=\omega_{+}+-\omega_{-}-1=3. Equation (4.14) for 1=UL<U=31=U-L<\ell\leq U=3 becomes

κ2π(1)+κ1π(0)=0,2κ2π(2)+κ1π(1)=0,-\kappa_{2}\pi(1)+\kappa_{1}\pi(0)=0,\quad-2\kappa_{2}\pi(2)+\kappa_{1}\pi(1)=0,

in agreement with the equations found in Example 4.2.

4.1. Matrix representation

A simple representation can be obtained in terms of determinants of a certain matrix. For an infinite matrix BB, let B()B(\ell), 1\ell\geq 1, denote the matrix composed of BB’s first \ell rows and columns.

Theorem 4.5.

The function γj:\gamma_{j}\colon\mathbb{Z}\to\mathbb{R}, LjUL\leq j\leq U, has the following determinant representation for >0\ell>0,

γj(U+)=\displaystyle\gamma_{j}(U+\ell)= detBj()\displaystyle\det B_{j}(\ell)

where the matrix BjB_{j} is the infinite band matrix,

Bj=(gj(1)1000gj(2)f1(U+2)100gj(m)fm1(U+m)fm2(U+m)fm3(U+m)10fm(U+m+1)fm1(U+m+1)fm2(U+m+1)f1(U+m+1)00fm(U+m+2)fm1(U+m+2)f2(U+m+2)000fm(U+m+3)f3(U+m+3)),\displaystyle B_{j}=\begin{pmatrix}g_{j}(1)&-1&0&0&\ldots&0&\ldots\\ g_{j}(2)&f_{1}(U+2)&-1&0&\ldots&0&\ldots\\ \vdots&\vdots&\vdots&\vdots&\ddots&\vdots&\\ g_{j}(m_{*})&f_{m_{*}-1}(U+m_{*})&f_{m_{*}-2}(U+m_{*})&f_{m_{*}-3}(U+m_{*})&\ldots&-1&\ldots\\ 0&f_{m_{*}}(U+m_{*}+1)&f_{m_{*}-1}(U+m_{*}+1)&f_{m_{*}-2}(U+m_{*}+1)&\ldots&f_{1}(U+m_{*}+1)&\ldots\\ 0&0&f_{m_{*}}(U+m_{*}+2)&f_{m_{*}-1}(U+m_{*}+2)&\ldots&f_{2}(U+m_{*}+2)&\ldots\\ 0&0&0&f_{m_{*}}(U+m_{*}+3)&\ldots&f_{3}(U+m_{*}+3)&\ldots\\ \vdots&\vdots&\vdots&\vdots&&\vdots&\ddots\end{pmatrix},

and

gj(k)=i=0mkfmi(U+k)γj(U+i+km),k=1,,m.\displaystyle g_{j}(k)=\sum_{i=0}^{m_{*}-k}f_{m_{*}-i}(U+k)\gamma_{j}(U+i+k-m_{*}),\qquad k=1,\ldots,m_{*}.
Proof.

For the first equality, we apply [19, Theorem 2] on the solution to a mm_{*}-th order linear difference equation with variable coefficients. First, from (4.4), we have for >U\ell>U that

γj()=k=1mγj(k)fk(),that is,k=0mγj(k)fk()=0,\displaystyle\gamma_{j}(\ell)=\sum_{k=1}^{m_{*}}\gamma_{j}(\ell-k)f_{k}(\ell),\qquad\text{that is,}\qquad\sum_{k=0}^{m_{*}}\gamma_{j}(\ell-k)f_{k}(\ell)=0,

as f0()=1f_{0}(\ell)=-1. Thus, for >Um\ell>U-m_{*}, we have

(4.15) k=0mγj(+k)fmk(+m)=0,\displaystyle\sum_{k=0}^{m_{*}}\gamma_{j}(\ell+k)f_{m_{*}-k}(\ell+m_{*})=0,

subject to the initial conditions γj()\gamma_{j}(\ell) for =U+1m,,U\ell=U+1-m_{*},\ldots,U, given in (4.4).

Secondly, we translate the notation used here into the notation of [19, Theorem 2]: n=mn=m_{*}, i=ki=k, k=(Um)k=\ell-(U-m_{*}). Furthermore, we take g(k)=0g(k)=0, q(a,b)=fm(ba)(U+a)q(a,b)=f_{m_{*}-(b-a)}(U+a) and yi+k=γj(i+k+Um)y_{i+k}=\gamma_{j}(i+k+U-m_{*}). Then, the condition i=0mq(k,i+k)yi+k=g(k)\sum_{i=0}^{m}q(k,i+k)y_{i+k}=g(k) of [19, Theorem 2] translate into (4.15). The matrix BjB_{j} is as in [19, Theorem 2], except the first column is multiplied by 1-1. By [19, Theorem 2], we find that

γj(U+)=(1)1i=1f0(U+i)(detBj())=detBj(),for.\displaystyle\gamma_{j}(U+\ell)=\frac{(-1)^{\ell-1}}{\prod_{i=1}^{\ell}f_{0}(U+i)}(-\det B_{j}(\ell))=\det B_{j}(\ell),\quad\text{for}\quad\ell\in\mathbb{N}.

By Lemma A.1, the second equality follows from f0()=1f_{0}(\ell)=-1, >U\ell>U, see (4.5). ∎

A combinatorial representation can be achieved by calculating the determinant. In the present case, the matrix Bj()B_{j}(\ell) is a Hessenberg matrix and an explicit combinatorial expression exists [22, Proposition 2.4]. Combining Theorem 4.4 and Theorem 4.5 yields the following.

Corollary 4.6.

Let π\pi be a stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}. Then,

π()=j=LUπ(j)detBj(U),>U.\displaystyle\pi(\ell)=\sum_{j=L}^{U}\pi(j)\det B_{j}(\ell-U),\qquad\ell>U.
Example 4.7 (Upwardly skip-free process).

Consider the SRN with mass-action kinetics,

0\ce>[κ1]S,S\ce>[κ2]2S,2S\ce>[κ3]0.0\ce{->[\kappa_{1}]}\text{S},\quad\text{S}\ce{->[\kappa_{2}]}2\text{S},\quad 2\text{S}\ce{->[\kappa_{3}]}0.

Here, Ω={2,1}\Omega=\{-2,1\} and λ2(x)=κ3x(x1)\lambda_{-2}(x)=\kappa_{3}x(x-1), λ1(x)=κ1+κ2x\lambda_{1}(x)=\kappa_{1}+\kappa_{2}x for x0x\in\mathbb{N}_{0}. Hence, ω=1\omega_{*}=1, 𝗂+=0{\sf i}_{+}=0, 𝗈=0{\sf o}_{-}=0. Consequently, s=0s=0, 𝖯0=0{\sf P}_{0}=\mathbb{N}_{0} and (A3) is fulfilled, in addition to (A1)-(A2). According to [30, Theorem 7], the SRN is positive recurrent on 0\mathbb{N}_{0} for all κ1,κ2,κ3>0\kappa_{1},\kappa_{2},\kappa_{3}>0, and hence has a unique stationary distribution.

We have L=0L=0 and U=1U=1. By Theorem 4.5, the stationary distribution fulfils

π(x)=π(0)detB0(x1)+π(1)detB1(x1),x2,\displaystyle\pi(x)=\pi(0)\det B_{0}(x-1)+\pi(1)\det B_{1}(x-1),\qquad x\geq 2,

where B0=(A0B)B_{0}=(A_{0}\,\,B), B1=(A1B)B_{1}=(A_{1}\,\,B), and

A0=(κ12κ30000),A1=(0κ1+κ223κ3000),B=(100013100κ1+2κ234κ324100κ1+3κ245κ335100κ1+4κ256κ346).A_{0}=\begin{pmatrix}\frac{\kappa_{1}}{2\kappa_{3}}\\ 0\\ 0\\ 0\\ 0\\ \vdots\end{pmatrix},\quad A_{1}=\begin{pmatrix}0\\ \frac{\kappa_{1}+\kappa_{2}}{2\cdot 3\kappa_{3}}\\ 0\\ 0\\ 0\\ \vdots\end{pmatrix},\quad B=\begin{pmatrix}-1&0&0&0&\ldots\\ \frac{1}{3}&-1&0&0&\ldots\\ \frac{\kappa_{1}+2\kappa_{2}}{3\cdot 4\kappa_{3}}&\frac{2}{4}&-1&0&\ldots\\ 0&\frac{\kappa_{1}+3\kappa_{2}}{4\cdot 5\kappa_{3}}&\frac{3}{5}&-1&\ldots\\ 0&0&\frac{\kappa_{1}+4\kappa_{2}}{5\cdot 6\kappa_{3}}&\frac{4}{6}&\ldots\\ \vdots&\vdots&\vdots&\vdots&\ddots\end{pmatrix}.
Example 4.8 (Downwardly skip-free process).

Consider the SRN with mass-action kinetics,

0\ce>[κ1]2S,S\ce>[κ2]0,2S\ce>[κ3]S.0\ce{->[\kappa_{1}]}2\text{S},\quad\text{S}\ce{->[\kappa_{2}]}0,\quad 2\text{S}\ce{->[\kappa_{3}]}\text{S}.

Here, Ω={1,2}\Omega=\{-1,2\} and λ1(x)=κ2x+κ3x(x1)\lambda_{-1}(x)=\kappa_{2}x+\kappa_{3}x(x-1), λ2(x)=κ1\lambda_{2}(x)=\kappa_{1} for x0x\in\mathbb{N}_{0}. Hence, ω=1\omega_{*}=1, 𝗂+=0{\sf i}_{+}=0, 𝗈=0{\sf o}_{-}=0. Consequently, s=0s=0, 𝖯0=0{\sf P}_{0}=\mathbb{N}_{0} and (A3) is fulfilled, in addition to (A1)-(A2). According to [30, Theorem 7], the SRN is positive recurrent on 0\mathbb{N}_{0} for all κ1,κ2,κ3>0\kappa_{1},\kappa_{2},\kappa_{3}>0, and hence has a unique stationary distribution.

We have L=U=0L=U=0. By Theorem 4.5, the stationary distribution fulfils

π(x)=π(0)detB0(x),x1,\displaystyle\pi(x)=\pi(0)\det B_{0}(x),\qquad x\geq 1,

where

B0=(κ1κ2100κ12(κ2+κ3)κ12(κ2+κ3)100κ13(κ2+2κ3)κ13(κ2+2κ3)100κ14(κ2+3κ3)κ14(κ2+3κ3)).B_{0}=\begin{pmatrix}\frac{\kappa_{1}}{\kappa_{2}}&-1&0&0&\ldots\\ \frac{\kappa_{1}}{2(\kappa_{2}+\kappa_{3})}&\frac{\kappa_{1}}{2(\kappa_{2}+\kappa_{3})}&-1&0&\ldots\\ 0&\frac{\kappa_{1}}{3(\kappa_{2}+2\kappa_{3})}&\frac{\kappa_{1}}{3(\kappa_{2}+2\kappa_{3})}&-1&\ldots\\ 0&0&\frac{\kappa_{1}}{4(\kappa_{2}+3\kappa_{3})}&\frac{\kappa_{1}}{4(\kappa_{2}+3\kappa_{3})}&\ldots\\ \vdots&\vdots&\vdots&\vdots&\ddots\end{pmatrix}.

5. Skip-free Markov chains

5.1. Downwardly skip-free Markov chains

An explicit iterative formula can be derived from Theorem 4.4 for downwardly skip-free Markov chains.

Corollary 5.1.

Assume ω=1\omega_{-}=-1, and let π(0)>0\pi(0)>0. Then, π\pi is a stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0} if and only if

π()=π(0)γ0(),for0,\pi(\ell)=\pi(0)\gamma_{0}(\ell),\quad\text{for}\quad\ell\in\mathbb{N}_{0},

where

γ0(0)=1,γ0()=k=1ω+γ0(k)fk(),>0,\gamma_{0}(0)=1,\qquad\gamma_{0}(\ell)=\sum_{k=1}^{\omega_{+}}\gamma_{0}(\ell-k)f_{k}(\ell),\quad\ell>0,

and fkf_{k} is as defined in (4.5). Consequently, there exists a unique stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}. Furthermore, if π\pi is a stationary distribution, then π(0)=(=0γ0())1\pi(0)=\left(\sum_{\ell=0}^{\infty}\gamma_{0}(\ell)\right)^{-1}.

Proof.

Since ω=1\omega_{-}=-1, then L=UL=U, see (4.2). Moreover, 𝗂ω=1{\sf i}_{\omega_{-}}=1 as otherwise the state zero could not be reached. Hence, L=U=𝗂ω1=0L=U={\sf i}_{\omega_{-}}-1=0 from (4.2). Consequently, π()=π(0)γ0()\pi(\ell)=\pi(0)\gamma_{0}(\ell), 0\ell\in\mathbb{N}_{0}, from Theorem 4.4. Furthermore, m=ω+ω1=ω+m_{*}=\omega_{+}-\omega_{-}-1=\omega_{+} such that the expression for γ0()\gamma_{0}(\ell), 0\ell\in\mathbb{N}_{0}, follows from (4.4). Positivity of γ()\gamma(\ell), 0\ell\in\mathbb{N}_{0}, follows from Theorem 6.3. If π\pi is a stationary distribution, then 1==0π()=π(0)=0γ0()1=\sum_{\ell=0}^{\infty}\pi(\ell)=\pi(0)\sum_{\ell=0}^{\infty}\gamma_{0}(\ell), and the conclusion follows. ∎

Corollary 5.1 leads to the classical birth-death process characterization.

Corollary 5.2.

Assume Ω={1,1}\Omega=\{-1,1\}. A measure π\pi on 0\mathbb{N}_{0} is a stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0} if and only if

π()=π(0)j=1λ1(j1)λ1(j),for>0,\pi(\ell)=\pi(0)\prod_{j=1}^{\ell}\frac{\lambda_{1}(j-1)}{\lambda_{-1}(j)},\quad\text{for}\quad\ell>0,

where π(0)=(1+=1j=1λ1(j1)λ1(j))1\pi(0)=\left(1+\sum_{\ell=1}^{\infty}\prod_{j=1}^{\ell}\frac{\lambda_{1}(j-1)}{\lambda_{-1}(j)}\right)^{-1} in the case of a stationary distribution.

Proof.

In particular, the process is downwardly skip-free. Furthermore, ω+=1\omega_{+}=1, such that for >0\ell>0, we have from Corollary 5.1, (4.5) and (4.6),

γ0()=k=1ω+γ0(k)fk()=γ0(1)f1()=j=1f1(j)=(1)j=1c1(j1)j=1c0(j).\gamma_{0}(\ell)=\sum_{k=1}^{\omega_{+}}\gamma_{0}(\ell-k)f_{k}(\ell)=\gamma_{0}(\ell-1)f_{1}(\ell)=\prod_{j=1}^{\ell}f_{1}(j)=(-1)^{\ell}\frac{\prod_{j=1}^{\ell}c_{1}(j-1)}{\prod_{j=1}^{\ell}c_{0}(j)}.

By definition of BkB_{k} and ck()c_{k}(\ell), k=0,1k=0,1 (m=1m_{*}=1), we have B0={1}B_{0}=\{-1\}, B1={1}B_{1}=\{1\},

c0()=λ1(),c1()=λ1().c_{0}(\ell)=-\lambda_{-1}(\ell),\quad c_{1}(\ell)=\lambda_{1}(\ell).

By insertion, this yields

γ0()=j=1λ1(j1)λ1(j),>0,\gamma_{0}(\ell)=\prod_{j=1}^{\ell}\frac{\lambda_{1}(j-1)}{\lambda_{-1}(j)},\qquad\ell>0,

and the statement follows from Corollary 5.1. ∎

5.2. Upwardly skip-free Markov chains

In general, we are not able to give conditions for when an upwardly skip-free Markov chain has a unique stationary measure (up to a scalar) and when it has more than one or even infinitely many linearly independent ones. However, in a particular case, if there is not a unique stationary measure, we establish all stationary measures as an explicit one-parameter family of measures (Corollary 5.5). If the transition functions are polynomial then we show a stationary measure is always unique (Corollary 5.6). Thus, we need non-polynomial transition rates for non-uniqueness and give one such example in Example 9.5.

Proposition 5.3.

Assume ω=2\omega_{-}=-2 and ω+=1\omega_{+}=1, hence U=L+1U=L+1. A non-zero measure π\pi on 0\mathbb{N}_{0} is a stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0} if and only if

(5.1) h(x)(ϕ(x+1)+1)=ϕ(x)ϕ(x+1),xU+2,h(x)(\phi(x+1)+1)=\phi(x)\phi(x+1),\quad x\geq U+2,

where

ϕ(x)\displaystyle\phi(x) =π(x1)π(x)λ1(x1)+λ2(x1)λ2(x),\displaystyle=\frac{\pi(x-1)}{\pi(x)}\frac{\lambda_{-1}(x-1)+\lambda_{-2}(x-1)}{\lambda_{-2}(x)},
(5.2) h(x)\displaystyle h(x) =(λ1(x1)+λ2(x1))(λ1(x)+λ2(x))λ1(x1)λ2(x),\displaystyle=\frac{(\lambda_{-1}(x-1)+\lambda_{-2}(x-1))(\lambda_{-1}(x)+\lambda_{-2}(x))}{\lambda_{1}(x-1)\lambda_{-2}(x)},

for xU+2x\geq U+2, and

(5.3) π(x)=π(x+1)(λ1(x+1)+λ2(x+1))+π(x+2)λ2(x+2)λ1(x),0xU,\pi(x)=\frac{\pi(x+1)(\lambda_{-1}(x+1)+\lambda_{-2}(x+1))+\pi(x+2)\lambda_{-2}(x+2)}{\lambda_{1}(x)},\quad 0\leq x\leq U,

with λ10\lambda_{-1}\equiv 0 for convenience if there are no jumps of size 1-1. If this is the case, then

(5.4) π(x)=π(U+1)j=U+1x1λ1(j)+λ2(j)λ2(j+1)ϕ(j+1),xU+2,\pi(x)=\pi(U+1)\prod_{j=U+1}^{x-1}\frac{\lambda_{-1}(j)+\lambda_{-2}(j)}{\lambda_{-2}(j+1)\phi(j+1)},\quad x\geq U+2,
Proof.

Recall the master equation for a stationary measure π\pi, in the form of (4.11):

(5.5) π(x)(λ1(x)+λ2(x))+π(x+1)λ2(x+1)=π(x1)λ1(x1),x1.\pi(x)(\lambda_{-1}(x)+\lambda_{-2}(x))+\pi(x+1)\lambda_{-2}(x+1)=\pi(x-1)\lambda_{1}(x-1),\quad x\geq 1.

Define ϕ\phi and hh as in the statement for xU+2x\geq U+2 (if x=U+1x=U+1, then ϕ(x),h(x)\phi(x),h(x) might be zero; if x<U+1x<U+1, then division by zero occurs as U+1=𝗂ωU+1={\sf i}_{\omega_{-}}). Dividing π(x+1)λ2(x+1)\pi(x+1)\lambda_{-2}(x+1) on both sides of (5.5) yields

π(x)π(x+1)λ1(x)+λ2(x)λ2(x+1)+1=\displaystyle\frac{\pi(x)}{\pi(x+1)}\frac{\lambda_{-1}(x)+\lambda_{-2}(x)}{\lambda_{-2}(x+1)}+1= π(x1)π(x)λ1(x1)+λ2(x1)λ2(x)\displaystyle\frac{\pi(x-1)}{\pi(x)}\frac{\lambda_{-1}(x-1)+\lambda_{-2}(x-1)}{\lambda_{-2}(x)}
π(x)π(x+1)λ1(x)+λ2(x)λ2(x+1)\displaystyle\cdot\frac{\pi(x)}{\pi(x+1)}\frac{\lambda_{-1}(x)+\lambda_{-2}(x)}{\lambda_{-2}(x+1)}
λ1(x1)λ2(x)(λ1(x1)+λ2(x1))(λ1(x)+λ2(x)),\displaystyle\cdot\frac{\lambda_{1}(x-1)\lambda_{-2}(x)}{(\lambda_{-1}(x-1)+\lambda_{-2}(x-1))(\lambda_{-1}(x)+\lambda_{-2}(x))},

and the identity (5.1) follows. By rewriting the master equation, then (5.3) follows. The calculations can be done in reverse order yielding the bi-implication. Equation (5.4) follows by induction. ∎

Lemma 5.4.

Let h:[N,)0(0,)h\colon[N,\infty)\cap\mathbb{N}_{0}\to(0,\infty) and ϕ:[N,)0\phi\colon[N,\infty)\cap\mathbb{N}_{0}\to\mathbb{R} be functions with N0N\in\mathbb{N}_{0}, and such that

(5.6) h(x)(ϕ(x+1)+1)=ϕ(x)ϕ(x+1),xN.h(x)(\phi(x+1)+1)=\phi(x)\phi(x+1),\quad x\geq N.

Let ϕ\phi^{*} be a positive number. Then, ϕ\phi is a positive function with ϕ(N)=ϕ\phi(N)=\phi^{*}, if and only if

Ψ2(N)ϕΨ1(N),\Psi_{2}(N)\leq\phi^{*}\leq\Psi_{1}(N),

where Ψ1(x)=limkψ(x,2k1)\Psi_{1}(x)=\lim_{k\to\infty}\psi(x,2k-1), Ψ2(x)=limkψ(x,2k)\Psi_{2}(x)=\lim_{k\to\infty}\psi(x,2k), and ψ(x,k)\psi(x,k) is determined recursively by

(5.7) ψ(x,k)=h(x)(1+1ψ(x+1,k1)),xN,k1,\psi(x,k)=h(x)\left(1+\frac{1}{\psi(x+1,k-1)}\right),\quad x\geq N,\quad k\geq 1,

with ψ(x,0)=h(x)\psi(x,0)=h(x).

Proof.

Let ψ(x,k)\psi(x,k) be as in the statement and ϕ\phi a positive function fulfilling (5.6). Note that ψ(x,k)\psi(x,k) is the kk-th convergent of a generalized continued fraction, hence ψ(x,2k)\psi(x,2k) is increasing in k0k\geq 0 and ψ(x,2k+1)\psi(x,2k+1) is decreasing in k0k\geq 0. Indeed, this follows from Theorem 4 (monotonicity of even and odd convergents) of [18]. See also the proof of Corollary 5.5.

By induction, we will show that

ψ(x,2k)<ϕ(x)<ψ(x,2k+1),xN,k0,\psi(x,2k)<\phi(x)<\psi(x,2k+1),\quad x\geq N,\quad k\geq 0,

from which it follows that

(5.8) Ψ2(x)=limkψ(x,2k)ϕ(x)Ψ1(x)=limkψ(x,2k1),xN.\Psi_{2}(x)=\lim_{k\to\infty}\psi(x,2k)\leq\phi(x)\leq\Psi_{1}(x)=\lim_{k\to\infty}\psi(x,2k-1),\quad x\geq N.

For the base case, it follows from (5.6) that ψ(x,0)=h(x)<ϕ(x)\psi(x,0)=h(x)<\phi(x) for xNx\geq N, and thus

ψ(x,1)=h(x)(1+1ψ(x+1,0))>h(x)(1+1ϕ(x+1))=ϕ(x).\psi(x,1)=h(x)\left(1+\frac{1}{\psi(x+1,0)}\right)>h(x)\left(1+\frac{1}{\phi(x+1)}\right)=\phi(x).

For the induction step, assume ψ(x,2k)<ϕ(x)<ψ(x,2k+1)\psi(x,2k)<\phi(x)<\psi(x,2k+1) for xNx\geq N and some k0k\geq 0. Then, using (5.7),

ψ(x,2k+2)\displaystyle\psi(x,2k+2) =h(x)(1+1ψ(x+1,2k+1))<h(x)(1+1ϕ(x+1))=ϕ(x),\displaystyle=h(x)\left(1+\frac{1}{\psi(x+1,2k+1)}\right)<h(x)\left(1+\frac{1}{\phi(x+1)}\right)=\phi(x),
ψ(x,2k+3)\displaystyle\psi(x,2k+3) =h(x)(1+1ψ(x+1,2k+2))>h(x)(1+1ϕ(x+1))=ϕ(x).\displaystyle=h(x)\left(1+\frac{1}{\psi(x+1,2k+2)}\right)>h(x)\left(1+\frac{1}{\phi(x+1)}\right)=\phi(x).

If ϕ(N)=ϕ\phi(N)=\phi^{*}, then the first implication follows from (5.8). For the reverse implication, assume Ψ2(N)ϕ(N)Ψ1(N).\Psi_{2}(N)\leq\phi(N)\leq\Psi_{1}(N). Note from (5.6) that ϕ(x+1)\phi(x+1) is positive only if h(x)<ϕ(x)h(x)<\phi(x). We will show that Ψ2(N)ϕ(N)Ψ1(N),\Psi_{2}(N)\leq\phi(N)\leq\Psi_{1}(N), implies Ψ2(x)ϕ(x)Ψ1(x)\Psi_{2}(x)\leq\phi(x)\leq\Psi_{1}(x) for all xNx\geq N, hence also h(x)=ψ(x,0)<Ψ2(x)ϕ(x)h(x)=\psi(x,0)<\Psi_{2}(x)\leq\phi(x) for all xNx\geq N, and we are done. Letting kk\to\infty in (5.7) yields

Ψ1(x)=h(x)(1+1Ψ2(x+1)),Ψ2(x)=h(x)(1+1Ψ1(x+1)),xN.\Psi_{1}(x)=h(x)\bigg{(}1+\frac{1}{\Psi_{2}(x+1)}\bigg{)},\quad\Psi_{2}(x)=h(x)\bigg{(}1+\frac{1}{\Psi_{1}(x+1)}\bigg{)},\quad x\geq N.

Hence, it follows from the induction hypothesis that

Ψ2(x+1)=1Ψ1(x)h(x)11ϕ(x)h(x)1=ϕ(x+1)1Ψ2(x)h(x)1=Ψ1(x+1),\Psi_{2}(x+1)=\frac{1}{\frac{\Psi_{1}(x)}{h(x)}-1}\leq\frac{1}{\frac{\phi(x)}{h(x)}-1}=\phi(x+1)\leq\frac{1}{\frac{\Psi_{2}(x)}{h(x)}-1}=\Psi_{1}(x+1),

using (5.6), and the claim follows. ∎

Below, we give a general condition for uniqueness and show that uniqueness does not always hold by example. In Section 6, we give concrete cases where uniqueness applies.

Corollary 5.5.

Assume ω=2\omega_{-}=-2 and ω+=1\omega_{+}=1. Choose π>0\pi^{*}>0 and Ψ2(U+2)ϕΨ1(U+2),\Psi_{2}(U+2)\leq\phi^{*}\leq\Psi_{1}(U+2), where Ψ1,Ψ2\Psi_{1},\Psi_{2} are as in Lemma 5.4 and hh as in (5.2). Let ϕ\phi be a solution to (5.6) in Lemma 5.4 with ϕ(U+2)=ϕ\phi(U+2)=\phi^{*}. Then, (5.4) and (5.3) define a stationary measure π\pi of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0} by setting π(U+1)=π\pi(U+1)=\pi^{*}.

The measure is unique if and only if Ψ2(U+2)=Ψ1(U+2)\Psi_{2}(U+2)=\Psi_{1}(U+2), if and only if the following sum is divergent for x=U+2x=U+2,

H(x)\displaystyle H(x) =n=0(h(x+2n+1)i=0nh(x+2i)h(x+2i+1)+i=0nh(x+2i+1)h(x+2i))\displaystyle=\sum_{n=0}^{\infty}\left(h(x+2n+1)\prod_{i=0}^{n}\frac{h(x+2i)}{h(x+2i+1)}+\prod_{i=0}^{n}\frac{h(x+2i+1)}{h(x+2i)}\right)
=n=0(λ1(x+2n)+λ2(x+2n))(λ1(x1)+λ2(x1))λ1(x+2n)λ2(x+2n+1)1Qn(x)\displaystyle=\sum_{n=0}^{\infty}\frac{(\lambda_{-1}(x+2n)+\lambda_{-2}(x+2n))(\lambda_{-1}(x-1)+\lambda_{-2}(x-1))}{\lambda_{1}(x+2n)\lambda_{-2}(x+2n+1)}\frac{1}{Q_{n}(x)}
+n=0λ1(x+2n+1)+λ2(x+2n+1)λ1(x1)+λ2(x1)Qn(x),\displaystyle\quad+\sum_{n=0}^{\infty}\frac{\lambda_{-1}(x+2n+1)+\lambda_{-2}(x+2n+1)}{\lambda_{-1}(x-1)+\lambda_{-2}(x-1)}Q_{n}(x),

where

Qn(x)=i=0nλ1(x+2i1)λ2(x+2i)λ1(x+2i)λ2(x+2i+1).Q_{n}(x)=\prod_{i=0}^{n}\frac{\lambda_{1}(x+2i-1)\lambda_{-2}(x+2i)}{\lambda_{1}(x+2i)\lambda_{-2}(x+2i+1)}.
Proof.

The first part of the proof is an application of Proposition 5.3 and Lemma 5.4 with N=U+2N=U+2. The last bi-implication follows by noting that ψ(x,k)\psi(x,k) in Lemma 5.4 is the kk-th convergent of a generalized continued fraction,

b0+c1b1+c2b2+c3b3+=h(x)(1+1h(x+1)(1+1h(x+2)(1+1h(x+3)(1+b_{0}+\cfrac{c_{1}}{b_{1}+\cfrac{c_{2}}{b_{2}+\cfrac{c_{3}}{b_{3}+\cdots\vphantom{\cfrac{1}{1}}}}}=h(x)(1+\cfrac{1}{h(x+1)(1+\cfrac{1}{h(x+2)(1+\cfrac{1}{h(x+3)(1+\cdots\vphantom{\cfrac{1}{1}}}}}

that is, bn=cn+1=h(x+n)b_{n}=c_{n+1}=h(x+n), n0n\geq 0. By transformation, this generalized continued fraction is equivalent to a (standard) continued fraction,

a0+1a1+1a2+1a3+a_{0}+\cfrac{1}{a_{1}+\cfrac{1}{a_{2}+\cfrac{1}{a_{3}+\cdots\vphantom{\cfrac{1}{1}}}}}

with

a2n=h(x+2n+1)i=0nh(x+2i)h(x+2i+1),a2n+1=i=0nh(x+2i+1)h(x+2i),n0.a_{2n}=h(x+2n+1)\prod_{i=0}^{n}\frac{h(x+2i)}{h(x+2i+1)},\quad a_{2n+1}=\prod_{i=0}^{n}\frac{h(x+2i+1)}{h(x+2i)},\quad n\geq 0.

By [18, Theorem 10], the continued fraction converges if and only if n=0an=\sum_{n=0}a_{n}=\infty, which proves the bi-implication, noting that by the first part of the proof it is sufficient to check x=U+2x=U+2, and using the concrete form of h(x)h(x) in (5.2). ∎

We give a concrete example of a non-unique Markov chain in Example 9.5 in Section 9.

Corollary 5.6.

Assume ω+=1\omega_{+}=1, ω=2\omega_{-}=-2, and that λω(x)\lambda_{\omega}(x), ωΩ\omega\in\Omega, is a polynomial for large xx. Then, there is a unique stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}.

Proof.

We prove that the first series in H(x)H(x) in Corollary 5.5 diverges for xU+2x\geq U+2, and hence H(x)=H(x)=\infty. Let m1=degλ1(x)m_{1}=\deg\lambda_{1}(x), m2=degλ2(x)m_{2}=\deg\lambda_{-2}(x). We first provide asymptotics of Qn(x)Q_{n}(x) for large nn. By the Euler-Maclaurin formula with xU+2x\geq U+2,

Qn(x)\displaystyle Q_{n}(x) =exp(i=0nlog(1λ1(x+2i1)λ1(x+2i))+log(1λ2(x+2i)λ2(x+2i+1)))\displaystyle=\exp\left(\sum_{i=0}^{n}\log\bigl{(}1-\tfrac{\lambda_{1}(x+2i-1)}{\lambda_{1}(x+2i)}\bigr{)}+\log\bigl{(}1-\tfrac{\lambda_{-2}(x+2i)}{\lambda_{-2}(x+2i+1)}\bigr{)}\right)
=exp(i=0nlog(1m1x+2i+O((x+2i)2))+log(1m2x+2i+O((x+2i)2)))\displaystyle=\exp\left(\sum_{i=0}^{n}\log\bigl{(}1-\tfrac{m_{1}}{x+2i}+O((x+2i)^{-2})\bigr{)}+\log\bigl{(}1-\tfrac{m_{2}}{x+2i}+O((x+2i)^{-2})\bigr{)}\right)
=exp(i=0nm1x+2i+O((x+2i)2)m2x+2i+O((x+2i)2))\displaystyle=\exp\left(\sum_{i=0}^{n}-\tfrac{m_{1}}{x+2i}+O((x+2i)^{-2})-\tfrac{m_{2}}{x+2i}+O((x+2i)^{-2})\right)
=exp(m1log(x+2n)m2log(x+2n)+O(1))\displaystyle=\exp\Bigl{(}-m_{1}\log(x+2n)-m_{2}\log(x+2n)+O(1)\Bigr{)}
=C(x,2n)(x+2n)(m1+m2),\displaystyle=C(x,2n)(x+2n)^{-(m_{1}+m_{2})},

where 0<C1(x)<C(x,2n)<C2(x)<0<C_{1}(x)<C(x,2n)<C_{2}(x)<\infty for all nn, and C1(x),C2(x)C_{1}(x),C_{2}(x) are positive constants depending on xx. This implies that the terms in the first series of H(x)H(x) are bounded uniformly in nn from below:

lim infnλ2(x+2n)λ2(x1)λ1(x+2n)λ2(x+2n+1)1Qn(x)>0,\displaystyle\liminf_{n\to\infty}\frac{\lambda_{-2}(x+2n)\lambda_{-2}(x-1)}{\lambda_{1}(x+2n)\lambda_{-2}(x+2n+1)}\frac{1}{Q_{n}(x)}>0,

and hence the first series in H(x)H(x) diverges. ∎

6. Invariant vectors and existence of stationary measures

In the previous section, we focused on non-zero measures on 0\mathbb{N}_{0} and conditions to ensure stationarity. However, the requirement of a non-zero measure is not essential. In fact, the conditions of Proposition 4.3 and Theorem 4.4 characterize equally any real-valued sequence that solves the master equation (2.2). Henceforth, we refer to the vector (vL,,vU)UL+1(v_{L},\ldots,v_{U})\in\mathbb{R}^{U-L+1} as a generator and the sequence as an invariant vector. This implies the linear subspace in ()\ell(\mathbb{R}) of invariant vectors is (UL+1)(U-L+1)-dimensional. Such vector might or might not be a signed invariant measure depending on whether the positive or the negative part of the vector has finite one-norm.

We assume (A1)-(A3) throughout Section 6. If the CTMC is recurrent (positive or null), then it is well known that there exists a unique stationary measure. For transient CTMCs, including explosive chains, there also exists a non-zero stationary measure in the setting considered.

Proposition 6.1.

There exists a non-zero stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}.

Proof.

It follows from [15, Corollary] and [25, Theorem3.5.1], noting that the set of states with non-zero transition rates to any given state x0x\in\mathbb{N}_{0} is finite, in fact #Ω\leq\#\Omega. ∎

Lemma 6.2.

Let π\pi be a non-zero invariant measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0} such that π(x)0\pi(x)\geq 0 for all x0x\in\mathbb{N}_{0}. Then, π(x)>0\pi(x)>0 for all x0x\in\mathbb{N}_{0}. In fact, π\pi is a non-zero stationary measure of (Ω,)(\Omega,\mathcal{F}).

Proof.

Assume π(x)=0\pi(x)=0. By rewriting the master equation (2.2), we obtain

π(x)\displaystyle\pi(x) =1ωΩλω(x)ωΩλω(xω)π(xω)=0.\displaystyle=\frac{1}{\sum_{\omega\in\Omega}\lambda_{\omega}(x)}\sum_{\omega\in\Omega}\lambda_{\omega}(x-\omega)\pi(x-\omega)=0.

Let xx be reachable from y0y\in\mathbb{N}_{0} in kk steps. If k=1k=1, then it follows from above that y=xωy=x-\omega for some ωΩ\omega\in\Omega and π(y)=0\pi(y)=0. If k>1k>1, then π(y)=0\pi(y)=0 by induction in the number of steps. Since 0\mathbb{N}_{0} is irreducible by assumption, then any state can be reached from any other state in a finite number of steps, and π(x)=0\pi(x)=0 for all x0x\in\mathbb{N}_{0}. However, this contradicts the measure is non-zero. Hence, π(x)>0\pi(x)>0 for all x0x\in\mathbb{N}_{0}. ∎

Theorem 6.3.

The sequences γL,,γU\gamma_{L},\ldots,\gamma_{U} form a maximal set of linearly independent invariant vectors in ()\ell(\mathbb{R}). Moreover, γj\gamma_{j} has positive and negative terms for all j=L,,Uj=L,\ldots,U if and only if ω<1\omega_{-}<-1, that is, if and only if LUL\not=U. If L=UL=U, then γL\gamma_{L} has all terms positive. In any case, γ(n)=(γL(n),,γU(n))0\gamma(n)=(\gamma_{L}(n),\ldots,\gamma_{U}(n))\not=0, for n0n\in\mathbb{N}_{0}.

Proof.

The former part follows immediately from Theorem 4.4, since (γj(L),,γj(U))(\gamma_{j}(L),\ldots,\gamma_{j}(U)) is the (jL+1)(j-L+1)-th unit vector of UL+1\mathbb{R}^{U-L+1}, for j=L,,Uj=L,\ldots,U, by definition. The latter part follows from Lemma 6.2 and the fact that γj(j)=1\gamma_{j}(j)=1 and γj()=0\gamma_{j}(\ell)=0 for {L,,U}{j}\ell\in\{L,\ldots,U\}\setminus\{j\}. If L=UL=U, then the linear space is one-dimensional and positivity of all γL()\gamma_{L}(\ell), 0\ell\in\mathbb{N}_{0}, follows from Lemma 6.2 and the existence of a stationary measure, see Proposition 6.1. The equivalence between ω<1\omega_{-}<1 and LUL\not=U follows from (4.2). If γ(n)=0\gamma(n)=0, then π(n)=0\pi(n)=0 for any invariant vector π\pi, contradicting the existence of a stationary measure. ∎

Theorem 6.4.

Let GUL+1G\subseteq\mathbb{R}^{U-L+1} be the set of generators of stationary measures of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}. Then, G>0UL+1G\subseteq\mathbb{R}_{>0}^{U-L+1} is a positive convex cone. Let G0=G{(0,,0)}G_{0}=G\cup\{(0,\ldots,0)\} be the set including the zero vector, which generates the null measure. Then, G0G_{0} is a closed set. The set {vG0:v1=1}\{v\in G_{0}\colon\|v\|_{1}=1\} is closed and convex.

Proof.

Positivity follows from Lemma 6.2. It is straightforward to see that GG is a positive convex cone. Assume v(m)G0v^{(m)}\in G_{0}, m0m\in\mathbb{N}_{0}, and v(m)v=(vL,,vU)G0v^{(m)}\to v=(v_{L},\ldots,v_{U})\not\in G_{0} as mm\to\infty. Using Lemma 6.2, there exists 0\ell\in\mathbb{N}_{0}, such that j=LUvjγj()<0\sum_{j=L}^{U}v_{j}\gamma_{j}(\ell)<0. Then, there also exists m0m\in\mathbb{N}_{0} such that j=LUvj(m)γj()<0\sum_{j=L}^{U}v^{(m)}_{j}\gamma_{j}(\ell)<0 with v(m)=(vL(m),,vU(m))v^{(m)}=(v^{(m)}_{L},\ldots,v^{(m)}_{U}), contradicting that v(m)G0v^{(m)}\in G_{0}. Hence, G0G_{0} is closed. The last statement is immediate from the previous. ∎

Part of Theorem 6.4 might be found in [8, Theorem 1.4.6]. In general, we do not have uniqueness of a stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}, unless in the case of recurrent CTMCs. For downwardly skip-free processes, we have L=U=0L=U=0, hence the space of signed invariant measures is one-dimensional and uniqueness follows, that is, there does not exist a proper signed invariant measure taking values with both signs.

We end the section with a few results on uniqueness of stationary measures. For this we need the following lemma.

Lemma 6.5.

Let v=(vL,,vU)0UL+1v=(v_{L},\ldots,v_{U})\in\mathbb{R}_{\geq 0}^{U-L+1} be a non-zero generator and assume ω+=1\omega_{+}=1. Let

ν()=j=LUvjγj(),0.\nu(\ell)=\sum_{j=L}^{U}v_{j}\gamma_{j}(\ell),\quad\ell\in\mathbb{N}_{0}.

If, either

(6.1) ν()0,=n(UL),,n,\nu(\ell)\geq 0,\quad\ell=n-(U-L),\ldots,n,

for some nULn\geq U-L, or

(6.2) ν()=0,=n(UL1),,n,\nu(\ell)=0,\quad\ell=n-(U-L-1),\ldots,n,

for some nUL1n\geq U-L-1, then

(6.3) ν()0,=0,,n.\nu(\ell)\geq 0,\quad\ell=0,\ldots,n.
Proof.

Since ω+=1\omega_{+}=1, then m=ω+ω1=ω=UL+1m_{*}=\omega_{+}-\omega_{-}-1=-\omega_{-}=U-L+1. We use ω-\omega_{-} rather than UL+1U-L+1 in the proof for convenience. From Lemma A.1, we have ck()0c_{k}(\ell)\leq 0 for 0\ell\in\mathbb{N}_{0} and 0k<ω0\leq k<-\omega_{-}, and cω()>0c_{-\omega_{-}}(\ell)>0 for 0\ell\in\mathbb{N}_{0}. The latter follows from Ω+={ω+}\Omega_{+}=\{\omega_{+}\} and 𝗂ω+=0{\sf i}_{\omega_{+}}=0 in this case. Furthermore, since ν\nu defines an invariant vector, then from (4.9),

(6.4) 0\displaystyle 0 =k=0ω1ν(nk)ck(nk)+ν(n+ω)cω(n+ω).\displaystyle=\sum_{k=0}^{-\omega_{-}-1}\nu(n-k)c_{k}(n-k)+\nu(n+\omega_{-})c_{-\omega_{-}}(n+\omega_{-}).

Assume (6.1) holds. If n=ULn=U-L, then there is nothing to show. Hence, assume nUL+1n\geq U-L+1. By assumption the sum in (6.4) is non-positive, while the last term is non-negative and cω(n+ω+1)>0c_{-\omega_{-}}(n+\omega_{-}+1)>0 as n+ω=n(UL+1)0n+\omega_{-}=n-(U-L+1)\geq 0. Hence, ν(n+ω)0\nu(n+\omega_{-})\geq 0. Continue recursively for n:=n1n:=n-1 until n+ω=0n+\omega_{-}=0. Note that ν()=v0\nu(\ell)=v_{\ell}\geq 0 for =L,,U\ell=L,\ldots,U, in agreement with the conclusion.

Assume (6.2) holds. If n=UL1n=U-L-1, then there is nothing to prove. For nULn\geq U-L, we aim to show (6.1) holds from which the conclusion follows. By simplification of (6.4),

ν(n(UL))cUL(n(UL))\displaystyle-\nu(n-(U-L))c_{U-L}(n-(U-L)) =ν(n+ω)cω(n+ω)\displaystyle=\nu(n+\omega_{-})c_{-\omega_{-}}(n+\omega_{-})

If cUL(n(UL))<0c_{U-L}(n-(U-L))<0, then either ν(n(UL))\nu(n-(U-L)) and ν(n+ω)\nu(n+\omega_{-}) take the same sign or are zero. Consequently, a) ν()0\nu(\ell)\geq 0 for all =n+ω,,n\ell=n+\omega_{-},\ldots,n, or b) ν()0\nu(\ell)\leq 0 for all =n+ω,,n\ell=n+\omega_{-},\ldots,n. Similarly, if cUL(n(UL))=0c_{U-L}(n-(U-L))=0, then ν(n+ω)=0\nu(n+\omega_{-})=0. Consequently, a) or b) holds in this case too. If a), then (6.3) holds. If b), then (6.3) holds with reverse inequality by applying an argument similar to the non-negative case. However, ν()=v0\nu(\ell)=v_{\ell}\geq 0, =L,,U\ell=L,\ldots,U, and at least one of them is strictly larger than zero, by assumption. Hence, the negative sign cannot apply and the claim holds. ∎

For nUn\geq U, define the (UL+1)×(UL+1)(U-L+1)\times(U-L+1) matrix by

A(n)=(γ(n(UL1))Tγ(n(UL1))1γ(n1)Tγ(n1)1γ(n)Tγ(n)1𝟏T),A(n)=\begin{pmatrix}\frac{\gamma(n-(U-L-1))^{T}}{\|\gamma(n-(U-L-1))\|_{1}}\vskip 3.0pt plus 1.0pt minus 1.0pt\\ \vdots\vskip 3.0pt plus 1.0pt minus 1.0pt\\ \frac{\gamma(n-1)^{T}}{\|\gamma(n-1)\|_{1}}\vskip 3.0pt plus 1.0pt minus 1.0pt\\ \frac{\gamma(n)^{T}}{\|\gamma(n)\|_{1}}\vskip 3.0pt plus 1.0pt minus 1.0pt\\ {\bf 1}^{T}\end{pmatrix},

where γ()=(γL(),,γU())\gamma(\ell)=(\gamma_{L}(\ell),\ldots,\gamma_{U}(\ell)), 𝟏=(1,,1){\bf 1}=(1,\ldots,1) and T denotes transpose. This matrix is well-defined by Theorem 6.3. The rows of A(n)A(n), except the last one, has 11-norm one, and all entries are between 1-1 and 11.

Theorem 6.6.

Assume there exists a strictly increasing subsequence (nk)k0(n_{k})_{k\in\mathbb{N}_{0}}, such that A(nk)AA(n_{k})\to A as kk\to\infty, with det(A)0\det(A)\not=0.

1) There is at most one stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}, say π\pi, with the property

(6.5) limkπ(nk)γ(nk)1=0.\lim_{k\to\infty}\frac{\pi(n_{k})}{\lVert\gamma(n_{k})\rVert_{1}}=0.

2) If ω+=1\omega_{+}=1 and π\pi is a unique stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}, then (6.5) holds.

Proof.

1) Let

σ(n)=(π(n(UL1))γ(n(UL1))1,,π(n)γ(n)1),\sigma(n)=\left(\frac{\pi(n-(U-L-1))}{\|\gamma(n-(U-L-1))\|_{1}},\ldots,\frac{\pi(n)}{\|\gamma(n)\|_{1}}\right),

and let π=(π(L),,π(U))\pi_{*}=(\pi(L),\ldots,\pi(U)) be the generator of the stationary measure π\pi. We have

A(nk)π=(σ(nk)1)Aπ=(𝟎1),A(n_{k})\pi_{*}=\begin{pmatrix}\sigma(n_{k})\\ 1\end{pmatrix}\to A\pi_{*}=\begin{pmatrix}{\bf 0}\\ 1\end{pmatrix},

as kk\to\infty, where 𝟎{\bf 0} is the zero vector of length ULU-L. Since AA is invertible, then

π=A1(𝟎1).\pi_{*}=A^{-1}\begin{pmatrix}{\bf 0}\\ 1\end{pmatrix}.

Consequently, as this holds for any stationary measure with the property (6.5), then π\pi is unique, up to a scalar.

2) According to Proposition 7.3 below, A(n)A(n) is invertible and there is a unique non-negative (component-wise) solution to

A(n)v(n)=(𝟎1),A(n)v^{(n)}=\begin{pmatrix}{\bf 0}\\ 1\end{pmatrix},

for all nUn\geq U. It follows that

Avnk=(AA(nk))v(nk)+A(nk)vnk=(AA(nk))v(nk)+(𝟎1)(𝟎1)Av^{n_{k}}=(A-A(n_{k}))v^{(n_{k})}+A(n_{k})v^{n_{k}}=(A-A(n_{k}))v^{(n_{k})}+\begin{pmatrix}{\bf 0}\\ 1\end{pmatrix}\to\begin{pmatrix}{\bf 0}\\ 1\end{pmatrix}

as kk\to\infty, since v(nk)1=1\|v^{(n_{k})}\|_{1}=1. Define

v=A1(𝟎1),v=A^{-1}\begin{pmatrix}{\bf 0}\\ 1\end{pmatrix},

then vv is non-negative and v1=1\|v\|_{1}=1, since v(nk)v^{(n_{k})} is non-negative, v(nk)1=1\|v^{(n_{k})}\|_{1}=1 and v(nk)vv^{(n_{k})}\to v as kk\to\infty. We aim to show that vv is the generator of the unique stationary measure π\pi.

Define an invariant vector νk\nu^{k}, for each k0k\in\mathbb{N}_{0}, by

νk()=j=LUvj(nk)γj(),0.\nu^{k}(\ell)=\sum_{j=L}^{U}v_{j}^{(n_{k})}\gamma_{j}(\ell),\quad\ell\in\mathbb{N}_{0}.

We have νk()=0\nu^{k}(\ell)=0 for all =nk(UL1),,nk\ell=n_{k}-(U-L-1),\ldots,n_{k}. Hence, by Lemma 6.5, νk()0\nu^{k}(\ell)\geq 0 for =0,,nk\ell=0,\ldots,n_{k}. Fix 0\ell\in\mathbb{N}_{0}. Then for all large kk such that nkn_{k}\geq\ell, we have νk()0\nu^{k}(\ell)\geq 0 and

νk()ν()=j=LUvjγj(),fork.\nu^{k}(\ell)\to\nu(\ell)=\sum_{j=L}^{U}v_{j}\gamma_{j}(\ell),\quad\text{for}\quad k\to\infty.

Hence, ν()0\nu(\ell)\geq 0 for all 0\ell\in\mathbb{N}_{0}. Consequently, using Lemma 6.2, ν\nu is a stationary measure and by uniqueness, it holds that ν=π\nu=\pi, up to a scaling constant. ∎

To state the next result, we introduce some notation. Let

Ij+={n0:γj(n)>0},Ij={n0:γj(n)<0},j=L,,U.I_{j}^{+}=\{n\in\mathbb{N}_{0}\colon\gamma_{j}(n)>0\},\quad I_{j}^{-}=\{n\in\mathbb{N}_{0}\colon\gamma_{j}(n)<0\},\quad j=L,\ldots,U.

If L=UL=U, then IL+=0I^{+}_{L}=\mathbb{N}_{0} and IL=I^{-}_{L}=\emptyset. If LUL\not=U, then by the definition of γj\gamma_{j}, Ij+I^{+}_{j}\not=\emptyset and IjI^{-}_{j}\not=\emptyset (Theorem 6.3). In general, it follows from Theorem 4.4, Proposition 6.1, and Lemma 6.2, that IjijIi+I_{j}^{-}\subseteq\cup_{i\not=j}I_{i}^{+}, i=LUIi+=0\cup_{i=L}^{U}I_{i}^{+}=\mathbb{N}_{0} and i=LUIi=\cap_{i=L}^{U}I_{i}^{-}=\emptyset. In particular, for UL=1U-L=1, that is, ω=2\omega_{-}=-2, then IUIL+I_{U}^{-}\subseteq I_{L}^{+} and ILIU+I_{L}^{-}\subseteq I_{U}^{+}. Below, in the proof of Lemma 6.7, we show these four sets are infinite. Hence, we may define

(6.6) r1=lim supnILγL(n)γU(n),r2=lim infnIUγL(n)γU(n).r_{1}=-\limsup_{n\in I_{L}^{-}}\frac{\gamma_{L}(n)}{\gamma_{U}(n)},\quad r_{2}=-\liminf_{n\in I_{U}^{-}}\frac{\gamma_{L}(n)}{\gamma_{U}(n)}.
Lemma 6.7.

Assume ω=2\omega_{-}=-2, that is, U=L+1U=L+1. It holds that 0<r1r2<0<r_{1}\leq r_{2}<\infty. A non-zero measure π\pi is a stationary measure if and only if

(6.7) π(U)π(L)[r1,r2].\frac{\pi(U)}{\pi(L)}\in[r_{1},r_{2}].

Furthermore, a stationary measure π\pi is unique if and only if r1=r2r_{1}=r_{2}, if and only if

(6.8) limnILIUπ(n)γ(n)1=0.\lim_{n\in I_{L}^{-}\cup I_{U}^{-}}\frac{\pi(n)}{\lVert\gamma(n)\rVert_{1}}=0.

If this is the case, then limes superior and limes inferior in (6.6) may be replaced by limes.

Proof.

Let π\pi be a stationary measure, which exists by Proposition 6.1. Then π(n)=π(L)γL(n)+π(U)γU(n)>0\pi(n)=\pi(L)\gamma_{L}(n)+\pi(U)\gamma_{U}(n)>0, which implies that

π(U)π(L)>γL(n)γU(n),nIL,andπ(U)π(L)<γL(n)γU(n),nIU.\frac{\pi(U)}{\pi(L)}>-\frac{\gamma_{L}(n)}{\gamma_{U}(n)},\quad\forall n\in I_{L}^{-},\quad\text{and}\quad\frac{\pi(U)}{\pi(L)}<-\frac{\gamma_{L}(n)}{\gamma_{U}(n)},\quad\forall n\in I_{U}^{-}.

by taking supremum and infimum, respectively, this further implies

r~1=sup{γL(n)γU(n):nIL}inf{γL(n)γU(n):nIU}=r~2,\widetilde{r}_{1}=\sup\left\{-\frac{\gamma_{L}(n)}{\gamma_{U}(n)}\colon n\in I_{L}^{-}\right\}\leq\inf\left\{-\frac{\gamma_{L}(n)}{\gamma_{U}(n)}\colon n\in I_{U}^{-}\right\}=\widetilde{r}_{2},

and that π(U)/π(L)\pi(U)/\pi(L) is in the interval [r~1,r~2][\widetilde{r}_{1},\widetilde{r}_{2}]. Note that r~2<\widetilde{r}_{2}<\infty and r~1>0\widetilde{r}_{1}>0, since IUI_{U}^{-} and ILI_{L}^{-} are both non-empty, using Theorem 6.3. For the reverse conclusion, for any invariant vector π\pi such that (6.7) holds, we have using Theorem 4.4,

π(n)=π(L)γL(n)+π(U)γU(n)0,n0,\pi(n)=\pi(L)\gamma_{L}(n)+\pi(U)\gamma_{U}(n)\geq 0,\quad\forall n\in\mathbb{N}_{0},

which implies that π\pi is a stationary measure, see Lemma 6.2.

Assume that either r~1\widetilde{r}_{1} or r~2\widetilde{r}_{2} is attainable for some n0n\in\mathbb{N}_{0}. Then, there exists a stationary measure π\pi such that

π(U)π(L)=γL(n)γU(n),\frac{\pi(U)}{\pi(L)}=-\frac{\gamma_{L}(n)}{\gamma_{U}(n)},

that is, π(n)=π(L)γL(n)+π(U)γU(n)=0\pi(n)=\pi(L)\gamma_{L}(n)+\pi(U)\gamma_{U}(n)=0, contradicting the positivity of π\pi, see Lemma 6.2. Hence, ILI_{L}^{-} and IUI_{U}^{-} both contain infinitely many elements, and since neither r~1\widetilde{r}_{1} nor r~2\widetilde{r}_{2} are attainable, then sup\sup and inf\inf can be replaced by lim sup\limsup and lim inf\liminf, respectively, to obtain r1r_{1} and r2r_{2} in (6.6). Hence, also IL+I_{L}^{+} and IU+I_{U}^{+} are infinite sets, since IUIL+I_{U}^{-}\subseteq I_{L}^{+} and ILIU+I_{L}^{-}\subseteq I_{U}^{+}.

For the second part, the bi-implication with r1=r2r_{1}=r_{2} is straightforward. Assume π\pi is a stationary measure, such that π(L),π(U)>0\pi(L),\pi(U)>0. Note that

g1=lim infnILγU(n)γ(n)1>0,g2=lim infnIUγL(n)γ(n)1>0,g_{1}=\liminf_{n\in I^{-}_{L}}\frac{\gamma_{U}(n)}{\lVert\gamma(n)\rVert_{1}}>0,\quad g_{2}=\liminf_{n\in I^{-}_{U}}\frac{\gamma_{L}(n)}{\lVert\gamma(n)\rVert_{1}}>0,

as otherwise π(n)=π(L)γL(n)+π(U)γU(n)<0\pi(n)=\pi(L)\gamma_{L}(n)+\pi(U)\gamma_{U}(n)<0 for some large nn. For nILn\in I^{-}_{L},

(6.9) π(n)γ(n)1\displaystyle\frac{\pi(n)}{\lVert\gamma(n)\rVert_{1}} =π(L)γL(n)γ(n)1+π(U)γU(n)γ(n)1=(γL(n)γU(n)+π(U)π(L))π(L)γU(n)γ(n)1\displaystyle=\pi(L)\frac{\gamma_{L}(n)}{\lVert\gamma(n)\rVert_{1}}+\pi(U)\frac{\gamma_{U}(n)}{\lVert\gamma(n)\rVert_{1}}=\left(\frac{\gamma_{L}(n)}{\gamma_{U}(n)}+\frac{\pi(U)}{\pi(L)}\right)\frac{\pi(L)\gamma_{U}(n)}{\lVert\gamma(n)\rVert_{1}}
π(L)(g1ϵ)(γL(n)γU(n)+π(U)π(L))0,\displaystyle\geq\pi(L)(g_{1}-\epsilon)\left(\frac{\gamma_{L}(n)}{\gamma_{U}(n)}+\frac{\pi(U)}{\pi(L)}\right)\geq 0,

for some small ϵ>0\epsilon>0 and large nn. Similarly, for nIUn\in I^{-}_{U},

(6.10) π(n)γ(n)1\displaystyle\frac{\pi(n)}{\lVert\gamma(n)\rVert_{1}} =π(L)γL(n)γ(n)1+π(U)γU(n)γ(n)1=(π(L)π(U)+γU(n)γL(n))π(U)γL(n)γ(n)1\displaystyle=\pi(L)\frac{\gamma_{L}(n)}{\lVert\gamma(n)\rVert_{1}}+\pi(U)\frac{\gamma_{U}(n)}{\lVert\gamma(n)\rVert_{1}}=\left(\frac{\pi(L)}{\pi(U)}+\frac{\gamma_{U}(n)}{\gamma_{L}(n)}\right)\frac{\pi(U)\gamma_{L}(n)}{\lVert\gamma(n)\rVert_{1}}
π(U)(g2ϵ)(π(L)π(U)+γU(n)γL(n))0,\displaystyle\geq\pi(U)(g_{2}-\epsilon)\left(\frac{\pi(L)}{\pi(U)}+\frac{\gamma_{U}(n)}{\gamma_{L}(n)}\right)\geq 0,

for ϵ>0\epsilon>0 and large nn. Taking lim sup\limsup over nILn\in I^{-}_{L} in (6.9) and lim sup\limsup over nIUn\in I^{-}_{U} in (6.10), using (6.8), yields

r1=lim supnILγL(n)γU(n)=π(U)π(L),1r2=lim supnIUγU(n)γL(n)=π(L)π(U),r_{1}=-\limsup_{n\in I^{-}_{L}}\frac{\gamma_{L}(n)}{\gamma_{U}(n)}=\frac{\pi(U)}{\pi(L)},\quad\frac{1}{r_{2}}=-\limsup_{n\in I^{-}_{U}}\frac{\gamma_{U}(n)}{\gamma_{L}(n)}=\frac{\pi(L)}{\pi(U)},

or with lim sup\limsup replaced by lim inf\liminf. Hence, (6.6) holds with lim sup\limsup and lim inf\liminf replaced by lim\lim. It follows that r1=r2r_{1}=r_{2} and that π\pi is unique. For the reverse statement, change the first inequality sign \geq in (6.9) and (6.10) to \leq, and (g1ϵ)(g_{1}-\epsilon) and (g2ϵ)(g_{2}-\epsilon) to one, and the conclusion follows. ∎

7. Convex constrained optimization

When the Markov chain is sufficiently complex, an analytical expression for a stationary measure may not exist. In fact, this seems to be the rare case. If an analytical expression is not available, one may find estimates of the relative sizes of π(L),π(U)\pi(L),\ldots\pi(U), which in turn determines π()\pi(\ell), 0\ell\geq 0, up to a scaling constant, by Theorem 4.4. If π\pi is a stationary distribution, this constant may then be found numerically by calculating the first nn probabilities π()\pi(\ell), =0,,n\ell=0,\ldots,n, for some nn, and renormalizing to obtain a proper distribution. Here, we examine how one may proceed in practice.

Theorem 7.1.

Assume (A1)-(A3) and for n0n\geq 0, let

(7.1) Kn={v0UL+1:j=LUvjγj()0,=0,,n,v1=1}.K_{n}=\left\{v\in\mathbb{R}^{U-L+1}_{\geq 0}\colon\sum_{j=L}^{U}v_{j}\gamma_{j}(\ell)\geq 0,\ \ell=0,\ldots,n,\ \lVert v\rVert_{1}=1\right\}.

Then, KnK_{n}\not=\emptyset, KnKn+1K_{n}\supseteq K_{n+1}, n0n\geq 0, and n=1Kn>0UL+1\cap_{n=1}^{\infty}K_{n}\subseteq\mathbb{R}^{U-L+1}_{>0} is non-empty and consists of all generators of non-zero stationary measures of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}, appropriately normalized. In fact, G0=n=1KnG_{0}=\cap_{n=1}^{\infty}K_{n} with G0G_{0} as in Theorem 6.4.

Furthermore, there is a unique minimizer v(n)v^{(n)} to the following constraint quadratic optimization problem,

(7.2) minvKnv22.\min_{v\in K_{n}}\lVert v\rVert_{2}^{2}.

Moreover, the limit v=limnv(n)n=1Knv^{*}=\lim_{n\to\infty}v^{(n)}\in\cap_{n=1}^{\infty}K_{n} exists and equals the generator of a stationary measure π\pi of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}, that is, v=(π(L),,π(U))v^{*}=(\pi(L),\ldots,\pi(U)).

Proof.

The sets KnK_{n} are non-empty by Proposition 6.1 and obviously non-increasing: KnKn+1K_{n}\supseteq K_{n+1}. Hence, since all KnK_{n}s have a common element, the intersection n=1Kn\cap_{n=1}^{\infty}K_{n} is also non-empty. Lemma 6.2 rules out the intersection contains boundary elements of 0UL+1\mathbb{R}^{U-L+1}_{\geq 0}.

Furthermore, the sets KnK_{n} are non-empty, closed and convex. Since 22\|\cdot\|_{2}^{2} is strictly convex, there exists a unique minimizer v(n)Knv^{(n)}\in K_{n} for the constrained optimization problem (7.2) [6, p137 or Ex 8.1 on p.447].

Since the sets KnK_{n} are non-increasing, then v(n)v^{(n)} is non-decreasing in 2\ell_{2}-norm: v(n)22v(n+1)22\lVert v^{(n)}\rVert_{2}^{2}\leq\lVert v^{(n+1)}\rVert_{2}^{2} for all n1n\geq 1. Furthermore, since the sets KnK_{n} are closed subsets of the simplex and hence compact, any sequence v(n)v^{(n)}, n1n\geq 1, of minimizers has a converging subsequence v(nk)v^{(n_{k})}, k1k\geq 1, say v=limkv(nk)n=1Knv^{*}=\lim_{k\to\infty}v^{(n_{k})}\in\cap_{n=1}^{\infty}K_{n} (the intersection is closed). To show uniqueness, suppose there is another converging subsequence v(mk)v^{(m_{k})}, k1k\geq 1, such that v~=limkv(mk)n=1Kn\widetilde{v}^{*}=\lim_{k\to\infty}v^{(m_{k})}\in\cap_{n=1}^{\infty}K_{n} and vv~v^{*}\not=\widetilde{v}^{*}. Then, it follows that v22=v~22\lVert v^{*}\rVert_{2}^{2}=\lVert\widetilde{v}^{*}\rVert_{2}^{2}, since the norm is non-decreasing along the full sequence. By convexity of KnK_{n}, the intersection n=1Kn\cap_{n=1}^{\infty}K_{n} is convex and vα=αv+(1α)v~n=1Knv_{\alpha}^{*}=\alpha v^{*}+(1-\alpha)\widetilde{v}^{*}\in\cap_{n=1}^{\infty}K_{n} for α(0,1)\alpha\in(0,1). By strict convexity of the norm and vv~v^{*}\not=\widetilde{v}^{*}, then

(7.3) vα22<αv22+(1α)v~22=v22,α(0,1).\|v_{\alpha}^{*}\|_{2}^{2}<\alpha\|v^{*}\|_{2}^{2}+(1-\alpha)\|\widetilde{v}^{*}\|_{2}^{2}=\|v^{*}\|_{2}^{2},\quad\alpha\in(0,1).

Let vα(k)=αv(nk)+(1α)v(mk)v_{\alpha}^{(k)}=\alpha v^{(n_{k})}+(1-\alpha)v^{(m_{k})}. By convexity and monotonicity of KnK_{n}, we have

vα(k)Kmin{nk,mk}fork1,v_{\alpha}^{(k)}\in K_{\min\{n_{k},m_{k}\}}\quad\text{for}\quad k\geq 1,
vα(k)vαn=1Knfork.v_{\alpha}^{(k)}\to v_{\alpha}^{*}\in\cap_{n=1}^{\infty}K_{n}\quad\text{for}\quad k\to\infty.

By assumption, vα(k)22min{v(nk)22,v(mk)22}\|v_{\alpha}^{(k)}\|_{2}^{2}\geq\min\{\|v^{(n_{k})}\|_{2}^{2},\|v^{(m_{k})}\|_{2}^{2}\}. Hence, vα22v22\|v_{\alpha}^{*}\|_{2}^{2}\geq\|v^{*}\|_{2}^{2}, contradicting (7.3). Consequently, v=v~v^{*}=\widetilde{v}^{*}.

Since vn=0Knv^{*}\in\cap_{n=0}^{\infty}K_{n}, then v=(π(L),,π(U))v^{*}=(\pi(L),\ldots,\pi(U)) for some non-zero stationary measure π\pi of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}. ∎

If the process is downwardly skip-free, then L=UL=U and π(L)\pi(L) might be put to one. Consequently, π()\pi(\ell), 0\ell\geq 0, can be found recursively from (4.13). Hence, it only makes sense to apply the optimization scheme for L<UL<U.

The quadratic minimizing function is chosen out of convenience to identify a unique element of the set KnK_{n}. Any strictly convex function could be used for this. If there exists a unique stationary measure, then one might choose a random element of KnK_{n} as any sequence of elements in KnK_{n}, nn\in\mathbb{N}, eventually converges to the unique element of n=1Kn\cap_{n=1}^{\infty}K_{n}. If there are more than one stationary measure, different measures might in principle be found by varying the convex function. In the case, two different stationary measures are found, then any linear combination with positive coefficients is also a stationary measure.

In practice, the convex constrained optimization approach outlined in Theorem 7.1 often fails for (not so) large nn, see the example in Section 9. This is primarily because the coefficients γj()\gamma_{j}(\ell) become exponentially large with alternating signs, and because numerical evaluation of close to zero probabilities might return close to zero negative values, hence violating the non-negativity constraint of the convex constrained optimization problem. The numerical difficulties in verifying the inequalities are non-negative are most severe for large nn, in particular if π(n)\pi(n) vanishes for large nn. To face the problems mentioned above, we investigate an alternative approach to the optimization problem.

Lemma 7.2.

Assume (A1)-(A3). Define the sets MnM_{n}, nUn\geq U, by

Mn\displaystyle M_{n} ={v0UL+1:j=LUvjγj()=0for=n(UL)+1,,n,v1=1}.\displaystyle=\left\{v\in\mathbb{R}^{U-L+1}_{\geq 0}\colon\sum_{j=L}^{U}v_{j}\gamma_{j}(\ell)=0\,\,\text{for}\,\,\ell=n-(U-L)+1,\ldots,n,\ \lVert v\rVert_{1}=1\right\}.

If MnM_{n}\not=\emptyset, then there is a unique minimizer w(n)w^{(n)} to the following constraint quadratic optimization problem,

minvMnv22.\min_{v\in M_{n}}\lVert v\rVert_{2}^{2}.

Moreover, if ω+=1\omega_{+}=1, then MnM_{n} is a singleton set, and MnKnM_{n}\subseteq K_{n} for nUn\geq U, where KnK_{n} is as in (7.1). Furthermore, if there exists a unique stationary measure π\pi of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}, then w=limnw(n)n=1Knw^{*}=\lim_{n\to\infty}w^{(n)}\in\cap_{n=1}^{\infty}K_{n} exists. In particular, ww^{*} equals the generator of π\pi, that is, w=(π(L),,π(U))w^{*}=(\pi(L),\ldots,\pi(U)), appropriately normalized.

Proof.

Existence of the minimizer follows similarly to the proof of Theorem 7.1. If ω+=1\omega_{+}=1, then it follows from Proposition 7.3 below that MnM_{n} is a singleton set for nUn\geq U. It follows from Lemma 6.5 that MnKnM_{n}\subseteq K_{n}. Since w(n)MnKnw^{(n)}\in M_{n}\subseteq K_{n}, and n=1Kn\cap_{n=1}^{\infty}K_{n} contains the generator of the unique stationary measure π\pi only, then v=limnw(n)v^{*}=\lim_{n\to\infty}w^{(n)} exists and equals the generator of π\pi. ∎

We refer to the optimization problem outlined in Lemma 7.2 as the linear approximation scheme. For ω+=1\omega_{+}=1, a solution v(n)v^{(n)} to the linear appproximatioin scheme automatically fulfils ν()=j=LUvj(n)γj()0\nu(\ell)=\sum_{j=L}^{U}v^{(n)}_{j}\gamma_{j}(\ell)\geq 0 for all =0,,n\ell=0,\ldots,n. In general, these inequalities need to be validated.

Proposition 7.3.

If ω+=1\omega_{+}=1, then MnM_{n}, nUn\geq U, is a singleton set.

Proof.

For nUn\geq U, let G(n)G(n) be the (UL+1)×(UL+1)(U-L+1)\times(U-L+1) matrix,

G(n)=(γL(n(UL1))γL+1(n(UL1))γU(n(UL1))γL(n1)γL+1(n1)γU(n1)γL(n)γL+1(n)γU(n)111),G(n)=\begin{pmatrix}\gamma_{L}(n-(U-L-1))&\gamma_{L+1}(n-(U-L-1))&\cdots&\gamma_{U}(n-(U-L-1))\\ \vdots&\vdots&&\vdots\\ \gamma_{L}(n-1)&\gamma_{L+1}(n-1)&\cdots&\gamma_{U}(n-1)\\ \gamma_{L}(n)&\gamma_{L+1}(n)&\cdots&\gamma_{U}(n)\\ 1&1&\cdots&1\end{pmatrix},

and let ci(n)c_{i}(n) be the cofactor of G(n)G(n) corresponding to the (UL+1)(U-L+1)-th row and ii-th column, for i=1,,UL+1i=1,\ldots,U-L+1. Then, det(G(n))=i=1UL+1ci(n)\det(G(n))=\sum_{i=1}^{U-L+1}c_{i}(n), and there exists a unique solution v(n)v^{(n)} to G(n)v=eUL+1G(n)v=e_{U-L+1}, where eUL+1e_{U-L+1} is the (UL+1)(U-L+1)-th unit vector in UL+1\mathbb{R}^{U-L+1}, if and only if det(G(n))0\det(G(n))\neq 0. If this is the case, then by Cramer’s rule,

vi(n)=ci(n)j=1UL+1cj(n),fori=1,,UL+1,v_{i}^{(n)}=\frac{c_{i}(n)}{\sum_{j=1}^{U-L+1}c_{j}(n)},\quad\text{for}\quad i=1,\ldots,U-L+1,

and hence v(n)0UL+1v^{(n)}\in\mathbb{R}^{U-L+1}_{\geq 0} if and only if all cofactors have the same sign or are zero. Hence, we aim to show at least one cofactor is non-zero and that all non-zero cofactors have the same sign. In the following, for convenience, we say the elements of a sequence a1,,ama_{1},\ldots,a_{m} have the same sign if all non-zero elements of the sequence have the same sign.

For nUn\geq U, define the (UL+2)×(UL+1)(U-L+2)\times(U-L+1) matrix

Γ(n)=(γL(n(UL))γL+1(n(UL))γU(n(UL))γL(n1)γL+1(n1)γU(n1)γL(n)γL+1(n)γU(n)111),\Gamma(n)=\begin{pmatrix}\gamma_{L}(n-(U-L))&\gamma_{L+1}(n-(U-L))&\cdots&\gamma_{U}(n-(U-L))\\ \vdots&\vdots&&\vdots\\ \gamma_{L}(n-1)&\gamma_{L+1}(n-1)&\cdots&\gamma_{U}(n-1)\\ \gamma_{L}(n)&\gamma_{L+1}(n)&\cdots&\gamma_{U}(n)\\ 1&1&\cdots&1\end{pmatrix},

and the (UL+1)×(UL+1)(U-L+1)\times(U-L+1) matrices Γ(n)\Gamma^{\ell}(n), =0,,UL\ell=0,\ldots,U-L, by removing row m=UL+1m=U-L+1-\ell of Γ(n)\Gamma(n) (that is, the (+1)(\ell+1)-th row counting from the bottom). For notational convenience, noting that the columns of Γ(n)\Gamma(n) take a similar form, we write these matrices as

Γ(n)=(γj(n(UL))γj(n(+1))γj(n(1))γj(n)1),=0,,UL.\Gamma^{\ell}(n)=\begin{pmatrix}\gamma_{j}(n-(U-L))\\ \vdots\\ \gamma_{j}(n-(\ell+1))\\ \gamma_{j}(n-(\ell-1))\\ \vdots\\ \gamma_{j}(n)\\ 1\end{pmatrix},\quad\ell=0,\ldots,U-L.

Note that

(7.4) G(n)=ΓUL(n),andΓ0(n)=ΓUL(n1).G(n)=\Gamma^{U-L}(n),\quad\text{and}\quad\Gamma^{0}(n)=\Gamma^{U-L}(n-1).

Let Γi(n)\Gamma^{\ell}_{i}(n) be the (UL)×(UL)(U-L)\times(U-L) matrix obtained by removing the bottom row and the ii-th column from Γ(n)\Gamma^{\ell}(n). Hence, the cofactor Ci(n)C^{\ell}_{i}(n) of Γ(n)\Gamma^{\ell}(n) corresponding to the (UL+1)(U-L+1)-th row and ii-th column is

(7.5) Ci(n)=(1)UL+1+idet(Γi(n)),andCiUL(n)=ci(n),C^{\ell}_{i}(n)=(-1)^{U-L+1+i}\det(\Gamma^{\ell}_{i}(n)),\quad\text{and}\quad C^{U-L}_{i}(n)=c_{i}(n),

i=1,,UL+1i=1,\ldots,U-L+1. By induction, we will show that the signs of Ci(n)C^{\ell}_{i}(n), i=1,,UL+1i=1,\ldots,U-L+1, are the same, potentially with some cofactors being zero, but at least one being non-zero.

Induction basis. For n=Un=U, we have

Γ(U)=(10000100001000011111),\Gamma(U)=\begin{pmatrix}1&0&\cdots&0&0\\ 0&1&\cdots&0&0\\ \vdots&\vdots&&\vdots&\vdots\\ 0&0&\cdots&1&0\\ 0&0&\cdots&0&1\\ 1&1&\cdots&1&1\end{pmatrix},

and it follows by tedious calculation that

CUL+1(U)=(1),Ci(U)=0forUL+1i,C^{\ell}_{U-L+1-\ell}(U)=(-1)^{\ell},\quad C^{\ell}_{i}(U)=0\quad\text{for}\quad\ell\not=U-L+1-i,

i=1,,UL+1i=1,\ldots,U-L+1 and =0,,UL\ell=0,\ldots,U-L. It follows that the CiC^{\ell}_{i}s have the same sign for \ell fixed and all i=1,,UL+1i=1,\ldots,U-L+1 (all cofactors are zero, except for i=UL+1i=U-L+1-\ell).

We postulate that the non-zero elements fulfil

sign(Ci(n))=(1)(nU)(UL)+,fornU,\text{sign}(C^{\ell}_{i}(n))=(-1)^{(n-U)(U-L)+\ell},\quad\text{for}\quad n\geq U,

and =0,,UL\ell=0,\ldots,U-L, i=1,,UL+1i=1,\ldots,U-L+1. The hypothesis holds for n=Un=U.

Induction step. Assume the statement is correct for some nUn\geq U. Using m=ω+ω1=UL+1m_{*}=\omega_{+}-\omega_{-}-1=U-L+1 and (4.4), we obtain for =1,,UL\ell=1,\ldots,U-L (excluding =0\ell=0),

Γ(n+1)=(γj(n+1(UL))γj(n+1(+1))γj(n+1(1))γj(n)γj(n+1)1)=(γj(n+1(UL))γj(n+1(+1))γj(n+1(1))γj(n)k=1UL+1γj(n+1k)fk(n+1)1).\Gamma^{\ell}(n+1)=\begin{pmatrix}\gamma_{j}(n+1-(U-L))\\ \vdots\\ \gamma_{j}(n+1-(\ell+1))\\ \gamma_{j}(n+1-(\ell-1))\\ \vdots\\ \gamma_{j}(n)\\ \gamma_{j}(n+1)\\ 1\end{pmatrix}=\begin{pmatrix}\gamma_{j}(n+1-(U-L))\\ \vdots\\ \gamma_{j}(n+1-(\ell+1))\\ \gamma_{j}(n+1-(\ell-1))\\ \vdots\\ \gamma_{j}(n)\\ \sum_{k=1}^{U-L+1}\gamma_{j}(n+1-k)f_{k}(n+1)\\ 1\end{pmatrix}.

Hence, using the linearity of the determinant, we obtain for 0<UL0<\ell\leq U-L (excluding =0\ell=0),

det(Γi(n+1))\displaystyle\det(\Gamma_{i}^{\ell}(n+1)) =|γj(n+1(UL))γj(n+1(+1))γj(n+1(1))γj(n)γj(n+1)f(n+1)|+|γj(n+1(UL))γj(n+1(+1))γj(n+1(1))γj(n)γj(n+1(UL+1))fUL+1(n+1)|\displaystyle=\begin{vmatrix}[c]\gamma_{j}(n+1-(U-L))\\ \vdots\\ \gamma_{j}(n+1-(\ell+1))\\ \gamma_{j}(n+1-(\ell-1))\\ \vdots\\ \gamma_{j}(n)\\ \gamma_{j}(n+1-\ell)f_{\ell}(n+1)\end{vmatrix}+\begin{vmatrix}[c]\gamma_{j}(n+1-(U-L))\\ \vdots\\ \gamma_{j}(n+1-(\ell+1))\\ \gamma_{j}(n+1-(\ell-1))\\ \vdots\\ \gamma_{j}(n)\\ \gamma_{j}(n+1-(U-L+1))f_{U-L+1}(n+1)\end{vmatrix}
=f(n+1)|γj(n+1(UL))γj(n+1(+1))γj(n+1(1))γj(n)γj(n+1)|+fUL+1(n+1)|γj(n+1(UL))γj(n+1(+1))γj(n+1(1))γj(n)γj(n+1(UL+1))|\displaystyle=f_{\ell}(n+1)\begin{vmatrix}[c]\gamma_{j}(n+1-(U-L))\\ \vdots\\ \gamma_{j}(n+1-(\ell+1))\\ \gamma_{j}(n+1-(\ell-1))\\ \vdots\\ \gamma_{j}(n)\\ \gamma_{j}(n+1-\ell)\end{vmatrix}+f_{U-L+1}(n+1)\begin{vmatrix}[c]\gamma_{j}(n+1-(U-L))\\ \vdots\\ \gamma_{j}(n+1-(\ell+1))\\ \gamma_{j}(n+1-(\ell-1))\\ \vdots\\ \gamma_{j}(n)\\ \gamma_{j}(n+1-(U-L+1))\end{vmatrix}
=f(n+1)(1)1|γj(n(UL1))γj(n)|\displaystyle=f_{\ell}(n+1)(-1)^{\ell-1}\begin{vmatrix}[c]\gamma_{j}(n-(U-L-1))\\ \vdots\\ \gamma_{j}(n)\end{vmatrix}
+fUL+1(n+1)(1)UL1|γj(n(UL))γj(n)γj(n(2))γj(n)|\displaystyle\qquad\qquad+f_{U-L+1}(n+1)(-1)^{U-L-1}\begin{vmatrix}[c]\gamma_{j}(n-(U-L))\\ \vdots\\ \gamma_{j}(n-\ell)\\ \gamma_{j}(n-(\ell-2))\\ \vdots\\ \gamma_{j}(n)\end{vmatrix}
=f(n+1)(1)1det(ΓiUL(n))+fUL+1(n+1)(1)UL1det(Γi1(n))\displaystyle=f_{\ell}(n+1)(-1)^{\ell-1}\det(\Gamma_{i}^{U-L}(n))+f_{U-L+1}(n+1)(-1)^{U-L-1}\det(\Gamma_{i}^{\ell-1}(n))

with the remaining terms from the linear expansion of the determinant are zero. In the computation of the determinant above, we abuse γj(n+1k)\gamma_{j}(n+1-k) for the row vector with the ii-th coordinate deleted. For =0\ell=0, then using (7.4),

det(Γi0(n+1))=det(ΓiUL(n)).\det(\Gamma^{0}_{i}(n+1))=\det(\Gamma^{U-L}_{i}(n)).

The above conclusions result in the following for the sign of the cofactors, using (7.5):

(7.6) Ci(n+1))\displaystyle C_{i}^{\ell}(n+1)) =f(n+1)(1)1CiUL(n)\displaystyle=f_{\ell}(n+1)(-1)^{\ell-1}C_{i}^{U-L}(n)
+fUL+1(n+1)(1)UL1Ci1(n),0<UL,\displaystyle\quad\quad+f_{U-L+1}(n+1)(-1)^{U-L-1}C_{i}^{\ell-1}(n),\quad 0<\ell\leq U-L,
(7.7) Ci0(n+1)\displaystyle C^{0}_{i}(n+1) =CiUL(n).\displaystyle=C^{U-L}_{i}(n).

We recall some properties of f(n)f_{\ell}(n). According to Lemma A.1(vi), fUL+1(n)>0f_{U-L+1}(n)>0 for nU+1n\geq U+1, using 𝗂+=0{\sf i}_{+}=0 (otherwise zero is a trapping state) and ω=UL+1-\omega_{-}=U-L+1. For 0<ω0\leq\ell<-\omega_{-}, we have sgn(ω++1/2)=1\text{sgn}(\omega_{-}+\ell+1/2)=-1, and hence f(n)0f_{\ell}(n)\leq 0 for nU+1n\geq U+1, according to Lemma A.1(vii)-(viii). Consequently, the sign of the two terms in (7.6) are:

(1)(1)1(1)(UL)(nU)+(UL)=(1)(UL)(n+1U)+,(-1)(-1)^{\ell-1}(-1)^{(U-L)(n-U)+(U-L)}=(-1)^{(U-L)(n+1-U)+\ell},
(+1)(1)UL+1(1)(UL)(nU)+1=(1)(UL)(n+1U)+,(+1)(-1)^{U-L+1}(-1)^{(U-L)(n-U)+\ell-1}=(-1)^{(U-L)(n+1-U)+\ell},

hence the sign of Ci(n+1)C_{i}^{\ell}(n+1) corroborates the induction hypothesis. The sign of the term in (7.7) is

(1)UL(1)(UL)(nU)=(1)(UL)(n+1U)+0,(-1)^{U-L}(-1)^{(U-L)(n-U)}=(-1)^{(U-L)(n+1-U)+0},

again in agreement with the induction hypothesis.

It remains to show that at least one cofactor is non-zero, that is, CiUL(n)0C_{i}^{U-L}(n)\neq 0 for at least one 1iUL+11\leq i\leq U-L+1 and nUn\geq U. Let an,=i=1UL+1|Ci(n)|a_{n,\ell}=\sum_{i=1}^{U-L+1}|C_{i}^{\ell}(n)|. From (7.4) and (7.6), we have

(7.8) an+1,\displaystyle a_{n+1,\ell} =|f(n+1)|an,UL+|fUL+1(n+1)|an,1,1UL,\displaystyle=|f_{\ell}(n+1)|a_{n,U-L}+|f_{U-L+1}(n+1)|a_{n,\ell-1},\quad 1\leq\ell\leq U-L,
an+1,0\displaystyle a_{n+1,0} =an,UL,\displaystyle=a_{n,U-L},

for nU.n\geq U. We show by induction that an,0a_{n,\ell}\not=0 for nUn\geq U and 0UL0\leq\ell\leq U-L. Hence, the desired conclusion follows. For n=Un=U, we have aU,=1a_{U,\ell}=1 for all =0,,UL\ell=0,\ldots,U-L. Assume an,0a_{n,\ell}\not=0 for =0,,UL\ell=0,\ldots,U-L and some nUn\geq U. Since fUL+1(n+1)>0f_{U-L+1}(n+1)>0 for n+1U+1n+1\geq U+1, then it follows from (7.8) that an+1,0a_{n+1,\ell}\not=0 for =0,,UL\ell=0,\ldots,U-L. The proof is completed. ∎

8. The generating function

In this and the next section, we consider one-species SRNs mainly with mass-action kinetics, without assuming (𝐀𝟑{\bf A3}). That is, we allow for multiple PICs indexed by s=max{𝗂+,𝗈},,𝗈+ω1s=\max\{{\sf i}_{+},{\sf o}_{-}\},\ldots,{\sf o}_{-}+\omega_{*}-1 and general ω1\omega_{*}\geq 1. For a stationary measure π\pi of (Ω,)(\Omega,\mathcal{F}) on a PIC ω0+s\omega_{*}\mathbb{N}_{0}+s (for some ss), we define

(8.1) πs()=π(s+ω),for0.\pi_{s}(\ell)=\pi(s+\ell\omega_{*}),\quad\text{for}\quad\ell\in\mathbb{N}_{0}.

Then, πs\pi_{s} is a stationary measure of (Ω,s)(\Omega_{*},\mathcal{F}_{s}) on 0\mathbb{N}_{0} for the translated Markov chain as in Section 4. Similarly, we add the superscript ss to γj(),ck(),fk()\gamma_{j}(\ell),c_{k}(\ell),f_{k}(\ell), whenever convenient. These quantities are already expressed in terms of the translated Markov chain.

For one-species SRNs with mass-action kinetics, one might represent the stationary distribution (when it exists) through the probability generating function (PGF). Recall that for a random variable XX on 0\mathbb{N}_{0} with law μ\mu, the PGF is:

g(z)=𝔼(zX)=x0μ(x)zx,|z|1,g(z)=\mathbb{E}(z^{X})=\sum_{x\in\mathbb{N}_{0}}\mu(x)z^{x},\quad|z|\leq 1,

such that

μ(x)=g(x)(0)x!,x0,\mu(x)=\frac{g^{(x)}(0)}{x!},\quad x\in\mathbb{N}_{0},

where g(x)g^{(x)} is the xx-th derivative of gg.

Proposition 8.1.

Let a one-species SRN fulfilling (A1) with mass-action kinetics be given, hence (A2) also holds. Assume there exists a stationary distribution π\pi on ω0+s\omega_{*}\mathbb{N}_{0}+s, for some s{max{𝗂,𝗈},,𝗈+ω1}s\in\{\max\{{\sf i},{\sf o}\},\ldots,{\sf o}+\omega_{*}-1\}. Let πs()=π(ω+s)\pi_{s}(\ell)=\pi(\omega_{*}\ell+s), 0\ell\in\mathbb{N}_{0}, be the corresponding distribution on 0\mathbb{N}_{0} and gsg_{s} the PGF of πs\pi_{s}. Then, gsg_{s} solves the following differential equation:

ykykκk(zykzyk)gs(yk)(z)=0,|z|1.\sum_{y_{k}\to y_{k}^{\prime}\in\mathcal{R}}\kappa_{k}(z^{y_{k}^{\prime}}-z^{y_{k}})g_{s}^{(y_{k})}(z)=0,\quad|z|\leq 1.
Proof.

Recall that πs\pi_{s} is an equilibrium of the master equation:

ykykλk(s+xωξk)πs(xξkω1)ykykλk(s+xω)πs(x)=0.\displaystyle\sum_{y_{k}\to y_{k}^{\prime}\in\mathcal{R}}\lambda_{k}(s+x\omega_{*}-\xi_{k})\pi_{s}(x-\xi_{k}\omega_{*}^{-1})-\sum_{y_{k}\to y_{k}^{\prime}\in\mathcal{R}}\lambda_{k}(s+x\omega_{*})\pi_{s}(x)=0.

Multiplying through by zs+xωz^{s+x\omega_{*}} and summing over x0x\in\mathbb{N}_{0}, we obtain

ykykx=0λk(s+xωξk)πs(xξkω1)zs+xωykykx=0λk(s+xω)πs(x)zs+xω=0.\displaystyle\sum_{y_{k}\to y_{k}^{\prime}\in\mathcal{R}}\sum_{x=0}^{\infty}\lambda_{k}(s+x\omega_{*}-\xi_{k})\pi_{s}(x-\xi_{k}\omega_{*}^{-1})z^{s+x\omega_{*}}-\sum_{y_{k}\to y_{k}^{\prime}\in\mathcal{R}}\sum_{x=0}^{\infty}\lambda_{k}(s+x\omega_{*})\pi_{s}(x)z^{s+x\omega_{*}}=0.

Using the form of mass-action kinetics, and rearranging terms,

ykykκkzykx=0(s+xω)yk¯πs(x)zs+xωykykykκkzykx=0(s+xω)yk¯πs(x)zs+xωyk=0,\displaystyle\sum_{y_{k}\to y_{k}^{\prime}\in\mathcal{R}}\kappa_{k}z^{y_{k}^{\prime}}\sum_{x=0}^{\infty}(s+x\omega_{*})^{\underline{y_{k}}}\pi_{s}(x)z^{s+x\omega_{*}-y_{k}}-\sum_{y_{k}\to y_{k}^{\prime}\in\mathcal{R}}\kappa_{k}z^{y_{k}}\sum_{x=0}^{\infty}(s+x\omega_{*})^{\underline{y_{k}}}\pi_{s}(x)z^{s+x\omega_{*}-y_{k}}=0,

where xy¯=x(x1)(xy+1)x^{\underline{y}}=x(x-1)\ldots(x-y+1). Here, we have used that πs(n)=0\pi_{s}(n)=0 for n<0n<0, and if ξk<0\xi_{k}<0, then sξkω𝗈1ξk<ykξk=yks-\xi_{k}-\omega_{*}\leq{\sf o}-1-\xi_{k}<y_{k}^{\prime}-\xi_{k}=y_{k}, which implies that (s+nω)yk¯=0(s+n\omega_{*})^{\underline{y_{k}}}=0 for 0n<ξkω10\leq n<-\xi_{k}\omega_{*}^{-1}. As gs(z)=x=0πs(x)zs+xωg_{s}(z)=\sum_{x=0}^{\infty}\pi_{s}(x)z^{s+x\omega_{*}}, we may write this as

ykykκkzykgs(yk)(z)ykykκkzykgs(yk)(z)=0,\displaystyle\sum_{y_{k}\to y_{k}^{\prime}\in\mathcal{R}}\kappa_{k}z^{y_{k}^{\prime}}g_{s}^{(y_{k})}(z)-\sum_{y_{k}\to y_{k}^{\prime}\in\mathcal{R}}\kappa_{k}z^{y_{k}}g_{s}^{(y_{k})}(z)=0,

whence upon collecting terms yields the desired. ∎

Since

πs(j)=gs(s+jω)(0)(s+jω)!,for allj0,\pi_{s}(j)=\frac{g_{s}^{(s+j\omega_{*})}(0)}{(s+j\omega_{*})!},\quad\text{for all}\quad j\in\mathbb{N}_{0},

we arrive at the following alternative expression for the stationary distribution.

Corollary 8.2.

Assume as in Proposition 8.1. Then

πs()=j=LsUsgs(s+jω)(0)(s+jω)!γjs(),0,\pi_{s}(\ell)=\sum_{j=L_{s}}^{U_{s}}\frac{g_{s}^{(s+j\omega_{*})}(0)}{(s+j\omega_{*})!}\gamma^{s}_{j}(\ell),\quad\ell\in\mathbb{N}_{0},

where γjs\gamma^{s}_{j} is defined as in (4.4), and πs\pi_{s} as in (8.1).

Example 8.3.

Consider the SRN with mass-action kinetics,

0\ce>[κ1]S\ce>[κ2]2S\ce>[κ3]0,0\ce{->[\kappa_{1}]}\text{S}\ce{->[\kappa_{2}]}2\text{S}\ce{->[\kappa_{3}]}0,

with κ1,κ2,κ3>0\kappa_{1},\kappa_{2},\kappa_{3}>0. This is an upwardly skip-free process: ω=1\omega_{*}=1, ω+=1\omega_{+}=1, ω=2\omega_{-}=-2, s=0s=0, L0=0L_{0}=0, U0=1U_{0}=1.

Applying Proposition 8.1, the governing differential equation of the PGF is,

κ3(1z2)g0′′(z)+κ2(z2z)g0(z)+κ1(z1)g0(z)=0,|z|1,\kappa_{3}(1-z^{2})g^{\prime\prime}_{0}(z)+\kappa_{2}(z^{2}-z)g_{0}^{\prime}(z)+\kappa_{1}(z-1)g_{0}(z)=0,\quad|z|\leq 1,

that is,

(8.2) (1+z)g0′′(z)κ2κ3zg0(z)κ1κ3g0(z)=0.(1+z)g^{\prime\prime}_{0}(z)-\frac{\kappa_{2}}{\kappa_{3}}zg_{0}^{\prime}(z)-\frac{\kappa_{1}}{\kappa_{3}}g_{0}(z)=0.

Let z~=κ2κ3(z+1)\tilde{z}=\frac{\kappa_{2}}{\kappa_{3}}(z+1) and h(z~)=g0(z)h(\tilde{z})=g_{0}(z), then (8.2) yields the Kummer differential equation [1]:

z~h′′(z~)+(κ2κ3z~)h(z~)κ1κ2h(z~)=0.\tilde{z}h^{\prime\prime}(\tilde{z})+\left(\frac{\kappa_{2}}{\kappa_{3}}-\tilde{z}\right)h^{\prime}(\tilde{z})-\frac{\kappa_{1}}{\kappa_{2}}h(\tilde{z})=0.

Taking g0(1)=1g_{0}(1)=1,

g0(z)=F11(κ1κ2;κ2κ3;(z+1)κ2κ3)F11(κ1κ2;κ2κ3;2κ2κ3),g_{0}(z)=\frac{{}_{1}F_{1}\!\left(\frac{\kappa_{1}}{\kappa_{2}};\frac{\kappa_{2}}{\kappa_{3}};\frac{(z+1)\kappa_{2}}{\kappa_{3}}\right)}{{}_{1}F_{1}\!\left(\frac{\kappa_{1}}{\kappa_{2}};\frac{\kappa_{2}}{\kappa_{3}};\frac{2\kappa_{2}}{\kappa_{3}}\right)},

where

F11(a;b;z)=n=0an¯znbn¯n!,{}_{1}F_{1}(a;b;z)=\sum_{n=0}^{\infty}\frac{a^{\overline{n}}z^{n}}{b^{\overline{n}}n!},

is the confluent hypergeometric function (also called Kummer’s function) of the first kind [1], and an¯=Γ(a+n)Γ(a)a^{\overline{n}}=\frac{\Gamma(a+n)}{\Gamma(a)} is the ascending factorial. Using the relation between Kummer’s function and its derivatives

dF1(n)1(a;b;z)dzn=an¯bn¯F11(a+n;b+n;z),n,\frac{d{}_{1}F_{1}^{(n)}(a;b;z)}{dz^{n}}=\frac{a^{\overline{n}}}{b^{\overline{n}}}{}_{1}F_{1}(a+n;b+n;z),\quad n\in\mathbb{N},

it follows that

π(x)=g0(x)(0)x!=1x!(κ2κ3)xΓ(κ2κ3)Γ(κ1κ2)Γ(x+κ1κ2)Γ(x+κ2κ3)F11(x+κ1κ2;x+κ2κ3;κ2κ3)F11(κ1κ2;κ2κ3;2κ2κ3).\pi(x)=\frac{g_{0}^{(x)}(0)}{x!}=\frac{1}{x!}\left(\frac{\kappa_{2}}{\kappa_{3}}\right)^{\!x}\frac{\Gamma(\frac{\kappa_{2}}{\kappa_{3}})}{\Gamma(\frac{\kappa_{1}}{\kappa_{2}})}\frac{\Gamma(x+\frac{\kappa_{1}}{\kappa_{2}})}{\Gamma(x+\frac{\kappa_{2}}{\kappa_{3}})}\frac{{}_{1}F_{1}(x+\frac{\kappa_{1}}{\kappa_{2}};x+\frac{\kappa_{2}}{\kappa_{3}};\frac{\kappa_{2}}{\kappa_{3}})}{{}_{1}F_{1}(\frac{\kappa_{1}}{\kappa_{2}};\frac{\kappa_{2}}{\kappa_{3}};\frac{2\kappa_{2}}{\kappa_{3}})}.

In particular, κ1κ3=κ22\kappa_{1}\kappa_{3}=\kappa_{2}^{2} if and only if π\pi is a Poisson distribution,

π(x)=1x!(κ1κ2)xeκ1κ2.\pi(x)=\frac{1}{x!}\left(\frac{\kappa_{1}}{\kappa_{2}}\right)^{\!x}e^{-\frac{\kappa_{1}}{\kappa_{2}}}.

When this is the case, the reaction network is complex balanced and the form of the stationary distribution is well known [2].

On the other hand,

g0(0)=F11(κ1κ2;κ2κ3;κ2κ3)F11(κ1κ2;κ2κ3;2κ2κ3),g0(0)=κ1κ2F11(1+κ1κ2;1+κ2κ3;κ2κ3)F11(κ1κ2;κ2κ3;2κ2κ3),g_{0}(0)=\frac{{}_{1}F_{1}\!\left(\frac{\kappa_{1}}{\kappa_{2}};\frac{\kappa_{2}}{\kappa_{3}};\frac{\kappa_{2}}{\kappa_{3}}\right)}{{}_{1}F_{1}\!\left(\frac{\kappa_{1}}{\kappa_{2}};\frac{\kappa_{2}}{\kappa_{3}};\frac{2\kappa_{2}}{\kappa_{3}}\right)},\quad g^{\prime}_{0}(0)=\frac{\kappa_{1}}{\kappa_{2}}\frac{{}_{1}F_{1}\!\left(1+\frac{\kappa_{1}}{\kappa_{2}};1+\frac{\kappa_{2}}{\kappa_{3}};\frac{\kappa_{2}}{\kappa_{3}}\right)}{{}_{1}F_{1}\!\left(\frac{\kappa_{1}}{\kappa_{2}};\frac{\kappa_{2}}{\kappa_{3}};\frac{2\kappa_{2}}{\kappa_{3}}\right)},

and by Corollary 8.2,

π(x)=g0(0)γ00(x)+g0(0)γ10(x),x0.\pi(x)=g_{0}(0)\gamma^{0}_{0}(x)+g_{0}^{\prime}(0)\gamma^{0}_{1}(x),\quad x\in\mathbb{N}_{0}.

9. Examples

To end we present some examples using the linear approximation scheme and the convex constrained optimization approach. We use the criteria in [30, Theorem 7] to check whether a CTMC with polynomial transition rates is positive recurrent, null recurrent, transient and non-explosive, or explosive. These properties hold for either all PICs or none, provided the transition rate functions are polynomials for large state values, as in mass-action kinetics [30, Theorem 7].

We have made two implementations of the numerical methods. One in R and one in Python (only mass-action kinetics) with codes available on request. In all examples, the code runs in few seconds on a standard laptop. The two codes agree when run on the same example. We have not aimed to optimize for speed. In figures, ‘State xx’ refers to the state of the original Markov chain, and ‘Index nn’ refers to the translated Markov chain, the index of, for example, γjs(n)\gamma_{j}^{s}(n). The index ss refers to the irreducibility class in Proposition 7.3.

Example 9.1.

Consider the SRN with mass-action kinetics,

S\ce>[κ1]2S\ce<=>[κ2][κ4]3S\ce>[κ3]S.\text{S}\ce{->[\kappa_{1}]}2\text{S}\ce{<=>[\kappa_{2}][\kappa_{4}]}3\text{S}\ce{->[\kappa_{3}]}\text{S}.

We have ω+=1\omega_{+}=1 and ω=2\omega_{-}=-2 with s=1s=1 (zero is a neutral state), and L1=0,U1=1L_{1}=0,U_{1}=1. Furthermore, a unique stationary distribution exists since the SRN is positive recurrent for all positive rate constants [30, Theorem 7], and π1(x)=π(1+x)\pi_{1}(x)=\pi(1+x). As the reaction network is weakly reversible (the reaction graph consists of a finite union of disjoint strongly connected components), then it is complex balanced for specific choices of rate constants, yielding a Poisson stationary distribution [2]. This is the case if and only if κ1(κ3+κ4)2=κ22κ3\kappa_{1}(\kappa_{3}+\kappa_{4})^{2}=\kappa_{2}^{2}\kappa_{3}.

Here, we focus on the stability of the numerical approximations using the linear approximation scheme and convex constrained optimization for a single set of parameters, κ1=40,κ2=22,κ3=κ4=1\kappa_{1}=40,\kappa_{2}=22,\kappa_{3}=\kappa_{4}=1, see Figure 1. Convex constrained optimization fails for n>18n>18 in (7.2) due to exponentially increasing γj1()\gamma_{j}^{1}(\ell) values with alternating signs. In contrast, the linear approximation scheme is quite robust and returns accurate estimates for the generating terms π1(0),π1(1)\pi_{1}(0),\pi_{1}(1) (=π(1),π(2))=\pi(1),\pi(2)), even for n=70n=70. However, in this situation, inaccurate and negative probability values for large state values are returned, see Figure 1. The estimated values of π(32)\pi(32) and π(33)\pi(33) are zero to the precision of the computer and the first negative estimate is π(34)=8.41013\pi(34)=-8.4\cdot 10^{-13}. From then on, the estimated probabilities increase in absolute value. The estimated generating terms for the convex constrained optimization problem with n=18n=18 and the linear approximation scheme with n=25n=25 deviate on the seventh decimal point only. In the latter case, the estimates remain unchanged for 25n7025\leq n\leq 70 for up to seven decimal points, despite negative probabilities are found for large nn.

It is remarkable that for n=70n=70 with γj1(n)\gamma_{j}^{1}(n) of the order e501022e^{50}\approx 10^{22}, we still numerically find that MnM_{n} is a singleton set, as postulated in Proposition 7.3, despite the solution gives raise to instabilities in calculating the probabilities. Also the numerical computations confirm that the limit in (6.8) in Lemma 6.7 is zero, as γj1(n)\gamma_{j}^{1}(n) increases beyond bound.

Refer to caption
Figure 1. Top left: The stationary distribution calculated using the linear approximation scheme with n=25n=25 (red dots) and convex constrained optimization with n=18n=18 (black circles). The latter results in wrong probability estimates for the states x=18x=18 and x=19x=19. Top right: The stationary distribution calculated using the linear approximation scheme with n=70n=70 (red dots). The orange subfigure is a blow up of the first 50 states for n=70n=70, normalized to their sum, indicating the correct form of the distribution is retrieved even in the presence of instability, and the onset of instabilities. Bottom left: Approximate values of π1(L1)\pi_{1}(L_{1}) and π1(U1)\pi_{1}(U_{1}) as function of nn found using the linear approximation scheme. Convergence is very fast. Bottom right: The values of γL11(n)\gamma^{1}_{L_{1}}(n) and γU11(n)\gamma^{1}_{U_{1}}(n) for n=0,,70n=0,\ldots,70. The coefficients are plotted as Υ(x)=log(x+1)\Upsilon(x)=\log(x+1) for x0x\geq 0 and Υ(x)=log(x+1)\Upsilon(x)=-\log(-x+1) for x0x\leq 0. Dots are connected by lines for convenience.

The following example demonstrates that both the linear approximation scheme and the convex optimization approach can be efficient in practice for ergodic CTMCs.

Example 9.2.

For the SRN with mass-action kinetics,

(9.1) S\ce>[κ1]3S\ce>[κ2]2S\ce>[κ3]4S\ce>[κ4]S\textstyle{\text{S}\ce{->[\kappa_{1}]}3\text{S}\ce{->[\kappa_{2}]}2\text{S}\ce{->[\kappa_{3}]}4\text{S}\ce{->[\kappa_{4}]}\text{S}}

we obtain ω+=2\omega_{+}=2, ω=3\omega_{-}=-3, and s=1s=1, L1=0L_{1}=0, U1=2U_{1}=2, such that there is one PIC with state space \mathbb{N}. Despite Proposition 7.3 does not apply (as ω+1\omega_{+}\not=1), numerically we find that MnM_{n} is a singleton set. Using the linear approximation scheme or the convex optimization approach, we obtain a rather quick convergence, see Figure 2. In this case, the coefficients γj1()\gamma^{1}_{j}(\ell), j=0,1,2j=0,1,2, decrease fast towards zero, taking both signs. Both algorithms run efficiently even for large nn as the coefficients vanish. The bottom right plot in Figure 2 shows γ(n)/γ(n)1\gamma(n)/\|\gamma(n)\|_{1}. There appears to be a periodicity of U1L1+1=3U_{1}-L_{1}+1=3, demonstrating numerically that the matrices A(3n)A(3n), A(3n+1)A(3n+1) and A(3n+2)A(3n+2), n0n\in\mathbb{N}_{0}, each converges as nn\to\infty, see Theorem 6.6. The generator recovered from either of the three sequences A(3n)A(3n), A(3n+1)A(3n+1) and A(3n+2)A(3n+2) agree to high accuracy, and agree with the generator found using the linear approximation scheme.

Refer to caption
Figure 2. Top left: Stationary distribution for (9.1) with κ1=50\kappa_{1}=50, κ3=5\kappa_{3}=5 and κ2=κ4=1\kappa_{2}=\kappa_{4}=1. The linear approximation scheme is shown in red, while the convex constrained optimization scheme is overlaid in black. Dots are connected by lines for convenience. Top right and Bottom left: Convergence of the generating terms is fast as γjs()\gamma_{j}^{s}(\ell) decreases fast to zero with \ell becoming large. Bottom right: Shown is γ(3n)/γ(3n)1\gamma(3n)/\|\gamma(3n)\|_{1} (red), γ(3n+1)/γ(3n+1)1\gamma(3n+1)/\|\gamma(3n+1)\|_{1} (green), and γ(3n+2)/γ(3n+2)1\gamma(3n+2)/\|\gamma(3n+2)\|_{1} (blue). The numerical computations clearly indicate periodicity. Note that despite convergence has not been achieved for the simulated values of nn, the generators recovered from the three series A(3n),A(3n+1),A(3n+2)A(3n),A(3n+1),A(3n+2) agree with those found from the linear approximation scheme, see Theorem 6.6. Dots are replaced by lines for visual reasons.

Although, in theory, it seems to make sense to use the linear approximation scheme only for stationary measures for which π(n)/γ(n)\pi(n)/\|\gamma(n)\| is vanishing. In practice, it seems that the linear approximation scheme still captures the feature of a stationary measure with non-vanishing π(n)/γ(n)1\pi(n)/\|\gamma(n)\|_{1} decently, when the states are not too large.

Example 9.3.

Computing an unknown distribution. We give an example of a mass-action SRN,

0\ce>[10]S\ce>[12]2S\ce>[1]6S,2S\ce>[2]0,0\ce{->[10]}\text{S}\ce{->[12]}2\text{S}\ce{->[1]}6\text{S},\quad 2\text{S}\ce{->[2]}0,

which is null recurrent by the criterion in [30, Theorem 7], and hence there exists a unique stationary measure due to [25, Theorem 3.5.1] and [7, Theorem 1]. In this case, ω1=2\omega_{1}=-2 and ω+=4\omega_{+}=4. Furthermore, s=0s=0 and L0=0L_{0}=0, U0=1U_{0}=1. We apply the linear approximation scheme to the SRN with n=150n=150 and find that MnM_{n} is a singleton set, despite ω+1\omega_{+}\not=1, see Proposition 7.3. For large states the point measures are considerably far from zero, see Figure 3. Moreover, instabilities occur. The inaccuracies in the values are due to small inaccuracies in the estimation of the generating terms and the large coefficients γj()\gamma_{j}(\ell) that increases exponentially.

Refer to caption
Figure 3. Left: The stationary measure computed with the linear approximation scheme and n=150n=150. For large states, significant errors occur in the estimation. Right: The generating terms.

We know from Corollary 5.6 that there exists a unique stationary measure for CTMCs with polynomial transition rates if ω=2\omega_{-}=-2 and ω+=1\omega_{+}=1, it remains to see if such a stationary measure is finite. With the aid of our numerical scheme, we might be able to infer this information in some scenarios.

Example 9.4.

Computing an unknown measure. Consider the following SRN,

10S\ce>[5000]12S\ce>[10]13S\ce>[1]16S,13S\ce>[1]10S.10\text{S}\ce{->[5000]}12\text{S}\ce{->[10]}13\text{S}\ce{->[1]}16\text{S},\quad 13\text{S}\ce{->[1]}10\text{S}.

It is explosive by the criterion in [30, Theorem 7]. We have s=10s=10, ω=3\omega_{-}=-3, ω+=3\omega_{+}=3, L10=0L_{10}=0 and U10=2U_{10}=2. The linear approximation scheme retrieves what appears to be a stationary distribution, see Figure 4. The numerical computations confirm that the limit in (6.5) in Theorem 6.6 is zero, pointing to the stationary distribution being unique.

Refer to caption
Figure 4. The stationary measure of an explosive SRN, the generating terms and the coefficients γjs()\gamma_{j}^{s}(\ell).

We end with an example for which there exists more than one stationary measure.

Example 9.5.

Consider the SRN with reactions 0\ce>[λ1]S0\ce{->[\lambda_{1}]}\text{S}, 2S\ce>[λ2]02\text{S}\ce{->[\lambda_{-2}]}0, and

λ1(0)\displaystyle\lambda_{1}(0) =λ1(1)=1,λ1(x)=x22,x2,\displaystyle=\lambda_{1}(1)=1,\quad\,\,\,\,\lambda_{1}(x)=\lfloor\tfrac{x}{2}\rfloor^{2},\quad x\geq 2,
λ2(0)\displaystyle\lambda_{-2}(0) =λ2(1)=0,λ2(x)=x+122,x2,\displaystyle=\lambda_{-2}(1)=0,\quad\lambda_{-2}(x)=\lfloor\tfrac{x+1}{2}\rfloor^{-2},\quad x\geq 2,

Then, (A1)-(A3) are fulfilled with ω=2\omega_{-}=-2, ω+=1\omega_{+}=1, s=0s=0, L0=0L_{0}=0, U0=1U_{0}=1 (so π0=π\pi_{0}=\pi). From Corollary 5.5 with x=U0+2=3x=U_{0}+2=3,

Qn(3)=i=0nλ1(2+2i)λ2(3+2i)λ1(3+2i)λ2(4+2i)=1,Q_{n}(3)=\prod_{i=0}^{n}\frac{\lambda_{1}(2+2i)\lambda_{-2}(3+2i)}{\lambda_{1}(3+2i)\lambda_{-2}(4+2i)}=1,
H(3)=n=01(n+1)2+1(n+2)2=1+n=12n2=π231<.H(3)=\sum_{n=0}^{\infty}\frac{1}{(n+1)^{2}}+\frac{1}{(n+2)^{2}}=-1+\sum_{n=1}^{\infty}\frac{2}{n^{2}}=\frac{\pi^{2}}{3}-1<\infty.

Consequently, there is not a unique stationary measure of (Ω,)(\Omega,\mathcal{F}) on 0\mathbb{N}_{0}. Numerical computations suggest that [Ψ2(3),Ψ1(3)][1.5351,2.6791],[\Psi_{2}(3),\Psi_{1}(3)]\approx[1.5351,2.6791], using (5.7) with k=700k=700. See Corollary 5.5 for a definition of Ψ1,Ψ2\Psi_{1},\Psi_{2}.

In the following, we will explore the stationary measures using the theory of Section 5.2 and compare it to results obtained from the linear approximation scheme. For any stationary measure π\pi, by definition of ϕ(x)\phi(x) and ϕ(x)<h(x)\phi(x)<h(x), we have for x3x\geq 3,

π(x)π(x1)=\displaystyle\frac{\pi(x)}{\pi(x-1)}= λ2(x1)λ2(x)1ϕ(x)>λ2(x1)λ2(x)1h(x)=(x12x+12)2,\displaystyle\frac{\lambda_{-2}(x-1)}{\lambda_{-2}(x)}\frac{1}{\phi(x)}>\frac{\lambda_{-2}(x-1)}{\lambda_{-2}(x)}\frac{1}{h(x)}=\left(\Big{\lfloor}\frac{x-1}{2}\Big{\rfloor}\Big{\lfloor}\frac{x+1}{2}\Big{\rfloor}\right)^{\!2},

hence

π(x)>C(x!)416x,x0,\pi(x)>C\frac{(x!)^{4}}{16^{x}},\quad x\geq 0,

for some constant C>0C>0, and π\pi is not a distribution. This is illustrated in Figure 5 (top left) that shows the logarithm of π(x)\pi(x) with ϕ=2.67\phi^{*}=2.67, using Corollary 5.5. The red line is a fitted curve to log(π(x))\log(\pi(x)) for the even states: log(π(2x))3.504+2.009xlog(x)3.429x\log(\pi(2x))\approx 3.504+2.009x\log(x)-3.429x, x2x\geq 2. The errors between true and predicted values are numerically smaller than 0.1 for all even states. Hence, the measure appears to grow super-exponentially fast. The function log(π(x))\log(\pi(x)) for the odd states grows at a comparable rate as that for the even states, but not with the same regularity.

Additionally, we computed the difference in log(π(x))\log(\pi(x)) for different values of ϕ\phi^{*}, showing again a distinction between odd and even states, see Figure 5 (top right).

We used the linear approximation scheme to estimate the generating terms (which we know are not uniquely given in this case). Aslo here, an alternating pattern emerges with even indices producing the generator (π(L0),π(U0))(0.3764,0.6235)(\pi(L_{0}),\pi(U_{0}))\approx(0.3764,0.6235) (n=100n=100), while odd indices producing the generator (0.4222,0.5777)\approx(0.4222,0.5777) (n=101n=101). For each nn, a unique solution is found. Computing the corresponding ϕ=ϕ(U0+2)\phi^{*}=\phi(U_{0}+2) yields another approximation to [Ψ2(3),Ψ1(3)][\Psi_{2}(3),\Psi_{1}(3)], namely [1.5240,2.7161][1.5240,2.7161], which is slightly larger than the previous approximation, [1.5351,2.6791][1.5351,2.6791]. By nature of the latter estimate, the first coordinate is increasing in nn, while the second is decreasing in nn, hence the latter smaller interval is closer to the true interval [Ψ2(3),Ψ1(3)][\Psi_{2}(3),\Psi_{1}(3)], than the former. Figure 5 (bottom left) also shows the estimated generating terms for different values of nn, providing a band for π(L0)\pi(L_{0}) and π(U0)\pi(U_{0}) for which stationary measures exist, in agreement with Lemma 6.7.

Finally, we computed the ratio of π(n)\pi(n) to γ(n)1\|\gamma(n)\|_{1} for different values of ϕ\phi^{*}, and observe that there appears to be one behavior for odd states and one for even. While one cannot infer the large state behavior of the ratios in Figure 5 (bottom right) from the figure, it is excluded by non-uniqueness of the measures, that they both converges to zero, see Lemma 6.7.

Refer to caption
Figure 5. Top left: The logarithm of the stationary measure with ϕ=2.67\phi^{*}=2.67. This value is in the upper end of the estimated interval [Ψ2(3),Ψ1(3)][\Psi_{2}(3),\Psi_{1}(3)],. The red line is the curve fitted to the log-measure of the even states, using regression analysis. The stationary measures retrieved using the linear approximation scheme in Section 7 is visually identical to the plotted measure, not shown. Top right: The difference in log(π(x))\log(\pi(x)) between the measure with ϕ=2.00\phi^{*}=2.00 (in the middle part of the interval) to that with ϕ=2.67\phi^{*}=2.67 (blue: even states, light blue: odd states), and the log-measure between the measure with ϕ=1.54\phi^{*}=1.54 (in the lower end of the interval) to that with ϕ=2.67\phi^{*}=2.67 (red: even states, orange: odd states). An alternating pattern emerges. Bottom left: The generating terms estimated for different values of nn, with red (even states)/orange (odd states) showing π(L0)\pi(L_{0}) and blue (even states)/light blue (odd states) showing π(U0)\pi(U_{0}). Bottom right: The normed measure, π(n)/γ(n)1\pi(n)/\|\gamma(n)\|_{1} for different values of ϕ\phi^{*}: 2.672.67 (red: even states, orange: odd), 2.002.00 (blue: even, light blue: odd) and 1.541.54 (olive: even, green: odd). Dots are replaced by lines for visual reasons.
Example 9.6.

Consider the SRN with non-mass-action kinetics:

0\ce<=>[λ1][λ1]S,2S\ce>[λ2]00\ce{<=>[\lambda_{1}][\lambda_{-1}]}\text{S},\quad 2\text{S}\ce{->[\lambda_{-2}]}0

with transition rates given by:

λ1(x)\displaystyle\lambda_{1}(x) =2x,x0,\displaystyle=2^{x},\quad x\geq 0,
λ1(0)\displaystyle\lambda_{-1}(0) =0,λ1(1)=2,λ1(x)=24x12x,x2,\displaystyle=0,\quad\lambda_{-1}(1)=2,\quad\lambda_{-1}(x)=2\cdot 4^{x-1}-2^{x},\quad x\geq 2,
λ2(0)\displaystyle\lambda_{-2}(0) =λ2(1)=0,λ2(x)=24x1,x2.\displaystyle=\lambda_{-2}(1)=0,\quad\lambda_{-2}(x)=2\cdot 4^{x-1},\quad x\geq 2.

For this SRN, ω=2\omega_{-}=-2, ω+=1\omega_{+}=1, s=0s=0, L0=0L_{0}=0 and U0=1U_{0}=1. Hence, the conditions of Proposition 7.3 are satisfied. The underlying CTMC is from [24], where it is shown that ν(x)=(1/2)x+1\nu(x)=(-1/2)^{x+1} for x0x\in\mathbb{N}_{0} is a signed invariant measure. This measure has generator (ν(0),ν(1))=(1/2,1/4)(\nu(0),\nu(1))=(-1/2,1/4). The space of signed invariant measures is U0L0+1=2U_{0}-L_{0}+1=2 dimensional, and a second linearly independent signed invariant measure has generator (1,0)(1,0). On the other hand, by the Foster-Lyapunov criterion [23], the process is positive recurrent, and hence admits a stationary distribution. Numerical computations show that the stationary distribution is concentrated on the first few states x=0,,3x=0,\ldots,3. The uniqueness of the stationary distribution is also confirmed by Corollary 5.5 in that H(U+2)=H(3)=H(U+2)=H(3)=\infty by a simple calculation.

Appendix A Appendix

Lemma A.1.

Assume (A1)-(A3). The following hold:

  • i)

    B0={ω}B_{0}=\{\omega_{-}\}, ωBk\omega_{-}\in B_{k} for 0k<ω0\leq k<-\omega_{-}, and ω+Bk\omega_{+}\in B_{k} for ωkm\omega_{-}\leq k\leq m_{*}.

  • ii)

    c0()<0c_{0}(\ell)<0 for >U\ell>U and c0()=0c_{0}(\ell)=0 for U\ell\leq U.

  • iii)

    ck()>0c_{k}(\ell)>0 for ωkm-\omega_{-}\leq k\leq m_{*} and 𝗂ω+\ell\geq{\sf i}_{\omega_{+}}. Generally, ck()0c_{k}(\ell)\geq 0, 0\ell\in\mathbb{N}_{0}.

  • iv)

    ck()<0c_{k}(\ell)<0 for 0k<ω0\leq k<-\omega_{-} and 𝗂ω=U+1\ell\geq{\sf i}_{\omega_{-}}=U+1. Generally, ck()0c_{k}(\ell)\leq 0, 0\ell\in\mathbb{N}_{0}.

  • v)

    If ck()>0c_{k}(\ell)>0, then ck(+1)>0c_{k}(\ell+1)>0, and similarly, if ck()<0c_{k}(\ell)<0, then ck(+1)<0c_{k}(\ell+1)<0 for k{0,,m}k\in\{0,\ldots,m_{*}\} and 0\ell\in\mathbb{N}_{0}.

  • vi)

    fk()>0f_{k}(\ell)>0 for ωkm-\omega_{-}\leq k\leq m_{*} and k𝗂ω+\ell-k\geq{\sf i}_{\omega_{+}}.

  • vii)

    fk()<0f_{k}(\ell)<0 for 0k<ω0\leq k<-\omega_{-} and k𝗂ω=U+1\ell-k\geq{\sf i}_{\omega_{-}}=U+1.

  • viii)

    If fk()>0f_{k}(\ell)>0, then fk(+1)>0f_{k}(\ell+1)>0, and similarly, if fk()<0f_{k}(\ell)<0, then fk(+1)<0f_{k}(\ell+1)<0 for k{0,,m}k\in\{0,\ldots,m_{*}\} and >U\ell>U.

Proof.

i) Since ω1\omega_{-}\leq-1, we have from (4.7),

B0={ωΩ|(ω+1/2)(ωω1/2)>0}={ωΩ|ωω}={ω}.B_{0}=\{\omega\in\Omega\,|\,(\omega_{-}+1/2)(\omega-\omega_{-}-1/2)>0\}=\{\omega\in\Omega\,|\,\omega\leq\omega_{-}\}=\{\omega_{-}\}.

For ωkm-\omega_{-}\leq k\leq m_{*}, we have k=ω+k+1/2>0k^{\prime}=\omega_{-}+k+1/2>0 and ωkω(ω+m+1/2)=ωω++1/2>0\omega-k^{\prime}\geq\omega-(\omega_{-}+m_{*}+1/2)=\omega-\omega_{+}+1/2>0 if ω=ω+\omega=\omega_{+}. Hence, ω+Bk\omega_{+}\in B_{k}. For 0k<ω0\leq k<-\omega_{-}, we have k=ω+k+1/2<0k^{\prime}=\omega_{-}+k+1/2<0 and ωkω(ω+1/2)<0\omega-k^{\prime}\leq\omega-(\omega_{-}+1/2)<0 if ω=ω\omega=\omega_{-}. Hence, ωBk\omega_{-}\in B_{k}. ii) Since sgn(ω+1/2)=1\text{sgn}(\omega_{-}+1/2)=-1, then for >U=𝗂ω1\ell>U={\sf i}_{\omega_{-}}-1, we have c0()=λω()<0c_{0}(\ell)=-\lambda_{\omega_{-}}(\ell)<0, see (4.6). Likewise, for U\ell\leq U. iii) and iv) The sign follows similarly to the proof of ii). Using i), the conditions on \ell ensure λω+()>0\lambda_{\omega_{+}}(\ell)>0 and λω()>0\lambda_{\omega_{-}}(\ell)>0, respectively, yielding the conclusions. v) It follows from (A2). vi)-viii) Similar to iii)-iv) using (4.5) and (4.7). ∎

For a matrix D=(dij)n×nD=(d_{ij})\in\mathbb{R}^{n\times n}, the jj-th column, j=1,,nj=1,\ldots,n, is weakly diagonally dominant (WDD) if |djj|ij|dij||d_{jj}|\geq\sum_{i\neq j}|d_{ij}| and strictly diagonally dominant (SDD) if |djj|>ij|dij||d_{jj}|>\sum_{i\neq j}|d_{ij}|. In particular, a matrix is WDD if all columns are WDD. A WDD matrix is weakly chain diagonally dominant (WCDD) if for each WDD column, say jj, which is not SDD, there exists an SDD column kk such that there is a directed path from vertex kk to vertex jj in the associated digraph. Every WCDD matrix is non-singular [5].

Lemma A.2.

Assume (A1)-(A3). Then, the row echelon form GG of HH, as defined in (4.8), exists.

Proof.

Recall that

Hm,n=δm,nλ(mn)(n)ωΩλω(m),m=0,,L1,n=0,,U1.\displaystyle H_{m,n}=\delta_{m,n}-\frac{\lambda_{(m-n)}(n)}{\sum_{\omega\in\Omega}\lambda_{\omega}(m)},\quad m=0,\ldots,L-1,\quad n=0,\ldots,U-1.

Note the following: if ω<m\omega<-m, then m<𝗂ωm<{\sf i}_{\omega} and hence λω(m)=0\lambda_{\omega}(m)=0. Hence, we might replace ωΩλω(m)\sum_{\omega\in\Omega}\lambda_{\omega}(m) by k=mλk(m)\sum_{k=-m}^{\infty}\lambda_{k}(m) in Hn,mH_{n,m}.

Define am,n=λ(mn)(n)a_{m,n}=\lambda_{(m-n)}(n) for n,mn,m\in\mathbb{Z}. Then the row echelon form H-H restricted to its first LL columns is invertible, that is, if

(1a0,1k=0ak,0a0,L1k=0ak,0a1,0k=0ak,11a1,L1k=0ak,1aL1,0k=0ak,L1aL1,1k=0ak,L11)\begin{pmatrix}\vskip 10.00002pt1&-\frac{a_{0,1}}{\sum_{k=0}^{\infty}a_{k,0}}&\ldots&-\frac{a_{0,L-1}}{\sum_{k=0}^{\infty}a_{k,0}}\\ \vskip 10.00002pt-\frac{a_{1,0}}{\sum_{k=0}^{\infty}a_{k,1}}&1&\ldots&-\frac{a_{1,L-1}}{\sum_{k=0}^{\infty}a_{k,1}}\\ \vskip 10.00002pt\vdots&\vdots&\ddots&\vdots\\ \vskip 10.00002pt-\frac{a_{L-1,0}}{\sum_{k=0}^{\infty}a_{k,L-1}}&-\frac{a_{L-1,1}}{\sum_{k=0}^{\infty}a_{k,L-1}}&\ldots&1\end{pmatrix}

is invertible. Multiplying the above matrix by the invertible diagonal matrix

diag(k=0ak,0,k=0ak,1,,k=0ak,L1){\rm diag}\left(\sum_{k=0}^{\infty}a_{k,0},\sum_{k=0}^{\infty}a_{k,1},\ldots,\sum_{k=0}^{\infty}a_{k,L-1}\right)

on the left side and its inverse on the right side gives a column diagonally dominant matrix

A=(1a0,1k=0ak,1a0,L1k=0ak,L1a1,0k=0ak,01a1,L1k=0ak,L1aL1,0k=0ak,0aL1,1k=0ak,11).A=\begin{pmatrix}\vskip 10.00002pt1&-\frac{a_{0,1}}{\sum_{k=0}^{\infty}a_{k,1}}&\ldots&-\frac{a_{0,L-1}}{\sum_{k=0}^{\infty}a_{k,L-1}}\\ \vskip 10.00002pt-\frac{a_{1,0}}{\sum_{k=0}^{\infty}a_{k,0}}&1&\ldots&-\frac{a_{1,L-1}}{\sum_{k=0}^{\infty}a_{k,L-1}}\\ \vskip 10.00002pt\vdots&\vdots&\ddots&\vdots\\ \vskip 10.00002pt-\frac{a_{L-1,0}}{\sum_{k=0}^{\infty}a_{k,0}}&-\frac{a_{L-1,1}}{\sum_{k=0}^{\infty}a_{k,1}}&\ldots&1\end{pmatrix}.

Note that k=1ak,0=k=1λk(0)>0\sum_{k=1}^{\infty}a_{k,0}=\sum_{k=1}^{\infty}\lambda_{k}(0)>0. Furthermore, by (𝐀𝟐\rm\mathbf{A2}) the following property holds: am,n>0a_{m,n}>0 implies am+1,n+1>0a_{m+1,n+1}>0 for any m,nm,n\in\mathbb{Z}, or by contraposition, am+1,n+1=0a_{m+1,n+1}=0 implies am,n=0a_{m,n}=0 for any m,nm,n\in\mathbb{Z}. Hence,

k=1ak+L1,L1=k=Lak,L1>0,\sum_{k=1}^{\infty}a_{k+L-1,L-1}=\sum_{k=L}^{\infty}a_{k,L-1}>0,

and therefore,

k=0L1ak,L1<k=0ak,L1.\sum_{k=0}^{L-1}a_{k,L-1}<\sum_{k=0}^{\infty}a_{k,L-1}.

Consequently, the LL-th column sum of AA is positive, implying AA is WDD matrix with SDD column LL.

By Lemma A.3, AA is invertible, and hence the row reduced echelon form exists. ∎

Lemma A.3.

Let nn\in\mathbb{N} and assume DD is an n×nn\times n matrix, such that

D=(1d0,1k=0dk,1d0,n1k=0dk,n1d1,0k=0dk,01d1,n1k=0dk,n1dn1,0k=0dk,0dn1,1k=0dk,11),D=\begin{pmatrix}\vskip 10.00002pt1&-\frac{d_{0,1}}{\sum_{k=0}^{\infty}d_{k,1}}&\ldots&-\frac{d_{0,n-1}}{\sum_{k=0}^{\infty}d_{k,n-1}}\\ \vskip 10.00002pt-\frac{d_{1,0}}{\sum_{k=0}^{\infty}d_{k,0}}&1&\ldots&-\frac{d_{1,n-1}}{\sum_{k=0}^{\infty}d_{k,n-1}}\\ \vskip 10.00002pt\vdots&\vdots&\ddots&\vdots\\ \vskip 10.00002pt-\frac{d_{n-1,0}}{\sum_{k=0}^{\infty}d_{k,0}}&-\frac{d_{n-1,1}}{\sum_{k=0}^{\infty}d_{k,1}}&\ldots&1\end{pmatrix},

where di,j0d_{i,j}\geq 0 for i0i\in\mathbb{N}_{0}, j=0,,n1j=0,\ldots,n-1, i=1di,0>0\sum_{i=1}^{\infty}d_{i,0}>0, and di+1,j+1=0d_{i+1,j+1}=0 implies di,j=0d_{i,j}=0 for any i0i\in\mathbb{N}_{0}, j=0,,n2j=0,\ldots,n-2. Then DD is WCDD, and thus non-singular.

Proof.

Number rows and columns from 0 to n1n-1. If n=1n=1, then the statement is trivial. Hence assume n>1n>1.

Fact 1: If column j>0j>0 sums to zero, then column j1j-1 sums to zero. Indeed, that column jj sums to zero is equivalent to i=ndi,j=0\sum_{i=n}^{\infty}d_{i,j}=0, which by the property of di,jd_{i,j}, implies that

(A.1) i=ndi1,j1=0.\sum_{i=n}^{\infty}d_{i-1,j-1}=0.

Hence also i=ndi,j1=0\sum_{i=n}^{\infty}d_{i,j-1}=0 and column j1j-1 sums to zero. Consequently, if column jj is WDD, but not SDD, then all columns 0,,j0,\ldots,j are WDD, but not SDD.

Fact 2: If column j>0j>0 sums to zero, then from (A.1) it holds that dn1,j1=0d_{n-1,j-1}=0. Inductively, using in addition Fact 1, di,k=0d_{i,k}=0 for iknji-k\geq n-j, corresponding to a lower left triangular corner of size jj. ∎

Acknowledgements

The second author was supported by the Novo Nordisk Foundation (Denmark), grant NNF19OC0058354. The third author was supported by TUM University Foundation, the Alexander von Humboldt Foundation, Simons Foundation, and a start-up funding from the University of Hawai’i at Mānoa.

The work was initiated when all authors were working at the University of Copenhagen. We are grateful to the AE and two anonymous reviewers whose comments and suggestions improved the presentation of the manuscript.

The authors declare no competing interests.

References

  • [1] Abramowitz, M. and Stegun, I.A. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, volume 55 of Applied Mathematics Series. National Bureau of Standards, 1972.
  • [2] Anderson, D.F, Craciun, G. and Kurtz, T.G. Product-form stationary distributions for deficiency zero chemical reaction networks. Bulletin Mathematical Biology, 72:1947–1970, 2010.
  • [3] Anderson, D.F. and Kurtz, T.G. Stochastic Analysis of Biochemical Systems, volume 1.2 of Mathematical Biosciences Institute Lecture Series. Springer International Publishing, Switzerland, 2015.
  • [4] Anderson, W.J. Continuous-Time Markov Chains: An Applications–Oriented Approach. Springer Series in Statistics: Probability and its Applications. Springer-Verlag, New York, 1991.
  • [5] Azimzadeh, P. A fast and stable test to check if a weakly diagonally dominant matrix is a nonsingular M-matrix. Mathematics of Computation, 88(316):783–800, 2019.
  • [6] Boyd, S.P. and Vandenberghe, L. Convex Optimization. Cambridge University Press, 2004.
  • [7] Derman, C. Some contributions to the theory of denumerable Markov chains. Transactions of the American Mathematical Society, 79(2):541–555, 1955.
  • [8] Douc, R., Moulines, E., Priouret, P. and Soulier, P. Markov Chains. Springer Series in Operations Research and Financial Engineering. Springer 2018.
  • [9] Erban, R., Chapman, S.J. and Maini, P.K. A practical guide to stochastic simulations of reaction-diffusion processes. arXiv:0704.1908, 2017.
  • [10] Ethier, S.N. and Kurtz, T.G. Markov Processes. Wiley Series in Probability and Mathematical Statistics. Wiley, 1986.
  • [11] Falk, J., Mendler, M. and Drossel, B. A minimal model of burst-noise induced bistability. PloS one, 12(4):e0176410, 2017.
  • [12] Gillespie, D.T. Exact stochastic simulation of coupled chemical reactions. Journal of Physical Chemistry, 81:2340–2361, 1977.
  • [13] Gillespie, D.T. A rigorous derivation of the chemical master equation. Physica A: Stat. Mech. Appl., 188:404–425, 1992.
  • [14] Gupta, A., Mikelson, J. and Khammash, M. A finite state projection algorithm for the stationary solution of the chemical master equation. The Journal of Chemical Physics, 147(15):154101, 2017.
  • [15] Harris, T.E. Transient Markov chains with stationary measures. Proceedings of the American Mathematical Society, 8(5):937–942, 1957.
  • [16] Jahnke, T. and Huisinga, W. Solving the chemical master equation for onomolecular reaction systems analytically. J. Math. Biol., 54:1–26, 2007.
  • [17] Kelly, F.P. Reversibility and Stochastic Networks. Applied Stochastic Methods Series. Cambridge University Press, 2011.
  • [18] Khinchin, A. Ya. Continued Fractions. Dover Publications, revised edition, 1997.
  • [19] Kittappa, R. K. A representation of the solution of the nth order linear difference equation with variable coefficients. Linear algebra and its applications, 193:211–222, 1993.
  • [20] Kuntz, J., Thomas, P. , Stan, Guy-Bart and Barahona, M. Stationary distributions of continuous-time Markov chains: a review of theory and truncation-based approximations. SIAM Review, 63(1):3–64, 2021.
  • [21] López-Caamal,F. and Marquez-Lago,T. T. Exact probability distributions of selected species in stochastic chemical reaction networks. Bulletin of Mathematical Biology, 76:2334–2361, 2014.
  • [22] Marrero, J.A. and Tomeo, V. On compact representations for the solutions of linear difference equations with variable coefficients. Journal of Computational and Applied Mathematics, 318:411–421, 2017.
  • [23] Meyn, S. P. and Tweedie, R. L. Markov Chains and Stochastic Stability. Cambridge University Press, 2nd edition, 2009.
  • [24] Miller Jr, R. G. Stationary equations in continuous-time markov chains. Transactions of the American Mathematical Society, 109(1):35–44, 1963.
  • [25] Norris,J.R. Markov Chains. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2009.
  • [26] Pastor-Satorras, R., Castellano, C., Van Mieghem, P. and Vespignani, A. Epidemic processes in complex networks. Rev. Mod. Phys., 87:925–979, 2015.
  • [27] Shahrezaei, V. and Swain, P.S. Analytical distributions for stochastic gene expression. PNAS, 105:17256–17261, 2008.
  • [28] Whittle, P. Systems in Stochastic Equilibrium. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley &\& Sons, Inc., Chichester, 1986.
  • [29] Xu, C., Hansen, M.C. and Wiuf, C. Structural classification of continuous time Markov chains with applications. Stochastics, 94:1003–1030, 2022.
  • [30] Xu, C, Hansen, M.C. and Wiuf, C. Full classification of dynamics for one-dimensional continuous-time Markov chains with polynomial transition rates. Advances in Applied Probability, 55:321–355, 2023.
  • [31] Xu, C, Hansen, M.C. and Wiuf, C. The asymptotic tails of limit distributions of continuous time Markov chains. Advances in Applied Probability, pages 1–42, 2023. doi:10.1017/apr.2023.42.