This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Critical scaling in hidden state inference for linear Langevin dynamics

B Bravi1111Current affiliation: Institute of Theoretical Physics, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland and P Sollich1 1 Department of Mathematics, King’s College London, Strand, London, WC2R 2LS UK barbara.bravi@epfl.ch and peter.sollich@kcl.ac.uk
Abstract

We consider the problem of inferring the dynamics of unknown (i.e. hidden) nodes from a set of observed trajectories and study analytically the average prediction error and the typical relaxation time of correlations between errors. We focus on a stochastic linear dynamics of continuous degrees of freedom interacting via random Gaussian couplings in the infinite network size limit. The expected error on the hidden time courses can be found as the equal-time hidden-to-hidden covariance of the probability distribution conditioned on observations. In the stationary regime, we analyze the phase diagram in the space of relevant parameters, namely the ratio between the numbers of observed and hidden nodes, the degree of symmetry of the interactions and the amplitudes of the hidden-to-hidden and hidden-to-observed couplings relative to the decay constant of the internal hidden dynamics. In particular, we identify critical regions in parameter space where the relaxation time and the inference error diverge, and determine the corresponding scaling behaviour.

Keywords: Plefka Expansion, Inference, Mean Field, Critical Scaling, Biochemical Networks, Dynamical Functional

1 Introduction

The reconstruction of the time evolution of a system starting from macroscopic measurements of its dynamics is a challenge of primary interest in statistical physics; see e.g. [1, 2, 3, 4]. The problem can be cast as follows: given the set of interaction parameters and a temporal sequence of observed variables, the aim is to infer the states of the variables that are unobserved or, in the terminology of machine learning, “hidden”.

Recently, in [5], we have proposed a method to solve this problem for continuous degrees of freedom based on a dynamical mean-field theory, the Extended Plefka Expansion [6]. We specialized to the case of a linear stochastic dynamics for the purpose of a direct comparison with the exact computation implemented via the Kalman filter [7], a well-known inference technique in linear (Gaussian) state space models [8]. In these models, the posterior distribution over the hidden dynamics is Gaussian: while the posterior mean provides the best estimate of the hidden dynamics, the posterior equal-time variance measures the uncertainty of this prediction and thus the inference error. With mean-field couplings (weak and long-ranged) drawn at random from a Gaussian distribution, we investigated numerically the performance of the extended Plefka expansion, finding a clear improvement for large system size. In this paper, we calculate analytically the posterior statistics in the thermodynamic limit of an infinitely large hidden system at stationarity, with a focus on the inference error and the relaxation time, defined as the typical timescale over which correlations between inference errors decay. The thermodynamic limit expressions we find are expected to be exact as we show by comparison to other methods, appealing to Random Matrix Theory and dynamical functionals [9]. As the resulting scenario is analytically tractable, one can study the dependence of relaxation times and the average prediction error on key system parameters and thus shed light on the accuracy of the inference process. This, beyond the derivation of the general thermodynamic limit expressions, is the second major aim of this paper and can be seen as analogous to what has been done for learning from static data in a linear perceptron [10, 11], where solvable models allowed the authors to study how the prediction error scales with the number of training examples and spatial dimensions. The emphasis here is on understanding the prediction accuracy for hidden states from a theoretical point of view, including its dependence on macroscopic parameters that could be measurable, at least indirectly.

The paper is organized as follows. In section 2.1 we introduce the basic set up, a linear stochastic dynamics with hidden nodes, and we recall the main results of the Extended Plefka Expansion applied to this case [5], as the starting point for the analysis of this paper. In section 2.2 we consider the thermodynamic limit of infinite network size, shifting from local correlations to their macroscopic average across the network. The exactness of the extended Plefka predictions is shown in section 2.3 by comparison with expressions obtained elsewhere[9] by Kalman filter and Random Matrix Theory (RMT) methods. From section 3, we present a systematic analysis of these mean-field results in terms of the system properties and the number of observations. In section 3.1 we determine the relevant dimensionless parameters governing prediction accuracy, namely the ratio between the numbers of observed and hidden nodes, the degree of symmetry of the hidden interactions, and the amplitudes of the hidden-to-hidden and hidden-to-observed interactions relative to the decay constant of the internal hidden dynamics. We identify critical points in this parameter space by studying the behaviour of power spectra using a scaling approach (sections 3.2, 3.3 and A). In the main part of the analysis we then consider the temporal correlations in the posterior dynamics of the hidden nodes, revealing interesting long-time behaviour in the critical regions (sections 3.4-3.9 and B). More specifically, we first present the power laws governing relaxation times in sections 3.4 and 3.5, then discuss the corresponding correlation functions in sections 3.6 and 3.7, and finally extract predictions on the inference error in sections 3.8 and 3.9.

2 Extended Plefka Expansion with hidden nodes

2.1 Set up

Let us consider a generic network where only the dynamics of a subnetwork of nodes is observed while the others are hidden and form what we call the “bulk”. The indices i,j=1,,Nbi,j=1,...,N^{\rm b} and the superscript b are used for the hidden or bulk variables; similarly the indices a,b=1,,Nsa,b=1,...,N^{\rm s} and the superscript s for the observed or subnetwork nodes of the network. Assuming linear couplings {Jij}\{J_{ij}\}, {Kia}\{K_{ia}\} between hidden and observed variables xix_{i}, xax_{a}, their dynamical evolution is described by

txi(t)\displaystyle\partial_{t}x_{i}(t) =λxi(t)+jJijxj(t)+aKiaxa(t)+ξi(t)\displaystyle=-\lambda x_{i}(t)+\sum_{j}J_{ij}x_{j}(t)+\sum_{a}K_{ia}x_{a}(t)+\xi_{i}(t) (2.1a)
txa(t)\displaystyle\partial_{t}x_{a}(t) =λxa(t)+bJabxb(t)+jKajxj(t)+ξa(t)\displaystyle=-\lambda x_{a}(t)+\sum_{b}J_{ab}x_{b}(t)+\sum_{j}K_{aj}x_{j}(t)+\xi_{a}(t) (2.1b)

JijJ_{ij} (respectively JabJ_{ab}) gives the hidden-to-hidden (respectively observed-to-observed) interactions, while the coupling between observed and hidden variables is contained in KiaK_{ia} and KajK_{aj}. λ\lambda is a self-interaction term acting as a decay constant and providing the basic timescale of the dynamics. The dynamical noises ξi\xi_{i}, ξa\xi_{a} are Gaussian white noises with zero mean and diagonal covariances Σi\Sigma_{i}, Σa\Sigma_{a}

ξi(t)ξj(t)=Σiδijδ(tt)ξa(t)ξb(t)=Σaδabδ(tt)\langle\xi_{i}(t)\xi_{j}(t^{\prime})\rangle=\Sigma_{i}\delta_{ij}\delta(t-t^{\prime})\qquad\langle\xi_{a}(t)\xi_{b}(t^{\prime})\rangle=\Sigma_{a}\delta_{ab}\delta(t-t^{\prime}) (2.2)

The application of the extended Plefka expansion to this problem with hidden and observed nodes [5] allows us to obtain a closed system of integral equations describing the second moments of the Gaussian posterior distribution over the hidden dynamics, i.e. conditioned on the observed trajectories. These second moments are denoted by Ci(t,t)C_{i}(t,t^{\prime}), the hidden posterior variance, Ri(t,t)R_{i}(t,t^{\prime}), the hidden posterior response, Bi(t,t)B_{i}(t,t^{\prime}) and Ba(t,t)B_{a}(t,t^{\prime}), the posterior correlations of auxiliary variables introduced to represent the hidden and observed dynamics respectively à la Martin–Siggia–Rose–Janssen–De Dominicis [12, 13, 14]. They are diagonal in the site index but functions of two times by construction of the Extended Plefka Expansion [6], which gives an effectively non-interacting approximation of the dynamics where couplings among trajectories are replaced by a memory term (related to Ri(t,t)R_{i}(t,t^{\prime})) and a time-correlated noise (whose covariance is related to Bi(t,t)B_{i}(t,t^{\prime})).

Our main interest is in the posterior statistics: the equal-time variance of inference errors gives a mean-square prediction error, while the time correlations between the inference errors define a posterior relaxation time. To simplify the equations derived in [5], we consider long times, where a stationary regime is reached: all two-time functions become time translation invariant (TTI) and the Laplace-transformed equations yield a system of four coupled equations for C~i(z)\tilde{C}_{i}(z), R~i(z)\tilde{R}_{i}(z), B~i(z)\tilde{B}_{i}(z), B~a(z)\tilde{B}_{a}(z). Let us briefly recall it here, as it will be the starting point for the analysis

C~i(z)(aKai2B~a(z)+jJji2B~j(z))+R~i(z)(z+λjJijJjiR~j(z))=1\displaystyle-\tilde{C}_{i}(z)\bigg{(}\sum_{a}K_{ai}^{2}\tilde{B}_{a}(z)+\sum_{j}J_{ji}^{2}\tilde{B}_{j}(z)\bigg{)}+\tilde{R}_{i}(z)\bigg{(}z+\lambda-\sum_{j}J_{ij}J_{ji}\tilde{R}_{j}(z)\bigg{)}=1 (2.3a)
C~i(z)(z+λjJijJjiR~j(z))R~i(z)(jJij2C~j(z)+Σi)=0\displaystyle\tilde{C}_{i}(z)\bigg{(}-z+\lambda-\sum_{j}J_{ij}J_{ji}\tilde{R}_{j}(-z)\bigg{)}-\tilde{R}_{i}(z)\bigg{(}\sum_{j}J_{ij}^{2}\tilde{C}_{j}(z)+\Sigma_{i}\bigg{)}=0 (2.3b)
B~i(z)[(z+λjJijJjiR~j(z))(aKai2B~a(z)+jJji2B~j(z))1\displaystyle\tilde{B}_{i}(z)\bigg{[}\bigg{(}-z+\lambda-\sum_{j}J_{ij}J_{ji}\tilde{R}_{j}(-z)\bigg{)}\bigg{(}\sum_{a}K_{ai}^{2}\tilde{B}_{a}(z)+\sum_{j}J_{ji}^{2}\tilde{B}_{j}(z)\bigg{)}^{-1}
(z+λjJijJjiR~j(z))(jJij2C~j(z)+Σi)]=1\displaystyle\qquad\quad\bigg{(}z+\lambda-\sum_{j}J_{ij}J_{ji}\tilde{R}_{j}(z)\bigg{)}-\bigg{(}\sum_{j}J_{ij}^{2}\tilde{C}_{j}(z)+\Sigma_{i}\bigg{)}\bigg{]}=1 (2.3c)
B~a(z)[Σa+jKaj2C~j(z)]=1\displaystyle\tilde{B}_{a}(z)\bigg{[}\Sigma_{a}+\sum_{j}K_{aj}^{2}\tilde{C}_{j}(z)\bigg{]}=-1 (2.3d)

2.2 Thermodynamic Limit

We expect the extended Plefka approach to give exact values for posterior means and variances in the case of mean-field type couplings (i.e. weak and long-ranged), in the thermodynamic limit of an infinitely large system. More precisely we define the thermodynamic limit as the one of an infinitely large bulk and subnetwork, Nb,NsN^{\rm b},N^{\rm s}\rightarrow\infty at constant ratio α=Ns/Nb\alpha=N^{\rm s}/N^{\rm b}. For the mean-field couplings we assume, as in [6], that {Jij}\{J_{ij}\} is a real matrix belonging to the Girko ensemble [15], i.e. its elements are independently and randomly distributed Gaussian variables with zero mean and variance satisfying

JijJij=j2Nb\langle{J}_{ij}{J}_{ij}\rangle=\frac{j^{2}}{N^{\rm b}} (2.4)
JjiJij=ηj2Nb\langle{J}_{ji}{J}_{ij}\rangle=\frac{\eta\,j^{2}}{N^{\rm b}} (2.5)

The parameter η[1,1]\eta\in[-1,1] controls the degree to which the matrix {Jij}\{J_{ij}\} is symmetric. In particular, the dynamics is non-equilibrium – it does not satisfy detailed balance – whenever η<1\eta<1. Similarly, {Kai}\{K_{ai}\} is taken as a random matrix with uncorrelated zero mean Gaussian entries of variance

Kai=0Kai2=k2Nb\langle{K}_{ai}\rangle=0\qquad\langle{K}_{ai}^{2}\rangle=\frac{k^{2}}{N^{\rm b}} (2.6)

We have introduced amplitude parameters jj for {Jij}\{J_{ij}\} (hidden-to-hidden) and kk for {Kai}\{K_{ai}\} (hidden-to-observed) here. The scaling of both types of interaction parameters with 1/Nb1/\sqrt{N^{\rm b}} ensures that, when the size of the hidden part of the system increases, the typical contribution it makes to the time evolution of each hidden and observed variable stays of the same order. The high connectivity, where all nodes interact with all other ones, implies that in the thermodynamic limit all nodes become equivalent. Local response and correlation functions therefore become identical to their averages over nodes, defined as

R~bb|s(z)=1NbjR~j(z)\displaystyle\tilde{R}^{\rm bb|s}(z)=\frac{1}{N^{\rm b}}\sum_{j}\tilde{R}_{j}(z) (2.7)
C~bb|s(z)=1NbjC~j(z)\displaystyle\tilde{C}^{\rm bb|s}(z)=\frac{1}{N^{\rm b}}\sum_{j}\tilde{C}_{j}(z) (2.8)
B~bb|s(z)=1NbjB~j(z)\displaystyle\tilde{B}^{\rm bb|s}(z)=\frac{1}{N^{\rm b}}\sum_{j}\tilde{B}_{j}(z) (2.9)

As a consequence, all site indices can be dropped and the correlation and response functions can be replaced by their mean values, C~i(z)C~bb|s(z)\tilde{C}_{i}(z)\equiv\tilde{C}^{\rm bb|s}(z), R~i(z)R~bb|s(z)\tilde{R}_{i}(z)\equiv\tilde{R}^{\rm bb|s}(z), B~i(z)B~bb|s(z)\tilde{B}_{i}(z)\equiv\tilde{B}^{\rm bb|s}(z). The superscripts bb||s here emphasize that we are considering moments conditioned on subnetwork values though for brevity we drop them in the rest of the calculation. Similarly, for the correlations of auxiliary variables related to observations we can set B~a(z)B~ss(z)\tilde{B}_{a}(z)\equiv\tilde{B}^{\rm ss}(z). Of primary interest is then C~(z)\tilde{C}(z), the Laplace transformed posterior (co-)variance function of prediction errors, which as a site-average can be thought of as a macroscopic measure of prediction performance. This should become self-averaging in the thermodynamic limit. To see this, consider e.g. the sum jJijJjiR~j(z)\sum_{j}J_{ij}J_{ji}\tilde{R}_{j}(z). Replacing R~j\tilde{R}_{j} by R~\tilde{R}, the prefactor is a sum of NbN^{\rm b} terms so converges to its average Nbηj2/Nb=ηj2N^{\rm b}\eta j^{2}/N^{\rm b}=\eta j^{2} in the thermodynamic limit. So, as in [6], we can replace

jJijJjiR~j(z)ηj2R~(z)\sum_{j}J_{ij}J_{ji}\tilde{R}_{j}(z)\sim\eta\,j^{2}\tilde{R}(z)

Making this and similar substitutions in the system (2.3), and choosing scalar noise covariances Σi=σb2\Sigma_{i}=\sigma_{\rm b}^{2} and Σa=σs2\Sigma_{a}=\sigma_{\rm s}^{2} as in [9], one finds

C~(z)(αk2(σs2+k2C~(z))+j2B~(z))+R~(z)(z+ληj2R~(z))=1\displaystyle-\tilde{C}(z)\bigg{(}-\frac{\alpha k^{2}}{(\sigma_{\rm s}^{2}+k^{2}\tilde{C}(z))}+j^{2}\tilde{B}(z)\bigg{)}+\tilde{R}(z)\bigg{(}z+\lambda-\eta j^{2}\tilde{R}(z)\bigg{)}=1 (2.10a)
C~(z)(z+ληj2R~(z))R~(z)(j2C~(z)+σb2)=0\displaystyle\tilde{C}(z)\bigg{(}-z+\lambda-\eta j^{2}\tilde{R}(-z)\bigg{)}-\tilde{R}(z)\bigg{(}j^{2}\tilde{C}(z)+\sigma_{\rm b}^{2}\bigg{)}=0 (2.10b)
B~(z)[(z+ληj2R~(z))(αk2(σs2+k2C~(z))+j2B~(z))1(z+ληj2R~(z))(j2C~(z)+σb2)]=1\displaystyle\begin{split}&\tilde{B}(z)\bigg{[}\bigg{(}-z+\lambda-\eta j^{2}\tilde{R}(-z)\bigg{)}\bigg{(}-\frac{\alpha k^{2}}{(\sigma_{\rm s}^{2}+k^{2}\tilde{C}(z))}+j^{2}\tilde{B}(z)\bigg{)}^{-1}\bigg{(}z+\lambda-\eta j^{2}\tilde{R}(z)\bigg{)}\\ &\qquad-\bigg{(}j^{2}\tilde{C}(z)+\sigma_{\rm b}^{2}\bigg{)}\bigg{]}=1\end{split} (2.10c)

Here B~ss(z)=1/(σs2+k2C~(z))\tilde{B}^{\rm ss}(z)=-1/(\sigma_{\rm s}^{2}+k^{2}\tilde{C}(z)) has already been substituted into equations (2.10a) and (2.10c).

We comment briefly on the relation of the above results to the Fluctuation Dissipation Theorem (FDT), which in terms of Laplace transforms reads

zC~(z)=σb22[R~(z)R~(z)]z\tilde{C}(z)=-\frac{\sigma_{\rm b}^{2}}{2}\big{[}\tilde{R}(z)-\tilde{R}(-z)\big{]} (2.11)

This can be compared with the expression for R~(z)\tilde{R}(z) that follows from (2.10b) (taken for zz and z-z)

R~(z)=C~(z)[λσb2+j2(1+η)C~(z)zj2(1η)C~(z)+σb2]\tilde{R}(z)=\tilde{C}(z)\bigg{[}\frac{\lambda}{\sigma_{\rm b}^{2}+j^{2}(1+\eta)\tilde{C}(z)}-\frac{z}{j^{2}(1-\eta)\tilde{C}(z)+\sigma_{\rm b}^{2}}\bigg{]} (2.12)

The r.h.s. of the FDT is then

σb22[R~(z)R~(z)]=zC~(z)σb2j2(1η)C~(z)+σb2-\frac{\sigma_{\rm b}^{2}}{2}\big{[}\tilde{R}(z)-\tilde{R}(-z)\big{]}=z\tilde{C}(z)\frac{\sigma_{\rm b}^{2}}{j^{2}(1-\eta)\tilde{C}(z)+\sigma_{\rm b}^{2}} (2.13)

Comparing with (2.11), the FDT is satisfied for symmetric couplings (η=1\eta=1) as expected, while there are progressively stronger deviations from FDT as η\eta decreases towards 1-1.

2.3 Comparison with known results

As a consistency check and to support our claim of exactness in the thermodynamic limit, we briefly compare our results with expressions for C~(z)\tilde{C}(z) and R~(z)\tilde{R}(z) that can be worked out by alternative means.

In general, from (2.10b) we can get an expression for R~(z)\tilde{R}(z) in terms of C~(z)\tilde{C}(z): this is (2.12). Substituting into (2.10a), B~(z)\tilde{B}(z) can also be worked out as a function of C~(z)\tilde{C}(z). Using these expressions for R~(z)\tilde{R}(z) and B~(z)\tilde{B}(z) in equation (2.10c), one finds a closed algebraic equation for the posterior variance C~(z)\tilde{C}(z)

z2=[σb2C~+αk2σb2σs21+k2σs2C~+j21+j2σb2C~+λ2(1+(1+η)j2σb2C~)2](1+(1η)j2σb2C~)2z^{2}=\left[-\frac{\sigma_{\rm b}^{2}}{\tilde{C}}+\frac{\alpha\frac{k^{2}\sigma_{\rm b}^{2}}{\sigma_{\rm s}^{2}}}{1+\frac{k^{2}}{\sigma_{\rm s}^{2}}\tilde{C}}+\frac{j^{2}}{1+\frac{j^{2}}{\sigma_{\rm b}^{2}}\tilde{C}}+\frac{\lambda^{2}}{\bigg{(}1+(1+\eta)\frac{j^{2}}{\sigma_{\rm b}^{2}}\tilde{C}\bigg{)}^{2}}\right]\bigg{(}1+(1-\eta)\frac{j^{2}}{\sigma_{\rm b}^{2}}\tilde{C}\bigg{)}^{2} (2.14)

This is the same expression as obtained by calculations in [9] using an explicit average over the quenched disorder variables JijJ_{ij} and KiaK_{ia}. Particular cases are also further validated by random matrix theory, as follows.

2.3.1 α=0\alpha=0.

This case corresponds to the absence of observations. One has then B~i(z)\tilde{B}_{i}(z), B~a(z)0\tilde{B}_{a}(z)\equiv 0 as these quantities simply play the role of Lagrange multipliers enforcing the conditioning on observations. To see this formally from the α0\alpha\to 0 limit of (2.10) one sets B~i(z)=αD~i(z)\tilde{B}_{i}(z)=\alpha\tilde{D}_{i}(z) and B~a(z)=αD~a(z)\tilde{B}_{a}(z)=\alpha\tilde{D}_{a}(z) where the D~\tilde{D} stay nonzero for Ns0N^{\rm s}\to 0. One verifies that under this assumption the system has as solution the responses and correlations known from [6] (where a thorough analysis of the thermodynamic limit of an analogous linear dynamics without observations was provided)

R~(z)=12η(z+λ)12η(z+λ)24j2η\tilde{R}(z)=\frac{1}{2\eta}(z+\lambda)-\frac{1}{2\eta}\sqrt{(z+\lambda)^{2}-4j^{2}\eta} (2.15)
C~(z)=4σb2[(λ+z)+(λ+z)24j2η][(λz)+(λz)24j2η]4j2\tilde{C}(z)=\frac{4\,\sigma_{\rm b}^{2}}{\big{[}(\lambda+z)+\sqrt{(\lambda+z)^{2}-4j^{2}\eta}\big{]}\big{[}(\lambda-z)+\sqrt{(\lambda-z)^{2}-4j^{2}\eta}\big{]}-4j^{2}} (2.16)

2.3.2 j=0j=0.

In this situation, the hidden variables have got no direct interactions. By solving (2.14), one obtains

C~(z)=σs22k2σ2(λ2z2){1α(λ2z2σ2)+[1α(λ2z2σ2)]2+4(λ2z2σ2)}\tilde{C}(z)=\frac{\sigma_{\rm s}^{2}}{2k^{2}}\frac{\sigma^{2}}{{(\lambda^{2}-z^{2})}}\bigg{\{}1-\alpha-\bigg{(}\frac{\lambda^{2}-z^{2}}{\sigma^{2}}\bigg{)}+\sqrt{\bigg{[}1-\alpha-\bigg{(}\frac{\lambda^{2}-z^{2}}{\sigma^{2}}\bigg{)}\bigg{]}^{2}+4\bigg{(}\frac{\lambda^{2}-z^{2}}{\sigma^{2}}\bigg{)}}\bigg{\}} (2.17)

where σ=σbk/σs\sigma=\sigma_{\rm b}k/\sigma_{\rm s}. This coincides with the result in [9], which can be derived separately using methods from random matrix theory.

2.3.3 η=1\eta=1.

Here we have symmetric hidden-to-hidden couplings and (2.14) can be simplified to

z2=σb2C~(z)+αk2σb2σs21+k2σs2C~(z)+j21+j2σb2C~(z)+λ2(1+2j2σb2C~(z))2z^{2}=-\frac{\sigma_{\rm b}^{2}}{\tilde{C}(z)}+\frac{\alpha\frac{k^{2}\sigma_{\rm b}^{2}}{\sigma_{\rm s}^{2}}}{1+\frac{k^{2}}{\sigma_{\rm s}^{2}}\tilde{C}(z)}+\frac{j^{2}}{1+\frac{j^{2}}{\sigma_{\rm b}^{2}}\tilde{C}(z)}+\frac{\lambda^{2}}{\bigg{(}1+2\frac{j^{2}}{\sigma_{\rm b}^{2}}\tilde{C}(z)\bigg{)}^{2}} (2.18)

This fifth order equation for the single site posterior covariance predicted by the extended Plefka expansion with hidden nodes is again confirmed by random matrix theory results [9].

3 Critical regions

We can now proceed to the main contribution of this work, namely the study of the properties of our conditioned dynamical system. The focus will be first on the power spectrum of the posterior covariance, given by C~(iω)\tilde{C}(\text{i}\omega), the Laplace transform C~(z)\tilde{C}(z) evaluated at z=iωz=\text{i}\omega, as this is the quantity that is immediately computable from the equations (2.10). We will then translate the insights from the frequency domain analysis into the time domain, to access key observables including the timescales of posterior correlations.

We use the formulae provided by the Plefka approach for our analysis but stress that the behaviour we find, including the presence of critical regions in parameter space, is not an artefact of the Plefka expansion. The Plefka expansion is known to have a finite radius of convergence [16] in terms of coupling strength, and so it might be thought that one should check whether our parameter ranges lie in the region of validity of the expansion. However, as highlighted in section 2.3, the Plefka results (2.10) fully agree with other methods that do not rely on perturbative expansions, namely random matrix theory and disorder-averaged dynamical functionals [9], which implies that there can be no issues with convergence of the Plefka expansion.

3.1 Dimensionless system for the power spectrum

We would like first to understand how C~(iω)\tilde{C}(\text{i}\omega) depends on the parameters λ\lambda, jj, kk, σs\sigma_{\rm s}, σb\sigma_{\rm b}, η\eta and α\alpha. The last two of these are already dimensionless. By extracting the appropriate dimensional scales, we can reduce the other five parameters to only two dimensionless combinations.

From (2.1) one sees that jj, kk, λ\lambda have dimensions t1t^{-1}, while the dimension of σs/b2\sigma_{\rm s/b}^{2} is x2t1x^{2}t^{-1}. We can build from these the dimensionless parameters γ=j/λ\gamma=j/\lambda and p=λ/σp=\lambda/\sigma. Here σ=σbk/σs\sigma=\sigma_{\rm b}k/\sigma_{\rm s}, which has dimension t1t^{-1} and contains the observation “intensity” kk as well as the ratio between the dynamical noises σb/σs\sigma_{\rm b}/\sigma_{\rm s}. The latter is a third dimensionless parameter but as it only enters one prefactor we will not need to keep it separately.

Extracting appropriate dimensional amplitudes for all four two-point functions, we write them as

C~(iω)=\displaystyle\tilde{C}(\text{i}\omega)= σs2k2𝒞α,γ,η,p(Ω)\displaystyle\frac{\sigma_{\rm s}^{2}}{k^{2}}\mathcal{C}_{\alpha,\gamma,\eta,p}(\Omega) (3.1a)
R~(iω)=\displaystyle\tilde{R}(\text{i}\omega)= 1jα,γ,η,p(Ω)\displaystyle\frac{1}{j}\mathcal{R}_{\alpha,\gamma,\eta,p}(\Omega) (3.1b)
B~(iω)=\displaystyle\tilde{B}(\text{i}\omega)= 1σb2α,γ,η,p(Ω)\displaystyle\frac{1}{\sigma_{\rm b}^{2}}\mathcal{B}_{\alpha,\gamma,\eta,p}(\Omega) (3.1c)
B~ss(iω)=\displaystyle\tilde{B}^{\rm ss}(\text{i}\omega)= 1σs2α,γ,η,pss(Ω)\displaystyle\frac{1}{\sigma_{\rm s}^{2}}\mathcal{B}^{\rm ss}_{\alpha,\gamma,\eta,p}(\Omega) (3.1d)

Here Ω=ω/σ\Omega=\omega/\sigma is a dimensionless frequency; similarly 𝒞α,γ,η,p(Ω)\mathcal{C}_{\alpha,\gamma,\eta,p}(\Omega), α,γ,η,p(Ω)\mathcal{R}_{\alpha,\gamma,\eta,p}(\Omega), α,γ,η,p(Ω)\mathcal{B}_{\alpha,\gamma,\eta,p}(\Omega) and α,γ,η,pss(Ω)\mathcal{B}^{\rm ss}_{\alpha,\gamma,\eta,p}(\Omega) are dimensionless and depend on the dimensionless parameters α\alpha, γ\gamma, η\eta and pp: for the sake of brevity, we do not write the subscripts indicating this dependence in the following. Let us briefly comment on (3.1a). One sees that C~(z)\tilde{C}(z) is directly proportional to σs2\sigma_{\rm s}^{2} and inversely proportional to k2k^{2}: the weaker the hidden-to-observed coupling and the stronger the dynamical noise acting on the observed variables, the less information one can extract from the subnetwork trajectories and the more uncertain the predictions for the behaviour of the bulk.

To summarize, we switch from eight original parameters {α,η,λ,j,k,σs,σb,ω}\{\alpha,\eta,\lambda,j,k,\sigma_{\rm s},\sigma_{\rm b},\omega\} to a set of five dimensionless parameters {α,η,γ,p,Ω}\{\alpha,\eta,\gamma,p,\Omega\}. For the dimensionless second moments (3.1), the system (2.10) becomes

𝒞(α1+𝒞+(γp)2)+(γp)1(iΩ+pηγp)=1\displaystyle-\mathcal{C}\bigg{(}-\frac{\alpha}{1+\mathcal{C}}+(\gamma p)^{2}\mathcal{B}\bigg{)}+(\gamma p)^{-1}\mathcal{R}\bigg{(}\text{i}\Omega+p-\eta\,\gamma\,p\,\mathcal{R}\bigg{)}=1 (3.2a)
(γp)𝒞(iΩ+pηγp)((γp)2𝒞+1)=0\displaystyle(\gamma p)\,\mathcal{C}\bigg{(}-\text{i}\Omega+p-\eta\,\gamma\,p\,\mathcal{R}_{-}\bigg{)}-\mathcal{R}\bigg{(}(\gamma p)^{2}\,\mathcal{C}+1\bigg{)}=0 (3.2b)
[(iΩ+pηγp)(α1+𝒞+(γp)2)1(iΩ+pηγp)(γp)2𝒞1]=1\displaystyle\mathcal{B}\bigg{[}\bigg{(}-\text{i}\Omega+p-\eta\,\gamma\,p\,\mathcal{R}_{-}\bigg{)}\bigg{(}-\frac{\alpha}{1+\mathcal{C}}+(\gamma p)^{2}\mathcal{B}\bigg{)}^{-1}\bigg{(}\text{i}\Omega+p-\eta\,\gamma\,p\,\mathcal{R}\bigg{)}-(\gamma p)^{2}\mathcal{C}-1\bigg{]}=1 (3.2c)

where we have dropped the frequency argument and introduced the shorthand =(Ω)\mathcal{R}_{-}=\mathcal{R}(-\Omega). The solution for ss\mathcal{B}^{\rm ss}, already taken into account by substitution in (3.2), is given by 1/(1+𝒞)-1/(1+\mathcal{C}).

3.2 Critical scaling

Refer to caption
Figure 1: Parameter space spanned by α\alpha, γ\gamma, pp. The blue stripe and the red area mark the values for which the posterior covariance 𝒞(0)\mathcal{C}(0) becomes singular, i.e. respectively α=0\alpha=0, γ>γc\gamma>\gamma_{c} (p\forall p) and p=0p=0, 0<α<10<\alpha<1 (γ\forall\gamma).

We next analyze in the parameter space α,γ,p\alpha,\gamma,p (at fixed η\eta) the singularities of 𝒞(0)\mathcal{C}(0), the (dimensionless) zero frequency posterior covariance. 𝒞(0)\mathcal{C}(0) is a convenient quantity to look at in the first instance as it helps one detect where qualitative changes of behaviour occur. We will see that these then show up also in the relaxation time and the inference error itself, i.e. the equal time posterior variance C(tt)=C(0)C(t-t)=C(0). Their behaviour will be addressed respectively in sections 3.4, 3.5 and 3.8, 3.9, while in sections 3.6 and 3.7 we clarify further the interpretation of our results for the relaxation times by looking at the shape of the correlation functions.

Independently of η\eta, we find two critical regions that are shown graphically in figure 1:

  1. 1.

    p\forall p, α=0\alpha=0 and γ>γc\gamma>\gamma_{c}

  2. 2.

    γ\forall\gamma, p=0p=0 and 0<α<10<\alpha<1

The first case gives back the dynamics without observations (α=0\alpha=0), for which γ<γc=1/(1+η)\gamma<\gamma_{c}=1/(1+\eta) is the condition of stability beyond which trajectories typically diverge in time (see [6]). Interestingly, as soon as α>0\alpha>0, the constraints from observations make the solution stable irrespective of whether γ\gamma is smaller or bigger than the critical value. For γ>γc\gamma>\gamma_{c} the observed trajectories would then be divergent, and so would the predicted hidden trajectories, while the error (posterior variance) of the predictions would remain bounded. It is difficult to conceive of situations where divergent mean trajectories would make sense, however, so we only consider the range γγc\gamma\leq\gamma_{c} in our analysis. In what follows, we will plot all the curves that refer to this region of the parameter space in blue.

The second limit, p0p\to 0, corresponds, for fixed ratio between noises σs/σb\sigma_{\rm s}/\sigma_{\rm b}, to kλk\gg\lambda: we call this scenario an “underconstrained” hidden system. In general for large kk the posterior variance decreases as 1/k21/k^{2}, as used in the scaling (3.1a). But for α<1\alpha<1 there are directions in the space of hidden trajectories that are not constrained at all by subnetwork observations, and their variance will scale as 1/λ21/\lambda^{2} instead. These directions give a large contributions to the dimensionless 𝒞(Ω)\mathcal{C}(\Omega) that diverges in the limit kλk\gg\lambda. In general one has a similar effect when k/σsλ/σbk/\sigma_{\rm s}\gg\lambda/\sigma_{\rm b}, where the noise in the dynamics acts to effectively reduce the relevant interaction or decay constant. This behaviour is broadly analogous to what happens in learning of linear functions from static data [10, 11]: there the prediction error will also diverge when α\alpha, which is then defined as the ratio of number of examples to number of spatial dimensions, is less than unity and no regularization is applied. For the curves belonging to this second region we choose the colour red.

Close to the two critical regions in the space of α\alpha, γ\gamma and pp, 𝒞(0)\mathcal{C}(0) is expected to exhibit a power-law dependence on the parameters specifying the distance away from the critical point. One can study this behaviour by using standard scaling techniques developed for the study of critical phase transitions [17]. The aim of this analysis, which we describe in A and summarize in the next section, is to find the master curves and associated exponents that describe the approach to the critical point(s). From the master curves for the power spectrum we can then derive (B) the corresponding scaling behaviour of the relaxation times and the inference error, which is discussed in sections 3.43.9 below.

3.3 Power Spectrum Master Curves

Refer to caption
Figure 2: 𝒞(Ω)\mathcal{C}(\Omega) for different α\alpha, at small fixed η\eta. We have chosen γ\gamma and pp close to their critical values 1+η1+\eta and 0, respectively, in order to see both critical regions as α\alpha is increased. The master curves for α0\alpha\to 0 and α1\alpha\to 1, resulting from the critical rescalings in A, are plotted (blue dashed line with crosses and red dashed line with circles respectively): Ω103\Omega^{*}\sim 10^{-3} for the α0\alpha\to 0 master curve, while Ω101\Omega^{*}\sim 10^{-1} for the one for α1\alpha\to 1. For ΩΩ1\Omega^{*}\ll\Omega\ll 1 one has a Lorentzian tail 𝒞1/Ω2\mathcal{C}\sim 1/\Omega^{2} for α\alpha small, while a different power-law feature, namely 𝒞1/Ω\mathcal{C}\sim 1/\Omega, emerges for α1\alpha\to 1. At ΩO(1)\Omega\sim O(1) a crossover to Lorentzian behaviour is seen for any α\alpha.

The master curves of the power spectrum describe its shape in the critical regions and for small frequencies Ω\Omega of the order of a parameter-dependent characteristic frequency Ω\Omega^{*} (see A). These master curves are useful because they highlight the two main trends (with η\eta and with α\alpha) that will manifest themselves in the relaxation time and in the inference error. In the first critical region where there are few observations, the conditional dynamics is dominated by hidden-hidden interactions. Their degree of symmetry η\eta therefore plays a key role and in fact determines a non-trivial crossover in the critical behaviour (see figure 9). As developed in A.1, this crossover from equilibrium to generic non-equilibrium dynamics can be studied by including ϵ=1η\epsilon=1-\eta as a small parameter in the scaling analysis. By contrast, in the second critical region the scaling functions do not depend on η\eta (see figure 8, right). The intuition here is that the critical behaviour is dominated by whether a direction in the hidden trajectory space is constrained by observations or not, i.e. by hidden-to-observed rather than hidden-to-hidden interactions.

Regarding the trend with α\alpha, the main feature is the decrease in 𝒞(0)\mathcal{C}(0) with increasing α\alpha meaning that with many observations the predictions for the hidden dynamics will become arbitrarily precise, in line with intuition. This can be seen in figure 2, where one observes also that at ΩO(1)\Omega\sim O(1) all spectra collapse into a Lorentzian tail. This indicates an exponential decay of the correlations between prediction errors in the temporal domain. As the amplitude of the tail is largely α\alpha-independent, the typical time of this exponential decay decreases with α\alpha: with many observations, errors in the prediction of the hidden states become progressively less correlated with each other.

This already shows that from the power spectrum 𝒞(Ω)\mathcal{C}(\Omega) one can extract useful information on relaxation times and the same is true for the inference error, as we analyze more systematically in the next sections. Importantly, we find nontrivial power-law dependences on α\alpha that arise from the dynamical nature of our problem and have no analogue in static learning  [10, 11].

3.4 Relaxation time for γγc\gamma\to\gamma_{c} and α0\alpha\to 0

We look at the relaxation time, which is a measure of time correlations in the errors of inferred hidden values. We study in particular how it depends on the number of observations and the interaction parameters.

The relaxation time can be defined in a mean-squared sense as

τ2=+t2C(t)𝑑t2C~(0)=12𝒞(0)d2𝒞(ω)d2ω|ω=0\tau^{2}=\frac{\int_{-\infty}^{+\infty}t^{2}C(t)dt}{2\,\tilde{C}(0)}=-\frac{1}{2\,\mathcal{C}(0)}\frac{d^{2}\mathcal{C}(\omega)}{d^{2}\omega}\bigg{\lvert}_{\omega=0} (3.3)

Given this relation, results for power spectra can be directly used to obtain the master curves for relaxation times, see B.1 and B.2. To construct these master curves, let us introduce the dimensionless version of the relaxation timescale,

𝒯=στ\mathcal{T}=\sigma\tau (3.4)

We look separately at the two critical regions, starting in this subsection with the first one, i.e. in the vicinity of the “no observations” critical point. We summarize here the main findings, leaving details for B.1. The distance from the singularity is controlled by α\alpha itself and δγ=γcγ\delta\gamma=\gamma_{c}-\gamma. As mentioned in the overview in section 3.3, the behaviour in this region depends crucially on the symmetry parameter η\eta and so we break down the results accordingly.

3.4.1 η=1\eta=1.

For δγ2<α1\delta\gamma^{2}<\alpha\ll 1 we find

𝒯α23\mathcal{T}\sim\alpha^{-\frac{2}{3}} (3.5)

This power law dependence is visible in figure 3 (left, see curve for η=1\eta=1).

3.4.2 1<η<1-1<\eta<1.

The main result here is

𝒯α14δγ2<α1\mathcal{T}\sim\alpha^{-\frac{1}{4}}\qquad\delta\gamma^{2}<\alpha\ll 1 (3.6)

This dependence can be verified in figure 3 (left, see curve for η=0.1\eta=0.1).

3.4.3 Crossover at η1\eta\approx 1.

For symmetries η\eta close to 1, a crossover between the α23\alpha^{-\frac{2}{3}} and α14\alpha^{-\frac{1}{4}} behaviours occurs at a value of α\alpha^{*} dependent on ϵ=1η\epsilon=1-\eta. This dependence can be analyzed quantitatively, see B.1.3; for an illustration we refer to the η=0.85\eta=0.85 curve in figure 3 (left).

3.5 Relaxation time for α1\alpha\to 1 and p0p\to 0

Refer to caption
Refer to caption
Figure 3: (Left) Relaxation time in the vicinity of the critical region α0\alpha\to 0 and γγc\gamma\to\gamma_{c}, as a function of α\alpha and for different η\eta. Solid lines are the numerics, dashed ones with circles the analytic master curves. A plateau for small α\alpha emerges for η\eta close to 1. For δγ2<α1\delta\gamma^{2}<\alpha\ll 1 one can see τα23\tau\sim\alpha^{-\frac{2}{3}} for η=1\eta=1 and τα14\tau\sim\alpha^{-\frac{1}{4}} for η=0.1\eta=0.1, consistent with Eqs. (3.5) and (3.6). The case η=0.85\eta=0.85 interpolates between these power tails with a crossover at α0.0005\alpha^{*}\sim 0.0005. (Right) Relaxation time in the vicinity of the critical region α1\alpha\to 1 and p0p\to 0, as a function of α\alpha. As the variation with γ\gamma is weak in this regime, we fix γ=0.1\gamma=0.1. Similarly the results are largely independent of η\eta; we choose η=0.1\eta=0.1. Solid lines are the numerics, dashed red ones with circles the analytic master curves. One can see that τ\tau stays roughly constant for α<1\alpha<1 while it drops as σ/(α1)\approx\sigma/(\alpha-1) for larger α\alpha in line with Eq. (3.7).
Refer to caption
Refer to caption
Figure 4: (Left) Relaxation times τ\tau as a function of α\alpha for small pp: we focus on negative symmetries and we fix γ=1\gamma=1. For η=1\eta=-1, γγc\gamma\ll\gamma_{c} so that δγ\delta\gamma is not small: τ\tau is well described by the master curve for p0p\to 0 (see red line with circles). For η=0.0001\eta=-0.0001, δγ\delta\gamma is small so that for small α\alpha one recovers the α1/4\alpha^{-1/4} behaviour of figure 3 (left) (see dark blue line with crosses). At α1\alpha\to 1 all curves collapse onto the η\eta-independent value 1/p1+γ27001/p\sqrt{1+\gamma^{2}}\approx 700. For η=1\eta=-1, τ\tau increases with α\alpha until it reaches a maximum around α=1\alpha=1 and then drops abruptly. (Inset, left) Plots of C(tt)C(t-t^{\prime}) itself, the lag-dependent correlation function, emphasize this trend with α\alpha. All curves have been normalized by C(0)C(0) to start at one: the amplitude C(0)C(0) decreases with increasing α\alpha as discussed in section 3.9. (Right) Relaxation time for small pp and δγ\delta\gamma as a function of α\alpha, showing an interpolation between the behaviours in figure 3 (left) and (right). The red master curve with circles is expected to give a good fit for α1\alpha\approx 1 only, consistent with our results.

We next discuss the behaviour of the relaxation time in the second critical region, α1\alpha\to 1 (i.e. Ns=NbN^{\rm s}=N^{\rm b}) and p0p\to 0 (i.e. kλk\gg\lambda at fixed σs/σb\sigma_{\rm s}/\sigma_{\rm b}). The distance from the critical point is therefore represented by pp itself and by δα=α1\delta\alpha=\alpha-1. As is shown in B.2, when both are small the results are independent of the degree of symmetry η\eta of the interactions among the hidden variables. For the dimensionless relaxation time we find for positive δα\delta\alpha (see B.2)

𝒯1δαp<δα1\mathcal{T}\sim\frac{1}{\delta\alpha}\qquad p<\delta\alpha\ll 1 (3.7)

and this is consistent with numerical data for α>1\alpha>1, see figure 3 (right). The figure also shows that as α\alpha decreases below unity the relaxation time reaches a plateau, whose height can be worked out as 1/p1+γ21/p\sqrt{1+\gamma^{2}}. The plateau and the decay in eq. (3.7) are obtained as the two limiting behaviours of an η\eta-independent master curve that applies when δα\delta\alpha is of order pp.

To understand the behaviour across the entire range 0<α<10<\alpha<1 one can work out a separate master curve for fixed α\alpha and p0p\to 0 (see Eq. (B.12) in B), which is plotted in figure 4 (left) for η=1\eta=-1 along with numerical results. This curve does have a nontrivial dependence both on η\eta and on α\alpha; for α0\alpha\to 0 it approaches a constant, whose value can be checked to be consistent with the α=0\alpha=0 result (Eq. (A.1)) as it must.

A comparison of figures 3 (right) and 4 (left) shows that the range around α=1\alpha=1 where relaxation times are independent of η\eta depends not just on pp but also on γ\gamma. For small γ\gamma as in figure 3 (right) we see good agreement with the η\eta-independent master curves down to at least α0.5\alpha\approx 0.5. For γ=1\gamma=1, the value we take to plot curves for η<0\eta<0, figure 4 (left) shows that already at α0.9\alpha\sim 0.9 there is visible variation across different values of η\eta.

Figure 4 (left) contains a further insight: at fixed γ\gamma and moderate α\alpha, varying η\eta can shift the system away from the second critical region (p0p\to 0) towards the first one (δγ0\delta\gamma\to 0). Indeed, if γ=1\gamma=1, δγ=γcγ\delta\gamma=\gamma_{c}-\gamma becomes small when η\eta is close to zero because γc=1/(1+η)\gamma_{c}=1/(1+\eta). Accordingly for the η=0.0001\eta=-0.0001 curve in the figure, the relaxation time for small α\alpha is captured better by the master curve for the first critical region (dark blue line with crosses) than the one for the second region. At fixed η\eta and for appropriate combinations of small δγ\delta\gamma and pp one can even see both critical regions as α\alpha is increased. Figure 4 (right) shows a case in point.

As a general trend in the relaxation times one sees that they decrease significantly with increasing α\alpha (see e.g. figure 4, right): as the values of the hidden variables become constrained increasingly strongly by those of the observed ones, the remaining uncertainty in the prediction becomes local in time. There are surprising counter-examples, however. Figure 4 (left) shows that for strongly anti-symmetric interactions, e.g. η=1\eta=-1, the relaxation time τ\tau increases with α\alpha up until α=1\alpha=1. This unusual trend is linked to a change of shape in the decay of the time-dependent correlation function, which is strongly non-exponential for α<1\alpha<1 but becomes close to exponential beyond (see the inset of figure 4, left).

3.6 Correlation functions for γγc\gamma\to\gamma_{c} and α0\alpha\to 0

Refer to caption
Refer to caption
Figure 5: (Left) Correlation function C(tt)C(t-t^{\prime}) in the critical region γγc\gamma\to\gamma_{c} and α0\alpha\to 0, for different η\eta. Solid lines are the numerics, dashed ones with circles the limit curves for α=0\alpha=0. Curves are normalized at the origin to focus on the shape of the correlation function decay. The curve for η=0.85\eta=0.85 interpolates between the one for η=1\eta=1, which has a power law regime 1/tt\sim 1/\sqrt{t-t^{\prime}} (over a range that is still small in the figure but eventually diverges as δγ0\delta\gamma\to 0), and the pure exponential behaviour one finds for η=0\eta=0. The crossover occurs at tt1/ϵ245t-t^{\prime}\sim 1/\epsilon^{2}\sim 45. (Inset, right) Correlation functions for η=1\eta=-1 for small α\alpha, which are essentially identical to the limit curve for α=0\alpha=0 (dashed line), and for α=0.5\alpha=0.5; note the oscillatory behaviour. At α=0\alpha=0 it is convenient to set p=1/γp=1/\gamma (see A.1) so we use this parameter setting also for nonzero α\alpha here. Small oscillations at long times can occur also in correlation functions at higher η\eta, e.g. for η=0\eta=0 in the left figure. (Right) Correlation function in the region p0p\to 0 and δα0\delta\alpha\to 0. Numerical results are essentially independent of η\eta so close to α=1\alpha=1; we take η=0.1\eta=-0.1. The master curve (dashed red line with circles) is expected to capture the behaviour for ttt-t^{\prime} beyond 1/Ω7001/\Omega^{*}\sim 700 but the agreement is good also for smaller time lags. For tt0t-t^{\prime}\to 0 the master curve would diverge and we have therefore normalized it by the C(0)C(0) of the numerical correlation function.

Having studied overall relaxation timescales, we next look more comprehensively at the time correlation functions C(tt)C(t-t^{\prime}), which can be obtained numerically by inverse Fourier transform of the power spectrum.

For the first critical region we refer to figure 5 (left). Here the master curves are the result at α=0\alpha=0 from [6]: for η=0\eta=0 and η=1\eta=1 these are given respectively by a pure exponential and by an exponentially weighted integral of a modified Bessel function I1I_{1}. In the latter case the correlation function has a 1/tt1/\sqrt{t-t^{\prime}} regime indicated in the figure, before crossing over to (tt)3/2(t-t^{\prime})^{-3/2} times an exponential cutoff (which kicks in around tt=250t-t^{\prime}=250 in the case shown in the figure). For intermediate symmetries (e.g. η=0.85\eta=0.85 in figure 5 left) one has a crossover between this behaviour and the pure exponential for η=0\eta=0. This crossover occurs at a time 1/ϵ2\sim 1/\epsilon^{2} (here ϵ=1η\epsilon=1-\eta as before), which is consistent with the frequency space crossover at frequency ϵ2\sim\epsilon^{2}, see e.g. figure 9 (right).

For η=1\eta=-1, in the α=0\alpha=0 case C(tt)C(t-t^{\prime}) is a Bessel function of the first kind J1J_{1} [6]. This has oscillations, and we find that this behaviour persists as α\alpha is increased. This is shown in the inset of figure 5 (right) where the oscillations exhibit a longer decay time for α=0.5\alpha=0.5 consistently with the findings on relaxation times of figure 4 (left). Note that small oscillations are also present in the curve α=0.5\alpha=0.5 of figure 4 (left, inset) on longer times than shown in the graph.

3.7 Correlation functions for α1\alpha\to 1 and p0p\to 0

We show an example of C(tt)C(t-t^{\prime}) in this second critical region in figure 5 (right). The master curve for the power spectrum in this region (see A.2) for δα0\delta\alpha\to 0 tends to 1/1+(Ω/pγ2+1)21/\sqrt{1+(\Omega/p\sqrt{\gamma^{2}+1})^{2}}, independently of η\eta. Its inverse Fourier transform gives a modified Bessel function K0(Ω/pγ2+1)K_{0}(\Omega/p\sqrt{\gamma^{2}+1}) in the time domain, which is plotted in the figure alongside the numerical results (dashed red line with circles).

3.8 Equal time posterior variance for γγc\gamma\to\gamma_{c} and α0\alpha\to 0

Refer to caption
Refer to caption
Figure 6: (Left) Inference error in the vicinity of the critical region α0\alpha\to 0 and γγc\gamma\to\gamma_{c}, as a function of α\alpha and for different η\eta. Solid lines are the numerics, dashed ones with circles the analytic master curves. The dominant contribution to the inference error is given by frequencies ΩΩ\Omega\sim\Omega^{*} for η=0.1\eta=0.1 while it is given by ΩO(1)\Omega\sim O(1) for η=1\eta=1. This leads to a power law dependence of the master curve in the former case, while in the latter regime the master “curve” is the α\alpha-independent posterior variance of the case without observations, i.e. a constant, which C(0)C(0) approaches for α0\alpha\to 0 as it should; for larger α\alpha it exhibits a smooth dependence on α\alpha unconnected to any critical behaviour. (Right) Inference error for small pp and δγ\delta\gamma as a function of α\alpha. This connects the behaviours in the left plot and in figure 7. The red master curve with circles is expected to give a good fit only around α1\alpha\approx 1, as observed.

We turn finally to the behaviour of the inference error for the prediction of hidden unit trajectories. This is given by the equal time posterior correlator

C(tt)=C(0)=12πC~(ω)𝑑ω=12πσs2k2σ𝒞(Ω)𝑑Ω=σsσbk𝒞0C(t-t)=C(0)=\frac{1}{2\pi}\int_{-\infty}^{\infty}\tilde{C}(\omega)d\omega=\frac{1}{2\pi}\frac{\sigma_{\rm s}^{2}}{k^{2}}\sigma\int_{-\infty}^{\infty}\mathcal{C}(\Omega)d\,\Omega=\frac{\sigma_{\rm s}\sigma_{\rm b}}{k}\mathcal{C}_{0} (3.8)

where 𝒞0\mathcal{C}_{0} is a dimensionless equal time posterior variance. We see that the size of the error is generically proportional to the noise acting on the dynamics of hidden and observed variables, and inversely proportional to the hidden-to-observed interaction strength. We summarize in the following subsections our results for C(0)C(0), with a focus on the α\alpha-dependence, and leave details to B.3 and B.4. We begin with the first critical region.

3.8.1 η=1\eta=1.

For η=1\eta=1 we find that the equal time correlator becomes essentially independent of α\alpha for small α\alpha, as one can see in figure 6 (left, curves for η=1\eta=1). More generally the result is that the dependence on α\alpha across the range 0<α<10<\alpha<1 is smooth, and to leading order unaffected by the vicinity of the critical region.

3.8.2 1<η<1-1<\eta<1.

Here one obtains (see B.3)

C(0)1p(1η)74(1+η)14α14\displaystyle C(0)\sim\frac{1}{p}(1-\eta)^{\frac{7}{4}}(1+\eta)^{\frac{1}{4}}{\alpha}^{-\frac{1}{4}} (3.9)

We thus predict C(0)α1/4C(0)\sim\alpha^{-1/4}, and this is consistent with the numerics, see e.g. figure 6 (left, curves for η=0.1\eta=0.1). It is notable that Eq. (3.9) is independent of δγ\delta\gamma; this behaviour holds for δγ2α1\delta\gamma^{2}\ll\alpha\ll 1, while for smaller α\alpha the value of δγ\delta\gamma would become relevant. The power law behaviour C(0)α1/4C(0)\sim\alpha^{-1/4} is also consistent with the scaling C~(0)τC(0)\tilde{C}(0)\sim\tau\,C(0) that one would expect on general grounds: the zero frequency power spectrum C~(0)\tilde{C}(0) is the integral of the correlation function, hence should scale as its amplitude C(0)C(0) times the decay time τ\tau. That this relation holds here follows from our previous result, τα1/4\tau\sim\alpha^{-1/4}, and C~(0)α1/2\tilde{C}(0)\sim\alpha^{-1/2} (see A.1).

3.9 Equal time posterior variance for α1\alpha\to 1 and p0p\to 0

Refer to caption
Figure 7: Inference error in the vicinity of the critical region α1\alpha\to 1 and p0p\to 0, as a function of α\alpha, at fixed γ=0.1\gamma=0.1. The dashed line with circles shows the prediction from the logarithmic divergence and first subleading term (see equation (B.16)), which is qualitatively remarkably accurate even away from α=1\alpha=1. The prediction behaves as 1α\sim 1-\alpha for α<1\alpha<1 as the linear scale inset shows. For the numerical results we used η=0.1\eta=0.1; other η\eta give virtually indistinguishable curves.

In the second critical region, the dominant terms of the integral (3.8) can be shown to scale near α=1\alpha=1 as 1α1-\alpha for α<1\alpha<1 and const.ln(α1)\mbox{const.}-\ln(\alpha-1) for α>1\alpha>1 (see B.4). These terms are plotted (red line) in figure 7 where the linear scale inset clearly shows the linear dependence on 1α1-\alpha. A separate p0p\to 0 master curve for fixed α\alpha below 1 can also be derived (by integrating the solution of (A.36)) and matches smoothly to the 1α1-\alpha behaviour around α=1\alpha=1.

As a common trend across the two critical regions we have the intuitively reasonable result that the inference error decreases when the number of observed variables gets bigger. Figure 6 (right) summarizes this and shows that, as in the case of the relaxation times, the behaviour with increasing α\alpha can cross over from the first critical region to the second provided that both δγ\delta\gamma and pp are small. It is worth re-emphasizing the non-trivial power law dependences of the inference error on α\alpha that occur in our dynamical setting and are quite distinct from the simpler behaviour in static learning scenarios [10, 11].

4 Conclusion

In this paper we considered the problem of inferring hidden states over time in a network of continuous degrees of freedom given a set of observed trajectories. We started from the results for a linear dynamics with Langevin noise derived in [5] and in particular we looked at the hidden posterior variance as a measure of the inference error. To study the average performance case, we considered the stationary regime (where time translation invariance makes it convenient to work in Fourier space), mean-field couplings (all-to-all, weak and long-ranged) and the limit of an infinitely large bulk size. Under these conditions, the Extended Plefka expansion used in [5] becomes exact; also errors become site-independent self-consistently and equivalent to the average error, which measures the quality of the prediction from the macroscopic point of view.

Our main goal was to study the properties of this average inference error, and the associated correlation functions and timescales of the posterior dynamics, as a function of the relevant dimensionless parameters α,γ,p,η\alpha,\gamma,p,\eta. Here α\alpha is the ratio between observed and hidden nodes, γ\gamma is related to the bulk internal stability and pp gives the relative weight of self-interactions and hidden-to-observed couplings. These structural parameters are assumed to be known, either by direct measurement or by theoretical estimation, and our results for the posterior statistics then quantify their interplay in determining the prediction error.

As the parameter space is relatively large, we organized the analysis around the critical regions where the (suitably non-dimensionalized) prediction error diverges. We first studied the power spectrum of the posterior correlations, deploying critical scaling approaches to identify the relevant variables and find scaling functions. These results could then be straightforwardly translated into the corresponding ones for relaxation times and inference errors, providing master curves for appropriately scaled numerical data, with non-trivial power law dependences on e.g. α\alpha resulting from the dynamical nature of our inference scenario.

The first critical region we analyzed corresponds to α1\alpha\ll 1, where there are many fewer observed nodes than hidden ones. Here we found that the presence of interaction symmetry (η=1\eta=1) leads to quite different scaling behaviour than for the generic case 1<η<1-1<\eta<1, indicating the importance of even small deviations from detailed balance for the dynamics. This is in qualitative agreement with earlier studies on systems without observations, e.g. [18, 19].

The second critical region is 0<α<10<\alpha<1 and p0p\to 0, where some parts of the hidden dynamics are strongly constrained but because α<1\alpha<1 there are other parts that remain unconstrained as there are still not enough observed nodes. The resulting singularities are driven essentially by the hidden-to-observed couplings and are therefore independent of the hidden-to-hidden interaction symmetry. A qualitative analogy can be drawn here to studies on “underconstrained” learning in neural networks, e.g. [10, 11], where the inference error can diverge at the point where the number of patterns to be learned equals the number of degrees of freedom. This happens when no weight decay is imposed on the dynamics, which in our scenario corresponds to small λ\lambda and hence small pp. The analogy is only partial as our dynamical setting has a much richer range of behaviours overall.

Another interesting comparison can be made. The Extended Plefka Expansion has been applied also to spin systems for the inference of hidden states [1]. The analytically tractable scenario of an infinitely large network of spins with random asymmetric couplings was studied using a replica approach [2] and there the error in predicting the states of hidden nodes does not exhibit a singularity structure like the one we find. It would be interesting to consider scenarios between these two extremes as in e.g. [20], to understand the relative importance of the type of dynamics (linear vs nonlinear) and the type of degrees of freedom (continuous vs discrete) in this context.

We stress that here the only form of noise we have included is dynamical noise acting on the time evolution of the observed nodes, rather than measurement noise affecting the accuracy of the observed trajectory. We limited ourselves to this case to focus our analysis on the interplay of other parameters such as the interaction strength and the number of observations compared to the number hidden nodes. In future work it would be desirable to include measurement noise in the observation process as has been done in e.g. dynamical learning in neural networks [21]. A further extension would allow for observations that are both noisy and sparse (in time) [22].

In terms of other future work and potential applications, our results could be of interest in experimental design when only the spatio-temporal evolution of a few nodes can be controlled. Given our systematic analysis of the dependence of the average inference error on key parameters of the system, one could study how this might guide the experimental set-up in such a way as to maximize the inference accuracy. For example, the parameter α\alpha measuring the relative number of nodes to monitor over time could enter the specification of a hypothetical experimental protocol. We recall that, for both critical regions, we found that the inference error decreases with this parameter and identified the relevant power laws. If we suppose that an estimate of other parameters (γ\gamma, pp, η\eta) is available either from previous measurements or some a priori knowledge, then our explicit expressions for C(0)C(0), the average inference error when many hidden trajectories are reconstructed, might serve to fix a minimal α\alpha needed for achieving a C(0)C(0) below a set precision threshold.

A major, complementary problem when extracting information from data is the estimation of parameters, as well as identifiability [23] and ultimately model selection. Statistical physics-inspired techniques have already been successfully applied e.g. to signalling and regulatory networks [24, 25] for learning the couplings from steady state data. To see how our results could be relevant in this regard, note that tackling inverse problems relies on an interplay between state inference and parameter estimation. In this paper, we have analyzed the inference problem for the time courses of hidden nodes, assuming that the model parameters are randomly distributed with known average properties, such as the coupling strength and the degree of symmetry (i.e. the deviation from equilibrium). As a next step one could consider inferring the parameters by Expectation-Maximization [26], where the Expectation part relies precisely on computing the posterior statistics. Our simple expressions for the average posterior variance in terms of the average coupling strength and degree of symmetry would then simplify this procedure in some regimes and help investigate it in an analytically controlled, thus more insightful, way.

Acknowledgement

This work was supported by the Marie Curie Training Network NETADIS (FP7, grant 290038). The authors acknowledge the stimulating research environment provided by the EPSRC Centre for Doctoral Training in Cross-Disciplinary Approaches to Non-Equilibrium Systems (CANES, EP/L015854/1). BB acknowledges also the Simons Foundation Grant No. 454953.

Appendix A Scaling analysis: power spectra

A.1 Master curves for γγc\gamma\to\gamma_{c} and α0\alpha\to 0

Refer to caption
Refer to caption
Figure 8: (Left) Numerically calculated power spectra for small α\alpha, showing the approach to the limit curve for α0\alpha\to 0 with γ\gamma close to γc\gamma_{c} (first critical region). This master curve and the ones for α=0.00001\alpha=0.00001, α=0.0001\alpha=0.0001 and α=0.001\alpha=0.001 are given by blue dashed lines with crosses. In this way one can see the amplitude variation for relatively large values of α¯\bar{\alpha}; in the plot α¯102\bar{\alpha}\sim 10^{2} (for α=0.00001\alpha=0.00001), 10310^{3} (for α=0.0001\alpha=0.0001) and 10410^{4} (for α=0.001\alpha=0.001). Even for smaller α¯\bar{\alpha}, the variation in shape with this parameter is small: α¯\bar{\alpha} mainly affects the height of the plateau close to Ω=0\Omega=0 and the position of the crossover to the large frequency Lorentzian tail. (Right) Power spectra for small δα=α1\delta\alpha=\alpha-1, illustrating the approach to the limit curve for δα0\delta\alpha\to 0 with pp close to 0 (second critical region). This master curve and the ones for δα=0.001\delta\alpha=0.001, δα=0.01\delta\alpha=0.01 and δα=0.1\delta\alpha=0.1 are shown as red dashed lines with circles. From this plot one can examine the variation with δα¯\delta\bar{\alpha}, which in the plot has values δα¯1\delta\bar{\alpha}\sim 1 (for δα=0.001\delta\alpha=0.001), δα¯10\delta\bar{\alpha}\sim 10 (for δα=0.01\delta\alpha=0.01) and δα¯100\delta\bar{\alpha}\sim 100 (for δα=0.1\delta\alpha=0.1). For the numerics we used η=0.5\eta=0.5, though again the curves are essentially η\eta-independent.

We begin with the behaviour of the posterior covariance power spectrum for γγc\gamma\to\gamma_{c} and α0\alpha\to 0. We already know the limit curve for the power spectrum: it is given by the spectrum at the critical point. At this point we have an α=0\alpha=0 system, i.e. without observations [6]. As the interactions and noise level relating to observed nodes are then irrelevant, we can set k=jk=j, σb=σs\sigma_{\rm b}=\sigma_{\rm s} (i.e p=1/γp=1/\gamma, σ=j\sigma=j and Ω=ω/j\Omega=\omega/j). From the expression in [6], the dimensionless power spectrum then reads

𝒞γ,η(Ω)=4[1γ+iΩ+(1γ+iΩ)24η][1γiΩ+(1γiΩ)24η]4σ2\mathcal{C}_{\gamma,\eta}(\Omega)=\frac{4}{\bigg{[}\frac{1}{\gamma}+\text{i}\Omega+\sqrt{\big{(}\frac{1}{\gamma}+\text{i}\Omega\big{)}^{2}-4\eta}\bigg{]}\bigg{[}\frac{1}{\gamma}-\text{i}\Omega+\sqrt{\big{(}\frac{1}{\gamma}-\text{i}\Omega\big{)}^{2}-4\eta}\bigg{]}-\frac{4}{\sigma^{2}}} (A.1)

𝒞(0)\mathcal{C}(0) becomes singular when γ=γc=1/(1+η)\gamma=\gamma_{c}=1/(1+\eta), as (A.1) then has a pole at Ω=0\Omega=0. The approach to (A.1)\eqref{eq:pno} at decreasing α\alpha is plotted in figure 8 (left).

A.1.1 η=1\eta=1.

To explain the rescaling procedure for understanding the approach to the singularity at γ=γc\gamma=\gamma_{c} when α\alpha is nonzero, we first focus on the case η=1\eta=1. In section 2.3.3 we derived an algebraic equation for C~(z)\tilde{C}(z) which, for the power spectrum 𝒞(Ω)\mathcal{C}(\Omega), becomes

Ω2=1𝒞+α1+𝒞+(γp)21+(γp)2𝒞+p2[1+2(γp)2𝒞]2-\Omega^{2}=-\frac{1}{\mathcal{C}}+\frac{\alpha}{1+\mathcal{C}}+\frac{(\gamma p)^{2}}{1+(\gamma p)^{2}\mathcal{C}}+\frac{p^{2}}{\big{[}1+2(\gamma p)^{2}\mathcal{C}\big{]}^{2}} (A.2)

Approaching the singularity along either of the two directions α0\alpha\to 0 or δγ0\delta\gamma\to 0 we get two distinct power law divergences of 𝒞(0)\mathcal{C}(0); a third direction of approach is from nonzero Ω\Omega at α=δγ=0\alpha=\delta\gamma=0. Approaching along the δγ\delta\gamma-direction (δγ0\delta\gamma\to 0 at α=Ω=0\alpha=\Omega=0) we find from (A.2)

𝒞(0)1p2δγ\mathcal{C}(0)\sim\frac{1}{p^{2}\sqrt{\delta\gamma}} (A.3)

For α0\alpha\to 0 at Ω=0\Omega=0 and γ=γc\gamma=\gamma_{c}, with γc=1/2\gamma_{c}=1/2 as η=1\eta=1, the result is

𝒞(0)α13p2\mathcal{C}(0)\sim\frac{{\alpha}^{-\frac{1}{3}}}{p^{2}} (A.4)

Finally for Ω0\Omega\to 0 at α=δγ=0\alpha=\delta\gamma=0 the low frequency tail of 𝒞(Ω)\mathcal{C}(\Omega) is known from [6] as

𝒞(Ω)1Ω\mathcal{C}(\Omega)\sim\frac{1}{\sqrt{\Omega}} (A.5)

These three power laws can be combined into the scaling

𝒞(Ω)\displaystyle\mathcal{C}(\Omega) =\displaystyle= 1p2δγ𝒞¯(Ω¯,α¯)\displaystyle\frac{1}{p^{2}\sqrt{\delta\gamma}}\,\bar{\mathcal{C}}(\bar{\Omega},\bar{\alpha}) (A.6)
Ω¯\displaystyle\bar{\Omega} =\displaystyle= Ω2pδγ\displaystyle\frac{\Omega}{2\,p\,\delta\gamma} (A.7)
α¯\displaystyle\bar{\alpha} =\displaystyle= α16δγ32\displaystyle\frac{\alpha}{16\,\delta\gamma^{\frac{3}{2}}} (A.8)

where the exponents of δγ\delta\gamma in Ω¯\bar{\Omega} and α¯\bar{\alpha} are fixed by standard arguments (see [27] for details). Inserting this ansatz into (A.2) and taking the limit δγ0\delta\gamma\to 0 gives

1+𝒞¯2+α¯𝒞¯3+Ω¯24𝒞¯4=0-1+\bar{\mathcal{C}}^{2}+\bar{\alpha}\,\bar{\mathcal{C}}^{3}+\frac{\bar{\Omega}^{2}}{4}\,\bar{\mathcal{C}}^{4}=0 (A.9)

This equation implicitly determines the master curve, i.e. the scaling function 𝒞¯(Ω¯,α¯)\bar{\mathcal{C}}(\bar{\Omega},\bar{\alpha}). It describes the power spectrum in the region where δγ\delta\gamma, α\alpha and Ω\Omega are all small but Ω¯\bar{\Omega} and α¯\bar{\alpha} are finite, which requires in particular that the dimensionless frequency Ω\Omega must be of the order of δγ\delta\gamma. Numerical data in figure 9 (right) show good agreement with this master curve in the relevant regime.

A.1.2 1<η<1-1<\eta<1.

Let us now consider η<1\eta<1, for which it is convenient to work with the entire system (3.2). For δγ0\delta\gamma\to 0 at α=0\alpha=0 one finds amplitudes 𝒞(0)(1η)/2δγp2\mathcal{C}(0)\sim(1-\eta)/2\,\delta\gamma\,p^{2} and (0)4δγ2(1+η)4/p4\mathcal{B}(0)\sim-4\,\delta\gamma^{2}(1+\eta)^{4}/p^{4}. At δγ=Ω=0\delta\gamma=\Omega=0, we get 𝒞(0)α1/2\mathcal{C}(0)\sim\alpha^{-1/2}. For the third direction we can read off from [6] that at α=δγ=0\alpha=\delta\gamma=0, the low-frequency tail of the power spectrum is given by 1/Ω21/\Omega^{2} for η<1\eta<1. Comparing the first and third expression suggests a crossover frequency determined by 1/Ω2=(1η)/2δγp21/\Omega^{2}=(1-\eta)/2\,\delta\gamma\,p^{2}, giving Ωp2δγ\Omega\sim p\sqrt{2\,\delta\gamma}. Using this we define scaling functions for 𝒞\mathcal{C} and \mathcal{B} as

𝒞(Ω)\displaystyle\mathcal{C}(\Omega) =\displaystyle= 1η2δγp2𝒞¯(Ω¯,α¯)\displaystyle\frac{1-\eta}{2\,\delta\gamma\,p^{2}}\bar{\mathcal{C}}(\bar{\Omega},\bar{\alpha}) (A.10)
(Ω)\displaystyle\mathcal{B}(\Omega) =\displaystyle= 4δγ2(1+η)41η2¯(Ω¯,α¯)\displaystyle-\frac{4\,\delta\gamma^{2}(1+\eta)^{4}}{1-\eta^{2}}\bar{\mathcal{B}}(\bar{\Omega},\bar{\alpha}) (A.11)
Ω¯\displaystyle\bar{\Omega} =\displaystyle= ΩΩ=Ωp(1η)1+η2δγ\displaystyle\frac{\Omega}{\Omega^{*}}=\frac{\Omega}{p\,(1-\eta)}\sqrt{\frac{1+\eta}{2\,\delta\gamma}} (A.12)
α¯\displaystyle\bar{\alpha} =\displaystyle= α(1η)4δγ2(1+η)3\displaystyle\frac{\alpha\,(1-\eta)}{4\,\delta\gamma^{2}(1+\eta)^{3}} (A.13)

The somewhat complicated looking prefactors are chosen here to give scaling functions that will be independent of pp and η\eta. The response \mathcal{R}, which also features in the original equations (3.2), does not need to be rescaled as it turns out to be equal to unity to leading order.

We insert the above rescalings into the system (3.2) and again look at the limit δγ0\delta\gamma\to 0. Some care is needed as there are competing orders of δγ\delta\gamma in the equations so that one has to expand the response \mathcal{R} not just to 𝒪(1)\mathcal{O}(1) but to 𝒪(δγ)\mathcal{O}(\delta\gamma). One then finds simply ¯(Ω¯,α¯)=α¯\bar{\mathcal{B}}(\bar{\Omega},\bar{\alpha})=\bar{\alpha} and this makes sense: at α=0\alpha=0 we must retrieve the results of [6], where the normalization of the MSRJD path integral leads all moments of auxiliary variables to vanish. The master curve for the posterior covariance spectrum can also be obtained explicitly, as

𝒞¯(Ω¯,α¯)=12α¯[(1+Ω¯2)+4α¯+(1+Ω¯2)2]\bar{\mathcal{C}}(\bar{\Omega},\bar{\alpha})=\frac{1}{2\bar{\alpha}}\bigg{[}-(1+\bar{\Omega}^{2})+\sqrt{4\bar{\alpha}+(1+\bar{\Omega}^{2})^{2}}\bigg{]} (A.14)

It has the limits 𝒞¯|α¯01/(1+Ω¯2)\bar{\mathcal{C}}|_{\bar{\alpha}\to 0}\sim 1/(1+\bar{\Omega}^{2}), 𝒞¯|Ω¯1/Ω¯2\bar{\mathcal{C}}|_{\bar{\Omega}\to\infty}\sim 1/\bar{\Omega}^{2} and 𝒞¯|α¯1/α¯\bar{\mathcal{C}}|_{\bar{\alpha}\to\infty}\sim 1/\sqrt{\bar{\alpha}}. The latter tells us how the prediction error decreases in the regime where α\alpha is still small but larger than δγ2\delta\gamma^{2}. The fit provided by the master curve (A.14) for different small values of α\alpha is shown in figure 8 (left).

Refer to caption
Refer to caption
Figure 9: (Left) Master curves for different values of ϵ¯\bar{\epsilon}, the parameter indicating effectively the distance from η=1\eta=1 when δγ\delta\gamma is close to zero. One can see the 1/Ω¯1/\sqrt{\bar{\Omega}} tail for ϵ¯0\bar{\epsilon}\to 0 and the 1/Ω¯21/\bar{\Omega}^{2} one for ϵ¯\bar{\epsilon}\to\infty (light blue and dark blue curves), while intermediate values of ϵ¯\bar{\epsilon} show a crossover between these two tails. (Right) Numerically determined power spectra for different values of η\eta, the symmetry parameter, at α0\alpha\to 0 and δγ0\delta\gamma\to 0. For η=0.1\eta=0.1, Ω3103\Omega^{*}\sim 3\cdot 10^{-3} and for ΩΩ1\Omega^{*}\ll\Omega\ll 1 one sees the Lorentzian tail. For η=1\eta=1, Ω5106\Omega^{*}\sim 5\cdot 10^{-6} and in the range ΩΩ1\Omega^{*}\ll\Omega\ll 1 one has the 1/Ω\sim 1/\sqrt{\Omega} tail. For η=0.9\eta=0.9 (i.e. ϵ=1η=0.1\epsilon=1-\eta=0.1, ϵ¯45\bar{\epsilon}\approx 45), the results interpolate between these two regimes as expected from the left figure; the crossover occurs at Ω0.01\Omega\sim 0.01. In fact, the limit 𝒞¯|Ω¯1/Ω¯2\bar{\mathcal{C}}|_{\bar{\Omega}\to\infty}\sim 1/\bar{\Omega}^{2}, once one reinserts the dependence on 1η=ϵ1-\eta=\epsilon, gives 𝒞(Ω)ϵ3/Ω2\mathcal{C}(\Omega)\sim\epsilon^{3}/\Omega^{2} for ΩΩ1\Omega^{*}\ll\Omega\ll 1, while (A.9) has a tail 𝒞(Ω)2/Ω\mathcal{C}(\Omega)\sim 2/\sqrt{\Omega}: these two tails meet around Ωϵ2\Omega\sim\epsilon^{2} (ϵ2=0.01\epsilon^{2}=0.01 in this case), in agreement with the results for the limit ϵ1\epsilon\ll 1 in the case without observations [6]. The dashed lines with circles are master curves for Ω\Omega in the vicinity of Ω\Omega^{*} for the different η\eta. Dotted lines with crosses trace the master curves at ΩO(1)\Omega\sim O(1), which are independent of α\alpha and equal to the curves at α=0\alpha=0. In the relevant range of large frequencies they behave essentially as Lorentzians.

A.1.3 Crossover at η1\eta\approx 1.

Above we found different power law behaviours and scaling functions for η=1\eta=1 and η<1\eta<1, thus a crossover must occur at η1\eta\approx 1. To see this, the two cases η=1\eta=1 and η<1\eta<1 can be analyzed as limit cases of a more general scaling ansatz that accounts explicitly for the effect of ϵ=1η\epsilon=1-\eta, i.e. the distance from the symmetric value η=1\eta=1. This becomes an additional critical parameter that enters the scaling functions in the combination

ϵ¯=ϵ/δγ\bar{\epsilon}=\epsilon/\sqrt{\delta\gamma} (A.15)

We define these scaling functions via

𝒞(Ω)\displaystyle\mathcal{C}(\Omega) =\displaystyle= (2+ϵ¯)2δγp2𝒞¯(Ω¯,α¯,ϵ¯)\displaystyle\frac{(2+\bar{\epsilon})}{2\,\sqrt{\delta\gamma}\,p^{2}}\bar{\mathcal{C}}(\bar{\Omega},\bar{\alpha},\bar{\epsilon}) (A.16)
(Ω)\displaystyle\mathcal{B}(\Omega) =\displaystyle= 32δγ322+ϵ¯¯(Ω¯,α¯,ϵ¯)\displaystyle-\frac{32\,\delta\gamma^{\frac{3}{2}}}{2+\bar{\epsilon}}\bar{\mathcal{B}}(\bar{\Omega},\bar{\alpha},\bar{\epsilon}) (A.17)
Ω¯\displaystyle\bar{\Omega} =\displaystyle= Ωp(2+ϵ¯)δγ\displaystyle\frac{\Omega}{p\,(2+\bar{\epsilon})\delta\gamma} (A.18)
α¯\displaystyle\bar{\alpha} =\displaystyle= α(2+ϵ¯)32δγ32\displaystyle\frac{\alpha\,(2+\bar{\epsilon})}{32\,\delta\gamma^{\frac{3}{2}}} (A.19)

We then find again ¯=α¯\bar{\mathcal{B}}=\bar{\alpha} while the master curve 𝒞¯(Ω¯,α¯,ϵ¯)\bar{\mathcal{C}}(\bar{\Omega},\bar{\alpha},\bar{\epsilon}) solves a fourth-order equation

(8+𝒞¯ϵ¯(2+ϵ¯))2(4+𝒞¯(2+ϵ¯)(ϵ¯+𝒞¯(1+α¯𝒞¯)(2+ϵ¯)))+𝒞¯4(2+ϵ¯)6Ω¯2=0(8+\bar{\mathcal{C}}\,\bar{\epsilon}(2+\bar{\epsilon}))^{2}(-4+\bar{\mathcal{C}}(2+\bar{\epsilon})(-\bar{\epsilon}+\bar{\mathcal{C}}(1+\bar{\alpha}\,\bar{\mathcal{C}})(2+\bar{\epsilon})))+\bar{\mathcal{C}}^{4}(2+\bar{\epsilon})^{6}\bar{\Omega}^{2}=0 (A.20)

The solution of this for a range of different ϵ¯\bar{\epsilon} is plotted in figure 9 (left).

The two previous cases η=1\eta=1 and η<1\eta<1 (with δγ0\delta\gamma\to 0) are recovered as the limits respectively for ϵ¯0\bar{\epsilon}\to 0 and ϵ¯\bar{\epsilon}\to\infty. In the first limit, (2+ϵ¯)δγ2δγ(2+\bar{\epsilon})\sqrt{\delta\gamma}\sim 2\sqrt{\delta\gamma} and the rescaling relations (A.6), (A.7), (A.8) for η=1\eta=1 are retrieved as they should be; accordingly, the equation (A.20) for the master curve becomes exactly (A.9). On the other hand, when ϵ¯\bar{\epsilon}\to\infty, (2+ϵ¯)δγϵ¯δγ=ϵ(2+\bar{\epsilon})\sqrt{\delta\gamma}\sim\bar{\epsilon}\sqrt{\delta\gamma}=\epsilon and we recover the rescalings (A.10)\eqref{resc01}-(A.13)\eqref{resc04} adopted for η<1\eta<1; in this case (A.20) reduces to

1+𝒞¯+α¯𝒞¯2+𝒞¯Ω¯2=0-1+\bar{\mathcal{C}}+\bar{\alpha}\bar{\mathcal{C}}^{2}+\bar{\mathcal{C}}\bar{\Omega}^{2}=0 (A.21)

whose positive solution is given by (A.14).

A.2 Master curves for α1\alpha\to 1 and p0p\to 0

In this section we look at the scaling around the second critical region, α1\alpha\to 1 and p0p\to 0. At α=1\alpha=1, we find the following power law scaling with pp of the amplitudes

𝒞(0)1pγ2+1\displaystyle\mathcal{C}(0)\sim\frac{1}{p\sqrt{\gamma^{2}+1}} (A.22)
(0)γpγ2+1\displaystyle\mathcal{R}(0)\sim\frac{\gamma p}{\sqrt{\gamma^{2}+1}} (A.23)
(0)1\displaystyle\mathcal{B}(0)\sim-1 (A.24)

Approaching from the other direction, p=0p=0, gives 𝒞(0)1/δα\mathcal{C}(0)\sim 1/\delta\alpha. At α=1\alpha=1 and p=0p=0, finally, one finds for small nonzero frequency Ω\Omega that 𝒞(Ω)1/Ω\mathcal{C}(\Omega)\sim 1/\Omega. Equating the three divergences above identifies a crossover frequency Ω=pγ2+1\Omega^{*}=p\sqrt{\gamma^{2}+1} and similarly a characteristic value for δα\delta\alpha. We thus define rescaled quantities again

𝒞(Ω)\displaystyle\mathcal{C}(\Omega) =\displaystyle= 1pγ2+1𝒞¯(Ω¯,δα¯)\displaystyle\frac{1}{p\,\sqrt{\gamma^{2}+1}}\,\bar{\mathcal{C}}(\bar{\Omega},\delta\bar{\alpha}) (A.25)
(Ω)\displaystyle\mathcal{R}(\Omega) =\displaystyle= γpγ2+1¯(Ω¯,δα¯)\displaystyle\frac{\gamma\,p}{\sqrt{\gamma^{2}+1}}\,\bar{\mathcal{R}}(\bar{\Omega},\delta\bar{\alpha}) (A.26)
(Ω)\displaystyle\mathcal{B}(\Omega) =\displaystyle= ¯(Ω¯,δα¯)\displaystyle-\bar{\mathcal{B}}(\bar{\Omega},\delta\bar{\alpha}) (A.27)
Ω¯\displaystyle\bar{\Omega} =\displaystyle= ΩΩ=Ωpγ2+1\displaystyle\frac{\Omega}{\Omega^{*}}=\frac{\Omega}{p\,\sqrt{\gamma^{2}+1}} (A.28)
δα¯\displaystyle\delta\bar{\alpha} =\displaystyle= δαpγ2+1\displaystyle\frac{\delta\alpha}{p\,\sqrt{\gamma^{2}+1}} (A.29)

Inserting into the system (3.2), taking p0p\to 0 and keeping only the leading terms one finds as the master curve for 𝒞(Ω)\mathcal{C}(\Omega)

𝒞¯(Ω¯,δα¯)=δα¯+4+δα¯2+4Ω¯22(1+Ω¯2)\bar{\mathcal{C}}(\bar{\Omega},\delta\bar{\alpha})=\frac{-\delta\bar{\alpha}+\sqrt{4+\delta\bar{\alpha}^{2}+4\bar{\Omega}^{2}}}{2\,(1+\bar{\Omega}^{2})} (A.30)

with limits 𝒞¯|δα¯01/1+Ω¯2\bar{\mathcal{C}}|_{\delta\bar{\alpha}\to 0}\sim 1/\sqrt{1+\bar{\Omega}^{2}}, 𝒞¯|Ω¯1/Ω¯\bar{\mathcal{C}}|_{\bar{\Omega}\to\infty}\sim 1/\bar{\Omega} and 𝒞¯|δα¯1/δα¯\bar{\mathcal{C}}|_{\delta\bar{\alpha}\to\infty}\sim 1/\delta\bar{\alpha}. From the latter one sees again the decrease of the inference error for increasing number of observations (while remaining in the regime studied here where δα\delta\alpha is small, p<δα1p<\delta\alpha\ll 1 ). The master curve predictions for generic δα¯\delta\bar{\alpha} are compared to direct numerical evaluation of 𝒞\mathcal{C} in figure 8 (right).

So far we had focussed on the α1\alpha\to 1 end of the second critical region. As this region covers the entire range 0<α<10<\alpha<1, however, one can also study the critical behaviour as p0p\to 0 for fixed α<1\alpha<1. The crossover into this region can be seen by taking δα¯\delta\bar{\alpha}\to-\infty, where 𝒞¯|δα¯|/(1+Ω¯2)\bar{\mathcal{C}}\to|\delta\bar{\alpha}|/(1+\bar{\Omega}^{2}) from (A.30). Including the prefactor from (A.25) and using (A.28) gives

𝒞(Ω)=1αp2(γ2+1)+Ω2\mathcal{C}(\Omega)=\frac{1-\alpha}{p^{2}(\gamma^{2}+1)+\Omega^{2}} (A.31)

This suggests that in general, for finite 1α1-\alpha, 𝒞\mathcal{C} will be 1/p2\sim 1/p^{2}. Generalizing this suggests the following scaling for small pp and 0<α<10<\alpha<1

𝒞(Ω)\displaystyle\mathcal{C}(\Omega) =\displaystyle= 1(pγ)2𝒞¯(Ω¯,δα,η)\displaystyle\frac{1}{(p\,\gamma)^{2}}\,\bar{\mathcal{C}}(\bar{\Omega},\delta\alpha,\eta) (A.32)
(Ω)\displaystyle\mathcal{R}(\Omega) =\displaystyle= γ¯(Ω¯,δα,η)\displaystyle\gamma\,\bar{\mathcal{R}}(\bar{\Omega},\delta\alpha,\eta) (A.33)
(Ω)\displaystyle\mathcal{B}(\Omega) =\displaystyle= ¯(Ω¯,δα,η)\displaystyle-\bar{\mathcal{B}}(\bar{\Omega},\delta\alpha,\eta) (A.34)
Ω¯\displaystyle\bar{\Omega} =\displaystyle= Ωp\displaystyle\frac{\Omega}{p} (A.35)

with δα=1α\delta\alpha=1-\alpha. By substitution into the system (3.2) and taking the limit p0p\to 0 one can then obtain explicit solutions for 𝒞¯\bar{\mathcal{C}}, ¯\bar{\mathcal{R}}, ¯\bar{\mathcal{B}}. In particular we find ¯=α\bar{\mathcal{B}}=\alpha, similarly to the previous cases, while 𝒞¯\bar{\mathcal{C}} satisfies

α1𝒞¯+11+𝒞¯+1γ2[1(1+𝒞¯(1+η))2+Ω¯2(1+𝒞¯(1η))2]\frac{\alpha-1}{\bar{\mathcal{C}}}+\frac{1}{1+\bar{\mathcal{C}}}+\frac{1}{\gamma^{2}}\bigg{[}\frac{1}{(1+\bar{\mathcal{C}}(1+\eta))^{2}}+\frac{\bar{\Omega}^{2}}{(1+\bar{\mathcal{C}}(1-\eta))^{2}}\bigg{]} (A.36)

In the limit where 1α11-\alpha\ll 1 one retrieves the result (A.31) as required for consistency between the two scaling limits.

The result above is of interest for negative symmetry parameters η\eta. As one approaches the extreme case η1\eta\to-1, i.e. close to antisymmetry, the critical λ\lambda at α=0\alpha=0 tends to zero. As a consequence, for finite kk, one has that the approach to the stability limit of the hidden dynamics corresponds to p0p\to 0 and is therefore captured by the master curve (A.36).

Appendix B Scaling analysis: timescales and amplitudes

B.1 Relaxation time for γγc\gamma\to\gamma_{c} and α0\alpha\to 0

The relaxation time is defined in a mean-squared sense by (3.3) and can be rewritten in terms of the rescaled power spectrum as

τ2=12𝒞¯(0,α¯)d2𝒞¯(Ω¯,α¯)d2ω|ω=0\tau^{2}=-\frac{1}{2\,\bar{\mathcal{C}}(0,\bar{\alpha})}\frac{d^{2}\bar{\mathcal{C}}(\bar{\Omega},\bar{\alpha})}{d^{2}\omega}\bigg{\lvert}_{\omega=0} (B.1)

We consider the dimensionless version of this typical timescale (3.4), 𝒯=στ\mathcal{T}=\sigma\tau, and analyze separately the vicinity of the two critical points. As τ\tau is determined directly from the power spectrum, its scaling behaviour follows from that of 𝒞(Ω)\mathcal{C}(\Omega). Explicitly, in terms of the dimensionless frequency Ω=ω/σ\Omega=\omega/\sigma, which for critical scaling is rescaled further to Ω¯=Ω/Ω\bar{\Omega}=\Omega/\Omega^{*}, one has

𝒯=𝒯τ¯(α¯),τ¯2(α¯)=12𝒞¯(0,α¯)d2𝒞¯(Ω¯,α¯)d2Ω¯|Ω¯=0\mathcal{T}=\mathcal{T}^{*}\bar{\tau}(\bar{\alpha}),\qquad\bar{\tau}^{2}(\bar{\alpha})=-\frac{1}{2\,\bar{\mathcal{C}}(0,\bar{\alpha})}\frac{d^{2}\bar{\mathcal{C}}(\bar{\Omega},\bar{\alpha})}{d^{2}\bar{\Omega}}\bigg{\lvert}_{\bar{\Omega}=0} (B.2)

with 𝒯=1/Ω\mathcal{T}^{*}=1/\Omega^{*}.

B.1.1 η=1\eta=1.

The relaxation time is rescaled using (B.2) as

𝒯=1pδγτ¯(α¯)\mathcal{T}=\frac{1}{p\,\delta\gamma}\,\bar{\tau}(\bar{\alpha}) (B.3)

where τ¯(α¯)\bar{\tau}(\bar{\alpha}) is the solution of a system of two equations

{(2+3α¯𝒞¯)𝒞¯τ¯214𝒞¯3=01+𝒞¯2+α¯𝒞¯3=0\begin{cases}&\big{(}2+3\,\bar{\alpha}\,\bar{\mathcal{C}}\big{)}\,\bar{\mathcal{C}}\,\bar{\tau}^{2}-\frac{1}{4}\bar{\mathcal{C}}^{3}=0\\ &-1+\,\bar{\mathcal{C}}^{2}+\bar{\alpha}\,\bar{\mathcal{C}}^{3}=0\end{cases} (B.4)

which can be obtained by deriving from (A.9) one equation for 𝒞¯(0,α¯)\bar{\mathcal{C}}(0,\bar{\alpha}) and one for d2𝒞¯(Ω¯,α¯)/d2Ω¯|Ω¯=0d^{2}\bar{\mathcal{C}}(\bar{\Omega},\bar{\alpha})/d^{2}{\bar{\Omega}}\big{\lvert}_{\bar{\Omega}=0} and using relations (B.2) and (B.3). For simplicity we have denoted 𝒞¯(0,α¯)\bar{\mathcal{C}}(0,\bar{\alpha}) and τ¯(α¯)\bar{\tau}(\bar{\alpha}) as 𝒞¯\bar{\mathcal{C}} and τ¯\bar{\tau}. From these two equations we note 𝒞¯(0,α¯)|α¯α¯13\bar{\mathcal{C}}(0,\bar{\alpha})|_{\bar{\alpha}\to\infty}\sim\bar{\alpha}^{-\frac{1}{3}} and τ¯|α¯α¯23\bar{\tau}|_{\bar{\alpha}\to\infty}\sim\bar{\alpha}^{-\frac{2}{3}}, which implies for δγ2<α1\delta\gamma^{2}<\alpha\ll 1 the power law (3.5).

B.1.2 1<η<1-1<\eta<1.

By applying (B.2) with (A.14), one can rescale in this regime according to

𝒯=1(1η)p1+η2δγτ¯(α¯)\mathcal{T}=\frac{1}{(1-\eta)\,p}\sqrt{\frac{1+\eta}{2\,\delta\gamma}}\,\bar{\tau}(\bar{\alpha}) (B.5)

and τ¯(α¯)\bar{\tau}(\bar{\alpha}) is given by

τ¯(α¯)=1(4α¯+1)14\bar{\tau}(\bar{\alpha})=\frac{1}{(4\bar{\alpha}+1)^{\frac{1}{4}}} (B.6)

with the limit τ¯|α¯α¯14\bar{\tau}|_{\bar{\alpha}\to\infty}\sim\bar{\alpha}^{-\frac{1}{4}} corresponding to (3.6).

B.1.3 Crossover at η1\eta\approx 1.

The relaxation time scalings above can be seen as limit cases of a more general scaling linked to the parameter ϵ¯\bar{\epsilon}. From the general master curve (A.20) we can derive the following system of two equations

{[8+𝒞¯ϵ¯(2+ϵ¯)]2[4+𝒞¯(2+ϵ¯)(ϵ¯+𝒞¯(1+α¯𝒞¯)(2+ϵ¯))]=0[16ϵ¯+𝒞¯(2+ϵ¯)(163ϵ¯2+4𝒞¯ϵ¯(2+ϵ¯)+α¯𝒞¯(24+5𝒞¯ϵ¯(2+ϵ¯))]τ¯2=𝒞¯3(2+ϵ¯)ϵ¯4(8+𝒞¯ϵ¯(2+ϵ¯))\displaystyle\begin{cases}&[8+\bar{\mathcal{C}}\bar{\epsilon}(2+\bar{\epsilon})]^{2}[-4+\bar{\mathcal{C}}(2+\bar{\epsilon})(-\bar{\epsilon}+\bar{\mathcal{C}}(1+\bar{\alpha}\,\bar{\mathcal{C}})(2+\bar{\epsilon}))]=0\\ &[-16\bar{\epsilon}+\bar{\mathcal{C}}(2+\bar{\epsilon})(16-3\bar{\epsilon}^{2}+4\,\bar{\mathcal{C}}\bar{\epsilon}(2+\bar{\epsilon})+\bar{\alpha}\,\bar{\mathcal{C}}(24+5\,\bar{\mathcal{C}}\bar{\epsilon}(2+\bar{\epsilon}))]\bar{\tau}^{2}=\frac{\bar{\mathcal{C}}^{3}(2+\bar{\epsilon})\bar{\epsilon}^{4}}{(8+\bar{\mathcal{C}}\bar{\epsilon}(2+\bar{\epsilon}))}\end{cases} (B.7)

where 𝒞¯\bar{\mathcal{C}} is shorthand for 𝒞¯(0,α¯,ϵ¯)\bar{\mathcal{C}}(0,\bar{\alpha},\bar{\epsilon}) and τ¯\bar{\tau} for τ¯(α¯,ϵ¯)\bar{\tau}(\bar{\alpha},\bar{\epsilon}). Consistent with the role of ϵ¯\bar{\epsilon} discussed above, one can check that the limit for ϵ¯0\bar{\epsilon}\to 0 is precisely the system (B.4), while for ϵ¯\bar{\epsilon}\to\infty (LABEL:mcebar0) becomes

{1+𝒞¯+α¯𝒞¯2=0𝒞¯+τ¯2(3+𝒞¯(4+5α¯𝒞¯))=0\begin{cases}&-1+\bar{\mathcal{C}}+\bar{\alpha}\,\bar{\mathcal{C}}^{2}=0\\ &-\bar{\mathcal{C}}+\bar{\tau}^{2}(-3+\bar{\mathcal{C}}(4+5\,\bar{\alpha}\,\bar{\mathcal{C}}))=0\end{cases} (B.8)

Here the first equation agrees with (A.14) as it should. Solving for τ¯\bar{\tau}, one finds (B.6).

For generic ϵ¯\bar{\epsilon}, the rescaled relaxation time τ¯(α¯)\bar{\tau}(\bar{\alpha}) must then exhibit a crossover in its large α¯\bar{\alpha} power law behaviour, i.e. from α¯14\bar{\alpha}^{-\frac{1}{4}} to α¯23\bar{\alpha}^{-\frac{2}{3}}. This crossover takes place around α¯=ϵ¯2/4\bar{\alpha}^{*}=\bar{\epsilon}^{2}/4; see figure 10.

B.2 Relaxation time for α1\alpha\to 1 and p0p\to 0

We next consider the behaviour of the relaxation time in the second critical region. From the rescaled expression of 𝒞\mathcal{C} (A.30) one has

𝒯=1p(1+γ2)τ¯(δα¯)\mathcal{T}=\frac{1}{p\,\sqrt{(1+\gamma^{2})}}\,\bar{\tau}(\delta\bar{\alpha}) (B.9)

and

τ¯(δα¯)=12(1δα¯4+δα¯2)\bar{\tau}(\delta\bar{\alpha})=\sqrt{\frac{1}{2}\bigg{(}1-\frac{\delta\bar{\alpha}}{\sqrt{4+\delta\bar{\alpha}^{2}}}\bigg{)}} (B.10)

with the limit τ¯|δα¯1/δα¯\bar{\tau}|_{\delta\bar{\alpha}\to\infty}\sim 1/\delta\bar{\alpha} giving (3.7).

Refer to caption
Figure 10: The rescaled relaxation time τ¯(α¯)\bar{\tau}(\bar{\alpha}), normalized at the origin, for α0\alpha\to 0 and γγc\gamma\to\gamma_{c} as a function of α¯\bar{\alpha} for different values of ϵ¯\bar{\epsilon}. The two limits, ϵ¯\bar{\epsilon}\to\infty and ϵ¯0\bar{\epsilon}\to 0, are highlighted in light blue and dark blue respectively. For intermediate values of ϵ¯\bar{\epsilon} one can see an interpolation between the corresponding power laws, i.e. α¯14\bar{\alpha}^{-\frac{1}{4}} and α¯23\bar{\alpha}^{-\frac{2}{3}}, and this crossover occurs at α¯=ϵ¯2/4\bar{\alpha}^{*}=\bar{\epsilon}^{2}/4 (for instance, at ϵ¯=25\bar{\epsilon}=25, α¯150\bar{\alpha}^{*}\sim 150). In the main text, the curve for η=0.85\eta=0.85 shown in figure 3 (left) has a crossover at α0.0005\alpha^{*}\sim 0.0005, which would correspond to the crossover for ϵ¯20\bar{\epsilon}\sim 20 here.

In the region of fixed 0<α<10<\alpha<1 and p0p\to 0 one again has a separate master curve, where one defines a rescaled time as

𝒯=1pτ¯(α,η)\mathcal{T}=\frac{1}{p}\bar{\tau}(\alpha,\eta) (B.11)

using (A.35). Given the master curve (A.36), τ¯\bar{\tau} can be found as the solution of the system

{1/γ2(1+𝒞¯(1η))2+(α1)τ¯2/𝒞¯+𝒞¯τ¯2/(1+𝒞¯)2+2𝒞¯(1+η)τ¯2/γ2(1+𝒞¯(1+η))3=0(α1)/𝒞¯+1/(1+𝒞¯)+1/γ2(1+𝒞¯(1+η))2=0\displaystyle\begin{cases}&1/\gamma^{2}(1+\bar{\mathcal{C}}(1-\eta))^{2}+(\alpha-1)\bar{\tau}^{2}/\bar{\mathcal{C}}+\bar{\mathcal{C}}\bar{\tau}^{2}/(1+\bar{\mathcal{C}})^{2}+2\,\bar{\mathcal{C}}(1+\eta)\bar{\tau}^{2}/\gamma^{2}(1+\bar{\mathcal{C}}(1+\eta))^{3}=0\\ &(\alpha-1)/\bar{\mathcal{C}}+1/(1+\bar{\mathcal{C}})+1/\gamma^{2}(1+\bar{\mathcal{C}}(1+\eta))^{2}=0\end{cases} (B.12)

As expected, for α1\alpha\to 1^{-} one recovers the δα¯\delta\bar{\alpha}\to-\infty limit of (B.9), i.e. 𝒯1/p1+γ2\mathcal{T}\sim 1/p\sqrt{1+\gamma^{2}}, for any η\eta, as can be seen also in figure 4 (left) of the main text.

B.3 Equal time posterior variance for γγc\gamma\to\gamma_{c} and α0\alpha\to 0

Finally we look at the equal time posterior correlator given by (3.8) in the main text. Its critical scaling properties depend on whether the integral over Ω\Omega that defines 𝒞0\mathcal{C}_{0} is dominated by small frequencies ΩΩ\Omega\sim\Omega^{*}, where Ω\Omega^{*} is the relevant frequency scale in the appropriate critical region, or by Ω1\Omega\sim 1. One has the first case when the critical master curve for 𝒞(Ω)\mathcal{C}(\Omega) has an integrable tail towards large (scaled) frequencies, otherwise the second. This illustrates how the analysis in the frequency domain is important in determining the power scaling of C(0)C(0).

B.3.1 η=1\eta=1.

Here the dominant contribution to the integral (3.8) comes from ΩO(1)\Omega\sim O(1), and in this region the master curve for the spectrum is given by equation (A.1) (the case without observations). This explains the effective independence from α\alpha pointed out in the main text.

B.3.2 1<η<1-1<\eta<1.

Here the dominant contribution to the prediction error comes from ΩΩ\Omega\sim\Omega^{*}, thus one has to evaluate the integral of the master curve (A.14). To do so we note from (A.10) that 𝒞0\mathcal{C}_{0} can be written in scaled form as

𝒞0=(1η)Ω2δγp2𝒞¯0(α¯)\mathcal{C}_{0}=\frac{(1-\eta)\Omega^{*}}{2\,\delta\gamma\,p^{2}}\,\bar{\mathcal{C}}_{0}(\bar{\alpha}) (B.13)

with 𝒞¯0(α¯)=𝒞¯(Ω¯,α¯)𝑑Ω¯\bar{\mathcal{C}}_{0}(\bar{\alpha})=\int_{-\infty}^{\infty}\bar{\mathcal{C}}(\bar{\Omega},\bar{\alpha})d\,\bar{\Omega}. This function encodes the entire α\alpha-dependence in the critical region. It has a finite limit for α¯0\bar{\alpha}\to 0 while for large α¯\bar{\alpha} it decays as 𝒞¯0α¯1/4\bar{\mathcal{C}}_{0}\sim\bar{\alpha}^{-1/4} as one can show by noting that the relevant frequencies Ω¯\bar{\Omega} in (A.14) are then α¯1/4\sim\bar{\alpha}^{-1/4}. Using this asymptotic behaviour in (B.13) and substituting also the expressions for Ω\Omega^{*} and α¯\bar{\alpha} from (A.12) and (A.13), respectively, one obtains

C(0)(1η)2p2δγ(1+η)α¯14=1p(1η)74(1+η)14α14\displaystyle C(0)\sim\frac{(1-\eta)^{2}}{p\sqrt{2\delta\gamma(1+\eta)}}\bar{\alpha}^{-\frac{1}{4}}=\frac{1}{p}(1-\eta)^{\frac{7}{4}}(1+\eta)^{\frac{1}{4}}{\alpha}^{-\frac{1}{4}} (B.14)

i.e. equation (3.9) in the main text. Thus one can derive C(0)α1/4C(0)\sim\alpha^{-1/4}, where α¯1\bar{\alpha}\gg 1 corresponds to δγ2α1\delta\gamma^{2}\ll\alpha\ll 1 from (A.13).

B.4 Equal time posterior variance for α1\alpha\to 1 and p0p\to 0

In the second critical region and focussing on α1\alpha\to 1, we have an interesting marginal case where the equal-time variance (3.8) has contributions from all frequencies ranging from the critical frequency scale ΩΩ\Omega\sim\Omega^{*} to Ω1\Omega\sim 1. This is because the power spectrum (A.30) for critical frequencies has a 1/Ω¯1/\bar{\Omega} tail for large Ω¯=Ω/Ω\bar{\Omega}=\Omega/\Omega^{*}, which gives a logarithmically divergent integral (3.8). This divergence is cut off only by the crossover to a Lorentzian tail when Ω=𝒪(1)\Omega=\mathcal{O}(1). Including the prefactor from (A.25), one thus estimates

𝒞0Ωpγ2+1 201/Ω𝒞¯(Ω¯,δα¯)𝑑Ω¯\mathcal{C}_{0}\approx\frac{\Omega^{*}}{p\sqrt{\gamma^{2}+1}}\,2\int_{0}^{1/\Omega^{*}}\bar{\mathcal{C}}(\bar{\Omega},\delta\bar{\alpha})d\bar{\Omega} (B.15)

The fraction in front of the integral equals unity as Ω=pγ2+1\Omega^{*}=p\sqrt{\gamma^{2}+1} from (A.28) so from the 1/Ω¯1/\bar{\Omega} tail one finds 𝒞02ln(1/Ω)\mathcal{C}_{0}\approx 2\ln(1/\Omega^{*}) to leading order. All of the interesting dependence on α\alpha is in the next subleading term, which is relevant in practice as it only competes with a logarithmic divergence. Writing ln(1/Ω)\ln(1/\Omega^{*}) as 01/Ω𝑑Ω¯/(1+Ω¯)\int_{0}^{1/\Omega^{*}}d\bar{\Omega}/(1+\bar{\Omega}), this subleading term can be split off in the form

𝒞0 2ln(1/Ω)+20[𝒞¯(Ω¯,δα¯)(1+Ω¯)1]𝑑Ω¯\mathcal{C}_{0}\approx\,2\ln(1/\Omega^{*})+2\int_{0}^{\infty}[\bar{\mathcal{C}}(\bar{\Omega},\delta\bar{\alpha})-(1+\bar{\Omega})^{-1}]d\bar{\Omega} (B.16)

The remaining integral is convergent at the upper limit so we have taken the upper limit 1/Ω1/\Omega^{*} to infinity as is appropriate to get the leading contribution for p0p\to 0. The integral is then a function of δα¯\delta\bar{\alpha} only, which one finds varies as |δα¯||\delta\bar{\alpha}| for δα¯\delta\bar{\alpha}\to-\infty and as const.ln(δα¯)\mbox{const.}-\ln(\delta\bar{\alpha}) for δα¯\delta\bar{\alpha}\to\infty.

Finally for the small α\alpha end of this critical region, one should take the integral of the solution of (A.36). Here the only dependence on α\alpha arises from (A.36) itself, and gives a smooth variation of 𝒞¯\bar{\mathcal{C}} with this parameter. Thus, independently of whether the integral determining the posterior variance C(0)C(0) is dominated by O(1)O(1) frequencies (η=1\eta=1) or by frequencies in the vicinity of Ω\Omega^{*} (η<1\eta<1), the result will also be a smooth function of α\alpha.

References

References

  • [1] Bachschmid-Romano L, Battistin C, Opper M and Roudi Y 2016 J. Phys. A: Math. Theor. 49 434003
  • [2] Bachschmid-Romano L and Opper M 2014 J. Stat. Mech. P06013
  • [3] Battistin C, Hertz J, Tyrcha J and Roudi Y 2015 J. Stat. Mech. P05021
  • [4] Dunn B and Roudi Y 2013 Phys. Rev. E 87 022127
  • [5] Bravi B and Sollich P 2016 Arxiv preprint 1603.05538
  • [6] Bravi B, Sollich P and Opper M 2016 J. Phys. A: Math. Theor. 49 194003
  • [7] Kalman R E 1960 J. Basic Eng. 82 35–45
  • [8] Bishop C M 2006 Pattern Recognition and Machine Learning (Springer)
  • [9] Bravi B, Opper M and Sollich P 2017 Phys. Rev. E 95 012122
  • [10] Hertz J A, Krogh A and Thorbergsson G I 1989 J. Phys. A: Math. Gen. 22 2133–2150
  • [11] Opper M 1989 Europhys. Lett. 8 389–392
  • [12] Martin P, Siggia E and Rose H 1973 Phys. Rev. A 8 423–436
  • [13] Janssen H K 1976 Z. Phys. B: Cond. Mat. 23 377–380
  • [14] De Dominicis C 1978 Phys. Rev. B 18 4913–4919
  • [15] Girko V L 1986 Theory Probab. Appl. 30 677–690
  • [16] Plefka T 1982 J. Phys. A: Math. Gen. 15 1971–1978
  • [17] Cardy J 1996 Scaling and Renormalization in Statistical Physics (Cambridge University Press)
  • [18] Crisanti A and Sompolinsky H 1987 Phys. Rev. A 36 4922–4939
  • [19] Cugliandolo L F, Kurchan J, Le Doussal P and Peliti L 1997 Phys. Rev. Lett. 78 350–353
  • [20] Barra A, Genovese G, Sollich P and Tantari D 2017 Arxiv preprint 1702.05882
  • [21] Krogh A 1992 J. Phys. A. Math. Gen. 25 1119–1133
  • [22] Vrettas M, Opper M and Cornford D 2015 Phys. Rev. E 91 012148
  • [23] Komorowski M, Costa M J, Rand D A and Stumpf M P H 2011 PNAS 108 8645–8650
  • [24] Molinelli E J et al 2013 PLOS Computational Biology 9
  • [25] Braunstein A, Pagnani A, Weigt M and Zecchina R 2008 J. Stat. Mech. P12001
  • [26] McLachlan G J and Krishnan T 1997 The EM Algorithm and its Extensions (Wiley)
  • [27] Bravi B 2016 Path integral approaches to subnetwork dynamics and inference Ph.D. thesis King’s College London