This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\pdfcolInitStack

tcb@breakable

Global thermodynamic manifold for conservative control of stochastic systems

Jordan R. Sawchuk jordan_sawchuk@sfu.ca    David A. Sivak dsivak@sfu.ca Department of Physics, Simon Fraser University, Burnaby, British Columbia, Canada V5A1S6
(August 9, 2025; August 9, 2025)
Abstract

Optimal control of stochastic systems plays a central role in nonequilibrium physics, with applications in the study of biological molecular motors and the design of single-molecule experiments. While exact analytic solutions to optimization problems are rare, under slow driving conditions, the problem can be reformulated geometrically solely in terms of equilibrium properties. In this framework, minimum-work protocols are geodesics on a thermodynamic manifold whose metric is a generalized friction tensor. Here, we introduce a new foundation for this friction-tensor formalism for conservatively driven systems. Under complete control of the potential energy, a global thermodynamic manifold (on which points are identified with instantaneous energy landscapes) has as its metric a full-control friction tensor. Arbitrary partial-control friction tensors arise naturally as inherited metrics on submanifolds of this global manifold. Leveraging a simple mathematical relationship between system dynamics and the geometry of the global manifold, we derive new expressions for the friction tensor that offer powerful tools for interpretation and computation of friction tensors and minimum-work protocols. Our results elucidate a connection between relaxation and dissipation in slowly driven systems and suggest optimization heuristics. We demonstrate the utility of these developments in three illustrative examples.

I Introduction

A central question in nonequilibrium thermodynamics concerns the “price of haste” [1]. For quasistatic (infinitely slow) processes, thermodynamic systems remain at equilibrium, rendering total dissipation independent of driving protocol. However, for processes occurring in finite time, established results from equilibrium thermodynamics no longer apply: instantaneous values of macroscopic observables fundamentally depend on the driving history. For mesoscopic systems where the relevant energy scales are comparable to thermal fluctuations, these quantities are stochastic, exhibiting protocol-dependent distributions [2].

These factors pose challenges to uncovering universal principles governing systems of theoretical and practical interest like biological molecular motors operating in the nonequilibrium environment of living cells—often with remarkable speed and efficiency [3]. Optimal control theory is one framework for navigating the complexity of nonequilibrium statistical physics: given a cost function and boundary conditions, optimization selects a set of privileged paths from the uncountable infinity of thermodynamically distinct paths in control-parameter space [4].

Minimum-work protocols (MWPs) minimize the ensemble average of the excess work, the extra energy required to drive a process in finite time. Although MWPs have been analytically derived for some simple systems [5, 6, 7], the problem is typically intractable. Reformulating the optimal-control problem as a system of Hamiltonian equations allows for the application of numerical methods for ordinary differential equations [6]; however, computation rapidly becomes prohibitively expensive for large systems, high-dimensional control vectors, or systems driven near a critical point. Ongoing efforts to address these challenges include methods employing machine learning [8, 9].

An alternative approach is to apply suitable approximations that simplify the optimal control problem. Approximations of the excess work based on purely equilibrium information have been derived in the limits of both fast [10] and slow [11, 12] driving. The slow-driving approximation gives rise to a formalism within which MWPs may be viewed as geodesics on a thermodynamic manifold. The metric on this manifold is a generalized friction tensor that depends only on the instantaneous value of the control parameters.

The friction-tensor formalism has been applied in various contexts, including the study of driven barrier crossings as a model of single-molecule force-extension experiments [13], driving the biological rotary motor F1 ATPase [14, 15], folding of DNA hairpins [16], control of protein copy-number distributions [17], bounds on the efficiency of membrane separation of binary gases [18], and nonequilibrium phase transitions [19]. The formalism has also been extended beyond the classical regime to quantum systems [20]. A re-weighted friction tensor has been introduced for the efficient sampling of free-energy surfaces [21] with applications to the binding affinity of drugs [22]. Recently it was shown for continuous overdamped dynamics that the thermodynamic length defined in the friction-tensor formalism is equivalent to the L2L^{2}-Wasserstein distance from optimal-transport theory under the restriction to equilibrium distributions [23].

In this paper, we establish a foundational geometric structure for the friction-tensor formalism. We define a non-Riemannian global thermodynamic manifold \mathcal{M} whose metric tensor (24) is the generalized friction tensor obtained by assuming arbitrary control over the microstate energies of a given system. We demonstrate how, in the case of conservative control, arbitrary partial-control friction tensors arise naturally as inherited metrics (30) on submanifolds of \mathcal{M}. We then derive two decompositions (42, 54) of the full-control friction tensor that facilitate numerical and analytic calculations, provide insight into the dynamical origins of dissipation for slowly driven systems, and suggest optimization heuristics. We provide concrete demonstrations of the framework in three illustrative examples.

II Conventions and assumptions

Our principal results apply to both discrete and continuous state spaces Ω\Omega under a set of assumptions about the nature of the dynamics and control. It will thus be advantageous to establish a unified notation that does not require any assumptions on the cardinality of the state space.

Summation (for discrete state spaces) and integration (for continuous state spaces) are unified by introducing a measure μ\mu on Ω\Omega: if g(x)g(x) is some measurable function on Ω\Omega,

Ωdμ(x)g(x)={xνΩg(xν),Ω discreteΩdxg(x),Ω continuous.\int_{\Omega}\mathrm{d}\mu(x)\,g(x)=\begin{dcases}\sum_{x_{\nu}\in\Omega}g(x_{\nu})\ ,\ \ &\Omega\text{ discrete}\\ \int_{\Omega}\mathrm{d}x\,g(x)\ ,&\Omega\text{ continuous}\end{dcases}\ . (1)

That is, μ\mu is the counting measure for discrete Ω\Omega and the Lebesgue measure for continuous Ω\Omega. Then for any state functions gg and hh,

(g,h)Ωdμ(x)g(x)h(x)\left(g,h\right)\equiv\int_{\Omega}\mathrm{d}\mu(x)\,g(x)h(x)\ (2)

is the 2\ell^{2} inner product (dot product) for discrete Ω\Omega or the L2L^{2} inner product for continuous Ω\Omega. In particular,

gq(g,q)\left\langle g\right\rangle_{q}\equiv\left(g,q\right) (3)

is the average value of gg with respect to a distribution qq over Ω\Omega.

Throughout, we extend in the straightforward way the definition of the inner product (2) to arguments other than state functions (e.g., (𝒖,𝒗)=𝒖𝖳𝒗\left(\bm{u},\bm{v}\right)=\bm{u}^{\mathsf{T}}\bm{v} for vectors 𝒖,𝒗m\bm{u},\bm{v}\in\mathbb{R}^{m}).

We consider ergodic continuous-time Markov processes XtX_{t} taking values xx in Ω\Omega. The time evolution of the probability distribution p(x,t)p(x,t) over Ω\Omega is governed by a generator 𝒢\mathcal{G}, i.e., p/t=𝒢p\partial p/\partial t=\mathcal{G}p. External control is accomplished through a vector of control parameters 𝒖\bm{u} such that 𝒢t=𝒢𝒖(t)\mathcal{G}_{t}=\mathcal{G}_{\bm{u}(t)}. The most general equation of motion for p(x,t)p(x,t) is then

pt(x,t)\displaystyle\frac{\partial p}{\partial t}(x,t) =[𝒢tp(t)](x)\displaystyle=\big{[}\mathcal{G}_{t}p(t)\big{]}(x) (4a)
Ωdμ(y)𝒢t(x,y)p(y,t),\displaystyle\equiv\int_{\Omega}\mathrm{d}\mu(y)\,\mathcal{G}_{t}(x,y)p(y,t)\ , (4b)

where in (4b) we define the integral kernel 𝒢t(x,y)\mathcal{G}_{t}(x,y) of 𝒢t\mathcal{G}_{t}, i.e., the rate of probability flowing to xx from yy. For discrete Ω\Omega, 𝒢t\mathcal{G}_{t} is a rate matrix 𝕎t\mathbb{W}_{t}, and this is the master equation. For continuous Ω\Omega, 𝒢t\mathcal{G}_{t} is the second-order differential Fokker-Planck operator t\mathcal{L}_{t}. The notation [𝒢tp(t)](x)[\mathcal{G}_{t}p(t)](x) here reflects that 𝒢t\mathcal{G}_{t} acts on the entire function p(t)p(t) (just as 𝕎t\mathbb{W}_{t} acts on the entire vector 𝒑(t)\bm{p}(t)).

We assume that the generator 𝒢𝒖\mathcal{G}_{\bm{u}} satisfies detailed balance for all 𝒖\bm{u}:

𝒢𝒖(y,x)π𝒖(x)=𝒢𝒖(x,y)π𝒖(y),\mathcal{G}_{\bm{u}}(y,x)\pi_{\bm{u}}(x)=\mathcal{G}_{\bm{u}}(x,y)\pi_{\bm{u}}(y)\ , (5)

where π𝒖\pi_{\bm{u}} is a distribution satisfying 𝒢𝒖π𝒖=0\mathcal{G}_{\bm{u}}\pi_{\bm{u}}=0. Under the assumptions of ergodicity and detailed balance, π𝒖eβV(𝒖)\pi_{\bm{u}}\propto\text{e}^{-\beta V(\bm{u})} is the unique equilibrium distribution for the process at fixed 𝒖\bm{u}, for potential energy VV and inverse temperature β(kBT)1\beta\equiv(k_{\rm B}T)^{-1}. Driving may thus be expressed by a time-dependent potential energy V(t)=V(𝒖(t))V(t)=V(\bm{u}(t)), i.e., control is conservative [24].

We often write gp(t)\left\langle g\right\rangle_{p(t)} as g(t)\left\langle g(t)\right\rangle for the average of a state function with respect to a time-dependent distribution p(t)p(t). When the parameters are temporally varied, π(t)π𝒖(t)\pi(t)\equiv\pi_{\bm{u}(t)} is the equilibrium distribution corresponding to the parameter values at time tt. The mean deviation of a state function gg from its instantaneous equilibrium value is δg(t)(g,δp(t))\langle\delta g(t)\rangle\equiv(g,\delta p(t)) for δp(t)p(t)π(t)\delta p(t)\equiv p(t)-\pi(t). Equilibrium (fixed-parameter) time-correlation functions are

Σg(t)\displaystyle\Sigma_{g}(t) δg(t)δg(0)eq\displaystyle\equiv\left\langle\delta g(t)\delta g(0)\right\rangle_{\text{eq}} (6a)
=Ωdμ(x)Ωdμ(y)P(x,t|y,0)π(y)δg(x)δg(y),\displaystyle=\int_{\Omega}\mathrm{d}\mu(x)\int_{\Omega}\mathrm{d}\mu(y)\,P(x,t|y,0)\pi(y)\,\delta g(x)\delta g(y)\ , (6b)

where P(x,t|y,0)=[et𝒢](x,y)P(x,t|y,0)=\left[\text{e}^{t\mathcal{G}}\right]\!(x,y) is the probability that the system is in state xx at time tt given that it was initialized in state yy at time 0.

With additional and not-too-restrictive assumptions on V(x,t)V(x,t) for continuous systems 111The technical condition (the Bakry-Emery criterion) is that VV be λ\lambda-convex [26]. A quadratic potential in d\mathbb{R}^{d} is an example of a λ\lambda-convex potential., these conditions mean that 𝒢\mathcal{G} has a discrete spectrum of real, non-positive eigenvalues [26].

The right eigenfunctions ψα\psi_{\alpha} and left eigenfunctions ϕα\phi_{\alpha} satisfy

𝒢ψα\displaystyle\mathcal{G}\psi_{\alpha} =λαψα\displaystyle=\lambda_{\alpha}\psi_{\alpha} (7a)
𝒢ϕα\displaystyle\mathcal{G}^{\dagger}\phi_{\alpha} =λαϕα.\displaystyle=\lambda_{\alpha}\phi_{\alpha}\ . (7b)

Here the adjoint 𝒢\mathcal{G}^{\dagger} is defined by (h,𝒢g)=(𝒢h,g)\left(h,\mathcal{G}g\right)=\left(\mathcal{G}^{\dagger}h,g\right). For discrete Ω\Omega, the adjoint operator is the transpose 𝕎𝖳\mathbb{W}^{\mathsf{T}} of the rate matrix, and for continuous Ω\Omega it is the backwards Kolmogorov operator \mathcal{L}^{\dagger}. In particular, with λ0=0\lambda_{0}=0, ψ0=π\psi_{0}=\pi and ϕ0=1\phi_{0}=1. The left and right eigenfunctions form a complete dual basis, i.e., they can be normalized such that (ψα,ϕβ)=δαβ(\psi_{\alpha},\phi_{\beta})=\delta_{\alpha\beta} for Kronecker delta δαβ\delta_{\alpha\beta}, and

αψα(x)ϕα(y)=δ(x,y),\sum_{\alpha}\psi_{\alpha}(x)\phi_{\alpha}(y)=\delta(x,y)\ , (8)

with δ(x,y)\delta(x,y) the Dirac delta function when xx and yy are continuous states and the Kronecker delta function when they are discrete states. Consequently, the fixed-𝒖\bm{u} generator admits a spectral decomposition

𝒢t(x,y)=α1λα(t)ψα(x,t)ϕα(y,t),\mathcal{G}_{t}(x,y)=\sum_{\alpha\geq 1}\lambda_{\alpha}(t)\psi_{\alpha}(x,t)\phi_{\alpha}(y,t)\ , (9)

such that Eq. (4b) can be written as

pt(x,t)=α1λα(t)(ϕα(t),p(t))ψα(x,t),\frac{\partial p}{\partial t}(x,t)=\sum_{\alpha\geq 1}\lambda_{\alpha}(t)\,\big{(}\phi_{\alpha}(t),p(t)\big{)}\,\psi_{\alpha}(x,t)\ , (10)

with λα\lambda_{\alpha}, ϕα\phi_{\alpha}, and ψα\psi_{\alpha} in general depending on time through the control parameters 𝒖\bm{u}.

We do not use Einstein summation in this paper due to potential ambiguities caused by expressions such as (9). However, for objects with clear geometric interpretations, we do distinguish vectors and covectors by upper and lower indices, respectively. For example, control parameters are written as uiu^{i}, and their conjugate forces are written with a lower index as fif_{i}. Latin indices are reserved for parametric control-parameter sets, while Greek indices are used for state-vector indices for discrete systems (e.g., probabilities pνp(xν)p_{\nu}\equiv p(x_{\nu})) and for eigenfunction indices across all systems.

III Thermodynamics, control, and geometry

Given a set of control parameters 𝒖\bm{u} and some cost functional 𝒞\mathcal{C}, an optimal control protocol is a curve 𝒖:[0,τ]m\bm{u}^{*}:\left[0,\tau\right]\to\mathbb{R}^{m} that minimizes the cost functional subject to some set of constraints. The choice of cost function depends on the context: e.g., one may wish to minimize entropy production or the loss of available work, maximize power output or efficiency, or optimize some combination of these [27, 1]. In this paper, we study protocols that minimize the average work.

Over a particular realization of a conservative control protocol, the energy V(Xt,t)V(X_{t},t) of the system at time tt depends on time through both the deterministic evolution V(t)V(t) of the potential energy and the stochastic dynamics XtX_{t}. On the ensemble level, these contributions can be separated in the time-derivative of the mean system energy V(t)p(t)=(V(t),p(t))\left\langle V(t)\right\rangle_{p(t)}=\left(V(t),p(t)\right) into the mean power and mean heat flow [28]:

ddt(V(t),p(t))=(dV(t)dt,p(t))mean power 𝒫(t)+(V(t),dp(t)dt)mean heat flow Q˙(t).\frac{\mathrm{d}}{\mathrm{d}t}\left(V(t),p(t)\right)=\underbrace{\left(\frac{\mathrm{d}V(t)}{\mathrm{d}t},p(t)\right)}_{\text{mean power }\left\langle\mathcal{P}(t)\right\rangle}+\underbrace{\left(V(t),\frac{\mathrm{d}p(t)}{\mathrm{d}t}\right)}_{\text{mean heat flow }\langle\dot{Q}(t)\rangle}\ . (11)

For a protocol V(t)V(t) over a time interval [0,τ][0,\tau], the average work (taken over many instantiations of the control protocol) is obtained by integrating 𝒫(t)\left\langle\mathcal{P}(t)\right\rangle over the protocol:

𝒲=0τdt(dV(t)dt,p(t)).\left\langle\mathcal{W}\right\rangle=\int_{0}^{\tau}\mathrm{d}t\left(\frac{\mathrm{d}V(t)}{\mathrm{d}t},\,p(t)\right)\ . (12)

It is convenient to instead minimize the mean excess work (or irreversible work), defined as the mean work minus the quasistatic work 𝒲(QS)=limτ𝒲\left\langle\mathcal{W}\right\rangle^{(\text{QS})}=\lim_{\tau\to\infty}\left\langle\mathcal{W}\right\rangle. Rescaling time in (12) with s=t/τs=t/\tau shows that all dependence on τ\tau comes from the distribution p(t)p(t), so the limit τ\tau\to\infty is taken by setting p(t)=π(t)p(t)=\pi(t). The mean excess work is then

𝒲ex=0τdt(dVdt,δp(t)).\left\langle\mathcal{W}_{\text{ex}}\right\rangle=\int_{0}^{\tau}\mathrm{d}t\,\left(\frac{\mathrm{d}V}{\mathrm{d}t},\delta p(t)\right)\ . (13)

This can equivalently be written as

𝒲ex=0τdt(d𝒖dt,δ𝒇(t))\langle\mathcal{W}_{\text{ex}}\rangle=-\int_{0}^{\tau}\mathrm{d}t\left(\frac{\mathrm{d}\bm{u}}{\mathrm{d}t},\left\langle\delta\bm{f}(t)\right\rangle\right)\ (14)

in terms of control parameters 𝒖\bm{u} and mean conjugate-force deviations

δ𝒇(t)(𝒖Vp(t)𝒖Vπ(t)).\left\langle\delta\bm{f}(t)\right\rangle\equiv-\left(\left\langle\nabla_{\bm{u}}V\right\rangle_{p(t)}-\left\langle\nabla_{\bm{u}}V\right\rangle_{\pi(t)}\right)\ . (15)
Refer to caption
Figure 1: Convergence of the exact excess work β𝒲ex\beta\langle\mathcal{W}_{\text{ex}}\rangle (solid red curve) to the linear-response approximation β𝒲ex(LR)\beta\langle\mathcal{W}_{\text{ex}}\rangle^{(\text{LR})} (dashed teal line) as the protocol duration τ\tau becomes large, for the (h,JNN,JNNN,K,L)(h,J_{\text{NN}},J_{\text{NNN}},K,L) protocol described in Sec. VI.3. As τ0\tau\to 0, the excess work asymptotically approaches the relative entropy D(π0πτ)D(\pi_{0}\|\pi_{\tau}) between the initial and final equilibrium distributions [10] (horizontal dashed black line).

We assume that the controlled system is prepared at time t=t=-\infty with some initial value of the control parameters 𝒖0\bm{u}_{0} so that p(0)=π𝒖0p(0)=\pi_{\bm{u}_{0}}. If the system is driven such that it remains close to equilibrium throughout the protocol, dynamic linear-response theory [29] approximates the average deviation of the conjugate forces 𝒇\bm{f} to the parameters 𝒖\bm{u} as

δ𝒇(t)βtdtdΣ𝒇dt[𝒖(t)𝒖(t)],\left\langle\delta\bm{f}(t)\right\rangle\approx\beta\int_{-\infty}^{t}\mathrm{d}t^{\prime}\,\frac{\mathrm{d}\Sigma_{\bm{f}}}{\mathrm{d}t}\left[\bm{u}(t^{\prime})-\bm{u}(t)\right]\ , (16)

for equilibrium time-correlation matrix Σ𝒇(t)=δ𝒇(t)δ𝒇(0)𝖳eq\Sigma_{\bm{f}}(t)=\left\langle\delta\bm{f}(t)\delta\bm{f}(0)^{\mathsf{T}}\right\rangle_{\text{eq}}. The derivative dΣ𝒇/dt\mathrm{d}\Sigma_{\bm{f}}/\mathrm{d}t characterizes the system response at time tt to a perturbation of the control parameters at time 0.

If driving is slow compared to the system’s relaxation timescales, the conjugate-force deviation reduces to

δ𝒇(t)ζ~(𝒖)d𝒖dt.\left\langle\delta\bm{f}(t)\right\rangle\approx-\tilde{\zeta}(\bm{u})\frac{\mathrm{d}\bm{u}}{\mathrm{d}t}\ . (17)

for generalized friction tensor [11]

ζ~(𝒖)β0dtΣ𝒇(t;𝒖).\tilde{\zeta}(\bm{u})\equiv\beta\int_{0}^{\infty}\mathrm{d}t^{\prime}\,\Sigma_{\bm{f}}(t^{\prime};\bm{u})\ . (18)

The linear-response (LR) contribution to the excess work is then

𝒲ex(LR)=0τdtd𝒖𝖳dtζ~(𝒖(t))d𝒖dt.\left\langle\mathcal{W}_{\text{ex}}\right\rangle^{\text{(LR)}}=\int_{0}^{\tau}\mathrm{d}t\frac{\mathrm{d}\bm{u}^{\mathsf{T}}}{\mathrm{d}t}\tilde{\zeta}(\bm{u}(t))\frac{\mathrm{d}\bm{u}}{\mathrm{d}t}\ . (19)

The component ζ~ij(𝒖)\tilde{\zeta}_{ij}(\bm{u}) of the generalized friction tensor approximates how the energy cost of simultaneously changing the control parameters uiu^{i} and uju^{j} scales with the velocity of the control parameters. While the exact excess work (13,14) depends on the solution p(t)p(t) of the time-inhomogeneous equation (4), the linear-response approximation (19) only requires information about the equilibrium dynamics for fixed parameters. The excess work 𝒲ex\langle\mathcal{W}_{\text{ex}}\rangle approaches the linear-response approximation 𝒲ex(LR)\langle\mathcal{W}_{\text{ex}}\rangle^{(\text{LR})} as the protocol duration τ\tau increases (Fig.  1).

The generalized friction tensor is positive-definite and symmetric, endowing the control-parameter space with a Riemannian metric structure. Within this geometric framework, control protocols are curves on a thermodynamic manifold ~\tilde{\mathcal{M}} on which distances quantify the excess work for near-equilibrium driving. The LR excess work (19) of a protocol 𝒖(t)\bm{u}(t) is identified with the geometric energy of a curve 𝒖(t)\bm{u}(t) on \mathcal{M}, and therefore LR minimum-work protocols are geodesics (paths of shortest length) on the thermodynamic manifold. Such protocols thus satisfy the geodesic equation 𝒖˙𝒖˙=0\nabla_{\dot{\bm{u}}}\dot{\bm{u}}=0 for covariant derivative 𝒖˙\nabla_{\dot{\bm{u}}} along the curve. In components,

d2uidt2+j,kΓjkidujdtdukdt=0,i=1,,m,\frac{\mathrm{d}^{2}u^{i}}{\mathrm{d}t^{2}}+\sum_{j,k}\Gamma^{i}_{\phantom{i}jk}\frac{\mathrm{d}u^{j}}{\mathrm{d}t}\frac{\mathrm{d}u^{k}}{\mathrm{d}t}=0,\ \ \ i=1,\dots,m\ , (20)

for Christoffel symbols of the second kind

Γjki12ζ~i(ζ~juk+ζ~kujζ~jku)\Gamma^{i}_{\phantom{i}jk}\equiv\frac{1}{2}\sum_{\ell}\tilde{\zeta}^{i\ell}\left(\frac{\partial\tilde{\zeta}_{\ell j}}{\partial u^{k}}+\frac{\partial\tilde{\zeta}_{k\ell}}{\partial u^{j}}-\frac{\partial\tilde{\zeta}_{jk}}{\partial u^{\ell}}\right) (21)

for ζ~i[ζ~1]i\tilde{\zeta}^{i\ell}\equiv\left[\tilde{\zeta}^{-1}\right]_{i\ell} [30].

In a small handful of systems [11, 31, 32], the friction tensor is analytically obtainable directly from the definition (18). In the special case of one-dimensional continuous overdamped dynamics, the friction tensor simplifies [20], extending the range of analytically tractable models. Physically motivated approximations may also provide analytic insights when the full friction tensor is not known [19]. More often, numerical methods are required, typically involving estimation of the time-correlation functions Σ𝒇\Sigma_{\bm{f}} using Markov-chain Monte Carlo simulations [33, 34, 35].

IV Hierarchy of Thermodynamic Manifolds

In this section, we apply the mathematics of submanifolds 222While this section is intended to be self-contained, the essential geometric ideas will be unfamiliar to many readers. For a thorough overview of differential geometry including submanifolds, degenerate metrics, and extrinsic geometry, with notation that will be more familiar to readers versed in general relativity, see [66]. For a modern treatment of Riemannian submanifolds, see [67]. to connect the geometries of distinct control-parameter sets. By constructing a global thermodynamic manifold \mathcal{M} spanned by the state energies, we show that the partial-control manifolds for arbitrary sets {u0,u1,,um1}\left\{u^{0},u^{1},\dots,u^{m-1}\right\} of control parameters arise naturally as submanifolds (Fig. 2). In conjunction with the decompositions we will derive in Sec. V, this geometric picture offers methods for rapidly computing friction tensors.

IV.1 Global manifold: Full control

In the context of conservative driving, ‘full control’ refers to the ability to manipulate the entire potential-energy landscape. For a system with NN discrete states, the state energies can be assembled into a vector 𝑽=(V0,V1,,VN1)\bm{V}=(V^{0},V^{1},\dots,V^{N-1}) with Vν=V(xν)V^{\nu}=V(x_{\nu}). For a continuous system, one may imagine a continuum of control parameters V(x)V(x) parameterized by xx. The forces conjugate to the state energies are indicator functions

ω(x,t)𝛿V(Xt)𝛿V(x)=δ(x,Xt),\omega(x,t)\equiv-\functionalderivative{V(X_{t})}{V(x)}=-\delta(x,X_{t})\,, (22)

where the functional derivative accounts for continuous xx. The average deviation of the conjugate forces ω\omega from their equilibrium values is then precisely the (negative) deviation of the probability distribution from equilibrium:

δω(x,t)δω(x)p(t)=δp(x,t).\langle\delta\omega(x,t)\rangle\equiv\left\langle\delta\omega(x)\right\rangle_{p(t)}=-\delta p(x,t)\ . (23)

Analogous to (18), we define the full-control friction tensor ζ(V)\zeta(V)—which depends on the entire potential-energy landscape VV—by its integral kernel

ζ(x,y;V)β0dtδω(x,t)δω(y,0)eq,\zeta(x,y;V)\equiv\beta\int_{0}^{\infty}\mathrm{d}t^{\prime}\left\langle\delta\omega(x,t^{\prime})\,\delta\omega(y,0)\right\rangle_{\text{eq}}\ , (24)

such that

𝒫ex(t)\displaystyle\left\langle\mathcal{P}_{\text{ex}}(t)\right\rangle =Ωdμ(x)dV(x)dtδp(x,t)\displaystyle=\int_{\Omega}\mathrm{d}\mu(x)\,\frac{\mathrm{d}V(x)}{\mathrm{d}t}\,\delta p(x,t) (25a)
Ω×Ωdμ(x)dμ(y)dV(x)dtζ(x,y;V)dV(y)dt\displaystyle\approx\int_{\Omega\times\Omega}\mathrm{d}\mu(x)\,\mathrm{d}\mu(y)\,\frac{\mathrm{d}V(x)}{\mathrm{d}t}\,\zeta(x,y;V)\,\frac{\mathrm{d}V(y)}{\mathrm{d}t} (25b)
𝒫ex(t)(LR).\displaystyle\equiv\left\langle\mathcal{P}_{\text{ex}}(t)\right\rangle^{(\text{LR})}\ . (25c)

Interpreted geometrically, the full-control friction tensor is a metric on a (perhaps infinite-dimensional) manifold that we call the global thermodynamic manifold \mathcal{M}, with each possible landscape identified with a point on \mathcal{M}. The squared length of a tangent vector on the global manifold corresponds to the linear-response excess power:

𝒫ex(t)(LR)=(dVdt,ζdVdt)dVdtζ2,\left\langle\mathcal{P}_{\text{ex}}(t)\right\rangle^{(\text{LR})}=\left(\frac{\mathrm{d}V}{\mathrm{d}t},\zeta\frac{\mathrm{d}V}{\mathrm{d}t}\right)\equiv\left|\left|\frac{\mathrm{d}V}{\mathrm{d}t}\right|\right|_{\zeta}^{2}\ , (26)

where ||||ζ||\cdot||_{\zeta} denotes a norm with respect to the metric ζ\zeta.

We will see later that the metric ζ\zeta has a non-empty null space spanned by 11 (respectively the constant function 11 for continuous Ω\Omega or the constant vector 𝟏\bm{1} for discrete Ω\Omega). We should expect this intuitively from the linear-response relation implied by Eq. (25):

δpζdVdt.\delta p\approx\zeta\frac{\mathrm{d}V}{\mathrm{d}t}\ . (27)

Uniformly raising or lowering the energy landscape does not affect the system’s dynamics, so this action does not alter the system distribution: δp=0\delta p=0 for dV/dt1\mathrm{d}V/\mathrm{d}t\propto 1. An equivalent argument may be made from conservation of probability 333Taking the inner product of both sides of (27) with the constant function 11 and applying (1,δp)=0(1,\delta p)=0 leads to the same conclusion. Since ζ\zeta is symmetric, in this context the arbitrariness of the potential-energy zero-point and conservation of probability are equivalent. This highlights a duality between energy and probability, manifest in the geometry: the linear-response relation (27) states that δp\delta p is the vector dual to dV/dt\mathrm{d}V/\mathrm{d}t. This is also reflected in the conjugate forces to the state energies (22)..

Since detζ=0\text{det}\,\zeta=0, the metric is said to be degenerate and the full-control manifold \mathcal{M} is thus a non-Riemannian manifold. This degeneracy does not meaningfully impede the usual geometric analysis, however, as the null vector (in the coordinates of VV) is the same everywhere on the manifold. The degeneracy may be managed by projecting points VV\in\mathcal{M} onto some hypersurface on which the metric is non-degenerate, e.g., by fixing the potential zero-point, setting V(y)=0V(y)=0 for some yΩy\in\Omega. We address the construction of Riemannian full-control submanifolds for discrete-state systems in Sec. IV.3.

Refer to caption
Figure 2: Partial-control manifolds (~,ζ~)(\tilde{\mathcal{M}},\tilde{\zeta}) (blue) coordinatized by a set of control parameters 𝒖\bm{u} are submanifolds of the global thermodynamic manifold (,ζ)(\mathcal{M},\zeta), with ζ~\tilde{\zeta} inherited from ζ\zeta by Eq. (30). The restriction V=V(𝒖)V=V(\bm{u}) of conservative control defines the immersion \mathcal{F} (28). The pushforward \mathcal{F}_{*}—that maps vectors on ~\tilde{\mathcal{M}} to vectors on \mathcal{M} (red arrows)—is related to the conjugate forces 𝒇\bm{f} (29).

IV.2 Submanifolds: Partial control

In practical settings, the number of control parameters is significantly smaller than the number of system states. For instance, early studies of fluctuation theorems involved the manipulation of single RNA molecules with the molecular extension as the only controllable parameter [38, 39]. The success of these demonstrations and other nonequilibrium measurements of equilibrium free-energy landscapes relies on executing experimental protocols with low excess work [4]. Partial (or parametric) control is thus an area of clear practical importance.

For conservative control with a set of mm control parameters {u0,,ui,,um1}\left\{u^{0},\dots,u^{i},\dots,u^{m-1}\right\}, protocols are constrained by V(t)=V(𝒖(t))V(t)=V(\bm{u}(t)) to a lower-dimensional control-parameter space. Geometrically, this space is a submanifold ~\tilde{\mathcal{M}} of the global manifold \mathcal{M}.

Since the control parameters represent a constrained control over the potential energy VV, there is a map (unique up to specification of the energy zero point) between points 𝒖\bm{u} on ~\tilde{\mathcal{M}} and points V(𝒖)V(\bm{u}) on \mathcal{M}. In differential geometry, such a map is an immersion:

:~:𝒖V(𝒖)\displaystyle\begin{split}\mathcal{F}&:\tilde{\mathcal{M}}\to\mathcal{M}\\ &:\bm{u}\mapsto V(\bm{u})\end{split} (28)

The partial-control manifold is called a submanifold of the global manifold with immersion \mathcal{F}.

Associated with an immersion is its pushforward (or differential) :T𝒖~TV(𝒖)\mathcal{F}_{*}:\text{T}_{\bm{u}}\tilde{\mathcal{M}}\to\text{T}_{V(\bm{u})}\mathcal{M}, mapping vectors on ~\tilde{\mathcal{M}} (i.e., elements of the tangent space T𝒖~\text{T}_{\bm{u}}\tilde{\mathcal{M}}) to vectors on \mathcal{M}. In this setting, the matrix elements of the pushforward are simply obtained by the chain rule, and are identified with the forces fi(x)f_{i}(x) conjugate to respective parameters uiu^{i} when the system is in state xx:

V˙(x)=[(𝒖˙)](x)=i=0m1fi(x)u˙i.\dot{V}(x)=\left[\mathcal{F}_{*}(\dot{\bm{u}})\right](x)=-\sum_{i=0}^{m-1}f_{i}(x)\,\dot{u}^{i}\ . (29)

The metric ζ~(𝒖)\tilde{\zeta}(\bm{u}) on ~\tilde{\mathcal{M}}, inherited from ζ(V)\zeta(V), is then

ζ~ij(𝒖)=Ω×Ωdμ(x)dμ(y)ζ(x,y;V(𝒖))fi(x)fj(y).\tilde{\zeta}_{ij}(\bm{u})=\int_{\Omega\times\Omega}\mathrm{d}\mu(x)\,\mathrm{d}\mu(y)\,\zeta(x,y;V(\bm{u}))\,f_{i}(x)f_{j}(y)\ . (30)

(Substituting (24) into this form and applying (22) shows it to indeed be equivalent to the metric defined in Eq. (18).) This is constructed to ensure that geometric invariants (e.g., the lengths of tangent vectors) are equal when computed on \mathcal{M} and on ~\tilde{\mathcal{M}}. In particular, since the squared length of a vector on a thermodynamic manifold corresponds to the linear-response excess power, the preservation of geometric invariants under coordinate transformations means that one computes the same excess power from ζ~\tilde{\zeta} and 𝒖(t)\bm{u}(t) as from ζ\zeta and V(t)V(t). Figure 2 summarizes this picture.

IV.3 Riemannian full-control submanifolds

For a discrete-state system with |Ω|=n|\Omega|=n, full control is possible with a set of n1n-1 parameters due to the invariance of the dynamics under a uniform shift of the state energies (captured in the degeneracy of ζ\zeta). We define a control-parameter set 𝒖n1\bm{u}\in\mathbb{R}^{n-1} as complete if the potential energy is fully determined (up to a global additive constant) by 𝒖\bm{u}.

Define the Jacobian 𝕁=𝖳\mathbb{J}=\mathcal{F}_{*}^{\mathsf{T}} of the transformation from the control parameters 𝒖\bm{u} to the state energies 𝑽\bm{V}, i.e., [𝕁]iν=fi(xν)=Vν/ui\left[\mathbb{J}\right]^{\phantom{i}\nu}_{i}=-f_{i}(x_{\nu})=\partial V^{\nu}/\partial u^{i}. Then for probability vector 𝒑(t)\bm{p}(t),

𝒇p(t)=𝕁𝒑(t).\left\langle\bm{f}\right\rangle_{p(t)}=-\mathbb{J}\bm{p}(t)\ . (31)

In App. A, we show that the completeness condition is equivalent to the invertibility of (31), i.e., the distribution 𝒑(t)\bm{p}(t) is completely determined by the average conjugate forces 𝒇𝒑(t)\left\langle\bm{f}\right\rangle_{\bm{p}(t)}. The inverse of (31) is

𝒑(t)=𝕁𝖱𝒇(t)+1n𝟏\bm{p}(t)=-\mathbb{J}^{\mathsf{R}}\left\langle\bm{f}(t)\right\rangle+\frac{1}{n}\bm{1} (32)

for right inverse 𝕁𝖱=𝕁𝖳(𝕁𝕁𝖳)1\mathbb{J}^{\mathsf{R}}=\mathbb{J}^{\mathsf{T}}\left(\mathbb{J}\mathbb{J}^{\mathsf{T}}\right)^{-1}. The time evolution of the average deviation δ𝒇(t)\left\langle\delta\bm{f}(t)\right\rangle of the conjugate force to a complete set of control parameters is thus self-contained in the sense that

dδ𝒇(t)dt=𝕎𝒇δ𝒇(t)\frac{\mathrm{d}\left\langle\delta\bm{f}(t)\right\rangle}{\mathrm{d}t}=\mathbb{W}_{\bm{f}}\left\langle\delta\bm{f}(t)\right\rangle (33)

where

𝕎𝒇𝕁𝕎𝕁𝖱\mathbb{W}_{\bm{f}}\equiv\mathbb{J}\mathbb{W}\mathbb{J}^{\mathsf{R}} (34)

is the generator of the dynamics of 𝒇(t)\left\langle\bm{f}(t)\right\rangle.

Refer to caption
Figure 3: Schematic of the geometric condition on the completeness of a set of control parameters 𝒖\bm{u}. A control parameter set 𝒖\bm{u} is complete, i.e., able to fully determine state energies up to a constant offset when the space Δf\Delta_{f} of average conjugate forces 𝒇\left\langle\bm{f}\right\rangle forms an (n1)(n-1)-simplex in n1\mathbb{R}^{n-1} (red triangle, right), since this allows a bijective transformation between Δ𝒇\Delta_{\bm{f}} and the probability simplex Δ𝒑\Delta_{\bm{p}} (green triangle, left).

There is an equivalent geometric condition on the completeness of a control-parameter set, illustrated for a three-state system in Fig. 3. Let Δ𝒇\Delta_{\bm{f}} be the convex hull of the nn points {𝒇(xi)}\left\{\bm{f}(x_{i})\right\}, i.e., the set of all conjugate-force averages 𝒇\left\langle\bm{f}\right\rangle. For a complete set of control parameters, Δf\Delta_{f} forms an (n1)(n-1)-dimensional simplex in n1\mathbb{R}^{n-1}. Since the space of probability vectors 𝒑\bm{p} for an nn-state system defines an (n1)(n-1)-dimensional simplex (the probability simplex Δ𝒑={𝒑n: 1𝖳𝒑=1}\Delta_{\bm{p}}=\left\{\bm{p}\in\mathbb{R}^{n}\ :\ \bm{1}^{\mathsf{T}}\bm{p}=1\right\}), there exists a bijective transformation between these spaces. Put simply, the space of conjugate-force averages 𝒇\left\langle\bm{f}\right\rangle for a complete set of control parameters is the same “size” as the space of probability vectors.

Concretely, consider an NN-spin system, for which there are n=2Nn=2^{N} states. We show in Sec. VI.3 that the set of all kk-spin interactions for k=1,,Nk=1,\dots,N from the cluster expansion is a complete control set that spans a (2N1)(2^{N}-1)-dimensional Riemannian full-control submanifold.

V Decompositions of the friction tensor

This hierarchy of thermodynamic manifolds means that, given full-control metric ζ\zeta, we may compute a partial-control metric ζ~\tilde{\zeta} for any control-parameter set {u0,,um1}\left\{u^{0},\dots,u^{m-1}\right\}. Though conceptually appealing, this leaves open the question of how we can actually calculate and physically interpret the friction tensor ζ\zeta and its relationship to inherited metrics ζ~\tilde{\zeta}. In this section, we show that the properties of the full-control friction tensor (24) permit mathematical manipulations which significantly simplify the calculation of friction tensors, offer a dynamical interpretation of the linear-response excess power, and suggest design principles for reducing dissipation in driven systems.

V.1 The Drazin inverse

We begin with a brief review of the Drazin inverse, a generalized inverse for linear operators which plays a central role in the derivations that follow. The Drazin inverse of the rate matrix has also appeared in previous works on optimal control [40, 41, 42, 43, 23] and has applications in first-passage problems [44, 45].

Intuitively, the need for an appropriately defined inverse of the generator arises due to the time integral in the definition (44) of the full-control friction tensor. In the same way as the generator 𝒢t\mathcal{G}_{t} plays the role of a time derivative on the space of probability distributions, its Drazin inverse 𝒢t𝒟\mathcal{G}^{\mathcal{D}}_{t} plays the role of a time integral (over [0,)[0,\infty)) on the same space.

Let 𝒜\mathcal{A} be a linear operator on some Hilbert space. The Drazin inverse of 𝒜\mathcal{A} is the unique operator 𝒜𝒟\mathcal{A}^{\mathcal{D}} satisfying

𝒜𝒜𝒟𝒜k\displaystyle\mathcal{A}\mathcal{A}^{\mathcal{D}}\mathcal{A}^{k} =𝒜k\displaystyle=\mathcal{A}^{k} (35a)
𝒜𝒟𝒜𝒜𝒟\displaystyle\mathcal{A}^{\mathcal{D}}\mathcal{A}\mathcal{A}^{\mathcal{D}} =𝒜𝒟\displaystyle=\mathcal{A}^{\mathcal{D}} (35b)
𝒜𝒟𝒜\displaystyle\mathcal{A}^{\mathcal{D}}\mathcal{A} =𝒜𝒜𝒟,\displaystyle=\mathcal{A}\mathcal{A}^{\mathcal{D}}\ , (35c)

for some kk\in\mathbb{Z} called the index of 𝒜\mathcal{A}. For operators on Hilbert spaces of finite dimension (i.e., matrices), kk is the smallest non-negative integer such that rank𝒜k=rank𝒜\text{rank}\,\mathcal{A}^{k}=\text{rank}\,\mathcal{A}^{\ell} for all >k\ell>k [46]. We consider only the case of k=1k=1 (such as for irreducible rate matrices [47]). In this case, the Drazin inverse of 𝒜\mathcal{A} may be constructed from

𝒜𝒟=P0+(𝒜P0)1,\mathcal{A}^{\mathcal{D}}=P_{0}+(\mathcal{A}-P_{0})^{-1}\ , (36)

for projection matrix P0P_{0} onto the null space 𝒩(𝒜)\mathcal{N}(\mathcal{A}) of 𝒜\mathcal{A}.

From properties (35a-c), one can show that

𝒜𝒜𝒟=IP0,\mathcal{A}\mathcal{A}^{\mathcal{D}}=I-P_{0}\ , (37)

for identity operator II. That is, 𝒜𝒜𝒟\mathcal{A}\mathcal{A}^{\mathcal{D}} projects vectors in the Hilbert space onto the complement of 𝒩(𝒜)\mathcal{N}(\mathcal{A}).

Concretely, suppose that 𝕎\mathbb{W} is an irreducible rate matrix with equilibrium distribution 𝝅\bm{\pi}. Then the product 𝕎𝕎𝒟\mathbb{W}\mathbb{W}^{\mathcal{D}} (and 𝕎𝒟𝕎\mathbb{W}^{\mathcal{D}}\mathbb{W} by the commutativity property (35c)) maps a probability distribution to its deviation from equilibrium, i.e., 𝕎𝕎𝒟𝒑=(IΠ)𝒑=δ𝒑\mathbb{W}\mathbb{W}^{\mathcal{D}}\bm{p}=(I-\Pi)\bm{p}=\delta\bm{p} with Π=𝝅𝟏𝖳=[𝝅𝝅]\Pi=\bm{\pi}\bm{1}^{\mathsf{T}}=\left[\bm{\pi}\,\cdots\,\bm{\pi}\right]. As a corollary,

𝕎𝕎𝒟δ𝒑=δ𝒑,\mathbb{W}\mathbb{W}^{\mathcal{D}}\delta\bm{p}=\delta\bm{p}\ , (38)

so that 𝕎𝒟\mathbb{W}^{\mathcal{D}} is a proper inverse of 𝕎\mathbb{W} when the domain is restricted to vectors in the hyperplane {𝒗: 1T𝒗=0}\left\{\bm{v}\ :\ \bm{1}^{T}\bm{v}=0\right\} – including deviations δ𝒑\delta\bm{p} from equilibrium and derivatives d𝒑/dt\mathrm{d}\bm{p}/\mathrm{d}t of probability distributions.

From properties (35b,c), it follows that 444Property (35b) can be re-written as (𝒜𝒟)2𝒜=𝒜𝒟(\mathcal{A}^{\mathcal{D}})^{2}\mathcal{A}=\mathcal{A}^{\mathcal{D}} by the commutative property (35c). Thus 𝒜x=0𝒜𝒟x=0\mathcal{A}x=0\implies\mathcal{A}^{\mathcal{D}}x=0. An identical argument holds for right-multiplication.

𝕎𝒟𝝅=𝟏𝖳𝕎𝒟=0.\mathbb{W}^{\mathcal{D}}\bm{\pi}=\bm{1}^{\mathsf{T}}\mathbb{W}^{\mathcal{D}}=0\ . (39)

The Drazin inverse may thus be interpreted as “inverting the invertible part” of a matrix [40], leaving the null space of 𝕎𝒟\mathbb{W}^{\mathcal{D}} coincident with the null space of 𝕎\mathbb{W}.

V.2 Operator decomposition

Consider some continuous- or discrete-state Markov process with generator 𝒢\mathcal{G} satisfying the conditions established in Sec. II. We would like to manipulate the expression (24) for the full-control friction tensor into a more manageable form. We take advantage of the fact that the integrand Σω(t)=δω(t)δω(0)eq\Sigma_{\omega}(t)=\left\langle\delta\omega(t)\delta\omega(0)\right\rangle_{\text{eq}} is calculated for a fixed energy landscape so that dπ/dt=0\mathrm{d}\pi/\mathrm{d}t=0 and thus dΣω/dt=𝒢Σω\mathrm{d}\Sigma_{\omega}/\mathrm{d}t=\mathcal{G}\Sigma_{\omega}. Then from (38), the full-control friction tensor is

ζ\displaystyle\zeta =β0dtΣω(t)\displaystyle=\beta\int_{0}^{\infty}\mathrm{d}t^{\prime}\,\Sigma_{\omega}(t^{\prime}) (40a)
=β0dt𝒢𝒟dΣω(t)dt\displaystyle=\beta\int_{0}^{\infty}\mathrm{d}t^{\prime}\,\mathcal{G}^{\mathcal{D}}\frac{\mathrm{d}\Sigma_{\omega}(t^{\prime})}{\mathrm{d}t} (40b)
=β𝒢𝒟Σω0\displaystyle=-\beta\mathcal{G}^{\mathcal{D}}\Sigma_{\omega}^{0} (40c)

for Σω0Σω(0)\Sigma_{\omega}^{0}\equiv\Sigma_{\omega}(0). The final line follows from the fact that limtΣω(t)=0\lim_{t\to\infty}\Sigma_{\omega}(t)=0 for ergodic dynamics.

Equation (40c) is a main result: the full-control friction tensor is decomposable into a product of the Drazin inverse of the generator and the equal-time equilibrium covariance matrix of the indicator functions ω\omega. Roughly, this separates the friction tensor into a static part Σω0\Sigma_{\omega}^{0} and a dynamic part β𝒢𝒟\beta\mathcal{G}^{\mathcal{D}}.

This expression simplifies further. Appendix B shows that the equal-time equilibrium covariance matrix of the indicator functions ω\omega satisfies

Σω0(x,y)=π(x)δ(x,y)π(x)π(y).\Sigma_{\omega}^{0}(x,y)=\pi(x)\delta(x,y)-\pi(x)\pi(y)\ . (41)

Since π\pi is in the null-space of 𝒢𝒟\mathcal{G}^{\mathcal{D}}, we obtain the decomposition

ζ(x,y)=β𝒢𝒟(x,y)π(y).\zeta(x,y)=-\beta\,\mathcal{G}^{\mathcal{D}}(x,y)\,\pi(y)\ . (42)

For discrete systems, this reduces the infinite-time integral (24) to a matrix equation

ζ=β𝕎𝒟diag{𝝅}.\zeta=-\beta\mathbb{W}^{\mathcal{D}}\text{diag}\left\{\bm{\pi}\right\}\ . (43)

Together with the formula (36) for the Drazin inverse of a matrix, this expression allows for machine-precision numerical calculation of the full-control friction tensor ζ(V)\zeta(V) – and thus any partial-control tensor ζ~(𝒖)\tilde{\zeta}(\bm{u}) via the analysis of Sec. IV.2.

In App. C, we show that for a complete control set 𝒖\bm{u}, the friction tensor is

ζ~(𝒖)=β𝕎𝒇1Σ𝒇0\tilde{\zeta}(\bm{u})=-\beta\mathbb{W}^{-1}_{\bm{f}}\Sigma_{\bm{f}}^{0} (44)

for generator 𝕎𝒇\mathbb{W}_{\bm{f}} of the dynamics of 𝒇(t)\left\langle\bm{f}(t)\right\rangle as defined in Eq. (34) and equal-time correlation matrix Σ𝒇0=δ𝒇δ𝒇𝖳𝝅(𝒖)\Sigma_{\bm{f}}^{0}=\left\langle\delta\bm{f}\delta\bm{f}^{\mathsf{T}}\right\rangle_{\bm{\pi}(\bm{u})}.

Lastly, since a protocol V(t)V(t) can be completely characterized up to an (irrelevant) additive constant by the evolution of the equilibrium distribution π(t)\pi(t), we use the operator decomposition to derive the relation (App. E)

ζdVdt=𝒢𝒟dπdt.\zeta\frac{\mathrm{d}V}{\mathrm{d}t}=\mathcal{G}^{\mathcal{D}}\frac{\mathrm{d}\pi}{\mathrm{d}t}\ . (45)

The linear-response relation (27) is then equivalent to the approximation

δp𝒢𝒟dπdt.\delta p\approx\mathcal{G}^{\mathcal{D}}\frac{\mathrm{d}\pi}{\mathrm{d}t}\ . (46)

This expression was derived previously using an operator expansion [41].

V.3 Connection to integral relaxation time

As noted in [11], the friction tensor ζ~\tilde{\zeta} (for any set of control parameters) may be written as

ζ~=kBT𝒯,\tilde{\zeta}=k_{\rm B}T\ \mathcal{T}\circ\mathcal{I}\ , (47)

the Hadamard product of the integral relaxation time matrix

𝒯ij0dtδfi(t)δfj(0)eqδfiδfj𝝅\mathcal{T}_{ij}\equiv\int_{0}^{\infty}\mathrm{d}{t^{\prime}}\frac{\left\langle\delta f_{i}(t^{\prime})\delta f_{j}(0)\right\rangle_{\text{eq}}}{\left\langle\delta f_{i}\delta f_{j}\right\rangle_{\bm{\pi}}} (48)

and the equilibrium Fisher-information matrix

β2Σ𝒇0.\mathcal{I}\equiv\beta^{2}\Sigma_{\bm{f}}^{0}\ . (49)

When the conjugate forces 𝒇\bm{f} are the indicator functions 𝝎\bm{\omega}, equating (47) with the operator decomposition (43) gives a simple correspondence between the Drazin inverse of the rate matrix and the integral relaxation time:

𝕎μν𝒟={πμ𝒯μν,μν(1πμ)𝒯μμ,μ=ν.\mathbb{W}_{\mu\nu}^{\mathcal{D}}=\begin{dcases}\pi_{\mu}\mathcal{T}_{\mu\nu}\ ,\ \ &\mu\neq\nu\\ -(1-\pi_{\mu})\mathcal{T}_{\mu\mu}\ ,&\mu=\nu\ .\end{dcases} (50)

Seen another way, defining the transition probability matrix with elements pμν(t)=Pr[μ,t|ν,0]p_{\mu\nu}(t)=\text{Pr}\left[\mu,t\ |\ \nu,0\right], the Drazin inverse of the rate matrix is [45]

𝕎μν𝒟\displaystyle-\mathbb{W}_{\mu\nu}^{\mathcal{D}} =0𝑑t[pμν(t)πμ].\displaystyle=\int_{0}^{\infty}dt\,\left[p_{\mu\nu}(t)-\pi_{\mu}\right]\ . (51a)

Thus, much like 𝒯\mathcal{T}, the Drazin inverse of the rate matrix describes the timescale of system relaxation. The magnitude of 𝕎μν𝒟\mathbb{W}^{\mathcal{D}}_{\mu\nu} characterizes how long it takes for pμν(t)p_{\mu\nu}(t), the probability pμ(t)p_{\mu}(t) conditioned on initialization in state ν\nu, to reach the equilibrium probability πμ\pi_{\mu}.

V.4 Spectral decomposition

When a generator 𝒢\mathcal{G} admits the spectral decomposition (9), the integral kernel of the Drazin inverse is

𝒢𝒟(x,y)=α1ταψα(x)ϕα(y)\mathcal{G}^{\mathcal{D}}(x,y)=-\sum_{\alpha\geq 1}\tau_{\alpha}\psi_{\alpha}(x)\phi_{\alpha}(y) (52)

for relaxation times

τα1λα,α1.\tau_{\alpha}\equiv-\frac{1}{\lambda_{\alpha}}\ ,\ \ \alpha\geq 1\ . (53)

See App. D for a concise proof. Substituting this expression into the decomposition (42) and using the fact that ψα(x)=ϕα(x)π(x)\psi_{\alpha}(x)=\phi_{\alpha}(x)\pi(x) for systems obeying detailed balance [49], we obtain a spectral decomposition of the full-control friction tensor:

ζ(x,y)=βα1ταψα(x)ψα(y).\zeta(x,y)=\beta\sum_{\alpha\geq 1}\tau_{\alpha}\psi_{\alpha}(x)\psi_{\alpha}(y)\ . (54)

The degeneracy of ζ\zeta as inferred from Eq. (27) is manifest in this form: from the biorthogonality of the eigenfunctions, we have ϕ0ζ=ζϕ0=0\phi_{0}\zeta=\zeta\phi_{0}=0, with ϕ0=1\phi_{0}=1.

This spectral decomposition can be used to express the linear-response excess power (25) as a sum of independent contributions associated with the fixed-parameter relaxation modes of the system:

𝒫ex(t)(LR)=βα1τα(t)(dV(t)dt,ψα(t))2,\langle\mathcal{P}_{\text{ex}}(t)\rangle^{(\text{LR)}}=\beta\sum_{\alpha\geq 1}\tau_{\alpha}(t)\left(\frac{\mathrm{d}V(t)}{\mathrm{d}t},\,\psi_{\alpha}(t)\right)^{2}\ , (55)

Under the change of variables st/τs\equiv t/\tau for total protocol time τ\tau, the linear-response excess work is

𝒲ex(LR)(τ)=kBTτα101dsτα(dβVds,ψα)2.\left\langle\mathcal{W}_{\text{ex}}\right\rangle^{(\text{LR})}(\tau)=\frac{k_{\rm B}T}{\tau}\sum_{\alpha\geq 1}\int_{0}^{1}\mathrm{d}s\,\tau_{\alpha}\left(\frac{\mathrm{d}\beta V}{\mathrm{d}s},\,\psi_{\alpha}\right)^{2}\ . (56)

Equations (54)-(56) constitute another main result. The spectral decomposition of the excess work provides insight into the dynamical origin of dissipation: roughly, each term captures the degree to which the protocol traverses a given mode, scaled by that mode’s relaxation time. As a design principle, dissipation may be reduced by driving a system such that dV/dt\mathrm{d}V/\mathrm{d}t is maximally orthogonal to modes with long relaxation times.

A similar expression holds for partial-control submanifolds. From the transformation law for metrics,

ζ~ij(𝒖)=α1ταψ~α|iψ~α|j\tilde{\zeta}_{ij}(\bm{u})=\sum_{\alpha\geq 1}\tau_{\alpha}\,\tilde{\psi}_{\alpha|i}\,\tilde{\psi}_{\alpha|j} (57)

for conjugate relaxation modes

ψ~α|i=Ωdμ(x)ψα(x)fi(x).\tilde{\psi}_{\alpha|i}=-\int_{\Omega}\mathrm{d}\mu(x)\,\psi_{\alpha}(x)f_{i}(x)\ . (58)

The zeroth conjugate relaxation mode 𝝍~0\tilde{\bm{\psi}}_{0} is the equilibrium average 𝒇π-\left\langle\bm{f}\right\rangle_{\pi} of the conjugate-force vector, and for α1\alpha\geq 1, 𝝍~α\tilde{\bm{\psi}}_{\alpha} captures the fixed-parameter relaxation of δ𝒇(t)-\left\langle\delta\bm{f}(t)\right\rangle corresponding to system relaxation along mode α\alpha. The linear-response excess power in terms of a limited set of control parameters is then

𝒫ex(t)(LR)=βα1τα(t)(d𝒖(t)dt,𝝍~α(t))2.\left\langle\mathcal{P}_{\text{ex}}(t)\right\rangle^{(\text{LR})}=\beta\sum_{\alpha\geq 1}\tau_{\alpha}(t)\left(\frac{\mathrm{d}\bm{u}(t)}{\mathrm{d}t},\ \tilde{\bm{\psi}}_{\alpha}(t)\right)^{2}\ . (59)

Equation (57) is particularly powerful for continuous systems for which the spectrum of \mathcal{L} is analytically known. While the full-control spectral decomposition (54) is a manifestly infinite sum for continuous state spaces, the partial-control spectral decomposition (57) may be a finite sum if the conjugate forces fi(x)f_{i}(x) are orthogonal to all but a finite set of modes ψα(x)\psi_{\alpha}(x); Section (VI.2) demonstrates a concrete example of this.

VI Illustrative examples

Refer to caption
Figure 4: Model systems studied in Sec. VI: (a) two-state system with transition rates w01w_{01} and w10w_{10} (left), one instantiation of which is a quantum dot interacting with a metallic reservoir [5] (right); (b) overdamped diffusion XtX_{t} of a particle in a harmonic trap V(x,t)V(x,t) parameterized by tunable stiffness κ(t)\kappa(t) and center x0(t)x_{0}(t); (c) four-spin system with single spin-flip dynamics, giving rise to a state graph that is isomorphic to a hypercube (left). Control-parameter sets are (possibly constrained) collections of kk-spin couplings (k=1,2,3,4k=1,2,3,4) drawn from 15 parameters of the cluster expansion (72), e.g., the 5-parameter set (h,JNN,JNNN,K,L)(h,J_{\text{NN}},J_{\text{NNN}},K,L) where h=h0=h1=h2=h3h=h_{0}=h_{1}=h_{2}=h_{3}, JNN=J01=J02=J13=J23J_{\text{NN}}=J_{01}=J_{02}=J_{13}=J_{23}, JNNN=J03=J12J_{\text{NNN}}=J_{03}=J_{12}, and K=K012=K013=K023=K123K=K_{012}=K_{013}=K_{023}=K_{123} (right).

VI.1 Two-state system

We begin with a very simple model system: two states with arbitrary transition rates w10w_{10} and w01w_{01} (Fig. 4a, left). From Eq. (36), the Drazin inverse of the rate matrix is

𝕎𝒟=1(w01+w10)2[w10w01w10w01].\mathbb{W}^{\mathcal{D}}=\frac{1}{(w_{01}+w_{10})^{2}}\begin{bmatrix}-w_{10}&w_{01}\\ w_{10}&-w_{01}\end{bmatrix}\ . (60)

From the decomposition (43) and instantaneous equilibrium distribution (π0(t),π1(t))𝖳=(w01w01+w10,w10w01+w10)𝖳\left(\pi_{0}(t),\pi_{1}(t)\right)^{\mathsf{T}}=\left(\frac{w_{01}}{w_{01}+w_{10}},\frac{w_{10}}{w_{01}+w_{10}}\right)^{\mathsf{T}} the full-control friction tensor is

ζ=βπ0(t)π1(t)τrel(t)[1111],\zeta=\beta\,\pi_{0}(t)\,\pi_{1}(t)\,\tau_{\text{rel}}(t)\begin{bmatrix}1&-1\\ -1&1\end{bmatrix}\ , (61)

for relaxation time τrel(t)=(w01(t)+w10(t))1\tau_{\text{rel}}(t)=(w_{01}(t)+w_{10}(t))^{-1}. Note that the two-state system has only one non-zero eigenvalue (and thus only one relaxation time τrelτ1\tau_{\text{rel}}\equiv\tau_{1}). It may be readily verified that the spectral decomposition ζ(𝑽)=βτrel𝝍1𝝍1𝖳\zeta(\bm{V})=\beta\tau_{\text{rel}}\bm{\psi}_{1}\bm{\psi}_{1}^{\mathsf{T}} yields the same result.

The control parameters 𝑽\bm{V} may be freely defined subject to

β(V1V0)=lnπ1π0.-\beta(V^{1}-V^{0})=\ln\frac{\pi_{1}}{\pi_{0}}\ . (62)

Thus the energy difference ΔV1V0\Delta\equiv V^{1}-V^{0} is a natural parameter to consider. The linear-response excess power is

𝒫ex(t)(LR)=βπ0(t)π1(t)τrel(t)(dΔdt)2\left\langle\mathcal{P}_{\text{ex}}(t)\right\rangle^{(\text{LR})}=\beta\pi_{0}(t)\pi_{1}(t)\tau_{\text{rel}}(t)\left(\frac{\mathrm{d}\Delta}{\mathrm{d}t}\right)^{2} (63)

and any LR minimum-work protocol Δ(t)\Delta(t) satisfies

d2Δdt2+12(dΔdt)2(π0π1τrel)Δ=0.\frac{\mathrm{d}^{2}\Delta}{\mathrm{d}t^{2}}+\frac{1}{2}\left(\frac{\mathrm{d}\Delta}{\mathrm{d}t}\right)^{2}\frac{\partial\left(\pi_{0}\pi_{1}\tau_{\text{rel}}\right)}{\partial\Delta}=0\ . (64)

A physical instantiation of a two-state system is a single-level quantum dot interacting with a metallic reservoir (Fig. 4a, right). In the wide-band approximation (i.e., the metallic reservoir’s density of states is assumed constant [50]), the energy ε\varepsilon of the quantum dot and the chemical potential μ\mu of the reservoir set the respective tunneling rates w10=2R0[1+eβ(με)]1w_{10}=2R_{0}[1+\text{e}^{\beta(\mu-\varepsilon)}]^{-1} into the reservoir and w01=2R0[1+eβ(εμ)]1w_{01}=2R_{0}[1+\text{e}^{\beta(\varepsilon-\mu)}]^{-1} out of the reservoir, where R0R_{0} is the tunneling rate when μ=ε\mu=\varepsilon [5]. Then the friction tensor (for one control parameter, the friction coefficient) is

ζ~(Δ)=β4R0sech2(12βΔ),\tilde{\zeta}(\Delta)=\frac{\beta}{4R_{0}}\sech^{2}\left(\tfrac{1}{2}\beta\Delta\right)\ , (65)

for Δ=εμ\Delta=\varepsilon-\mu.

VI.2 Harmonic trap

Consider the overdamped diffusion XtX_{t} of a particle in a one-dimensional harmonic trap V(x,t)=12κ(t)(xx0(t))2V(x,t)=\frac{1}{2}\kappa(t)(x-x_{0}(t))^{2} with stiffness κ(t)\kappa(t) and trap center x0(t)x_{0}(t) (Fig. 4b). The Fokker-Planck operator governing the evolution of the particle’s position distribution p(x,t)p(x,t) is

tp(x,t)=1βγx{eβV(x,t)x[eβV(x,t)p(x,t)]},\mathcal{L}_{t}\,p(x,t)=\frac{1}{\beta\gamma}\frac{\partial}{\partial x}\left\{\text{e}^{-\beta V(x,t)}\frac{\partial}{\partial x}\left[\text{e}^{\beta V(x,t)}\ p(x,t)\ \right]\right\}\ , (66)

for (Cartesian) friction coefficient γ\gamma and inverse temperature β\beta. Dropping the explicit time-dependence, the right eigenfunctions of \mathcal{L} are [49] (after normalizing to (ψα,ϕα)=1(\psi_{\alpha},\phi_{\alpha})=1 with ϕα=ψα/π\phi_{\alpha}=\psi_{\alpha}/\pi)

ψα(x)=(1)α(βκ)1α2πα!αxαeβV(x),\psi_{\alpha}(x)=(-1)^{\alpha}\sqrt{\frac{(\beta\kappa)^{1-\alpha}}{2\pi\alpha!}}\frac{\partial^{\alpha}}{\partial x^{\alpha}}\text{e}^{-\beta V(x)}\ , (67)

with relaxation times

τα=γακ.\tau_{\alpha}=\frac{\gamma}{\alpha\kappa}\ . (68)

This is sufficient to write down an infinite sum for the full-control friction tensor ζ\zeta using (54). A more practical expression may be obtained for the metric ζ~\tilde{\zeta} on the (κ,x0)(\kappa,x_{0})-thermodynamic submanifold: the conjugate forces fx0=V/x0f_{x_{0}}=-\partial V/\partial x_{0} and fκ=V/κf_{\kappa}=-\partial V/\partial\kappa are determined by the first three left eigenfunctions:

fx0(x)\displaystyle f_{x_{0}}(x) =κβϕ1(x),\displaystyle=-\sqrt{\frac{\kappa}{\beta}}\,\phi_{1}(x)\ , (69a)
fκ(x)\displaystyle f_{\kappa}(x) =12βκ[ϕ0(x)+2ϕ2(x)].\displaystyle=-\frac{1}{2\beta\kappa}\left[\phi_{0}(x)+\sqrt{2}\,\phi_{2}(x)\right]\ . (69b)

Then by the orthonormality of the eigenfunctions, (ϕα,ψβ)=δαβ(\phi_{\alpha},\psi_{\beta})=\delta_{\alpha\beta}, the conjugate relaxation modes ψ~α|i=(fi,ψα)\tilde{\psi}_{\alpha|i}=\left(-f_{i},\psi_{\alpha}\right) (58) are

𝝍~1\displaystyle\tilde{\bm{\psi}}_{1} =κβ(10)\displaystyle=\sqrt{\frac{\kappa}{\beta}}\begin{pmatrix}1\\ 0\end{pmatrix} (70a)
𝝍~2\displaystyle\tilde{\bm{\psi}}_{2} =12κβ(01),\displaystyle=\frac{1}{\sqrt{2}\kappa\beta}\begin{pmatrix}0\\ 1\end{pmatrix}\ , (70b)

with 𝝍~α=𝟎\tilde{\bm{\psi}}_{\alpha}=\bm{0} for all α>2\alpha>2. Thus, only the first two modes contribute to the linear-response dissipation for the harmonic trap. From (54) the friction tensor is

ζ~\displaystyle\tilde{\zeta} =β(τ1𝝍~1𝝍~1𝖳+τ2𝝍~2𝝍~2𝖳)\displaystyle=\beta\left(\tau_{1}\tilde{\bm{\psi}}_{1}\tilde{\bm{\psi}}_{1}^{\mathsf{T}}+\tau_{2}\tilde{\bm{\psi}}_{2}\tilde{\bm{\psi}}_{2}^{\mathsf{T}}\right) (71a)
=(γ00γ4βκ3),\displaystyle=\begin{pmatrix}\gamma&0\\ 0&\frac{\gamma}{4\beta\kappa^{3}}\end{pmatrix}\ , (71b)

in agreement with [11].

VI.3 Classical spin systems

VI.3.1 Complete control set for NN spins

Control of classical spin systems in the linear-response regime has been studied fairly extensively [51, 52, 53]. For a system of NN spins σi\sigma_{i} that take values in {1,1}\left\{-1,1\right\}, there are 2N2^{N} states 𝝈=(σ0,σ1,,σN1)\bm{\sigma}=\left(\sigma_{0},\sigma_{1},\dots,\sigma_{N-1}\right), giving rise to a 2N2^{N}-dimensional global manifold. We begin by constructing a Riemannian full-control manifold by introducing a (2N1)(2^{N}-1)-parameter complete-control set for arbitrary spin systems. Define the energy function

V(𝝈)=k=12N1Xki𝒞(k)σi,V(\bm{\sigma})=-\sum_{k=1}^{2^{N}-1}X^{k}\prod_{i\in\mathcal{C}(k)}\sigma_{i}\ , (72)

where {Xk}\left\{X^{k}\right\} is the set of all nn-spin interactions with n=1,,Nn=1,\dots,N, and 𝒞(k)\mathcal{C}(k) is the set of spin indices corresponding to the coupling XkX^{k}. See, e.g., (VI.3.2) for four spins. This energy function is often called the cluster expansion [54], and it arises in the study of spin glasses and machine learning as the generalized Ising model [55], generalized lattice-gas model [56], or generalized restricted Boltzmann machine [57].

The probability p(𝝈)p(\bm{\sigma}) of any spin configuration 𝝈\bm{\sigma} can be expanded in the spin moments [58]:

p(𝝈)=12N(1+jσjσj+jkσjσkσjσk+).p(\bm{\sigma})=\frac{1}{2^{N}}\left(1+\sum_{j}\left\langle\sigma_{j}\right\rangle\sigma_{j}+\sum_{j\neq k}\left\langle\sigma_{j}\sigma_{k}\right\rangle\sigma_{j}\sigma_{k}+\cdots\right)\ . (73)

Following Sec. IV.3, we compile the forces fk=V/Xkf_{k}=-\partial V/\partial X^{k} into a Jacobian matrix [𝕁]kν=fk(𝝈ν)\left[\mathbb{J}\right]_{k}^{\phantom{k}\nu}=-f_{k}(\bm{\sigma}_{\nu}) for some ordering of the states ν=0,1,,2N1\nu=0,1,\dots,2^{N}-1. Then (73) can be put into the form of Eq. (32):

𝒑=12N𝕁𝖳𝒇+12N𝟏.\bm{p}=-\frac{1}{2^{N}}\mathbb{J}^{\mathsf{T}}\left\langle\bm{f}\right\rangle+\frac{1}{2^{N}}\bm{1}\ . (74)

Thus, there is an invertible transformation between the full set of spin moments 𝒇-\left\langle\bm{f}\right\rangle and the probability distribution 𝒑\bm{p}. We may identify 𝕁𝖱=2N𝕁𝖳\mathbb{J}^{\mathsf{R}}=2^{-N}\mathbb{J}^{\mathsf{T}} from the above expression or derive it directly (App. F). This invertibility means that the dynamics of the full set of central moments δfk\left\langle\delta f_{k}\right\rangle is fully specified by their current values:

dδ𝒇(t)dt=𝕎𝒇δ𝒇(t)\frac{\mathrm{d}\left\langle\delta\bm{f}(t)\right\rangle}{\mathrm{d}t}=\mathbb{W}_{\bm{f}}\left\langle\delta\bm{f}(t)\right\rangle (75)

with 𝕎𝒇=2N𝕁𝕎𝕁𝖳\mathbb{W}_{\bm{f}}=2^{-N}\mathbb{J}\,\mathbb{W}\,\mathbb{J}^{\mathsf{T}}. This holds for arbitrary transition rates satisfying detailed balance.

VI.3.2 Four-spin model

We consider a four-spin system evolving under the single-spin flip transition rates

wμν={11+exp{β[VμVν]},𝝈μ,𝝈ν differ in one spin0,otherwisew_{\mu\nu}=\begin{dcases}\frac{1}{1+\exp\left\{\beta\left[V^{\mu}-V^{\nu}\right]\right\}}\ ,&\text{$\bm{\sigma}_{\mu},\bm{\sigma}_{\nu}$ differ in one spin}\\ 0\ ,&\text{otherwise}\end{dcases} (76)

for μν\mu\neq\nu. These are well-studied dynamics for the Ising system, sometimes called heat-bath dynamics or Glauber dynamics [59, 60, 61, 53].

Refer to caption
Figure 5: (a) Linear-response (LR) excess work for LR optimal protocols for the four-spin system computed for 200 different control sets with between 11 and 1515 control parameters. The protocols featured in Fig. 6 and Fig. 7 are labeled. (b) 2D histogram showing the number of independently varying parameters against the number of available control parameters (e.g., 18 of the calculated 5-parameter optimal protocols had only 4 independently varying parameters).

For a system of four spins, the generalized Ising model (72) has 15 parameters: 4 fields hih_{i}, 6 two-spin couplings JijJ_{ij}, 4 three-spin interactions KijkK_{ijk}, and a single four-spin interaction LL (Fig. 8, App. G):

V(𝝈)=i\displaystyle V(\bm{\sigma})=-\sum_{i} hiσii<jJijσiσj\displaystyle h_{i}\sigma_{i}-\sum_{i<j}J_{ij}\sigma_{i}\sigma_{j}
i<j<kKijkσiσjσkLiσi\displaystyle-\sum_{i<j<k}K_{ijk}\sigma_{i}\sigma_{j}\sigma_{k}-L\prod_{i}\sigma_{i} (77)

From these 15 parameters, the imposition of constraints on their time evolution may be used to build partial control sets. Figure 4c shows one such partial control set, with a bulk field h=h0=h1=h2=h3h=h_{0}=h_{1}=h_{2}=h_{3}, two-spin coupling blocks JNN=J01=J02=J13=J23J_{\text{NN}}=J_{01}=J_{02}=J_{13}=J_{23} (for nearest neighbors) and JNNN=J03=J12J_{\text{NNN}}=J_{03}=J_{12} (for next-nearest neighbors), bulk three-spin coupling K=K012=K013=K023=K123K=K_{012}=K_{013}=K_{023}=K_{123}, and four-spin coupling LL.

We set our boundary conditions in terms of these 5 bulk couplings to h(0)=1,JNN(0)=1h(0)=-1,J_{\text{NN}}(0)=1 and h(τ)=+1,JNN(τ)=1h(\tau)=+1,J_{\text{NN}}(\tau)=1, with all other couplings set to zero at t=0t=0 and t=τt=\tau. Though (VI.3.2) provides no notion of “geometry” for the spin system, under these boundary conditions it is constructive to imagine a ferromagnetic Ising model on a 2×22\times 2 grid as in Fig. 4c.

Refer to caption
Figure 6: (a) Linear-response (LR) optimal control protocols for a classical four-spin system in scaled time s=t/τs=t/\tau, for three different control-parameter sets. (b) Spectral contributions to the LR excess power (54). The dissipation along mode α\alpha is the product of the relaxation time τα(s)\tau_{\alpha}(s) and the squared inner product (dβV/ds,ψα)2\left(\mathrm{d}\beta V/\mathrm{d}s,\psi_{\alpha}\right)^{2} of the control-parameter velocity with the instantaneous mode. Modes are ordered such that τ1(0)τ2(0)\tau_{1}(0)\geq\tau_{2}(0)\geq\cdots. In the middle panel, the squared inner products of the degenerate modes 77 and 88 with the velocity are summed since their respective contributions to the dissipation cannot meaningfully be distinguished. Dotted black lines: modes that are everywhere orthogonal (or nearly so) to the velocity dβV/ds\mathrm{d}\beta V/\mathrm{d}s and which thus do not meaningfully contribute to dissipation.

We computed LR minimum-work protocols for 200 different control sets consistent with these endpoints. The number of control parameters per set varied from 11 for the minimal set (with only a global field hh) to 1515 for the maximal set (with independent control over all parameters in the cluster expansion). Appendix G details the construction of control sets and computation of minimum-work protocols.

The results are summarized in Fig. 5. We observe that very few protocols need to use all available degrees of freedom to minimize the linear-response excess work. For instance, the optimized 15-parameter full-control protocol is (to numerical precision) identical to the 5-parameter protocol (Fig. 5a blue diamond, and Fig. 6 right). Evidently (at least within the linear-response regime) it is not possible to reduce dissipation by assuming a finer level of control over the spin system. In all, only 42 distinct protocols were identified among the 200 computed. Figure 5b shows the number of independently varying parameters compared to the number of available parameters; at most 7 parameters were utilized.

Figure 6 shows LR-optimized protocols for 3 different sets of control parameters: the one-parameter set (h,¬J,¬K,¬L)(h,\neg J,\neg K,\neg L), the three-parameter set (h=,¬J,K,¬L)(h_{=},\neg J,K,\neg L) and the five-parameter set (h,(JNN,JNNN),K,L)(h,(J_{\text{NN}},J_{\text{NNN}}),K,L). Here the four elements of the sets refer to constraints on control of the 11-, 22-, 33-, and 44-spin parameters, respectively (see Table 1 of App. G). For instance, h=h_{=} represents two wide magnetic fields, one interacting with the upper row of spins (σ0,σ1\sigma_{0},\sigma_{1}) and the other with the bottom row (σ2,σ3\sigma_{2},\sigma_{3}), realized by constraints h0=h1h_{0}=h_{1} and h2=h3h_{2}=h_{3}. The symbol ¬\neg indicates no control of the given type of coupling, so ¬J\neg J indicates that JNN=1J_{\text{NN}}=1 and JNNN=0J_{\text{NNN}}=0 are fixed.

The one-parameter protocol (Fig. 6, left) controls only a global field hh. This is the minimal control set for the specified protocol endpoints. Since the region of control-parameter space traversed by hh is fixed, the only optimizable property is the velocity of h(t)h(t), which for the optimal protocol slows down where relaxation times are relatively large. The dominant contribution by far to the dissipation comes from the first mode.

The other two protocols (Fig. 6a, middle and right) are geodesics on three- and five-dimensional thermodynamic manifolds. Both protocols change the magnetic fields nonmonotonically, a feature previously observed in [34]. How this (perhaps surprising) behavior reduces the total excess work of the protocols is revealed by examining the separate spectral contributions to the LR excess power (55).

The first mode 𝝍1(s)\bm{\psi}_{1}(s) has a fairly persistent identity in the explored regions of parameter space: roughly, it characterizes relaxation between the two ferromagnetic states (see also Fig. 7). Given that the central task is to flip the spins from spin-down to spin-up on average, it is not surprising that the first mode was the dominant contribution to the LR excess work for almost all of the 200 parameter sets examined.

Projecting the state energies onto the relaxation modes (Fig. 6, bottom row) reveals that driving is nearly orthogonal to the first mode along the nonmonotonic portions of the protocols, and thus concentrates nearly all dissipation among modes other than the first during these time windows. For the 5-parameter protocol, this feature significantly reduces the relaxation time τ1\tau_{1}. This uncovers the control strategy: by driving orthogonally to 𝝍1\bm{\psi}_{1}, we are able to efficiently drive the system to a region of control-parameter space in which τ1\tau_{1} is relatively small.

For the 3-parameter protocol, not only is τ1\tau_{1} not significantly reduced, it is roughly doubled near the ends of the protocol. In breaking the symmetry of the four spins, the driving is dominantly directed along the relatively fast-relaxing second mode 𝝍2\bm{\psi}_{2}, which can be interpreted as relaxation between the ferromagnetic states and the state 𝝈=(+1,+1,1,1)\bm{\sigma}=(+1,+1,-1,-1). We also observe that (somewhat counterintuitively) most of the \sim15% of dissipation attributable to the first mode occurs near s=1/2s=1/2 (when the spectral gap is relatively small), while most of the \sim80% of dissipation attributable to the second mode occurs when the spectral gap is largest. While τ1\tau_{1} is often regarded as the characteristic dynamical times for a system with a large spectral gap [62, 63], our results show that in driven systems faster modes can become salient, and may even dominate the dissipation – especially for efficient protocols which may drive orthogonal to slow modes to save energy.

Lastly, Fig. 7 shows the linear-response minimum-work protocol for the two-parameter control set (h,¬J,¬K,L)(h,\neg J,\neg K,L). The control strategy is similar to the 5-parameter MWP: in the relevant region of the (h,L)(h,L)-manifold, the projection 𝝍~1\tilde{\bm{\psi}}_{1} of the first mode (black arrows) is nearly parallel to the hh-direction. So, given that one must drive along the first mode in order to flip the spins, reduced dissipation is achieved by driving the system nearly perpendicularly to this mode to reach a region of control-parameter space where τ1\tau_{1} is smaller.

Refer to caption
Figure 7: Projections 𝝍~1\tilde{\bm{\psi}}_{1} and 𝝍~2\tilde{\bm{\psi}}_{2} of the respective first and second modes (black and red arrows) onto the thermodynamic manifold for the (h,L)(h,L) control set plotted along the linear-response MWP (blue curve). For visual clarity, the vectors are plotted with unit length, although their magnitudes vary across the space. Heatmap: relaxation time τ1\tau_{1} of the first mode 𝝍1\bm{\psi}_{1}.

VII Summary and Outlook

We have presented a new foundation for the friction-tensor formalism for conservative control of stochastic systems. Thermodynamic manifolds ~\tilde{\mathcal{M}} for partial control sets 𝒖\bm{u} are derived as submanifolds of a global thermodynamic manifold \mathcal{M} whose metric structure is provided by the full-control friction tensor. We derived two decompositions of the full-control friction tensor which facilitate both interpretation and calculation of arbitrary friction tensors and minimum-work protocols.

In particular, we show that the linear-response contribution to the excess power can be decomposed into a sum of terms corresponding to the system’s relaxation modes (55). The magnitude of the α\alphath term depends on the degree to which the protocol drives the system along that mode, scaled by its relaxation time τα\tau_{\alpha}. This decomposition suggests a design principle for protocols that reduce excess dissipation for finite-time processes: one should drive a system such that dV/dt\differential{V}/\differential{t} is maximally orthogonal to modes with long relaxation times. When this is possible, the common identification of τ1\tau_{1} as the system’s characteristic dynamical timescale may be unsuitable even in the presence of a large spectral gap.

For low-dimensional discrete systems with small state spaces or continuous systems with known spectra, these decompositions simplify the analytic calculation of the friction tensor. For larger systems, the operator decomposition (43) enables numerical calculation of the friction tensor without recourse to simulation of the stochastic dynamics. Indeed, for the 4-spin system with |Ω|=16|\Omega|=16, we were able to rapidly compute 200 different MWPs—many on high-dimensional thermodynamic manifolds, with up to 15 control parameters. This opens up the possibility of systematically searching control spaces for new insights and principles.

For very large discrete systems or continuous systems discretized on a fine grid, the matrix inversion in (43) will be prohibitively expensive. Truncation of the spectral decomposition may enable controlled-accuracy approximation of the friction tensor, particularly in the case of a large spectral gap (i.e., τ1τqτq+1\tau_{1}\geq\cdots\geq\tau_{q}\gg\tau_{q+1}\geq\cdots for some qq). However, given that even long relaxation timescales may be irrelevant when driving is orthogonal to the corresponding mode, the accuracy of this approximation fundamentally depends on the nature of driving.

Acknowledgements.
We thank Adrianne Zhong (University of California Berkeley, Department of Physics) and Antonio Patrón Castro and W. Callum Wareham (Simon Fraser University, Department of Physics) for insightful conversations and feedback on the manuscript. This work was supported by a Natural Sciences and Engineering Research Council of Canada (NSERC) Undergraduate Student Research Award (J.R.S.), NSERC CGS Master’s and Doctoral scholarships (J.R.S.), an NSERC Discovery Grant RGPIN-2020-04950 (D.A.S.), and a Tier-II Canada Research Chair CRC-2020-00098 (D.A.S.), and was enabled in part by support provided by the Digital Research Alliance of Canada (alliancecan.ca).

Appendix A Full-control condition

A set of control parameters 𝒖\bm{u} is complete when it can manipulate the energy landscape 𝑽(𝒖)\bm{V}(\bm{u}) up to a global additive constant. For |Ω|=n|\Omega|=n, this requires n1n-1 parameters. Formally, the completeness condition may be stated as follows: for all 𝑽m\bm{V}^{\prime}\in\mathbb{R}^{m}, there exists some 𝒖n1\bm{u}\in\mathbb{R}^{n-1} and cc\in\mathbb{R} such that

𝑽=𝑽(𝒖)+c𝟏,𝑽n.\bm{V}^{\prime}=\bm{V}(\bm{u})+c\bm{1}\ ,\ \ \forall\,\bm{V}^{\prime}\in\mathbb{R}^{n}\ . (78)

Assume that the first derivatives of 𝑽(𝒖)\bm{V}(\bm{u}) in the parameters uiu^{i} are continuous. We show that (78) is equivalent to the condition that 𝒇𝒑(t)=𝕁𝒑(t)\left\langle\bm{f}\right\rangle_{\bm{p}(t)}=-\mathbb{J}\bm{p}(t) has inverse

𝒑(t)=𝕁𝖱𝒇(t)+1n𝟏\bm{p}(t)=-\mathbb{J}^{\mathsf{R}}\left\langle\bm{f}(t)\right\rangle+\frac{1}{n}\bm{1}\ (79)

for Jacobian 𝕁iν=Vν/ui\mathbb{J}_{i}^{\phantom{i}\nu}=\partial V^{\nu}/\partial u^{i} and right inverse 𝕁𝖱𝕁𝖳(𝕁𝕁𝖳)1\mathbb{J}^{\mathsf{R}}\equiv\mathbb{J}^{\mathsf{T}}(\mathbb{J}\mathbb{J}^{\mathsf{T}})^{-1}.

First we show that (78)\implies(79). The condition (78) implies that 𝕁\mathbb{J} has full row rank, so the right inverse of the Jacobian for a complete control set exists, and satisfies

𝕁𝕁𝖱\displaystyle\mathbb{J}\mathbb{J}^{\mathsf{R}} =In1\displaystyle=I_{n-1} (80a)
𝕁𝖱𝕁\displaystyle\mathbb{J}^{\mathsf{R}}\mathbb{J} =In1n𝟙n,\displaystyle=I_{n}-\frac{1}{n}\mathbbm{1}_{n}\ , (80b)

where InI_{n} is the n×nn\times n identity matrix and 𝟙n\mathbbm{1}_{n} is the n×nn\times n matrix of ones. Equation (80b) follows from the facts that 𝕁𝖱𝕁\mathbb{J}^{\mathsf{R}}\mathbb{J} is the transpose of the orthogonal projector onto the span of 𝕁\mathbb{J} [64] and the null-space 𝒩(𝕁)\mathcal{N}(\mathbb{J}) is spanned by the vector 𝟏\bm{1}.

Operating on Eq. (31) with 𝕁𝖱\mathbb{J}^{\mathsf{R}} and applying the identity (80b) yields

𝕁𝖱𝒇(t)=𝒑(t)+1n𝟏,\mathbb{J}^{\mathsf{R}}\left\langle\bm{f}(t)\right\rangle=-\bm{p}(t)+\frac{1}{n}\bm{1}\ , (81)

so Eq. (79) holds for any complete control set 𝒖\bm{u}.

To prove the converse (i.e., (79)\implies(78)), the existence of a right inverse 𝕁𝖱\mathbb{J}^{\mathsf{R}} immediately implies that 𝒩(𝕁)\mathcal{N}(\mathbb{J}) is one-dimensional. Left-multiplying (81) by 𝕁\mathbb{J} and applying (80a) gives

𝒇(t)\displaystyle\left\langle\bm{f}(t)\right\rangle =𝕁𝒑(t)+1n𝕁𝟏\displaystyle=-\mathbb{J}\bm{p}(t)+\frac{1}{n}\mathbb{J}\bm{1} (82a)
=𝒇(t)+1n𝕁𝟏,\displaystyle=\left\langle\bm{f}(t)\right\rangle+\frac{1}{n}\mathbb{J}\bm{1}\ , (82b)

so 𝒩(𝕁)=span{𝟏}\mathcal{N}(\mathbb{J})=\text{span}\left\{\bm{1}\right\}.

Appendix B Equilibrium covariance matrix of the indicator functions

We derive here a simple expression for the equal-time equilibrium covariance matrix Σω0\Sigma_{\omega}^{0} of the forces ω(x,t)\omega(x,t) conjugate to the potential energy V(x)V(x):

Σω0(x,y)\displaystyle\Sigma^{0}_{\omega}(x,y) δω(x)δω(y)\displaystyle\equiv\left\langle\delta\omega(x)\delta\omega(y)\right\rangle (83a)
=ω(x)ω(y)π(x)π(y)\displaystyle=\left\langle\omega(x)\omega(y)\right\rangle-\pi(x)\pi(y) (83b)
=Ωdμ(z)π(z)δ(x,z)δ(y,z)π(x)π(y)\displaystyle=\int_{\Omega}\mathrm{d}\mu(z)\,\pi(z)\delta(x,z)\delta(y,z)-\pi(x)\pi(y) (83c)
=π(x)δ(x,y)π(x)π(y).\displaystyle=\pi(x)\delta(x,y)-\pi(x)\pi(y)\ . (83d)

For discrete-state systems, it is useful to write this in matrix form:

Σω0\displaystyle\phantom{\left\langle\delta\omega\,\delta\right\rangle}\Sigma^{0}_{\omega} =diag{𝝅}𝝅𝝅𝖳.\displaystyle=\text{diag}\left\{\bm{\pi}\right\}-\bm{\pi}\bm{\pi}^{\mathsf{T}}\ . (84)

Appendix C Riemannian full-control friction tensor

We show that the friction tensor for a complete control set is given by Eq. (44). First, note that from the definition (34) of the generator for the dynamics of 𝒇(t)\left\langle\bm{f}(t)\right\rangle and the properties (80a) and (80b) for the right inverse 𝕁R\mathbb{J}^{R}, the inverse 𝕎𝒇1\mathbb{W}_{\bm{f}}^{-1} exists and equals

𝕎𝒇1=𝕁𝕎𝒟𝕁𝖱.\mathbb{W}_{\bm{f}}^{-1}=\mathbb{J}\mathbb{W}^{\mathcal{D}}\mathbb{J}^{\mathsf{R}}\ . (85)

This is straightforwardly verified using the definitions and properties of 𝕎𝒇,𝕎𝒟\mathbb{W}_{\bm{f}},\mathbb{W}^{\mathcal{D}}, and 𝕁𝖱\mathbb{J}^{\mathsf{R}}. Next,

[𝕁Σω0𝕁𝖳]ij\displaystyle\left[\mathbb{J}\Sigma^{0}_{\omega}\mathbb{J}^{\mathsf{T}}\right]_{ij} =μ,νfi(xμ)fj(xν)(πμδμνπμπν)\displaystyle=\sum_{\mu,\nu}f_{i}(x_{\mu})f_{j}(x_{\nu})\left(\pi_{\mu}\delta_{\mu\nu}-\pi_{\mu}\pi_{\nu}\right) (86a)
=fifjπfiπfjπ\displaystyle=\left\langle f_{i}f_{j}\right\rangle_{\pi}-\left\langle f_{i}\right\rangle_{\pi}\left\langle f_{j}\right\rangle_{\pi} (86b)
=[Σ𝒇0]ij,\displaystyle=\left[\Sigma^{0}_{\bm{f}}\right]_{ij}\ , (86c)

where in the first line we used Eq. (84). Then

β𝕎𝒇1Σ𝒇0\displaystyle-\beta{\mathbb{W}}_{\bm{f}}^{-1}\Sigma^{0}_{\bm{f}} =β𝕁𝕎𝒟Σω0𝕁𝖳\displaystyle=-\beta\mathbb{J}\mathbb{W}^{\mathcal{D}}\Sigma^{0}_{\omega}\mathbb{J}^{\mathsf{T}} (87a)
=𝕁ζ𝕁𝖳\displaystyle=\mathbb{J}\zeta\mathbb{J}^{\mathsf{T}} (87b)
=ζ~.\displaystyle=\tilde{\zeta}\ . (87c)

Appendix D Spectral decomposition of the Drazin inverse

Let 𝒢\mathcal{G} be the generator of a discrete- or continuous-state Markov process with spectral decomposition

𝒢(x,y)=α1λαψα(x)ϕα(y).\mathcal{G}(x,y)=\sum_{\alpha\geq 1}\lambda_{\alpha}\psi_{\alpha}(x)\phi_{\alpha}(y)\ . (88)

We will show that the integral kernel for the Drazin inverse of 𝒢\mathcal{G} is

𝒢𝒟(x,y)=α11λαψα(x)ϕα(y).\mathcal{G}^{\mathcal{D}}(x,y)=-\sum_{\alpha\geq 1}\frac{1}{\lambda_{\alpha}}\psi_{\alpha}(x)\phi_{\alpha}(y)\ . (89)

First, the composition of 𝒢\mathcal{G} with 𝒢𝒟\mathcal{G}^{\mathcal{D}} is

(\displaystyle( 𝒢𝒢𝒟)(x,y)\displaystyle\mathcal{G}\circ\mathcal{G}^{\mathcal{D}})(x,y)
=Ωdμ(z)𝒢(x,z)𝒢𝒟(z,y)\displaystyle=\int_{\Omega}\mathrm{d}\mu(z)\,\mathcal{G}(x,z)\,\mathcal{G}^{\mathcal{D}}(z,y) (90a)
=α,β1λαλβψα(x)ϕβ(y)Ωdμ(z)ϕα(z)ψβ(z)\displaystyle=\sum_{\alpha,\beta\geq 1}\frac{\lambda_{\alpha}}{\lambda_{\beta}}\psi_{\alpha}(x)\phi_{\beta}(y)\int_{\Omega}\mathrm{d}\mu(z)\,\phi_{\alpha}(z)\psi_{\beta}(z) (90b)
=α1ψα(x)ϕα(y)\displaystyle=\sum_{\alpha\geq 1}\psi_{\alpha}(x)\phi_{\alpha}(y) (90c)

where the last line follows from the biorthogonality of the eigenfunctions. The composition 𝒢𝒟𝒢\mathcal{G}^{\mathcal{D}}\circ\mathcal{G} yields the same result, so the commutativity property (35c) is satisfied.

Since the full set of eigenfunctions is also complete, i.e.,

α0ψα(x)ϕα(y)=δ(x,y),\sum_{\alpha\geq 0}\psi_{\alpha}(x)\phi_{\alpha}(y)=\delta(x,y)\ , (91)

this further simplifies to

(𝒢𝒢𝒟)(x,y)=δ(x,y)ψ0(x).(\mathcal{G}\circ\,\mathcal{G}^{\mathcal{D}})(x,y)=\delta(x,y)-\psi_{0}(x)\ . (92)

This is equivalent to Eq. (37). That is, 𝒢𝒢𝒟\mathcal{G}\circ\mathcal{G}^{\mathcal{D}} projects vectors in the Hilbert space onto the complement of 𝒩(𝒢)\mathcal{N}(\mathcal{G}).

To see that properties (35a,b) hold with index k=1k=1, note that 1𝒢𝒟=𝒢𝒟ψ0=01\mathcal{G}^{\mathcal{D}}=\mathcal{G}^{\mathcal{D}}\psi_{0}=0 by construction. Then

(𝒢𝒟𝒢\displaystyle(\mathcal{G}^{\mathcal{D}}\circ\mathcal{G} 𝒢𝒟)(x,y)\displaystyle\circ\mathcal{G}^{\mathcal{D}})(x,y)
=dμ(z)[δ(x,z)ψ0(x)]𝒢𝒟(z,y)\displaystyle=\int\mathrm{d}\mu(z)\,[\delta(x,z)-\psi_{0}(x)]\,\mathcal{G}^{\mathcal{D}}(z,y) (93a)
=𝒢𝒟(x,y),\displaystyle=\mathcal{G}^{\mathcal{D}}(x,y)\ , (93b)

and similarly,

(𝒢𝒢𝒟\displaystyle(\mathcal{G}\circ\mathcal{G}^{\mathcal{D}} 𝒢)(x,y)\displaystyle\circ\mathcal{G})(x,y)
=dμ(z)𝒢(x,z)[δ(z,y)ψ0(z)]\displaystyle=\int\mathrm{d}\mu(z)\,\mathcal{G}(x,z)\,[\delta(z,y)-\psi_{0}(z)] (94a)
=𝒢(x,y).\displaystyle=\mathcal{G}(x,y)\ . (94b)

Since the Drazin inverse is the unique operator satisfying properties (35a-c), 𝒢𝒟\mathcal{G}^{\mathcal{D}} is the Drazin inverse of 𝒢\mathcal{G}.

Appendix E Proof of Equation (45)

To show that (45) holds, we first note that the temporal derivative of the instantaneous equilibrium distribution π(x,t)=Z1exp{βV(x,t)}\pi(x,t)=Z^{-1}\exp\left\{-\beta V(x,t)\right\} can be expressed in terms of dV/dt\mathrm{d}V/\mathrm{d}t by

dπ(x,t)dt=Ωdμ(y)dV(y,t)dtδπ(x,t)δV(y,t),\frac{\mathrm{d}\pi(x,t)}{\mathrm{d}t}=\int_{\Omega}\mathrm{d}\mu(y)\,\frac{\mathrm{d}V(y,t)}{\mathrm{d}t}\frac{\delta\pi(x,t)}{\delta V(y,t)}\ , (95)

where δπ(x,t)/δV(y,t)\delta\pi(x,t)/\delta V(y,t) is the functional derivative of π\pi with respect to VV. From the definition of the Boltzmann distribution,

δeβV(x,t)δV(y,t)\displaystyle\frac{\delta\text{e}^{-\beta V(x,t)}}{\delta V(y,t)} =βδ(x,y)eβV(x,t)\displaystyle=-\beta\,\delta(x,y)\,\text{e}^{-\beta V(x,t)} (96a)
δZ[V(t)]δV(y,t)\displaystyle\frac{\delta Z[V(t)]}{\delta V(y,t)} =βΩdμ(x)δ(xy)eβV(x,t)\displaystyle=-\beta\int_{\Omega}\mathrm{d}\mu(x)\,\delta(x-y)\,\text{e}^{-\beta V(x,t)} (96b)
=βeβV(y,t).\displaystyle=-\beta\text{e}^{-\beta V(y,t)}\ . (96c)

Then by the chain and product rules,

δπ(x,t)δV(y,t)\displaystyle\frac{\delta\pi(x,t)}{\delta V(y,t)} =1Z2δZ[V(t)]δV(y,t)eβV(x,t)+1ZδeβV(x,t)δV(y,t)\displaystyle=-\frac{1}{Z^{2}}\frac{\delta Z[V(t)]}{\delta V(y,t)}\,\text{e}^{-\beta V(x,t)}+\frac{1}{Z}\frac{\delta\text{e}^{-\beta V(x,t)}}{\delta V(y,t)} (97a)
=βeβV(y,t)ZeβV(x,t)Zβδ(x,y)eβV(x,t)Z\displaystyle=\beta\frac{\text{e}^{-\beta V(y,t)}}{Z}\frac{\text{e}^{-\beta V(x,t)}}{Z}-\beta\delta(x,y)\frac{\text{e}^{-\beta V(x,t)}}{Z} (97b)
=βπ(x,t)[π(y,t)δ(x,y)]\displaystyle=\beta\,\pi(x,t)\left[\pi(y,t)-\delta(x,y)\right] (97c)

The temporal derivative of the equilibrium distribution is therefore

dπ(x,t)dt=βπ(x,t)[dVdtπ(t)dV(x,t)dt].\frac{\mathrm{d}\pi(x,t)}{\mathrm{d}t}=\beta\,\pi(x,t)\left[\left\langle\frac{\mathrm{d}V}{\mathrm{d}t}\right\rangle_{\pi(t)}-\frac{\mathrm{d}V(x,t)}{\mathrm{d}t}\right]\ . (98)

Since π(t)\pi(t) is in the null-space of 𝒢t𝒟\mathcal{G}^{\mathcal{D}}_{t}, operating on both sides of (98) with 𝒢t𝒟\mathcal{G}^{\mathcal{D}}_{t} yields

(𝒢t𝒟\displaystyle\bigg{(}\mathcal{G}^{\mathcal{D}}_{t} dπdt)(x)\displaystyle\frac{\mathrm{d}\pi}{\mathrm{d}t}\bigg{)}(x)
=Ωdμ(y)𝒢t𝒟(x,y)dπ(y)dt\displaystyle=\int_{\Omega}\mathrm{d}\mu(y)\,\mathcal{G}^{\mathcal{D}}_{t}(x,y)\frac{\mathrm{d}\pi(y)}{\mathrm{d}t} (99a)
=α1ταψα(x)Ωdμ(y)ϕα(y)π(y)dV(y)dt\displaystyle=\sum_{\alpha\geq 1}\tau_{\alpha}\psi_{\alpha}(x)\int_{\Omega}\mathrm{d}\mu(y)\,\phi_{\alpha}(y)\pi(y)\frac{\mathrm{d}V(y)}{\mathrm{d}t} (99b)
=α1ταψα(x)Ωdμ(y)ψα(y)dV(y)dt\displaystyle=\sum_{\alpha\geq 1}\tau_{\alpha}\psi_{\alpha}(x)\int_{\Omega}\mathrm{d}\mu(y)\,\psi_{\alpha}(y)\frac{\mathrm{d}V(y)}{\mathrm{d}t} (99c)
=(ζdVdt)(x),\displaystyle=\left(\zeta\frac{\mathrm{d}V}{\mathrm{d}t}\right)(x)\ , (99d)

where in (99b) we applied the spectral decomposition (89) of the Drazin inverse of the generator.

Appendix F Right inverse of the Ising full-control Jacobian

We show explicitly that the Jacobian matrix 𝕁\mathbb{J} for the the full set of control parameters from the cluster expansion (72) has right inverse 𝕁𝖱=2N𝕁𝖳\mathbb{J}^{\mathsf{R}}=2^{-N}\mathbb{J}^{\mathsf{T}}, as inferred from (74). The components of 𝕁\mathbb{J} are

[𝕁]kν=k𝒞(k)σj(xν),\left[\mathbb{J}\right]_{k}^{\phantom{i}\nu}=-\prod_{k\,\in\,\mathcal{C}(k)}\sigma_{j}(x_{\nu})\ , (100)

where 𝒞(k)\mathcal{C}(k) is the set of spin indices associated with interaction XkX^{k}, and σj(xα)\sigma_{j}(x_{\alpha}) is the value of spin kk in state xαx_{\alpha}. Then

[𝕁𝕁𝖳]k\displaystyle\left[\mathbb{J}\mathbb{J}^{\mathsf{T}}\right]_{k\ell} =νi𝒞(k)j𝒞()σi(xν)σj(xν).\displaystyle=\sum_{\nu}\prod_{i\in\mathcal{C}(k)}\prod_{j\in\mathcal{C}(\ell)}\sigma_{i}(x_{\nu})\,\sigma_{j}(x_{\nu})\ . (101a)

For k=k=\ell, the spin sets are identical so the summand evaluates to one and thus [𝕁𝕁𝖳]kk=2N\left[\mathbb{J}\mathbb{J}^{\mathsf{T}}\right]_{kk}=2^{N}.

For kk\neq\ell, we need only consider spin indices in either 𝒞(k)\mathcal{C}(k) or 𝒞()\mathcal{C}(\ell), since any i𝒞(k)𝒞()i\in\mathcal{C}(k)\cap\mathcal{C}(\ell) contributes a factor of one to the product. Denote this set by 𝒞(k)𝒞()\mathcal{C}(k)\oplus\mathcal{C}(\ell). For each state xαx_{\alpha}, there is a state xα¯x_{\overline{\alpha}} obtained by inverting all spins. If 𝒞(k)𝒞()\mathcal{C}(k)\oplus\mathcal{C}(\ell) has an odd number of spins, then the α\alphath and α¯\overline{\alpha}th terms in the sum cancel, so these entries will vanish. If 𝒞(k)𝒞()\mathcal{C}(k)\oplus\mathcal{C}(\ell) has an even number of spins, a similar cancellation occurs by inverting only the first spin in each (ordered) product. Therefore, [𝕁𝕁𝖳]k=2Nδk\left[\mathbb{J}\mathbb{J}^{\mathsf{T}}\right]_{k\ell}=2^{N}\delta_{k\ell}, so

𝕁𝖱=12N𝕁𝖳.\mathbb{J}^{\mathsf{R}}=\frac{1}{2^{N}}\mathbb{J}^{\mathsf{T}}\ . (102)

Appendix G Control sets and numerical methods for the four-spin system

Refer to caption
Figure 8: Fourth-order Venn diagram showing the 15 control parameters in a complete control set for the four-spin model. Each spin is associated with a set of 8 different parameters, and the 4-spin interaction LL is at the intersection of all four sets.

We computed minimum-work protocols for a collection of 200 control-parameter sets with between 11 and 1515 parameters for the four-spin system. The endpoints for all protocols are h(0)=1h(0)=-1 and h(τ)=+1h(\tau)=+1 for global field h(h0=h1=h2=h3)h\ (h_{0}=h_{1}=h_{2}=h_{3}), JNN(0)=JNN(τ)=1J_{\text{NN}}(0)=J_{\text{NN}}(\tau)=1 for “nearest-neighbour” coupling JNN(J01=J02=J13=J23)J_{\text{NN}}\ (J_{01}=J_{02}=J_{13}=J_{23}), and all other interactions set to zero. Any control set capable of driving the system between these endpoints must have (at minimum) control over all of the fields hih_{i}.

The parameter sets studied are the elements of the product set

[h,h=,h×,hi]\displaystyle\left[h,h_{=},h_{\times},h_{i}\right]
×[¬J,JNN,JNNN,(JNN,JNNN),Jij]\displaystyle\ \ \ \ \ \times\left[\neg J,J_{\text{NN}},J_{\text{NNN}},(J_{\text{NN}},J_{\text{NNN}}),J_{ij}\right]
×[¬K,K,K=,K×,Kijk]\displaystyle\ \ \ \ \ \times\left[\neg K,K,K_{=},K_{\times},K_{ijk}\right]
×[¬L,L].\displaystyle\ \ \ \ \ \times\left[\neg L,L\right]\ . (103)

The elements in the sets above are control subsets defined by constraints on the 11-, 22-, 33-, and 44-spin interactions. For example, referring to Table 1, the subset h=h_{=} contains two parameters, JNNNJ_{\text{NNN}} and LL each contain one parameter, and ¬K\neg K is an empty set, so the control-parameter set (h=,JNNN,¬K,L(h_{=},J_{\text{NNN}},\neg K,L) contains the four parameters

u0=h0=h1u1=h2=h3u2=J03=J12u3=L.\begin{matrix}u^{0}&=&h_{0}=h_{1}\\ u^{1}&=&h_{2}=h_{3}\\ u^{2}&=&J_{03}=J_{12}\\ u^{3}&=&L\ .\end{matrix} (104)
\SetRow hh (h0=h1=h2=h3)(h_{0}=h_{1}=h_{2}=h_{3})
h=h_{=} (h0=h1),(h2=h3)(h_{0}=h_{1}),(h_{2}=h_{3})
h×h_{\times} (h0=h3),(h1=h2)(h_{0}=h_{3}),(h_{1}=h_{2})
hih_{i} All fields independently controlled
¬J\neg J JNNJ_{\text{NN}}=1, JNNNJ_{\text{NNN}}=0 (uncontrolled)
JNNJ_{\text{NN}} (J01=J02=J13=J23)(J_{01}=J_{02}=J_{13}=J_{23})
JNNNJ_{\text{NNN}} (J03=J12)(J_{03}=J_{12})
JijJ_{ij} All two-spin couplings independently controlled
¬K\neg K K=0K=0 (uncontrolled)
KK (K012=K013=K023=K123)(K_{012}=K_{013}=K_{023}=K_{123})
K=K_{=} (K012=K013),(K023=K123)(K_{012}=K_{013}),(K_{023}=K_{123})
K×K_{\times} (K012=K023),(K013=K123)(K_{012}=K_{023}),(K_{013}=K_{123})
KijkK_{ijk} All three-spin interactions independently controlled
¬L\neg L L=0L=0 (uncontrolled)
LL Four-spin interaction controlled
Table 1: Defining constraints of the control subsets comprising the 200 parameter sets in (G).

To calculate the MWPs, we used a relaxation method [52, 34]. An auxiliary variable rr is introduced, defining a system of partial differential equations

2uit2+j,kΓjki(𝒖)ujtukt=uir\displaystyle\frac{\partial^{2}u^{i}}{\partial t^{2}}+\sum_{j,k}\Gamma^{i}_{\phantom{i}jk}(\bm{u})\frac{\partial u^{j}}{\partial t}\frac{\partial u^{k}}{\partial t}=-\frac{\partial u^{i}}{\partial r} (105a)
ui(0,r)=u0i,ui(τ,r)=uτir\displaystyle u^{i}(0,r)=u^{i}_{0},\ u^{i}(\tau,r)=u^{i}_{\tau}\ \ \ \forall\,r (105b)
ui(t,0)=gi(t)\displaystyle u^{i}(t,0)=g^{i}(t) (105c)

for protocol endpoints u0iu^{i}_{0} and uτiu^{i}_{\tau} and initial curve gi(t)g^{i}(t). If this system is sufficiently well-behaved, the curve ui(t,r)u^{i}(t,r) approaches a solution of the geodesic equation (20) as rr\to\infty for any initialization gi(t)g^{i}(t). When rr and tt are discretized, Δr\Delta r plays the role of a learning rate, scaling the magnitude of the updates ui(t,r)ui(t,r+Δr)u^{i}(t,r)\to u^{i}(t,r+\Delta r). We took ΔrΔt2\Delta r\sim\Delta t^{2} to ensure stability. Rapid convergence was achieved by initializing with a large timestep Δt=0.02\Delta t=0.02 and progressively refining the grid to a maximum resolution Δt=0.00125\Delta t=0.00125.

We obtained an analytic (if inelegant) expression for the derivatives of the friction tensor using the operator decomposition (43) with the rate law wμν=πμ/(πμ+πν)w_{\mu\nu}=\pi_{\mu}/(\pi_{\mu}+\pi_{\nu}) for Boltzmann distribution 𝝅\bm{\pi} and the form (36) of the Drazin inverse. The Christoffel symbols (21) were then rapidly computed on-the-fly from local information at each point on the discretized curve.

The spectra exhibit eigenvalue-crossings, so ordinary rank-ordering (i.e., λ0(t)λ1(t)\lambda_{0}(t)\geq\lambda_{1}(t)\geq\cdots for all tt) results in non-smooth evolution λα(t)\lambda_{\alpha}(t) of the eigenvalues and discontinuities in the evolution 𝝍α(t)\bm{\psi}_{\alpha}(t) of eigenvectors. In order to preserve continuity of physically meaningful eigenvectors, we computed the rank-ordered spectrum at t=0t=0 and used Rayleigh-quotient iteration [65] over a fine interpolation of the protocols (Δt=8.33×105\Delta t=8.33\times 10^{-5}, or 12,000 points) to obtain the spectrum at time (n+1)Δt(n+1)\Delta t from the spectrum at time nΔtn\Delta t.


References