This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Adaptive Real-Time Grid Operation via Online Feedback Optimization with Sensitivity Estimation

Miguel Picallo1, Lukas Ortmann1, Saverio Bolognani, Florian Dörfler Automatic Control Laboratory, ETH Zurich, 8092 Zurich, Switzerland
{miguelp,ortmannl,bsaverio,dorfler}@ethz.ch
Abstract

In this paper we propose an approach based on an Online Feedback Optimization (OFO) controller with grid input-output sensitivity estimation for real-time grid operation, e.g., at subsecond time scales. The OFO controller uses grid measurements as feedback to update the value of the controllable elements in the grid, and track the solution of a time-varying AC Optimal Power Flow (AC-OPF). Instead of relying on a full grid model, e.g., grid admittance matrix, OFO only requires the steady-state sensitivity relating a change in the controllable inputs, e.g., power injections set-points, to a change in the measured outputs, e.g., voltage magnitudes. Since an inaccurate sensitivity may lead to a model-mismatch and jeopardize the performance, we propose a recursive least-squares estimation that enables OFO to learn the sensitivity from measurements during real-time operation, turning OFO into a model-free approach. We analytically certify the convergence of the proposed OFO with sensitivity estimation, and validate its performance on a simulation using the IEEE 123-bus test feeder, and comparing it against a state-of-the-art OFO with constant sensitivity.

Index Terms:
Online Feedback Optimization, Real-time AC Optimal Power Flow, Recursive Estimation, Voltage Regulation
\thanksto

1 These two authors contributed equally.
Funding by the Swiss Federal Office of Energy through the projects “ReMaP” (SI/501810-01) and “UNICORN” (SI/501708), the Swiss National Science Foundation through the NCCR Automation, and by the ETH Foundation is gratefully acknowledged.

I Introduction

The increasing amount of controllable, yet sometimes unpredictable, power resources in electrical grids, e.g., renewable generation, electric vehicles, flexible loads, etc., leads to new challenges and opportunities in the operation of power systems. On the one hand, these new controllable elements allow to minimize the grid operational cost and promote a transition to a more sustainable power system. On the other hand, given the volatility and unpredictability of these resources, fast control decisions are required to avoid constraint violations, e.g., overvoltages. This is especially relevant in distribution grids, where many of these resources are deployed. However, measurement scarcity and poor grid models challenge grid operation at such low voltage levels.

One way to leverage the controllability of these resources and to optimize the grid operation is by solving an AC Optimal Power Flow (AC-OPF) [1], an optimization problem to determine the set-points of controllable resources that minimize the operational cost and enforce grid safety requirements, e.g., voltage limits, line thermal limits, etc. Unfortunately, standard AC-OPF requires a) full grid observability, e.g., measurements of all active and reactive power injections and consumptions, and b) an accurate nonlinear grid model, e.g., its admittance matrix [1]. Yet, learning the model may require an extensive deployment of measurements across the network [2, 3], usually not available or affordable on the distribution system level. Furthermore, the volatility of renewable energy sources and household loads requires high sampling and control-loop rates to satisfy the grid constraints. Yet, solving a computationally expensive AC-OPF may pose a limit on these rates.

Online Feedback Optimization (OFO) [4, 5, 6] is a novel computationally efficient approach that allows to track the solutions of an AC-OPF problem under time-varying conditions using subsecond control-loop rates. OFO is based on a controller that uses grid measurements as feedback to iteratively steer the controllable input set-points towards the AC-OPF solutions, and has already been successfully tested in both simulations and experimental settings [7]. Furthermore, OFO neither requires full grid observability [8], nor an accurate nonlinear grid model. It only needs measurements of the outputs that need to be controlled, and the input-output sensitivity that matches a change in the input to a change in the output. This sensitivity is essentially a derivative of the power flow equations at the operating point [9], and thus depends on the grid state and exogenous disturbances, e.g., loads. Hence, constructing an accurate sensitivity requires the grid model and full measurements of the grid to evaluate it. To avoid these requirements, some OFO approaches use a constant approximate linear model, and thus a constant approximate sensitivity [6, 8, 7]. Even though OFO is robust against small approximation errors in this sensitivity [7], an inaccurate sensitivity introduces a model-mismatch that may lower the approach performance [10]. Therefore, some model-free approaches try to operate the system optimally without requiring a model or sensitivity. First, reinforcement learning allows to disregard the model, and instead take decisions based solely on measurements [11]. However, reinforcement learning has limited theoretical guarantees, and may not be able to enforce the grid safety constraints during its learning phase. Second, data-driven control [12, 13, 14] based on Willems Fundamental lemma [15] allows to compute the sensitivity after gathering sufficient data. Yet, these approaches estimate a constant linear model, and thus may fail to adapt to different operating points. Finally, zeroth-order gradient-free methods as [16] allow to operate the system while continuously estimating and updating the sensitivity. However, [16] requires a sufficient time-scale separation between the sensitivity estimation procedure and the feedback optimization, which may lower the convergence rate of the entire approach if the measurement sample rate is restricted due to communication limits.

Therefore, in this paper, with a similar spirit as in the extremum seeking approach [16], we propose a model-free OFO approach that sequentially estimates a time-varying sensitivity while operating the grid, bypassing the need to know the whole grid model accurately, and to have full grid observability. Our contributions are as follows: First, we design a sensitivity learning approach via recursive least squares [17, 18]. We use as measurements the change in the outputs caused by a change of the controllable inputs. Second, we combine this sensitivity estimation with a persistently exciting OFO that gathers enough information about the sensitivity while driving the control inputs towards the AC-OPF solutions. Third, we certify the convergence of both the estimated sensitivity and the control input towards the true sensitivity and the time-varying solution of the AC-OPF, respectively. Fourth and finally, we simulate the proposed OFO controller with sensitivity estimation on the 3-phase, unbalanced IEEE 123-bus test feeder [19] using real consumption data, and show its superior performance over a state-of-the-art OFO with a constant sensitivity approximation.

The paper is structured as follows: Section II presents some preliminaries on grid models, AC-OPF and OFO. Section III explains our proposed OFO with sensitivity estimation approach, and provides theoretical convergence guarantees. Section IV shows the simulation on a test feeder. Finally, Section V concludes and discusses further work.

II Preliminaries: Grid Model, AC-OPF and OFO

II-A Grid Model

For each bus ii of a nn-bus power system we define the voltage magnitude as viv_{i}\in\mathbb{R}, the active and reactive power as pip_{i}\in\mathbb{R} and qiq_{i}\in\mathbb{R}, respectively. We obtain the vectors vv, pp, and qq of dimension nn by stacking the individual bus quantities, i.e., v=[v1,,vn]Tv=[v_{1},\dots,v_{n}]^{T}. We define the control input vector unuu\in\mathbb{R}^{n_{u}} consisting of all the controllable resources (e.g. active and reactive generation and flexible loads in pp and qq, slack bus voltage magnitude v1v_{1} through tap changers); the output vector yy (e.g. voltage magnitude elements in vv) with all the quantities that we measure and want to control through the inputs; and the disturbance vector dd with all uncontrollable power injections (e.g. conventional consumption loads in pp and qq). The grid admittance matrix and the power flow equations allow to define an input-output map h()\mathdutchcal{h}(\cdot) that characterizes the output yy as a non-linear function of uu and dd:

y=h(u,d).y=\mathdutchcal{h}(u,d). (1)

The input-output map h()\mathdutchcal{h}(\cdot) is not typically available in closed form, since in general it is not possible to derive an analytical expression of vv (in yy) as a function of pp and qq (in uu and dd) using the power flow equations [1]. Yet, the local existence of a continuous differentiable map h()\mathdutchcal{h}(\cdot) can be guaranteed by the implicit function theorem [20].

II-B AC Optimal Power Flow for Grid Operation

The operation of a power grid consists of deciding the input utu_{t} at each time instant tt. An AC-OPF allows to formulate this decision process as an optimization problem:

ut,yt=argminu𝒰t,yf(u)+g(y) s.t. y=h(u,dt),\displaystyle\begin{split}u^{*}_{t},y_{t}^{*}=&\arg\min_{u\in\mathcal{U}_{t},y}f(u)+g(y)\\ &\text{ s.t. }y=\mathdutchcal{h}(u,d_{t}),\end{split} (2)

where f(u)f(u) is the operational cost on the input uu; g(y)g(y) is a penalty function to enforce some grid specification on the output yy, e.g., voltage limits; 𝒰t\mathcal{U}_{t} is the time-varying set of admissible inputs that defines the operational constraints on utu_{t}, e.g., power limits 𝒰t={u|u¯t<u<u¯t}\mathcal{U}_{t}=\{u\,|\,\underline{u}_{t}<u<\overline{u}_{t}\}; and dtd_{t} is the disturbance value at time tt, e.g., uncontrollable loads or non-dispatchable generation. The nonlinear input-output model (1) in (2) relates the outputs to the chosen input.

Optimal real-time decision making consists of first taking measurements dtd_{t}; then, solving the AC-OPF problem (2), and finally applying the solution utu^{*}_{t} to the system. Then, this is repeated at the next time step t+1t+1.

II-C Linear Power Flow Approximation

Solving AC-OPF problems (2) to determine the set-points of power resources is a compelling and valuable tool for grid operators, but it comes with some drawbacks: First, the full nonlinear model of the grid h(u,d)\mathdutchcal{h}(u,d) is needed. Second, solving the AC-OPF (2) can be computationally expensive, which may jeopardize its use for real-time grid operation. This can be circumvented by linearizing the map h()\mathdutchcal{h}(\cdot) in (1) at an operating point [21, 22, 1], e.g., the zero-injection point (uop,dop)=(0,0)(u_{\text{op}},d_{\text{op}})=(0,0), to obtain the approximation

y=H0u+D0d+y0,\displaystyle\begin{split}y=H_{0}u+D_{0}d+y_{0},\end{split} (3)

where y0y_{0} is an offset representing the output value when u=d=0u=d=0, e.g., 11 p.u. for all voltage magnitudes. The matrices H0=uh(u,d)|(uop,dop)H_{0}=\nabla_{u}\mathdutchcal{h}(u,d)|_{(u_{\text{op}},d_{\text{op}})} and D0=dh(u,d)|(uop,dop)D_{0}=\nabla_{d}\mathdutchcal{h}(u,d)|_{(u_{\text{op}},d_{\text{op}})} are evaluated at the operating point, and represent the sensitivities of the output with respect to changes in the input uu and disturbance dd, respectively. This linear approximation (3) can substitute the nonlinear map h()\mathdutchcal{h}(\cdot) in the AC-OPF (2) to get

minu𝒰tf(u)+g(H0u+D0dt+y0).\displaystyle\begin{split}\min_{u\in\mathcal{U}_{t}}f(u)+g(H_{0}u+D_{0}d_{t}+y_{0}).\end{split} (4)

II-D Online Feedback Optimization (OFO)

Solving the AC-OPF with linear power flow approximation (4) is computationally efficient and could be employed in real-time operation. However, this approach does not take advantage of output measurements yty_{t}, since it only feeds dtd_{t} through the inaccurate linear model (3). Hence, such a feedforward approach introduces a model-mismatch that can cause a performance degradation, and even lead to constraint violations, e.g., under and overvoltages.

Instead, OFO is a novel approach [5, 6, 4] that uses yty_{t} as feedback to achieve a safer grid operation and track the solution of the AC-OPF (2) under time-varying conditions. For that, OFO turns a standard optimization algorithm, in our case projected gradient decent [23], into a feedback controller that takes the grid output measurements yty_{t}, instead of computing the output yty_{t} via the grid model (1) or the linearized one (3). Projected gradient decent consists of a gradient step and a projection: First, we compute the gradient of the cost function in (4):

u(f(u)+g(y))=(3)uf(u)+H0Tyg(y).\displaystyle\nabla_{u}\big{(}f(u)+g(y)\big{)}\overset{\eqref{eq:linPF}}{=}\nabla_{u}f(u)+H_{0}^{T}\nabla_{y}g(y).

To minimize the operational cost, the current input utu_{t} is pushed along the direction of the negative gradient with a step size α\alpha, and then it is projected onto the feasible space 𝒰t\mathcal{U}_{t} to enforce the operational constraints on the input, i.e.,

ut+1=Π𝒰t[utα(uf(ut)+H0Tyg(yt))],\displaystyle\begin{split}u_{t+1}=\Pi_{\mathcal{U}_{t}}\big{[}u_{t}-\alpha\big{(}\nabla_{u}f(u_{t})+H_{0}^{T}\nabla_{y}g(y_{t})\big{)}\big{]},\end{split} (5)

where Π𝒰[u]=argminz𝒰uz22\Pi_{\mathcal{U}}\big{[}u]=\arg\min_{z\in\mathcal{U}}\lVert u-z\rVert_{2}^{2} is the projection of uu onto 𝒰\mathcal{U}, which is typically easy to evaluate for power grid operation [6], especially if 𝒰t={u|u¯tuu¯t}\mathcal{U}_{t}=\{u\,|\,\underline{u}_{t}\leq u\leq\overline{u}_{t}\} is a box constraint.

III Online Feedback Optimization with Sensitivity Estimation

The OFO controllers are robust, i.e., preserve stability, against using a constant power flow sensitivity approximation H0H_{0} instead of the actual one uh(u,d)\nabla_{u}\mathdutchcal{h}(u,d) [7, 10]. Unfortunately, even if the overall system is stable, a model mismatch between H0H_{0} and uh(u,d)\nabla_{u}\mathdutchcal{h}(u,d) may lead to a difference between the solution utu_{t}^{*} of the AC-OPF problem and the values utu_{t} produced by the OFO controller (5) [10]. Therefore, we propose an approach to sequentially update the sensitivity H0H_{0} into a good approximation of the true sensitivity uh(u,d)\nabla_{u}\mathdutchcal{h}(u,d), and thus avoid a potential performance degradation. For that, we will consider the sensitivity as a time-varying parameter Ht=uh(ut,dt)H_{t}=\nabla_{u}\mathdutchcal{h}(u_{t},d_{t}), and propose a recursive least-squares approach to generate sensitivity estimates H^t\hat{H}_{t} using the measured variations of yy and uu over time, Δu\Delta u and Δy\Delta y respectively. Then, in every time step we feed this estimated sensitivity H^t\hat{H}_{t} to the OFO as in Figure 1.

System h(u,d)\mathdutchcal{h}(u,d)(10): Online Feedback Optimization(9): Sensitivity estimationH^\hat{H}Δu\Delta uΔy\Delta yinput uuoutput yyOFO with sensitivity estimationdisturbance dd
Figure 1: Model-free grid operation via Online Feedback Optimization (OFO) with sensitivity estimation.

III-A Sensitivity Estimation

Due to the non-linearity of h(u,d)\mathdutchcal{h}(u,d), the true sensitivity uh(u,d)\nabla_{u}\mathdutchcal{h}(u,d) depends on the values of uu and dd. The temporal variation of the disturbance dtd_{t} and the input utu_{t}, e.g., due to applying the OFO controller (5) in the input case, produces a time-varying sensitivity Ht=uh(ut,dt)H_{t}=\nabla_{u}\mathdutchcal{h}(u_{t},d_{t}). Instead of learning the dependency on uu and dd, we model a time-varying sensitivity HtH_{t} with the following random process:

ht=ht1+ωp,t1\displaystyle\begin{split}h_{t}=h_{t-1}+\omega_{p,t-1}\end{split} (6)

where h=vec(H)h=\text{vec}(H) is the column-wise vector representation of the sensitivity matrix HH, Δut1=utut1\Delta u_{t-1}=u_{t}-u_{t-1} denotes a change of the input uu, and ωp,t𝒩(0,Σp,t)\omega_{p,t}\sim\mathcal{N}(0,\Sigma_{p,t}) is a Gaussian process noise with covariance Σp,t=Σp1+Σp2Δut22\Sigma_{p,t}=\Sigma_{p_{1}}+\Sigma_{p_{2}}\lVert\Delta u_{t}\rVert_{2}^{2}, that represents how the sensitivity changes over time. We make the part Σp2\Sigma_{p_{2}} of the process noise proportional to Δut2\lVert\Delta u_{t}\rVert_{2}, since a large Δut\Delta u_{t} can trigger a larger change in the true sensitivity uh(u,d)\nabla_{u}\mathdutchcal{h}(u,d) that depends on uu, and the part Σp1\Sigma_{p_{1}} independent of Δut\Delta u_{t} to account for a uncontrolled random change Δdt=dt+1dt\Delta d_{t}=d_{t+1}-d_{t} that can affect the sensitivity as well.

Next, to derive a measurement equation for the sensitivity HtH_{t}, consider the first-order Taylor approximation of yty_{t}

h(ut,dt)yth(ut1,dt1)yt1+uh(ut1,dt1)Ht1Δut1+dh(ut1,dt1)Δdt1.\displaystyle\begin{split}\overbrace{\mathdutchcal{h}(u_{t},d_{t})}^{y_{t}}\approx&\overbrace{\mathdutchcal{h}(u_{t-1},d_{t-1})}^{y_{t-1}}+\overbrace{\nabla_{u}\mathdutchcal{h}(u_{t-1},d_{t-1})}^{H_{t-1}}\Delta u_{t-1}\\ &+\nabla_{d}\mathdutchcal{h}(u_{t-1},d_{t-1})\Delta d_{t-1}.\end{split} (7)

At each time tt, we measure yty_{t}, and compute the variation Δyt1=ytyt1\Delta y_{t-1}=y_{t}-y_{t-1}. Based on the Taylor approximation (7), we treat this variation Δyt1\Delta y_{t-1} as a noisy linear measurement of Ht1H_{t-1} through a measurement model that depends on Δut1\Delta u_{t-1}:

Δyt1=Ht1Δut1=UΔ,t1ht1+ωm,t1\displaystyle\begin{split}\Delta y_{t-1}&=\underbrace{H_{t-1}\Delta u_{t-1}}_{=U_{\Delta,t-1}h_{t-1}}+\omega_{m,t-1}\end{split} (8)

where UΔ,t=ΔutT𝟙U_{\Delta,t}=\Delta u_{t}^{T}\otimes\mathbbm{1}, with the Kronecker product \otimes, and ωm,t𝒩(0,Σm,t)\omega_{m,t}\sim\mathcal{N}(0,\Sigma_{m,t}) is a Gaussian measurement noise with covariance Σm,t=Σm1+Σm2Δut22+Σm3Δut24\Sigma_{m,t}=\Sigma_{m_{1}}+\Sigma_{m_{2}}\lVert\Delta u_{t}\rVert_{2}^{2}+\Sigma_{m_{3}}\lVert\Delta u_{t}\rVert_{2}^{4}. Again, the part Σm1\Sigma_{m_{1}} independent of Δut\Delta u_{t} in the measurement noise represents the effect of an uncontrolled random disturbance change Δdt\Delta d_{t}, while the other parts Σm2\Sigma_{m_{2}} and Σm3\Sigma_{m_{3}} encapsulate the second-order error of the Taylor approximation (7).

To update the sensitivity estimate h^t\hat{h}_{t}, we combine the information given by the previous sensitivity estimate h^t1=vec(H^t1)\hat{h}_{t-1}=\text{vec}(\hat{H}_{t-1}), and the measurements Δyt1\Delta y_{t-1} (8). We compute the new sensitivity estimate h^t\hat{h}_{t} through a Bayesian update represented in the following least-squares problem [17, 18]:

h^t=argminh^h^h^t1Σt112+Δyt1UΔ,t1h^Σm,t112,\displaystyle\begin{split}&\hat{h}_{t}=\arg\min_{\hat{h}}\lVert\hat{h}-\hat{h}_{t-1}\rVert_{{\Sigma_{t-1}^{-1}}}^{2}+\lVert\Delta y_{t-1}-U_{\Delta,t-1}\hat{h}\rVert_{{\Sigma_{m,t-1}^{-1}}}^{2},\end{split}

where Σt\Sigma_{t} is the covariance matrix representing the uncertainty of the sensitivity estimate h^t\hat{h}_{t}, and xA2=xTAx\lVert x\rVert_{A}^{2}=x^{T}Ax is the norm of xx with respect to a positive definite matrix AA. The resulting recursive estimation can be expressed as a Kalman filter [24]:

h^t=h^t1+Kt1(Δyt1UΔ,t1h^t1)Σt=(𝟙Kt1UΔ,t1)Σt1+Σp,t1,\displaystyle\begin{split}\hat{h}_{t}=&\hat{h}_{t-1}+K_{t-1}(\Delta y_{t-1}-U_{\Delta,t-1}\hat{h}_{t-1})\\ \Sigma_{t}=&\big{(}\mathbbm{1}-K_{t-1}U_{\Delta,t-1}\big{)}\Sigma_{t-1}+\Sigma_{p,t-1},\end{split} (9)

where 𝟙\mathbbm{1} is the identity matrix, and Kt=ΣtUΔ,tT(Σm,t+UΔ,tΣtUΔ,tT)1K_{t}=\Sigma_{t}U_{\Delta,t}^{T}(\Sigma_{m,t}+U_{\Delta,t}\Sigma_{t}U_{\Delta,t}^{T})^{-1} is the Kalman gain, which is well defined for an invertible Σm,t\Sigma_{m,t}, see later Assumption 1.

Remark 1

Note that for a diagonal measurement noise covariance Σm,t=σm,t𝟙\Sigma_{m,t}=\sigma_{m,t}\mathbbm{1}, in the limit σm,t\sigma_{m,t}\to\infty, the gain is Kt=0K_{t}=0, thus the sensitivity is not updated, and we keep the initial sensitivity, i.e., h^t=h^t1==h^0\hat{h}_{t}=\hat{h}_{t-1}=\cdots=\hat{h}_{0}. Similarly, a large Σm,t\Sigma_{m,t} diminishes KtK_{t}, and helps to tune how fast we want to learn or differ from the initial sensitivity. On the other hand, the process noise covariance Σp,t\Sigma_{p,t} represents our trust in our current model, and it also helps to tune the learning rate.

III-B Persistently Exciting OFO

To learn the time-varying sensitivity HtH_{t}, we need to capture enough information via the measurement equation (8), i.e, we need to use different Δu\Delta u to explore different reactions Δy\Delta y and infer different elements of HtH_{t} from them. This can be formalized via the persistency of excitation condition [25]: Δut\Delta u_{t} is persistently exciting if there exists a time span T>0T>0, such that for all t>0t>0, the matrix formed by columns Δut+i\Delta u_{t+i} for i{0,,T}i\in\{0,\dots,T\} has full rank, i.e., rank(Δut,,Δut+T)=nu\text{rank}(\Delta u_{t},\dots,\Delta u_{t+T})=n_{u}. To achieve persistency of excitation, we perturb the OFO step (5) with ωu,tnu\omega_{u,t}\in\mathbb{R}^{n_{u}}, a bounded zero-mean white noise with independent and identically distributed elements with standard deviation σu\sigma_{u}, e.g., a truncated Gaussian distribution. As a result, we obtain the following persistently exciting OFO with estimated sensitivity H^t\hat{H}_{t}:

ut+1=Π𝒰t[utα(uf(ut)+H^tTyg(yt))+ωu,t]\displaystyle\begin{split}u_{t+1}=\Pi_{\mathcal{U}_{t}}\big{[}u_{t}-\alpha\big{(}\nabla_{u}f(u_{t})+\hat{H}_{t}^{T}\nabla_{y}g(y_{t})\big{)}+\omega_{u,t}\big{]}\end{split} (10)

The resulting interconnected OFO, sensitivity learning and power grid is represented in the block diagram in Figure 1. At each time tt, a complete loop of the online optimization with sensitivity estimation can be represented as:

Algorithm 1 Online Feedback Optimization (OFO) with sensitivity estimation (blue block in Figure 1)
1:  Input: yty_{t} (measured from the grid)
2:  Recover from previous step: yt1,ut1,uty_{t-1},u_{t-1},u_{t}
3:  Sensitivity update using (9):
Kt1=Σt1UΔ,t1T(Σm,t1+UΔ,t1Σt1UΔ,t1T)1K_{t-1}=\Sigma_{t-1}U_{\Delta,t-1}^{T}(\Sigma_{m,t-1}+U_{\Delta,t-1}\Sigma_{t-1}U_{\Delta,t-1}^{T})^{-1} h^t=h^t1+Kt1(Δyt1UΔ,t1h^t1)\hat{h}_{t}=\hat{h}_{t-1}+K_{t-1}(\Delta y_{t-1}-U_{\Delta,t-1}\hat{h}_{t-1})
Σt=(𝟙Kt1UΔ,t1)Σt1+Σp,t1\Sigma_{t}=\big{(}\mathbbm{1}-K_{t-1}U_{\Delta,t-1}\big{)}\Sigma_{t-1}+\Sigma_{p,t-1}
4:  Sample the excitation noise ωu,t𝒩(0,σu2𝟙)\omega_{u,t}\sim\mathcal{N}(0,\sigma_{u}^{2}\mathbbm{1})
5:  Input optimization using (10):
ut+1=Π𝒰t[utα(uf(ut)+H^tTyg(yt))+ωu,t]u_{t+1}=\Pi_{\mathcal{U}_{t}}\big{[}u_{t}-\alpha\big{(}\nabla_{u}f(u_{t})+\hat{H}_{t}^{T}\nabla_{y}g(y_{t})\big{)}+\omega_{u,t}\big{]}
6:  Output: ut+1u_{t+1}
Remark 2

The sensitivity learning approach (9) is independent of the method used to update the input uu, since it only requires the increment Δu\Delta u and the measured Δy\Delta y. Hence, it is not only compatible with the projected-gradient-based OFO in (10), but can be combined with linearly simplified AC-OPF as (4), or other OFO approaches, e.g., primal-dual methods [6, 7], quadratic programming [26, 27], which may have other desirable properties, like strict constraint satisfaction or a faster convergence.

III-C Convergence Analysis

In this section we analyze the convergence of the estimated sensitivity H^t\hat{H}_{t} produced by the sensitivity learning (9), and the input utu_{t} produced by the OFO (10), towards the true sensitivity HtH_{t} and the solution utu^{*}_{t} of the AC-OPF (2), respectively. We certify this convergence assuming that the true sensitivity HtH_{t} behaves according to the simplified dynamic process (6) and satisfies the linear measurements equation (8); and that the projected gradient descent used in (10) is a strongly monotone and Lipschitz continuous operator:

Definition 1 (Monotone and Lipschitz operator)

An operator F:nnF:\mathbb{R}^{n}\to\mathbb{R}^{n} is ηF\eta_{F}-strongly monotone if (x1x2)T(F(x1)F(x2))ηFx1x222(x_{1}-x_{2})^{T}(F(x_{1})-F(x_{2}))\geq\eta_{F}\lVert x_{1}-x_{2}\rVert_{2}^{2} for all x1,x2x_{1},x_{2}, and LFL_{F}-Lipschitz continuous if F(x1)F(x2)2LFx1x22\lVert F(x_{1})-F(x_{2})\rVert_{2}\leq L_{F}\lVert x_{1}-x_{2}\rVert_{2}.

Assumption 1

The functions f()f(\cdot) and g()g(\cdot) in (2) are continuously differentiable. The sensitivity satisfies (6) and (8) with independent ωp,t\omega_{p,t} and ωm,t\omega_{m,t}. Furthermore, for all t>0t>0, Σp,t,Σm,t\Sigma_{p,t},\Sigma_{m,t} have a positive lower and upper bound, i.e., there exists γ,β>0\gamma,\beta>0 such that γ𝟙Σp,tβ𝟙\gamma\mathbbm{1}\preceq\Sigma_{p,t}\preceq\beta\mathbbm{1}, γ𝟙Σm,tβ𝟙\gamma\mathbbm{1}\preceq\Sigma_{m,t}\preceq\beta\mathbbm{1}; there exists Lh>0L_{h}>0 such that yg(h(ut,dt))2Lh\lVert\nabla_{y}g(\mathdutchcal{h}(u^{*}_{t},d_{t}))\rVert_{2}\leq L_{h}; and the operator Ft()=uf()+HtTyg(h(,dt))F_{t}(\cdot)=\nabla_{u}f(\cdot)+H_{t}^{T}\nabla_{y}g(\mathdutchcal{h}(\cdot,d_{t})) in (10) is η\eta-strongly monotone and LL-Lipschitz continuous.

The continuous differentiability of f()f(\cdot) and g()g(\cdot) is common for typical cost functions in power systems, e.g., linear or quadratic f()f(\cdot), and quadratic penalty functions like g()=max(0,)2g(\cdot)=\max(0,\cdot)^{2}. For strongly convex and Lipschitz smooth cost functions f()f(\cdot), the strong monotonicity and Lipschitz continuity of the gradient operator Ft()F_{t}(\cdot) holds in certain regions around nominal operating points [10]. In particular, it would hold if using a usual linear approximation for the input-output map (1) [8]. Since uu and dd are restricted by the grid physical limits, e.g., power ratings, the upper bound of Δut2\lVert\Delta u_{t}\rVert_{2} and yg(h(ut,dt))2\lVert\nabla_{y}g(\mathdutchcal{h}(u^{*}_{t},d_{t}))\rVert_{2} are justified, since g()g(\cdot) is differentiable in a compact set. The persistency of excitation ensures that Δut2>0\lVert\Delta u_{t}\rVert_{2}>0 with high probability. Then, Σp,t=Σp1+Σp2Δut220,Σm,t=Σm1+Σm2Δut22+Σm3Δut240\Sigma_{p,t}=\Sigma_{p_{1}}+\Sigma_{p_{2}}\lVert\Delta u_{t}\rVert_{2}^{2}\succ 0,\Sigma_{m,t}=\Sigma_{m_{1}}+\Sigma_{m_{2}}\lVert\Delta u_{t}\rVert_{2}^{2}+\Sigma_{m_{3}}\lVert\Delta u_{t}\rVert_{2}^{4}\succ 0 if at least one Σpi0\Sigma_{p_{i}}\succ 0 and one Σmj0\Sigma_{m_{j}}\succ 0 for some i,ji,j. Finally, even though the true sensitivity is state dependent, i.e., Ht=uh(ut,dt)H_{t}=\nabla_{u}\mathdutchcal{h}(u_{t},d_{t}), the process and measurement noises in (6) and (8) allow to overapproximate the actual behavior of the sensitivity via these simplifications. In conclusion, Assumption 1 is reasonable. Then, with a persistently exciting Δu\Delta u as in (10), we have the following convergence result:

Proposition 1

Under Assumption 1, and the persistently excited OFO updates (10), the sensitivity estimates (9) satisfy:

Unbiased mean: 𝔼[hth^t]22Ch,1eCh,2tt0Bounded covariance: 𝔼[hth^t22]=tr(Σt)Ch,3+Ch,4eCh,5tCh,3,\displaystyle\begin{split}\text{Unbiased mean: }&\lVert\mathbb{E}[h_{t}-\hat{h}_{t}]\rVert_{2}^{2}\leq C_{h,1}e^{-C_{h,2}t}\overset{t\to\infty}{\to}0\\ \text{Bounded covariance: }&\mathbb{E}[\lVert h_{t}-\hat{h}_{t}\rVert_{2}^{2}]=\text{tr}(\Sigma_{t})\\ &\leq C_{h,3}+C_{h,4}e^{-C_{h,5}t}{\to}C_{h,3},\end{split} (11)

where 𝔼[]\mathbb{E}[\cdot] denotes the expectation, Ch,i>0C_{h,i}>0 are positive constants, and t\overset{t\to\infty}{\to} the limit as tt goes to infinity. Furthermore, if the step size in (10) satisfies α<2ηL2\alpha<\tfrac{2\eta}{L^{2}}, so that ϵ=12ηα+L2α2<1\epsilon=\sqrt{1-2\eta\alpha+L^{2}\alpha^{2}}<1, then we have

𝔼[utut2]11ϵ(σu+supk<t𝔼[Δuk2]+Ch,3αLh)+ϵt𝔼[u0u02]+αLhtCh,4max(ϵ,eCh,52)t1t11ϵ(σu+supk𝔼[Δuk2]+Ch,3αLh).\displaystyle\begin{split}&\mathbb{E}[\lVert u_{t}-u_{t}^{*}\rVert_{2}]\\ \leq&\tfrac{1}{1-\epsilon}\big{(}\sigma_{u}+\sup_{k<t}\mathbb{E}[\lVert\Delta u^{*}_{k}\rVert_{2}]+\sqrt{C_{h,3}}\alpha L_{h}\big{)}\\[-2.84544pt] &+\epsilon^{t}\mathbb{E}[\lVert u_{0}-u^{*}_{0}\rVert_{2}]+\alpha L_{h}t\sqrt{C_{h,4}}\max(\epsilon,e^{\frac{-C_{h,5}}{2}})^{t-1}\\[2.84544pt] \overset{t\to\infty}{\to}&\tfrac{1}{1-\epsilon}\big{(}\sigma_{u}+\sup_{k}\mathbb{E}[\lVert\Delta u^{*}_{k}\rVert_{2}]+\sqrt{C_{h,3}}\alpha L_{h}\big{)}.\end{split} (12)
Proof:

See Appendix. ∎

Proposition 1 establishes first that the estimated sensitivity h^t\hat{h}_{t} converges in expectation to the true sensitivity hth_{t} with a bounded covariance. Additionally, the control input utu_{t} converges to the AC-OPF solution utu_{t}^{*} from (2) with a quantifiable tracking error determined by the bound Ch,3C_{h,3} of the sensitivity estimation covariance, the variance σu\sigma_{u} of the persistency of excitation noise ωu\omega_{u}, and the temporal variation of the AC-OPF solution 𝔼[Δut22]\mathbb{E}[\lVert\Delta u^{*}_{t}\rVert_{2}^{2}], where Δut\Delta u^{*}_{t} can also be bounded by the temporal variation of dtd_{t} and 𝒰t\mathcal{U}_{t} in the AC-OPF (2) [28].

IV Test Case

In this section we validate the proposed OFO with sensitivity estimation. We simulate a benchmark distribution grid under time-varying conditions during a 1-hour simulation with 1-second resolution, hence a 1 second control-loop rate. In particular, we show its superior performance against an OFO approach with a constant sensitivity. First we explain the simulation setup, and then we comment the results obtained.

IV-A Simulation Setup

Refer to caption
Figure 2: IEEE 123-bus test feeder [19]. Distributed generation: yellow diamond = solar, grey parallelogram = wind. Lines with perturbed electrical parameters: blue square-dotted.
  • Distribution grid: We use the 3-phase, unbalanced IEEE 123-bus test feeder [19] in Figure 2.

  • Disturbance dd: We consider uncontrollable active and reactive loads in our disturbance vector dd. To generate these load profiles we use 11-second resolution data of the ECO data set [29], then aggregate households and rescale them to the base loads of the 123-bus feeder. This gives us values of dtd_{t} for every second during simulation time of 1h.

  • Controllable input set-points uu: We add two solar PV systems and two wind turbines to the grid as in [8], see Figure 2. They can inject active power, and inject and absorb reactive power on all three phases, which gives us 24 control inputs. We consider a slack bus 150 in Figure 2, with a controllable voltage magnitude through, e.g., a tap changer, which makes in total nu=25n_{u}=25. The solar and wind generation profiles are generated based on a 11-minute solar irradiation profile [30] and a 22-minute wind speed profile [31]. Generation is assumed constant between samples. We use these profiles to set the time-varying upper limit of the feasible set u¯t\overline{u}_{t}, set the lower limit of active generation to u¯t=0\underline{u}_{t}=0, and define 𝒰t={u|u¯tuu¯t}\mathcal{U}_{t}=\{u\,|\,\underline{u}_{t}\leq u\leq\overline{u}_{t}\}.

  • Output yy: We consider as output yy the voltage magnitudes of all phases at all buses except the slack bus, given that it is a control input.

  • AC-OPF cost function in (2): We use a quadratic cost that penalizes deviating from a reference: f(u)=12uuref22f(u)=\frac{1}{2}\lVert u-u_{\text{ref}}\rVert_{2}^{2}. The reference urefu_{\text{ref}} for the voltage magnitude at the slack bus is 11 p.u. The reference for the controllable generation is the maximum installed power to promote using as much renewable energy as possible. The reference for reactive power is 0. Note that the cost function is continuously differentiable, and has a strongly monotone and Lipschitz continuous gradient as required in Assumption 1. We consider the voltage limits [0.94p.u.,1.06p.u.][0.94\,\text{p.u.},1.06\,\text{p.u.}] for all nodes as in [5, 8], and use the penalty function g(y)=ρ2max([𝟙𝟙]y+[1.060.94],0)2g(y)=\frac{\rho}{2}\max\big{(}\left[\begin{smallmatrix}\mathbbm{1}\\ -\mathbbm{1}\end{smallmatrix}\right]y+\left[\begin{smallmatrix}-1.06\\ 0.94\end{smallmatrix}\right],0\big{)}^{2}, with a sufficiently large penalization parameter ρ=100\rho=100 to discourage violations. Again, this function is continuously differentiable, and has a monotone and Lipschitz continuous gradient.

  • Sensitivity process and measurement noises in (6) and (8): Under fast sampling rates Δdt\Delta d_{t} may be negligible, especially when compared to Δut\Delta u_{t}. Hence, for the simulation we assign Σp1,Σm1,Σm2\Sigma_{p_{1}},\Sigma_{m_{1}},\Sigma_{m_{2}} to 0, and keep Σp2,Σm30\Sigma_{p_{2}},\Sigma_{m_{3}}\succ 0. This ensures that Σp,t,Σm,t0\Sigma_{p,t},\Sigma_{m,t}\succ 0 for all tt, as required by Assumption 1.

  • Persistency of excitation: We use a symmetric truncated Gaussian distribution with σu=0.0001\sigma_{u}=0.0001 p.u. to introduce a low persistency of excitation noise ωu,t\omega_{u,t} that facilitates our sensitivity learning, but avoids introducing a big deviation in the input convergence, see (12).

  • Initializing sensitivity and linear model (3): We use the zero-injection operating point uop=0,dop=0u_{\text{op}}=0,d_{\text{op}}=0 to initialize the sensitivity estimation, i.e., H^0=H0=uh(u,d)|(0,0)\hat{H}_{0}=H_{0}=\nabla_{u}\mathdutchcal{h}(u,d)|_{(0,0)}, see (3). In the first simulation (1: true admittance) we use the true admittance to compute H0H_{0}, in the second (2: perturbed admittance) we use a perturbed admittance matrix, where we have introduce an up to 20%20\% error in the admittance of the lines indicated in Figure 2.

IV-B Results

We analyze the simulation performance of OFO with sensitivity learning (9) and (10), and compare it against an OFO with constant sensitivity (5). We validate both results in Proposition 1: First, the estimated sensitivity H^t\hat{H}_{t} converges to the real time-varying sensitivity HtH_{t}. Second, the input utu_{t} converges to the AC-OPF solution utu^{*}_{t} (2).

IV-B1 True admittance

First we perform a simulation where we use the true admittance to derive the initial sensitivity H0H_{0} in the linear power flow approximation (3). Figure 3 shows the norm of the AC-OPF solution utu^{*}_{t} of (2) that we calculate with the correct non-linear model h()\mathdutchcal{h}(\cdot) and the disturbances dtd_{t}. This optimal input is time-varying due to the changing solar radiation and wind speed in the limits u¯t\overline{u}_{t}, and the temporal variation of the loads in dtd_{t}. Figure 3 shows how the OFO control input utu_{t} converges towards the optimal input utu^{*}_{t} using different sensitivities: The inputs uHu_{H} produced by the OFO controller (5) with the exact sensitivity Ht=h(ut,dt)H_{t}=\nabla\mathdutchcal{h}(u_{t},d_{t}) succeed in tracking the AC-OPF solution uu^{*}, with relatively small differences caused by the time-varying disturbances dtd_{t} and/or available energy u¯t\overline{u}_{t}. However, when using the constant sensitivity H0H_{0} in (5), there is a large difference between the generated control input uH0u_{H_{0}} and the optimal one uu^{*}. This gap is closed when using the OFO with sensitivity estimation (10), i.e., uH^u_{\hat{H}} is able to converge to the AC-OPF solution uu^{*} of (2) with a small tracking error, as predicted by Proposition 1.

Figure 4 shows the relative error ΔyHΔu2Δy2\frac{\lVert\Delta y-H\Delta u\rVert_{2}}{\lVert\Delta y\rVert_{2}} of the measurement equation (8). This helps to understand why OFO with sensitivity learning (9) performs better than with a constant sensitivity H0H_{0}: The linearization error with estimated sensitivity H^t\hat{H}_{t} gets lower respect to the one with H0H_{0}. This means that the learned sensitivity becomes a more accurate linear approximation than (3), which causes the lower optimization error observed in Figure 3. Even though the error ΔyHΔu2Δy2\frac{\lVert\Delta y-H\Delta u\rVert_{2}}{\lVert\Delta y\rVert_{2}} does not converge to 0 when using H^\hat{H}, the sensitivity estimation approach (9) learns enough to drive the control set-points to the optimum, see Figure 3, which is our ultimate objective.

Finally, Figure 5 shows that the inputs uH^u_{\hat{H}}, produced by the OFO with sensitivity estimation (10) result into much less voltage violations than uH0u_{H_{0}} from the OFO with constant sensitivity (5). Actually, the number of voltage violations of uH^u_{\hat{H}} gets close to those of the OFO with true sensitivity uHu_{H}. Hence, the OFO with sensitivity estimation not only reduces the distance to the AC-OPF solution, see Figure 3, but performs a better voltage regulation.

IV-B2 Perturbed admittance

In Figure 6 we show a simulation for which we perturb the admittance of the lines indicated in Figure 2 with an up to 20%20\% error. We observe how the OFO with sensitivity learning uH^u_{\hat{H}} (10) is still able to track the AC-OPF solution uu^{*} of (2) in time-varying conditions. The OFO uH0u_{H_{0}} with a fixed sensitivity (5) and the same step size as uH^u_{\hat{H}} diverges, since it tries to regulate the voltage with a wrong sensitivity that is too far from the actual one. Convergence is recovered with a lower step size in uH0,slowu_{H_{0},\text{slow}}, but it still performs poorly at tracking the AC-OPF solution. This experiment allows us to conclude that the OFO with sensitivity estimation (10) is a model-free approach that does not require an accurate model, but learns it online.

Refer to caption
Figure 3: Euclidean norm of the AC-OPF solution utu_{t}^{*}, and the optimization error between utu_{t}^{*} and the set-points utu_{t} produced by the OFO (5), using either the true sensitivity HH (green with dots), the estimated sensitivity H^{\hat{H}} (blue with diamonds), the constant sensitivity at a zero-injection operation point H0{H_{0}} (yellow with squares), with respective set-points uH,uH^,uH0u_{H},u_{\hat{H}},u_{H_{0}}.
Refer to caption
Figure 4: Moving average over 5 minutes of the relative error ΔyHΔu2Δy2\frac{\lVert\Delta y-H\Delta u\rVert_{2}}{\lVert\Delta y\rVert_{2}} when using the learned sensitivity H^\hat{H} (blue with diamonds) or the one fixed at an zero-injection operation point H0H_{0} (yellow with squares).
Refer to caption
Figure 5: Moving average over 5 minutes of the number of voltage violations across all nodes.
Refer to caption
Figure 6: Same as Figure 3. For the constant sensitivity H0H_{0}, we plot uH0u_{H_{0}} (yellow) when using the same step size as uH,uH^u_{H},u_{\hat{H}}, and uH0,slowu_{H_{0},\text{slow}} (red) with a smaller step size. Both H^\hat{H} and H0H_{0} are initialized with an perturbed admittance matrix YY.

V Conclusion and Outlook

Standard Online Feedback Optimization (OFO) typically uses an approximate input-output sensitivity, which may lower its performance. Alternative, one can compute the actual sensitivity, but that requires, having an accurate grid model and full grid observability, which is usually not available. In this work we have proposed a recursive estimation approach that provides Online Feedback Optimization (OFO) with a tool to learn the model sensitivity without extensive measurements, and thus improves its performance and turns OFO into a model-free approach. We have provided convergence guarantees when approximating the time-varying sensitivity behavior by a random process with linear measurements. We have established that even under time-varying conditions the estimated sensitivity and the control input converge to a neighborhood of the true sensitivity and the solution of the AC-OPF, respectively. Finally, we have validated with simulations using the IEEE 123-bus test feeder that our proposed OFO controller with sensitivity estimation performs successfully even though the actual sensitivity is state-dependent, i.e., it is able to track a time-varying optimal input while satisfying the grid specifications. In short, the proposed OFO controller with sensitivity estimation can be used as a model-free plug-and-play controller for real-time power grid operation that enables safe and optimal control.

An interesting future addition would be to investigate a more suitable way to design the persistency of excitation, possibly linked to the optimization problem, so that it explores specific directions of interest. Additionally, it would be interesting to observe how the proposed sensitivity estimation approach performs under a sudden change of topology caused by, e.g., a line fault, network split, etc.; under communication problems, e.g., delays, missing packages, recurrent outliers due to, for example, sensor misscalibration.

References

  • [1] D. K. Molzahn and I. A. Hiskens, “A Survey of Relaxations and Approximations of the Power Flow Equations,” Foundations and Trends in Electric Energy Systems, vol. 4, no. 1-2, pp. 1–221, February 2019.
  • [2] S. Bolognani, N. Bof, D. Michelotti, R. Muraro, and L. Schenato, “Identification of power distribution network topology via voltage correlation analysis,” in 52nd Conf. on Decision and Control, 2013, pp. 1659–1664.
  • [3] K. Moffat, M. Bariya, and A. Von Meier, “Unsupervised impedance and topology estimation of distribution networks—limitations and tools,” IEEE Trans. Smart Grid, vol. 11, no. 1, pp. 846–856, 2020.
  • [4] D. K. Molzahn, F. Dörfler, H. Sandberg, S. H. Low, S. Chakrabarti, R. Baldick, and J. Lavaei, “A survey of distributed optimization and control algorithms for electric power systems,” IEEE Trans. Smart Grid, vol. 8, no. 6, pp. 2941–2962, Nov. 2017.
  • [5] A. Hauswirth, A. Zanardi, S. Bolognani, F. Dörfler, and G. Hug, “Online optimization in closed loop on the power flow manifold,” in 2017 IEEE Manchester PowerTech.   IEEE, 2017, pp. 1–6.
  • [6] E. Dall’Anese and A. Simonetto, “Optimal power flow pursuit,” IEEE Trans. Smart Grid, vol. 9, no. 2, pp. 942–952, Mar. 2018.
  • [7] L. Ortmann, A. Hauswirth, I. Caduff, F. Dörfler, and S. Bolognani, “Experimental validation of feedback optimization in power distribution grids,” Electric Power Systems Research, vol. 189, p. 106782, 2020.
  • [8] M. Picallo, S. Bolognani, and F. Dörfler, “Closing the loop: Dynamic state estimation and feedback optimization of power grids,” Electric Power Systems Research, vol. 189, p. 106753, 2020.
  • [9] S. Bolognani and F. Dörfler, “Fast power system analysis via implicit linearization of the power flow manifold,” in 53rd Annual Allerton Conf. on Communication, Control, and Computing.   IEEE, 2015, pp. 402–409.
  • [10] M. Colombino, J. W. Simpson-Porco, and A. Bernstein, “Towards robustness guarantees for feedback-based optimization,” in 2019 IEEE 58th Conf. on Decision and Control.   IEEE, 2019, pp. 6207–6214.
  • [11] X. Chen, G. Qu, Y. Tang, S. Low, and N. Li, “Reinforcement learning for decision-making and control in power systems: Tutorial, review, and vision,” arXiv preprint arXiv:2102.01168, 2021.
  • [12] J. Coulson, J. Lygeros, and F. Dörfler, “Data-enabled predictive control: In the shallows of the deepc,” in 2019 18th European Control Conference (ECC).   IEEE, 2019, pp. 307–312.
  • [13] C. Mugnier, K. Christakou, J. Jaton, M. De Vivo, M. Carpita, and M. Paolone, “Model-less/measurement-based computation of voltage sensitivities in unbalanced electrical distribution networks,” in 2016 Power Systems Computation Conference (PSCC).   IEEE, 2016, pp. 1–7.
  • [14] G. Bianchin, M. Vaquero, J. Cortes, and E. Dall’Anese, “Data-driven synthesis of optimization-based controllers for regulation of unknown linear systems,” Dec. 2021.
  • [15] J. C. Willems, P. Rapisarda, I. Markovsky, and B. L. De Moor, “A note on persistency of excitation,” Systems & Control Letters, vol. 54, no. 4, pp. 325–329, 2005.
  • [16] X. Chen, J. I. Poveda, and N. Li, “Model-free optimal voltage control via continuous-time zeroth-order methods,” Dec. 2021.
  • [17] L. Lennart, “System identification: theory for the user,” PTR Prentice Hall, Upper Saddle River, NJ, vol. 28, 1999.
  • [18] R. Isermann and M. Münchhof, Identification of dynamic systems: an introduction with applications.   Springer Science & Business Media, 2010.
  • [19] W. Kersting, “Radial distribution test feeders,” IEEE Trans. Power Syst., vol. 6, no. 3, pp. 975–985, 1991.
  • [20] S. G. Krantz and H. R. Parks, The implicit function theorem: history, theory, and applications.   Springer Science & Business Media, 2012.
  • [21] S. H. Low, “Convex relaxation of optimal power flow—part i: Formulations and equivalence,” IEEE Transactions on Control of Network Systems, vol. 1, no. 1, pp. 15–27, 2014.
  • [22] S. Bolognani and S. Zampieri, “On the existence and linear approximation of the power flow solution in power distribution networks,” IEEE Transactions on Power Systems, vol. 31, no. 1, pp. 163–172, 2015.
  • [23] D. P. Bertsekas, “Nonlinear programming,” Journal of the Operational Research Society, vol. 48, no. 3, pp. 334–334, 1997.
  • [24] A. H. Jazwinski, “Mathematics in science and engineering,” Stochastic processes and filtering theory, vol. 64, 1970.
  • [25] E. Bai and S. Sastry, “Persistency of excitation, sufficient richness and parameter convergence in discrete time adaptive control,” Systems & Control Letters, vol. 6, no. 3, pp. 153–163, 1985.
  • [26] V. Häberle, A. Hauswirth, L. Ortmann, S. Bolognani, and F. Dörfler, “Non-convex feedback optimization with input and output constraints,” IEEE Control Systems Letters, vol. 5, no. 1, pp. 343–348, 2020.
  • [27] M. Picallo, D. Liao-McPherson, S. Bolognani, and F. Dörfler, “Cross-layer design for real-time grid operation: Estimation, optimization and power flow,” arXiv preprint arXiv:2109.13842, 2021.
  • [28] I. Subotic, A. Hauswirth, and F. Dorfler, “Quantitative sensitivity bounds for nonlinear programming and time-varying optimization,” IEEE Transactions on Automatic Control, 2021.
  • [29] C. Beckel, W. Kleiminger, R. Cicchetti, T. Staake, and S. Santini, “The ECO data set and the performance of non-intrusive load monitoring algorithms,” in Proc. 1st ACM Conf. on Embedded Systems for Energy-Efficient Buildings, 11 2014.
  • [30] HelioClim-3, “HelioClim-3 Database of Solar Irradiance,” http://www.soda-pro.com/web-services/radiation/helioclim-3-archives-for-free, accessed: 2017-12-01.
  • [31] MERRA-2, “The Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) Web service,” http://www.soda-pro.com/web-services/meteo-data/merra, accessed: 2017-12-01.
  • [32] Tzyh-Jong Tarn and Y. Rasis, “Observers for nonlinear stochastic systems,” IEEE Trans. Autom. Control, vol. 21, no. 4, pp. 441–448, Aug. 1976.

Appendix: Proof of Proposition 1

Consider the information matrix WI=k=tt+TUΔ,kTΣm,k1UΔ,k=k=tt+T(ΔukΔukT)Σm,k1W_{I}=\sum_{k=t}^{t+T}U_{\Delta,k}^{T}{\Sigma_{m,k}^{-1}}U_{\Delta,k}\allowbreak=\sum_{k=t}^{t+T}(\Delta u_{k}\Delta u_{k}^{T})\otimes{\Sigma_{m,k}^{-1}}. Since γ𝟙Σm,tβ𝟙\gamma\mathbbm{1}\preceq\Sigma_{m,t}\preceq\beta\mathbbm{1} for all tt, we have 1β𝟙Σm,t11γ𝟙\frac{1}{\beta}\mathbbm{1}\preceq\Sigma_{m,t}^{-1}\preceq\frac{1}{\gamma}\mathbbm{1}, and (k=tt+TΔukΔukT)1β𝟙WI(k=tt+TΔukΔukT)1γ𝟙(\sum_{k=t}^{t+T}\Delta u_{k}\Delta u_{k}^{T})\otimes\frac{1}{\beta}\mathbbm{1}\preceq W_{I}\preceq(\sum_{k=t}^{t+T}\Delta u_{k}\Delta u_{k}^{T})\otimes\frac{1}{\gamma}\mathbbm{1}. Since Δu\Delta u is persistently exciting, there exists a sufficiently large TT and γ2,β2>0\gamma_{2},\beta_{2}>0 so that γ2𝟙k=tt+TΔukΔukTβ2𝟙\gamma_{2}\mathbbm{1}\preceq\sum_{k=t}^{t+T}\Delta u_{k}\Delta u_{k}^{T}\preceq\beta_{2}\mathbbm{1}, and thus γ2β𝟙WIβ2γ𝟙\frac{\gamma_{2}}{\beta}\mathbbm{1}\preceq W_{I}\preceq\frac{\beta_{2}}{\gamma}\mathbbm{1}. Hence, the matrix pair (𝟙,UΔ,t)(\mathbbm{1},U_{\Delta,t}) from the dynamic system (6) and (8) is uniformly completely observable, and, additionally, uniformly complete controllable given Σp,t0\Sigma_{p,t}\succ 0 [24, Ch. 7]. As a result, the sensitivity converges exponentially in expectation, and is exponentially bounded in mean square [24, 32], i.e., there exists positive constants Ch,i>0C_{h,i}>0 satisfying (11).

Then, under Assumption 1 we have

ut+1ut+12ut+1ut2+Δut2(10)Π𝒰t[utα(uf(ut)+H^tTyg(yt))+ωu,t]Π𝒰t[utαFt(ut)]2+Δut2(utα(uf(ut)+H^tTyg(yt))+ωu,t)±Htyg(yt)(utαFt(ut))2+Δut2(utαFt(ut))(utαFt(ut))2+ωu,t2+αLhhth^t2+Δut2ϵutut2+ωu,t2+αLhhth^t2+Δut2,\displaystyle\begin{split}&\lVert u_{t+1}-u_{t+1}^{*}\rVert_{2}\leq\lVert u_{t+1}-u^{*}_{t}\rVert_{2}+\lVert\Delta u^{*}_{t}\rVert_{2}\\[-2.84544pt] \overset{\eqref{eq:ofope}}{\hskip 11.38092pt\leq}&\lVert\Pi_{\mathcal{U}_{t}}\big{[}u_{t}-\alpha\big{(}\nabla_{u}f(u_{t})+\hat{H}_{t}^{T}\nabla_{y}g(y_{t})\big{)}+\omega_{u,t}\big{]}\\ &-\Pi_{\mathcal{U}_{t}}\big{[}u^{*}_{t}-\alpha F_{t}(u^{*}_{t})\big{]}\rVert_{2}+\lVert\Delta u^{*}_{t}\rVert_{2}\\ {\leq}&\lVert\big{(}u_{t}-\alpha\big{(}\nabla_{u}f(u_{t})+\hat{H}_{t}^{T}\nabla_{y}g(y_{t})\big{)}+\omega_{u,t}\big{)}\pm H_{t}\nabla_{y}g(y_{t})\\ &-\big{(}u^{*}_{t}-\alpha F_{t}(u^{*}_{t})\big{)}\rVert_{2}+\lVert\Delta u^{*}_{t}\rVert_{2}\\ \leq&\lVert\big{(}u_{t}-\alpha F_{t}(u_{t})\big{)}-\big{(}u^{*}_{t}-\alpha F_{t}(u^{*}_{t})\big{)}\rVert_{2}+\lVert\omega_{u,t}\rVert_{2}\\ &+\alpha L_{h}\lVert h_{t}-\hat{h}_{t}\rVert_{2}+\lVert\Delta u^{*}_{t}\rVert_{2}\\ \leq&\epsilon\lVert u_{t}-u^{*}_{t}\rVert_{2}+\lVert\omega_{u,t}\rVert_{2}+\alpha L_{h}\lVert h_{t}-\hat{h}_{t}\rVert_{2}+\lVert\Delta u^{*}_{t}\rVert_{2},\end{split}

where in the second inequality we use that utu^{*}_{t} satisfies ut=Π𝒰t[utαFt(ut)]u^{*}_{t}=\Pi_{\mathcal{U}_{t}}\big{[}u^{*}_{t}-\alpha F_{t}(u_{t}^{*})\big{]}, i.e., due to optimality utu^{*}_{t} is a fixed point of the operator (10) with ωu,t=0\omega_{u,t}=0 and the true sensitivity HtH_{t} instead of the estimated one H^t\hat{H}_{t}. In the fourth inequality, where ϵ2=12ηα+L2α2\epsilon^{2}=1-2\eta\alpha+L^{2}\alpha^{2}, we use that the operator Ft()F_{t}(\cdot) is η\eta-strongly monotone and LL-Lipschitz continuous. Hence, in expectation we have

𝔼[ut+1ut+12]ϵ𝔼[utut2]+σu+𝔼[Δut2]+αLh𝔼[hth^t2]ϵt+1𝔼[u0u02]+11ϵ(σu+supkt𝔼[Δuk2])+αLhk=0tϵtk𝔼[hkh^k2](11)ϵt+1𝔼[u0u02]+11ϵ(σu+supkt𝔼[Δuk2]+Ch,3αLh)+αLh(t+1)Ch,4max(ϵ,eCh,52)tt11ϵ(σu+supk𝔼[Δuk2]+Ch,3αLh),\displaystyle\begin{split}&\mathbb{E}[\lVert u_{t+1}-u_{t+1}^{*}\rVert_{2}]\\ \leq&\epsilon\mathbb{E}[\lVert u_{t}-u^{*}_{t}\rVert_{2}]+\sigma_{u}+\mathbb{E}[\lVert\Delta u^{*}_{t}\rVert_{2}]+\alpha L_{h}\mathbb{E}[\lVert h_{t}-\hat{h}_{t}\rVert_{2}]\\ \leq&\epsilon^{t+1}\mathbb{E}[\lVert u_{0}-u^{*}_{0}\rVert_{2}]+\tfrac{1}{1-\epsilon}\big{(}\sigma_{u}+\sup_{k\leq t}\mathbb{E}[\lVert\Delta u^{*}_{k}\rVert_{2}]\big{)}\\[-9.95863pt] &+\alpha L_{h}\sum_{k=0}^{t}\epsilon^{t-k}\mathbb{E}[\lVert h_{k}-\hat{h}_{k}\rVert_{2}]\\[-5.69046pt] \overset{\eqref{eq:convhproof}}{\leq}&\epsilon^{t+1}\mathbb{E}[\lVert u_{0}-u^{*}_{0}\rVert_{2}]\\ &+\tfrac{1}{1-\epsilon}\big{(}\sigma_{u}+\sup_{k\leq t}\mathbb{E}[\lVert\Delta u^{*}_{k}\rVert_{2}]+\sqrt{C_{h,3}}\alpha L_{h}\big{)}\\[-4.26773pt] &+\alpha L_{h}(t+1)\sqrt{C_{h,4}}\max(\epsilon,e^{\frac{-C_{h,5}}{2}})^{t}\\[-2.84544pt] \overset{t\to\infty}{\to}&\tfrac{1}{1-\epsilon}\big{(}\sigma_{u}+\sup_{k}\mathbb{E}[\lVert\Delta u^{*}_{k}\rVert_{2}]+\sqrt{C_{h,3}}\alpha L_{h}\big{)},\end{split}

where in the second inequality we apply the first one recursively. In the second and third inequality we bound the geometric series k=0tϵtk11ϵ\sum_{k=0}^{t}\epsilon^{t-k}\leq\frac{1}{1-\epsilon}, and use that \sqrt{\cdot} is subadditive.