This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Closed-form control with spike coding networks

Filip S. Slijkhuis, Sander W. Keemink, and Pablo Lanillos
These authors contributed equally
The authors are with the Donders Institute for Brain, Cognition and Behavior. Department of AI, Radboud University, Nijmegen, the Netherlands. e-mail: filipslijkhuis[at]outlook.com, {sander.keemink, p.lanillos}[at]donders.ru.nlP.L. is partially supported by the Spikeference project, Human Brain Project Specific Grant Agreement 3 (ID: 945539).
Abstract

Efficient and robust control using spiking neural networks (SNNs) is still an open problem. Whilst behaviour of biological agents is produced through sparse and irregular spiking patterns, which provide both robust and efficient control, the activity patterns in most artificial spiking neural networks used for control are dense and regular — resulting in potentially less efficient codes. Additionally, for most existing control solutions network training or optimization is necessary, even for fully identified systems, complicating their implementation in on-chip low-power solutions. The neuroscience theory of Spike Coding Networks (SCNs) offers a fully analytical solution for implementing dynamical systems in recurrent spiking neural networks — while maintaining irregular, sparse, and robust spiking activity — but it’s not clear how to directly apply it to control problems. Here, we extend SCN theory by incorporating closed-form optimal estimation and control. The resulting networks work as a spiking equivalent of a linear–quadratic–Gaussian controller. We demonstrate robust spiking control of simulated spring-mass-damper and cart-pole systems, in the face of several perturbations, including input- and system-noise, system disturbances, and neural silencing. As our approach does not need learning or optimization, it offers opportunities for deploying fast and efficient task-specific on-chip spiking controllers with biologically realistic activity.

Index Terms:
Spiking Neural Networks, State Estimation, Optimal Control, Spike Coding Networks, Dynamical Systems.

I Introduction

Brain and behaviour are inseparable. The activity of complex networks of neurons is strongly linked to the capacity of biological agents to move and interact in the world. These networks control the body through sparse spiking activity [1, 2], providing high energy-efficiency and robustness against perturbations (e.g. noise or neural silencing) [3, 4, 5]. Whilst SNNs are experiencing a qualitative improvement in recent years driven by neuromorphic hardware and learning algorithms developments  [6, 7], there are still open challenges in applying them to control problems.

First, the majority of the SNN-based solutions rely on training or otherwise optimizing model parameters [8, 9, 6], even for fully identified systems. Analytical solutions are desirable for control, as they are interpretable, have the advantage of being well-applicable to identified systems, are quick and efficient to implement and deploy, and they are amendable to theoretical explorations of stability and function. In control theory, such fully analytical solutions are often possible, but are difficult to directly implement in SNNs due to their highly nonlinear and discontinuous nature.

Second, biological spiking codes are generally highly irregular and sparse [10, 11] — and require little energy. In contrast, SNN-based control solutions often produce and require highly dense and regular activity (e.g., [12, 13]), and are hence inefficient [14]. How we can reconcile precise and efficient control with a more biological irregular spiking code is an open problem.

While spiking irregularity is usually considered a consequence of noise, according to the neuroscience theory of spike coding networks (SCNs) it could also be a signature of a highly precise spiking code. SCN theory follows a similar principle as the theory of predictive coding. Neurons only fire when the network’s prediction error exceeds a threshold value, efficiently constraining this error. The resulting neural spiking activity is ‘coordinated’ across the neuron population, producing sparse and irregular patterns [15, 16], and is highly robust against several biological perturbations [17, 16]. SCNs have the great advantage that they are defined analytically — through a closed-form solution for the recurrent connectivity. They thus have the potential to solve both above challenges. However, while the SCN framework permits us to analytically implement [15, 18] or learn [19, 20, 21] any dynamical system, it is an open problem how they can be used to both estimate and control the state of an external system.

In this paper we combine optimal control and SCN theory to address both of the above challenges — providing a closed-form solution for optimal control using spiking neural networks, given a well identified system, while producing realistic and robust spiking patterns (Fig. 1). Our proposed method for estimation and control both expands on a promising theory for understanding biological spiking activity, as well as provides a major step towards developing low-power, high-efficiency, robust, and task-specific neuromorphic controllers.

Refer to caption
Figure 1: Controlling dynamical systems with spiking neural networks. A recurrent spiking neural network (left) will emit spikes based on its inputs and connectivity, which are translated into a control signal to control some target system (right). 𝐅c\mathbf{F}_{c} and 𝐅m\mathbf{F}_{m} represent input weights, 𝛀\mathbf{\Omega} represent recurrent weights, and 𝐃\mathbf{D} represent read-out weights. By first defining the optimal control solution for the system, and then directly translating the resulting control parameters into these connection weights, we can generate a robust and population-wide coordinated spiking network that accurately controls identified systems.

I-A Contribution

We mathematically formalize the link between optimal control of dynamical systems and the SCN framework. Particularly, we analytically derive the spiking equivalent of the linear–quadratic–Gaussian (LQG) control problem – the combination of a linear quadratic estimator (Kalman filter) and a linear–quadratic regulator (LQR) controller (Fig. 2B and C). We show that our proposed networks (ii) accurately estimate and control well-known dynamical systems, achieving similar performance as their non-spiking counterpart, and, importantly, (iiii) they preserve the irregular and sparse spiking patterns and robustness to neural silencing.

Developing efficient spiking neural controllers with fully derivable and interpretable connectivity is especially relevant for industrial applications [6], robotics [9] and machine intelligence in general [12].

I-B Related SCN literature

Here, we non-exhaustively review the main lines of research that followed a similar approach as us. For recent reviews on SNNs related to robotics and control see [8, 22]. Our work is based on [15], which originally showed how any linear dynamical system can be analytically implemented in an SCN. While there have been several follow-up papers on extending the framework for general computations [19, 23, 18], control has been less well-studied. In [21], spike coding networks were extended by adding learning rules for connectivity weights. This allowed them to perform forward prediction of the dynamical system state. As an additional result, they showed how to control a pendulum by using the network to simulate multiple future trajectories of the pendulum with different control policies. However, the control algorithm is not encoded within the network and does not provide robust estimation. In [24, 25], the authors proposed an analytical SCN-inspired framework for control. They mainly focused on producing the correct control signal, and derived network connectivities accordingly. The framework requires fully observable systems, and they were not able to provide a general, clear and simple mathematical model that is in line with both control and neuroscience communities. Furthermore, they did not investigate the beneficial properties of the spiking patterns generated by the coordination of the neurons. Here, we will provide a simpler analytical derivation more in line with existing SCN theory, derive networks for both estimation and control of partially observable systems, and investigate the robustness properties of the control and networks in more detail. Overall, our approach takes inspiration from the broader SCN literature to provide a unified mathematical framework to analytically compute optimal control in SNNs.

II Methods

II-A The spiking control problem

We have a system with state 𝐱(t)\mathbf{x}(t) we would like to control with a spiking signal (𝐬(t)\mathbf{s}(t)) emitted from a recurrent SNN (Fig. 1). The SNN is provided with some incomplete measurements of the system state 𝐲(t)\mathbf{y}(t) and a target state 𝐳(t)\mathbf{z}(t), and has to generate a control signal 𝐮(t)\mathbf{u}(t).

The spiking patterns are generated according to some underlying voltage dynamics. There are many such models of varying complexities, but for most practical applications networks of leaky integrate-and-fire (LIF) neurons suffice. A network of NN such neurons is then defined by

𝐯˙(t)=λ𝐯(t)+𝐅𝐜(t)+𝛀𝐬𝐫(t)+𝛀𝐟𝐬(t)+ηv,\mathbf{\dot{v}}(t)=-\lambda\mathbf{v}(t)+\mathbf{F}\mathbf{c}(t)+\mathbf{\Omega_{s}\mathbf{r}}(t)+\mathbf{\Omega_{f}s}(t)+\mathbf{\eta}_{v}\text{,} (1)

where 𝐯(t)N\mathbf{v}(t)\in\mathbb{R}^{N} is the vector of neural voltages, 𝐅K×N\mathbf{F}\in\mathbb{R}^{K\times N} are some input weights, 𝐜(t)K\mathbf{c}(t)\in\mathbb{R}^{K} is some KK-dimensional input, 𝐬(t)N\mathbf{s}(t)\in\mathbb{R}^{N} are the emitted spikes, 𝐫(t)N\mathbf{r}(t)\in\mathbb{R}^{N} are the filtered spike-trains111such that 𝐫˙=λ𝐫+𝐬\dot{\mathbf{r}}=-\lambda\mathbf{r}+\mathbf{s}, see Fig. 1., and ηvN\mathbf{\eta}_{v}\in\mathbb{R}^{N} corresponds to independent voltage noise. Finally, there are fast synapses which affect the post-synaptic voltage instantaneously following a spike (through fast recurrent connections 𝛀fN×N\mathbf{\Omega}_{f}\in\mathbb{R}^{N\times N}), and slow synapses which cause an initially slow change in voltage (through the slow recurrent connections 𝛀sN×N\mathbf{\Omega}_{s}\in\mathbb{R}^{N\times N}).

Whenever a given neuron’s voltage viv_{i} reaches a threshold TiT_{i}, that neuron will emit a spike at that time (resulting in spike-trains si(t)=jδ(ttj)s_{i}(t)=\sum_{j}\delta(t-t_{j})). The voltage will then be reset to some resting value (here implemented through the diagonal elements of 𝛀\mathbf{\Omega}).

How should the recurrent and input weights of the network be set, such that the output signal is as optimal as possible? One solution is to train the network using a cost function that enforces, for instance, sparsity [14]. Here we take another route by first defining the solution according to classical control theory, and then translating this solution into the spiking network parameters by using the neuroscience theory of Spike Coding Networks (SCN) [15, 26].

Refer to caption
Figure 2: Schematics for identification, estimation and control of dynamical systems for classical control (left column) and its spiking neural network equivalent (right column). The internal recurrent connections in the neural representations are illustrative—see Sec III for more details. (A) Identification of dynamical systems [17, 18]. This work: (B) Optimal estimation of dynamical systems through Kalman filtering. (C) Optimal estimation and control of dynamical systems through Kalman filtering combined with an LQR controller.

II-B Spike-coding network theory

For the sake of completeness, in this section we will give a brief overview of the original SCN framework (as originally proposed in [15]) for implementing linear dynamical systems in an SNN — for an extensive mathematical derivation of the SCN we refer the reader to [17, 18].

II-B1 Tracking a fully observable system

If a KK-dimensional state 𝐱(t)K\mathbf{x}(t)\in\mathbb{R}^{K} is fully observable without noise, SCN theory defines how a recurrent SNN can optimally track this signal [15]. The proof starts with two assumptions: 1) the input signal estimate 𝐱^\mathbf{\hat{x}} can be linearly decoded from the spiking activity as 𝐱^=𝐃𝐫\mathbf{\hat{x}}=\mathbf{Dr}, where 𝐃N×K\mathbf{D}\in\mathbb{R}^{N\times K} are known decoding weights (in this paper randomly drawn from the normal distribution and normalized); and 2) spikes should only be emitted when this improves a coding error defined by the L2-norm, yielding a greedy spiking rule, i.e. neuron ii should spike iff 𝐱𝐃𝐫22>𝐱𝐃𝐫𝐃i22||\mathbf{x}-\mathbf{D}\mathbf{r}||^{2}_{2}>||\mathbf{x}-\mathbf{D}\mathbf{r}-\mathbf{D}_{i}||^{2}_{2}. Here 𝐃i\mathbf{D}_{i} is the ii’th column of 𝐃\mathbf{D}, and reflects the change in error due to neuron ii spiking.

From these assumptions a recurrent network of leaky integrate-and-fire neurons is directly derived of the form

𝐯˙=λ𝐯+𝐃(𝐱˙+λ𝐱)+𝛀f𝐬.\mathbf{\dot{v}}=-\lambda\mathbf{v}+\mathbf{D}^{\top}(\mathbf{\dot{x}}+\lambda\mathbf{x})+\boldsymbol{\Omega}_{f}\mathbf{s}\text{.} (2)

In this network, the input 𝐱\mathbf{x} is encoded through forward weights 𝐃\mathbf{D}^{\top}, with the derivative term 𝐱˙\dot{\mathbf{x}} ensuring that quick changes in 𝐱\mathbf{x} are adequately tracked. Effectively, the network takes in both the current state and its dynamics as inputs. The recurrent connections are given by fast connections 𝛀f=𝐃𝐃\boldsymbol{\Omega}_{f}=-\mathbf{D}^{\top}\mathbf{D}. These connections make sure that the spiking in the network is coordinated across the neurons, such that there are no superfluous spikes. A neural post-spike ‘self-reset’ is implicitly included by 𝛀f\boldsymbol{\Omega}_{f}’s diagonal. Neurons will emit a spike when their threshold is hit, which follows from the same derivation as Ti=𝐃i𝐃i/2T_{i}=\mathbf{D}_{i}^{\top}\mathbf{D}_{i}/2. We here assume instantaneous communication between neurons, such that only one neuron spikes at a given time (since they can instantly inhibit each-other), but note that this assumption is not strictly required[16]. The resulting spiking activity will optimally encode the input-signal such that 𝐱^(t)=𝐃𝐫(t)𝐱(t)\hat{\mathbf{x}}(t)=\mathbf{D}\mathbf{r}(t)\approx\mathbf{x}(t).

II-B2 Implementing a dynamical system

Instead of tracking an external signal (𝐱^𝐱\hat{\mathbf{x}}\approx\mathbf{x}) one can also use SCN theory to define an SNN which implements a linear dynamical system of the form 𝐱˙=𝐀𝐱\dot{\mathbf{x}}=\mathbf{A}\mathbf{x} [15] (Fig. 2A). Briefly, we can replace 𝐱˙\dot{\mathbf{x}} (in Eq. 2) by the now known dynamics 𝐀𝐱\mathbf{A}\mathbf{x}. The resulting voltage dynamics are given by

𝐯˙\displaystyle\mathbf{\dot{v}} =λ𝐯+𝐃(𝐀𝐱+λ𝐱)𝐃𝐃𝐬+ηV.\displaystyle=-\lambda\mathbf{v}+\mathbf{D}^{\top}(\mathbf{Ax}+\lambda\mathbf{x})-\mathbf{D}^{\top}\mathbf{Ds}+\eta_{V}\text{.} (3)

By next replacing the external state 𝐱\mathbf{x} by the network’s internal estimate 𝐱^=𝐃𝐫\hat{\mathbf{x}}=\mathbf{Dr} this can be further simplified as

𝐯˙\displaystyle\mathbf{\dot{v}} =λ𝐯+𝛀s𝐫+𝛀f𝐬+ηV,\displaystyle=-\lambda\mathbf{v}+\mathbf{\Omega}_{s}\mathbf{r}+\mathbf{\Omega}_{f}\mathbf{s}+\eta_{V}\text{,} (4)

where the derivation has resulted in an additional set of slow connections implementing the desired dynamics given by 𝛀s=𝐃(𝐀+λ𝐈)𝐃\mathbf{\Omega}_{s}=\mathbf{D}^{\top}(\mathbf{A}+\lambda\mathbf{I})\mathbf{D}. In essence, the slow connectivity reads out the internal estimate of 𝐱\mathbf{x} through 𝐃\mathbf{D}. It then applies the dynamics of the linear dynamical system through 𝐀+λ𝐈\mathbf{A}+\lambda\mathbf{I}, and encodes the result back into the network using 𝐃\mathbf{D}^{\top}. The network now keeps track of its own estimate on a fast time scale through the fast connections, and drifts this estimate according to the dynamics 𝐀\mathbf{A} through the slow recurrent connections.

III Closed-form optimal estimation and control with SNNs

In the previous section, we summarized how SCNs can implement dynamical systems — but how can we link such a network to an external system and control it? Here we provide a new mathematical framework which extends SCN theory to allow the implementation of classical control in SNNs consisting of leaky integrate-and-fire neurons. The inner workings of the networks, and the interpretation of the different connectivities, remain unchanged from the previous section. The resulting recurrent SNN is able to simultaneously act as a state-estimator (Kalman filter) and a controller (LQR) for an external dynamical system. Precisely, we provide a linear–quadratic–Gaussian spiking controller with extended benefits of efficient spiking patterns coding.

III-A The classical control problem

Consider a KK-dimensional linear dynamical system in state-space representation (Fig. 2A.i):

𝐱˙(t)\displaystyle\mathbf{\dot{x}}(t) =𝐀𝐱(t)+𝐁𝐮(t)+𝜼d\displaystyle=\mathbf{A}\mathbf{x}(t)+\mathbf{B}\mathbf{u}(t)+\boldsymbol{\eta}_{d} (5)
𝐲(t)\displaystyle\mathbf{y}(t) =𝐂𝐱(t)+𝜼n,\displaystyle=\mathbf{C}\mathbf{x}(t)+\boldsymbol{\eta}_{n}\text{,} (6)

where 𝐱(t)K\mathbf{x}(t)\in\mathbb{R}^{K} is the state vector, 𝐀K×K\mathbf{A}\in\mathbb{R}^{K\times K} is the system matrix conveying the dynamics of the system, 𝐮(t)P\mathbf{u}(t)\in\mathbb{R}^{P} is the input vector through which we can control the system, and 𝐁K×P\mathbf{B}\in\mathbb{R}^{K\times P} is the input matrix. 𝜼dK\boldsymbol{\eta}_{d}\in\mathbb{R}^{K} reflects internal disturbances, given by a zero-mean Gaussian process with co-variance 𝚺d\mathbf{\Sigma}_{d}. 𝐲(t)Q\mathbf{y}(t)\in\mathbb{R}^{Q} are observations about 𝐱(t)\mathbf{x}(t), where 𝐂Q×K\mathbf{C}\in\mathbb{R}^{Q\times K} is the observability matrix. 𝜼n\boldsymbol{\eta}_{n} is sensor noise, given by a zero-mean Gaussian process with co-variance 𝚺n\mathbf{\Sigma}_{n}.

We will generally assume as known the measurement vector 𝐲\mathbf{y}, the system matrix 𝐀\mathbf{A}, input matrix 𝐁\mathbf{B}, the input vector 𝐮\mathbf{u}, the observability matrix 𝐂\mathbf{C}, and co-variances 𝚺d\mathbf{\Sigma}_{d} and 𝚺n\mathbf{\Sigma}_{n}.

Our goal is to control 𝐱\mathbf{x} to some reference state 𝐳K\mathbf{z}\in\mathbb{R}^{K}. To do this we must be able to estimate the state 𝐱\mathbf{x} from the observations 𝐲\mathbf{y} (estimation problem), and find the best control signal 𝐮\mathbf{u} to do so (optimal control problem). In control theory this is largely solved, and we will now consider how to combine the resulting solutions with SCN theory to generate the correct u as an output of an SNN.

III-B Optimal state estimation with SCNs

Full-state estimates 𝐱^\mathbf{\hat{x}} given a noisy and incomplete measurements in linear systems can be provided by a Kalman filter [27], which optimally balances an internal (predicted) state estimate to a noisy and/or partially observable external measurement (Fig. 2B.i). The Kalman filter in continuous time is given by a dynamical system of the form

𝐱^˙=𝐀𝐱^+𝐁𝐮+𝐊f(𝐲^𝐲).\dot{\hat{\mathbf{x}}}=\mathbf{A}\hat{\mathbf{x}}+\mathbf{B}\mathbf{u}+\mathbf{K}_{f}(\hat{\mathbf{y}}-\mathbf{y})\text{.} (7)

The Kalman filter gain matrix, 𝐊fK×Q\mathbf{K}_{f}\in\mathbb{R}^{K\times Q}, is applied to an error between the observations of the dynamical system and the Kalman filter’s own internal estimate, 𝐲^=𝐂𝐱^\mathbf{\hat{y}=C\hat{x}}. For fully identified systems 𝐊f\mathbf{K}_{f} can be found by solving an algebraic Riccati equation [27]. Assuming that 𝐊f\mathbf{K}_{f} is known, we can directly implement the optimal filtering under the SCN framework (by following Section II-B2) resulting in the following voltage update rule:

𝐯˙\displaystyle\mathbf{\dot{v}} =λ𝐯+𝛀s𝐫+𝛀f𝐬+𝐅i𝐮System estimate+𝛀k𝐫+𝐅k𝐲Kalman update+ηV,\displaystyle=-\lambda\mathbf{v}+\underbrace{\mathbf{\Omega}_{s}\mathbf{r}+\mathbf{\Omega}_{f}\mathbf{s}+\mathbf{F}_{i}\mathbf{u}}_{\text{System estimate}}+\underbrace{\mathbf{\Omega}_{k}\mathbf{r}+\mathbf{F}_{k}\mathbf{y}}_{\text{Kalman update}}+\eta_{V}\text{,} (8)

where, on top of the previously introduced slow and fast connectivity, we now also have input (control) connections 𝐅i=𝐃𝐁\mathbf{F}_{i}=\mathbf{D}^{\top}\mathbf{B} mapping the control signal 𝐮\mathbf{u} to the neurons, recurrent and feed-forward “Kalman filter” connections (𝛀k=𝐃𝐊f𝐂𝐃\mathbf{\Omega}_{k}=-\mathbf{D}^{\top}\mathbf{K}_{f}\mathbf{CD} and 𝐅k=𝐃𝐊f\mathbf{F}_{k}=\mathbf{D}^{\top}\mathbf{K}_{f}). The recurrent connections essentially read-out the internal state of the system, apply both the observability matrix and the Kalman filter gain matrix to predict its evolution, and maps this back into the network. The feed-forward connections take the partially observable state, 𝐲\mathbf{y}, and applies the Kalman filter gain matrix. A more detailed derivation can be found in the appendix (section VIII). Now, the entire SCN, including all of its connections, represents a Kalman filter (Fig. 2B.ii). The network internally simulates a linear dynamical system of the form 𝐱^˙=𝐀𝐱^+𝐁𝐮\mathbf{\dot{\hat{x}}=A\hat{x}+Bu}, but also corrects its own estimate according to the external input 𝐲\mathbf{y}.

III-C Optimal control with SCNs

Now that we have optimal state-estimation we can extend the framework to optimal control by deriving the spiking version of a Linear-Quadratic Regulator (LQR) controller. Given a known system of the form in Eq. (5), this controller produces the optimal 𝐮\mathbf{u} signal to minimize the squared error between the state 𝐱\mathbf{x} and a target 𝐳\mathbf{z}. This gives rise to a linear control law of the form

𝐮=𝐊c(𝐱𝐳),\mathbf{u}=-\mathbf{K}_{c}(\mathbf{x}-\mathbf{z})\text{,} (9)

where 𝐊𝐜P×K\mathbf{K_{c}}\in\mathbb{R}^{P\times K} is the LQR gain matrix, which can be found by solving an algebraic Riccati equation, given assumptions on the cost of state deviations and actuation [27].

In order to control a partially observable and/or noisy dynamical system, we can combine an LQR controller with a Kalman filter (see Fig. 2C.i). The Kalman filter should then be aware of the specific form used for the control, and we can swap out 𝐮\mathbf{u} for the LQR control law (Eq. (9) in Eq. (7)), resulting in a dynamical system of the form

𝐱^˙=𝐀𝐱^+𝐁𝐊c(𝐱^𝐳)+𝐊f(𝐲^𝐲).\mathbf{\dot{\hat{x}}=A\hat{x}+B}\mathbf{K}_{c}(\mathbf{\hat{x}}-\mathbf{z})+\mathbf{K}_{f}\mathbf{(\hat{y}-y).} (10)

We will refer throughout to the combination of a Kalman filter and LQR control as the ideal controller. We can now use the same method as in the previous section to implement this in a spiking network.

To be able to read out the full control signal, we additionally encode the reference 𝐳\mathbf{z} into the network with a new set of fast connections (similar to Eq. (2)), which allows us to read out 𝐮\mathbf{u} from the internal estimates 𝐱^\mathbf{\hat{x}} and 𝐳^\mathbf{\hat{z}} of the network. For this we define two sets of decoding weights: 𝐃𝐱\mathbf{D}_{\mathbf{x}} for 𝐱^\mathbf{\hat{x}}, and 𝐃𝐳\mathbf{D}_{\mathbf{z}} for 𝐳^\mathbf{\hat{z}}.

Extending the SCN defined in Eq. (8) with LQR control and an internal encoding of 𝐳\mathbf{z}, we get the final voltage update rule

𝐯˙=λ𝐯+𝛀s𝐫+𝛀f𝐱𝐬System estimate+𝛀c𝐫+𝛀z𝐫Control estimate+𝛀k𝐫+𝐅k𝐲Kalman update+𝐃𝐳(𝐳˙+λ𝐳)+𝛀f𝐳𝐬𝐳 representation+ηV,\displaystyle\begin{aligned} \mathbf{\dot{v}}=-\lambda\mathbf{v}&+\underbrace{\mathbf{\Omega}_{s}\mathbf{r}+\mathbf{\Omega}^{\mathbf{x}}_{f}\mathbf{s}}_{\text{System estimate}}+\underbrace{\mathbf{\Omega}_{c}\mathbf{r}+\mathbf{\Omega}_{z}\mathbf{r}}_{\text{Control estimate}}+\underbrace{\mathbf{\Omega}_{k}\mathbf{r}+\mathbf{F}_{k}\mathbf{y}}_{\text{Kalman update}}\\ &+\underbrace{\mathbf{D}_{\mathbf{z}}^{\top}(\mathbf{\dot{z}}+\lambda\mathbf{z})+\mathbf{\Omega}^{\mathbf{z}}_{f}\mathbf{s}}_{\mathbf{z}\text{ representation}}+\eta_{V}\text{,}\end{aligned} (11)

where we have indicated the parts of the connectivity that are tracking the external system, the effect of the control on the system, the Kalman filter updates, and the representation of the target signal. For the internal estimate 𝐱^\mathbf{\hat{x}}, we now have two sets of recurrent “control” and “target” connections, which represent the LQR controller within the Kalman filter SCN (see Fig. 2C.ii). The recurrent control connectivity, 𝛀c=𝐃𝐱𝐁𝐊c𝐃𝐱\mathbf{\Omega}_{c}=-\mathbf{D}_{\mathbf{x}}^{\top}\mathbf{B}\mathbf{K}_{c}\mathbf{D}_{\mathbf{x}}, decodes 𝐱^\mathbf{\hat{x}} from the internal state of the SCN using 𝐃𝐱\mathbf{D}_{\mathbf{x}}, and applies the LQR gain matrix 𝐊c\mathbf{K}_{c}. The result is transformed using 𝐁\mathbf{B}, and encoded back into the network. The recurrent target connectivity, 𝛀z=𝐃𝐱𝐁𝐊c𝐃𝐳\mathbf{\Omega}_{z}=\mathbf{D}_{\mathbf{x}}^{\top}\mathbf{B}\mathbf{K}_{c}\mathbf{D}_{\mathbf{z}} does something similar, but first reads out 𝐳^\mathbf{\hat{z}} using 𝐃𝐳\mathbf{D}_{\mathbf{z}}. The reference signal, 𝐳\mathbf{z}, is encoded into the network using the same method as defined in Eq. (8). A more detailed derivation for all these connections can be found in the appendix (section VIII). Note that while there now appear to be a large set of separate recurrent connections, all the slow connectivities (𝛀s\mathbf{\Omega}_{s}, 𝛀c\mathbf{\Omega}_{c}, 𝛀z\mathbf{\Omega}_{z}, 𝛀k\mathbf{\Omega}_{k}) can be grouped together in a single connection matrix, as well as both the fast connectivities (𝛀f\mathbf{\Omega}_{f}, 𝛀f𝐳\mathbf{\Omega}^{\mathbf{z}}_{f}).

We now have an SCN which implements the entirety of 𝐱^˙=𝐀𝐱^+𝐁𝐊c(𝐱^𝐳^)+𝐊f(𝐲^𝐲)\mathbf{\dot{\hat{x}}=A\hat{x}+B}\mathbf{K}_{c}(\mathbf{\hat{x}}-\mathbf{\hat{z}})+\mathbf{K}_{f}\mathbf{(\hat{y}-y)}, but also internally keeps an estimate of the desired state or reference 𝐳\mathbf{z}. We can obtain the output of the internal LQR controller in the SCN by defining the following new set of decoding weights: 𝐃𝐮=𝐊c(𝐃𝐱𝐃𝐳)\mathbf{D_{u}}=-\mathbf{K}_{c}(\mathbf{D}_{\mathbf{x}}-\mathbf{D}_{\mathbf{z}}). The control signal can then directly be read-out from the neural activities as 𝐮=𝐃𝐮𝐫\mathbf{u}=\mathbf{D_{u}}\mathbf{r}. This 𝐮\mathbf{u} can be applied to an external dynamical system to control it, all whilst the SCN keeps an internal estimate of the external dynamical system. Hence, we have finally arrived at a fully derived recurrent SNN which allows us to track, replicate and control an external dynamical system.

IV Results

We evaluated the performance of the proposed mathematical framework (Sec. III) as well as analyzed the properties of the spiking patterns for internal robustness against perturbations (e.g., noise, neural silencing and external force)222The code used to generate the results is available in the following Github repository: https://github.com/FSSlijkhuis/SCN_estimation_and_control. We applied the networks we derived above to two standard dynamical systems: the linear spring-mass-damper system and the nonlinear cartpole system.

IV-A Spring-Mass-Damper (SMD) system

We compared our proposed SCN estimation and control with their non spiking counterparts for the spring-mass-damper system (Fig. 3A). The baselines comparison are a Kalman filter for estimation and LQG for control, which we refer to as the ”Idealized” estimator and controller. Furthermore, we evaluated the system robustness against input-noise, neural silencing and external perturbations. The SMD system has two dynamical variables: position xx (=x1=x_{1}) and velocity vv (=x2=x˙1=x_{2}=\dot{x}_{1}). Our instance of the SMD system has an internal disturbance ηd\mathbf{\eta}_{d}, and it outputs a partially observable state which measures only the position xx with sensor noise ηn\mathbf{\eta}_{n}.

Refer to caption
Figure 3: Estimation and control of a linear system using SCNs. (A) Schematic of a Spring-Mass-Damper (SMD) system. (B) SCN estimation (green) of SMD system (red) simulated over a time period of 50 seconds. The top and middle plot show the estimated and real positions of the spring and the estimated and real velocity of the spring in the SMD system, respectively. The observation (blue) is shown in the top plot only, since the SCN estimator only has access to the noisy xx. The bottom plot shows the spike trains across all 20 neurons used in the SCN estimator. (C) Control of SMD system using the SCN controller (green) vs an idealized controller (red), with reference signal (blue, dashed), while evaluating neural silencing robustness. The reference position of the spring is increased in a stair-wise manner, which explains the sudden changes in the spring velocity. Increased spiking activity in the neurons of the SCN is observed during these shifts. The robustness to neural silencing of the SCN controller is also shown. At every red, dashed vertical line, 15 neurons are “killed” (indicated by the gray bars in the spike plot). The SCN controller and estimator starts with 50 neurons, and is able to correctly control and estimate the SMD system up until the last portion of neurons is killed, after which the system breaks down due to control errors. Before the system fails, the SCN shows some signs of neurons ’taking over’ the spiking of removed neurons. In the third and fourth plot, the error between the target signal and both controllers is shown for both the position and the velocity of the spring.

IV-A1 Estimation

We first evaluated whether an SCN Kalman filter can accurately track the SMD system based on incomplete and noisy measurements. For the estimation, we set u=0u=0, so there is no external force acting on our dynamical system. In Fig. 3B, we compared the estimation outputted by the spike coding network (green) to the numerical dynamical system simulation (red). Even though the estimation starts in a different initial state and there is both substantial measurement noise and internal disturbance in the real dynamical system, the network estimate quickly converges. Overall, the SCN estimates both dynamical variables of the real dynamical system with very high accuracy.

IV-A2 Control

To control a dynamical system, the SCN computes the optimal uu, such that the state of the simulated dynamical system converges towards a reference signal 𝐳\mathbf{z}. In Fig. 3C, we show an SMD controlled through an idealized controller (red) and a dynamical system controlled using an SCN controller (green). Once again, the estimation from the SCN is very similar to the real dynamical system. On top of that, we show the reference signal (blue, dashed), which is a stair-wise increase of the SMD position, xx. We can see that both the estimation from the SCN and the real dynamical system follow the reference signal, which is an indication of the correct behavior of the internal controller of the SCN. When we investigate the control errors in detail (blue curves) we see that the two controllers match closely.

Refer to caption
Figure 4: Spring-Mass-Damper system control range of SCN controller. (A) Average position error between SMD system controlled through SCN controller and target signal (top), where a short force pulse is applied halfway during the simulation, plotted against multiple values for the sensor noise covariance. The same analysis is shown for the idealized controller (bottom). Both the sensor noise covariance and the strength of the force pulse are varied, and for each combination of values, a simulation of 5 seconds is run. The result is shown as the error in the position of the SMD system averaged over the entire simulation, where dark blue indicates a small error, and yellow a large error. (B) The effect of force pulses and sensor noise on the SMD system when controlled by an SCN controller. The influence on the SMD system is shown for three parameter values, which are highlighted in A. The top plot, (i), shows the effects of a noise covariance of 1e-5 and a force pulse strength of 100N. Plotted are the (noisy) observations going into the SCN controller (blue), the SMD system controlled by the SCN controller (green), and the reference signal (red, dashed). Plotted separately is the pulse into the SMD system (orange). The middle plot, (ii), shows the same but with a sensor noise covariance of 0.001 and a pulse strength of 500N. The bottom plot, (iii), uses a sensor noise covariance of 0.1 and a pulse strength of 900N.

IV-A3 Controller robustness

In Fig. 3C, we further show the robustness of our framework, as the network is able to keep the system controlled when facing severe neural silencing (as in classic SCNs [17, 16]). Starting with 50 neurons, we progressively disable 15 neurons at certain timesteps (red, vertical dashed lines), by preventing these neurons to spike from this point on. The timestep at which the first 15 neurons are disabled (at 10 seconds) is exactly at the point of a change in the reference signal. In this case there is no visible effect on the performance. The neural silencing at the second timestep (at 26.6 seconds) again demonstrates little effect on the performance of the SCN controller, although a very subtle increase in the velocity error is observed. After the third timestep at which 15 neurons are disabled (at 43.3 seconds), only 5 neurons are available to the SCN controller. Only at this point does the SCN controller start to struggle to accurately control the SMD system.

In the raster plots in Fig. 3, the spike patterns are moderately sparse and irregular. For clarity of the spike trains we used smaller networks here, but the spiking patterns can be made arbitrarily sparse by adding more neurons (as then the spiking is coordinated across more neurons). In Fig. 3C, we can observe that when neurons are silenced, other neurons compensate for its spiking, to make sure that the estimation stays as accurate as possible (and the spikes become progressively less sparse). This is a clear indication of the coordinated spiking of the neurons in the network.

Fig. 4A shows the effects of different parameter-values for the sensor noise covariance and the effects of a sudden pulse of force into the SMD system on the controller error of the SCN controller (top) and the idealized LQR controller (bottom). Here, the controller error corresponds to the average error in the position of the mass between the SMD system and the reference signal across an entire simulation, when controlled by the respective controllers. This error increases for larger values of the sensor noise covariance, as well as with larger pulses of force into the SMD system. There is no noticeable difference between the control errors of the SCN controller and the idealized controller.

In Fig. 4B, the position of the mass in the SMD system controlled by the SCN controller is shown, using three sets of parameters indicated in Fig. 4A. Here, we clearly see the effect of the pulse on the SMD system, where a larger pulse corresponds to a larger displacement of the mass. The SCN controller quickly tries to correct for this displacement, bringing the mass back to its reference position. The observation, y, is also shown. Larger values for the sensor noise covariance are clearly visible in the observation. Not shown here is the SMD system controlled by the idealized controller, because there is no visible difference between the two SMD systems.

Refer to caption
Figure 5: SCN control of nonlinear cartpole system. (A) Schematic of the Cartpole system. (B) Control of Cartpole system (green) using SCN controller and estimator linearized around up-position of pole, compared to a second Cartpole system (red) using idealized linear controller and estimator, also linearized around the up-position of the pole, with reference signal (blue, dashed), simulated across a time period of 50 seconds. The figure only shows the position of the cart (x1x_{1}) and the angle of the pole (x3x_{3}). The reference position of the cartpole is increased in a stair-wise manner, but the pole stays upright. In the third and fourth plot, an error between the target and both controllers is shown, for both the position of the cart and the angle of the pole. The lower plot shows the spiking activity within the SCN, consisting of 100 neurons. During a change in the reference position, an increase in spiking activity is observed.

IV-B Cartpole system

We evaluated our proposed SCN controller (i.e., linear-quadratic-Gaussian spiking control) on the nonlinear cartpole system (see Fig. 5A). The cartpole has four dynamical variables: the position of the cart, xx (=x1=x_{1}), the velocity of the cart vv (=x2=x˙1=x_{2}=\dot{x}_{1}), the position of the pole, θ\theta (=x3=x_{3}), and the angular velocity of the pole, ω\omega (=x4=x˙3=x_{4}=\dot{x}_{3}). Just like the SMD, our instance of the cartpole has an internal disturbance ηd\mathbf{\eta}_{d}, and it outputs a partially observable state consisting of only the position of the cart, xx, with sensor noise ηn\mathbf{\eta}_{n}. Note that the cartpole system is a nonlinear dynamical system, but both our ideal and spiking controllers assume linear dynamics. To produce a meaningful control signal, we use linearized dynamics for the internal estimate. In our case, we linearized around the up-position of the pole (θ=π\theta=\pi).

Fig. 5B compares a cartpole system controlled through an idealized controller (red) to a cartpole controlled through an SCN controller (green). On top of that, we show the reference signal (blue, dashed), which is a stair-wise increase of the cart position, xx. The task of the controllers is then to move the cart whilst keeping the pole upright. In the figure, we show the position (xx) of the cart and the angle of the pole (θ\theta). We see that both controllers follow the reference signal almost perfectly, and most importantly, keep the pole upright.

The spike plots show that especially with a large amount of neurons, the spiking is irregular and sparse. Increased spiking activity is observed when the reference signal demands a change of the state of the dynamical system.

IV-C Parameter settings

For each network, the decoding weights (𝐃\mathbf{D}) are sampled from a normal distribution, with each column normalized to have norm 0.1, except for norm 0.01 in Fig. 5B and norm 1 in Fig. 6. 20 and 50 neurons were used in Fig. 3B and C respectively, 50 neurons were used in Fig. 4 and Fig. 6, and 100 neurons in Fig. 5B. The SMD parameters were m=3m=3, k=5k=5 and c=0.5c=0.5 for Fig. 3B and Fig. 4, while m=20m=20, k=6k=6 and c=2c=2 were used for Fig. 3C and Fig. 6. Fig. 3B and Fig. 6 used 𝚺d=𝚺n=0.001\mathbf{\Sigma}_{d}=\mathbf{\Sigma}_{n}=0.001. Fig. 3C used 𝚺d=𝚺n=0.1\mathbf{\Sigma}_{d}=\mathbf{\Sigma}_{n}=0.1. Fig. 4 used 𝚺d=0.001\mathbf{\Sigma}_{d}=0.001. The cartpole system parameters were m=1m=1, M=5M=5, L=2L=2, g=10g=-10, and d=1d=1, with 𝚺d=𝚺n=1e-7\mathbf{\Sigma}_{d}=\mathbf{\Sigma}_{n}=1\text{e-}7. All networks used ηV=1e-5\eta_{V}=1\text{e-}5 and λ=0.1\lambda=0.1, except for Fig. 6, which used ηV=1e-6\eta_{V}=1\text{e-}6 and varying λ\lambda. Forward Euler was used throughout, with a timestep of 1e-3s, except for 1e-4s in Fig. 4, Fig. 5B and Fig. 6. For control, R=1e-2R=1\text{e-}2. For control of the SMD system, the matrix 𝐐K×K\mathbf{Q}\in\mathbb{R}^{K\times K} prioritizes x1x_{1} with weight 10, and x2x_{2} with weight 1. For cartpole control, x3x_{3} is weighted with 10 and the rest with 1.

V Advantages and limitations of SCN control

The use of SCN control provides several advantages and limitations. This section provides an overview.

V-1 Hardware implementations

Because SCN control is implemented through standard LIF neurons it has strong potential for neuromorphic deployment, such as on Loihi [28] or SpiNNaker [29] chips — allowing for potentially lower energy control implementations than classical von Neumann architectures. However, a general framework of how to transfer the theory to hardware is still lacking, and faces two main challenges. First, the theory assumes zero transmission delays between neurons, which effectively constraints one neuron to spike per time-frame. If this constraint is weakened, the networks can go into an epileptic state [16, 30]. On hardware implementations this assumption might not always be reasonable. As a solution, several extensions have been made to SCN theory that make it work with such delays [31, 30, 16, 32]. Second, for control we need two types of connections: fast and slow. The fast connections can be implemented by a simple change in the voltage following a spike, but the slow connections require longer lasting effects (through the filtered spike trains 𝐫\mathbf{r}). Not every hardware implementation might allow for such slow synapses, or allow the simultaneous use of both synapse types. It also requires every neuron to have access to the filtered spike trains of every other neuron, which might increase the connectivity requirements — which is again highly dependent on the exact hardware.

V-2 Explainability

A great benefit of SCN control is its explainability. There is a clear link between the different network parameters and the computation (as outlined in our methods section), as well as a rich literature studying their more in-depth properties. In particular, there is an elegant and in-depth geometric interpretation of the core functionality [16]. Thus, with new applications for the theory and resulting problems, there is a clear theoretical framework within which one can think through possible solutions. One can also easily adjust the model parameters for different activity and coding regimes, as demonstrated in Fig. 6. Another spiking neural network method that has a similar level of theoretical explainability is the neural engineering framework [12], which also allows one to implement dynamical systems, including for control[33]. However, as outlined next, SCNs come with a number of additional advantages — in particular for robustness and biological fidelity.

Refer to caption
Figure 6: Demonstrating adjustment to the network sparsity through adjusting the voltage leakage (λ\lambda), shown on the Spring-Mass-Damper system, simulated over a time period of 10 seconds. For all values of λ\lambda, the normalization factor of the decoding weights has been increased to 1 in order to reduce overall spiking at the cost of controller precision. (A) The effects of λ=0\lambda=0 on spiking behavior of the SCN controller. A total of 163 spikes are fired during this simulation, which averages to 16 spikes per second. Inset: a filtered spike train rnr_{n} for a single neuron nn after a single spike (dot). (B) The effects of λ=1\lambda=1 on spiking behavior of the SCN controller. A total of 358 spikes are fired during this simulation, which averages to 35 spikes per second. (C) The effects of λ=10\lambda=10 on spiking behavior of the SCN controller. A total of 2381 spikes are fired during this simulation, which averages to 238 spikes per second.

V-3 Biological fidelity

A scientifically motivated advantage of SCN control over other methods of control is its biological plausibility. Despite being directly derivable through the underlying theory, the resulting networks portray several features consistent with biology, such as sparse and irregular firing [10, 11], robustness to a range of perturbations[17, 16], and an underlying strict balance of excitation and inhibition[15, 26]. The theory is also extendable to more complex synapses and neural models either through learning[32] or design[18], enabling more complex computations.

V-4 Distributed and robust control

A core advantage of SCN control is that we inherit the property of previous SCN implementations of distributed and robust computations[15, 17]. Given a known linear control algorithm, the required computations can be effectively distributed across the neurons. Consequently, the resulting networks are highly robust to neural silencing (Fig. 3). The robustness to neural silencing might be highly useful for situations in which neuromorphic chips can get damaged during operation, such as in high radiation environments.

V-5 Temporal sparsity and energy use

The distributed and coordinated nature of SCNs ensures that spikes are only fired if they contribute to the required dynamics. As a result, the output spike patterns can be highly temporally sparse and efficient. Compared to other spiking neural network methods, SCNs therefore might potentially offer lower energy use. However, for a fair comparison to other methods we would need direct hardware implementations and comparison to other methods, and measure the energy directly. While this is out of scope for this paper, we here highlight a few important caveats. First, while the total spike count across the population is highly optimized, this does require highly dense recurrent connectivity, and thus significant spike-routing and potential energy use. While the density of connectivities can be drastically reduced[18], it would still need to be taken into account. Second, the use of both fast and slow connectivity types might complicate neural interactions, further increasing energy use. Finally, the sparsity might not be as optimized in the presence of synaptic delays. Even with these caveats, the theoretical understanding of the method should allow further optimization once comparisons can be made. As a demonstration, we show one can drastically reduce the total number of spikes by reducing the voltage leakage constant — without losing performance or needing retraining (Fig. 6) (at the cost of less biological realism and potentially higher memory requirements).

VI Conclusion

For fully identified systems there exist highly efficient closed-form control solutions. Ideally we would like to be able to implement these directly in spiking neural networks, and in a way that mimics the efficient and robust nature of the brain. In this work, we provided an extension of SCN theory to allow optimal estimation and control — providing spiking equivalents for Kalman filters and LQR control. We showed that these networks maintain sparse and irregular spiking patterns when controlling a dynamical system, and are robust to both severe neural silencing and system perturbations. Our networks are analytically derived and do not need learning nor optimization. Hence, our presented approach opens up the prospect for deploying fast, efficient and low-power SNN on-chip controllers with the advantage of having hardware intrinsic redundancy (i.e., the controller still works when some of the neurons stop working) and ensuring that neurons only fire when there is a prediction error.

As outlined in section V, there are several open challenges for future work, in particular for hardware implementations. Of particular note is that SCN theory assumes instantaneous synaptic delays, which is not realistic in either the brain or all neuromorphic hardware. There are several avenues to implement delays in SCNs [31, 30, 16], which can be well combined with the work presented in this paper. While here we focused on analytical implementations, learning rules do exist to implement SCN connectivities [19, 20, 34]. It is an open question whether these can also be applied for optimal control. Of particular difficulty then is online estimation of the Kalman and LQR gain matrices (𝐊c\mathbf{K}_{c} and 𝐊f\mathbf{K}_{f}). This was recently shown to be possible in non-spiking networks [35], which gives a possible avenue for combining online gains optimization within a closed-form spiking implementation.

VII Acknowledgements

We thank Marcel van Gerven, Bodo Rückauer, Justus Hübotter, Christian Machens, William Podlaski, and Michele Nardin for helpful discussions on control and (spiking) networks.

VIII Appendix: detailed derivation of network connectivity

Here we will show how one can use SCN theory to take the dynamical system corresponding to the continuous Kalman filter and LQR control, and directly implement them in an SNN (following [15]). For both cases we start with the auto-encoding network

𝐯˙=λ𝐯+𝐃(𝐱˙+λ𝐱)+𝛀f𝐬.\mathbf{\dot{v}}=-\lambda\mathbf{v}+\mathbf{D}^{\top}(\mathbf{\dot{x}}+\lambda\mathbf{x})+\boldsymbol{\Omega}_{f}\mathbf{s}\text{.} (12)

This network takes as an external input 𝐱\mathbf{x} and its derivative 𝐱˙\dot{\mathbf{x}} and can accurately track this signal with the read-out 𝐃𝐫\mathbf{D}\mathbf{r}. To implement a given dynamical system in this network we need to do two things. (1) replace the derivative of the input by the desired dynamics (𝐱˙=f(𝐱,𝐮)\dot{\mathbf{x}}=f(\mathbf{x},\mathbf{u})). (2) Instead of feeding the state of 𝐱\mathbf{x} as an external input, estimate the current state from the network read-out 𝐱^=𝐃𝐫\hat{\mathbf{x}}=\mathbf{D}\mathbf{r}, and feed that back into the input (by simply replacing 𝐱\mathbf{x} by 𝐱^\hat{\mathbf{x}}). This results in network dynamics given by:

𝐯˙=λ𝐯+𝐃(f(𝐱^,𝐮)+λ𝐱^)+𝛀f𝐬.\mathbf{\dot{v}}=-\lambda\mathbf{v}+\mathbf{D}^{\top}(f(\hat{\mathbf{x}},\mathbf{u})+\lambda\hat{\mathbf{x}})+\boldsymbol{\Omega}_{f}\mathbf{s}\text{.} (13)

If f(𝐱,𝐮)f(\mathbf{x},\mathbf{u}) is purely linear this can be done through a set of slow connections, as in the main text. We do not consider nonlinear cases in this paper, but note that an analytical solution does exist[18].

VIII-A Estimation

For the continuous definition of the Kalman filter (Eq. 7), the above results in voltage dynamics given by

𝐯˙=λ𝐯+𝐃(𝐀𝐱^+λ𝐱^)𝐃𝐃𝐬+𝐃𝐁𝐮+𝐃𝐊f(𝐲^𝐲)+ηV.\displaystyle\begin{aligned} \mathbf{\dot{v}}=-\lambda\mathbf{v}&+\mathbf{D}^{\top}(\mathbf{A\hat{x}}+\lambda\mathbf{\hat{x}})-\mathbf{D}^{\top}\mathbf{Ds}+\mathbf{D}^{\top}\mathbf{Bu}\\ &+\mathbf{D}^{\top}\mathbf{K}_{f}(\mathbf{\hat{y}}-\mathbf{y})+\eta_{V}.\end{aligned} (14)

We can then replace 𝐱^=𝐃𝐫\hat{\mathbf{x}}=\mathbf{D}\mathbf{r}, and group the different terms together to get the final voltage dynamics to implement a Kalman filter in a recurrent SNN given by Eq. 8.

VIII-B Control + estimation

For the continuous definition of a Kalman filter including LQR control we have equation 10. We can do the same as above and directly implement this in a recurrent SNN, but we also additionally need to represent the target state 𝐳\mathbf{z} in the network (as explained in the main text). This together results in the voltage update rule

𝐯˙=λ𝐯+𝛀s𝐫𝐃𝐱𝐃𝐱𝐬+𝐃𝐱𝐁𝐊c(𝐱^𝐳^)+𝐃𝐱𝐊f(𝐲^𝐲)+𝐃𝐳(𝐳˙+λ𝐳)𝐃𝐳𝐃𝐳𝐬+ηV.\displaystyle\begin{aligned} \mathbf{\dot{v}}=-\lambda\mathbf{v}&+\mathbf{\Omega}_{s}\mathbf{r}-\mathbf{D}_{\mathbf{x}}^{\top}\mathbf{D}_{\mathbf{x}}\mathbf{s}+\mathbf{D}_{\mathbf{x}}^{\top}\mathbf{B}\mathbf{K}_{c}(\mathbf{\hat{x}}-\mathbf{\hat{z}})\\ &+\mathbf{D}_{\mathbf{x}}^{\top}\mathbf{K}_{f}(\mathbf{\hat{y}}-\mathbf{y})+\mathbf{D}_{\mathbf{z}}^{\top}(\mathbf{\dot{z}}+\lambda\mathbf{z})\\ &-\mathbf{D}_{\mathbf{z}}^{\top}\mathbf{D}_{\mathbf{z}}\mathbf{s}+\eta_{V}.\end{aligned} (15)

We can then replace 𝐱^=𝐃𝐫\hat{\mathbf{x}}=\mathbf{D}\mathbf{r}, and group the different terms together the get the final voltage dynamics to implement a Kalman filter plus LQR controller in a recurrent SNN given by Eq. 11.

References

  • [1] W. Gerstner, W. M. Kistler, R. Naud, and L. Paninski, Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition.   Cambridge University Press, Jul. 2014, google-Books-ID: D4j2AwAAQBAJ.
  • [2] L. F. Abbott, B. DePasquale, and R.-M. Memmesheimer, “Building functional networks of spiking model neurons,” Nature Neuroscience, vol. 19, no. 3, pp. 350–355, Mar. 2016, number: 3 Publisher: Nature Publishing Group. [Online]. Available: https://www.nature.com/articles/nn.4241
  • [3] S. Denève, A. Alemi, and R. Bourdoukan, “The Brain as an Efficient and Robust Adaptive Learner,” Neuron, vol. 94, no. 5, pp. 969–977, Jun. 2017. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0896627317304178
  • [4] M. C. Leary and J. L. Saver, “Annual Incidence of First Silent Stroke in the United States: A Preliminary Estimate,” Cerebrovascular Diseases, vol. 16, no. 3, pp. 280–285, 2003, publisher: Karger Publishers. [Online]. Available: https://www.karger.com/Article/FullText/71128
  • [5] J. H. Morrison and P. R. Hof, “Life and Death of Neurons in the Aging Brain,” Science, vol. 278, no. 5337, pp. 412–419, Oct. 1997, publisher: American Association for the Advancement of Science. [Online]. Available: http://www.science.org/doi/10.1126/science.278.5337.412
  • [6] M. Davies, A. Wild, G. Orchard, Y. Sandamirskaya, G. A. F. Guerra, P. Joshi, P. Plank, and S. R. Risbud, “Advancing Neuromorphic Computing With Loihi: A Survey of Results and Outlook,” Proceedings of the IEEE, vol. 109, no. 5, pp. 911–934, May 2021, conference Name: Proceedings of the IEEE.
  • [7] E. O. Neftci, H. Mostafa, and F. Zenke, “Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-Based Optimization to Spiking Neural Networks,” IEEE Signal Processing Magazine, vol. 36, no. 6, pp. 51–63, Nov. 2019, conference Name: IEEE Signal Processing Magazine.
  • [8] Z. Bing, C. Meschede, F. Röhrbein, K. Huang, and A. C. Knoll, “A Survey of Robotics Control Based on Learning-Inspired Spiking Neural Networks,” Frontiers in Neurorobotics, vol. 12, 2018. [Online]. Available: https://www.frontiersin.org/articles/10.3389/fnbot.2018.00035
  • [9] M. Traub, R. Legenstein, and S. Otte, “Many-Joint Robot Arm Control with Recurrent Spiking Neural Networks,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Sep. 2021, pp. 4918–4925, iSSN: 2153-0866.
  • [10] D. J. Tolhurst, J. A. Movshon, and A. F. Dean, “The statistical reliability of signals in single neurons in cat and monkey visual cortex,” Vision Research, vol. 23, no. 8, pp. 775–785, Jan. 1983. [Online]. Available: https://www.sciencedirect.com/science/article/pii/0042698983902006
  • [11] M. N. Shadlen and W. T. Newsome, “The Variable Discharge of Cortical Neurons: Implications for Connectivity, Computation, and Information Coding,” Journal of Neuroscience, vol. 18, no. 10, pp. 3870–3896, May 1998, publisher: Society for Neuroscience Section: ARTICLE. [Online]. Available: https://www.jneurosci.org/content/18/10/3870
  • [12] C. Eliasmith, T. C. Stewart, X. Choo, T. Bekolay, T. DeWolf, Y. Tang, and D. Rasmussen, “A Large-Scale Model of the Functioning Brain,” Science, vol. 338, no. 6111, pp. 1202–1205, Nov. 2012, publisher: American Association for the Advancement of Science. [Online]. Available: http://www.science.org/doi/full/10.1126/science.1225266
  • [13] D. Salaj, A. Subramoney, C. Kraisnikovic, G. Bellec, R. Legenstein, and W. Maass, “Spike frequency adaptation supports network computations on temporally dispersed information,” eLife, vol. 10, p. e65459, Jul. 2021, publisher: eLife Sciences Publications, Ltd. [Online]. Available: https://doi.org/10.7554/eLife.65459
  • [14] J. F. Hübotter, P. Lanillos, and J. M. Tomczak, “Training Deep Spiking Auto-encoders without Bursting or Dying Neurons through Regularization,” arXiv:2109.11045 [cs], Sep. 2021, arXiv: 2109.11045. [Online]. Available: http://arxiv.org/abs/2109.11045
  • [15] M. Boerlin, C. K. Machens, and S. Denève, “Predictive Coding of Dynamical Variables in Balanced Spiking Networks,” PLOS Computational Biology, vol. 9, no. 11, pp. 1–16, Nov. 2013, publisher: Public Library of Science. [Online]. Available: https://doi.org/10.1371/journal.pcbi.1003258
  • [16] N. Calaim, F. A. Dehmelt, P. J. Gonçalves, and C. K. Machens, “The geometry of robustness in spiking neural networks,” eLife, vol. 11, p. e73276, May 2022, publisher: eLife Sciences Publications, Ltd. [Online]. Available: https://doi.org/10.7554/eLife.73276
  • [17] G. T. Barrett David, D. Sophie, and C. K. Machens, “Optimal compensation for neuron loss,” eLife, vol. 5, 2016. [Online]. Available: https://ru.idm.oclc.org/login?url=https://www.proquest.com/scholarly-journals/optimal-compensation-neuron-loss/docview/2162464133/se-2
  • [18] M. Nardin, J. W. Phillips, W. F. Podlaski, and S. W. Keemink, “Nonlinear computations in spiking neural networks through multiplicative synapses,” Peer Community Journal, vol. 1, p. e68, Dec. 2021. [Online]. Available: https://peercommunityjournal.org/articles/10.24072/pcjournal.69/
  • [19] A. Alemi, C. Machens, S. Deneve, and J.-J. Slotine, “Learning Nonlinear Dynamics in Efficient, Balanced Spiking Networks Using Local Plasticity Rules,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, Apr. 2018. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/11320
  • [20] W. Brendel, R. Bourdoukan, P. Vertechi, C. K. Machens, and S. Denève, “Learning to represent signals spike by spike,” PLOS Computational Biology, vol. 16, no. 3, p. e1007692, 2020, publisher: Public Library of Science. [Online]. Available: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007692
  • [21] D. Thalmeier, M. Uhlmann, H. J. Kappen, and R.-M. Memmesheimer, “Learning Universal Computations with Spikes,” PLOS Computational Biology, vol. 12, no. 6, pp. 1–29, Jun. 2016, publisher: Public Library of Science. [Online]. Available: https://doi.org/10.1371/journal.pcbi.1004895
  • [22] Y. Sandamirskaya, M. Kaboli, J. Conradt, and T. Celikel, “Neuromorphic computing hardware and neural architectures for robotics,” Science Robotics, vol. 7, no. 67, p. eabl8419, Jun. 2022, publisher: American Association for the Advancement of Science. [Online]. Available: http://www.science.org/doi/abs/10.1126/scirobotics.abl8419
  • [23] A. Mancoo, S. Keemink, and C. K. Machens, “Understanding spiking networks through convex optimization,” in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, Eds., vol. 33.   Curran Associates, Inc., 2020, pp. 8824–8835. [Online]. Available: https://proceedings.neurips.cc/paper/2020/file/64714a86909d401f8feb83e8c2d94b23-Paper.pdf
  • [24] F. Huang, J. Riehl, and S. Ching, “Optimizing the dynamics of spiking networks for decoding and control,” in 2017 American Control Conference (ACC), May 2017, pp. 2792–2798, iSSN: 2378-5861.
  • [25] F. Huang and S. Ching, “Dynamical Spiking Networks for Distributed Control of Nonlinear Systems,” in 2018 Annual American Control Conference (ACC), Jun. 2018, pp. 1190–1195, iSSN: 2378-5861.
  • [26] S. Denève and C. K. Machens, “Efficient codes and balanced networks,” Nature neuroscience, vol. 19, no. 3, pp. 375–382, Mar. 2016. [Online]. Available: https://doi.org/10.1038/nn.4243
  • [27] S. L. Brunton and J. N. Kutz, Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control, 1st ed.   USA: Cambridge University Press, 2019.
  • [28] M. Davies, N. Srinivasa, T.-H. Lin, G. Chinya, Y. Cao, S. H. Choday, G. Dimou, P. Joshi, N. Imam, S. Jain, Y. Liao, C.-K. Lin, A. Lines, R. Liu, D. Mathaikutty, S. McCoy, A. Paul, J. Tse, G. Venkataramanan, Y.-H. Weng, A. Wild, Y. Yang, and H. Wang, “Loihi: A Neuromorphic Manycore Processor with On-Chip Learning,” IEEE Micro, vol. 38, no. 1, pp. 82–99, Jan. 2018, conference Name: IEEE Micro.
  • [29] O. Rhodes, L. Peres, A. G. D. Rowley, A. Gait, L. A. Plana, C. Brenninkmeijer, and S. B. Furber, “Real-time cortical simulation on neuromorphic hardware,” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol. 378, no. 2164, p. 20190160, Dec. 2019, publisher: Royal Society. [Online]. Available: https://royalsocietypublishing.org/doi/10.1098/rsta.2019.0160
  • [30] C. E. R. Buxó and J. W. Pillow, “Poisson balanced spiking networks,” PLOS Computational Biology, vol. 16, no. 11, p. e1008261, Nov. 2020, publisher: Public Library of Science. [Online]. Available: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008261
  • [31] M. Chalk, B. Gutkin, and S. Denève, “Neural oscillations as a signature of efficient coding in the presence of synaptic delays,” eLife, vol. 5, p. e13824, Jul. 2016, publisher: eLife Sciences Publications, Ltd. [Online]. Available: https://doi.org/10.7554/eLife.13824
  • [32] F. A. Mikulasch, L. Rudelt, and V. Priesemann, “Local dendritic balance enables learning of efficient representations in networks of spiking neurons,” Proceedings of the National Academy of Sciences, vol. 118, no. 50, p. e2021925118, Dec. 2021, publisher: Proceedings of the National Academy of Sciences. [Online]. Available: https://www.pnas.org/doi/full/10.1073/pnas.2021925118
  • [33] T. DeWolf, T. C. Stewart, J.-J. Slotine, and C. Eliasmith, “A spiking neural model of adaptive arm control,” Proceedings of the Royal Society B: Biological Sciences, vol. 283, no. 1843, p. 20162134, Nov. 2016, publisher: Royal Society. [Online]. Available: https://royalsocietypublishing.org/doi/full/10.1098/rspb.2016.2134
  • [34] J. Büchel, D. Zendrikov, S. Solinas, G. Indiveri, and D. R. Muir, “Supervised training of spiking neural networks for robust deployment on mixed-signal neuromorphic processors,” Scientific Reports, vol. 11, no. 1, p. 23376, Dec. 2021, number: 1 Publisher: Nature Publishing Group. [Online]. Available: https://www.nature.com/articles/s41598-021-02779-x
  • [35] J. Friedrich, S. Golkar, S. Farashahi, A. Genkin, A. Sengupta, and D. Chklovskii, “Neural optimal feedback control with local learning rules,” in Advances in Neural Information Processing Systems, vol. 34.   Curran Associates, Inc., 2021, pp. 16 358–16 370. [Online]. Available: https://proceedings.neurips.cc/paper/2021/hash/88591b4d3219675bdeb33584b755f680-Abstract.html