Seminar talk in Oxford on Rapid Compensatory Mechanisms

I am delighted to get the chance to present my work on learning in spiking neural networks next week (Tuesday, 17 October 2017, 1pm to 2pm) in Oxford at the “EP Cognitive and Behavioural Neuroscience Seminar”.

Title: Making Cell Assemblies: What can we learn about plasticity from spiking neural network models?

Abstract: Long-term synaptic changes are thought to underlie learning and memory. Hebbian plasticity and homeostatic plasticity work in concert to combine neurons into functional cell assemblies. This is the story you know. In this talk, I will tell a different tale. In the first part, starting from the iconic notion of the Hebbian cell assembly, I will show the difficulties that synaptic plasticity has to overcome to form and maintain memories stored as cell assemblies in a network model of spiking neurons. Teetering on the brink of disaster, a diversity of synaptic plasticity mechanisms must work in symphony to avoid exploding network activity and catastrophic memory loss – in order to fulfill our preconception of how memories are formed and maintained in biological neural networks. I will introduce the notion of Rapid Compensatory Processes, explain why they have to work on shorter timescales than currently known forms of homeostatic plasticity, and motivate why it is useful to derive synaptic learning rules from a cost function approach. Cost functions will also serve as the motivation for the second part of my talk in which I will focus on the issue of spatial credit assignment. Plastic synapses encounter this issue when they are part of a network in which information is processed sequentially over several layers. I will introduce several recent conceptual advances in the field that have lead to algorithms which can train spiking neural network models capable of solving complex tasks. Finally, I will show that such algorithms can be mapped to voltage-dependent three-factor Hebbian plasticity rules and discuss their biological plausibility.

Posted in talks Tagged with: ,

ICML Talk on “Continual Learning Through Synaptic Intelligence”

I am looking forward to presenting our work on synaptic consolidation at ICML in Sydney this year. The talk will be held on Tue Aug 8th 11:06–11:24 AM @ Darling Harbour Theatre. Ben and I will also present a poster (#46; see below) on the same topic on Tuesday.

See also the paper, the code and an older blog post on the topic.


Posted in news, talks Tagged with: , , , , , ,

Supervised learning in multi-layer spiking neural networks

We just put a conference paper version of “SuperSpike”, our work on supervised learning in multi-layer spiking neural networks to the arXiv As always I am keen to get your feedback.

Posted in publications Tagged with: , ,

The temporal paradox of Hebbian learning and homeostatic plasticity

I am happy that our article on “The temporal paradox of Hebbian learning and homeostatic plasticity” was just published in Current Opinion in Neurobiology (full text). This article essentially concisely presents the main arguments for the existence of rapid compensatory processes (RCP) in addition to slow forms of homeostatic plasticity. It then reviews some of the top candidates to fill this role. Unlike other articles that we have written before, the present one has a control theoretic spin.

Here is the journal version and a preprint in case the former does not work for you. Many thanks to the people who contributed to this article, either in their role as anonymous reviewers or as the ones who gave their input to the preprint on bioRxiv.

I hope you will find this article thought provoking and helpful.

Posted in publications Tagged with: , ,

Role of complex synapses in continual learning

Excited that our preprint “Improved multitask learning through synaptic intelligence” just went life on the arXiv ( This article, by Ben Poole, Surya and myself, illustrates the benefits of complex synaptic dynamics on continual learning in neural networks. Here a short summary why I am particularly excited about this work with focus on its neuroscience side.

Artist’s impression of a “complex synapse” by K. Yadava (2017).

“How much should I care?” This is the question a synapse has to ask itself when it comes to updating its efficacy to form a new memory. If the synapse goes “all in and devotes itself fully to forming the new memory, it might decide to substantially change its weight. Old memories, which may have been encoded using the same synaptic weight, might be damaged or overwritten entirely by this process. However, if a synapse clings on too tightly to its own past and does not change its weight by a lot, it becomes difficult to form new memories. This dichotomy has been termed the plasticity stability dilemma. To overcome this dilemma, several complex synapse and plasticity models have been proposed over the years (Fusi et al., 2005; Clopath et al., 2008; Barrett et al., 2009; Lahiri and Ganguli, 2013; Ziegler et al., 2015; Benna and Fusi, 2016). Most of these models implement intricate hidden synaptic dynamics on multiple timescales, which in some cases are augmented by some form of neuromodulatory control. However, these models are typically studied in abstract analytical frameworks in which their direct impact on memory performance and learning of real-world tasks is often hard to measure. In our article, we have now taken a step towards looking at the functional benefits of complex synaptic models in network models which learn to perform real-world tasks.

In the manuscript, we investigated the problem of multi-task learning in which a neural network has to solve a standard classification task. But, instead of having access to all the training data at once, the network is trained peu-a-peu on one sub-task at a time. For example, suppose you want to learn the (MNIST) digits. Instead of giving you all the labels, your supervisor only shows you zeros and ones at first. Then, the next day, you get to see twos and threes and so forth … when you train a standard MLP on these tasks one by one, by the time you get to 8 and 9, you will have forgotten about 0 and 1. In machine learning this problem is called catastrophic forgetting (McCloskey, M. & Cohen, N., 1989; Srivastava et al., 2013; Goodfellow et al., 2013), but the problem is very much related to the plasticity stability dilemma introduced above. In our article, we propose a straight-forward mechanism by which each synapse “remembers” how important its is for achieving good performance on a given set of tasks. The more important a synapse “thinks” it is for storing past memories, the more reluctant it becomes in updating its efficacy to store new memories. It turns out that when all synapses in a network do that, it becomes a relatively simple undertaking for a network to learn new things while also being good at remembering the old stuff.

The mechanism has a simple yet beautiful connection to computing the path integral over the gradient field. The details do not matter, but to give you the gist of it: What’s really cool and somewhat surprising is that for gradient-based learning schemes, a synapse can estimate its own contribution to improvements of the global objective based on purely local measurements. All a synapse needs to have access to is its own gradient component and its own weight update. Additionally, the synapse needs to have some memory about its recent past. To implement these dynamics, the synaptic state space needs to be 3–4 dimensional depending on how you count. From a biological point of view, this dimensionality does not seem too unreasonable given the chemical and molecular complexity of real synapses (Redondo and Morris, 2011). I find this pretty neat, but of course there are plenty of open questions left for future studies. For instance, in the manuscript we use backprop to train our networks. That makes the gradient itself a highly nonlocal quantity and for multilayer networks its still pretty much unclear how this gradient gets to the synapse. But one step at a time. Figuring out how credit assignment in the deep layers in the brain is achieved is still a pretty much open problem, but a lot of people are working on that, so we can hope for some progress on that topic soon.

Anyway, I am looking forward to feedback and suggestions 🙂


  • Barrett, A.B., Billings, G.O., Morris, R.G.M., and van Rossum, M.C.W. (2009). State Based Model of Long-Term Potentiation and Synaptic Tagging and Capture. PLoS Comput Biol 5, e1000259.
  • Benna, M.K., and Fusi, S. (2016). Computational principles of synaptic memory consolidation. Nat Neurosci advance online publication.
  • Clopath, C., Ziegler, L., Vasilaki, E., Büsing, L., and Gerstner, W. (2008). Tag-Trigger-Consolidation: A Model of Early and Late Long-Term-Potentiation and Depression. PLoS Comput Biol 4, e1000248.
  • Fusi, S., Drew, P.J., and Abbott, L.F. (2005). Cascade models of synaptically stored memories. Neuron 45, 599–611.
  • Goodfellow, I.J., Mirza, M., Xiao, D., Courville, A., and Bengio, Y. (2013). An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks. arXiv:1312.6211 [Cs, Stat].
  • Lahiri, S., and Ganguli, S. (2013). A memory frontier for complex synapses. In Advances in Neural Information Processing Systems, (Tahoe, USA: Curran Associates, Inc.), pp. 1034–1042.
  • McCloskey, M., and Cohen, N.J. (1989). Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem. In Psychology of Learning and Motivation, G.H. Bower, ed. (Academic Press), pp. 109–165.
  • Redondo, R.L., and Morris, R.G.M. (2011). Making memories last: the synaptic tagging and capture hypothesis. Nat Rev Neurosci 12, 17–30.
  • Srivastava, R.K., Masci, J., Kazerounian, S., Gomez, F., and Schmidhuber, J. (2013). Compete to Compute. In Proceedings of the 26th International Conference on Neural Information Processing Systems, (USA: Curran Associates Inc.), pp. 2310–2318.
  • Ziegler, L., Zenke, F., Kastner, D.B., and Gerstner, W. (2015). Synaptic Consolidation: From Synapses to Behavioral Modeling. J Neurosci 35, 1319–1334.
Posted in publications Tagged with: , , , ,

Upcoming talk at COSYNE workshop “Learning in multi-layer spiking neural networks”

I am very much looking forward to presenting some recent work with Surya on learning in spiking neural networks at the CoSyNe workshop “Deep learning” and the brain (6.20–6.50p on Monday, 27 February 2017 in “Wasatch”).

Schematic drawing of a multi-layer spiking neural network

In my talk I will revisit the problem of training multi-layer spiking neural networks using an objective function approach. Due to the non-differentiable nature of spiking neurons and their non-trivial history-dependence induced by the spike reset, it is generally not possible to apply gradient-based learning methods like the ones used to train deep neural networks in machine learning.

During my presentation, I will one-by-one address the core problems typically encountered when trying to train spiking neural networks and introduce Superspike, a new approach to training deterministic spiking neural networks to solve complex and non-linearly separable temporal tasks.

Illustration of Superspike algorithm solving a 4-way classification task. In this example each output neuron needs to learn to spike in response to one out of 100 noisy input spike patterns. All neurons in the network are implemented as standard leaky integrate-and-fire neurons. Left: Schematic setup of the network. Middle: Initially all output and hidden neurons are quiescent. At a later time the output neurons have learned to respond to the correct stimulus class (indicated by shaded color region), while the hidden neurons show sparse and temporally irregular activity.

Importantly, Superspike has a direct interpretation as a Hebbian three-factor learning rule. Moreover, I am going to share some of my ideas on how I think similar algorithms could be implemented in neurobiology. For instance, when combined with feedback alignment (Lillicrap et al. 2016) the weight transport problem can be alleviated (see the Figure below for a simple example). With all that said, it would be great if you would care to join me for my talk. I am looking forward to fruitful discussions during the workshop and your feedback.


Posted in news, talks Tagged with: , ,

Special Issue: “Integrating Hebbian and Homeostatic plasticity”

I recommend taking a look at the special issue  on ‘Integrating Hebbian and Homeostatic plasticity’ which was just published in Phil Trans of the Royal Society B. You can find the table of contents at The issue is based on a fruitful discussion meeting in London in April 2016 and combines multiple contributions from both theory and experiment. It offers an excellent overview of the state of the art in research on Hebbian and homeostatic plasticity.

The issue also includes a paper by Wulfram and myself and our take on the role of negative feedback processes on different timescales. In this paper we suggest a division of labor between rapid compensatory processes (RCP) which act on short timescales, which are comparable with the timescales of plasticity induction, and homeostatic mechanisms which act on homeostatic timescales of hours or days.

Posted in news, publications Tagged with: , , , ,

Towards a post-journal world

I just enjoyed reading Romain Brette’s post about how to move towards a better scientific publication system. Maybe you will find it interesting too.

My new year resolution : to help move science to the post-journal world


Posted in web Tagged with: ,

Auryn v0.8.0 stable released

The stable Auryn version 0.8 is available now. The new version comes with extensive refactoring under the hood an now supports complex synapse models and improved vectorization for neuron models. The new version is available on github

Posted in news Tagged with: , ,

What’s new in Auryn v0.8.0-alpha?

Last week I put up a release branch for Auryn v0.8 which is currently in alpha stage. The code can be found here

The main perks: Further increase of performance. Class-based state vectors for neuronal and synaptic states for ease of code writing and readability.

Increased performance

The main changes from Auryn v0.7 to Auryn v0.8.0-alpha happened under the hood. Auryn’s core vector class for state updates and its core class for MPI communication between nodes were both completely rewritten. This increase Auryn’s performance even further by about 10%. Here are the results of a series of benchmarks on  how execution speed increased with development:

Ease of writing code

By re-factoring Auryn’s state vector class which is the heart of neuronal and synaptic updates, not only performance was increased, but also the code has now become more readable and easier to write. Before, vector operations were based on a functional framework inherited from older versions which still used the GSL. To implement an exponential decay of an AMPA conductance stored in a state vector g_ampa for instance you had to write

auryn_vector_float_scale( mul, g_ampa);

where mul is a float and g_ampa is a the vector containing all AMPA conductances of the NeuronGroup. Now, state vectors are classes with their own functions. The above expression now reduces to:

g_ampa->scale( mul );

Or similarly, to compute the current caused by an inhibitory conductance up to know you had to write:


which first computes the distance from the inhibitory reversal potential (e_rev), stores it in the state vector t_inh and then multiplies it with the conductances in g_gaba. In Auryn v0.8 the same is achieved by


Don’t worry, though. All the legacy functions will also still work.

New devices, models and perks

In addition to that Auryn 0.8 comes with a bunch of nice new tools. For instance there is a BinaryStateMonitor now. Both, BinaryStateMonitor and StateMonitor can now compress their output if desired. Moreover, I laid out the basis for supporting AVX instructions in the future. There are new neuron models available such as the Izhikevich model and plenty of more …

Go take a look! I hope you like it.


Posted in news Tagged with: ,