Excited that our preprint “Improved multitask learning through synaptic intelligence” just went life on the arXiv (https://arxiv.org/abs/1703.04200). This article, by Ben Poole, Surya and myself, illustrates the benefits of complex synaptic dynamics on continual learning in neural networks. Here a short summary why I am particularly excited about this work with focus on its neuroscience side.

Artist’s impression of a “complex synapse” by K. Yadava (2017).

“How much should I care?” This is the question a synapse has to ask itself when it comes to updating its efficacy to form a new memory. If the synapse goes “all in and devotes itself fully to forming the new memory, it might decide to substantially change its weight. Old memories, which may have been encoded using the same synaptic weight, might be damaged or overwritten entirely by this process. However, if a synapse clings on too tightly to its own past and does not change its weight by a lot, it becomes difficult to form new memories. This dichotomy has been termed the plasticity stability dilemma. To overcome this dilemma, several complex synapse and plasticity models have been proposed over the years (Fusi et al., 2005; Clopath et al., 2008; Barrett et al., 2009; Lahiri and Ganguli, 2013; Ziegler et al., 2015; Benna and Fusi, 2016). Most of these models implement intricate hidden synaptic dynamics on multiple timescales, which in some cases are augmented by some form of neuromodulatory control. However, these models are typically studied in abstract analytical frameworks in which their direct impact on memory performance and learning of real-world tasks is often hard to measure. In our article, we have now taken a step towards looking at the functional benefits of complex synaptic models in network models which learn to perform real-world tasks.

In the manuscript, we investigated the problem of multi-task learning in which a neural network has to solve a standard classification task. But, instead of having access to all the training data at once, the network is trained peu-a-peu on one sub-task at a time. For example, suppose you want to learn the (MNIST) digits. Instead of giving you all the labels, your supervisor only shows you zeros and ones at first. Then, the next day, you get to see twos and threes and so forth … when you train a standard MLP on these tasks one by one, by the time you get to 8 and 9, you will have forgotten about 0 and 1. In machine learning this problem is called catastrophic forgetting (McCloskey, M. & Cohen, N., 1989; Srivastava et al., 2013; Goodfellow et al., 2013), but the problem is very much related to the plasticity stability dilemma introduced above. In our article, we propose a straight-forward mechanism by which each synapse “remembers” how important its is for achieving good performance on a given set of tasks. The more important a synapse “thinks” it is for storing past memories, the more reluctant it becomes in updating its efficacy to store new memories. It turns out that when all synapses in a network do that, it becomes a relatively simple undertaking for a network to learn new things while also being good at remembering the old stuff.

The mechanism has a simple yet beautiful connection to computing the path integral over the gradient field. The details do not matter, but to give you the gist of it: What’s really cool and somewhat surprising is that for gradient-based learning schemes, a synapse can estimate its own contribution to improvements of the global objective based on purely local measurements. All a synapse needs to have access to is its own gradient component and its own weight update. Additionally, the synapse needs to have some memory about its recent past. To implement these dynamics, the synaptic state space needs to be 3–4 dimensional depending on how you count. From a biological point of view, this dimensionality does not seem too unreasonable given the chemical and molecular complexity of real synapses (Redondo and Morris, 2011). I find this pretty neat, but of course there are plenty of open questions left for future studies. For instance, in the manuscript we use backprop to train our networks. That makes the gradient itself a highly nonlocal quantity and for multilayer networks its still pretty much unclear how this gradient gets to the synapse. But one step at a time. Figuring out how credit assignment in the deep layers in the brain is achieved is still a pretty much open problem, but a lot of people are working on that, so we can hope for some progress on that topic soon.

Anyway, I am looking forward to feedback and suggestions ūüôā


  • Barrett, A.B., Billings, G.O., Morris, R.G.M., and van Rossum, M.C.W. (2009). State Based Model of Long-Term Potentiation and Synaptic Tagging and Capture. PLoS Comput Biol 5, e1000259.
  • Benna, M.K., and Fusi, S. (2016). Computational principles of synaptic memory consolidation. Nat Neurosci advance online publication.
  • Clopath, C., Ziegler, L., Vasilaki, E., B√ľsing, L., and Gerstner, W. (2008). Tag-Trigger-Consolidation: A Model of Early and Late Long-Term-Potentiation and Depression. PLoS Comput Biol 4, e1000248.
  • Fusi, S., Drew, P.J., and Abbott, L.F. (2005). Cascade models of synaptically stored memories. Neuron 45, 599‚Äď611.
  • Goodfellow, I.J., Mirza, M., Xiao, D., Courville, A., and Bengio, Y. (2013). An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks. arXiv:1312.6211 [Cs, Stat].
  • Lahiri, S., and Ganguli, S. (2013). A memory frontier for complex synapses. In Advances in Neural Information Processing Systems, (Tahoe, USA: Curran Associates, Inc.), pp. 1034‚Äď1042.
  • McCloskey, M., and Cohen, N.J. (1989). Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem. In Psychology of Learning and Motivation, G.H. Bower, ed. (Academic Press), pp. 109‚Äď165.
  • Redondo, R.L., and Morris, R.G.M. (2011). Making memories last: the synaptic tagging and capture hypothesis. Nat Rev Neurosci 12, 17‚Äď30.
  • Srivastava, R.K., Masci, J., Kazerounian, S., Gomez, F., and Schmidhuber, J. (2013). Compete to Compute. In Proceedings of the 26th International Conference on Neural Information Processing Systems, (USA: Curran Associates Inc.), pp. 2310‚Äď2318.
  • Ziegler, L., Zenke, F., Kastner, D.B., and Gerstner, W. (2015). Synaptic Consolidation: From Synapses to Behavioral Modeling. J Neurosci 35, 1319‚Äď1334.
Tagged with: , , , ,

I am very much looking forward to presenting some recent work with Surya on learning in spiking neural networks at the CoSyNe workshop ‚ÄúDeep learning‚ÄĚ and the brain (6.20‚Äď6.50p on Monday, 27 February 2017 in “Wasatch”).

Update: Preprint available.

Schematic drawing of a multi-layer spiking neural network

In my talk I will revisit the problem of training multi-layer spiking neural networks using an objective function approach. Due to the non-differentiable nature of spiking neurons and their non-trivial history-dependence induced by the spike reset, it is generally not possible to apply gradient-based learning methods like the ones used to train deep neural networks in machine learning.

During my presentation, I will one-by-one address the core problems typically encountered when trying to train spiking neural networks and introduce Superspike, a new approach to training deterministic spiking neural networks to solve complex and non-linearly separable temporal tasks.

Illustration of Superspike algorithm solving a 4-way classification task. In this example each output neuron needs to learn to spike in response to one out of 100 noisy input spike patterns. All neurons in the network are implemented as standard leaky integrate-and-fire neurons. Left: Schematic setup of the network. Middle: Initially all output and hidden neurons are quiescent. At a later time the output neurons have learned to respond to the correct stimulus class (indicated by shaded color region), while the hidden neurons show sparse and temporally irregular activity.

Importantly, Superspike has a direct interpretation as a Hebbian three-factor learning rule. Moreover, I am going to share some of my ideas on how I think similar algorithms could be implemented in neurobiology. For instance, when combined with feedback alignment (Lillicrap et al. 2016) the weight transport problem can be alleviated (see the Figure below for a simple example). With all that said, it would be great if you would care to join me for my talk. I am looking forward to fruitful discussions during the workshop and your feedback.


Tagged with: , ,

I recommend taking a look at the special issue¬† on ‚ÄėIntegrating Hebbian and Homeostatic plasticity‚Äô which was just published in Phil Trans of the Royal Society B. You can find the table of contents at http://rstb.royalsocietypublishing.org/content/372/1715. The issue is based on a fruitful discussion meeting in London in April 2016 and combines multiple contributions from both theory and experiment. It offers an excellent overview of the state of the art in research on Hebbian and homeostatic plasticity.

The issue also includes a paper by Wulfram and myself and our take on the role of negative feedback processes on different timescales. In this paper we suggest a division of labor between rapid compensatory processes (RCP) which act on short timescales, which are comparable with the timescales of plasticity induction, and homeostatic mechanisms which act on homeostatic timescales of hours or days.

Tagged with: , , , ,

I just enjoyed reading Romain Brette’s post about how to move towards a better scientific publication system. Maybe you will find it interesting too.

My new year resolution : to help move science to the post-journal world


Tagged with: ,

The stable Auryn version 0.8 is available now. The new version comes with extensive refactoring under the hood an now supports complex synapse models and improved vectorization for neuron models. The new version is available on github

Tagged with: , ,

Last week I put up a release branch for Auryn v0.8 which is currently in alpha stage. The code can be found here https://github.com/fzenke/auryn/releases

The main perks: Further increase of performance. Class-based state vectors for neuronal and synaptic states for ease of code writing and readability.

Increased performance

The main changes from Auryn v0.7 to Auryn v0.8.0-alpha happened under the hood. Auryn’s core vector class for state updates and its core class for MPI communication between nodes were both completely rewritten. This increase Auryn’s performance even further by about 10%. Here are the results of a series of benchmarks on¬† how execution speed increased with development:

Ease of writing code

By re-factoring Auryn’s state vector class which is the heart of neuronal and synaptic updates, not only performance was increased, but also the code has now become more readable and easier to write. Before, vector operations were based on a functional framework inherited from older versions which still used the GSL. To implement an exponential decay of an AMPA conductance stored in a state vector g_ampa for instance you had to write

auryn_vector_float_scale( mul, g_ampa);

where mul is a float and g_ampa is a the vector containing all AMPA conductances of the NeuronGroup. Now, state vectors are classes with their own functions. The above expression now reduces to:

g_ampa->scale( mul );

Or similarly, to compute the current caused by an inhibitory conductance up to know you had to write:


which first computes the distance from the inhibitory reversal potential (e_rev), stores it in the state vector t_inh and then multiplies it with the conductances in g_gaba. In Auryn v0.8 the same is achieved by


Don’t worry, though. All the legacy functions will also still work.

New devices, models and perks

In addition to that Auryn 0.8 comes with a bunch of nice new tools. For instance there is a BinaryStateMonitor now. Both, BinaryStateMonitor and StateMonitor can now compress their output if desired. Moreover, I laid out the basis for supporting AVX instructions in the future. There are new neuron models available such as the Izhikevich model and plenty of more …

Go take a look! I hope you like it.


Tagged with: ,

I will have a poster at the Discussion Meeting “Integrating Hebbian and homeostatic plasticity” in London next week April 19–20, 2016

I am happy about the opportunity to present a poster which summarizes the key insights I gained during my PhD at an exciting looking discussion meeting “Integrating Hebbian and homeostatic plasticity” organized by Kevin Fox and Michael Stryker at the Royal Society in London.


Tagged with: , ,

… just got a lot easier with the new AurynVector class.

Because Auryn originally used GSL vectors (which predates C++) it was still using non object oriented syntax for vector data types internally. That made writing code for new neuron models particularly ugly and also hard to read. People who were struggling with this will be happy to hear that this now just got a little easier.

In the current development version of Auryn’s code I re-factored the central vector data type (auryn_vector_float) to a class template type AurynVectorFloat which now brings its own constructor and methods to manipulate it. I was also happy to see that performance was not notably affected by the change. In time I might even be able to drop the explicit use of SIMD instructions. For the new code the current GNU C++ compiler detects automatically where its use is advantageous. The old legacy code will still remain in Auryn’s code base for a while for backward compatibility. For details see



Tagged with:

The new stable Auryn version is now online

After a couple of months of testing, Auryn 0.7.0 is now available for download. The new version now finally uses cmake throughout and can thus be built on Windows PCs and Macs as well without a hassle. Enjoy!


Tagged with:

This concludes our Special Issue in Frontiers Computational Neuroscience

Cristina, Matt and me are happy to successfully conclude the Frontiers Research Topic that we have organized over the past year. I would like to express my thanks to all the authors and reviewers who made this endeavor happen. We have summarized the main outcomes of this work in our editorial article which is now online.