Skip to content

Commit

Permalink
Move user docs on surrogate gradient to eprop_iaf and include elsewhere
Browse files Browse the repository at this point in the history
  • Loading branch information
akorgor committed Sep 11, 2024
1 parent 60a80ee commit 757b064
Show file tree
Hide file tree
Showing 5 changed files with 77 additions and 84 deletions.
56 changes: 54 additions & 2 deletions models/eprop_iaf.h
Original file line number Diff line number Diff line change
Expand Up @@ -103,8 +103,39 @@ voltage :math:`\psi_j^{t-1}` (the product of which forms the eligibility
trace :math:`e_{ji}^{t-1}`), and the learning signal :math:`L_j^t` emitted
by the readout neurons.
See the documentation on the :doc:`eprop_archiving_node<../models/eprop_archiving_node/>` for details on the surrogate
gradients functions.
.. start_surrogate-gradient-functions
Surrogate gradients help overcome the challenge of the spiking function's
non-differentiability, facilitating the use of gradient-based learning
techniques such as e-prop. The non-existent derivative of the spiking
variable with respect to the membrane voltage,
:math:`\frac{\partial z^t_j}{ \partial v^t_j}`, can be effectively
replaced with a variety of surrogate gradient functions, as detailed in
various studies (see, e.g., [3]_). NEST currently provides four
different surrogate gradient functions:
1. A piecewise linear function used among others in [1]_:
.. math::
\psi_j^t = \frac{ \gamma }{ v_\text{th} } \text{max}
\left( 0, 1-\beta \left| \frac{ v_j^t - v_\text{th} }{ v_\text{th} }\right| \right) \,. \\
2. An exponential function used in [4]_:
.. math::
\psi_j^t = \gamma \exp \left( -\beta \left| v_j^t - v_\text{th} \right| \right) \,. \\
3. The derivative of a fast sigmoid function used in [5]_:
.. math::
\psi_j^t = \gamma \left( 1 + \beta \left| v_j^t - v_\text{th} \right| \right)^2 \,. \\
4. An arctan function used in [6]_:
.. math::
\psi_j^t = \frac{\gamma}{\pi} \frac{1}{ 1 + \left( \beta \pi \left( v_j^t - v_\text{th} \right) \right)^2 } \,. \\
.. end_surrogate-gradient-functions
In the interval between two presynaptic spikes, the gradient is calculated
at each time step until the cutoff time point. This computation occurs over
Expand Down Expand Up @@ -272,6 +303,27 @@ References
van Albada SJ, Plesser HE, Bolten M, Diesmann M. Event-based
implementation of eligibility propagation (in preparation)
.. start_surrogate-gradient-references
.. [3] Neftci EO, Mostafa H, Zenke F (2019). Surrogate Gradient Learning in
Spiking Neural Networks. IEEE Signal Processing Magazine, 36(6), 51-63.
https://doi.org/10.1109/MSP.2019.2931595
.. [4] Shrestha SB, Orchard G (2018). SLAYER: Spike Layer Error Reassignment in
Time. Advances in Neural Information Processing Systems, 31:1412-1421.
https://proceedings.neurips.cc/paper_files/paper/2018/hash/82.. rubric:: References
.. [5] Zenke F, Ganguli S (2018). SuperSpike: Supervised Learning in Multilayer
Spiking Neural Networks. Neural Computation, 30:1514–1541.
https://doi.org/10.1162/neco_a_01086
.. [6] Fang W, Yu Z, Chen Y, Huang T, Masquelier T, Tian Y (2021). Deep residual
learning in spiking neural networks. Advances in Neural Information
Processing Systems, 34:21056–21069.
https://proceedings.neurips.cc/paper/2021/hash/afe434653a898da20044041262b3ac74-Abstract.html
.. end_surrogate-gradient-references
Sends
+++++
Expand Down
9 changes: 7 additions & 2 deletions models/eprop_iaf_adapt.h
Original file line number Diff line number Diff line change
Expand Up @@ -110,8 +110,9 @@ voltage :math:`\psi_j^{t-1}` (the product of which forms the eligibility
trace :math:`e_{ji}^{t-1}`), and the learning signal :math:`L_j^t` emitted
by the readout neurons.
See the documentation on the :doc:`eprop_archiving_node<../models/eprop_archiving_node/>` for details on the surrogate
gradients functions.
.. include:: ../models/eprop_iaf.rst
:start-after: .. start_surrogate-gradient-functions
:end-before: .. end_surrogate-gradient-functions
In the interval between two presynaptic spikes, the gradient is calculated
at each time step until the cutoff time point. This computation occurs over
Expand Down Expand Up @@ -287,6 +288,10 @@ References
van Albada SJ, Plesser HE, Bolten M, Diesmann M. Event-based
implementation of eligibility propagation (in preparation)
.. include:: ../models/eprop_iaf.rst
:start-after: .. start_surrogate-gradient-references
:end-before: .. end_surrogate-gradient-references
Sends
+++++
Expand Down
11 changes: 8 additions & 3 deletions models/eprop_iaf_adapt_bsshslm_2020.h
Original file line number Diff line number Diff line change
Expand Up @@ -113,15 +113,16 @@ voltage :math:`\psi_j^t` (the product of which forms the eligibility
trace :math:`e_{ji}^t`), and the learning signal :math:`L_j^t` emitted
by the readout neurons.
See the documentation on the :doc:`eprop_archiving_node<../models/eprop_archiving_node/>` for details on the surrogate
gradients functions.
.. math::
\frac{ \text{d} E }{ \text{d} W_{ji} } &= \sum_t L_j^t \bar{e}_{ji}^t \,, \\
e_{ji}^t &= \psi_j^t \left( \bar{z}_i^{t-1} - \beta \epsilon_{ji,a}^{t-1} \right) \,, \\
\epsilon^{t-1}_{ji,\text{a}} &= \psi_j^{t-1} \bar{z}_i^{t-2} + \left( \rho - \psi_j^{t-1} \beta \right)
\epsilon^{t-2}_{ji,a} \,. \\
.. include:: ../models/eprop_iaf.rst
:start-after: .. start_surrogate-gradient-functions
:end-before: .. end_surrogate-gradient-functions
The eligibility trace and the presynaptic spike trains are low-pass filtered
with the following exponential kernels:
Expand Down Expand Up @@ -257,6 +258,10 @@ References
van Albada SJ, Plesser HE, Bolten M, Diesmann M. Event-based
implementation of eligibility propagation (in preparation)
.. include:: ../models/eprop_iaf.rst
:start-after: .. start_surrogate-gradient-references
:end-before: .. end_surrogate-gradient-references
Sends
+++++
Expand Down
11 changes: 8 additions & 3 deletions models/eprop_iaf_bsshslm_2020.h
Original file line number Diff line number Diff line change
Expand Up @@ -106,13 +106,14 @@ voltage :math:`\psi_j^t` (the product of which forms the eligibility
trace :math:`e_{ji}^t`), and the learning signal :math:`L_j^t` emitted
by the readout neurons.
See the documentation on the :doc:`eprop_archiving_node<../models/eprop_archiving_node/>` for details on the surrogate
gradients functions.
.. math::
\frac{ \text{d} E }{ \text{d} W_{ji} } &= \sum_t L_j^t \bar{e}_{ji}^t \,, \\
e_{ji}^t &= \psi^t_j \bar{z}_i^{t-1} \,, \\
.. include:: ../models/eprop_iaf.rst
:start-after: .. start_surrogate-gradient-functions
:end-before: .. end_surrogate-gradient-functions
The eligibility trace and the presynaptic spike trains are low-pass filtered
with the following exponential kernels:
Expand Down Expand Up @@ -242,6 +243,10 @@ References
van Albada SJ, Plesser HE, Bolten M, Diesmann M. Event-based
implementation of eligibility propagation (in preparation)
.. include:: ../models/eprop_iaf.rst
:start-after: .. start_surrogate-gradient-references
:end-before: .. end_surrogate-gradient-references
Sends
+++++
Expand Down
74 changes: 0 additions & 74 deletions nestkernel/eprop_archiving_node.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,80 +34,6 @@

namespace nest
{
/* BeginUserDocs: e-prop plasticity
Short description
+++++++++++++++++
Archiving node managing the history of e-prop variables.
Description
+++++++++++
The archiving node comprises a set of functions needed for writing the e-prop
values of the e-prop variables to history and retrieving them, as well as
functions to compute, for example, the firing rate regularization and the
surrogate gradient.
Surrogate gradient functions
++++++++++++++++++++++++++++
Surrogate gradients help overcome the challenge of the spiking function's
non-differentiability, facilitating the use of gradient-based learning
techniques such as e-prop. The non-existent derivative of the spiking
variable with respect to the membrane voltage,
:math:`\frac{\partial z^t_j}{ \partial v^t_j}`, can be effectively
replaced with a variety of surrogate gradient functions, as detailed in
various studies (see, e.g., [1]_). NEST currently provides four
different surrogate gradient functions:
1. A piecewise linear function used among others in [2]_:
.. math::
\psi_j^t = \frac{ \gamma }{ v_\text{th} } \text{max}
\left( 0, 1-\beta \left| \frac{ v_j^t - v_\text{th} }{ v_\text{th} }\right| \right) \,. \\
2. An exponential function used in [3]_:
.. math::
\psi_j^t = \gamma \exp \left( -\beta \left| v_j^t - v_\text{th} \right| \right) \,. \\
3. The derivative of a fast sigmoid function used in [4]_:
.. math::
\psi_j^t = \gamma \left( 1 + \beta \left| v_j^t - v_\text{th} \right| \right)^2 \,. \\
4. An arctan function used in [5]_:
.. math::
\psi_j^t = \frac{\gamma}{\pi} \frac{1}{ 1 + \left( \beta \pi \left( v_j^t - v_\text{th} \right) \right)^2 } \,. \\
References
++++++++++
.. [1] Neftci EO, Mostafa H, Zenke F (2019). Surrogate Gradient Learning in
Spiking Neural Networks. IEEE Signal Processing Magazine, 36(6), 51-63.
https://doi.org/10.1109/MSP.2019.2931595
.. [2] Bellec G, Scherr F, Subramoney F, Hajek E, Salaj D, Legenstein R,
Maass W (2020). A solution to the learning dilemma for recurrent
networks of spiking neurons. Nature Communications, 11:3625.
https://doi.org/10.1038/s41467-020-17236-y
.. [3] Shrestha SB, Orchard G (2018). SLAYER: Spike Layer Error Reassignment in
Time. Advances in Neural Information Processing Systems, 31:1412-1421.
https://proceedings.neurips.cc/paper_files/paper/2018/hash/82f2b308c3b01637c607ce05f52a2fed-Abstract.html
.. [4] Zenke F, Ganguli S (2018). SuperSpike: Supervised Learning in Multilayer
Spiking Neural Networks. Neural Computation, 30:1514–1541.
https://doi.org/10.1162/neco_a_01086
.. [5] Fang W, Yu Z, Chen Y, Huang T, Masquelier T, Tian Y (2021). Deep residual
learning in spiking neural networks. Advances in Neural Information
Processing Systems, 34:21056–21069.
https://proceedings.neurips.cc/paper/2021/hash/afe434653a898da20044041262b3ac74-Abstract.html
EndUserDocs */

/**
* Base class implementing an intermediate archiving node model for node models supporting e-prop plasticity
* according to Bellec et al. (2020) and supporting additional biological features described in Korcsak-Gorzo,
Expand Down

0 comments on commit 757b064

Please sign in to comment.