IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University...
-
Upload
amanda-jacobs -
Category
Documents
-
view
215 -
download
0
description
Transcript of IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University...
![Page 1: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/1.jpg)
IJCNN, July 27, 2004 [email protected] 1
Extending SpikeProp
Benjamin SchrauwenJan Van Campenhout
Ghent UniversityBelgium
![Page 2: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/2.jpg)
IJCNN, July 27, 2004 [email protected] 2
Overview
● Introduction● SpikeProp● Improvements● Results● Conclusions
![Page 3: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/3.jpg)
IJCNN, July 27, 2004 [email protected] 3
Introduction
● Spiking neural networks get increased attention:
● Biologically more plausible● Computationally stronger (W. Maass)● Compact and fast implementation in hardware
possible (analogue and digital)● Have temporal nature
● Main problem: supervised learning algorithms
![Page 4: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/4.jpg)
IJCNN, July 27, 2004 [email protected] 4
SpikeProp
● Introduced by S. Bohte et al. in 2000● An error-backpropagation learning algorithm● Only for SNN using “time-to-first-spike”
coding
t
~1/a
![Page 5: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/5.jpg)
IJCNN, July 27, 2004 [email protected] 5
Architecture of SpikeProp
● Originally introduced by Natschläger and Ruf● Every connection consists of several synaptic
connections● All 16 synaptic connections have enumerated delays (1-
16ms) and different weights, originally same filter
![Page 6: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/6.jpg)
IJCNN, July 27, 2004 [email protected] 6
SRM neuron
● Modified Spike Response Model (Gerstner)
t
Neuron reset ofno interest because
only one spike needed !
![Page 7: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/7.jpg)
IJCNN, July 27, 2004 [email protected] 7
Idea behind SpikeProp
Minimize SSE between actual output spike time and desired output spike time
Change weight along negative direction of the gradient
![Page 8: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/8.jpg)
IJCNN, July 27, 2004 [email protected] 8
Math of SpikeProp
Only output layer given
Linearise around thresholdcrossing time
![Page 9: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/9.jpg)
IJCNN, July 27, 2004 [email protected] 9
Problems with SpikeProp
● Overdetermined architecture● Tendency to get stuck when a neuron stops
firing● Problems with weight initialisation
![Page 10: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/10.jpg)
IJCNN, July 27, 2004 [email protected] 10
Solving some of the problems
● Instead of enumerating parameters: learn them
● Delays● Synaptic time constants● Thresholds
● We can use much more limited architecture● Add specific mechanism to keep neurons
firing: decrease threshold
![Page 11: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/11.jpg)
IJCNN, July 27, 2004 [email protected] 11
Learn more parameters
● Quite similar to weight update rule● Gradient of error with respect to parameter● Parameter specific learning rate
![Page 12: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/12.jpg)
IJCNN, July 27, 2004 [email protected] 12
Math of the improvements - delays
Delta is the same as for weight rule,thus different delta formula for outputas for inner layers.
![Page 15: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/15.jpg)
IJCNN, July 27, 2004 [email protected] 15
What if training gets stuck?
● If one of the neurons in the network stops firing: training rule stops working
● Solution: actively lower threshold of neuron whenever it stops firing (multiply by 0.9)
● Same as scaling all the weights up● Improves convergence
![Page 16: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/16.jpg)
IJCNN, July 27, 2004 [email protected] 16
What about weight initialisation
● Weight initialisation is a difficult problem● Original publication has vague description of process● S. M. Moore contacted S. Bohte personally for
clarifying the subject for his masters thesis● Weight initialisation is done by a complex procedure● Moore concluded that: ”weights should be initialized in
such a way that every neuron initially fires, and that its membrane potential doesn’t surpass the threshold too much”
![Page 17: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/17.jpg)
IJCNN, July 27, 2004 [email protected] 17
What about weight initialisation
● In this publication we chose a very simple initialisation procedure
● Initialise all weights randomly● Afterwards, set a weight such that the sum of all
weights is equal to 1.5● Convergence rates could be increased by
using more complex initialisation procedure
![Page 18: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/18.jpg)
IJCNN, July 27, 2004 [email protected] 18
Problem with large delays
• During the testing of the algorithm a problem arose when the trained delays got very large: delay learning stopped
• If input is preceded by output: problem• Solved by constraining delays
Output of neuron Input of neuron
![Page 19: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/19.jpg)
IJCNN, July 27, 2004 [email protected] 19
Results
● Tested for binary XOR (MSE = 1ms)• Bohte:
• 3-5-1 architecture• 16 synaptic terminals• 20*16 = 320 weights• 250 training cycles
• Improvements:• 3-5-1 architecture• 2 synaptic terminals• 20*2 = 40 weights• 130 training cycles• 90% convergence
• 3-3-1 architecture• 2 synaptic terminals• 12*2 = 24 weights• 320 training cycles• 60% convergence
![Page 20: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/20.jpg)
IJCNN, July 27, 2004 [email protected] 20
Results
● Optimal learning rates (found by experiment):● ● ● ●
● Some rates seem very high, but that is because the values we work with are times expressed in ms
● Idea that learning rate must be approx. 0.1 is only correct when input and weights are normalised !!
![Page 21: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/21.jpg)
IJCNN, July 27, 2004 [email protected] 21
Conclusions
● Because parameters can be learned, no enumeration is necesarry, thus architectures are much smaller
● For XOR: ● 8 times less weights needed● Learning converges faster (50% of original)● No complex initialisation functions● Positive and negative weights can be mixed● But convergence deteriorate with further reduction
of weights
![Page 22: IJCNN, July 27, 2004 Extending SpikeProp Benjamin Schrauwen Jan Van Campenhout Ghent University Belgium.](https://reader036.fdocuments.us/reader036/viewer/2022062601/5a4d1bde7f8b9ab0599de15c/html5/thumbnails/22.jpg)
IJCNN, July 27, 2004 [email protected] 22
Conclusions
● Technique only tested on small problem, should be tested on real world applications
● But, we are currently preparing a journal paper on a new backprop rule that:
● supports a multitude of coding hypotheses (population coding, convolution coding, ...)
● better convergence● simpler weight initialisation● ...