UNIVERSITΓT HEIDELBERG - EPFLΒ Β· Boltzmann distribution over π§ β0,1 Lπ= 1 exp 1 2...
Transcript of UNIVERSITΓT HEIDELBERG - EPFLΒ Β· Boltzmann distribution over π§ β0,1 Lπ= 1 exp 1 2...
Fast inference with
spiking networks
UNIVERSITΓT HEIDELBERG
Electronic Vision(s)
Kirchhoff Institute for Physics
University of Heidelberg
Mihai A. Petrovici
Computantional Neuroscience Group
Department of Physiology
University of Bern
Probabilistic
computation
with spikes
Bishop (2006)
get an estimate of the distribution of the sought function (expected value and uncertainty)
Probabilistic (Bayesian) computing: motivation
Rabbit/duck ambiguity
Necker cube
Knill-Kersten illusion
Probabilistic (Bayesian) computing: experimental evidence
Berkes et al. (2011)
Starkweather et al. (2017)
Probabilistic (Bayesian) computing: experimental evidence
Deep generative models
Salakhutdinov & Hinton (2009)
Probabilistic (Bayesian) computing in machine learning
Neuromorphic hardware
Schemmel et al. (2010)
A system that performs probabilistic inference has to
represent probability distributions π(π§1, π§2, β¦ )
calculate posterior (conditional) distributions π π§1, π§2, β¦ π§π, π§π+1, β¦ )
evaluate marginal distributions π π§1, π§2 = π(π§1, π§2, π§3, β¦ ) π§3,π§4,β¦
Requirements for probabilistic inference
π π =1
πexp
1
2πππΎπ+ πππ
analytic / parametric
full state space
sampling
Representation of probability distributions
temporal aspects:
increasingly correct representation
anytime computing
computational complexity aspects:
computation of contitionals is simple
marginalization is free
Sampling vs. parametric representation
BΓΌsing et al. (2011), Petrovici & Bill et al. (2016)
Spike-based encoding of an ensemble state
π§π = 1 neuron has spiked in [π‘ β π, π‘)
spike pattern encodes states π(π‘)
Boltzmann distribution over π§π β 0, 1
π π =1
πexp
1
2πππΎπ+ πππ
mediated by synaptic weights:
π’π = ππππ§π + ππ
πΎ
π=1
Neural computability condition
(which is equivalent to a logistic activation function π π§π = 1 π\π) =1
1+exp(βπ’π) ).
π’π = logπ π§π = 1 π\π)
π π§π = 0 π\π)
Emulation of Boltzmann machines
BΓΌsing et al. (2011), Petrovici & Bill et al. (2016)
π§π = 1 neuron has spiked in [π‘ β π, π‘)
spike pattern encodes states π(π‘)
abstract neural sampling model, BΓΌsing et al. (2011)
Emulation of Boltzmann machines
Neural computability condition
(which is equivalent to a logistic activation function π π§π = 1 π\π) =1
1+exp(βπ’π) ).
π’π = logπ π§π = 1 π\π)
π π§π = 0 π\π)
β πspike β erf πΌ β π’eff β π’0
β π πΌ β π’eff β π’0
π’thresh
logistic fit for
parameter transformation
(predicted)
free membrane potential distribution
logistic function
Idea: Stochasticity by Poisson background
unfortunately, neurons are a bit more complicatedβ¦
noise source: Poisson spike trains
high background firing rates
relatively low synaptic weights
membrane as Ornstein-Uhlenbeck process
ππ’ π‘ = Ξ β π β π’ π‘ ππ‘ + π ππ(π‘)
Ricciardi & Sacerdote (1979)
Ξ =1
πsyn
π =πΌext + πl πΈl + πππ€ππΈπ
rev πsyn ππ=1
πtot
π2 = ππ π€π πΈπ
rev β π 2πsyn ππ=1
2 πtot 2
for COBA LIF:
The diffusion approximation
Brunel & Sergi (1998)
FPT π, 0 β inf π‘ β₯ 0: ππ‘ = π|π0 = 0
=π
ππ2 1 + erf
π
π2π₯ exp
ππ₯2
π2ππ₯
π
0
Thomas (1975)
π = π π 1 + erf π₯ exp π₯2 ππ₯
πeff βπ/π
πβπ/π
assumption: πsyn βͺ πm
this allows expansion in πsyn
πm :
ππ =1
πref + FPT(ππ‘β, π0)
First-passage-time calculations
however, remember that πref β πsyn !
membrane does not forget !
Moreno-Bote & Parga (2004)
assumption: πsyn β« πm
then, the synaptic input appears quasistatic to the membrane
π = πm
π β π’
π β π’
β1
π = π π’ π π’ ππ’
β
π
however, remember that πref β πsyn !
adiabatic approximation does not hold !
The adiabatic approximation
ππ = 1 β ππ
πβ1
π=1
β πππβ1 π ππβ1|ππβ1 > πthr πππ
πthr
ββ
π ππ|ππβ1
β
πthr
ππ = πππβ1 π ππβ1|ππβ1 > πthr πππ
πthr
ββ
π ππ|ππβ1 πΉππ(πthr , ππ)
β
πthr
π π§π = 1 =π‘π, refractory
π‘total =
ππ π πref π
ππ β π πref + ππππβ1
π=1 + πππ
πππ = ππ’π πeff ln
π β π’ππ β π’π
π π’π|π’π > π, π’πβ1
β
π
HCS !
Petrovici & Bill et al. (2016)
The membrane autocorrelation propagation
Petrovici & Bill et al. (2016)
(Fully visible) LIF-based Boltzmann machines
Probst & Petrovici et al. (2015)
Beyond Boltzmann: Spiking Bayesian networks
Training of RBMs/DBMs on MNIST:
- maximum likelihood learning
Ξπ€ππ β π§ππ§π data β π§ππ§π model
Ξππ β π§π data β π§π model
- coupled adaptive tempering
96.9 % correct classification
with less than 2000 neurons
Deep spiking discriminative architectures
Leng & Petrovici et al. (2016)
Deep pong
Roth, Zenk (2017)
Training of RBMs/DBMs on MNIST:
- maximum likelihood learning
Ξπ€ππ β π§ππ§π data β π§ππ§π model
Ξππ β π§π data β π§π model
- coupled adaptive tempering
96.9 % correct classification
with less than 2000 neurons
Deep spiking generative architectures
Leng & Petrovici et al. (2016)
Short-term plasticity enables superior mixing
Leng & Petrovici et al. (2016)
STP model: Tsodyks & Markram (1997)
Gibbs
LIF
β¦ so where does the noise come from?
1st approximation: independent Poisson sources
unrealistic in both biological & artificial systems
better: common pool of presynaptic partners
correlated inputs
deviation from target distribution
Embedded stochastic inference machines
more realistic: sea of noise
Jordan et al. (2015)
Noiseless stochastic computation
ongoing work with Dominik Dold and Ilja Bytschok
Physical emulation
of spiking networks
Diesmann (2012)
simulation speed 1520:1
compared to biological real-time
synaptic plasticity
learning
development
evolution
nature
seconds
days
years
> millennia
simulation
hours
years
millennia
> millions of years
Simulation: size & time
1.000.000 times more energy-efficient
10.000 times less energy-efficient
BrainScaleS
Simulation & emulation: energy scaling
Hodgkin-Huxley-Model Electrophysiology
πΆππ’ = ππΏ π’ β πΈπΏ + πsyn (π’ β πΈsyn)
πK π4 π’ β πΈK +
πNa π3β π’ β πΈNa
Adaptive Exponential I&F Model
πΆππ’ = ππΏ π’ β πΈπΏ + πsyn π’ β πΈsyn +
ππΏΞπ expVβVπ
Ξπβπ€
ππ€π€ = π π β πΈπΏ β π€
Analog neuromorphic hardware
HICANN chip
Schemmel et al. (2010)
mixed-signal VLSI:
membrane analog
spikes digital
inherent speedup:
103 β 105
HICANN chip
Schemmel et al. (2010)
Analog neuromorphic hardware
Adaptive Exponential I&F Model
πΆππ’ = ππΏ π’ β πΈπΏ + πsyn π’ β πΈsyn +
ππΏΞπ expVβVπ
Ξπβπ€
ππ€π€ = π π β πΈπΏ β π€
Waferscale integration BrainScaleS system
Schemmel et al. (2010)
4 million AdEx neurons, 1 billion conductance-based synapses, under construction
The Hybrid Modeling Facility in Heidelberg
Hardware is not softwareβ¦
Petrovici & StΓΆckel et al. (2015), Petrovici et al. (2017)
LIF sampling on accelerated hardware
bio time: 100 s
software simulation: 1 s
hardware emulation: 10 ms
Petrovici & SchrΓΆder et al. (2017)
Robustness from structure
Outlook / Work in
progress
spiking networks
modeling
magnetic systems
π π§ =1
πexp
1
2π§πππ§ + π§ππ
π π =1
πexp
1
2πππ½ π + πππ
ongoing work with Andreas Baumbach
Ensemble dynamics
Carleo & Troyer (2017)
Quantum many-body problems
Ξπ€ βπ
ππ€
Ξ¨π π»|Ξ¨π
Ξ¨π Ξ¨π
Learning rules
β’ maximum likelihood learning
Ξπ€ππ β π§ππ§π data β π§ππ§π model
Ξππ β π§π data β π§π model
β’ backprop
Ξπ€ππ βππΈ
πππ
ππππnetπ
πnetπππ€ππ
vs.
target
ongoing work with Joao Sacramento and Walter Senn
References
β’ Bishop, C. M. (2006). Pattern recognition. Machine Learning, 128, 1-58.
β’ Starkweather, C. K., Babayan, B. M., Uchida, N., & Gershman, S. J. (2017). Dopamine reward prediction errors reflect hidden-state inference across time. Nature Neuroscience,
20(4), 581-589.
β’ Berkes, P., OrbΓ‘n, G., Lengyel, M., & Fiser, J. (2011). Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science, 331(6013), 83-87.
β’ Salakhutdinov, R., & Hinton, G. (2009, April). Deep boltzmann machines. In Artificial Intelligence and Statistics (pp. 448-455).
β’ Schemmel, J., Briiderle, D., Griibl, A., Hock, M., Meier, K., & Millner, S. (2010, May). A wafer-scale neuromorphic hardware system for large-scale neural modeling. In Circuits and
systems (ISCAS), proceedings of 2010 IEEE international symposium on (pp. 1947-1950). IEEE.
β’ Buesing, L., Bill, J., Nessler, B., & Maass, W. (2011). Neural dynamics as sampling: a model for stochastic computation in recurrent networks of spiking neurons. PLoS Comput Biol,
7(11), e1002211.
β’ Petrovici, M. A., Bytschok, I., Bill, J., Schemmel, J., & Meier, K. (2015). The high-conductance state enables neural sampling in networks of LIF neurons. BMC Neuroscience, 16(1), O2.
β’ Petrovici, M. A., Bill, J., Bytschok, I., Schemmel, J., & Meier, K. (2016). Stochastic inference with spiking neurons in the high-conductance state. Physical Review E, 94(4), 042312.
β’ Ricciardi, L. M., & Sacerdote, L. (1979). The Ornstein-Uhlenbeck process as a model for neuronal activity. Biological cybernetics, 35(1), 1-9.
β’ Brunel, N., & Sergi, S. (1998). Firing frequency of leaky intergrate-and-fire neurons with synaptic current dynamics. Journal of theoretical Biology, 195(1), 87-95.
β’ Moreno-Bote, R., & Parga, N. (2006). Auto-and crosscorrelograms for the spike response of leaky integrate-and-fire neurons with slow synapses. Physical review letters, 96(2),
028101.
β’ Probst, D., Petrovici, M. A., Bytschok, I., Bill, J., Pecevski, D., Schemmel, J., & Meier, K. (2015). Probabilistic inference in discrete spaces can be implemented into networks of LIF
neurons. Frontiers in computational neuroscience, 9.
β’ Leng, L., Petrovici, M. A., Martel, R., Bytschok, I., Breitwieser, O., Bill, J., ... & Meier, K. (2016). Spiking neural networks as superior generative and discriminative models. Cosyne
Abstracts, Salt Lake City USA.
β’ Jordan, J., Tetzlaff, T., Petrovici, M., Breitwieser, O., Bytschok, I., Bill, J., ... & Diesmann, M. (2015). Deterministic neural networks as sources of uncorrelated noise for probabilistic
computations. BMC Neuroscience, 16(Suppl 1), P62.
β’ Diesmann, M. (2013). The road to brain-scale simulations on K. Biosupercomput. Newslett, 8(8).
β’ Petrovici, M. A.*, StΓΆckel*, D., Bytschok, I., Bill, J., Pfeil, T., Schemmel, J. & Meier, K. (2015). Fast sampling with neuromorphic hardware. Advances in Neural Information Processing
Systems (NIPS).
β’ Petrovici, M. A., Schroeder, A., Breitwieser, O., GrΓΌbl, A., Schemmel, J., & Meier, K. (2017). Robustness from structure: Inference with hierarchical spiking networks on analog
neuromorphic hardware. arXiv preprint arXiv:1703.04145. (To appear in Proceedings of the ISCAS 2017.)
β’ Petrovici, M. A., Schmitt, S., KlΓ€hn, J., StΓΆckel, D., Schroeder, A., Bellec, G., ... & GΓΌttler, M. (2017). Pattern representation and recognition with accelerated analog neuromorphic
systems. arXiv preprint arXiv:1703.06043. (To appear in Proceedings of the IJCNN 2017.)
β’ Carleo, G., & Troyer, M. (2017). Solving the quantum many-body problem with artificial neural networks. Science, 355(6325), 602-606.