BDNF Function in Adult Synaptic Plasticity_the Synaptic Consolidation Hypotesis
Synaptic processing unit final year project - anthony hsiao
-
Upload
anthonyhsiao -
Category
Education
-
view
547 -
download
1
Transcript of Synaptic processing unit final year project - anthony hsiao
Imperial College LondonImperial College LondonImperial College LondonImperial College London
Department of Electrical and Electronic EngineeringDepartment of Electrical and Electronic EngineeringDepartment of Electrical and Electronic EngineeringDepartment of Electrical and Electronic Engineering
Final Year Project Report Final Year Project Report Final Year Project Report Final Year Project Report 2007200720072007
Project Title: The SynaptiThe SynaptiThe SynaptiThe Synaptic Processing Unitc Processing Unitc Processing Unitc Processing Unit
Student: Anthony HsiaoAnthony HsiaoAnthony HsiaoAnthony Hsiao
Course: 4T4T4T4T
Project Supervisor: Dr. George ConstantinidesDr. George ConstantinidesDr. George ConstantinidesDr. George Constantinides
Second Marker: Professor Alessandro AstolfiProfessor Alessandro AstolfiProfessor Alessandro AstolfiProfessor Alessandro Astolfi
AbstractAbstractAbstractAbstract
A small but growing community of engineers and scientists around the world are
breaking new grounds in the field of Neuromorphic Engineering, and succeed in
designing ever more complex brain-inspired artificial neural systems and
implementing them in low power analogue VLSI silicon chips.
A recently proposed synapse model called binary cascade synapse has memory
properties that are superior to other comparable models, and it is suitable for
implementation into digital hardware. Recent efforts have succeeded in designing
FPGA implementations of these binary cascade synapses, but failed to implement a
usefully large number of them onto one single chip.
This project focuses on developing the FPGA implementation of binary cascade
synapses further, and by radically changing the digital architecture, essentially
designing a microprocessor that processes cascade synapses. This processor is called
Synaptic Processing Unit (SPU) and the prototype implementation can currently host
up to 8192 cascade synapses.
This report describes the development of the SPU, which necessitated the
development of a novel learning rule alongside of it, called Spike Timing and Activity
Dependent Plasticity (STADP), and portrays a characterisation of this learning rule.
Both the hardware implementation of the SPU and of the learning rule are
implemented onto an FPGA and evaluated in-circuit.
Then, to put the SPU to an ultimate test, it was used together with an aVLSI neuron
chip to form a neural system with binary cascade synapses, and was given a real
classification task, whereby it was taught to classify two greyscale images. And
indeed, the system does successfully classify the two images, which is a very
encouraging result.
To the best of the knowledge of the author, the SPU presented here is the first
hardware implementation with such large number of synapses of its kind, in the
world.
The Synaptic Processing Unit Anthony Hsiao
1-2
AAAAcknowledgementscknowledgementscknowledgementscknowledgements
Thank you to all those people who have helped me get this far, both
academically and otherwise, and to those that accompanied me along the way.
In particular, I would like to thank DylanDylanDylanDylan Muir at the Institute of
Neuroinformatics for supervising my project, and being there whenever I
needed help, especially during the crazy hours before the FPGA decided to
take a holiday in the US.
I would also like to thank Dr. GeorgeGeorgeGeorgeGeorge Constantinides at Imperial College
London for supervising my project and Prof. AlessandroAlessandroAlessandroAlessandro Astolfi for second
marking it.
More words of thanks go to Prof. AlessandroAlessandroAlessandroAlessandro Astolfi for coordinating my
exchange to ETH Zurich, and for being patient when necessary and laidback
whenever possible.
Thank you StefanoStefanoStefanoStefano Fusi, one of the most impressive characters I met at the
Institute, for giving me initial feedback and coming up with the basis for what
later became STADP.
Special thanks to SungdoSungdoSungdoSungdo Choi and DanielDanielDanielDaniel Fasnacht for all the help and
support with the hardware and infrastructure; my computer was not struck by
a particle from space, it turned out.
Special thanks to JohannaJohannaJohannaJohanna von Lindeiner for good nights on the bench, and
the many inspiring exchanges. I actually mean it !
A very special thank you goes out to PanthaPanthaPanthaPantha Roy, who is just amazing. Thanks
for the good times, and for attempting to save me from becoming a social
recluse during the final few weeks of this project.
An equally special thank you goes out to SiddhartaSiddhartaSiddhartaSiddharta Jha, another amazing
character. Thank you for all those discussions and creative breaks, which
really enriched my time at the institute.
A massive thank you to a fellow brother in work, ChristopherChristopherChristopherChristopher Maltby, for
enduring all those long days and longer nights of work with me. As you know,
without your company, I would not have been able to get any work done, let
alone finish.
I would like to thank my parents, WendyWendyWendyWendy and TienTienTienTien----WenWenWenWen for their unconditional
support and for opening so many doors for me. Without your efforts and
sacrifices, I would not be where I am today, and would probably not get
wherever I will get in five, ten years!
Finally, I would like to thank DylanDylanDylanDylan Muir again, because I am actually very
grateful for all the help! Without your razor-sharp brain lobes and you
patience and support, I would not have been able to achieve half of what I
managed to do!
The Synaptic Processing Unit Anthony Hsiao
1-3
Table of contentsTable of contentsTable of contentsTable of contents
1111 INTRODUCTIONINTRODUCTIONINTRODUCTIONINTRODUCTION 1-9
1.11.11.11.1 WWWWHAT IS NEUROMORPHIC HAT IS NEUROMORPHIC HAT IS NEUROMORPHIC HAT IS NEUROMORPHIC ENGINEERINGENGINEERINGENGINEERINGENGINEERING???? 1-10
1.21.21.21.2 TTTTHE TOPIC OF THIS PROHE TOPIC OF THIS PROHE TOPIC OF THIS PROHE TOPIC OF THIS PROJECTJECTJECTJECT 1-11
1.31.31.31.3 AAAAIMSIMSIMSIMS 1-12
1.41.41.41.4 FFFFURTHER REPORT STRUCTURTHER REPORT STRUCTURTHER REPORT STRUCTURTHER REPORT STRUCTUREUREUREURE 1-12
2222 BACKGROUNDBACKGROUNDBACKGROUNDBACKGROUND 2-15
2.12.12.12.1 OOOOF BRAINSF BRAINSF BRAINSF BRAINS,,,, NEURONS AND SYNAPSE NEURONS AND SYNAPSE NEURONS AND SYNAPSE NEURONS AND SYNAPSESSSS 2-15
2.22.22.22.2 SSSSYNAPTIC PLASTICITY AYNAPTIC PLASTICITY AYNAPTIC PLASTICITY AYNAPTIC PLASTICITY AT THE HEART OF LEARNT THE HEART OF LEARNT THE HEART OF LEARNT THE HEART OF LEARNING IN NEURAL SYSTEMING IN NEURAL SYSTEMING IN NEURAL SYSTEMING IN NEURAL SYSTEMSSSS 2-20
2.32.32.32.3 TTTTHE HE HE HE CASCADE SYNAPSE MODECASCADE SYNAPSE MODECASCADE SYNAPSE MODECASCADE SYNAPSE MODELLLL 2-21
2.42.42.42.4 PPPPREVIOUS WORKREVIOUS WORKREVIOUS WORKREVIOUS WORK 2-24
2.52.52.52.5 OOOOVERVIEW OF THE HARDWVERVIEW OF THE HARDWVERVIEW OF THE HARDWVERVIEW OF THE HARDWARE ENVIRONMENTARE ENVIRONMENTARE ENVIRONMENTARE ENVIRONMENT 2-25
2.5.1 SILICON NEURONS 2-26
2.5.2 SILICON SYNAPSES 2-27
2.5.3 COMMUNICATION USING AER 2-27
2.5.4 THE FPGA BOARD 2-28
2.5.5 SOFTWARE 2-30
2.5.6 FINALLY… ERROR! BOOKMARK NOT DEFINED.
3333 STADP STADP STADP STADP –––– A NOVEL HEBBIAN LEA A NOVEL HEBBIAN LEA A NOVEL HEBBIAN LEA A NOVEL HEBBIAN LEARNING RULERNING RULERNING RULERNING RULE 3-31
3.13.13.13.1 STADPSTADPSTADPSTADP –––– YYYYET ANOTHER LEARNING ET ANOTHER LEARNING ET ANOTHER LEARNING ET ANOTHER LEARNING RULERULERULERULE???? 3-31
3.1.1 FROM SPIKE TIME TO SPIKE RATE 3-33
3.23.23.23.2 CCCCHARACTERISTICS OF HARACTERISTICS OF HARACTERISTICS OF HARACTERISTICS OF STADPSTADPSTADPSTADP 3-35
4444 DESIGNDESIGNDESIGNDESIGN 4-38
4.14.14.14.1 SSSSUMMARY OF FEATURES OUMMARY OF FEATURES OUMMARY OF FEATURES OUMMARY OF FEATURES OF THE F THE F THE F THE SSSSYNAPTIC YNAPTIC YNAPTIC YNAPTIC PPPPROCESSING ROCESSING ROCESSING ROCESSING UUUUNITNITNITNIT 4-38
4.24.24.24.2 SSSSYSTEM LEVEL DESIGNYSTEM LEVEL DESIGNYSTEM LEVEL DESIGNYSTEM LEVEL DESIGN 4-38
The Synaptic Processing Unit Anthony Hsiao
1-4
4.2.1 THE SPU IN A NEURAL SYSTEM 4-39
4.2.2 INPUT AND OUTPUT PORTS 4-39
4.34.34.34.3 VVVVIRTUALISING THE CASCIRTUALISING THE CASCIRTUALISING THE CASCIRTUALISING THE CASCADE SYNAPSEADE SYNAPSEADE SYNAPSEADE SYNAPSE 4-40
4.44.44.44.4 SPUSPUSPUSPU INTERNAL ADDRESSING INTERNAL ADDRESSING INTERNAL ADDRESSING INTERNAL ADDRESSING 4-42
4.54.54.54.5 MMMMODULAR DESIGN OF THEODULAR DESIGN OF THEODULAR DESIGN OF THEODULAR DESIGN OF THE SPUSPUSPUSPU 4-43
4.64.64.64.6 MMMMODULE SPECIFICATIONSODULE SPECIFICATIONSODULE SPECIFICATIONSODULE SPECIFICATIONS 4-44
4.6.1 FORWARDING 4-45
4.6.2 LEARNING RULE (STADP) 4-45
4.6.3 CASCADE PROCESS 4-46
4.6.4 CASCADE MEMORY 4-46
4.6.5 GLOBAL SIGNALS 4-47
5555 IMPLEMENTATIONIMPLEMENTATIONIMPLEMENTATIONIMPLEMENTATION 5-48
5.15.15.15.1 PPPPSEUDOSEUDOSEUDOSEUDO----RANDOM NUMBER GENERARANDOM NUMBER GENERARANDOM NUMBER GENERARANDOM NUMBER GENERATORSTORSTORSTORS 5-48
5.25.25.25.2 DDDDESESESESCRIPTION OF GENERICSCRIPTION OF GENERICSCRIPTION OF GENERICSCRIPTION OF GENERICS 5-49
5.35.35.35.3 MMMMODULE LEVEL DESIGNODULE LEVEL DESIGNODULE LEVEL DESIGNODULE LEVEL DESIGN 5-51
5.3.1 SPIKE FORWARDING 5-51
5.3.2 LEARNING RULE (STADP) 5-52
5.3.3 CASCADE SYNAPSE 5-56
5.3.4 CASCADE MEMORY 5-58
5.3.5 SIGNAL SELECTOR 5-60
5.45.45.45.4 SSSSYSTEM INTEGRATIONYSTEM INTEGRATIONYSTEM INTEGRATIONYSTEM INTEGRATION 5-60
5.55.55.55.5 IIIINTEGRATION INTO THE NTEGRATION INTO THE NTEGRATION INTO THE NTEGRATION INTO THE FPGAFPGAFPGAFPGA BO BO BO BOARDARDARDARD 5-62
5.5.1 ON CLOCKS 5-64
6666 VERIFICATIONVERIFICATIONVERIFICATIONVERIFICATION 6-65
7777 EVALUATION &EVALUATION &EVALUATION &EVALUATION & EXPERIMENTATION EXPERIMENTATION EXPERIMENTATION EXPERIMENTATION 7-67
7.17.17.17.1 IIIINNNN----HARDWARE CHARACTERISHARDWARE CHARACTERISHARDWARE CHARACTERISHARDWARE CHARACTERISATION OF ATION OF ATION OF ATION OF STADPSTADPSTADPSTADP 7-67
7.27.27.27.2 MMMMODIFICATIONS FOR THEODIFICATIONS FOR THEODIFICATIONS FOR THEODIFICATIONS FOR THE EXPERIMENTAL EXPERIMENTAL EXPERIMENTAL EXPERIMENTAL SSSSETUPETUPETUPETUP 7-71
7.37.37.37.3 CCCCIRCUIT CALIBRATIONIRCUIT CALIBRATIONIRCUIT CALIBRATIONIRCUIT CALIBRATION 7-73
The Synaptic Processing Unit Anthony Hsiao
1-5
7.47.47.47.4 IIIINNNN----CIRCUIT VERIFICATIONCIRCUIT VERIFICATIONCIRCUIT VERIFICATIONCIRCUIT VERIFICATION 7-75
7.4.1 FORWARDING 7-75
7.4.2 POTENTIATION 7-77
7.4.3 DEPRESSION 7-78
7.57.57.57.5 AAAA R R R REAL CLASSIFICATION TEAL CLASSIFICATION TEAL CLASSIFICATION TEAL CLASSIFICATION TASKASKASKASK 7-80
7.5.1 FROM IMAGE TO PRE-SYNAPTIC STIMULI 7-80
7.5.2 TEACHING METHODS 7-83
7.5.3 RESULTS – NORMAL TEACHING 7-86
7.5.4 RESULTS - BOTTOM UP TEACHING 7-91
7.5.5 REMARKS ON THE CLASSIFICATION EXPERIMENTS 7-95
8888 DISCUSSIONDISCUSSIONDISCUSSIONDISCUSSION 8-97
8.18.18.18.1 TTTTHE HARDWAREHE HARDWAREHE HARDWAREHE HARDWARE 8-97
8.28.28.28.2 STADPSTADPSTADPSTADP 8-98
8.38.38.38.3 TTTTHE CLASSIFICATION TAHE CLASSIFICATION TAHE CLASSIFICATION TAHE CLASSIFICATION TASKSKSKSK 8-99
8.48.48.48.4 CCCCALIBRATION OF THE NEALIBRATION OF THE NEALIBRATION OF THE NEALIBRATION OF THE NEURAL SYSTEMURAL SYSTEMURAL SYSTEMURAL SYSTEM 8-103
9999 CONCLUSIONCONCLUSIONCONCLUSIONCONCLUSION 9-105
9.19.19.19.1 RRRREFINEMENTSEFINEMENTSEFINEMENTSEFINEMENTS 9-106
10101010 REFERENCESREFERENCESREFERENCESREFERENCES 10-108
10.1.1 WEB REFERENCES 10-109
10.1.2 DATASHEETS AND REFERENCE BOOKS 10-110
11111111 APPENDIX APPENDIX APPENDIX APPENDIX I I I I –––– SUPPLEMENTARY FILES SUPPLEMENTARY FILES SUPPLEMENTARY FILES SUPPLEMENTARY FILES 11-111
12121212 APPENDIX II APPENDIX II APPENDIX II APPENDIX II –––– VERIFICATION CHECKL VERIFICATION CHECKL VERIFICATION CHECKL VERIFICATION CHECKLISTSISTSISTSISTS 12-112
12.112.112.112.1 MMMMODULE ODULE ODULE ODULE LLLLEVEL EVEL EVEL EVEL VVVVERIFICATIONERIFICATIONERIFICATIONERIFICATION 12-112
12.212.212.212.2 SSSSYSTEM YSTEM YSTEM YSTEM LLLLEVEL EVEL EVEL EVEL VVVVERIFICATIONERIFICATIONERIFICATIONERIFICATION 12-114
The Synaptic Processing Unit Anthony Hsiao
1-6
13131313 APPENDIX III APPENDIX III APPENDIX III APPENDIX III –––– A JOURNEY THROUGH T A JOURNEY THROUGH T A JOURNEY THROUGH T A JOURNEY THROUGH THE SPUHE SPUHE SPUHE SPU 13-117
13.113.113.113.1 PPPPRERERERE----SYNAPTIC SPIKESYNAPTIC SPIKESYNAPTIC SPIKESYNAPTIC SPIKE 13-117
13.213.213.213.2 PPPPOSTOSTOSTOST----SYNAPTIC SPIKESYNAPTIC SPIKESYNAPTIC SPIKESYNAPTIC SPIKE 13-119
14141414 APPENDIX IV APPENDIX IV APPENDIX IV APPENDIX IV –––– DESIGN HIERARCHY OF DESIGN HIERARCHY OF DESIGN HIERARCHY OF DESIGN HIERARCHY OF SOURCE FILES SOURCE FILES SOURCE FILES SOURCE FILES 14-120
The Synaptic Processing Unit Anthony Hsiao
1-7
List of figuresList of figuresList of figuresList of figures
FIGURE 1: IMAGE OUTPUT OF A SILICON RETINA .................................................................................... 1-11
FIGURE 2: NEURONS OF THE WORLD. ................................................................................................... 2-16
FIGURE 3: ACTION POTENTIALS (PIKES) ARE COMMONLY DESCRIBED BY THREE PROPERTIES:...................... 2-17
FIGURE 4: ACTION POTENTIALS OF THE WORLD. .................................................................................... 2-18
FIGURE 5: CGI OF A SYNAPSE WITH PRE- AND POST-SYNAPTIC NEURONS. ................................................ 2-19
FIGURE 6: MICROGRAPH OF A SYNAPSE TAKEN AT THE UNIVERSITY OF ST. LUIS. ..................................... 2-19
FIGURE 7: DIFFERENT FORMS OF SYNAPTIC PLASTICITY .......................................................................... 2-21
FIGURE 8: SCHEMATIC OF A CASCADE MODEL OF SYNAPTIC PLASTICITY. ............................................... 2-22
FIGURE 9: INITIAL SIGNAL-TO-NOISE-RATIO AS A FUNCTION OF MEMORY LIFETIME, FROM [1]..................... 2-24
FIGURE 10: CIRCUIT DIAGRAM OF AN ULTRA LOW POWER INTEGRATE & FIRE NEURON. ............................ 2-26
FIGURE 11: CIRCUIT DIAGRAM OF THE SO CALLED DIFF-PAIR INTEGRATOR (DPI) SYNAPSE. ....................... 2-27
FIGURE 12: PROTOTYPE FPGA BOARD DEVELOPED BY DANIEL FASNACHT. ............................................. 2-29
FIGURE 13: EXPERIMENTAL HARDWARE SETUP...................................................................................... 2-30
FIGURE 14: STADP ........................................................................................................................... 3-33
FIGURE 15: THE STADP MECHANISM. ................................................................................................. 3-34
FIGURE 16: SIMULATED BEHAVIOUR OF STADP. .................................................................................. 3-36
FIGURE 17: SYSTEM LEVEL INTERACTION OF SPU AND AVLSI NEURON CHIP............................................ 4-39
FIGURE 18: BIT REPRESENTATION OF CASCADE SYNAPSES ...................................................................... 4-40
FIGURE 19: SPU INTERNAL ADDRESSING FORMAT ................................................................................. 4-42
FIGURE 20: CONCEPTUAL ARCHITECTURE OF THE SPU.......................................................................... 4-43
FIGURE 21: A HYBRID CELLULAR AUTOMATA LINEAR ARRAY ................................................................ 5-49
FIGURE 22: CONVENTIONS ON THE ARROWS USED IN BLOCK DIAGRAMS .................................................. 5-51
FIGURE 23: SPIKE FORWARDING MODULE BLOCK DIAGRAM.................................................................... 5-52
FIGURE 24: STADP LEARNING RULE BLOCK DIAGRAM........................................................................... 5-54
FIGURE 25: INITIALISATION OF DELTA_T LOOK-UP TABLE. ...................................................................... 5-55
FIGURE 26: FLOW DIAGRAM OF THE CASCADE SYNAPSE'S STATE UPDATE RULE ........................................ 5-56
FIGURE 27: CASCADE MODULE BLOCK DIAGRAM .................................................................................. 5-58
FIGURE 28: CASCADE MEMORY BLOCK DIAGRAM ................................................................................. 5-59
FIGURE 29: INPUT SOURCE SELECTOR BLOCK DIAGRAM ......................................................................... 5-60
FIGURE 30: PIPELINED SPU BLOCK DIAGRAM ....................................................................................... 5-61
FIGURE 31: PIPELINED DATAFLOW THROUGH THE SPU .......................................................................... 5-62
FIGURE 32: BLOCK DIAGRAM OF THE INTEGRATION OF THE SPU WITHIN THE FPGA BOARD ...................... 5-63
FIGURE 33: COMPARISON OF DELTA_T_LUT CONTENT FOR 5KHZ AND 90MHZ. ...................................... 7-69
FIGURE 34: SIMULATED HARDWARE BEHAVIOUR OF STADP AT 5KHZ SIMULATION CLOCK FREQUENCY. .... 7-71
The Synaptic Processing Unit Anthony Hsiao
1-8
FIGURE 35: FREQUENCY RESPONSE OF THE NEURAL SYSTEM. ..................................................................7-74
FIGURE 36: OSCILLOSCOPE SCREENSHOT OF POST-SYNAPTIC MEMBRANE POTENTIAL:................................7-74
FIGURE 37: EXAMPLE OF A COHERENT 30HZ POISSON SPIKE TRAIN TO ALL 256 SYNAPSES. ........................7-76
FIGURE 38: OSCILLOSCOPE SCREENSHOT OF POST-SYNAPTIC MEMBRANE POTENTIAL:................................7-77
FIGURE 39: IN-CIRCUIT VERIFICATION OF POTENTIATION. ........................................................................7-78
FIGURE 40: IN-CIRCUIT VERIFICATION OF DEPRESSION. ...........................................................................7-79
FIGURE 41: OSCILLOSCOPE SCREENSHOT OF DECREASING POST-SYNAPTIC FIRING RATE: ............................7-80
FIGURE 42: USING PICTURES AS PRE-SYNAPTIC STIMULI. .........................................................................7-82
FIGURE 43: SPIKE TRAINS DERIVED FROM 16X16 PIXEL GREYSCALE IMAGES OF ANTHONY AND DYLAN. .....7-82
FIGURE 44: CONCEPTUAL PROCEDURE OF A REAL CLASSIFICATION TASK. .................................................7-85
FIGURE 45: CLASSIFICATION TASK: TEACH DYLAN, SHOW DYLAN FIRST, AT 22HZ. ..................................7-87
FIGURE 46: CLASSIFICATION TASK: TEACH DYLAN, SHOW ANTHONY FIRST, AT 22HZ. ..............................7-87
FIGURE 47: CLASSIFICATION TASK: TEACH DYLAN, SHOW DYLAN FIRST, AT 25HZ. ..................................7-88
FIGURE 48: CLASSIFICATION TASK: TEACH DYLAN, SHOW ANTHONY FIRST, AT 25HZ. ..............................7-88
FIGURE 49: CLASSIFICATION TASK: TEACH ANTHONY, SHOW ANTHONY FIRST, AT 22HZ...........................7-89
FIGURE 50: CLASSIFICATION TASK: TEACH ANTHONY, SHOW DYLAN FIRST, AT 22HZ. ..............................7-89
FIGURE 51: CLASSIFICATION TASK: TEACH ANTHONY, SHOW ANTHONY FIRST, AT 25HZ...........................7-90
FIGURE 52: CLASSIFICATION TASK: TEACH ANTHONY, SHOW DYLAN FIRST, AT 25HZ. ..............................7-90
FIGURE 53: CLASSIFICATION TASK: BOTTOM-UP TEACHING DYLAN, AT 50HZ..........................................7-92
FIGURE 54: CLASSIFICATION TASK: BOTTOM-UP TEACHING DYLAN, AT 70HZ. .........................................7-92
FIGURE 55: CLASSIFICATION TASK: BOTTOM-UP TEACHING DYLAN, FOR 2S AT 50HZ................................7-93
FIGURE 56: CLASSIFICATION TASK: BOTTOM-UP TEACHING ANTHONY, AT 50HZ. .....................................7-93
FIGURE 57: CLASSIFICATION TASK: BOTTOM-UP TEACHING ANTHONY, AT 70HZ. .....................................7-94
FIGURE 58: CLASSIFICATION TASK: BOTTOM-UP TEACHING ANTHONY, FOR 2S AT 50HZ. ..........................7-94
FIGURE 59: EXPECTED EFFECTS ON A SYNAPSE ....................................................................................8-101
FIGURE 60: PRE-SYNAPTIC SPIKE ARRIVES AT SPU. ............................................................................13-117
FIGURE 61: VALID PRE-SYNAPTIC SPIKE GETS FORWARDED, AFTER TWO CLOCK DELAYS ........................13-117
FIGURE 62: VALID PRE-SYNAPTIC SPIKE GENERATES A PLASTICITY EVENT. ............................................13-117
FIGURE 63: CASCADE SYNAPSE CHANGES IN OPERATION ....................................................................13-118
FIGURE 64: PLASTICITY EVENTS .......................................................................................................13-118
FIGURE 65: VALID POST-SYNAPTIC SPIKE ARRIVES AT SPU..................................................................13-119
FIGURE 66: POST-SYNAPTIC SPIKE DOES NOT GET FORWARDED ...........................................................13-119
FIGURE 67: POST-SYNAPTIC SPIKE SETS POST-SYNAPTIC EXPIRY TIME. ..................................................13-119
The Synaptic Processing Unit Anthony Hsiao
1-9
1111 IntroductionIntroductionIntroductionIntroduction
‘The brain – that’s my second most favourite organ!’ – Woody Allen
Solving the mystery behind how the human brain works and computes will be one of
the most significant discoveries in the history of science. A profound understanding
of our most important organ (bar Woody Allen…) will have significant implications
to healthcare, psychology and ethics, as well as to computing, robotics and artificial
intelligence. Visionaries such as Ray Kurzweil go as far as predicting, that before the
middle of the 21st century, humans and machines will be able to merge in a way
never seen before, as brain interfaces enable users to bridge the gap between the real
and virtual worlds to a level where the distinction between ‘real’ and ‘not real’ might
lose its importance. Artificial systems would reach computational powers that
matched those of the human brain, just to surpass them a few years later.
Most people find it difficult to imagine such scenarios, especially since even the most
powerful computers to date, which can perform billions of operations per second,
cannot reproduce some of the computational-magic that human brains perform on a
day to day basis, such as pattern recognition or visual processing. ‘Intelligent’ and
‘interactive’ systems are neither intelligent nor interactive, the most advanced robots
in the world are no match for a young child when it comes to performing motor tasks
or recognition; the thought of ever meeting a machine with intelligence, humor or an
opinion goes far beyond what most people think their computers will ever be able to
do.
Such future scenarios have been the topic of several books and films, and are
portrayed as horror scenarios more often than not, ignoring many of the potential
opportunities that such a future could bear. Without attempting to make any
qualifying judgments, it should be noted that change happens, whether it is welcome
or not.
This change could well be initiated by a small but growing community of engineers
and scientists, driven by impressive advances in neuroscience, who are making
The Synaptic Processing Unit Anthony Hsiao
1-10
significant progress in copying neuronal organization and function into artificial
systems. The secret to the human brain’s superior abilities appears to reside in how
the brain organises its slow acting electrical and chemical components (namely
neurons, as basic computational unit in the brain, synapses, which are the interfaces
of neurons and possess rich dynamics allowing neurons to form interconnected
neural circuits). Researchers sometimes speak of ‘morphing’ these structures of
neural connections into silicon circuits, creating neuromorphic microchips. If
successful, this work could lead to implantable silicon retinas for the blind or sound
processors for the deaf that last for 30 years on a single nine-volt battery or to low-
cost, highly effective visual, audio or olfactory recognition chips for robots and other
smart machines. The long term goal is to engineer ever more complex artificial
systems with ever richer behaviour, and ultimately, the construction of an artificial
brain.
1.11.11.11.1 What is neuromorphic engineering?What is neuromorphic engineering?What is neuromorphic engineering?What is neuromorphic engineering?
The term neuromorphic was coined by Carver Mead, in the late 1980s to describe
Very Large Scale Integration (VLSI) systems containing analogue electronic circuits
that mimic neuro-biological architectures present in the nervous system.
Neuromorphic Engineering is a new interdisciplinary field that takes inspiration from
biology, physics, mathematics and engineering to design analog, digital or mixed-
mode analog/digital VLSI artificial neural systems. These include vision systems,
head-eye systems, auditory processors and autonomous robots, whose physical
architecture and design principles are based on those of biological nervous systems.
Although the field of neuromorphic engineering is still relatively new, impressive and
encouraging results have already been achieved. Ranging from ‘simple’ chips with
silicon neurons or synapses [13] to more complex systems such as a silicon retina or
cochlea [13] have been demonstrated in the past.
The Synaptic Processing Unit Anthony Hsiao
1-11
Figure Figure Figure Figure 1111: Image output of a sil: Image output of a sil: Image output of a sil: Image output of a siliiiicon con con con retinaretinaretinaretina
Showing the head of a person at the Brains in Silicon Lab at Stanford University.
1.21.21.21.2 The topic of this projectThe topic of this projectThe topic of this projectThe topic of this project
This project focuses on one aspect of neuromorphic systems which is at the heart of
some of the dynamics of neural networks, namely on synapses. Fusi et. al. have
demonstrated how using ordinary bounded synapse models can have devastating
effects on memory in scenarios with ongoing modifications, and proposed a new
synapse model, the binary Cascade Synapse [1], which outperforms ordinary (binary)
synapse models on several aspects [9].
The nature of the Cascade Synapse makes it convenient to implement in digital
hardware rather than analogue VLSI, and it would be useful to augment existing
neuromorphic neuron chips with Cascade Synapse functionality. Such a neural
system could then act as one single entity in a larger multi chip environment.
Previous efforts have successfully designed individual cascade synapses and
implemented a small number – eight, to be precise – of them on an FPGA; however,
in order to perform useful computation in a reasonably sized neural system, a massive
up-scaling of the number of synapses on one chip is necessary. In order to augment a
typical aVLSI neuron chip with cascade synapse functionality, any number upwards
of 4000 synapses would be desirable, or rather, necessary.
One way of doing this is to fundamentally change the way cascade synapses are
implemented on the FPGA, referred to as virtualisation: rather than having a number
of fixed hardware cascade synapses, which is logic-real-estate inefficient, an
abstraction of each synapse could be stored in memory, and only retrieved, processed
on and stored on demand. Since memory is generally cheap and abundant, unlike
The Synaptic Processing Unit Anthony Hsiao
1-12
logic, in digital circuits, this Synaptic Processing Unit (SPU) can potentially allow for
a very large scale implementation of cascade synapses on one single FPGA.
1.31.31.31.3 AimsAimsAimsAims
1. To develop a Synaptic Processing Unit based on an FPGA that implements a
large number of cascade synapses
2. To integrate the SPU with an aVLSI neuron chip to form a working neural
system
3. To demonstrate the capabilities of the neural system by performing a real
classification task
1.41.41.41.4 Further report structureFurther report structureFurther report structureFurther report structure
This report is written for the scientifically and technically minded reader, with
background knowledge of the concepts of electronic engineering, and is further
structured as follows:
2.2.2.2. BackgrounBackgrounBackgrounBackgroundddd
This chapter attempts to brief the reader on all the necessary interdisciplinary
background knowledge required for this project. In particular, it outlines some of
the relevant biology and neuroscience, explains the used binary cascade model in
more detail and describes the hardware and infrastructure environment the SPU
will be working in.
3.3.3.3. STADP STADP STADP STADP –––– a novel Hebbian learning rule a novel Hebbian learning rule a novel Hebbian learning rule a novel Hebbian learning rule
This chapter will argue the case for developing a new learning rule called STADP,
and describe how it works. It will also present an initial characterisation of the
learning rule derived from simulation.
4.4.4.4. DesignDesignDesignDesign
This chapter starts by providing a summary of the features of the SPU, to allow
the reader to get a first impression. Then, it outlines the high level design and
argues for the system architecture used. It finishes by giving a set of specifications
for a modular implementation of the design.
The Synaptic Processing Unit Anthony Hsiao
1-13
5.5.5.5. ImplementationImplementationImplementationImplementation
This chapter starts by going off on a tangent, diving into the realm of random
number generators. Then, it describes how the specifications given in the previous
chapter were implemented in each module, and how the SPU integrates within
the FPGA and its environment.
6.6.6.6. VerificationVerificationVerificationVerification
This chapter is a very short one, which only portrays the efforts undertaken in
order to verify the design and implementation. It will not reproduce the
verification efforts themselves.
7.7.7.7. Evaluation & experimentationEvaluation & experimentationEvaluation & experimentationEvaluation & experimentation
This is one of the key chapters and describes all the in-circuit verification and
experimentation that has been carried out. Furthermore, it explains the real
classification task given to the neural system, and presents the results.
8.8.8.8. DiscussionDiscussionDiscussionDiscussion
This chapter discusses the evaluation and experimentation results, and tries to
make general statements about the operation of the SPU, and conclusions about
the success of the classification tasks itself.
9.9.9.9. ConclusionConclusionConclusionConclusion
This chapter wraps up the report, and includes the conclusions derived from the
work presented here. It objectively assesses advantages and disadvantages of the
SPU, and suggests further improvements or changes to the system that might be
worthwhile.
10.10.10.10. References References References References
This chapter enlists the sources that have been referred to while writing the report
as well as sources that have been used throughout the design and implementation
of the SPU.
11.11.11.11. AppenAppenAppenAppenddddicesicesicesices
There are four appendices, Appendix I with a list of supplementary Matlab files
used throughout the project, Appendix II with a copy of the checklist used for
verification, Appendix III with screeshots of waveforms showing the journey of a
The Synaptic Processing Unit Anthony Hsiao
1-14
pre- and a post-synaptic spike through the SPU and finally Appendix IV, listing
the design hierarchy of the VHDL source files used.
The Synaptic Processing Unit Anthony Hsiao
2-15
2222 BackgroundBackgroundBackgroundBackground
‘If the human brain were so simple that we could understand it, we would be so
simple that we couldn't’ – Emerson M. Pugh
2.12.12.12.1 Of brains, neurons and synapsesOf brains, neurons and synapsesOf brains, neurons and synapsesOf brains, neurons and synapses
When IBM’s Deep Blue supercomputer beat then world chess champion Garry
Kasparov during their rematch in 1997, it did so by means of sheer brute force and
computational power. The machine evaluated some 200 million potential board
moves a second, whereas Kasparov considered only three each second, at most
10.1.1. But despite Deep Blue’s victory (in fact, Kasparov won the first match against
Deep Blue the year earlier, and IBM refused to agree to a third ‘deciding’ match [21]),
computers are no real competition for the human brain in areas such as vision,
hearing, pattern recognition, and learning, not to mention their inability to display
creativity, humour or emotions. And when it comes to operational efficiency, there is
no contest at all. A typical room-size supercomputer weighs roughly 1,000 times
more, occupies 10,000 times more space and consumes a millionfold more power
than does the neural tissue that makes up the brain [22].
Clearly, computers and brains are fundamentally different, both in terms of
architecture and performance. Table 1 summarises important key differences of
brains and (conventional) computers.
Processing Processing Processing Processing
elementselementselementselements
Element Element Element Element
sizesizesizesize
Energy Energy Energy Energy
useuseuseuse
SSSSpeedpeedpeedpeed Style of Style of Style of Style of
computationcomputationcomputationcomputation
Fault Fault Fault Fault
toleranttoleranttoleranttolerant
BrainBrainBrainBrain ~1011 neurons
~1014 synapses
10-6m 30W 100Hz Parallel,
distributed,
memory at
computation
Yes
PCPCPCPC 109 transistors 10-6m 30W
(CPU)
109Hz + Serial,
centralized,
memory
distant to
computation
No
Table Table Table Table 1111: A comparison between computers and brains: A comparison between computers and brains: A comparison between computers and brains: A comparison between computers and brains
The Synaptic Processing Unit Anthony Hsiao
2-16
At the most basic cellular level, brains consist of a vast number of brain cells, an
estimated 100 billion of them, called neurons. These are also believed to constitute
the basic building blocks of computation within the central nervous system, and are
in many ways analogous to logic gates in digital electronics. The brain's network of
neurons forms a massively parallel information processing system.
While there are a large number of different types of neurons, each with different
functions and morphologies, most neurons are typically composed of a soma, or cell
body, a dendritic tree and an axon, as shown in Figure 2.
Figure Figure Figure Figure 2222: Neurons of the world. : Neurons of the world. : Neurons of the world. : Neurons of the world.
There are many different types of neurons, each with different morphologies and functions, which are found in different parts of brains. Image courtesy of G. Indiveri
One of the most important properties of a neuron is its membrane potential, the
potential difference across the cell membrane, which is used to communicate
between neurons. A complicated molecular mechanism that stems from the cell’s
highly complex membrane can give rise to so called action potentials or spikes, which
are sharp a increase followed by an equally sharp drop in the membrane potential
within a few ms. A neuron receives inputs, i.e. spikes, from other neurons, typically
many thousands, on its dendritic tree, and integrates them (approximately) on its
membrane potential. Once the membrane potential exceeds a certain threshold, the
neuron generates a spike which travels from the body down the axon, commonly
The Synaptic Processing Unit Anthony Hsiao
2-17
described as the output of a neuron, to the next neuron(s) (or other receptors). This
spiking event is also called depolarization, and is followed by a refractory period,
during which the neuron is unable to fire. The membrane potential of a spiking
neuron is shown in Figure 3, conceptually, while Figure 4 shows some measurements
of real action potentials of the world. Typically, neurons fire at rates between 0Hz
and about 100Hz, and both the precise timing of individual spikes and the firing rates
of neurons are believed to play an important role in neural communication and
computation.
Figure Figure Figure Figure 3333: : : : Action potentials (Action potentials (Action potentials (Action potentials (pikepikepikepikes) are commonly described by three ps) are commonly described by three ps) are commonly described by three ps) are commonly described by three propertiesropertiesropertiesroperties: : : :
Pulse width, firing rate or inter-spike-interval and refractory period. Courtesy of Giacomo Indiveri.
The Synaptic Processing Unit Anthony Hsiao
2-18
Figure Figure Figure Figure 4444: Action potentials of the world. : Action potentials of the world. : Action potentials of the world. : Action potentials of the world.
Courtesy of Giacomo Indiveri, modified by Anthony Hsiao
The axon endings of neurons almost touch the dendrites or cell body of the next
neuron. The gap between two neurons is a specialized structure called synapse and is
the point of transmission of spikes from the pre-synaptic neuron to the post-synaptic
neuron, as shown in Figure 5 and Figure 6. This transmission is effected by
neurotransmitters, chemicals which are released from the pre-synaptic neuron upon
depolarization, which bind to receptors in the post-synaptic neuron, thereby
advancing the depolarisation of it. Most synapses are excitatory, i.e. they increase the
depolarisation of the post-synaptic neuron, although there are so called inhibitory
synapses (with inhibitory neurotransmitters), which render a post-synaptic neuron less
excitable. The human brain is estimated to have a vast 1014 synapses.
The extent to which a spike from one neuron is transmitted on to the next, the
synaptic efficacy or weight, depends on many factors, such as the amount of
neurotransmitter available or the number and arrangement of receptors, and is not
constant, but changes over time. This property is called synaptic plasticity, and it is
this variable synaptic strength, that is believed to give rise to both memory and
learning capabilities, which makes it particularly interesting to study synapses!
The Synaptic Processing Unit Anthony Hsiao
2-19
Figure Figure Figure Figure 5555: CGI of a Synapse with pre: CGI of a Synapse with pre: CGI of a Synapse with pre: CGI of a Synapse with pre---- and post and post and post and post----synaptic neurons. synaptic neurons. synaptic neurons. synaptic neurons.
Excerpt of the 2005 Winner of the Science and Engineering Visualisation Challenge. By G. Johnson. Medical Media, Boulder, CO
Figure Figure Figure Figure 6666: Micrograph of a: Micrograph of a: Micrograph of a: Micrograph of a Synapse taken at the University of St. Luis. Synapse taken at the University of St. Luis. Synapse taken at the University of St. Luis. Synapse taken at the University of St. Luis.
In the center of the image is the Synaptic Cleft, which separates the pre- (top) and post-synaptic neuron (bottom). The pre-synaptic neuron has clearly visible vesicles which contain neurotransmitters. Upon pre-synaptic depolarisation, these neurotransmitters are released and diffuse across the synaptic
cleft, to be received by receptors on the post-synaptic neuron, advancing its depolarisation.
Scientists have developed various models of the underlying molecular mechanisms of
synaptic plasticity, describing it to good levels of accuracy; however it is important to
appreciate, that there are details to synaptic plasticity which are still subject of
ongoing research.
The Synaptic Processing Unit Anthony Hsiao
2-20
2.22.22.22.2 Synaptic plasticity Synaptic plasticity Synaptic plasticity Synaptic plasticity at the heart of learning at the heart of learning at the heart of learning at the heart of learning in neural systemsin neural systemsin neural systemsin neural systems
There are several underlying mechanisms that cooperate to achieve synaptic plasticity,
including changes in the quantity of neurotransmitter released into a synapse and
changes in how effectively cells respond to those neurotransmitters [7]. As memories
are believed to be represented by vastly interconnected networks of synapses in the
brain, synaptic plasticity is one of the important neuro-chemical foundations of
learning and memory. Thereby, strengthening, Long-Term Potentiation (LTP), and
weakening of a synapse, Long-Term Depression (LTD), are widely considered to be
the major mechanisms by which learning happens and memories are stored in the
brain.
Many models of learning assume some kind of activity based plasticity, whereby an
increase in synaptic efficacy arises from the pre-synaptic cell's repeated and persistent
stimulation of the post-synaptic cell. These kinds of learning rules are commonly
referred to as Hebbian learning rules, popularly summarised as ‘What fires together,
wires together’.
Another particularly prominent experimentally observed form of long term plasticity
is called Spike-Timing Dependent Plasticity (STDP), and depends on the relative
timing of pre- and post-synaptic action potentials. If a pre-synaptic spike is succeeded
quickly by a post-synaptic spike, then there appears to exist some kind of causality
since the pre-synaptic neuron has contributed to the depolarization of the post-
synaptic neuron, and they should be connected more strongly, by potentiating the
synapse. Conversely, if a pre-synaptic spike is directly preceded by a post-synaptic
spike, their connection should be weakened, and the synapse gets depressed.
Different forms of observed plasticity that can be described by STDP are shown in
Figure 7.
The Synaptic Processing Unit Anthony Hsiao
2-21
Figure Figure Figure Figure 7777: : : : Different forms of synaptic plasticityDifferent forms of synaptic plasticityDifferent forms of synaptic plasticityDifferent forms of synaptic plasticity
The amount (qualitatively) and type of synaptic modification evoked by repeated pairing of pre- and post-synaptic action potentials in different preparations.
The horizontal axis is the difference tpre-tpost of these spike-times. Results are shown for slice recordings of different neurons. Without going into unnecessary detail, the important point to note is
that different forms of plasticity exist. Figure from Abbott & Nelson 2000.
Several other models of synaptic plasticity exist, ranging over several levels of
complexity and biological plausibility. Each has its advantages and disadvantages,
proposing different mechanisms of synaptic plasticity, trying to explain different
types of experimentally observed plasticity. Other global regulatory processes of
learning, such as synaptic scaling or synaptic redistribution are thought to be
necessary alongside activity based learning rules [5].
While learning rules and models of synaptic plasticity attempt to describe the
mechanism by which synaptic plasticity is generated, different models of synapses
themselves exist, which can vary greatly in the way they respond to ‘plasticity signals’.
2.32.32.32.3 The cascade synapse modelThe cascade synapse modelThe cascade synapse modelThe cascade synapse model
Storing memories of ongoing, everyday experiences requires a high degree of
synaptic plasticity, while retaining these memories demands protection against
changes induced by further activity and experiences. Models in which memories are
stored through switch-like transitions in synaptic efficacy are good at storing but bad
at retaining memories if these transitions are likely, and they are poor at storage but
good at retention if they are unlikely [1]. In order to address this dilemma, Fusi et. al.
developed the model of binary cascade synapses, which combines high levels of
memory storage with long retention times and significantly outperforms conventional
models [9].
The Synaptic Processing Unit Anthony Hsiao
2-22
They consider the case of binary synapses, i.e. a synapse with only two efficacies, (for
example potentiated and depressed, weak or strong), which is not implausible, since
biological synapses have been reported to display binary states of efficacy as well [2].
The structure of a binary cascade model is shown in Figure 8, specifying two
independent dimensions for each synapse. Just like ordinary models of binary
synapses, a binary cascade synapse can be in one of two states of efficacy, weak or
strong, but while ordinary models only allow one fixed value of plasticity, cascade
synapses possess a cascade of n states with varying degree of plasticity,
implementing metaplasticity (i.e. the plasticity of plasticity). Ongoing plasticity then
corresponds to transitions of a synapse between states characterized by different
degrees of plasticity, rather than (only) different synaptic strengths.
Figure Figure Figure Figure 8888:::: Schematic of a Cascade Model of Synaptic Plasticity. Schematic of a Cascade Model of Synaptic Plasticity. Schematic of a Cascade Model of Synaptic Plasticity. Schematic of a Cascade Model of Synaptic Plasticity.
Courtesy of Stefano Fusi. There are two levels of synaptic strength, weak (yellow) and strong (blue), denoted by + and -. Associated with these strengths is a cascade of n sates (n = 5 in this case).
Transitions between state I of the cascade of any strength and state 1 of the opposite strength take place with probability qi, corresponding to conventional synaptic plasticity. Transitions with
probabilities p i ±link the states within the respective cascade (downward arrows), corresponding to
metaplasticity.
Binary cascade synapses can respond to any learning rule with binary plasticity
signals, i.e. signals that are either ‘potentiate’ or ‘depress’, and responds to them
stochastically; plasticity signals are only responded to with a given probability which
The Synaptic Processing Unit Anthony Hsiao
2-23
is determined by the state along the cascade the synapse is in. So it is the varying
probability of responding to plasticity signals that implement the different degrees of
plasticity described above.
In the highest state (state 1 of the cascade in Figure 8), the probability of responding
to a plasticity event is 1, and decreases for states further down the cascades, where
the synapse becomes less plastic. In the model analysed by Fusi, the plasticity actually
halves for every state down the cascade, i.e. 50% chance of responding to a plasticity
signal in the second cascade, 25% in the third, and so forth.
A cascade synapse can respond to plasticity events in two ways, depending on
whether it already has the ‘right’ efficacy, referred to as switching and chaining. If it
switches, then it is changing efficacy, i.e. from weak to strong, or vice versa. If a
synapse switches, it will always make a transition to state 1, i.e. the most plastic state,
of the opposite cascade, regardless of what state it was in before. In Figure 8, these
transitions are represented by the arrows between the two cascades, with plasticity
probabilities given by qi. If the synapse chains, i.e. it already has the right efficacy,
then it is moving down one state in the cascade, thereby reducing (halving) its
plasticity probability, becoming less plastic. In Figure 8, this is represented by the
downward arrows connecting consecutive states within each cascade, with plasticity
probabilities given by pi+/-.
Thus, cascade synapses can respond to ongoing modifications by reducing their
plasticity, thereby ‘reassuring’ their state of efficacy. Another way of looking at it is
that synaptic efficacies and their degree of plasticity are dependent on the history of
the synapses and the plasticity signals they received.
Fusi et. al. assess the performance of cascade synapses to that of ordinary binary
synapses by comparing the strength of an initial memory trace, the initial signal-to-
noise ratio, as well as the average memory lifetime, the point at which this signal-to-
noise ratio becomes equal to 1 for both synapse model (it is worthwhile to reiterate,
that it was this trade-off, ability to store memories easily vs. retaining them for a long
time, that originally led them to develop the cascade synapse model in the first place).
They find that cascade models arrive at a better compromise, storing new memories
The Synaptic Processing Unit Anthony Hsiao
2-24
more easily and faithfully, yet retaining them for a longer period of time, as shown in
Figure 9. Without going into unnecessary detail (the interested reader is advised to
consult [1] for more information), they find that the better performance of cascade
synapses stems the fact that they experience power-law forgetting, unlike ordinary
binary synapses, which experience exponentially fast decay of their memories.
Figure Figure Figure Figure 9999:::: Initial Signal Initial Signal Initial Signal Initial Signal----totototo----noisenoisenoisenoise----ratio as a function of memory lifetime, from ratio as a function of memory lifetime, from ratio as a function of memory lifetime, from ratio as a function of memory lifetime, from [1][1][1][1]....
The initial signal-to-noise ratio of a memory trace stored using 105 synapses plotted against the memory lifetime (in units of 1 over the rate of candidate plasticity events). The blue (lower) curve is for a binary model with synaptic modification occurring with probability q that varies along the curve. The red (upper) line applies to the cascade model described by Fusi et. Al. The two curves have been normalised so that the binary model with q = 1 gives the same result as the n = 1 cascade model to
which it is identical. Clearly, the cascade model performs better than the ‘normal’ binary model both in terms of initial signal-to-noise ratio and memory lifetime.
In summary, binary cascade synapses outperform their ‘ordinary counterpart’ in terms
of memory storage and retention, which derives from the more complex structure
allowing the synapse to respond to ongoing modifications along two dimensions –
efficacy and metaplasticity. It is desirable to implement these nice properties into real
hardware, and previous attempts have already laid good groundwork for that.
2.42.42.42.4 Previous workPrevious workPrevious workPrevious work
This project mainly builds up on two previous projects. The first one, titled ‘A
stochastic synapse for reconfigurable hardware’, a short project during the Telluride
workshop for Neuromorphic Engineering by Dylan Muir [15], laid the ground work
The Synaptic Processing Unit Anthony Hsiao
2-25
for both the following and this project. In particular, it succeeded in creating a first
VHDL implementation of the cascade synapse and verified its operation in
simulations. One of the biggest contributions of this project is the design of one
particular type of pseudo-random number generator, the Hybrid Cellular Automata
array pseudo-random number generator, which also found extensive use in this
current project. However, no actual hardware was synthesised from the digital design.
The second project, ‘A VHDL implementation of the Cascade Synapse Model’, a
diploma project by Tobias Kringe [16], succeeded in designing and implementing a
small array of cascade synapses onto an FPGA. The operation of the digital cascade
synapses was verified both in simulation and in hardware, and encouraging results
were achieved in confirming the complex behaviour of the cascade synapse (which is
why this current project will not focus on reproducing and re-verifying the properties
of hardware implemented cascade synapses). However, the VHDL implementation
was rather large, and only a small number of synapses could be implemented onto
the FPGA. It was Tobias Kringe who proposed to virtualise the cascade synapses
(which is one of the aims of this current project) in order to realise a useful number of
synapses onto one FPGA. Due to the radically different architecture of the virtualised
synapses to the static hardware synapses, next to none of his VHDL implementation
was reused.
To the best of the knowledge of the author, there has been no other working
hardware implementation of a large number of cascade synapses (in fact, of any
number of synapses) to date.
2.52.52.52.5 Overview of the hardware environmentOverview of the hardware environmentOverview of the hardware environmentOverview of the hardware environment
Neuromorphic aVLSI hardware commonly comprises low power analogue CMOS
circuits operating in the subthreshold regime, that mimic (morph) the properties of
real neural systems and elements. In particular, a neuromorphic aVLSI neuron chip
was used, which comprised an array of leaky Integrate & Fire (IF) silicon neurons
with Diff-Pair Integrator (DPI) synapses. Communication to the outside world was
done using the asynchronous Address Event Representation (AER) protocol. The
The Synaptic Processing Unit Anthony Hsiao
2-26
FPGA is sitting on an FPGA board developed at the Institute of Neuroinformatics in
Zurich.
2.5.12.5.12.5.12.5.1 Silicon neuronsSilicon neuronsSilicon neuronsSilicon neurons
There are different types of silicon neurons, such as conductance based models which
aim to map molecular conductance mechanisms underlying neuron behaviour in
detail into analogue electronic circuits, or more qualitative models such as the I&F
neuron model, which merely implements the observed characteristics of neuron
behaviour into silicon, such as integration, firing or the refractory period.
The aVLSI chip used in this project contained 128 I&F neurons similar to the circuit
depicted in Figure 10. Qualitatively, this I&F circuit works by integrating input
current from on-chip synapses on its membrane, and elicits a (voltage) spike if the
membrane voltage crosses a firing threshold.
Figure Figure Figure Figure 10101010: : : : Circuit diagram of aCircuit diagram of aCircuit diagram of aCircuit diagram of an ultra lon ultra lon ultra lon ultra low power Integrate & w power Integrate & w power Integrate & w power Integrate & Fire Neuron.Fire Neuron.Fire Neuron.Fire Neuron.
Labelled functional circuit elements mimic the behaviour of real neurons. Transistors operate in the sub-threshold regime to exploit their desirable exponential characteristics. A capacitor Cmem integrates incoming post-synaptic current into a membrane voltage Vmem. If the membrane potential crosses the
spiking threshold, it will ‘spike’ just like a real neuron. Courtesy of Giacomo Indiveri.
The Synaptic Processing Unit Anthony Hsiao
2-27
2.5.22.5.22.5.22.5.2 Silicon synapsesSilicon synapsesSilicon synapsesSilicon synapses
Each I&F neuron has 32 silicon synapses with different properties and behaviour
connected to it, but only one type of synapse was used in this project, namely the
static DPI synapse. The circuit of such a synapse is depicted in Figure 11.
Qualitatively, the DPI synapse works by receiving a (voltage) spike from a pre-
synaptic neuron (or from the outside world), and then injects a given amount of
current onto the membrane of the post-synaptic neuron it is connected to in response.
The amount of current produced by every incoming spike is dependent on the static
synaptic weight and the time constant of the synapse, which can be adjusted to
achieve the desired static synaptic weight.
Figure Figure Figure Figure 11111111: : : : Circuit diagram of the so called DCircuit diagram of the so called DCircuit diagram of the so called DCircuit diagram of the so called Diffiffiffiff----PPPPair air air air IIIIntegrator ntegrator ntegrator ntegrator (DPI) synapse(DPI) synapse(DPI) synapse(DPI) synapse....
For every pre-synaptic spike it receives, it dumps a post-synaptic current onto the membrane of the post-synaptic neuron connected to it. The amount of current, and other dynamics, can be set by
parameters such as the synaptic weight, the time constant tau or the threshold voltage.
2.5.32.5.32.5.32.5.3 CommCommCommCommunication using AERunication using AERunication using AERunication using AER
The Address Event Representation (AER) protocol is used to allow for
communication in multi-chip environments. It is a serial asynchronous four-phase
handshaking protocol (using request-acknowledge signals) which encodes events (i.e.
spikes) of individual neurons by assigning each neuron a unique address (up to
The Synaptic Processing Unit Anthony Hsiao
2-28
16bits). Every time a neuron fires, it generates an address event, which is then
transmitted over the AER bus to receiving hardware. Unlike conventional electronic
systems with arrays of information sources, such as digital cameras, neuromorphic
systems using the AER protocol do not scan through every one of its elements to
transmit one frame after another, but rather, information is transmitted on demand.
Only if a neuron spikes, will an address event be transmitted. Therein, one of the
most important points about the AER protocol is its asynchrony, whereby the precise
timing of the address event is implicitly encoding the time of the spike itself – no
need to communicate timestamps for individual spikes.
Conveniently, since electronic circuits implementing neuromorphic hardware are very
fast, while neural activity is rather slow (<100Hz), a large number of neurons can
share the same AER bus without problem. Typically, an AER bus would have a
bandwidth of about 1Mevent/second.
2.5.42.5.42.5.42.5.4 The FPGA boardThe FPGA boardThe FPGA boardThe FPGA board
The FPGA used in this project is a Xilinx Spartan 3 (xc3s400pq208) that sits on a
prototype FPGA board developed by Daniel Fasnacht during his diploma project at
the Institute of Neuroinformatics in Zurich, depicted in Figure 12. Features used in
this project are the USB interface and the two AER ports (one input, one output). It
has an external clock of 106.125MHz, and is programmed using JTAG.
Apart from developing the board itself, Daniel Fasnacht further developed a Linux
driver to allow communication with the USB board. A program developed by
Giacomo Indiveri is used to send data to the FPGA board. In particular, pre-synaptic
spikes are sent through the USB bus to the SPU by specifying a synapse address and
an inter-spike interval to the previous spike, data which is easily generated using the
piking neuron toolbox1 in Matlab. The aVLSI neuron chip is configured using Matlab2.
1 Developed by Dylan Muir at the Institute of Neuroinformatics
2 To set up the environment variable for the aVLSI chip in Matlab: chipinit.m. To load the required
calibration settings to the chip: bias_050607.m
The Synaptic Processing Unit Anthony Hsiao
2-29
It should be noted, that his is a prototype board, and with experimental or prototype
hardware, extra consideration should be taken, since not all functions necessarily
have to work as expected. However, seeing experimental hardware work and become
‘alive’ is one of the most gratifying moments of hardware development.
In the experimental setup used for the classification task (as described in 7.5A real
classification task) the FPGA board interfaces with an aVLSI ‘IFSLTWA’ neuron chip,
using the AER connections to send address events to, and receiving feedback from
the neurons. Figure 13 illustrates this experimental setup.
Figure Figure Figure Figure 12121212: Prototype FPGA board developed by Daniel Fasnacht. : Prototype FPGA board developed by Daniel Fasnacht. : Prototype FPGA board developed by Daniel Fasnacht. : Prototype FPGA board developed by Daniel Fasnacht.
1. Xilinx Spartan 3 (xc3s400pq208) 2. USB port 3. AER-out port 4. AER-in port
The Synaptic Processing Unit Anthony Hsiao
2-30
Figure Figure Figure Figure 13131313: Experimental hardware setup. : Experimental hardware setup. : Experimental hardware setup. : Experimental hardware setup.
1. FPGA SPU 2. Forward AER connection 3. aVLSI chip with array of I&F neurons 4. Oscilloscope measuring the post-synaptic membrane potential 5. post-synaptic feedback AER connection (with logic
analyzer) 6. pre-synaptic stimuli input USB connection.
2.5.52.5.52.5.52.5.5 SoftwareSoftwareSoftwareSoftware
Throughout this project, three software packages were used, namely Xilinx ISE 9.1i
Webpack to code the VHDL design, Modelsim PE Student Edition to simulate VHDL
code and Matlab, for various things, including plotting, initialization file generation,
analysis or spike train generation.
A project diary was kept on GoogleDocuments.
The Synaptic Processing Unit Anthony Hsiao
3-31
3333 STADP STADP STADP STADP –––– a novel a novel a novel a novel Hebbian Hebbian Hebbian Hebbian learning rlearning rlearning rlearning ruleuleuleule
‘The illiterate of the 21st century will not be those who cannot read and write, but
those who cannot learn, unlearn, and relearn’ – Alvin Toffler
In the previous section, the general concept of synaptic plasticity was introduced.
While different learning rules have been proposed, for the task at hand, keeping in
mind that the Synaptic Processing Unit is to be tested on a real classification task, it is
necessary to implement a learning rule that is both suitable for the learning task in a
general environment, as well as easily implemented into digital hardware. There are
several learning rules out there that would be interesting to be implemented, most
prominently STDP, amongst also others [18], [3], [20], but none really meet the needs
for this project.
From [19] and [20], it was concluded that ordinary STDP would not be sufficient as a
general learning rule. Instead, the system would either have to be taught with
specifically crafted and highly correlated temporal patterns (not a general
environment), or a more elaborate version of STDP would have to be constructed,
which is impractical for the implementation, both in terms of hardware real estate
(memory in particular, but also logic) and circuit complexity. Prototype designs for
STDP were rejected on the basis of it requiring excessive memory and
overcomplicating the digital circuit.
Instead, a novel but very simple, easily implemented learning rule was developed
together with [20], called Spike-Timing and Activity Dependent Plasticity (STADP),
which produces simple binary plasticity events, depress and potentiate, as required by
the binary cascade synapse model.
3.13.13.13.1 STADP STADP STADP STADP –––– Yet another learning rule?Yet another learning rule?Yet another learning rule?Yet another learning rule?
At the heart of STADP is the same Hebbian learning paradigm, that ‘what fires
together, wires together’. Unlike STDP, which derives the causality for ‘firing
together’ from the difference in spike times, STADP uses a mixture of firing time and
The Synaptic Processing Unit Anthony Hsiao
3-32
firing rate based measures to determine, whether pre- and post-synaptic neuron ‘fire
together’.
As the name suggests, STADP produces plasticity signals depending on spike timing
as well as activity. In particular, it is dependent on the state of activity of the post-
synaptic neuron, and the timing of pre-synaptic spikes.
STADP says, that the post-synaptic neuron can be in one of two states at any point in
time: active and inactive. This state is determined by a threshold function of the post-
synaptic firing frequency: if it is above a mean firing rate fm, it is said to be active,
otherwise it is inactive. For example, a setup of aVLSI I&F neurons could have a
mean firing rate fm = 50Hz, which is biologically plausible, and be said to be active
for firing rates above 50Hz, and inactive for firing rates below 50Hz.
Then, two neurons are said to ‘fire together’ if a pre-synaptic spike arrives while the
post-synaptic neuron is active, and the synapse should be potentiated (LTP). The
reverse is also true, i.e. when a pre-synaptic spike arrives at the synapse while the
post-synaptic neuron is inactive, then the synapse should be depressed (LTD).
However, this scheme would result in one plasticity signal for every pre-synaptic
spike, so in order to condition the number of plasticity signals produced, STADP is
stochastic, and only produces potentiation or depression signals with a certain
probability, called the probability of plasticity, p(plasticity). Figure 14 below
summarises how STADP produces plasticity events.
The Synaptic Processing Unit Anthony Hsiao
3-33
Figure Figure Figure Figure 14141414: STADP: STADP: STADP: STADP
Plasticity events are elicited with a probability p(plasticity), and depend on the spike time of the pres-synaptic, and the activity of the post-synaptic neuron.
3.1.13.1.13.1.13.1.1 From spike time to spike rateFrom spike time to spike rateFrom spike time to spike rateFrom spike time to spike rate
The two state abstraction of the post-synaptic neuron’s activity essentially requires an
integration of its spike-times to produce spike rates. However, integration of spikes
arriving at irregular intervals into spike rates can be a non-trivial task in real time
processing in digital hardware (it would be very easy in analogue electronics
actually!). In STADP, this is elegantly performed using a stochastic process, inspired
by quantum physics [20]. The main idea behind this is that the post-synaptic neuron
is in an unknown state of activity until it gets ‘measured’, in this case by an incoming
pre-synaptic spike.
Every time the post-synaptic neuron spikes, its state of activity is set to active
independent on the current state. A neuron in active state can then make a transition
to the inactive state with a probability p(deactivate) (this can also be regarded as a
two state hidden Markov process), as depicted in Figure 15.
Without specifying what the p(deactivate) is at any point of time, it can be
appreciated how a post-synaptic neuron firing at mean firing rate fm should have a
probability of being in active state, p(active) of 0.5, a more active neuron should have
a higher p(active) and a less active neuron should have a lower p(active).
The Synaptic Processing Unit Anthony Hsiao
3-34
Figure Figure Figure Figure 15151515:::: The STADP mechanism The STADP mechanism The STADP mechanism The STADP mechanism....
A post-synaptic neuron can be in one of two states: active and inactive. The STADP mechanism determines the state of the post-synaptic neuron by integrating the post-synaptic firing times. A post-synaptic spike sets the neuron to active state, which then stochastically resets to the inactive state after an amount of time equal to the mean postsynaptic inter-spike interval. Clearly, the probability that the post-synaptic neuron is in active state at any given time increases as it’s firing rate increases, and is 0.5
if it is firing at the mean firing rate.
In order to implement this in real hardware (it would be rather challenging to actually
instantiate some kind of quantum process), the STADP mechanism proposed here is
using an abstraction of the stochastic deactivation of the post-synaptic neuron. This
abstraction is based on the assumption that the neuron fires as a poisson process with
mean firing rate fm, which has an exponentially distributed inter-spike interval (the
time interval between two consecutive spikes) ~ exp(1/fm). Then, upon every
incoming post-synaptic spike (which sets the neuron’s state to active), an
exponentially distributed ‘expiry time’ is drawn, after which the neuron is said to
reset to the inactive state.
This way, the desired properties can be achieved: if the post-synaptic neuron is firing
at the mean firing rate fm, it will have an equal chance of being in active or inactive
state, on average, at any point in time. Similarly, if it is firing at a higher rate, it has a
higher chance of being active since it is being set to active faster than it is expiring to
inactive, while if it is firing at a lower rate, it has a lower chance of being active at
any point in time.
The Synaptic Processing Unit Anthony Hsiao
3-35
One question remains. Whether a plasticity event is a depression or a potentiation
event is dependent on the post-synaptic neuron’s activity as explained above – but
then, how does STADP behave for different pre-synaptic frequencies? As the name
suggests, the plasticity is dependent on spike timing, since the state of activity of the
post-synaptic neuron is only ever evaluated on an incoming pre-synaptic spike, but in
fact, its rate plays a role too.
In general, the higher the pre-synaptic frequency, the more plasticity events will be
produced. However, since potentiation and depression are only elicited with
probability p(plasticity), the dependence on the pre-synaptic rate is slightly more
complex. While high pre-synaptic frequencies are likely to lead to a high rate of
plasticity, low, but non-zero, pre-synaptic frequencies are likely not to result in any
plasticity event at all, as only few of the already rare pre-synaptic spikes would ever
lead to a plasticity event.
In summary, the pre-synaptic firing rate can be said to determine the rate (probability)
of plasticity events, while the post-synaptic frequency is best described as setting the
type of the plasticity events. Synapses with high pre-synaptic firing rates are more
likely to be receiving plasticity signals, while synapses with low pre-synaptic firing
rates are likely to remain static, as they receive none or only few plasticity events.
3.23.23.23.2 CharacteristicCharacteristicCharacteristicCharacteristics of STADPs of STADPs of STADPs of STADP
The previous section explained how, conceptually, STADP works, and how the actual
STADP mechanism, which draws an exponentially distributed expiry time for the
post-synaptic neuron to reset to the inactive state, works. The following paragraphs
describe some of its characteristics as well as the expected plasticity signals that
STADP would produce.
When characterising the behaviour or the results of STADP, the two important points
to be noted are firstly whether the expiry time mechanism works at all, and secondly
what plasticity profile it produces over a range of pre- and post-synaptic frequencies.
By observing p(active), the correct operation of the mechanism can be verified, by
The Synaptic Processing Unit Anthony Hsiao
3-36
observing the plasticity rates, i.e. how many potentiation or depression events are
elicited per second, insights into the plasticity profile can be gained.
The following plots were obtained from a simple Matlab simulation3 done by Dylan
Muir, and show the rate of potentiation (LTP rate), rate of depression (LTD rate), the
net effect of plasticity (LTP rate – LTD rate) as well as p(active), over pre- and post-
synaptic frequency ranges of 0-100Hz.
Figure Figure Figure Figure 16161616: Simulated behaviour of STADP.: Simulated behaviour of STADP.: Simulated behaviour of STADP.: Simulated behaviour of STADP.
Left column: rate of potentiation and depression events per second, over a range of pre- and post-synaptic frequencies [1:100Hz] (ignore the axis labels). Right column: Net effect of STADP and
probability of the postsynaptic neuron being in active state per unit time.
These simulation results suggest that STADP indeed works as a Hebbian learning rule,
and has the desired characteristics. The p(active) is approximately 0.5 at a post-
synaptic frequency of 50Hz, is increases for higher frequencies, and decreases for
lower frequencies. Furthermore, the plasticity rate increases with pre-synaptic
3 p(active) curve: make_prob_active_vs_freq_plot.m other plots: make_freq_sim_plot.m
The Synaptic Processing Unit Anthony Hsiao
3-37
frequency for both potentiation and depression, which also have a qualitatively
correct behaviour, best summarized by the net effect of LTP and LTD: with increasing
pre-synaptic frequencies, there are more plasticity events, with potentiation
dominating for high post-synaptic frequencies, and depression dominating for low
post-synaptic frequencies.
One important characteristic to note, however, is that potentiation and depression are
not symmetric within the regime of operation, and that the net effect of plasticity has
a bias towards depression, or equivalently, reluctance towards potentiation. This is
due to the p(active) curve, which is not linear or symmetric about the (50Hz, 0.5)
point. As will be described later in the experimental section, this will have an
observable effect.
Possible remedies for this could include measures such as pre-biasing or distorting the
p(active) curve so that it saturates at 100Hz, or by setting a minimum expiry time of
10ms (1/100Hz) in order to ensure that p(active) is 1 at 100Hz. The remedy used
would have to be matched to the particular implementation of STADP.
While more detailed and formal analysis of STADP would be desirable, this would go
beyond the scope of this report. These initial simulation results are satisficing ( =
satisfying enough), and confidence in the learning rule further derives from [20].
The Synaptic Processing Unit Anthony Hsiao
4-38
4444 DesignDesignDesignDesign
‘I am enough of an artist to draw freely upon my imagination. Imagination is more
important than knowledge. Knowledge is limited. Imagination encircles the world’ –
Albert Einstein
4.14.14.14.1 SSSSummaryummaryummaryummary of features of the Synaptic Processing Unit of features of the Synaptic Processing Unit of features of the Synaptic Processing Unit of features of the Synaptic Processing Unit
The Synaptic Processing Unit designed here has the following features:
• Speed of operation: Clocked at 90MHz internally
• System architecture:
o Fully pipelined design – the SPU can theoretically process a new
address event every clock cycle, although this never happens in
practice
o Modular design – allows for easy plug-in of a new learning rule
• On-chip learning rule: STADP with 11.1ns time resolution
• I/O ports: 1x USB input, 1x AER input, 1x AER output
• Cascade representation: 6bit, reconfigurable, allowing for synapses with up to
32 cascades
• Cascade memory address width: 13bit, reconfigurable, allowing for up to
8192 binary cascade synapses
• Addressing: Configurable number of neurons (up to 256)
• One teacher synapse per neuron
4.24.24.24.2 System System System System level designlevel designlevel designlevel design
Although this project builds upon previous work as mentioned earlier, most parts of
the Synaptic Processing Unit were designed from scratch, since the pipelined and
virtualized cascade synapse requires a very different architecture.
The Synaptic Processing Unit Anthony Hsiao
4-39
4.2.14.2.14.2.14.2.1 The SPU in a neural systemThe SPU in a neural systemThe SPU in a neural systemThe SPU in a neural system
From a high level point of view, the SPU is supposed to integrate with one aVLSI
neuron chip, forming one coherent neural system containing an array of neurons with
cascade synapse functionality. This system could, for example, be used as one layer
of a larger network of spiking neurons, as depicted in Figure 17.
Figure Figure Figure Figure 17171717: Syste: Syste: Syste: System level interactim level interactim level interactim level interaction of SPU and aVLSI neuron chip.on of SPU and aVLSI neuron chip.on of SPU and aVLSI neuron chip.on of SPU and aVLSI neuron chip.
Together, these form one freely reconfigurable integrated array of N Integrate and Fire neurons with binary cascade synapses.
4.2.24.2.24.2.24.2.2 Input and output portsInput and output portsInput and output portsInput and output ports
In order to act as one coherent system, the SPU has to be able to communicate both
with the neuron chip, as well as with the outside world. Here, this is done using the
USB port of the FPGA board as pre-synaptic input, and the two AER ports to connect
the SPU to the neuron chip.
Clearly, a forward connection, whereby pre-synaptic spikes are routed towards the
right post-synaptic neuron is necessary. However, in order to be able to perform
learning using STADP, and indeed most other learning rules, an additional feedback
connection from the neuron chip back to the SPU is necessary, in order to obtain
information about the post-synaptic neurons, which in this case means to estimate
their state of activity.
The Synaptic Processing Unit Anthony Hsiao
4-40
4.34.34.34.3 Virtualising the cascade synapseVirtualising the cascade synapseVirtualising the cascade synapseVirtualising the cascade synapse
The binary cascade model is quite a nice model to be implemented in digital
hardware. It has essentially only two important properties, namely its binary efficacy
and its current state, which at the same time encodes the plasticity, which in turn is
represented by a plasticity probability, which halves for every higher cascade. This
has ‘digital’ written all over it.
In order to virtualise the cascade synapses, some conceptual ‘cascade mechanism’ by
which to process them has to be devised. The basic idea is to trade hardware real
estate on the FPGA for memory, and to process synapses on demand. This has two
immediate design deliverables:
• In order to virtualise the cascade synapses, an abstraction or memory
representation of them has to be defined,
• A mechanism, by which they are processed on, i.e. how individual synapses
respond to plasticity signals, has to be developed
Conveniently, the cascade synapse can be represented by a bit vector very intuitively.
One bit encodes the synaptic efficacy, while a number of other bits encode the state
of the synapse, i.e. the synaptic plasticity, i.e. the plasticity probability, depending on
the number of cascades. Then, halving the plasticity probability is just a matter of a
bit shifting operation. As depicted in Figure 18, an Nbit representation where the
MSB represents the efficacy, and the word [N-1...0] represents the plasticity
probability, as an unsigned binary number.
Figure Figure Figure Figure 18181818: Bit representation of cascade synapses: Bit representation of cascade synapses: Bit representation of cascade synapses: Bit representation of cascade synapses
The Synaptic Processing Unit Anthony Hsiao
4-41
Using this representation, the plasticity probability ranges from 0 to 2N-1-1 rather than
from 0 to 1, but this is not a problem, since it can be regarded as the numerator of a
rational number with denominator 2N-1-1. Such a representation can easily be stored
in and retrieved from memory, and provides the functionality required to implement
the virtualisation.
Here, N = 6 was fixed as a reasonable maximum cascade representation width,
allowing for synapses with up to 32 cascades. This is more than sufficient, and in fact,
too large a number of cascades can actually decrease the memory performance of the
synapses [1].
The processing on the cascade synapse can be expected to be relatively simple, since
there is only a small number of things the synapse ‘can do’: switch or chain, with a
probability given by its state. The exact mechanism implemented is described in
detail in the
The Synaptic Processing Unit Anthony Hsiao
4-42
Implementation section, but from a high level description point of view, it has to:
• Obtain the right cascade from memory
• Perform the necessary operations on its state representation (i.e. switch, chain
or do nothing)
• Produce a new cascade state representation, and pass it back to the cascade
memory
4.44.44.44.4 SPU internal addressing SPU internal addressing SPU internal addressing SPU internal addressing
Since incoming and outgoing events are following the AER protocol, whereby
neurons are identified by addresses, the SPU internal representation is also using
addresses as identifiers of synapses.
Figure Figure Figure Figure 19191919: : : : SPU internal addressing formatSPU internal addressing formatSPU internal addressing formatSPU internal addressing format
At the heart of the addressing scheme are the synapses, which can be identified
uniquely by an Nbit synapse address, as shown in Figure 19. For historical reasons4,
this synapse address is set to 13bits, allowing it to uniquely identify up to 8192
synapses. The top few bits of the synapse address represent the neuron address,
which uniquely identify the post-synaptic neuron which the cascade synapse is
connecting to. The aspect ratio of the neural system, i.e. how many neurons there are
and how many synapses each has can be changed freely within the SPU by changing
4 The SPU was originally designed to interact with an aVLSI chip with 256 neurons and 8192 synapses,
the largest of its kind at that time
The Synaptic Processing Unit Anthony Hsiao
4-43
this neuron address width, and does not have to correspond to the actual number of
neurons (or synapses) on the aVLSI chip.
4.54.54.54.5 ModulModulModulModular design of thear design of thear design of thear design of the SPU SPU SPU SPU
Apart from implementing cascade synapse behaviour in a virtualised fashion, the SPU
has to perform two other important tasks: spike forwarding and learning.
Overall, the core of the SPU, i.e. ignoring data I/O and FPGA board particulars, will
have the following four modules:
• Forwarding module
• Learning module
• Cascade module
• Cascade memory
The conceptual architecture that stems from these four modules is depicted in Figure
20.
Figure Figure Figure Figure 20202020: : : : Conceptual Conceptual Conceptual Conceptual ArchitectureArchitectureArchitectureArchitecture of the SPU of the SPU of the SPU of the SPU
The principle of operation of the SPU is as follows:
The Synaptic Processing Unit Anthony Hsiao
4-44
1. The signal selector (not one of the core functions of the SPU) performs
arbitration between pre- and post-synaptic inputs, and forwards this address
into the SPU, to the forwarding module, the cascade memory and the learning
module.
2. The cascade memory retrieves the cascade synapse representation
corresponding to the synapse address, and, at the same time, writes new
cascade states to (another location in) memory.
3. The learning rule (stochastically) produces plasticity signals as required by
STADP and the pre- and post-synaptic spikes the SPU receives.
4. The forwarding module forwards pre-synaptic addresses on to the output of
the SPU, depending if, and only if, the efficacy of the synapse is high.
5. The cascade module (stochastically) processes the cascade representation
according to the plasticity signals it receives from the learning module and
passes on a new cascade state to be written by the cascade memory
This architecture can be fully pipelined, so that the SPU can process one ‘instruction’,
i.e. one address event, per clock cycle. This is particularly important in order to ensure
that the SPU is operating fast enough, since in a multi-chip environment, it should not
be the processing bottleneck, but rather, it should be able to process whatever is
being thrown its way by the pre-synaptic input (USB). Since the AER bus can
typically transmit about 1Mevent/second, the SPU should be able to process a
multiple of that, which a fully pipelined architecture allows.
In order to ensure that only the ‘right’ signals are being processed and that no wrong
data is written to memory, the SPU uses an extra level of control signals that indicate
the validity of the data shown in Figure 20.
4.64.64.64.6 Module specificationsModule specificationsModule specificationsModule specifications
The high level relationship between the individual modules described above
translates into precise input/output and functional specifications, described below.
The Synaptic Processing Unit Anthony Hsiao
4-45
4.6.14.6.14.6.14.6.1 ForwardingForwardingForwardingForwarding
Function:
• To forward valid pre-synaptic spikes to the post-synaptic neuron address over
the AER output of the SPU, if the ‘target’ synapse has high efficacy or a
teacher signal was sent.
Input signals:
• neuron_address: address of the synapse the current pre-synaptic spike is
addressed to. Up to 13bits
• target_synapse_efficacy: MSB of the cascade representation of the
addressed synapse. 1bit.
• address_pre_post: control signal issued by the signal selector which
indicates whether current data comes from the pre-synaptic (‘0’) or the post-
synaptic (‘1’) feedback input. 1bit.
• address_valid: control signal that indicates whether current data is a valid
Outputs:
• target_neuron_address: address of the post-synaptic neuron that is to be
sent out through the AER output. up to 8bits.
• target_address_valid: control signal that indicates whether the target
neuron address is valid. 1bit.
4.6.24.6.24.6.24.6.2 Learning Rule (STADP) Learning Rule (STADP) Learning Rule (STADP) Learning Rule (STADP)
Function:
• To implement STADP
• To correctly produce plasticity events (dep./pot.)
Inputs:
• synapse_address: address of the incoming pre- or post-synaptic spike. Up to
13bits.
• address_pre_post: control signal issued by the signal selector which
indicates whether current data comes from the pre-synaptic (‘0’) or the post-
synaptic (‘1’) feedback input. 1bit.
• address_valid: control signal that indicates whether current data is a valid.
1bit.
Outputs:
• cascade_synapse_address: address of the cascade synapse that the plasticity
signals are valid for. Up to 13bits.
• plasticity_dep_pot: plasticity signal, indicating whether the cascade
synapse should be depressed (‘0’) or potentiated (‘1’). 1bit.
• plasticity_valid: control signal that indicates whether the plasticity signal
and the cascade synapse address are valid. 1bit.
The Synaptic Processing Unit Anthony Hsiao
4-46
4.6.34.6.34.6.34.6.3 CascCascCascCascade Processade Processade Processade Process
Function:
• To process cascade states according to plasticity signals from the learning
module
Inputs:
• cascade_synapse_state: cascade state representation of the cascade synapse
that is to be processed. Up to 6bits.
• cascade_synapse_address: address of the current cascade synapse that the
plasticity signals are valid for. Up to 13bits.
• plasticity_dep_pot: plasticity signal, indicating whether the cascade
synapse should be depressed (‘0’) or potentiated (‘1’). 1bit.
• plasticity_valid: control signal that indicates whether the plasticity signal
and the cascade synapse address are valid. 1bit.
Outputs:
• cascade_address_out: address of the new cascade state representation of
the valid new state. Up to 6bits.
• new_state: new processed cascade state representation ready to be written
back to memory. Up to 6bits.
• new_state_valid: control signal that indicates whether the new state and the
cascade out address is valid. 1bit.
4.6.44.6.44.6.44.6.4 Cascade memoryCascade memoryCascade memoryCascade memory
Function:
• To retrieve cascade representations of synapses addressed at its read port
• To store valid and new cascade representations of synapses addressed at its
write port
Input signals:
• synapse_address: address of the cascade the current pre-synaptic spike is
addressed to. Up to 13bits.
• new_state_address: address of the new state that has undergone plasticity.
Up to 13bits.
• new_state: new state of cascade synapse after processing. Up to 6bits.
• new_state_valid: control signal that indicates whether the new state for the
new state address is a valid. 1bit.
Outputs:
• current_state: address of the post-synaptic neuron that is to be sent out
through the AER output. Up to 6bits.
The Synaptic Processing Unit Anthony Hsiao
4-47
4.6.54.6.54.6.54.6.5 Global signalsGlobal signalsGlobal signalsGlobal signals
In addition to the inputs specified above, all modules share clock, clock enable and
asynchronous reset inputs to reset all internal registers and FIFOs. Note that the
content of memory is not reset to the initial state by this reset signal, but only the
output registers of the memory are cleared. All signals internal to the SPU are active
high.
The Synaptic Processing Unit Anthony Hsiao
5-48
5555 ImplementationImplementationImplementationImplementation
‘‘‘‘It's not good enough that we do our best; sometimes we have to do what's
required’ – Winston Churchill
5.15.15.15.1 PseudoPseudoPseudoPseudo----random number generatorsrandom number generatorsrandom number generatorsrandom number generators
The performance of stochastic learning processes, indeed of any stochastic process, is
heavily dependent on the ‘quality’ of the underlying randomness. Since the SPU has
random processes in two of its major functional components, the cascade synapse
module and the learning rule, implementing a good pseudo-random number
generator (pRNG) is even more important.
A good pRNG generates highly uncorrelated sequences of pRNs with a very long
maximum-length, before the sequence repeats. A good review on ‘classical’ pRNGs
can be found in [8], however the pRNG used here is more unconventional. Instead of
performing mathematical manipulation, including multiplication by prime numbers
and modulo division to generate pRNs, which is what most classical pRNGs do and is
rather resource intensive for a digital logic implementation, a so called Hybrid cellular
automata (HCA) array pRNG is employed, which, on the contrary, are a very efficient
choice for FPGA implementation.
Cellular automata consist of grids of ‘cells’, where each cell can be in one of a finite
number of states. Time is discrete, and each cell has a local update rule to determine
the state of it in the next unit of time. One of the most popular cellular automata is
Conway’s 2D ‘Game of Life’.
Here, we consider a one dimensional binary HCA, i.e. an array of bits, where each
cell (bit) has one of two local update rules, namely Rule 90 or Rule 150, as shown in
Figure 21, classified by Wolfram [16]. Rule 90 takes the XOR of both of its
neighbours to determine the next state of a cell, while Rule 150 adds the XOR of the
current value of the cell as well. Cells beyond the boundaries of the array are
considered to be '1' at all times, which ensures that the automaton does not freeze in
case of all cells being '0'. These choices and the right configuration for the rules used
The Synaptic Processing Unit Anthony Hsiao
5-49
ensure that the pRNG produces maximum length sequences of uniform pRNs. In [8],
there is a detailed description of which rules to use for what bit position to generate
maximum length sequences for HCA arrays of a given size.
Figure Figure Figure Figure 21212121: A : A : A : A HHHHybrid Cellular Automata linear arrayybrid Cellular Automata linear arrayybrid Cellular Automata linear arrayybrid Cellular Automata linear array
The HCA pRNG makes use of two different nearest neighbour update rules, namely Rule 90 and Rule 150. It is very suitable for implementation on an FPGA, and further produces maximal-length
sequences of highly uncorrelated patterns. Figure courtesy of Dylan Muir.
If used as described above, HCA pRNGs would introduce high correlation for
adjacent cells, which can be avoided by only using a subset of non-neighbouring bits
from a larger array to generate random numbers. One possible choice for creating a
32bit random number is to use a 128bit HCA, tapping off every fourth bit to form the
pRN, for example.
By using this method to generate pRNs as required by the different modules, the
stochastic processes in the SPU can be trusted to be as random as is possible, to the
best of the knowledge of the author.
5.25.25.25.2 Description of genDescription of genDescription of genDescription of genericsericsericserics
Before explaining the architecture of the individual SPU internal modules, it is helpful
to understand the parameterisation of the VHDL code that was carried out in order to
The Synaptic Processing Unit Anthony Hsiao
5-50
keep the SPU reconfigurable. The following is a brief description of the generics used
within the implementation that allow a customisation of the SPU.
• SYNAPSE_ADDRESS_WIDTH : natural := 13: The synapse address width is
the width of most the addresses within the SPU, and sets the maximum
number of synapses that can be addressed. By default, it is set to 13bits,
allowing for up to 8192 cascade synapses to be addressed. The fixed depth of
the cascade memory (the memory itself is not parameterisable) also limits the
maximum number of synapses to be implemented to 8192, although fewer
synapses may be used (manual reconfiguration of the memory would be
required to increase the depth of the cascade memory; this is not difficult).
• NEURON_ADDRESS_WIDTH : natural := 8: The neuron address width is the
width of the neuron address, and tells the SPU how many of the synapse
addresses’ MSBs are attributed to identifying the neuron. By default, it is set
to 8bits allowing for up to 256 neurons to be addressed, and a smaller number
of neurons can be specified without problems.
• CASCADE_WIDTH : natural := 5: The cascade width is the number of bits
that the cascade representation uses. It can be up to 6 bits wide, as limited by
the width of the cascade memory, but fewer bits, such as the default value of
5 bits may be specified. The cascade width includes both the efficacy bit and
the plasticity probability width. At the same time, the cascade width specifies
the width of the pRN generated in the cascade synapse module, which is
always one bit less than the cascade width (since the plasticity probability in
the cascade representation, which will be compared to the pRN, is one bit
smaller than the cascade width).
• PRE_THRESHOLD : natural := 230: The pre threshold sets the p(plasticity)
with which STADP elicits plasticity events; the higher the threshold, the
smaller is the p(plasticity). It may range from 0 to 255, where p(plasticity)
would be 1 and 0 respectively.
Using these four parameters, the SPU can be configured, at compile time, to have the
desired characteristics.
The Synaptic Processing Unit Anthony Hsiao
5-51
5.35.35.35.3 Module level designModule level designModule level designModule level design
The following sections will individually describe the implementations of the SPU’s
modules on a functional level. In order to save paper and time, no VHDL code is
reproduced here. The interested reader is advised to consult the supplementary CD
for the VHDL code.
In all of the diagrams shown in the following sections, the convention shown in
Figure 22 for arrows is used. In particular, dotted arrows are used to represent the
flow of control signals, dashed arrows for addresses and solid lined arrows are used
to represent the flow of data.
Figure Figure Figure Figure 22222222: Conventions on the arrows used in block diagrams: Conventions on the arrows used in block diagrams: Conventions on the arrows used in block diagrams: Conventions on the arrows used in block diagrams
Furthermore, light blue vertical bars are used to indicate register levels or clocked
processes.
5.3.15.3.15.3.15.3.1 Spike forwardingSpike forwardingSpike forwardingSpike forwarding
The forwarding module is the simplest out of all the four major functional modules.
As specified in the previous chapter, it ‘only’ has to forward valid pre-synaptic spikes
if the synapse it was addressed to has high efficacy, or if it is being sent to the
teacher synapse. The basic structure of the learning module is shown in Figure 23.
The outputs are generated in a very simple way. The target neuron address is simply
forwarded directly from the incoming neuron address, while the target address valid
signal is a simple chain of logic operations. Note that the target address valid signal is
dependent on the negation of the address_pre_post signal, since a pre-synaptic
input spike is represented by a ‘0’.
The Synaptic Processing Unit Anthony Hsiao
5-52
Figure Figure Figure Figure 23232323: Spike forwarding module block diagram: Spike forwarding module block diagram: Spike forwarding module block diagram: Spike forwarding module block diagram
The teacher synapse is defined to be the 0th synapse of every neuron, i.e. if the
synapse address’ bottom (depending on how wide the neuron address width is) bits
are zero, then it is sent to the teacher synapse, and should be forwarded regardless of
the synaptic efficacy.
Due to its simplicity, the forwarding module only requires one clock cycle to perform
the processing.
5.3.25.3.25.3.25.3.2 Learning rule (STADP)Learning rule (STADP)Learning rule (STADP)Learning rule (STADP)
The learning rule module is much more complex, as shown in Figure 24. It contains
some logic, several registers, a look-up table implemented by a 256x36bit single port
ROM, a 256x36bit single port memory block RAM, a 36bit timer with 11.1ns
resolution and an 8bit pRNG. In order to understand it, it is best to work from the
outputs backwards, and considering separately what happens on a pre- and on a post-
synaptic synapse address (spike).
There are three output signals: the cascade synapse address, the plasticity signal and
the plasticity valid signal, which need to be considered first.
The Synaptic Processing Unit Anthony Hsiao
5-53
The cascade synapse address is simply a forwarded version of the input synapse
address.
The plasticity signal, i.e. whether a synapse should be depressed or potentiated,
depends on the activity of the postsynaptic neuron. As mentioned earlier, this is
implemented by drawing pseudo-random exponentially distributed expiry times for
the post-synaptic neuron, at which it becomes inactive, and comparing this expiry
time to the current time is all it needs to elicit the right plasticity signal. So, if the
current time, i.e. the output of the timer, is greater than the post-synaptic neuron’s
expiry time which is given by the output of the expiry time memory, i.e. it has already
expired, then a depression signal is produced (plasticity_dep_pot is reset to ‘0’).
If the current time is less than or equal to the expiry time, then the neuron has not yet
expired but is still active, and a potentiation signal is produced (plasticity_dep_pot
is reset to ‘1’).
The plasticity valid signal is only valid, if the incoming spike is valid and pre-synaptic.
Furthermore, since plasticity signals are only elicited with a probability p(plasticity),
the plasticity valid signal is further only valid, if an 8bit pRN is above the plasticity
threshold pre_threshold.
That is really all there is to the generation of plasticity signals, i.e. that is all that
happens on arrival of a pre-synaptic spike. The rest of the STADP learning rule
module is concerned with handling post-synaptic spikes and setting pseudo-random
exponentially distributed expiry times.
The Synaptic Processing Unit Anthony Hsiao
5-54
Figure Figure Figure Figure 24242424: STADP learning rule block diagram: STADP learning rule block diagram: STADP learning rule block diagram: STADP learning rule block diagram
The Synaptic Processing Unit Anthony Hsiao
5-55
Integral to determining the state of activity of the post-synaptic neuron are the
delta_t_LUT ROM and the activity expiry times RAM. The former contains pre-
loaded exponentially distributed time intervals, after which the post-synaptic neuron
expires, while the latter contains the absolute times, at which the post-synaptic
neuron expires. The pRNG permanently generates pseudo-random numbers between
0 and 255, which are also the address input to the ROM, thus pseudo-randomly
reading the content of the ROM. This has the effect of drawing an exponentially
distributed new expiry time, after the output of the ROM is added to the current time
output of the timer. Thus, on every clock cycle, there is one exponentially distributed
expiry time available at the input to the RAM, which will be written to memory upon
arrival of a valid post-synaptic spike, into the location specified by the post-synaptic
neuron address (the top few bits of the synapse address). All times are represented in
units of clock cycles.
The reason behind choosing an 8bit pRN, 256 entries deep delta_t_LUT or the 256
entries deep activity expiry time memory is again historic, and has to do with
the fact that the SPU was initially designed to interact with a neuron chip with 256
I&F neurons.
Figure Figure Figure Figure 25252525: Initialisation of delta_t look: Initialisation of delta_t look: Initialisation of delta_t look: Initialisation of delta_t look----up table.up table.up table.up table.
This LUT contains exponentially distributed delta(t)s with mean 20ms. The distribution is sampled at 256 points and the data is stored in random positions within the ROM
The Synaptic Processing Unit Anthony Hsiao
5-56
The content of the delat_t_LUT table is initialised with a coefficient file generated
using a Matlab script5, and is such that a neuron firing at the mean firing rate fm,
would on average draw expiry times of 1/fm, as required. The content of the
coefficient file, and thus the content the LUT is initialised with, is in units of clock
cycles. Figure 25 shows an example of an initialization of the look-up table.
The processing of plasticity signals takes two clock cycles in total due to this
module’s two-stage pipelined architecture.
5.3.35.3.35.3.35.3.3 Cascade synapseCascade synapseCascade synapseCascade synapse
Before examining the architecture of the cascade synapse module, it is helpful to
have another look at the process by which the cascade synapse should respond to
plasticity signals, i.e. how the cascade should be processed. A conceptual flow
diagram is shown in Figure 26.
Figure Figure Figure Figure 26262626: Flow diagram of the cascade synapse's state update rule: Flow diagram of the cascade synapse's state update rule: Flow diagram of the cascade synapse's state update rule: Flow diagram of the cascade synapse's state update rule
1. Since the cascade synapses are stochastic, some of the incoming plasticity
events do not actually require any processing to be done on the synapse at all,
i.e. the synapse does not undergo any plasticity. The probability of undergoing
5 To generate a new coefficient file that the delta_t_LUT is initialized with, use coe.m
The Synaptic Processing Unit Anthony Hsiao
5-57
plasticity is given by the synapse’s current plasticity probability, represented
by an unsigned binary number from the cascade representation. So in order to
determine whether a synapse should be modified at all, this plasticity
probability is compared to a uniform pRN. If it is greater or equal, then it
should undergo plasticity, and do nothing otherwise. This, i.e. deciding
whether anything should be done to the synapse, is the first important step in
the processing of the cascade.
2. If the synapse does respond to the plasticity signals it receives (i.e. its
plasticity probability is larger than a pRN), then it has two choices: either
chain, or switch. This is dependent on the current efficacy and the ‘direction’
of the plasticity signal, i.e. whether it is a potentiation or a depression
command. If the current efficacy and the direction of plasticity agree, i.e. if a
depressed synapse receives a depress signal, or if a potentiated synapse
receives a potentiate signal, then the synapse should chain, and it should
switch otherwise.
3. The chaining process simply requires the cascade to reduce its plasticity, by
shifting it by one bit towards the LSB, thereby halving the plasticity
probability. The efficacy remains unchanged.
4. The switching process is similarly simple, since all that needs to be done to
the cascade representation is to invert the efficacy and to reset the plasticity
probability to the highest value, i.e. to ‘1..1’.
As outlined here, the actual processing of the cascade synapses is not very complex,
but can be done with simple logic operations. Again, implementing the cascade
synapse into digital hardware is nearly ideal.
The architecture of the cascade synapse module is shown in Figure 27. It contains
two pipeline register levels, one pRNG of width cascade_width – 1 and the state
update logic which implements the processing steps described above.
The cascade address out and the new state valid signals which feed into the cascade
memory are not actually modified at all by the cascade synapse process. They are
passed straight through the module, crossing two pipeline register stages.
The Synaptic Processing Unit Anthony Hsiao
5-58
What the cascade synapse is acting on are the (current) cascade synapse state as well
as the plasticity signal, in a fashion described above. During the first stage, a
comparator determines whether any changes to the cascade state need to be done at
all, and during the second stage, the appropriate modifications to the current cascade
synapse state are made, and output to the new state.
Figure Figure Figure Figure 27272727: : : : Cascade module block diagramCascade module block diagramCascade module block diagramCascade module block diagram
The processing of the cascade representations takes two clock cycles in total due to
its two-stage processing and pipelined architecture.
5.3.45.3.45.3.45.3.4 Cascade memoryCascade memoryCascade memoryCascade memory
The cascade memory module is more than just a simple block of memory. It contains
a 8192x6bit dual port RAM block, a multiplexer and a comparator, to perform
memory read-write collision avoidance.
Conceptually, the cascade memory needs to read a current cascade state from
memory, and at the same time, write a ‘new’ cascade state back into memory, hence
the dual port functionality of the memory. In particular one port is used as dedicated
write port, the other one as dedicated read port. However, as is commonly the case
with dual port memory, there exists the danger that both ports attempt to read or
The Synaptic Processing Unit Anthony Hsiao
5-59
write to the same memory location at the same time, which would lead to unknown
or unstable outputs.
Therefore, in order to avoid memory access collisions, the cascade memory contains a
comparator which checks, whether a collision is about to happen (i.e. whether both
read and write address are the same). In case of a collision, priority is given to the
write port, as the read port gets disabled. Then, the output of the memory’s write port
(which is actually valid) is selected as output, as the memory operates in
WRITE_FIRST mode [32]. That way, data is still written to memory and the same
data is also produced at the output. If there is no collision, the output is by default
selected to be the output of the read port.
Figure Figure Figure Figure 28282828: : : : Cascade memory block diagramCascade memory block diagramCascade memory block diagramCascade memory block diagram
The content of the memory is initialised using a coefficient file generated by another
short Matlab script6. This initialises the cascade memory to contain pseudo-random
states uniformly distributed across all of its cascades.
Due to the collision avoidance mechanism, the cascade memory module also requires
2 clock cycles to read data and produce it at its output correctly. Nevertheless, this
6 To generate a coefficient file to initialize the cascade memory, the script state_init.m was used.
The Synaptic Processing Unit Anthony Hsiao
5-60
memory is fully pipelined and can process new read or write commands at every
clock cycle.
5.3.55.3.55.3.55.3.5 Signal selectorSignal selectorSignal selectorSignal selector
The four modules described above are the core modules internal to the SPU, however
there is one other important module, the signal selector, which sits at the interface
between SPU and the FPGA board’s specific hardware (such as USB or AER
components). The purpose of the signal selector is primarily to interface the SPU with
spikes coming form the pre- and post-synaptic inputs, annotating them as pre- or
post-synaptic. In the case that both inputs have valid data available, the signal
selector selects the signals in an alternating fashion. Figure 29 shows the selector,
which interfaces with pre-synaptic USB (fx2) and feedback AER FIFOs.
Figure Figure Figure Figure 29292929: Input source selector: Input source selector: Input source selector: Input source selector block diagram block diagram block diagram block diagram
The selector requires one clock cycle to produce the data, which it feeds directly into
the SPU.
5.45.45.45.4 System integrationSystem integrationSystem integrationSystem integration
The individual modules that are at the core of the SPU have been described above;
this section explains more specifically how they integrate to make up the SPU.
The Synaptic Processing Unit Anthony Hsiao
5-61
Since all modules are fully pipelined and can process events at every clock cycle,
extra care has to be taken to ensure that the right data is at the right place at the right
time.
Conveniently, most of the modules take 2 clock cycles to process data, so little
synchronisation has to be done. The forwarding process, however, which receives
input both from the source selector and the cascade memory, has to be conditioned.
Specifically, the address and valid signals to the forwarding process have to be
delayed such that they arrive at the same time as the target synapse efficacy, namely
two clock cycles later. Figure 30 depicts a more detailed block diagram of the SPU,
including a two clock cycle delay to synchronise the forwarding module (process).
The numbers just next to modules indicate the clock cycle that the data arrives at that
module.
Figure Figure Figure Figure 30303030:::: Pipelined SPU block diagram Pipelined SPU block diagram Pipelined SPU block diagram Pipelined SPU block diagram
The SPU interacts with the outside world through the ports provided by the FPGA
board, namely the USB and AER ports. Each of these ports is connected to a FIFO
(either at its input or its output) acting as a buffer. If the AER output FIFO is nearly
full (this is unlikely to happen in practice), it sets a global busy signal high, which in
turn forces the SPU internal module’s clock enable signals low, thereby practically
The Synaptic Processing Unit Anthony Hsiao
5-62
freezing any processing that is happening within the SPU, until the FIFO has freed up
some space again by sending data out the AER output.
The pipeline of the SPU is depicted more explicitly in Figure 31, which shows with
what causalities and dependencies data flows through the SPU. Data from the signal
selector arrives at the first delay buffer, the cascade memory and the learning rule at
the same time. Two clock cycles later, data, a cascade state as well as plasticity
signals arrive at the forwarding module and the cascade synapse respectively. Valid
pre-synaptic data is forwarded to the AER output FIFO on the next clock edge, and
one clock cycle later, a new cascade state is ready to be written back into memory.
FigurFigurFigurFigure e e e 31313131: : : : Pipelined dataflow through the SPUPipelined dataflow through the SPUPipelined dataflow through the SPUPipelined dataflow through the SPU
5.55.55.55.5 Integration into the FPGA boardIntegration into the FPGA boardIntegration into the FPGA boardIntegration into the FPGA board
The FPGA board offers a full set of I/O interfaces that the SPU makes use of, and the
following provides a more detailed description of the precise integration of the SPU
into the FPGA board. Figure 32 illustrates the interfaces the SPU is making use of,
and their associated entities.
Spike Forwarding
Cascade Memory
Cascade Synapse
1111 2222 3333 4444 5555 0000 6666
DataDataDataData Signal Selector
DataDataDataData
DataDataDataData
AddAddAddAdd
StateStateStateState
PlastPlastPlastPlast
NStatNStatNStatNStat
Delay Buffer 1
Delay Buffer 2
Learning Rule
NStatNStatNStatNStat
AER outAER outAER outAER out
PlastPlastPlastPlast
StateStateStateState NStatNStatNStatNStat
The Synaptic Processing Unit Anthony Hsiao
5-63
Figure Figure Figure Figure 32323232: Block diagram of the integration of the SPU within the FPGA bo: Block diagram of the integration of the SPU within the FPGA bo: Block diagram of the integration of the SPU within the FPGA bo: Block diagram of the integration of the SPU within the FPGA boardardardard
Note: all FIFOs need ‘First Word Fall Through’ property
Pre-synaptic data enters the FPGA board through the USB port, and is handled by the
fx2if (USB interface) and then buffered into the fx2 FIFO. The pre-synaptic stimulus
data sent to the USB port consists of address and inter-spike interval pairs. In order to
handle this, the FPGA board features a sequencer which holds back the address (i.e.
part of the data) for a duration given by the inter-spike interval, before passing it on
to the input selector (blue arrow in Figure 32).
From there on, data (addresses) enters the SPU and leaves it again after a few clock
cycles, going though the synapse selector (if it is used), an output FIFO as buffer,
through the AER out interface module and through the AER out port into one of the
static DPI synapses on an aVLSI neuron chip (green arrow in Figure 32) (the synapse
selector module works quite similar to the signal selector, but the other way round: it
forwards spikes to one of the two static DPI synapses on an aVLSI neuron chip, in an
alternating fashion, so as to avoid overloading one single synapse on a neuron with
too many spikes. Whether or not it is used depends on the application, it is
appropriate to use it of a lot of synapses are connecting to one post-synaptic neuron,
as was the case in the classification task described later).
The Synaptic Processing Unit Anthony Hsiao
5-64
When a post-synaptic spike elicits an address event, it is communicated back over the
feedback AER in bus to the port, through the AER in interface module to a FIFO
acting as a buffer, via the signal selector back into the SPU (where it would be
processed by the STADP module).
5.5.15.5.15.5.15.5.1 On clocksOn clocksOn clocksOn clocks
As mentioned in the feature summary, the SPU is clocked at 90MHz internally. This
clock was derived from one of the FPGA’s internal Digital Clock Multipliers (DCM),
which conditioned the external 106.125MHz clock to produce the desired 90MHz.
Everything within the FPGA board is running at 90MHz, with one exception: the USB
port is operated at 45MHz, by halving the 90MHz clock signal. The USB FIFO
(fx2fifo) is thus driven with two different clocks, written at 45MHz and read at
90MHz.
The Synaptic Processing Unit Anthony Hsiao
6-65
6666 VerificationVerificationVerificationVerification
‘Genius is 1% inspiration and 99% perspiration’ – Thomas Edison
Verification is one of the most daunting but crucial tasks in digital hardware design,
and failure to do so properly can come at great costs both in terms of money, time
and reputation. Rather than reproducing the entire verification work carried out,
which includes testbenches at module and system level, the interested reader is
pointed to the appendices.
Verification plans for module and system level verification, which followed an ad-hoc
testing paradigm, can be found in
The Synaptic Processing Unit Anthony Hsiao
6-66
Appendix II – Verification checklists. Appendix III – A journey through the SPU aims
to demonstrate simulation efforts made to verify the correct operation of the SPU. In
particular, it shows a set of example waveforms from testbenches, which follow a
pre- and a post-synaptic spike on a journey through the SPU.
The Synaptic Processing Unit Anthony Hsiao
7-67
7777 EvaluationEvaluationEvaluationEvaluation & experimentation & experimentation & experimentation & experimentation
‘What we see depends mainly on what we look for’ – Sir John Lubbock
Previous work, including work by Tobias Kringe, focused on, and verified, the
behaviour and performance of digital hardware implementations of the cascade
synapse, and reproducing this is not an aim of this project. Instead, the focus here lies
on the use of the cascade synapses for learning within a general learning environment.
The following sections describe the evaluation of the SPU which was carried out in
three steps:
• Firstly, the STADP learning rule was characterised again, this time in-
hardware, to further reassure the correct operation of it – especially since
software simulations (Matlab) and hardware implementations can be worlds
apart.
• Then, a quick in-circuit verification of the SPU, including the verification of
forwarding and learning, was carried out, to assure that the SPU was
operational.
• Finally, the SPU was tested in a general learning environment, and, coupled
with an aVLSI neuron chip, was used for a real classification task.
7.17.17.17.1 InInInIn----hardware chardware chardware chardware characterisation of STADPharacterisation of STADPharacterisation of STADPharacterisation of STADP
The Matlab simulation of STADP presented earlier verified, qualitatively, that this
learning rule has the expected properties. However, it is worthwhile to go one step
further and perform yet another characterisation of STADP, but this time in hardware.
This in-hardware characterisation was carried out using a behavioural model
simulation of the learning module in ModelSim (an in-circuit verification is
conceivable, but inconvenient since there would be no access to internal signals of
the FPGA, and it would take a lot of manual labour to actually carry out the large
amount of measurements required).
The Synaptic Processing Unit Anthony Hsiao
7-68
In order to reproduce the simulation results of Figure 16, a slightly more elaborate
VHDL testbench7 and additional Matlab functions were required. The testbench
simulates the STADP module connected to the sequencer and timestamp modules so
that stimuli could be sent to it in the ‘normal’ fashion. It is using Matlab generated
stimulus inputs8 read from several binary files, and logs STADP plasticity outputs to
several binary output files, which are then analysed in Matlab9 to obtain the results
required to reproduce the plots for characterisation.
Any meaningful in-hardware characterisation of STADP necessitates the collection
and analysis of a large amount of output data to pre- and post-synaptic stimuli, since
STADP is stochastic, which in turn cover a large range of pre- and post-synaptic
frequency pairs (10:5:100Hz for both pre- and post-synaptic frequencies, i.e. nearly
400 data points). However, this would amount to several minutes’ worth of input
spike trains (pre- and post-synaptic), which would take months to simulate on a
normal desktop PC in ModelSim.
This is mainly because ModelSim is a synthesis tool which does not simulate in real
time, but is most comfortable simulating in simulation time units in regimes of micro
to pico seconds. Then, the simulation of one clock cycle in real time (e.g. with a
90MHz clock, 11.1ns) can take several iterations in simulation time. Simulating one
second worth of a 90MHz clock would thus take at least 90million iterations in
simulation time, and simulating several minutes worth of input stimuli would become
a task of ridiculously high computational complexity (for a standard desktop PC). An
important point to note is, that most of this time, the STADP module would not even
produce any outputs, since stimuli, i.e. spikes, are being held back by the sequencer
for most of the time.
7 Elaborate VHDL testbench using file I/O to read stimuli from binary file, and write outputs to binary
files: Class_tb.vhd
8 To generate the binary stimuli files: generatePostCharacterisationStimuliFile.m,
generatePlasticityCharacterisationStimuliFile.m
9 To analyse the log of the binary output files: characterisePActivePost.m,
characterisePlasticity.m
The Synaptic Processing Unit Anthony Hsiao
7-69
In order to get around this problem, the in-hardware simulation of the STADP
module was carried out at an imaginary ‘internal clock frequency’ of 5kHz instead of
90MHz. This means that the delta_t_LUT within the STADP module, which
previously (and in the actual hardware running on the SPU) contained exponentially
distributed expiry intervals in units of 11.1ns (1 clock cycle at 90MHz), now contains
the same exponentially distributed expiry intervals, but in units of 0.2ms (1 clock
cycle at 5kHz). Similarly, the inter-spike intervals of the input stimuli are in units of
0.2ms now, while they were in units of 11.1ns previously.
Figure Figure Figure Figure 33333333: Comparison of : Comparison of : Comparison of : Comparison of delta_t_delta_t_delta_t_delta_t_LUT content for 5kHz anLUT content for 5kHz anLUT content for 5kHz anLUT content for 5kHz and 90MHz.d 90MHz.d 90MHz.d 90MHz.
Using an imaginary clock frequency of 5kHz, the content of the delta_t_LUT is much coarser. This can be thought of as sampling the curve of the exponential distribution at a lower frequency.
The bottom line of this is that the inter-spike intervals are smaller, in terms of clock
cycles, which means that the sequencer waits fewer clock cycles before releasing a
stimulus. Overall, this reduces simulation time to a manageable load. The drawback
of this approach is the more coarse exponential distribution of expiry times that is
loaded into the delta_t_LUT, which can be thought of as being sampled at a lower
sampling rate, as depicted in Figure 33. However, this is a necessary evil that still
enables a meaningful in-hardware characterisation of the operation of the STADP
module.
Lengthy simulation yields the results shown in Figure 34. On first glance, they appear
to resemble the simulation results of Figure 16. And indeed, the qualitative behaviour
is satisficing. The p(active) is approximately 0.5 at a post-synaptic frequency of 50Hz,
increases for higher frequencies, and decreases for lower frequencies. The large
The Synaptic Processing Unit Anthony Hsiao
7-70
amount of ‘noise’ observed is attributed mainly to the coarse sampling of the
distribution curve, as mentioned above, and to stochasticity of the underlying poisson
spike train stimuli. Also, the plasticity rate increases with pre-synaptic frequency for
both potentiation and depression, which also show the correct behaviour,
qualitatively. The net effect of LTP and LTD is also within the expected range, and
also shows a bias towards depression, which is possibly more pronounced than in
Figure 16. This effect can also, in parts, be attributed to the coarse sampling of the
distribution, which results in this p(active) curve being slightly lower than expected,
and therefore ‘even less symmetric’ than in the previous simulation, making
potentiation less likely and pronouncing the reluctance towards potentiation. Another
cause for this is that at this ‘slow’ clock speed, individual delays within the hardware,
it takes an input spike 2 clock cycles before it gets processed by STADP, have a much
greater effect on the absolute perceived timing of the spike by the learning rule.
Initial simulations using an even lower imaginary clock frequency of 1kHz had an
even more noisy p(active) curve (results not shown here), supporting the
extrapolation, that the actual hardware running at 90MHz, sampling the exponential
distribution curve at a sufficiently high frequency, can be expected to behave far
more closely to the simulation results presented in Figure 16.
The Synaptic Processing Unit Anthony Hsiao
7-71
Figure Figure Figure Figure 34343434: Simulated hardware : Simulated hardware : Simulated hardware : Simulated hardware behaviourbehaviourbehaviourbehaviour of of of of STADP STADP STADP STADP at 5kHz at 5kHz at 5kHz at 5kHz simulation clock frequency simulation clock frequency simulation clock frequency simulation clock frequency....
Left column: rate of potentiation and depression events per second, over a range of pre- and post-synaptic frequencies [1:100Hz] (ignore the axis labels). Right column: Net effect of STADP and probability of the postsynaptic neuron being in active state per unit time, together with expected
result, over a range of frequencies [1:100Hz]
In summary, it can be concluded that the hardware implementation of STADP does
indeed have the right characteristics, and that there is reason to believe that although
the in-hardware simulation result at 5kHz is less ideal, the actual hardware running at
90kHz behaves more like the expected simulation model.
7.27.27.27.2 Modifications for the eModifications for the eModifications for the eModifications for the experimental Setupxperimental Setupxperimental Setupxperimental Setup
This section outlines some important modifications made to the SPU that was specific
to the experimental hardware used and the experiments conducted. These
modifications to the SPU are not generally applicable.
The experimental hardware used, namely the FPGA board, had one unexpected
shortcoming: the AER in port, i.e. the post-synaptic feedback connection, suffered
The Synaptic Processing Unit Anthony Hsiao
7-72
from timing inaccuracies on the data bus. While the control signals, request and
acknowledge, worked as desired, indicating address events correctly, the actual data,
i.e. the address representation, was being read by the AER module before the data
bus could settle, thereby essentially producing random AER data inputs. This is a
hardware bug that has to be solved, however that would go beyond the scope of this
project. Instead, the address of the post-synaptic neuron was hardwired into the SPU.
This was only possible since experiments conducted only made use of a single post-
synaptic silicon neuron.
Section 5.5 Integration into the FPGA board briefly mentioned the so called synapse
selector. This module was implemented in order to be able to conduct the
experiments (as will be described later) which only made use of one single post-
synaptic silicon neuron, with a large number of pre-synaptic inputs, 256 to be precise.
The DPI synapses are designed to operate in biologically plausible regimes, receiving
pre-synaptic inputs of up to, say, 100-200Hz, although typical firing rates are around
50Hz. The classification experiment (which will be described in a later section)
however required those 256 pre-synaptic inputs, operating at expected firing rates of
up to about 50Hz each, to stimulate the same post-synaptic neuron. This ~13kHz
input would have overwhelmed the input bandwidth of the single DPI synapse –
through which all pre-synaptic activity is routed to the post-synaptic neuron – which
was observed to saturate at a pre-synaptic firing rate of about 12kHz. Luckily, the
aVLSI neuron chip featured two identical static DPI synapses per I&F neuron, so that
the pre-synaptic load could be shared between both.
The task of the synapse selector was to make sure that spikes were sent to both DPI
synapses in an alternating fashion, by toggling one address bit in the output address.
Measurements of this are given in the Circuit calibration section.
Finally, the SPU’s output is a neuron address identifying the post-synaptic neuron
from within all the neurons on the neuron chip. However, the neuron chip allows for
addressing of individual synapses on the chip (which makes sense, and in fact, the
SPU’s addressing scheme is also based on uniquely identifying synapses). Each
neuron has several synapses, including some with rich dynamics and local learning
The Synaptic Processing Unit Anthony Hsiao
7-73
rules, which were not used, while the two excitatory static DPI synapses where used.
Therefore, the missing synapse identifier also has to be hardwired into the SPU. In
particular, this required hard wiring the bottom 5bits of the AER out address to
always send spikes through the static DPI synapses (in fact, the 2nd bit was not static,
but toggled by the synapse selector).
7.37.37.37.3 Circuit calibrationCircuit calibrationCircuit calibrationCircuit calibration
(In the following paragraphs, whenever a stimulus is presented or sent to the SPU, it
refers to sending a text file with address and inter-spike interval timestamp generated
by the spiking neuron toolbox for matlab, using the linux script aexstim developed by
Giacomo Indiveri)
Before the SPU can be operated together with an aVLSI neuron chip to form a neural
system, they need to be calibrated to obtain the desired behaviour.
For the experiments that are to be carried out, the system should be calibrated for one
single post-synaptic neuron receiving 256 different pre-synaptic inputs. One way of
looking at this system is to consider the post-synaptic neuron to be performing a
mapping of the total pre-synaptic input frequency, i.e. the firing rate at the input to
the DPI synapse which is equal to the sum of all individual pre-synaptic firing rates, to
a post-synaptic output frequency. Since the total input frequency is high, the synaptic
weight of the DPI synapse has to be reduced to a level where this mapping is linear,
and does not drive the post-synaptic neuron at too high frequencies, or into saturation.
In particular, a mapping of approximately [0, 12.8kHz] total pre-synaptic frequency to
[0, 100Hz] post-synaptic frequency is required.
In order to achieve this calibration, two DPI synapse parameters, the weight w and
the time constant tau, were adjusted. Then, using pre-synaptic inputs of known
constant frequency to drive the post-synaptic neuron, an input-output relationship
could be established, and w and tau could be tweaked experimentally. Figure 35
shows the final frequency response of the neural system, with parameter values w ~=
0.43V and tau ~= 2.79V. It has a nice linear region for a wide range of pre-synaptic
stimulus frequencies before it starts to saturate, thanks to the synapse selector
The Synaptic Processing Unit Anthony Hsiao
7-74
mechanism. Figure 36 shows an oscilloscope screenshot where the system is running
at the equilibrium point, at which the post-synaptic neuron fires at mean frequency
50Hz.
Figure Figure Figure Figure 35353535: Frequency response of the neural system. : Frequency response of the neural system. : Frequency response of the neural system. : Frequency response of the neural system.
Linear post-synaptic frequency response to a wide range of pre-synaptic stimuli frequencies. DPI synaptic weight and time constant parameters are w ~= 0.43V and tau ~= 2.79V.
Figure Figure Figure Figure 36363636: Oscilloscope screenshot of post: Oscilloscope screenshot of post: Oscilloscope screenshot of post: Oscilloscope screenshot of post----synaptic mesynaptic mesynaptic mesynaptic membrane potential: mbrane potential: mbrane potential: mbrane potential:
A regular teacher signal driving the post-synaptic neuron at ~50Hz. The top signal is the membrane potential of the post-synaptic neuron, the bottom signal is the teacher signal driving it, running at
~6.4kHz.
The Synaptic Processing Unit Anthony Hsiao
7-75
Using this system calibration10, the desired experiments can be carried out.
7.47.47.47.4 InInInIn----circuit verificationcircuit verificationcircuit verificationcircuit verification11111111
Before diving into a more elaborate classification experiment using the SPU, it is
worthwhile to perform a final set of in-circuit verification tasks. Firstly, some general
tests were done to verify basic operation, before more elaborate in-circuit verification
tasks were carried out, namely verification of Forwarding, Potentiation and
Depression.
In summary, it was concluded that the basic operation of the SPU behaves as
expected. The other in-circuit verification experiments are described below.
In the following paragraphs, the term ‘teacher signal’ is used to refer to a teacher
stimulus that drives the post-synaptic neuron at the specified frequency, e.g. a 25Hz
teacher signal is a signal that drives the post-synaptic neuron at a frequency of 25Hz.
For most experiments, the on-chip current injection function on the aVLSI neuron
chip was used as teacher signal, rather than an actual pre-synaptic input to the
teacher synapse, because it is more convenient, and does not require the generation
of a multitude of different teacher stimulus files.
7.4.17.4.17.4.17.4.1 Forwarding Forwarding Forwarding Forwarding
The correct operation of the forwarding mechanism was already partly verified during
the calibration, as regular teacher signals were used to obtain the frequency response.
However, there is more to forwarding, and the following functionalities were verified:
• Does the SPU forward teacher signals correctly? (already verified during
calibration)
o Output frequency should correspond to the specified input frequency
o Verified by forwarding spikes of regular ISI, and observing the output
rate.
10 Stored in Matlab script bias_050607.m
11 Videos of in circuit verification (depression and potentiation) available on YouTube. Search terms:
SPU, stadp, cascade, synapse, plasticity
The Synaptic Processing Unit Anthony Hsiao
7-76
• Does it stop spikes to depressed synapses?
o Initialise all synapses to a depressed state
o Send spikes to depressed synapses
o There should be no output spikes
• Does it forward spikes to potentiated synapses?
o Initialise all synapses to a potentiated state
o Send spikes to potentiated synapses
o There should be output spikes at the input spike rate
The ability to correctly forward spikes was verified using tests described above. An
example poisson input spike train used to verify forwarding is shown in Figure 37
and a screenshot from the oscilloscope showing an example post-synaptic response to
such a spike train is shown in Figure 38.
Figure Figure Figure Figure 37373737: Example of a coherent 30Hz poisson spike train to all 256 synapses. : Example of a coherent 30Hz poisson spike train to all 256 synapses. : Example of a coherent 30Hz poisson spike train to all 256 synapses. : Example of a coherent 30Hz poisson spike train to all 256 synapses.
Black dots represent a spike at the time of its occurrence. All spike trains have the same average spike rate.
The Synaptic Processing Unit Anthony Hsiao
7-77
Figure Figure Figure Figure 38383838: Oscilloscope screenshot of post: Oscilloscope screenshot of post: Oscilloscope screenshot of post: Oscilloscope screenshot of post----synaptic membrane potential: synaptic membrane potential: synaptic membrane potential: synaptic membrane potential:
a poisson stimulus driving the post-synaptic neuron at ~30Hz, clearly showing the contributions to the membrane potential by individual incoming pre-synaptic spikes.
7.4.27.4.27.4.27.4.2 PotentiationPotentiationPotentiationPotentiation
In order to verify the ability of the synapses to become potentiated, a range of
different stimuli lasting 1s were applied to the SPU repeatedly, while a teacher signal
was applied at the same time. The teaching time, i.e. the number of trials of 1s
showings, required to drive the post-synaptic frequency above the mean firing rate of
50Hz was recorded as output, over four sets of trials. Every time a new set of
measurements was taken, the SPU was first power cycled to re-initialise the cascade
states to all depressed.
Figure 39 shows a plot of the teaching times against several input stimuli, for two
different teacher signal strengths. The errorbar plot (and all other following errorbar
plots in this report) shows mean and standard deviation of the dataset measured.
Clearly, the synapses are able to potentiate so long as the teacher signal is strong
enough. As the average pre-synaptic frequency increases, so does the effectiveness of
teaching, since the time required to drive the post-synaptic neuron into active state,
i.e. above 50Hz, decreases. However, it should be noted, that the strength of the
The Synaptic Processing Unit Anthony Hsiao
7-78
teacher signal is more critical to the success of potentiation than the pre-synaptic
firing rate itself.
Figure Figure Figure Figure 39393939: In: In: In: In----circuit verification of potentiation. circuit verification of potentiation. circuit verification of potentiation. circuit verification of potentiation.
The system is indeed plastic, and synapses potentiate stochastically depending on the strength of the teacher signal and the pre-synaptic firing rate. Note that it was not possible to drive the post-synaptic
neuron above 50Hz with a teacher signal of 11Hz.
7.4.37.4.37.4.37.4.3 DepressionDepressionDepressionDepression
In order to verify the synapses ability to become depressed, a range of different
stimuli lasting 1s were applied to the SPU repeatedly, without applying a teacher
signal at the same time. The depression time, i.e. the number of trials of 1s showings,
required before the post-synaptic frequency decreases to zero was recorded as output,
over three sets of trials. Every time a new set of measurements was taken, the SPU
was first power cycled to re-initialise the cascade states to all potentiated.
Figure 40 shows a plot of the depression times against several input stimuli. The
errorbar plot shows mean and standard deviation of the dataset measured. Indeed,
the synapses are able to get depressed, and the higher the stimulus frequency is, the
more slowly the depression process is, or the longer it takes to fully depress the
synapses.
The Synaptic Processing Unit Anthony Hsiao
7-79
FFFFigure igure igure igure 40404040: In: In: In: In----circircircircuit verification of depressioncuit verification of depressioncuit verification of depressioncuit verification of depression. . . .
The system is indeed plastic, and synapses depress stochastically depending on the pre-synaptic firing rate
Figure 41 shows an oscilloscope screenshot capturing the depression of more and
more synapses indirectly. This can be seen by the decreasing firing rate of the post-
synaptic neuron, which eventually dies off and stops firing completely.
The depression of synapses, and in fact the potentiation of synapses as well, has a
kind of positive feedback effect built in that stems from STADP. As more and more
synapses get depressed (potentiated), the post-synaptic neuron is driven by ever
fewer (ever more) pre-synaptic inputs, thereby further reducing (increasing) post-
synaptic activity, leading to more depression (potentiation) events.
The Synaptic Processing Unit Anthony Hsiao
7-80
Figure Figure Figure Figure 41414141: Oscilloscope screenshot of : Oscilloscope screenshot of : Oscilloscope screenshot of : Oscilloscope screenshot of decreasing decreasing decreasing decreasing postpostpostpost----synaptic synaptic synaptic synaptic firing ratefiring ratefiring ratefiring rate::::
The post-synaptic frequency (upper waveform) decreases as more and more of the initially all-potentiated synapses get depressed. This is due to the pre-synaptic stimulus (lower waveform), which
is too low to drive the post-synaptic neuron into active state (>50Hz), thereby causing mostly depression events.
7.57.57.57.5 A real classificationA real classificationA real classificationA real classification task task task task
Having performed in-circuit verification of the SPUs principal functions, it is ready to
be used in a real classification task.
The task at hand is an image classification task, whereby a neuron with its cascade
synapses is supposed to learn to classify two images (it is understood that it is the
synapses that actually perform the learning, but it is the neural system as a whole,
which is learning to classify). In particular, by teaching one image, and not teaching
the other one, it is supposed to learn to respond to the taught image by being ‘active’,
and to give out no or only a weak response to the other image, after it has been
taught for a certain amount of time.
7.5.17.5.17.5.17.5.1 From image to preFrom image to preFrom image to preFrom image to pre----synaptic synaptic synaptic synaptic stimulistimulistimulistimuli
Here, the image is a 16x16 pixel greyscale bitmap image, where each pixel represents
a pre-synaptic neuron firing at a rate that is given by its pixel colour. Thus, output
The Synaptic Processing Unit Anthony Hsiao
7-81
neuron used for the classification task sources from 256 pre-synaptic inputs, and
produces one post-synaptic output.
In order to stay within biologically plausible regimes of operation, the greyscale
pixels, which have values that vary between 0 (black) and 255 (white), encode mean
firing rates between 0 and 100Hz. Two pictures, one of Dylan Muir and another one
of Anthony Hsiao, were used, scaled down to a size of 16x16 and converted to
greyscale. Then the two greyscale images were modified to have the same ‘average
colour’ and therefore encode the same average pre-synaptic frequency, as well as
having the same ‘total colour’, or intensity, (the sum of all pixel values) to ensure that
they encode the same total pre-synaptic frequency at the input of the DPI synapse. In
particular, the average colour of the two images is ‘grey’, a value of about 123 (out
of 255), which translates into an average encoded mean firing rate just less than
50Hz per pre-synaptic input, or a total mean firing rate just less than 12.8kHz at the
input of the DPI synapse. When presented to the (uniformly initialised) SPU, with
approximately half of its synapses occupying depressed states, half occupying
potentiated states, the DPI synapse sees a total initial mean firing rate of just below
6.4kHz, which is just below the activity threshold of the post-synaptic neuron (see
frequency response, Figure 35). The picture-to-stimulus conversion process is
depicted in Figure 42.
The Synaptic Processing Unit Anthony Hsiao
7-82
Figure Figure Figure Figure 42424242: : : : Using pictures as preUsing pictures as preUsing pictures as preUsing pictures as pre----synaptic stimuli. synaptic stimuli. synaptic stimuli. synaptic stimuli.
From left to right (applies to both rows): Original picture, converted 16x16 pixel greyscale image normalized to have the same total intensity, mapped firing rates ([0:100]Hz) as encoded by greyscale
pixel value. Pictures of Anthony Hsiao and Dylan Muir.
The mean firing rates obtained are then converted into 256 poisson spike trains, as
shown in Figure 43. These are the stimuli with which the classification task is carried
out. For a human, the original images look very different, but even when converted
into spike train stimuli, the pictures show distinct patterns that the synapses are
hoped to be able to learn, enabling the neuron to classify these two images.
Figure Figure Figure Figure 43434343: Spike trains derived from 16x16 pixel : Spike trains derived from 16x16 pixel : Spike trains derived from 16x16 pixel : Spike trains derived from 16x16 pixel greyscalegreyscalegreyscalegreyscale images of Anthony and Dylan images of Anthony and Dylan images of Anthony and Dylan images of Anthony and Dylan. . . .
Firing rates vary between 0Hz and 100Hz.
The Synaptic Processing Unit Anthony Hsiao
7-83
7.5.27.5.27.5.27.5.2 Teaching Teaching Teaching Teaching methodsmethodsmethodsmethods
During teaching, the two images (read: the two stimuli derived from the images) are
presented to the system in an alternating fashion, for a given period of time, e.g. 1s
each, for several trials. The image that is to be learned is presented together with a
strong teacher signal, as depicted in the first row of Figure 44, while the other picture
is presented without teacher signal (not shown). During this process, the learning rule
provides for changes in synaptic efficacy, creating a synaptic efficacy mask (column 4
in Figure 44). While in the learning phase, the synapses undergo plasticity, and
ideally learn to assume, and keep, efficacies that allow for a classification of the two
images, i.e. driving the post-synaptic neuron active for the taught image (second row
in Figure 44), and keeping the post-synaptic neuron inactive for the other image
(third row in Figure 44).
Two different teaching methods are used and compared, ‘normal’ teaching and
‘bottom-up teaching’. Both teaching methods are used in the same way as explained
above, however normal teaching starts off with a uniform initialisation of the states
of the cascade synapses, whereas bottom-up teaching starts off with a uniform
initialisation of only depressed synapses.
In order to decide whether or not the neuron is able to classify the two images (for
both teaching methods used), a right sided Student’s t-test for statistical significance
was applied on the difference of the two post-synaptic responses (response to the
taught image [Hz] – response to the other image [Hz]), testing the following
hypotheses:
• Null hypothesisNull hypothesisNull hypothesisNull hypothesis (H (H (H (H0000)))): the difference in the post-synaptic responses to the
taught and the other image comes from a distribution with mean zero (i.e.
there is no difference in the post-synaptic frequencies in response to the
taught and the other image)
• Alternative hypothesisAlternative hypothesisAlternative hypothesisAlternative hypothesis (H (H (H (H1111)))): there is indeed a positive difference in the post-
synaptic responses (i.e. the post-synaptic frequency in response to the taught
image is greater than the post-synaptic frequency in response to the other
image)
The Synaptic Processing Unit Anthony Hsiao
7-84
The t-tests are evaluated at the 5% level, returning a probability p, which represents
the probability that the underlying process described by the null hypothesis could
have produced the data observed, and a confidence interval CI for the true mean for
H1 at 5%.
The Synaptic Processing Unit Anthony Hsiao
7-85
Figure Figure Figure Figure 44444444: : : : ConConConConceptual ceptual ceptual ceptual procedureprocedureprocedureprocedure of a of a of a of a real classification task. real classification task. real classification task. real classification task.
The classification task involves two phases: Teaching and Classification. The top row depicts one part of the teaching phase, during which both the image that is to be learned and the other image are presented to the system in an alternating fashion (the presentation of the other image is not shown
here), in order to ‘teach’ the correct synaptic weights. During the classification phase, depicted in the second and third row, presenting the taught picture should result in a high post-synaptic firing rate,
while presenting the other image should result in a low (if any) post-synaptic firing rate.
The different steps involved in the classification task are represented by images in the different columns: 1. Original images 2. Converted 16x16 pixel greyscale images 3. Pre-synaptic firing rates
([0:100]Hz, [maroon : dark blue]) as mapped from the greyscale pixel value 4. Synaptic mask comprising the binary synapse efficacies 5. Resulting ‘masked’ stimulus at the input of the aVLSI
neuron’s DPI synapse.
The Synaptic Processing Unit Anthony Hsiao
7-86
7.5.37.5.37.5.37.5.3 ResultsResultsResultsResults –––– Normal teaching Normal teaching Normal teaching Normal teaching
Three different experimental parameters were changed during the normal teaching
classification experiments, namely the image taught (Dylan or Anthony), the order in
which the images were presented to the neuron during the learning phase (show the
taught image first or show the other image first) and the strength of the teacher
signal (22Hz or 25Hz).
For each of those eight classification trials, each image was presented for a total of
10s (i.e. showing image A for 1s, then showing image B for 1s, and repeat this
another 9 times), before the actual classification. So after teaching (or not teaching)
the images in an alternating fashion for a total of 10s each, the images were once
again presented to the neuron one after the other, but this time without any teacher
signal. After every presentation (regardless of which image and which trial), the post-
synaptic frequency was measured with an oscilloscope and recorded. This was
repeated N times to get a sets of results.
The following pages show commented plots of the results for the different
classification trials, using normal teaching.
The Synaptic Processing Unit Anthony Hsiao
7-87
Figure Figure Figure Figure 45454545: Classification task: Teach Dylan, s: Classification task: Teach Dylan, s: Classification task: Teach Dylan, s: Classification task: Teach Dylan, show Dylan first,how Dylan first,how Dylan first,how Dylan first, at 22Hz. at 22Hz. at 22Hz. at 22Hz.
The two pictures are presented to the neural system in an alternating fashion, for1s each. Last data
point without teacher signal. On final showing, the neuron is (just) unable to classify the two pictures (p = 0.0556), even though the mean post-synaptic firing rate in response to the taught signal is higher.
Figure Figure Figure Figure 46464646: Classification task: Teach Dylan, show Anthony first, at 22Hz.: Classification task: Teach Dylan, show Anthony first, at 22Hz.: Classification task: Teach Dylan, show Anthony first, at 22Hz.: Classification task: Teach Dylan, show Anthony first, at 22Hz.
The two pictures are presented to the neural system in an alternating fashion, for1s each. Last data point without teacher signal. On final showing, the neuron is unable to classify the two pictures (p = 0.2500), even though the mean post-synaptic firing rate in response to the taught signal is higher.
The Synaptic Processing Unit Anthony Hsiao
7-88
Figure Figure Figure Figure 47474747: Classification task: T: Classification task: T: Classification task: T: Classification task: Teach Dylan, show Dylan first, at 25Hz.each Dylan, show Dylan first, at 25Hz.each Dylan, show Dylan first, at 25Hz.each Dylan, show Dylan first, at 25Hz.
The two pictures are presented to the neural system in an alternating fashion, for1s each. Last data point without teacher signal. On final showing, the neuron is unable to classify the two pictures (p = 0.1187), even though the mean post-synaptic firing rate in response to the taught signal is higher.
Figure Figure Figure Figure 48484848: Classification task: Teach Dylan, show Anthony first, at 25Hz.: Classification task: Teach Dylan, show Anthony first, at 25Hz.: Classification task: Teach Dylan, show Anthony first, at 25Hz.: Classification task: Teach Dylan, show Anthony first, at 25Hz.
The two pictures are presented to the neural system in an alternating fashion, for1s each. Last data point without teacher signal. On final showing, the neuron is able to classify the two pictures. The difference in post-synaptic frequencies is statistically significant at 5% level with (p = 0.0413, CI =
[2.029, inf]).
The Synaptic Processing Unit Anthony Hsiao
7-89
Figure Figure Figure Figure 49494949: Classification task: Teach Anthony, show Anthony first, at 22Hz.: Classification task: Teach Anthony, show Anthony first, at 22Hz.: Classification task: Teach Anthony, show Anthony first, at 22Hz.: Classification task: Teach Anthony, show Anthony first, at 22Hz.
The two pictures are presented to the neural system in an alternating fashion, for1s each. Last data point without teacher signal. On final showing, the neuron is unable to classify the two pictures (p = 0.1872), even though the mean post-synaptic firing rate in response to the taught signal is higher.
Figure Figure Figure Figure 50505050: Classification task: Tea: Classification task: Tea: Classification task: Tea: Classification task: Teacccch Anthony, show Dylan first, at 22Hzh Anthony, show Dylan first, at 22Hzh Anthony, show Dylan first, at 22Hzh Anthony, show Dylan first, at 22Hz....
The two pictures are presented to the neural system in an alternating fashion, for1s each. Last data point without teacher signal. On final showing, the neuron is unable to classify the two pictures.
The Synaptic Processing Unit Anthony Hsiao
7-90
Figure Figure Figure Figure 51515151: Classification: Classification: Classification: Classification task: Teach Anthony, show Anthony first, at 25Hz. task: Teach Anthony, show Anthony first, at 25Hz. task: Teach Anthony, show Anthony first, at 25Hz. task: Teach Anthony, show Anthony first, at 25Hz.
The two pictures are presented to the neural system in an alternating fashion, for1s each. Last data point without teacher signal. On final showing, the neuron is unable to classify the two pictures (p = 0.0964), even though the mean post-synaptic firing rate in response to the taught signal is higher.
Figure Figure Figure Figure 52525252: Classification task: Teach Anthony, show Dylan first, at 25Hz.: Classification task: Teach Anthony, show Dylan first, at 25Hz.: Classification task: Teach Anthony, show Dylan first, at 25Hz.: Classification task: Teach Anthony, show Dylan first, at 25Hz.
The two pictures are presented to the neural system in an alternating fashion, for1s each. Last data point without teacher signal. On final showing, the neuron is unable to classify the two pictures (p = 0.3264), even though the mean post-synaptic firing rate in response to the taught signal is higher.
Only one out of the eight classification trials was actually able to classify the two
images successfully (using the given definition of ‘able to classify’), although in all
The Synaptic Processing Unit Anthony Hsiao
7-91
trials bar one, the mean post-synaptic frequency in response to the taught image was
higher than the response to the other image. The results for normal teaching are
summarised in Table 2.
Teach DylanTeach DylanTeach DylanTeach Dylan Teach AnthonyTeach AnthonyTeach AnthonyTeach Anthony
Show Show Show Show
firstfirstfirstfirst
Able to Able to Able to Able to
classify?classify?classify?classify?
tttt----test test test test
resultsresultsresultsresults
Show Show Show Show
firstfirstfirstfirst
Able to Able to Able to Able to
classify?classify?classify?classify?
tttt----test test test test
resultsresultsresultsresults
Dylan � p = 0.0556 Anthony � p = 0.1872
22
Hz
Anthony � p = 0.2500
22
Hz
Dylan � NaN
Dylan � p = 0.1187 Anthony � p = 0.0964
25
Hz
Anthony � p = 0.0413p = 0.0413p = 0.0413p = 0.0413
25
Hz
Dylan � p = 0.3264 Table Table Table Table 2222: Summary of normal teaching results: Summary of normal teaching results: Summary of normal teaching results: Summary of normal teaching results
7.5.47.5.47.5.47.5.4 Results Results Results Results ---- BBBBottom upottom upottom upottom up teaching teaching teaching teaching
The experimental procedures for the bottom-up teaching experiments were the same
as for normal teaching (i.e. 10 repetitions of alternating presentation of the images,
repeated N times). However, only two experimental parameters were changed during
the bottom-up teaching classification experiments, namely the image taught (Dylan or
Anthony) and the strength of the teacher signal (50Hz or 70Hz) – since all synapses
are initially depressed in the bottom-up teaching, it would be meaningless to present
the other image without teacher signal first. In addition, one further set of teaching
trials at 50Hz teacher strength, but presenting each image for 2s rather than 1s was
conducted.
The following pages show commented plots of the results for the six different
classification trials using the bottom-up teaching method.
The Synaptic Processing Unit Anthony Hsiao
7-92
Figure Figure Figure Figure 53535353: C: C: C: Classification task: lassification task: lassification task: lassification task: BottomBottomBottomBottom----up tup tup tup teaching Dylan, at 50Hz.eaching Dylan, at 50Hz.eaching Dylan, at 50Hz.eaching Dylan, at 50Hz.
Here, the two pictures are presented to the neural system in an alternating fashion, for 1s each. On final showing, the post-synaptic neuron is able to classify the two pictures. The difference in post-
synaptic frequencies is statistically significant at the 5% level, and indeed, it is significant at the 2.5% level as well (p = 0.0125, CI = [9.350, inf]).
Figure Figure Figure Figure 54545454: C: C: C: Classification task: lassification task: lassification task: lassification task: BottomBottomBottomBottom----up tup tup tup teaching Dylan, at 70Hz.eaching Dylan, at 70Hz.eaching Dylan, at 70Hz.eaching Dylan, at 70Hz.
Here, the two pictures are presented to the neural system in an alternating fashion, for 1s each. On final showing, the post-synaptic neuron is able to classify the two pictures. The difference in post-
synaptic frequencies is statistically significant at the 5% level, and indeed, it is significant at the 2.5% level as well (p = 0.0151, CI = [8.016, inf]).
The Synaptic Processing Unit Anthony Hsiao
7-93
Figure Figure Figure Figure 55555555: Classifica: Classifica: Classifica: Classification task: Bottomtion task: Bottomtion task: Bottomtion task: Bottom----up teaching Dylan, for 2s at 50Hzup teaching Dylan, for 2s at 50Hzup teaching Dylan, for 2s at 50Hzup teaching Dylan, for 2s at 50Hz....
Here, the two pictures are presented to the neural system in an alternating fashion, for 2s each. On final showing, the post-synaptic neuron is able to classify the two pictures. The difference in post-
synaptic frequencies is statistically significant at the 5% level, and indeed, it is significant at the 0.5% level as well (p = 0.0049, CI = [2.360, inf]).
Figure Figure Figure Figure 56565656: Classification task: Bottom: Classification task: Bottom: Classification task: Bottom: Classification task: Bottom----up teaching Anthony, at 50Hz.up teaching Anthony, at 50Hz.up teaching Anthony, at 50Hz.up teaching Anthony, at 50Hz.
Here, the two pictures are presented to the neural system in an alternating fashion, for 1s each. On final showing, the post-synaptic neuron is able to classify the two pictures. The difference in post-
synaptic frequencies is statistically significant at the 5% level, and indeed, it is significant at the 2.5% level as well (p = 0.0123, CI = [6.746, inf]).
The Synaptic Processing Unit Anthony Hsiao
7-94
Figure Figure Figure Figure 57575757: Classification task: Bottom: Classification task: Bottom: Classification task: Bottom: Classification task: Bottom----up teaching Anthony, at up teaching Anthony, at up teaching Anthony, at up teaching Anthony, at 77770Hz.0Hz.0Hz.0Hz.
Here, the two pictures are presented to the neural system in an alternating fashion, for 1s each. On final showing, the post-synaptic neuron is unable to classify the two pictures (p = 0.0601), even
though the mean post-synaptic firing rate in response to the taught signal is higher.
Figure Figure Figure Figure 58585858: Classification task: Bottom: Classification task: Bottom: Classification task: Bottom: Classification task: Bottom----up teaching Anthony, for 2s at 50Hz.up teaching Anthony, for 2s at 50Hz.up teaching Anthony, for 2s at 50Hz.up teaching Anthony, for 2s at 50Hz.
Here, the two pictures are presented to the neural system in an alternating fashion, for 2s each. On final showing, the post-synaptic neuron is able to classify the two pictures. The difference in post-synaptic frequencies is statistically significant at the 5% level, and indeed, it is significant at the 1%
level as well (p = 0.0072, CI = [9.914, inf]).
All bar one out of the six bottom-up teaching classification trials were able to
successfully classify the two images, and in all trials, the mean post-synaptic
The Synaptic Processing Unit Anthony Hsiao
7-95
frequency in response to the taught image was higher than the response to the other
image. Furthermore, it appears, that the teaching for 2s each produces better results.
The results for bottom-up teaching are summarised in Table 3.
Teach DylanTeach DylanTeach DylanTeach Dylan Teach AnthonyTeach AnthonyTeach AnthonyTeach Anthony
TeacherTeacherTeacherTeacher Able to Able to Able to Able to
classify?classify?classify?classify? tttt----test resultstest resultstest resultstest results TeacherTeacherTeacherTeacher
Able to Able to Able to Able to
classify?classify?classify?classify? tttt----test resultstest resultstest resultstest results
50Hz, 1s � p = 0.0125p = 0.0125p = 0.0125p = 0.0125
CI = [9.35CI = [9.35CI = [9.35CI = [9.351111, inf], inf], inf], inf] 50Hz, 1s �
p = 0.0123p = 0.0123p = 0.0123p = 0.0123
CI = [6.7CI = [6.7CI = [6.7CI = [6.746464646, inf], inf], inf], inf]
70Hz, 1s � p = 0.0151p = 0.0151p = 0.0151p = 0.0151
CI = [8.0CI = [8.0CI = [8.0CI = [8.016161616, inf], inf], inf], inf] 70Hz, 1s �
p = 0.0601
n/a
50Hz, 2s � p = 0.0049p = 0.0049p = 0.0049p = 0.0049
CI = [2.36CI = [2.36CI = [2.36CI = [2.360000, inf], inf], inf], inf]
50Hz, 2s � p = 0.0p = 0.0p = 0.0p = 0.0072072072072
CI = [9.CI = [9.CI = [9.CI = [9.914914914914, inf], inf], inf], inf] Table Table Table Table 3333: Summary of bottom: Summary of bottom: Summary of bottom: Summary of bottom----up teaching resultsup teaching resultsup teaching resultsup teaching results
7.5.57.5.57.5.57.5.5 Remarks on the classification Remarks on the classification Remarks on the classification Remarks on the classification experimentsexperimentsexperimentsexperiments
The different classification trials presented above displayed a high degree of variation,
and the final post-synaptic frequencies differ considerably from trial to trial without
displaying a clear pattern or ‘preferred’ post-synaptic frequency of recognition and
rejection. This can be in parts due to the stochastic processes that go on inside the
SPU, as well as due to the fact that within each classification trial, only a limited
amount of data was available12.
In order to be able to make a more general statement about the ability of the neuron
to classify the two images, the Students t-test is applied on the two grouped datasets
of the two different teaching methods, testing whether the neuron can (in general)
classify the two images using normal teaching or the bottom-up teaching method,
irrespective of which image was being taught or which one was presented first, etc.
The results are shown in Table 4 below.
12 Unfortunately, due to time constraints during the project, it was not possible to do more extensive
experiments and data collection. This was mainly due to the fact that the prototype hardware was in
development at the same time as this project and that it had to be shared between to different projects
using it. Furthermore, the hardware is being used at a workshop in the USA as this report is being
written, imposing a strict deadline for using it for experiments. In addition, a considerable amount of
time was spent debugging the feedback AER bus.
The Synaptic Processing Unit Anthony Hsiao
7-96
BBBBottomottomottomottom----upupupup teaching teaching teaching teaching NNNNormalormalormalormal teaching teaching teaching teaching
Able to classify?Able to classify?Able to classify?Able to classify? tttt----test resulttest resulttest resulttest result Able to classify?Able to classify?Able to classify?Able to classify? tttt----test resulttest resulttest resulttest result
� p = 0.0000p = 0.0000p = 0.0000p = 0.0000
CI = [10.353,CI = [10.353,CI = [10.353,CI = [10.353, inf] inf] inf] inf]
�
p = 0.0043p = 0.0043p = 0.0043p = 0.0043
CI = [4.141, inf]CI = [4.141, inf]CI = [4.141, inf]CI = [4.141, inf]
Table Table Table Table 4444: Classification results: Classification results: Classification results: Classification results for grouped data for grouped data for grouped data for grouped data. Significant at 0.5% level. Significant at 0.5% level. Significant at 0.5% level. Significant at 0.5% level....
From the results for grouped data, it becomes much more obvious that, indeed, the
classification works, and that the post-synaptic frequency in response to the taught
image is (or should be, within limits the underlying randomness permits) in fact
always higher than the response to the other image. This is an encouraging result,
which can be built upon through more thorough investigation.
Other general comments and experimental observations:
• While working with the system, jumpy or oscillatory behaviour in response to
plasticity was observed – for example, stimulating it with a homogenous pre-
synaptic input would sometimes lead to sudden, jumpy increase or decrease of
the post-synaptic frequency
• Depression happened more quickly and readily than potentiation
• Synapses were very plastic, if not ‘too plastic’, i.e. the post-synaptic frequency
responded in timescales of seconds rather than 10s of seconds or minutes
The Synaptic Processing Unit Anthony Hsiao
8-97
8888 DiscussionDiscussionDiscussionDiscussion
‘If I keep an open mind, will my brain fall out?’ – Anonymous
The development of the SPU has gone a long way in this project, from developing a
custom made learning rule (although it would be nice if STADP could find its way
into other applications as well) over to the development of the SPU itself, essentially
from scratch, to implementing it onto a real FPGA board and performing a real
classification task with it. Here, the hardware, learning rule and classification task will
be discussed separately, before finishing off with remarks on calibrating the neural
system.
8.18.18.18.1 The hardwareThe hardwareThe hardwareThe hardware
When making practical use of the SPU, it would always have to be integrated with an
aVLSI neuron chip (or a PC emulating one). This would always necessitate a certain
amount of manual effort to calibrate the neuron chip and set its biases, and to ensure
that the addressing format used within the SPU and the neuron chip are compatible.
Having said that, the SPU is parameterized and it should be easy to adapt it to the
neural environment in place. Since it is written in VHDL, it could even be ported to a
different architecture, should this be necessary. In that case, extra care would have to
be taken to ensure that platform specific features such as memory or other constraints
are met.
The architecture of the SPU is modular and transparent. The STADP learning rule
could easily be replaced by another one if required, as long as it is fully pipelined,
with few modifications necessary to be made to the SPU itself. The virtualisation of
the cascade synapses is straightforward and allows for the implementation of a very
large number of synapses. This greatly adds to the usability of the SPU in a real
neural system with high connectivity, and further expanding the number of synapses
implemented on the chip from currently 8192 requires little effort.
Finally, since it is fully pipelined, the number of synapses implemented is virtually
only limited by the throughput of the AER bus and the amount of available memory
The Synaptic Processing Unit Anthony Hsiao
8-98
on the chip or board, which is more likely to be a bottleneck than the inability of the
SPU to process all those synapses. Indeed, in the hardware environment the SPU was
tested and used in, it was the aVLSI neuron chip (the DPI synapse, to be precise)
rather than the SPU which proved to be a bottleneck. Having said this, at no point
was the AER been driven to its limits of capacity, so a real challenge for synaptic
processing was not encountered.
8.28.28.28.2 STADPSTADPSTADPSTADP
STADP was developed with the aim of providing for a simple yet capable general
learning rule to the SPU that is easily implemented into digital hardware. Its principle
functioning was verified both in simulation and in hardware (and proven in circuit),
but on closer look, it has one inherent non-ideality. As shown during all these
verification efforts, and as observed during experimentation, STADP has a slight bias
towards depression, which occurs at a higher rate than potentiation, under otherwise
equal circumstances.
Further investigation revealed, that this bias stems from the asymmetry in the p(active)
curve of the post-synaptic neuron, the probability of it to be in active state for any
given frequency. In particular, this is due to the fact that within the regime the SPU
was operated in during experimentation (post-synaptic frequencies of 0-100Hz), the
p(active) never actually reaches a value of 1. Some measures to counter this effect
were suggested, including setting a minimum value for the expiry time interval, and
further investigation into the usefulness of such measures would be desirable.
However, another way of looking at the bias towards depression is to regard it as
some form of global inhibitory process, which could, when used in the right way,
have highly desirable effects on the bottom line functionality of the learning rule.
Some work such as [3] actually makes use of such mechanisms, and it could be
worthwhile to address this ‘non-ideality’ of STADP by making good use of it rather
than trying to get around it.
However, despite this bias, STADP has proven to be a capable learning rule, and is
fully functional inside the SPU.
The Synaptic Processing Unit Anthony Hsiao
8-99
8.38.38.38.3 The classification taskThe classification taskThe classification taskThe classification task
To put the design of the SPU to an ultimate test, a real classification task was chosen
to investigate its learning capabilities and the classification abilities of a neuron
augmented with cascade synapses from the SPU.
Constructing pre-synaptic stimuli from two greyscale images of Dylan and Anthony,
and teaching the neuron one of them at a time using two different teaching
paradigms, it was concluded that the neuron (read: the neural system consisting of a
neuron with cascade synapses) can indeed learn to classify them correctly, in general,
both using normal teaching as well as using bottom-up teaching. A hypothesis test
using the Students t-test at the 5% level of significance was used to decide whether
the neuron was actually able to correctly classify two images, by testing for the
distribution of the differences in the post-synaptic responses to the taught and other
image.
Some of the results of this hypothesis test however might appear counterintuitive,
since at times, the mean post-synaptic frequency in response to a taught image
seemed to be much higher than to that of the other image, while the hypothesis test
would conclude that the neuron is unable to classify the two images. The claims from
both sides are valid, but it has to be pointed out, that unfortunately only a small set of
data was available to base the analysis on. Since the SPU has several underlying
stochastic processes at its heart, those small data sets could have been corrupted by
chance events easily. However, as the data set increases, more trust should be given
to the hypothesis test (and indeed, as the amount of data increases, the number of
counterintuitive results should decrease), which is what was done by grouping the
available data together. It is from this grouped data set that the encouraging result
stems from, that the the the the neuron is indeed able to learn to classifyneuron is indeed able to learn to classifyneuron is indeed able to learn to classifyneuron is indeed able to learn to classify the two images, and that
the postthe postthe postthe post----synaptic response to the taught image issynaptic response to the taught image issynaptic response to the taught image issynaptic response to the taught image is always higheralways higheralways higheralways higher than the response to
the other image.
The limited amount of data also prevented any conclusions to be made about the way
parameters of the teaching methods, such as the teacher frequency or which image to
The Synaptic Processing Unit Anthony Hsiao
8-100
present first, had an impact on the ability to classify images correctly. This is an
unfortunate experimental shortcoming that was partly due to external circumstances.
Considering the variations in the reported post-synaptic frequencies upon
classification, there was not one ‘preferred’ post-synaptic frequency for ‘recognition’
and ‘rejection’ but instead, a wide range of post-synaptic frequencies was observed
(even though the ‘recognition’ signal, i.e. the response to the taught image, was
practically always at a higher frequency than the ‘rejection’ signal), one must ask the
question about the underlying mechanisms that are responsible for the variations. A
closer look at what might be happening during the learning process could give an
answer.
From Figure 44 it becomes clear, that the formation or modification of the synaptic
mask by STADP is at the heart of the learning process. STADP has the property that it
tends to potentiate quickly those synapses which receive a high pre-synaptic
frequency from the taught image and a low pre-synaptic frequency from the other
image, and depress quickly those synapses which receive a low pre-synaptic
frequency from the taught image and a high pre-synaptic frequency from the other
image. Synapses with low firing rates for both images would tend to oscillate
between efficacies but since they have low firing rates, this would not have a large
impact on the neuron, and would not happen frequently either. Conversely, synapses
with high pre-synaptic firing rates for both image’s inputs would tend to oscillate
between efficacies rapidly, which can potentially have significant ‘noise like’ effects
on the post-synaptic neuron, since randomly, strong signals could be forwarded or
blocked by the oscillating part of the synaptic mask. These expected effects for a
synapse during learning is summarised in Figure 59.
The Synaptic Processing Unit Anthony Hsiao
8-101
Figure Figure Figure Figure 59595959:::: Expected effects on a synapse Expected effects on a synapse Expected effects on a synapse Expected effects on a synapse
during the learning phase of the classification task
This implies, that synapses which receive inputs from two ‘high’ pre-synaptic
frequency stimuli from the taught and the other image, cannot learn whether to be
potentiated or depressed, and puts a limit on the ability of the neuron to classify the
two images. Furthermore, if the post-synaptic neuron is firing just below the mean
firing rate (50Hz), then a sudden and random switch of a number of such undecided
synapses to a potentiated state can easily push the post-synaptic frequency above the
mean rate, and initiate a positive feedback loop which leads to more potentiation
events for all (other) synapses. Similarly, if the post-synaptic neuron is firing just
above the mean firing rate, a sudden and random switch of a number of such
undecided synapses could lead to a depressed state can easily pull the post-synaptic
frequency below the mean firing rate, and initiate a positive feedback loop, which
leads to more depression events for all (other) synapses.
This can explain both, the jumpy nature of plasticity and the post-synaptic frequency
observed experimentally, as well as the variations in the reported post-synaptic
frequencies upon classification. Furthermore, there are implications to STADP, too,
namely, that for it to be able to learn to classify, pre-synaptic stimuli have to be
sufficiently dissimilar. This analysis actually helps a lot to further the understanding
of the way different teaching methods work, too.
The Synaptic Processing Unit Anthony Hsiao
8-102
From the results presented, it should be clear, that bottom-up learning generally
outperforms normal learning in terms of the post-synaptic neuron’s ability to classify
the two images. This is mainly due to the ‘cleaner’ or more ideal starting condition it
has compared to the normal teaching method. This refers to the fact that a post-
synaptic neuron whose synapses are being taught using bottom-up teaching does not
receive as much unwanted ‘noise’ from the other image, since a larger proportion of
the synapses that are not readily used by the taught pre-synaptic stimulus is likely to
remain depressed for a longer period of time, thereby blocking out the unwanted
‘noise’. However, in an online learning system as the SPU is, whereby there is
ongoing plasticity which immediately affects the synapses in a continuous way, it is
likely that after a long period of time and exposure to stimuli, the cascade synapses
are more likely to be in states which more closely resemble the random initialisation
of the cascade memory during normal teaching, than a mostly-depressed situation as
bottom-up teaching requires. Thus, while bottom-up teaching might have a better
performance, it is not a sustainable teaching method, but can only be used on
initialised memory, unlike the normal teaching method (which can be used at any
point in time).
Finally, an interesting question is whether the results presented could have been
achieved with STDP (as pointed out in the beginning of this report, STDP was one of
the candidate learning rules for the SPU). STDP learns by considering the absolute
time difference between pre- and post-synaptic spikes. In this classification task,
STDP would probably have efficacy-oscillated many of the synapses, and hence the
neuron would have been unable to classify the two images. This is because of the
nature of the input stimuli. Since every pixel encodes a mean firing rate which is then
converted into poisson spike trains, any post-synaptic spike would, on average, have
approximately as many pre-synaptic spikes preceding it as it would have following it.
This would result in an undecided synapse with all the undesirable effects mentioned
before. The only way by which STDP could have taught a synapse in a meaningful
way is if many (most) of the post-synaptic spikes consistently precede or follow many
(most) of the pre-synaptic spikes, which is very unlikely.
The Synaptic Processing Unit Anthony Hsiao
8-103
If, on the other hand, a slightly modified version of STDP using at least one,
preferably two, integrators was used, as suggested by [19], it is entirely possible that
similar or even better results could be achieved. However, this modified STDP would
then be closer to STADP again.
Despite all the non-idealities of STADP, it has proven its worth both in terms of
reduced complexity and easy of implementation as well as in the actual classification
task.
8.48.48.48.4 Calibration of the neural systemCalibration of the neural systemCalibration of the neural systemCalibration of the neural system
The calibration between SPU and the aVLSI neuron chip was done for an arbitrarily
chosen operating regime of ~0-100Hz, and a desired 50Hz mean firing rate output.
Similarly, the choice of teaching methods, using 1s presentation per image with an
arbitrarily chosen teacher signal for example (the teacher signals were not entirely
arbitrary, since they were chosen to be ‘strong enough’, but there is no reason why
they were chosen to have any particular strength), or of using a plasticity threshold of
230, was based on intuition and reason rather than any in depth knowledge of the
characteristics of the neural system, since this was the first system of its kind.
However, there is nothing to suggest that the neural system could or should not be
calibrated for a different operating regime. Indeed, experimental observations suggest
that the system could benefit from a more formal characterisation of the responses to
the change of several parameters (including the DPI parameters w and tau, the
plasticity threshold, the mean firing rate fm or the total intensity of the greyscale
images an their mapping into pre-synaptic frequencies, amongst others). This in
depth understanding of the entire neural system (SPU + aVLSI neuron chip) would
greatly improve the performance of future classification tasks, simply by allowing the
experimenter to more closely match the system to the classification tasks
requirements.
This knowledge could be gained through lengthy and thorough experimentation, or a
more analytical and formal analysis of a model of the system. Yet another way to do
this would be to try to mimic a well studied biological system and to attempt to
The Synaptic Processing Unit Anthony Hsiao
8-104
replicate its behaviour. This would provide for a reasonable basis from which to take
further analysis.
The Synaptic Processing Unit Anthony Hsiao
9-105
9999 ConclusionConclusionConclusionConclusion
‘If you cannot - in the long run - tell everyone what you have been doing, your doing
has been worthless’ – Erwin Schrödinger
This project set out to develop a Synaptic Processing Unit (SPU) that implements a
large number of cascade synapses. By using a virtualization strategy whereby a
cascade representation is stored in memory and loaded and processed on demand,
the SPU designed here can implement a total of 8192 binary cascade synapses.
Because of the modular and transparent architecture, the SPU can easily be expanded
or modified, in case more synapses are needed, or a different learning rule, for
example, is desired. It is fully pipelined, and able to handle a high throughput of
address events; in fact, it is expected to be able to process cascade synapses faster
than the currently used communication protocol can supplied it with.
A dedicated stochastic Hebbian learning rule called Spike Timing and Activity
Dependent Plasticity (STADP) was developed, characterised and implemented in
order to equip the SPU with an on-chip learning functionality. It is dependent on the
pre-synaptic spike times and the post-synaptic spike rate. This learning rule has
several advantages, including its simplicity and ease of implementation, as well as its
ability to learn general, sufficiently dissimilar patterns, but also has drawbacks, such
as a bias towards long term depression (or reluctance towards long term potentiation).
The SPU was then integrated with an aVLSI neuron chip to form a working
integrated neural system, and put to test by performing a real classification task. This
task involved classifying two 16x16 pixel images, which were converted into pre-
synaptic spike trains and presented to a neuron with 256 cascade synapses. Two
different teaching methods were employed, normal and bottom-up teaching, and in
both cases, the neuron is able to classify the two images. In particular, it was tested
whether the difference in post-synaptic frequency of the response to the taught and
the other image was non-zero and of statistical significance at 5% the level, which it
was. This was used to conclude that the post-synaptic responses are indeed different
The Synaptic Processing Unit Anthony Hsiao
9-106
for the taught and the other image, which implies a successful classification of the
images. Furthermore, it has to be pointed out, that the post-synaptic frequency in
response to the taught image was always higher than the frequency in response to the
other image.
These are two very encouraging results, and there is a lot of scope for further work
on the SPU.
9.19.19.19.1 RefinementsRefinementsRefinementsRefinements
To the best of the knowledge of the author, the SPU is the first hardware
implementation of a large number of cascade synapses of its kind, in the world. For
that reason, it is important to develop solid working knowledge of experimental
procedures and calibration of integrated neural systems using the SPU and an aVLSI
neuron chip. In particular, devising a methodology by which to determine the best
operating regime for the SPU as well as characterising the system’s behaviour’s
dependence on some of its most important parameters, including the plasticity
threshold, the balance between pre-synaptic firing rate and synaptic weight (of the
aVLSI synapse on the neuron chip) and the teaching method used for a classification
task, would be necessary to allow for efficient and application matched usage of the
SPU within other neural systems in the future.
Also, a more thorough analysis into the behaviour of STADP and the root cause for
its bias would be an important contribution. Here, it is proposed that rather than
regarding this as a weakness of the learning rule, it could be investigated whether this
bias towards depression could be looked at as an emergent behaviour instead, which
implements some form of global inhibition.
Another modification to STADP could include the introduction of a third post-
synaptic neuron state. Rather than being either active or inactive at any point in time,
it would make sense to add a state where the activity is ‘neither’, ‘both’ or ‘normal’.
In that state, no plasticity signals would be elicited. This would be interesting, since
currently there is no regime of operation where the synapses do not categorically
undergo plasticity. The SPU is stochastic, yet with a two state STADP learning rule, it
The Synaptic Processing Unit Anthony Hsiao
9-107
does not allow for any statistical variation in the activity of the post-synaptic neuron,
but instead draws ‘a sharp line’ between its states of activity. In an online learning
scenario such as a classification task, this three state learning rule would have
desirable properties, whereby training progress is less likely to be immediately
overwritten by plasticity events occurring due to statistical variation, thereby
producing better classification results.
Another modification to the way the SPU interacts with the aVLSI neuron chip would
complement the modification proposed above. Currently, all the pre-synaptic spikes
are routed to the post-synaptic neuron through excitatory synapses only. For richer
learning dynamics, it would be interesting to make use of the inhibitory synapses on
the aVLSI neuron chip as well, which would also be expected to improve the learning
and hence the classification capabilities of the SPU.
Finally, since the FPGA board used was actually not fully functional, the feedback
AER port violated timing constraints of the data bus, it would be essential to fix this.
Besides this, the hardware environment was good enough to last several revisions of
the SPU.
The Synaptic Processing Unit Anthony Hsiao
10-108
10101010 ReferencesReferencesReferencesReferences
[1] Fusi, Drew, Abbott. Cascade Models of Synaptically Stored Memories.
Neuron, 44445555, 599–611, 2005.
[2] C. Peterson, R. Malenka, R. Nicoll, J. Hopfield. All-or-none potentiation at
ca3-ca1 synapses. PNAS, 8888, 4732 – 4737, 1998.
[3] S. Fusi, W. Senn. Learning Only When Necessary: Better Memories of
Correlated Patterns in Networks with Bounded Synapses. Neural Computation,
17171717(10), 2106 – 2138, 2005.
[4] J.-L. Gaiarsa, O. Caillard, Y. Ben-Ari. Long-term plasticity at GABAergic and
glycinergic synapses:mechanisms and functional significance. Trends in
Neurosciences, 25252525(11), 564 – 570, 2002.
[5] L. Abbot, S. Nelson. Synaptic plasticity: Taming the beast. Nature
Neuroscience, 3333, 1178 – 1183, 2000.
[6] G. Indiveri, E. Chicca, R. Douglas. A VLSI array of low power spiking neurons
and bistable synapses with spike timing dependent plasticity. IEEE Trans. on
Neural Networks, 17171717(1), 211 – 221, 2006.
[7] Gaiarsa et. al. Long-term plasticity at GABAergic and glycinergic synapses:
mechanisms and functional significance. Trends in Neuroscience, 25252525(11), 564-
70, 2002.
[8] S. Park, K. Miller. Random number generators: good ones are hard to find.
Computing practices, 31313131(10), 1192 – 1201, 1988.
[9] S. Zhang, D. M. Miller, J. C. Muzio. Minimal Cost One-Dimensional Linear
Hybrid Cellular Automata of Degree Through 500. JOURNAL OF
ELECTRONIC TESTING: Theory and Applications, 6666, 255 – 258, 1995.
[10] D. Rubin, S. Fusi. Storing Sparse random patterns with cascade
synapses. Preprint submitted to Elsevier Science, September 2006.
[11] S. Fusi, L. Abbott. Limits on the memory storage capacity of bounded
synapses. Nature Neuroscience, 10101010(4), 485 – 493, 2007.
The Synaptic Processing Unit Anthony Hsiao
10-109
[12] J. Lisman, N. Sprunston. Postsynaptic depolarisation requirements for
LTP and LTD: a critique of spike timing dependent plasticity. Nature
Neuroscience. 8888(7), 839 – 841, 2005.
[13] C. Barolozzi, G. Indiveri. Synaptic Dynamics in analog VLSI. 2006.
[14] V. Chan, S.-C. Liu, A. van Schaik. A matched silicon cochlea pair with
address event representation interface, IEEE Transactions on Circuits and
Systems I, Regular Papers.
[15] D. Muir. Stochastic synapse for reconfigurable hardware. Telluride
Workshop, 2005.
[16] T. Kringe. A VHDL implementation of the Cascade Synapse Model.
Diploma thesis, 2006.
[17] S. Wolfram. Statistical mechanics of cellular automata. Reviews of
Modern Physics, 55555555, 601 – 644, 1983.
[18] R. Gutig, H. Sompolinsky. The Tempotron: a neuron that learns spike
timing-based decisions. Nature Neuroscience 9999, 420 – 428, 2006.
[19] R. Legenstein, W. Maass. What can a neuron learn with Spike-Timing
Dependent Plasticity? Neural Computation, 17171717, 2337 – 2382, 2005.
[20] S. Mitra, S. Fusi, G. Indiveri. A VLSI spike-driven dynamic synapse which
learns only when necessary. Proceedings of IEEE International Symposium on
Circuits and Systems ISCAS06, 2777-2780, 2006.
[21] S. Fusi, personal communication.
[22] G. Kasparov. How life imitates chess. William Heinemann, 2007
[23] K. Boahen, Neuromorphic Microchips, Scientific American, May 2006
[24] G. Indiveri, T. Delbruck, S-C. Liu. Lecture notes: Computation in
Neuromorphic aVLSI Systems. 2006.
10.1.110.1.110.1.110.1.1 Web referencesWeb referencesWeb referencesWeb references
[25] IBM deepblue website, www.research.ibm.com/deepblue
The Synaptic Processing Unit Anthony Hsiao
10-110
10.1.210.1.210.1.210.1.2 DatasheetsDatasheetsDatasheetsDatasheets and reference books and reference books and reference books and reference books
[26] Xilinx Spartan and Memory
http://direct.xilinx.com/bvdocs/appnotes/xapp173.pdf
[27] configuration and read back
http://direct.xilinx.com/bvdocs/appnotes/xapp176.pdf
[28] Spartan 3 Configuration Guide
http://direct.xilinx.com/bvdocs/userguides/ug332.pdf
[29] Spartan 3 Family Data Sheet
http://direct.xilinx.com/bvdocs/publications/ds099.pdf
[30] Quard Port RAM design
http://direct.xilinx.com/bvdocs/appnotes/xapp228.pdf
[31] FIFO Design http://direct.xilinx.com/bvdocs/appnotes/xapp258.pdf
[32] Spartan 3 Advanced configuration Note
http://direct.xilinx.com/bvdocs/appnotes/xapp452.pdf
[33] Using Block RAM in Spartan 3
http://direct.xilinx.com/bvdocs/appnotes/xapp463.pdf
[34] Using LUTs as distributed RAM
http://direct.xilinx.com/bvdocs/appnotes/xapp464.pdf
[35] Peter J. Ashenden. The Designers Guide to VHDL
The Synaptic Processing Unit Anthony Hsiao
11-111
11111111 AppendixAppendixAppendixAppendix I I I I –––– Supplementary Supplementary Supplementary Supplementary filesfilesfilesfiles
High level Matlab scripts used:
• General
o coe.m – generate delta_t_lut coefficient file
o state_init.m – generate cascade memory initialisation coefficient file
o state_init_dep_pot.m – generate all dep or pot cascade initialisation
• Classification
o chipinit.m – set up environment variables for aVLSI chip
o bias_050607.m – load neuron chip calibration
o scan(127, 127) – observe membrane potential of neuron 127 at pin
o createAllFiles.m – create stimuli files for classification
o generateRegTeacher.m – generate regular teacher signal file
o generateCoherent16x16.m – generate 256 homogenous poisson
spike trains
o IOResponse.mat – workspace for frequency response plot (with data)
o results.mat – workspace for results (with data)
o performTTest.m – functions that perform t-test on data inside results.mat
• Characterisation
o characterisationWorkspace.mat – workspace for STADP
characterisation (with data and parameters)
o characterisePActivePost.m – read in output files from
class_tb_vhd.fdo testbench to characterise p(active)
o characterisePlasticity.m – read in output files from
class_tb_vhd.fdo testbench to characterise LTP, LTD, NetRate
o generatePlasticityCharacterisationStimuliFile.m – generate
stimuli file for class_tb_vhd.fdo testbench for LTP, LTD, NetRate
characterisation
o generatePostCharacterisationStimuliFile.m – generate stimuli
file for class_tb_vhd.fdo testbench for p(active) characterisation
o make_freq_sim_plot.m – simulation for STADP characterisation
o make_prob_active_vs_freq_plot.m – simulation for STADP
characterisation of p(active)
The Synaptic Processing Unit Anthony Hsiao
12-112
12121212 Appendix II Appendix II Appendix II Appendix II –––– VVVVerification checklistserification checklistserification checklistserification checklists
12.112.112.112.1 Module Level Verification Module Level Verification Module Level Verification Module Level Verification
• Cascade State Memory
o High Level Specification:
� Read from memory correctly, as given by address
� Write to memory correctly, as given by address and data. Written memory
should also appear on the output to be read, instantly (WRITE_THROUGH
mode)
� Cruicial point is, that the addresses are correctly decoded, e.g. that the
decoder and the multiplexors within the memory work correctly
o Corner cases
� If in general writing to and reading from memory works in principle or in
general, then the memory architecture should be correct.
� should be verified over all ranges of memory
� for the same address, need to check precisely whether the right output is
selected or not
o To be verified:
� EN:
• while EN = '0', nothing should happen at the outputs (whatever
was there, stays there)
� rst: - only affects the output latches, and not the content of the RAM itself
• if rst = '1' the output registers should be zero.
• reset under memory collision:
o MUX should still select the correct output
� WE:
• unless WE = '1', nothing should be written to memory.
• If WE = '0' then we should just read from memory
� does it write to the correct address?
� does it read from the correct address?
� What happens when read and write are same address? (only appicable to
dual port memory - cascade state memory)
• the memory should have a 'security mechanism' which ensures that
even if we are trying to write access the same address, it should
allow write and read correctly
• if the addresses collide, then chose the write through data iff we
write to memory, and the read data otherwise
• forwarding module
o High level specification:
� forward a valid input address to the output, iff the target synapse has high
efficacy or the spike is sent to a teacher
o Corner cases:
� as this module is very simple, there are no critical corner cases - would be
good to test if over the entire range (a representative subrange) of the
neuron addresses
o To be verified:
� EN:
� rst:
• reset should clear all outputs to zero
� does the 'valid' output work?
The Synaptic Processing Unit Anthony Hsiao
12-113
• output (target_address_valid) should follow the input (address_valid) AND NOT address_pre_post (if EN and not rst) at
the next clock edge
• teacher synapse (00000) should be forwarded regardless of efficacy
� is the (correct) address being forwarded?
• output address should follow the input address (if EN and not rst) at
the next clock edge
• pRNG
o High level specification:
� generate maximal pRN sequences according to its seed
� has three pRN outputs outputs, which are shifted versions of each other
o Corner cases:
� no real corner cases, as it is just 'generating away'
o To be verified:
� EN:
� rst:
• should go back to its seed value on reset, and restart the sequence
� does it work?
• should produce pRN sequences
• Learning Rule (STADP)
o High level specification:
� to determine whether a synapse should be potentiated or depressed,
depending on its presynaptic firing time and its postsynaptic activity
� to correctly produce plasticity events (dep/pot)
o Corner cases:
� no critical corner cases
� things over the full range of addresses should be verified
o To be verified:
� EN, rst
� plasticity_valid -> should be valid iff a valid presynaptic address is there, and
iff pRN_i > threshold (pre_above_threshold) (with 1 clk delay)
• does the pre_above_threshold work?
� cascade_synapse_address -> should follow the input with 1clk cycle delay
� is delta_t p-random?
� does the timer work?
� is the new_expiry_time correct (i.e. is the addition correct? - mind that the
result is only valid after the next clock edge!)
� is the WE signal for the memory correct? -> address valid AND address post
� is the post_expiry_time correct (does the read and write to the memory
work?)
� does the comparator work? -> plasticity_dep_pot? (if post expiry time >
current time)
• Cascade synapse
o High level specification:
� perform plasticity operations on an incoming cascade synapse's state
representation
� switch or chain, depending on plasticity and current efficacy
o Corner cases:
� need to check whether it works for all cases, i.e. if 'do_something' is correct
for pRN >, =, < plasticity probability for both types of states (depressed or
potentiated)
o To be verified:
� EN, rst
The Synaptic Processing Unit Anthony Hsiao
12-114
� is the 'new_state_valid' signal correct?
• it should follow the input valid signal on the second clock cycle
(given by the pipeline)
� is the 'cascade_address_out' signal correct?
• it should follow the input address on the second next clock edge
� is the 'do something' signal correct?
• should be high iff the plasticity probability is greater or equal to the
pRN (on the next clock cycle)
� do the new_efficacy and new_state signal behave correctly with respect to
the do_something signal, i.e. is the new_state i signal correct?
• Chaining and switching behaviour should be correct
12.212.212.212.2 System Level Verification System Level Verification System Level Verification System Level Verification
• SPU
o High level specification:
� perform spike routing
� implement cascade synapse learning through STADP
o Corner cases:
� since the system should work with all the individual units combined,
all/most corner cases from the individual modules also apply here.
� in order to breakdown the verification, perform verification in steps:
• Forwarding Only
o observe outputs and related internal signals
• Learning Only
o observe internal signals only
o follow the cascading process of a synapse through for
several times:
� by repeatedly applying pre and postsynaptic
spikes to the same synapse
• synapse should chain down the
potentiated cascade
� by only applying presynaptic spikes to whichever
synapse(s)
• synapse should chain down the
depressed cascade
o do this with several representative synapses
o To be verified:
� Forwarding only:
• EN, rst,
o on NOT EN, all signals should be preserved
o on rst, all plasticity should be is lost
• does the target_address_valid signal work as expected?
o Should follow the address_valid AND NOT
address_pre_post AND synapse efficacy, with delays
� is the address_valid_fwd signal delayed by 2 clk
cycles?
� do we get the correct target_synapse_efficacy?
• is the target neuron address correct?
o should follow the synapse_address 8MSB (neuron address)
with 2 clk cycles delay
� Learning Only
• EN, rst
The Synaptic Processing Unit Anthony Hsiao
12-115
• are the internal valid signals being forwarded correctly?
o plasticity_valid should follow address_valid with 2 clock
delays (if there is a plasticity to be had, which should be
the case in about 50% of the time)
o new_state_valid should follow plasticity_valid with 2 clock
delays
• are the addresses being forwarded correctly, internally?
o cascade_synapse_address should follow synapse_address
with 2 clk delays
o new_state_address should follow
cascade_synapse_address with 2 clk delays
• the way it is stimulated, it should first produce pot signals, and
towards the second half dep signals
o Follow through:
� postsynaptic spike
� produce plasticity (can have any value, but should
not be valid)
� cascade should not change the state (could have a
different new_state, but it should not be valid, so
it won't be written)
� presynaptic spike
� produce valid plasiticy (first pot, later dep)
� cascade should change state
• SPU with I/O and FIFO (with USB interface removed, only USB FIFO accessible)
o High level specification:
� correctly interface FIFOs and AER with SPU (it is difficult to test the USB,
since we don't know the communication standard)
� Two stage approach:
• First just observe the I/O signals, and verify that AER i/o and the
FIFOs work
• Secondly, once we are certain that this works, we shall follow
several spikes on their journey through the SPU
o Corner cases:
� check whether the out-post AER works
� check both situations where fifos are not empty, so that the iSelector has to
toggle
� probably won't be able to fill up the output fifo, so difficult to check the EN
signal....
o To be verified:
� First stage
• clk_90, clk_45
o right frequencies? (90, 45)
• rst
o should reset the fifos and SPU
• USB fifo input
• Observe data at the USB fifo in (can't simulate the USB in as i dont
know the communication standard) - pre_fifo_in, pre_fifo_we
o should take two valid data in cycles in order to construct
the 16bit AER data
• Observe the data at the out of the fifo - usb_pre_fifo_empty (low),
usb_pre_fifo_dout,
o should change to the input data
o on usb_pre_fifo_read, the fifo should become empty again
The Synaptic Processing Unit Anthony Hsiao
12-116
• Trace signals through the system:
o Apply signals to USB FIFO, 8bit wise
� should take two WE cycles before the fifo is not
empty
o Read signal into selector
� Signal should appear at the otput of the fifo when
it is not empty, and usb_pre_fifo_read is high
o Forward onto SPU
� Initially, no arbitration necessary, since the post
AER should not contain any data
� One clk cycle later this data should appear at the
output of the Selector
� data_valid, pre_post should be correct (high, low)
for one clk cycle o Output of the SPU
� four clk cycles later, the output should (iff the
synapse had high efficacy) be equal to the input
(neuron) address (top 8MSB)
� address valid should go high
o AER out
� should raise a request, wait for acknowledge and
produce valid data
The Synaptic Processing Unit Anthony Hsiao
13-117
13131313 Appendix III Appendix III Appendix III Appendix III –––– A journey through the SPU A journey through the SPU A journey through the SPU A journey through the SPU
13.113.113.113.1 PrePrePrePre----synaptic spikesynaptic spikesynaptic spikesynaptic spike
Figure Figure Figure Figure 60606060: Pre: Pre: Pre: Pre----synaptic spike arrives at SPU. synaptic spike arrives at SPU. synaptic spike arrives at SPU. synaptic spike arrives at SPU.
As pre-synaptic data becomes available (empty -> low), it is loaded into the SPU
Figure Figure Figure Figure 61616161: Valid pre: Valid pre: Valid pre: Valid pre----synapticsynapticsynapticsynaptic spike gets forwarded spike gets forwarded spike gets forwarded spike gets forwarded, after two clock delays , after two clock delays , after two clock delays , after two clock delays
Figure Figure Figure Figure 62626262: : : : Valid pValid pValid pValid prererere----synaptic spike synaptic spike synaptic spike synaptic spike generates agenerates agenerates agenerates a plasticity event plasticity event plasticity event plasticity event. . . .
It is a depression event, since the current time is greater than the post-expiry-time.
The Synaptic Processing Unit Anthony Hsiao
13-118
Figure Figure Figure Figure 63636363: : : : Cascade synapse changes in operationCascade synapse changes in operationCascade synapse changes in operationCascade synapse changes in operation
Cascade synapse responds to valid depression signal by chaining. This takes 2 clock cycles.
Figure Figure Figure Figure 64646464: Plasticity events: Plasticity events: Plasticity events: Plasticity events
The Synaptic Processing Unit Anthony Hsiao
13-119
13.213.213.213.2 PostPostPostPost----synaptic spikesynaptic spikesynaptic spikesynaptic spike
Figure Figure Figure Figure 65656565: : : : Valid pValid pValid pValid postostostost----synaptic spike arrives at SPUsynaptic spike arrives at SPUsynaptic spike arrives at SPUsynaptic spike arrives at SPU
Figure Figure Figure Figure 66666666: Post: Post: Post: Post----synaptic spike does not get forwardedsynaptic spike does not get forwardedsynaptic spike does not get forwardedsynaptic spike does not get forwarded
Figure Figure Figure Figure 67676767: : : : PostPostPostPost----synaptic spike sets postsynaptic spike sets postsynaptic spike sets postsynaptic spike sets post----synaptic expiry time.synaptic expiry time.synaptic expiry time.synaptic expiry time.
Post-synaptic spike draws random delta_t by reading the LUT from a random address, and adds it to the current time to get the post-expiry-time, which is stored into memory
The Synaptic Processing Unit Anthony Hsiao
14-120
14141414 Appendix IV Appendix IV Appendix IV Appendix IV –––– Design hierarchy of source files Design hierarchy of source files Design hierarchy of source files Design hierarchy of source files
The SPU project files all sit within the spu_i_o_wrapper, and have a hierarchy as
shown below. Source files marked with * were developed by Daniel Fasnacht,
marked with ^ were developed by Dylan Muir
Spu_i_o_wrapper
• Fx2DCM*
• DCM_FxPhase*
• fx2if*
• fxoutfifo (coregen)
• timestamp*
• sequencer*
• paerInput*
• pinfifo (coregen)
• input_source_selector
• SPU
o Forwarding_process
o Cascade_state_memory
� Cascade_memory_coregen (coregen)
o Stadp
� pRNG_stadp
• ca_flag_150_90^
� lut_delta_t (coregen)
� activity_expiry_times_stadp (coregen)
o cascade_process
� pRNG
• ca_flag_150_90^
• poutfifo (coregen)
• paerOutput*