Reiner Hartenstein, TU Kaiserslautern, Germany http ... · face detection 6000 video-rate stereo...
-
Upload
duongkhuong -
Category
Documents
-
view
222 -
download
2
Transcript of Reiner Hartenstein, TU Kaiserslautern, Germany http ... · face detection 6000 video-rate stereo...
[email protected] 24 June 2013
Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 1
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
The Tunnel Vision
Syndrome:
Massively Delaying
Progress
1
Reiner Hartenstein
VIPSI-2012 MONTENEGRO
Hotel Splendid in Becici Dec 31, 2012 to Jan 1, 2013
IEEE fellow
FPL fellow
SDPS fellow
Kaiserslautern University of Technology
The George Washington University, 5-7 June 2013, Washington, DC
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Living in Baden-Baden
2
Bill Cinton‘s 6th visit http://hartenstein.de/Clinton/
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Outline (1)
•Preface
•History of Computing
•The Systolic Array
•The Kress Array
•The Xputer Paradigm
•The Twin Paradigm Approach
•More Tunnel Vision
•We must Reinvent Computing
•Conclusions
3
http://www.uni-kl.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
4
Power consumption by ICT infrastructures: x30 til 2030 if trends continue
G. Fettweis, E. Zimmermann: ICT Energy Consumption - Trends and Challenges; WPMC'08, Lapland, Finland, 8 –11 Sep 2008
4 © New York Times at Dallas 2006
Power costs more than servers
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Google‘s Electricity Bill
Cost of a data center determined by the monthly power bill
„The possibility of computer equipment power consumption spiraling out of control could have serious consequences
for the overall affordability of computing.”
[L. A. Barrosso, Google]
5
Patent for water-based data centers
Google going to sell electricity
http://hartenstein.de/ComputerStromverbrauch.pdf © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
PATMOS 2013 - 23rd International Workshop on Power
And Timing Modeling, Optimization and Simulation
6
co-located w. VARI 2013 - 4rd European Workshop on CMOS Variability
http://xputer.de/PATMOS/
the leading conference on power optimization
[email protected] 24 June 2013
Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 2
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
The End of Evolution
The end of evolution: we need a revolution!
Disruptive research is urgently required and
We need by orders of magnitude more parallelism and power-efficiency
fundamental issues have to be revisited
We must overcome the von Neumann syndrome and also the widespread Tunnel Vision dementia
re-implementing
Power efficiency progress too slow
7 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
„mini“computers: VAX-11/750
8
quasi-standard around 1980
UC Berkeley (visiting professor)
E.I.S. project (my M-&-C-action)
my Xputer lab at
Kaiserslautern
my CS department
at Kaiserslautern
(personal experience:)
NATO ASI, Urbino 1981
my von Neumann syndrome experience also here
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Too many terminals
NATO ASI on VLSI 1981, SOGESTA, Urbino, Italy,
orders of magnitude
speed-up by using
museum equipment
Paolo
Antognetti
Donald O. Pederson
Data Input Speed-up by Orders of
Magnitude ! I used this Museum
Equipment
9 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
•Preface
•History of Computing
•The Systolic Array
•The Kress Array
•The Xputer Paradigm
•The Twin Paradigm Approach
•More Tunnel Vision
•We must Reinvent Computing
•Conclusions
Outline (2)
10
http://www.uni-kl.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
The History of Computing (1)
11
The 1st electrical computer, ready prototyped for mass production ?
Guess: which year, which company ?
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
The History of Computing
(2) Prototype 1884: Herman Hollerith
12
the first reconfigurable computer
first Xilinx FPGA 100 years later
1989 US census use
The LUT (lookup table)
datastream-based ! datastream-based !
non-volatile !!
size: <3 refrigerators ! size: <3 refrigerators !
[email protected] 24 June 2013
Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 3
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Punched Card Data Memory
Not yet invented in 1884:
• magnetic tape (1898),
• the vacuum tube (1904),
• magnetic drum (1932),
• the transistor (1934),
• ferrite core memory (1949),
• hard disc (1956).
13
… state of the art …..
1969: MPGA
1971: PLD
1973. PLA
1982: EPLD
1984: FPGA.
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
6 decades later:
paradigm shift
14
30 tons, 178 kW
almost 1000 square feet of floor space
about 3 hours MTBF
from data streams to instruction streams
1946
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
104
102
101
15
FFT FFT 100
Reed-Solomon Decoding
Reed-Solomon Decoding 2400
Viterbi Decoding Viterbi Decoding 400
1000
MAC MAC
DSP and wireless
molecular dynamics simulation
molecular dynamics simulation
88
BLAST BLAST 52
protein identification protein identification 40
Smith-Waterman pattern matching Smith-Waterman pattern matching
288
Bioinformatics
GRAPE GRAPE
20 20
Astrophysics Astrophysics
SPIHT wavelet-based image compression
SPIHT wavelet-based image compression 457
real-time face detection
real-time face detection
6000 6000
video-rate stereo vision
video-rate stereo vision
900 pattern
recognition pattern
recognition 730
Image processing, Pattern matching, Multimedia
3000 CT imaging CT imaging crypto crypto
1000
DES breaking DES
breaking
28500
100
106
103
Spe
edup
-Fac
tor
8723 DNA seq.
3439
1116*
*)DES br. equipment size
Speed-up factors are not new
>15000 >15000
DPLA replacing 256 FPGAs’1984 (E.I.S. project)
by Software to FPGA migration Speed-up Factors
Software to DPLA migration
PISA
project
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
no von Neumann-bottle- neck
no von Neumann-bottle- neck
instruction stream parallelism:
many von Neumann
bottlenecks
many von Neumann
bottlenecks
16
[Reiner’s
watering can
model]
What Parallelism?
datastream parallelism:
why such massive speed-up ?
instead of:
Look at the Reconfigurable
Computing Paradox
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
instruction stream parallelism:
many von Neumann
bottlenecks
many von Neumann
bottlenecks
17
[Reiner Hartenstein’s
watering can model]
Instruction Stream
Parallelism
better illustration
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
18
Lambert M. Surhone, Mariam T. Tennoe,
Susan F. Hennessow (ed.): Von Neumann
Syndrome; ßetascript publishing 2011
The Beauty and the
Joy of Computing …
[email protected] 24 June 2013
Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 4
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Thou shalt not propose
things really new!
You won't get research grants and submitted papers will be rejected
19
Reasonable people adapt themselves to the world as it is. Breakthroughs of innovations can be achieved only by unreasonable people. stolen from
George Bernhard Shaw
Criticizing the world how it is you even can be fired.
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Max Planck: Replacement of false doctrines by new insights needs 50 years waiting for not only old professors but also their scholars to die off.
50 Years of
Software Crisis F. L. Bauer coined this term in 1961
20
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
•Preface
•History of Computing
•The Systolic Array
•The Kress Array
•The Xputer Paradigm
•The Twin Paradigm Approach
•More Tunnel Vision
•We must Reinvent Computing
•Conclusions
Outline (3)
21
http://www.uni-kl.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Early ASAP Conferences:
tend back to datastreams !
CHDL-Based CAD System for the Synthesis of Systolic Architectures;
22
The 3rd International Conference on Systolic Arrays, 1989 in Killarney, Ireland, in the historic Great Southern Hotel
Mapping Systolic Arrays onto the Map-Oriented Machine (MoM)
The ASAP series is an important forum to discuss cures for healing from the von Neumann syndrome
our contributions:
systolic general purpose processor
22
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Our ASAP‘94 thru ASAP‘97 Papers
ASAP'94, San Francisco, CA, USA: A Dynamically Reconfigurable Wavefront Array Architecture for Evaluation of Expressions
23
ASAP'95, Strasbourg, France: A Parallelizing Compilation Method for the Map-oriented Machine;
ASAP'96, Chicago, Ill., USA: A Synthesis System for Bus-based Wavefront Array Architectures;
ASAP`97, Zurich, CH: A Novel Sequencer Hardware for Application Specific Computing;
(MoM)
http://xputer.de/ASAP/ © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Systolic Arrays (1)
24
„VAX? That‘s why it is so slow“
IEEE 7th ISCA, La Baule, France, May 6-8, 1980
from the airport to La Baule
Mario Barbacci:
M. J. Foster and H. T. Kung: The Design of Special-Purpose VLSI Chips ...
[email protected] 24 June 2013
Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 5
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Systolic Arrays (2)
25
IEEE 7th ISCA, La Baule, France, May 6-8, 1980
M. J. Foster and H. T. Kung: The Design of Special-Purpose VLSI Chips ...
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
http://www.fpl.uni-kl.de/papers/publications/karl-steinbuch.html
26
why not general purpose ?
M. J. Foster and H. T. Kung: “The Design of Special-
Purpose VLSI Chips ... “ Karl Steinbuch
It is not sufficient to invent something. You need to recognize, that you have invented something.
Systolic Arrays (3)
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
•Preface
•History of Computing
•The Systolic Array
•The Kress Array
•The Xputer Paradigm
•The Twin Paradigm Approach
•More Tunnel Vision
•We must Reinvent Computing
•Conclusions
Outline (4)
27
http://www.uni-kl.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
28
What Synthesis Method? (2)
of course algebraic! (linear projection)
Rainer Kress replaced it by simulated annealing*:
supports also any irregular & wild form pipe networks
1995:
supports only applications with strictly regular data dependencies
*) KressArray [ASP-DAC-1995]
http://kressarray.de/
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
http://xputers.informatik.uni-kl.de/papers/publications/NageldingerDiss.html
Ulrich Nageldinger: Coarse-grained Reconfigurable Architectures Design Space Exploration; Dissertation, 2001, Kaiserslautern University of Technology
Systolic Array Generalization
29
Rainer Kress: A Fast Reconfigurable ALU for Xputers, Dissertation 1996, Kaiserslautern University of Technology
DATE 2001, Munich, Germany ?
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Ulrich‘s tie
30
[email protected] 24 June 2013
Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 6
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
•Preface
•History of Computing
•The Systolic Array
•The Kress Array
•The Xputer Paradigm
•The Twin Paradigm Approach
•More Tunnel Vision
•We must Reinvent Computing
•Conclusions
Outline (5)
31
http://www.uni-kl.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
The Data Streams ?
x x x
x x x
x x x
| | |
x x x x
x x
x x x
- - -
x x x x
x x
x x x
- - -
- - -
- - -
- - -
x x x
x x x
x x x
| | |
| | |
| | |
| |
|
32
*) and receives
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
“It’s not our job”
x x x
x x x
x x x
| | |
x x x x
x x
x x x
- - -
x x x x
x x
x x x
- - -
- - -
- - -
- - -
x x x
x x x
x x x
| | |
| | |
| | |
| |
|
33 *) or receives
Another Tunnel Vision Symptom
without a sequencer: missed to invent a new machine paradigm
33 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
34
Who generates
the data streams?
http://xputer.de/ http://data-streams.org/
*) publ. 1989
34
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
104
102
101
35
100
2400
400
1000
MAC MAC
88 52
40
288
20 20
457
900 730
1000
100
103
Spe
edup
-Fac
tor
DPLA replacing 256 FPGAs’1984 (E.I.S. project)
Complexity: linear with area
PISA
project
>15000 >15000
Design Rule Check on MoM Xputer, programmed by MoPL (Lynn Conway‘s λ-grid-based CMOS rules)
by Software to DPLA migration PISA Speed-up Factors
105
35 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Processing 4-by-4 Reference Patterns
36
Design Rule Check accelerator*: DPLA: fabricated by the
E.I.S. Multi University Project:
*) PISA DRC accelerator [ICCAD 1984]
1984: 1 DPLA replaces 256 FPGAs
DPLA was more area-efficient than FPGA - by several orders of magnitude
[email protected] 24 June 2013
Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 7
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Reconfigurable Address Generator
no instruction streams needed even to generate most complex address sequences
37
a) b)
c)
d) e)
f) g)
video scan
-90º rotated video scan
sheared video scan
non-rectangular video scan
zigzag video scan
spiral scan
feed-back-driven scans
atomic scan linear scan
-45º rotated (mirx (v scan))
perfect shuffle
GAU
auto-sequencing memory: generalized the DMA
37 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
•Preface
•History of Computing
•The Systolic Array
•The Kress Array
•The Xputer Paradigm
•The Twin Paradigm Approach
•More Tunnel Vision
•We must Reinvent Computing
•Conclusions
Outline (6)
38
http://www.uni-kl.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Dual paradigm mind set:
an old hat - but ignored
39
time to space mapping: procedural to structural:
1967: W. A. Clark: Macromodular Computer
Systems; 1967 SJCC, AFIPS Conf. Proc.
C. G. Bell et al: The Description and Use of Register-Transfer Modules (RTM's); IEEE Trans-C21/5, May 1972
FF
token bit
evoke
FF FF
1971
loop to pipe mapping
why did it take 25 years to find out?
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Duality of procedural Languages
Flowware Languages
read next data item
goto (data address)
jump to (data address)
data loop
data loop nesting
data loop escape
data stream branching
yes: internally parallel loops
40
Software Languages
read next instruction
goto (instruction address)
jump to (instruction address)
instruction loop
instruction loop nesting
instruction loop escape
instruction stream branching
no: internally parallel loops
But there is an AsymmetryAsymmetry But there is an AsymmetryAsymmetry
program counter: data counter(s):
more simple:
no ALU tasks
MoPL state register(s):
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
41
A Clean Terminology, please
program source compilation result
Software instruction streams
Flowware data streams
Configware datapath structures configured
Data-Flow Execution Models for Extreme Scale Computing (DFM 2013)
StreamStream http://www.cs.ucy.ac.cy/dfmworkshop/
http://www.pactconf.org/ in conjunction with PACT 2013
This new series started 10 Oct 2011 at Galveston Island, Texas, USA
Edinburgh, UK, 19-23 Sep 2013
Generalization of the term „Program“
41 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
The 1st
Dual Paradigm Co-Compiler
42
Analyzer / Profiler
GNU C compiler
para d igm Computer machine
DPSS
X-C compiler
Xputer machine paradigm
Partitioner
Loop Transformations
X-C
Resource Parameters
supporting different platforms
Juergen Becker’s
CoDe-X 1996
(Rainer Kress)
X-C is C language extended by MoPL
KressArray Flowware
Extended Lesley Lamport
Hyperplane Theorem
Host Software
KressArray Configware
all 3 types of sources:
Software,
Configware,
Flowware
[email protected] 24 June 2013
Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 8
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Outline (7)
43
http://www.uni-kl.de
•Preface
•History of Computing
•The Systolic Array
•The Kress Array
•The Xputer Paradigm
•The Twin Paradigm Approach
•More Tunnel Vision
•We must Reinvent Computing
•Conclusions
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
More Tunnel Vision (1)
44 > 20 years ignored by TR scene
solution illustration:
some re-write rules: specification:
Term Rewriting (TR) use in EDA: „still the only top-down example - Universidade de Brasilia in 2001: TR-expert Prof. Mauricio Ayala-Rinćon,
only in verification: is bottom-up.“
automatic floor plan design by TR: an integer multiplier example, published 1979
http://www.fpl.uni-kl.de/karl/karl_history_fbi.html#3.1 algebraic structured design
44
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
More Tunnel Vision (2)
45
20 years ignored by TR scene
floor plan of one slice
microchip layout integer multiplier example, published 1979
http://www.fpl.uni-kl.de/karl/karl_history_fbi.html#3.1 algebraic structured design 45 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
•Preface
•History of Computing
•The Systolic Array
•The Kress Array
•The Xputer Paradigm
•The Twin Paradigm Approach
•More Tunnel Vision
•We must Reinvent Computing
•Conclusions
Outline (8)
46
http://www.uni-kl.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
47
A huge design space
extending Flynn‘s taxonomy by going heterogeneous:
Reiner‘s Taxonomy
datastream-based (anti-machine)
noI versus SI or MI
Programmability crisis solution impossible without mastering the entire design space
Single vs. Multiple Instruction vs. Data
Mike Flynn‘s taxonomy
reconfigurable or not
Diana‘s Taxonomy
Dia
na G
öhrin
ger‘s
P
h.D
.thes
is
47 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
48
The tunnel vision of
the pre-manycore age
48
[email protected] 24 June 2013
Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 9
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
49
Heterogeneous Taxonomy
The extension into this huge design space is mainly
ignored by curricula and even by most R&D scenes
49 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
50
Das E.I.S.-Projekt:
http://xputer.de/EIS/
[1980]
The by far most
effective project
in the history of
modern computer
science
Education
Revolution:
the M-&-C VLSI
Design Revolution
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Education Revolution:
the M-&-C Design Revolution
51
reduced width of specialization
The Mead-&-Conway strategy:
Clearing out & intuitive models
Removal of the education
dilemma
Silicon Foundry
ta
ll th
in
m
an
Application
cohe
renc
e
traditional division of specialization:
Logic level
Switching level
Circuit level
Register Transfer (RT) level
Application level
Layout level
in-house technology
submit reject
submit reject
submit reject
submit reject
submit reject
width of specialization
frag
men
tatio
n
Das E.I.S.-Projekt:
http://xputer.de/EIS/
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
More Connected Thinking
52
The new strategy:
heterogeneous platforms
ta
ll th
in
m
an
cohe
renc
e
traditional division of specialization:
Logic level
Switching level
Circuit level
Register Transfer (RT) level
application level
Layout level
inter-processor NoC
pipe-networks
arbiter
Multiplexing
distribution
Interconnect at all levels
off-chip communication
global and local memory
distributed memory
even more
ta
ll th
in
w
om
an
cohe
renc
e
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
The Tunnel Vision of
Language Designers
… in teaching to students …
Absurdely incomprehensible abstractions are the problem in „standard“ languages
[E. A. Lee: Are new languages necessary for multicore? 2007]
[E. A. Lee. The problem with threads. Computer, 2006.]
Concurrency models should operate at component architecture level rather than programming languages. [E. A. Lee]
The High Cost of Movement of Data
and Instructions
53
53 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Less Abstraction Layers ?
Removing Abstraction Layers hides critical sources of efficiency limits.
54
We must change how programmers think, also by …..
Hides issues to detect overhead and bottlenecks
… fusion of abstraction layers: challenging educators
… opening the borders between paradigm domains
[email protected] 24 June 2013
Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 10
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
CS Education
55
procedural have not *) efficiently
You cannot *teach Hardware to a Programmer
structural
To a Hardware Guy you always can
teach Programming
have procedural
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
•Preface
•History of Computing
•The Systolic Array
•The Kress Array
•The Xputer Paradigm
•The Twin Paradigm Approach
•More Tunnel Vision
•We must Reinvent Computing
•Conclusions
Outline (9)
56
http://www.uni-kl.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Conclusions
57
We must disruptively reivent CS education
Disruptive research is urgently required and
We need by orders of magnitude more parallelism and power-efficiency
fundamental issues have to be revisited
We must overcome the von Neumann syndrome and the widespread Tunnel Vision dementia
A reductionist attitude of most R&D areas massively delays the solution of urgent problems
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
We need „une' Levée en Masses“
58
We need „une' Levée en
Masses“
We need „une' Levée en
Masses“
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
ASAP 2013, Washington, DC, 5-7 June 2013
59
Reiner Hartenstein:
The Tunnel Vision Syndrome: Massively Delaying Progrsss
59 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
backup for
discussion
60
[email protected] 24 June 2013
Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 11
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Program Engineering (2)
The Generalization of Software Engineering
61
PE
Program Engineering
Flowware Engineering
FE
auto-sequencing Memory auto-sequencing Memory
asM asM
SE
Software Engineering
CPU CPU
CE
Configware Engineering structures
pipe network model etc.
pipe network model etc.
The Generalization of Software Engineering —
*) do not confuse with „dataflow“!
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
The Reconfigurable Computing Paradox
62
• The spirit from the Mainframe Age is collapsing under the von Neumann syndrome
• There is something fundamentally wrong in using the von Neumann paradigm
• Up to 4 orders of magnitude speedup + tremendously slashing the electricity bill by migration to FPGA
• Bad FPGA technology: reconfigurability overhead, wiring overhead, routing congestion, slow clock speed
• The reason of this paradox ?
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
1
2
3
4
5
6
7
8 y
x
1 2 3 4 5 6 7 8
JPEG zigzag
scan pattern v2
SouthScan is step by [0,1]
endSouthScan;
*> Declarations
NorthEastScan is loop 6 times until [*,1]
step by [1, -1] endloop
end NorthEastScan;
SouthWestScan is loop 7 times until [1,*]
step by [-1,1] endloop
end SouthWestScan;
HalfZigZag is EastScan
loop 3 times SouthWestScan
SouthScan NorthEastScan
EastScan endloop
end HalfZigZag;
goto PixMap[1,1]
HalfZigZag;
SouthWestScan
uturn (HalfZigZag)
HalfZigZag
data counter data counter
data counter data counter
2
1
3
4
HalfZigZag
Main program:
Flowware language example (MoPL): programming the datastream
x
y
63
EastScan is step by [1,0]
end EastScan;
MoPL: no instruction streams !
(an animation) 63
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
The End of Evolution
64
The end of evolutionary extension of current models
Disruptive research is required in programmability
An entirely new software stack is needed
Fundamental programming issues have to be revisited
We need a much deeper integration between applications and data inside all kinds of memory
Rethink all disciplines from circuit design and test, up to architecture, system design, storage behavior, compilers , run time systems, operating systems, and programming.
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
What means zettaFLOP ?
BTW:
July 2005:
an early
trailblazer
Name FLOPS
yottaFLOPS 1024
zettaFLOPS 1021
exaFLOPS 1018
petaFLOPS 1015
teraFLOPS 1012
gigaFLOPS 109
megaFLOPS 106
kiloFLOPS 103
ASCI Red: 850 kW
IBM Roadrunner: 2,483 kW
65 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Following the exemplar: our ideal:
Project has provided an educational framework to create a population of designers (and researchers) needed
66
The most influential research project in modern computer history.
The brain child of:
Carver Mead Lynn Conway
(~1980-…) The VLSI Revolution
Solving the VLSI design crisis: missing reply to Moore‘s Law
missing designer population, design tools (SW), HW for a design as a whole
Incubator of EDA* industry, workstations…
comparable to our scenario
text book [1980] ( Bestseller ! )
[email protected] 24 June 2013
Featured Invited Talk; The 24th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2013), 5-7 June 2013, Washington, DC, USA 12
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
VLSI Design Education Spreading Rapidly
1980 - 1983
67
world-wide
Carver Mead
Lynn Conway
incubator of the EDA industry etc.
The most effective project in the history of modern computer science
67 © 2012, [email protected] http://hartenstein.de
TU Kaiserslautern
Flowware Languages
68
MoPL: fully supporting the anti machine paradigm – the counterpart of the von Neumann paradigm
general purpose:
Streams-C: defines 1-D streams; generates VHDL
DSP-C: allows to describe key features of DSPs
specialized:
Brook: for modern graphics hardware
StreaMIT