Field-Programmable Technology: Today’s and Tomorrow’s
description
Transcript of Field-Programmable Technology: Today’s and Tomorrow’s
![Page 1: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/1.jpg)
1
Field-Programmable Technology: Today’s and Tomorrow’s
Wayne LukImperial College London
TWEPP-07Prague
4 September 2007
Imperial College London
April 2005
![Page 2: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/2.jpg)
2
Outline: technology = devices + design
1. overview: motivation and vision2. field-programmable devices: today - Xilinx Virtex-4, Virtex-5; Stretch S5 3. field-programmable design : today
- enhance optimality and re-use 4. field-programmable devices: tomorrow - hybrid FPGA, die stacking 5. field-programmable design : tomorrow - guided synthesis, representation,
upgradability 6. summary Thanks to colleagues, students and collaborators from Imperial College,
University of British Columbia, Chinese University of Hong Kong, University of Massachusetts Amherst, UK Engineering and Physical Sciences Research Council, Stretch, Xilinx
![Page 3: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/3.jpg)
3
1. Motivation: good - Moore’s Law
• rising– capacity– speed
• falling– power/MHz– price
Source: Xilinx
![Page 4: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/4.jpg)
4
1
Lo
gic
Tra
nsi
sto
rs p
er C
hip
`(K
)
P
rod
uct
ivit
yT
ran
s./S
taff
- M
on
th
10
100
1,000
10,000
100,000
1,000,000
10,000,000
10
100
1,000
10,000
100,000
1,000,000
10,000,000
100,000,000 Logic (Moore’s Law) Transistors/ChipTransistor/Staff Month
58%/Yr. compoundComplexity growth rate
21%/Yr. compoundProductivity growth rate
1981
1983
1985
1987
1989
1991
1993
1995
1997
1999
2003
2001
2005
2007
2009
xxx
x xx
x
challenge: reduce the design productivity gap
Not so good: design productivity gap
Source: SEMATECH
![Page 5: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/5.jpg)
5
Our vision: unified synthesis + analysis
C /M atlab /P rem iere
C om pile r
Machinecode
Run-tim einterface
Configurationinform ation
Fixed processor Custom processor
C ustom com pu ting sys tem
/Progolapplication description
manual/automatic partition, compile, analysis, validate
system-specific architecture
system-specific programming interface
Java/
CPUs + DSPs FPGAs + sensors
![Page 6: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/6.jpg)
6
500MHz FlexibleSoft Logic Architecture
500MHz Programmable DSP Execution Units
0.6-11.1GbpsSerial Transceivers
450MHz PowerPC™ Processorswith
Auxiliary Processor Unit
1Gbps DifferentialI/O
500MHz multi-portDistributed RAM
500MHz DCM DigitalClock Management
2a. Devices today: Virtex-4 FPGA
• fine-grain fabric + special function units• challenge: use resources effectively
Source: Xilinx
![Page 7: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/7.jpg)
7
2b. Virtex-5 FPGA
Source: Xilinx
• capacity– rises
• logic speed– rises
• I/O speed– rises
• all good news?
![Page 8: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/8.jpg)
8
Growing gap: amount of gates vs I/O
Source: Xilinx
• I/O off-chip– serial
• I/O on-chip– scalable– flexible– easy to use
• interconnect– heterogeneous– customisable
![Page 9: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/9.jpg)
9
2c. Software configurable engine: S5
Wide Register File (WRF)• 32 Wide Registers (WR)• 128-bit WideLoad/Store Unit• 128-bit Load/Store• Auto Increment/Decrement• Immediate, Indirect, Circular• Variable-byte Load/Store• Variable-bit Load/Store
ISEF• Instruction Specialization Fabric• Compute Intensive• Arbitrary Bit-width Operations• 3 Inputs and 2 Outputs• Pipelined, Bypassed, Interlocked• Random Logic Support• Internal State Registers
ALUFPU
32-BIT RF
CO
NT
RO
L
128-BIT WRF32-BIT RF
ALUFPU
S5 ENGINE
ISEFInstruction-Set
Extension Fabric
DATA RAM32KB
SRAM256KB
D-CACHE32KB
I-CACHE32KB
MMU
RISC Processor• Tensilica – Xtensa V• 32 KB I & D Cache• On-Chip Memory, MMU• 24 Channels of DMA, FPU
Source: Stretch
![Page 10: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/10.jpg)
10
3. Design today: overview• structural or register-transfer level (RTL)
– e.g. VHDL, Verilog– low-level, little automation, small designs
• behavioural, system-level descriptions– e.g. SystemC (public-domain: systemc.org)– MARTES: +UML for real-time embedded systems
• general-purpose software languages– e.g. C, Java; with hardware support: Handel-C– high-level, large automation, large designs
• special-purpose descriptions– e.g. System Generator (signal processing)– high-level, domain-specific optimisations
![Page 11: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/11.jpg)
11
3a. Enhance optimality and re-use• design optimality: quality
– select algorithm and devices: meet requirements– mapping: regular to systolic, rest to processor– I/O: dictates on-chip parallelism, buffering schemes– control speed/area/power: pipelining, layout plan– partitioning: coarse vs fine grain logic and memory
• design re-use: productivity– separate aspects specific to application/technology– library of customisable components with trade-offs– compose and customise to meet requirements– uniform interface to memory and I/O: hide details– pre-verified parts: ease system verification
![Page 12: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/12.jpg)
12
Example: systolic summation tree
• n-input adder– tree of (n-1)
2-input adders
• each adder– has k stages– each stage
has s-bit adder
• figure shows– k = 3– s = 3
• high k, low s– less cycle
time– more area– less power
![Page 13: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/13.jpg)
13
Finance application: value-at-risk
• sampling from multivariate Gaussian distribution
• DSP units: matrix multiplication
• systolic tree: accumulate result
• RC2000 board– Virtex-4 xc4vsx55– 400MHz
64DSPs
48DSPs
GaussianRNG
Delta GammaAccumulator
Control RC2000Interface
IO Clock domain
FIFO
FIFO
+ + + +
+ +
+
33 times faster than 2.2GHz quad Opteron(including all IO overheads, PC-FPGA communications, and using AMD
optimised BLAS for software)
![Page 14: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/14.jpg)
14
CO
NT
RO
L
3b. Stretch program development
• profile code–identify hotspots
• special instruction– implement ‘C’ functions in
single instructions– bit-width optim.
• software compiler– generate instruction– schedule instruction
• multiple data (WR)– perform operations in parallel
• efficient data movement – intrinsic load store operations– 20+ DMA channels
APPLICATIONC/C++
COMPILEDMACHINE
CODE
Compiler
InstructionDefinition
NEW INSTRUCTIONS
INSTRUCTION GENERATION
TAILOR ISEF TO APPLICATION
AUTOMATIC
Source: Stretch
![Page 15: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/15.jpg)
15
500MHz FlexibleSoft Logic Architecture
500MHz Programmable DSP Execution Units
0.6-11.1GbpsSerial Transceivers
450MHz PowerPC™ Processorswith
Auxiliary Processor Unit
1Gbps DifferentialI/O
500MHz multi-portDistributed RAM
500MHz DCM DigitalClock Management
4. Devices tomorrow: more diversed
Source: Xilinx
Replaced by other functional units, e.g. floating-point units
![Page 16: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/16.jpg)
16
4a. Hybrid FPGA: architecture
• most digital circuits– datapath: regular, word-based
logic– control: irregular, bit-based logic
• hybrid FPGA– customised coarse-grained block:
domain-specific requirements– fine-grained blocks:
existing FPGA architecture– good match to computing
applications for given domain
![Page 17: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/17.jpg)
17
Coarse-grained fabric library
U1:fpmul
control status
Q D
U2:fpadd
control status
U{D-1}:wb
bit 0
bit 1
bit 2
bit N-1
control status
Output Mux
Input Buses (M)
Feedback Registers (F)
FeedbackMux
Output Buses
(R)
control
Control Signal Input Status Flag Output
Floating Point
Multiplier
Floating Point Adder/Subtractor
status
bit 0
bit 1
bit 2
bit N-1
control status
U0:wb
D=9, M=4, R=3, F=3, 2 add, 2 mul: best density over benchmarks
![Page 18: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/18.jpg)
18
Evaluation
• 6 benchmark circuits– digital signal processing kernels: e.g. bfly (for
FFT)– linear algebra: e.g. matrix multiplication– complete application: e.g. bgm (financial model)
• circuits: partitioned to control + datapath– control: vendor tools to fine-grained units– datapath: manually map to coarse-grained units
• comparison – directly synthesized to Xilinx Virtex-II devices
![Page 19: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/19.jpg)
19
Results
Floating Point hybrid FPGA XC2V3000-6
Area(slices
)Delay(ns)
Area(slices)
Delay(ns)
Area(times
)Delay
(times)
bfly 565 9.02 13733 24.57 24.3 2.72
dscg 661 10.11 9614 22.78 14.5 2.25
fir4 371 9.06 11290 23.68 30.4 2.61
mm3 642 8.90 8889 23.4 13.8 2.63
ode 545 9.74 8238 21.93 15.1 2.25
bgm 1810 10.00 30207 24.34 16.7 2.43
Geometric Mean 18.3 2.48
![Page 20: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/20.jpg)
20
4b. On-chip memory bandwidth
• storage hierarchy: registers, LUT RAM, block RAM
• processor cache: address lack of I/O bandwidth
Source: Xilinx
![Page 21: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/21.jpg)
21
High density die stacking
Source: Xilinx
![Page 22: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/22.jpg)
22
Programmable circuit board
Source: Xilinx
• die stacking: 3D interchip connections• customisable system-in-package: productivity
gap?
![Page 23: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/23.jpg)
23
5a. Design tomorrow: guided synthesis
• guided transformation of design descriptions– automate tedious and error-prone steps– applicable to various levels of abstraction
• focus: two timing models– strict timing model: cycle-accurate - cycle-accurate - efficiencyefficiency– flexible timing model: behavioural - behavioural - productivityproductivity
• combine cycle-accurate and behavioural models– rapid development with high quality– design maintainability and portability
• based on high-level language– library developer: provide optimised building blocks– application developer: customise building blocks
![Page 24: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/24.jpg)
24
Timing models: strict vs flexible
{ delta = b*b - ((a*c) << 2); if (delta > 0) num_sol = 2; else if (delta == 0) num_sol = 1; else num_sol = 0;}
{ delta = b*b - ((a*c) << 2); if (delta > 0) num_sol = 2; else if (delta == 0) num_sol = 1; else num_sol = 0;}
b a c
* *<<
-delta
> 0
num_sol
== 02
1
0
2
MU
X
MU
X
executed executed at cycleat cycle 11
executed executed at cycle 2at cycle 2
cc
cctrue
c
*
a
b b
* << 2
-
>
=
= =
==
true false
false
num_sol
num_sol num_sol
0
0
0
2
1
strict timing(Handel-C)
flexible timing
behavioural model- partial-ordering- abstract operations
cycle accurate model- total-ordering- resource-bound
scheduling unscheduling
![Page 25: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/25.jpg)
25
{ delta = b*b - ((a*c) << 2); if (delta > 0) num_sol = 2; else if (delta == 0) num_sol = 1; else num_sol = 0;}
{ delta = b*b - ((a*c) << 2); if (delta > 0) num_sol = 2; else if (delta == 0) num_sol = 1; else num_sol = 0;}
cc
cctrue
c
*
a
b b
* << 2
-
>
=
= =
==
true false
false
num_sol
num_sol num_sol
0
0
0
2
1
unschedulingunscheduling(flexible (flexible timing)timing)
schedulinschedulingg
constraintsconstraints
+
par { // ================== [stage 1] pipe_mult[0].in(b,b); pipe_mult[1].in(a,c); // ==================[stage 8] tmp0 = pipe_mult[0].q; tmp1 = pipe_mult[1].q << 2; // ==================[stage 9] tmp2 = tmp0 - tmp1; // ==================[stage 10] if (tmp2 > 0) num_sol = 2; else if (tmp2 == 0) num_sol = 1; else num_sol = 0; delta = tmp2; }
synthesissynthesis(strict (strict timing)timing)
b a c
pmult[0]
delta
> 0
num_sol
==2
10
pmult[1]
<< 2
tmp1
-tmp2
tmp0
stage 1-7
stage 8
stage 9
stage 10
cyclecycle 11
MU
X
MU
X
Rapid design: automated scheduling
• support combination of manual and automatic scheduling
![Page 26: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/26.jpg)
26
par { { // ================= [stage 1] pipe_mult[0].in(b,b); pipe_mult[0].in(a,c); } ....}
Maintainability: retarget design
par { // ================== [stage 1] pipe_mult[0].in(b,b); pipe_mult[1].in(a,c); // ==================[stage 8] tmp0 = pipe_mult[0].q; tmp1 = pipe_mult[1].q << 2; // ==================[stage 9] tmp2 = tmp0 - tmp1; // ==================[stage 10] if (tmp2 > 0) num_sol = 2; else if (tmp2 == 0) num_sol = 1; else num_sol = 0; delta = tmp2; }
b a c
pmult [0]
delta
> 0
num_sol
==2
10
pmult [1]
<< 2
tmp1
-tmp2
tmp0
stage 1-7
stage 8
stage 9
stage 10
cyclecycle 11
MU
X
MU
X
cc
cctrue
c
*
a
b b
* << 2
-
>
=
= =
==
true false
false
num_sol
num_sol num_sol
0
0
0
2
1
schedulinschedulingg
constraintsconstraints
+
unschedulingunscheduling(flexible timing)(flexible timing)
b a c
pmult[0]
> 0
num_sol
==2 1
0
tmp1
tmp0
-tmp2
pmult[0]
cycle 2cycle 2
cycle 1cycle 1
<< 2
stage 1-4
stage 5
stage 6
cycle 1cycle 1
cycle 2cycle 2
cycle 1cycle 1
cycle 2cycle 2
MU
X
MU
X
synthesissynthesis(strict (strict timing)timing)
![Page 27: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/27.jpg)
27
Implementation
![Page 28: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/28.jpg)
28
Automatically generated results
• ffd: free-form deformation; dct: discrete cosine transform
with respect to smallest
with respect to software
![Page 29: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/29.jpg)
29
5b. Data representation optimisation
In1 In2 In3 In4 In5
+ *
-
+
*+
+
Out1 Out2
X
Y
• In1..In5– known width
• Out1..Out2– width determines
accuracy– defined by user
• find representation– minimise width of
nodes, e.g. X, Y
• trade-off in speed, area, power, error
![Page 30: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/30.jpg)
30
Floating-point design Fixed-point design
Output Design Descriptions
Xilinx System Generator VHDL
A Stream Compiler (ASC) Code Annotated DFG HandelC
Designdatabase
Range analysis( Interval analysis )
Precision analysis( Automatic Differentiation )
Bit-width determination Bit-width determination
Design Selection
BitSize bit-width analysis system - Frontend
BitSize bit-width analysis system - Backend
User Specified designconstraints
Input Design Descriptions Xilinx System Generator C/C++ ASC Code HandelC
Design Selection
Fixed-point
- range: integer
- precision: fraction
Floating-point
- range: exponent
- precision: mantissa
![Page 31: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/31.jpg)
31
0 5 10 15 20 250
500
1000
1500
2000
2500
3000
Error Percentage
Are
a -
Vir
tex2
Slic
es DFT
4-Tap FIR filter
FIR filter and DFT: area vs error
1% more error: 65% less area
![Page 32: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/32.jpg)
32
0 5 10 15 20 2520
30
40
50
60
70
80
90
Error Percentage
Sp
ee
d -
MH
z
DFT4-Tap FIR filter
FIR filter and DFT: speed vs error
2% more error: 20% higher speed
![Page 33: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/33.jpg)
33
5c. Upgradable design
initial release
Number of new users
upgradabledesign
non-upgradabledesign
10 20 30
upgrade 1
upgrade 2
Time (months)
Source: Xilinx
• upgradability: minimise time-to-market maximise time-in-market• add new functions, fix bugs• very rapid upgrade?
![Page 34: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/34.jpg)
34
Dynamic upgrade: turbo coder
• error correction code: add redundancy• need: fast, low-power, adapt to noise level• recursive systematic convolutional (RSC)
encoder, decoder, interleaver
RSC
RSC
Interleaver
Decoder
Decoder
InterleaverDe-Interleaver
Interleaver
DecideIn Out
Turbo encoder
Turbo decoder
noisycommun.channel
Source: Liang, Tessier, Goeckel
![Page 35: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/35.jpg)
35
Self-tuning: run-time reconfiguration
• adapt: less channel noise, so lower power
• larger Nmax: better correction, more area/power
• sample channel noise every 250K bits
• find Signal to Noise Ratio (SNR), select Nmax
• if Nmax current Nmax, configure new bitstream
• configuration overhead: about 30ms, 54.8mW
controller
LUTSNR =?
Nmax
current Nmax
ConfigureFPGA
Source: Liang, Tessier, Goeckel
![Page 36: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/36.jpg)
36
Performance of dynamic upgrade
code(65,57
)(31,27)
(15,13)
staticspeed (Kbps) 173.4 301.2 487.9
power (mW) 447.7 205.8 134.3
dynamic
required reconfigs
8369/10000
6306/10000
6925/10000
speed (Kbps) 359.1 429.4 598.6
power (mW) 216.2 131.7 111.6
power saving
52% 36% 18%• up to 0.5x power, 2x speed over static decoder• 100 times faster than processor decoderSource: Liang, Tessier, Goeckel
![Page 37: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/37.jpg)
37
• domain-specific design automation– languages + tools: for particle physics systems?
• multi-core, sensor network co-design– multiple hardware/software: FPGA + CPU + sensors
• extending processor and compiler capabilities– static and dynamic optimizations, self-tuning
• power-aware, radiation-aware design– transforms e.g. pipelining, damage monitoring
• rapid and informative design validation– simulation + FPGA prototype + formal verification
Other directions
![Page 38: Field-Programmable Technology: Today’s and Tomorrow’s](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814e90550346895dbc378f/html5/thumbnails/38.jpg)
38
• good: Moore’s Law, bad: productivity gap• vision: unified design synthesis and analysis• devices and design today
– growing gap: amount of I/O and amount of logic – enhance optimality and re-use: I/O driven
• devices tomorrow– hybrid FPGA: multi-granularity fabric– 3D FPGA: customisable system-in-package
• design tomorrow– guided synthesis: optimised and portable design – data representation optimisation– upgradable and self-tuned design
6. Summary