Figures src1
Transcript of Figures src1
IIX
PVC SSC
SAD BFS
NW
PFF
MUM
SPMV KM
LUD
WP
AES SCP
PVR JPEG
LIB
NQU FFT
BP
BLK
SD2 SD1
RAY LK
C NN
PFN
HOT MM
STO CP
avg0.4
0.8
1.2
1.6
3.15662364604449 Minimum TLP Optimal TLP
Nor
mal
ized
IPC
On Chip Network
C
L2DRAM
L2DRAM
L2DRAM
L2DRAM
L1CL1
CL1
CL1
CL1
CL1
Application
Kernel Kernel
CTA
Kernel...CTA……CTA
Warp…
Warp
Thre
ad
...
IIX
PVC SSC
SAD BFS
NW
PFF
MUM
SPMV KM
LUD
WP
AES SCP
PVR JPEG
LIB NQU
FFT
BP BLK
SD2
SD1 RAY
LKC
NN PFN
HOT
MM STO
CP avg
0%20%40%60%80%
100%Ac
tive
Tim
e Ra
tio (R
ACT
)
1 2 3 4-2.22044604925031E-16
0.2
0.4
0.6
0.8
1
1.2
1.4AES
IPC lat_rt util
1 2 3 4 5 6 7 8-2.22044604925031E-16
0.2
0.4
0.6
0.8
1
1.2
1.4CP
IPC lat_rt util
1 2 3 4 5 6 7 8-2.22044604925031E-16
0.2
0.4
0.6
0.8
1
1.2
1.4JPEG
IPC lat_rt util
1 2 3 4 5 6 7 80.7
0.75
0.8
0.85
0.9
0.95
1
1.05MM
IPC lat_rt util
1 2 3 4 5 6 7 80
0.20.40.60.8
11.21.4
AES MMJPEG CP
Number of CTAs
Nor
mal
ized
IPC
1 2 3 4 5 6 7 80
0.20.40.60.8
11.21.4
AES MMJPEG CP
Number of CTAs
Nor
mal
ized
late
ncy
1 2 3 4 5 6 7 80
0.20.40.60.8
11.21.4
AES MMJPEG CP
Number of CTAs
Nor
mal
ized
core
uti-
lizati
on
Idle
CTA 1CTA 2
CTA 3CTA 4Co
re 1
CTA 5CTA 6
CTA 7CTA 8Co
re 2
CTA 1
CTA 2
CTA 6
CTA 7Core
1
CTA 3
CTA 4 CTA 5
CTA 8
Idle
Core
2
H LC_idle
L M HL
M
H
- -- --
C_mem
C_st
all
: Increment n: Decrement n
0 4 8 12 160.5
0.6
0.7
0.8
0.9
1
Type1Type2
Number of cores turned off
Rela
tive
perf
orm
ance
0 4 8 12 160.5
0.6
0.7
0.8
0.9
1
Type1Type2
Number of cores turned off
Rela
tive
perf
orm
ance
IIX
PVC SSC
SAD BFS
NW
PFF
MUM
SPMV KM
LUD
WP
AES SCP
PVR JPEG
LIB
NQU FFT
BP
BLK
SD2 SD1
RAY LK
C NN
PFN
HOT MM
CCP STO
CP avg
0.80.9
11.11.21.31.41.51.6
2.3 2.9 3.01.9
3.22.9 3.0 2.1
DYNCTA Optimal TLPN
orm
aliz
ed IP
C
IIX
PVC SSC
SAD BFS
NW
PFF
MUM
SPMV KM
LUD
WP
AES SCP
PVR JPEG
LIB
NQU FFT
BP
BLK
SD2 SD1
RAY LK
C NN
PFN
HOT MM
STO CP
avg0.8
1
1.2
1.4
1.6
2.32.9 3.0
1.93.2
2.9 3.0 2.12.9
TL DYNCTA Optimal TLPN
orm
aliz
ed IP
C
IIX
PVC SSC
SAD BFS
NW
PFF
MUM
SPMV KM
LUD
WP
AES SCP
PVR JPEG
LIB
NQU FFT
BP
BLK
SD2 SD1
RAY LK
C NN
PFN
HOT MM
STO CP
avg0
0.4
0.8
1.2
1.6Round Trip Fetch Latency Core Utilization
Nor
mal
ized
Val
ue
00.5
11.5
22.5
33.5
44.5
00.10.20.30.40.50.60.70.80.91
Average n Nopt RACT
Num
ber o
f CTA
s
Activ
e tim
e ra
tio (R
ACT)
0123456789
00.10.20.30.40.50.60.70.80.91
Average n Nopt RACT
Num
ber o
f CTA
s
Activ
e tim
e ra
tio (R
ACT)
0
1
2
3
4
5
6
7
00.10.20.30.40.50.60.70.80.91
Average n Nopt RACT
Num
ber o
f CTA
s
Activ
e tim
e ra
tio (R
ACT)
0123456789
00.10.20.30.40.50.60.70.80.91
Average n Nopt RACT
Num
ber o
f CTA
s
Activ
e tim
e ra
tio (R
ACT)
IIX
PVC SSC
SAD BFS
NW
PFF
MUM
SPMV KM
LUD
WP
AES SCP
PVR JPEG
LIB
NQU FFT
BP
BLK
SD2 SD1
RAY LK
C NN
PFN
HOT MM
STO CP
avg012345678
N DYNCTA OptimalAv
erag
e nu
mbe
r of
CTAs
IIX
PVC SSC
SAD BFS
NW
PFF
MUM
SPMV KM
LUD
WP
AES SCP
PVR JPEG
LIB
NQU FFT
BP
BLK
SD2 SD1
RAY LK
C NN
PFN
HOT MM
STO CP
avg0.4
0.6
0.8
1
1.2
1.4DYNCTA DYNCORE with power gating DYNCORE without power gating
Nor
mal
ized
Pow
er
IIX
PVC SSC
SAD BFS
NW
PFF
MUM
SPMV KM
LUD
WP
AES SCP
PVR JPEG
LIB
NQU FFT
BP
BLK
SD2 SD1
RAY LK
C NN
PFN
HOT MM
STO CP
avg0.6
11.41.82.22.6
32.9
3.13.6
3.23.0
2.6
DYNCTA DYNCORE with power gating DYNCORE without power gatingN
orm
aliz
ed E
nerg
y Effi
cien
cy
4 8 12 160
0.20.40.60.8
11.21.41.61.8
2IPC power energy
Number of Cores Turned Off
Nor
mal
ized
Val
ue
64 1210
0.20.40.60.8
11.21.41.61.8
2IPC power energy
Number of Nodes in the System
Nor
mal
ized
Val
ue