ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD:...
Transcript of ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD:...
![Page 1: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/1.jpg)
ACCORD:AssociativityforDRAMCachesbyCoordinatingWay-InstallandWay-Prediction
ISCA2018
1
Vinson Young (GT)Chiachen Chou (GT)Aamer Jaleel (NVIDIA)Moinuddin K. Qureshi (GT)
Authors:
![Page 2: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/2.jpg)
4-8x Bandwidth(of traditional memory)
3D-DRAM MITIGATES BANDWIDTH WALL
2
Hybrid Memory Cube (HMC) from Micron, High Bandwidth Memory (HBM) from Samsung
3D-Stacked DRAM
✔
Limited Capacity✘
Memory3D-DRAM + High-Capacity Memory = Hybrid Memory
Modernsystempackingmanycoresè BandwidthWall
![Page 3: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/3.jpg)
OS-visible Space
System Memory(NVM / DRAM)
USE 3D-DRAM AS A CACHE
3
DRAM-Cache(3D-DRAM)
Mem
ory
Hie
rarc
hy
fast
slow
CPUL1$L2$
L3$
CPU
L2$L1$
Using 3D-DRAM as a DRAM cache, can improve memory bandwidth (and avoid OS/software change)
MCDRAM from Intel
![Page 4: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/4.jpg)
Organize at line granularity (64B) for capacity/BW utilization
Gigascale cache needs large tag-store (tens of MBs)
3D-DRAM
ARCHITECTING LARGE DRAM CACHES
4
4GB Data128 MBTags
Tags?Too large for SRAM
![Page 5: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/5.jpg)
Organize at line granularity (64B) for high cache utilization
Gigascale cache needs large tag-store (tens of MBs)
Practical designs must store Tags in DRAM
3D-DRAM
ARCHITECTING LARGE DRAM CACHES
5
How to architect tag-store for low-latency tag access?
4GB Data128 MBTags
![Page 6: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/6.jpg)
EFFICIENT TAG ORGANIZATION (KNL CACHE)
6
Practical designs are 64B line-size, store Tag-With-Data, and are direct-mapped, to optimize for hit-latency.
Tag Data Tag Data Tag Data Tag Data
Tag-With-Data [Alloy Cache, Intel Knights Landing]
Single Tag+Data Lookup (1x hit latency),but direct-mapped
Intel Knights Landing Product (MCDRAM) uses this DRAM-cache organization.
![Page 7: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/7.jpg)
60
70
80
90
1-way
2-way
4-way
8-way
(a) Hit Rate
Hit
Ra
te (
%)
0
0.5
1
1.5
2-way
4-way
8-way
(b) Speedup (Parallel)
Sp
ee
du
p (
Pa
ralle
l) 0
0.5
1
1.5
2-way
4-way
8-way
(c) Speedup (Idealized)
Sp
ee
du
p (
Ide
aliz
ed
)Reduce 25% of misses
POTENTIAL OF ASSOCIATIVITY
7
How can we make DRAM caches associative?
Assumes 16-core system, with 4GB DRAM-Cache, in front of PCM memory.
![Page 8: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/8.jpg)
ASSOCIATIVITY OPTION 1: SERIAL TAG LOOKUP
Serial Tag Lookup enables associativity, but, it has serialization delay.
8
A B
Way0 Way1Address
A BIfmiss
![Page 9: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/9.jpg)
ASSOCIATIVITY OPTION 2: PARALLEL TAG LOOKUP
Parallel Lookup avoids serialization latency, but, it introduces 2x bandwidth cost.
9
A B
Way0 Way1Address
A B
![Page 10: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/10.jpg)
60
70
80
90
1-way
2-way
4-way
8-way
(a) Hit Rate
Hit
Ra
te (
%)
0
0.5
1
1.5
2-way
4-way
8-way
(b) Speedup (Parallel)
Sp
ee
du
p (
Pa
ralle
l) 0
0.5
1
1.5
2-way
4-way
8-way
(c) Speedup (Idealized)
Sp
ee
du
p (
Ide
aliz
ed
)Reduce 25% of misses -46%
ASSOCIATIVITY FOR DRAM CACHE (PARALLEL)
10
Increasing associativity naively actually degrades performance due to increased BW cost
![Page 11: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/11.jpg)
60
70
80
90
1-way
2-way
4-way
8-way
(a) Hit Rate
Hit
Ra
te (
%)
0
0.5
1
1.5
2-way
4-way
8-way
(b) Speedup (Parallel)
Sp
ee
du
p (
Pa
ralle
l) 0
0.5
1
1.5
2-way
4-way
8-way
(c) Speedup (Idealized)
Sp
ee
du
p (
Ide
aliz
ed
)Reduce 25% of misses -46%
21%
ASSOCIATIVITY FOR DRAM CACHE (IDEAL)
11
With latency / BWof direct-mapped
Associativity must still maintain the latency/BWof direct-mapped caches. How?
![Page 12: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/12.jpg)
OPTION 3: WAY-PREDICTED TAG LOOKUP
Way-Predicted Tag Lookup can obtain improved hit-rate, with BW / latency of direct-mapped cache.
12
Way-Predicted Tag Lookup
A B
Way0 Way1Address
BIfmiss
WayPrediction
![Page 13: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/13.jpg)
Accuracy(4-way) 74.3% 91.6%
Accuracy(8-way) 63.2% 81.2%
MRUPred(1bit/set)
Partial-Tag(4bit/line)
SRAMStorage 4MB 32MB
Way-Pred Accuracy(2-way)
85.7% 97.3%
WAY-PREDICTION ACCURACY & COST
Prior methods for way-prediction have low accuracy and/or have high storage overhead.
13
![Page 14: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/14.jpg)
TOWARDS ASSOCIATIVITY W/ WAY-PREDICTION
14
Way-Predicted Tag Lookup
A B
Way0 Way1Address
BIfmiss
WayPrediction
Goal: Low storage-overhead and high accuracy way-prediction, to enable associative DRAM cache
![Page 15: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/15.jpg)
ACCORD OVERVIEW
• Background
• ACCORD– Probabilistic Way-Steering (PWS)– Ganged Way-Steering (GWS)– Skewed Way-Steering (SWS)
• Summary
15
![Page 16: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/16.jpg)
INSIGHT: WAY-PREDICTABILITY AT LOW STORAGE?
Insight: Modifying install policy can make way-prediction much simpler!
16
EVENODD
Way0 Way1
EVEN
ODDODDEVEN
Base Install Policy (Rand)
EVENODDEVEN
ODDODDEVEN
Tag-based Install Policy
Way0 Way1
Hard-to-predict (~50%) Predict 100%! But, direct-mapped
![Page 17: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/17.jpg)
PROPOSAL: ACCORD
AssoCiativity by CoORDinating way-install and prediction.ACCORD achieves a way-predictable cache at low cost.
17
Way0 Way1
A2B3A3
B5B7
WayInstallPolicy
WayPredictor
Coordinate
![Page 18: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/18.jpg)
ACCORD OVERVIEW
• Background
• ACCORD– Probabilistic Way-Steering (PWS)– Ganged Way-Steering (GWS)– Skewed Way-Steering (SWS)
• Summary
18
![Page 19: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/19.jpg)
PROBABILISTIC WAY-STEERING
PWS enables way-predictability, by trading speed of learning to use both ways (hit-rate)
19
Install using PWS
PageA,BBias=90% 10%
Static prediction: ~90%
B1B2B3B4
B6B7
B0
B5
Way0 Way1Address
A1A2
A3A4
A6A7
A0
A5
B1B2B3B4
B6B7
B0
B5
A1A2
A3A4
A6A7
A0
A5
Preferred
Will use both ways, improve hit-rate
![Page 20: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/20.jpg)
SENSITIVITY TO PWS PROBABILITY
20
0% 20% 40% 60% 80% 100%
0% 2% 4% 6% 8%
10% 12% 14%
50% 60% 70% 80% 85% 90% 100% Way-PredAccuracy(%
)
MissRed
uctio
n(%
)
Biasforselecting“preferredway”
Way-PredAccuracy
2-way design Direct-mapped
Preferred-wayInstallProbability=x%biastoinstallinpreferredway
![Page 21: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/21.jpg)
SENSITIVITY TO PWS PROBABILITY
21
0% 20% 40% 60% 80% 100%
0% 2% 4% 6% 8%
10% 12% 14%
50% 60% 70% 80% 85% 90% 100% Way-PredAccuracy(%
)
MissRed
uctio
n(%
)
Preferred-wayInstallProbability
MissReduction(%) Way-PredAccuracy
2-way design Direct-mapped
![Page 22: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/22.jpg)
SENSITIVITY TO PWS PROBABILITY
22
2.6% 3.7% 4.7% 5.5% 5.6% 5.3%
0.0% 0% 20% 40% 60% 80% 100%
0% 2% 4% 6% 8%
10% 12% 14%
50% 60% 70% 80% 85% 90% 100% Way-PredAccuracy(%
)
MissRed
uctio
n(%
)Speedu
p(%
)
Preferred-wayInstallProbability
Speedup MissReduction(%) Way-PredAccuracy
Preferred-way Install Probability (85%) provides best trade-off of hit-rate for WP accuracy, for 5.6% speedup.
5.6% speedup
![Page 23: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/23.jpg)
ACCORD OVERVIEW
• Background
• ACCORD– Probabilistic Way-Steering (PWS)– Ganged Way-Steering (GWS)– Skewed Way-Steering (SWS)
• Summary
23
![Page 24: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/24.jpg)
GANGED WAY-STEERING
24
B1B2B3B4
B6B7
B0
B5
Way0 Way1Address
B0B1B2B3B4
B6B7
B5
Way0 Way1Address
B0B1B2B3B4
B6B7
B5
A0A1A2A3A4
A6A7
A5
A1A2
A3A4
A6A7
A0
A5
B1B2B3B4
B6B7
B0
B5
A1A2
A3A4
A6A7
A0
A5
Probabilistic Way-SteeringPer-line randomized decision
Ganged Way-SteeringPer-page rand decision
Preferred Preferred
Pred ~50% Pred >90%
Ganged Way-Steering makes install decision at large granularity, to improve predictability for workloads with high spatial locality.
![Page 25: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/25.jpg)
GANGED WAY-STEERING IMPLEMENTATION
25
Way0 Way1
A2B3A3
B5B7
0x001
RegionID WayGuideInstall
RecentInstallTable(RIT)
Install RegionID
0x101
WayPredictWay
RecentLookupTable(RLT)
Access
GWS Per-Region Last-Way install + Last-Way prediction. 64-entry RIT and 64-entry RLT needs only 320 Bytes.
0 1
![Page 26: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/26.jpg)
PWS+GWS WAY-PREDICTION ACCURACY
26
70%
75%
80%
85%
90%
95%
100%
PWS+GWSPWSLibquantum
GWS enables spatial workloads to have near-100% accuracyPWS has ~85% base accuracy
Combination of PWS+GWS achieves 90% accuracy,at the cost of 320B storage.
Way-PredAcc(%
)
70%
75%
80%
85%
90%
95%
100%
Average(21workloads)PWS+GWSPWS
![Page 27: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/27.jpg)
PWS+GWS (ACCORD 2-WAY) RESULTS
PWS + GWS gets 7.3% of 10% speedup of perfectly-predicted 2-way cache.
7.3% speedup
27System assumes 4GB DRAM Cache, and PCM-based main memory.
0%
2%
4%
6%
8%
10%
12% Speedu
p
PWSPWS+GWS
Perfect
![Page 28: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/28.jpg)
ACCORD OVERVIEW
• Background
• ACCORD– Probabilistic Way-Steering (PWS)– Ganged Way-Steering (GWS)– Skewed Way-Steering (SWS)
• Summary
28
![Page 29: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/29.jpg)
DIFFICULTY IN SCALING TO N-WAYS
• Scaling ACCORD to N-ways– ACCORD 4-way has 3% speedup– ACCORD 8-way has 6% slowdown…
We need solutions to reduce miss-confirmation
• Miss confirmation: N-way cache needs N accesses to confirm line is not resident
29
EA CB D
Way0 Way1 Way2 Way3AddressEMiss!
![Page 30: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/30.jpg)
SOLUTION: SKEWED WAY-STEERING
Restricting placement, reduces miss-confirmation èhit-rate benefits without any storage overhead
30
Only2lookupstodeterminemiss
Way0 Way2Way3
EAccess:
ABCA B
4-waywith2-skew:Access:ABC
OnePreferred+OneAlternateway
Way1
![Page 31: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/31.jpg)
SPEEDUP FROM ACCORD (WITH SWS)
31
0%
2%
4%
6%
8%
10%
12% Speedu
p
2-Way
SWS 8-way achieves 11% speedup
4-Way 8-Way
![Page 32: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/32.jpg)
ACCORD OVERVIEW
• Background
• ACCORD– Probabilistic Way-Steering (PWS)– Ganged Way-Steering (GWS)– Skewed Way-Steering (SWS)
• Summary
32
![Page 33: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/33.jpg)
SUMMARY OF ACCORD
§ ACCORD: associative DRAM caches by coordinating way-install and way-prediction.
§ Probabilistic Way-Steering§ Biased-install enables accurate static way-prediction
§ Ganged Way-Steering§ Region-based install enables accurate region-based way-prediction
§ Skewed Way-Steering§ Skew enables flexibility in line placement, while maintaining miss cost
§ ACCORD enables associativity at negligible storage cost (320B), to achieve 11% speedup.
33
![Page 34: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/34.jpg)
ACCORD BACKUP SLIDES
ACCORD backup slides
34
![Page 35: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/35.jpg)
REPLACEMENT POLICY?
• LRU– State in SRAM
• 1-bit per line needs 8MB. Size of Last-level cache– State in DRAM
• 9% slowdown due to state-update cost (Hit to alternate way)
35
![Page 36: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/36.jpg)
COMPARISON TO OTHER WAY PREDICTORS
36
-6% -4% -2% 0% 2% 4% 6% 8%
10% 12%
Speedu
p
ACCORD outperforms other predictors while needing negligible storage overhead (320 B)
![Page 37: ACCORD: Associativity for DRAM Caches by Coordinating Way … · 2018-06-20 · ACCORD: Associativity for DRAM Caches by Coordinating Way-Install and Way-Prediction ISCA 2018 1 Vinson](https://reader033.fdocuments.us/reader033/viewer/2022052613/5f1676c556e22826c02ebc42/html5/thumbnails/37.jpg)
COLUMN-ASSOCIATIVE CACHE
• Column-associative / Hash-Rehash cache – Install lines in preferred way (way-0)– On eviction, move line to alternate way (way-1)– On hit to alternate way, move to preferred way
• Effectiveness– In general, way-prediction accuracy similar to MRU– But, requires significant bandwidth to swap lines on hit
to alternate way. CA-cache thus causes 4% slowdown.
37