Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.
-
Upload
jose-pinilla -
Category
Engineering
-
view
92 -
download
5
Transcript of Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.
![Page 1: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/1.jpg)
Adaptive Insertion Policies for High Performance CachingQureshi, et al.
EECE527 - Paper SummaryJose Pinilla
![Page 2: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/2.jpg)
Cache Replacement Policies
● Victim Selection Policy○ LRU
● Insertion Policy○ MRU○ LRU
![Page 3: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/3.jpg)
LRU (Baseline)LRU replacement (commonly used):
![Page 4: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/4.jpg)
Belady’s OPT
Optimal page replacement algorithm (Changes Victim Selection Policy):
LRU replacement (commonly used):
![Page 5: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/5.jpg)
LIP (LRU Insertion Policy)
LIP (LRU Insertion Policy)
LRU replacement (commonly used):
7 7
0
7
1
0
2
1
0
3
1
0
4
1
0
2
1
0
3
1
0
3
2
0
3
2
1
0
2
1
0
7
1
![Page 6: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/6.jpg)
Belady’s OPT
LIP (LRU Insertion Policy)
LRU replacement (commonly used):
7 7
0
7
1
0
2
1
0
3
1
0
4
1
0
2
1
0
3
1
0
3
2
0
3
2
1
0
2
1
0
7
1
![Page 7: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/7.jpg)
Cyclic Reference Model
for j = 1 to Ninstructions read (a1...aT)
for j = 1 to Ninstructions read (b1...bT)
Let there be an access pattern in which (a1 · · · aT)N is followed by (b1 · · · bT)N
Cache Size K (K < T)
N >> T N >> K/ϵ
![Page 8: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/8.jpg)
Access Pattern: LRU Step 1
a1
a2
a3
aT
K
TN
![Page 9: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/9.jpg)
Access Pattern: LRU Step 2
a1
a2
a3
aT
K
TN
![Page 10: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/10.jpg)
Access Pattern: LRU Step X
a1
a2
a3
aT
K
TN
![Page 11: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/11.jpg)
Access Pattern: LRU Step X>T*N
a1
a2
a3
aT
TN
b1
b2
b3
bT
KTN
![Page 12: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/12.jpg)
T
Access Pattern: LIP Step 1
a1
a2
a3
aT
TN
b1
b2
b3
bT
N
K
![Page 13: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/13.jpg)
T
Access Pattern: LIP Step 2
a1
a2
a3
aT
TN
b1
b2
b3
bT
N
K-1
![Page 14: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/14.jpg)
T
Access Pattern: LIP Step X>T*N
a1
a2
a3
aT
TN
b1
b2
b3
bT
N
K-1
![Page 15: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/15.jpg)
Bimodal InsertionControl the percentage of incoming lines placed as MRU
ϵ = Bimodal throttle parameterϵ=1 => LRUϵ=0 => LIP
![Page 16: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/16.jpg)
T
Access Pattern: BIP
a1
a2
a3
aT
TN
b1
b2
b3
bT
N
K-1
![Page 17: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/17.jpg)
T
Access Pattern: BIP
a1
a2
a3
aT
TN
b1
b2
b3
bT
N
![Page 18: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/18.jpg)
T
Access Pattern: BIP
a1
a2
a3
aT
TN
b1
b2
b3
bT
N
![Page 19: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/19.jpg)
T
Access Pattern: BIP
a1
a2
a3
aT
TN
b1
b2
b3
bT
N
![Page 20: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/20.jpg)
Hit Rate
Cache Size K (K < T)
ϵ = Bimodal throttle parameterϵ=1 => LRUϵ=0 => LIP
N >> T N >> K/ϵ
![Page 21: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/21.jpg)
Benchmarksmcf art
health
250M instructions obtained with SimPoint
![Page 22: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/22.jpg)
Results 1
So they proved that it works…
![Page 23: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/23.jpg)
Results 1
So they proved that it works…...but don’t over do it (ϵ)...
![Page 24: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/24.jpg)
Results 1
So they proved that it works…...but don’t over do it (ϵ)...
...actually, let’s choose LRU on run-time sometimes.
![Page 25: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/25.jpg)
DIP: Select MechanismDIP - Global / DSS DIP - Set Dueling
ATD: Auxiliary Tag Directory
MTD: Main Tag Directory
![Page 26: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/26.jpg)
DIP: Select MechanismDIP - Global / DSS DIP - Set Dueling
Dedicated-SetSelection
Policy
Staticor
Dynamic(+2 bits)
![Page 27: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/27.jpg)
DIP: Select MechanismDIP - Global / DSS DIP - Set Dueling
Dedicated-SetSize
SelectionPolicy
![Page 28: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/28.jpg)
Run-time adaptation: PSEL values
PSEL>=512 then LIP PSEL<512 then LRU
![Page 29: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/29.jpg)
Hardware advantages● LIP, BIP and DIP similar to current LRU approximations
● DIP does not require extra bits in the tag-store entry
● No major logic overhead means the cache access time is unaffected
![Page 30: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/30.jpg)
Related Work
R: Random, N: Random from the less recent half, F: Frequently
● Bypass
● Early Eviction
● Dynamic Exclusion
![Page 31: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/31.jpg)
Remarks
Retain some fraction of the working set
Dynamically adapt to workloads and patterns
Low overhead (Set dueling)
![Page 32: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/32.jpg)
Questions?
![Page 33: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/33.jpg)
Questions?
What would be the behaviour if DIP used ATDs dedicated to LRU and LIP?
● Compare Amean
Dynamic ϵ● Can ϵ be extracted from PSEL?
![Page 34: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/34.jpg)
References“Cache Replacement with Dynamic Exclusion”. Scott McFarling
“Set-Dueling-Controlled Adaptive Insertion for High-Performance Caching”. Qureshi et al.
“Using SimPoint for Accurate and Efficient Simulation”. Perelman et al.
“Adaptive Caching for High-Performance Memory Systems”. PhD Dissertation. Qureshi et al.
![Page 35: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/35.jpg)
McFarling: Conflict Between Loops
for i = 1 to 10for j = 1 to 10
instruction afor j = 1 to 10
instruction b
*(a10b10)10 = 0%
(amah9bmbh
9)10 = 10%
* ignoring loop
Source: “Cache Replacement with Dynamic Exclusion”. Scott McFarling
![Page 36: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/36.jpg)
McFarling: Conflict Between Loops Levels
for i = 1 to 10for j = 1 to 10
instruction ainstruction b
Direct-mapped(amah
9bm)10 = 18%
Optimalamah
9bm(ah10bm)9 = 10%
Source: “Cache Replacement with Dynamic Exclusion”. Scott McFarling
![Page 37: Summary - Adaptive Insertion Policies for High Performance Caching. Qureshi, et al.](https://reader034.fdocuments.us/reader034/viewer/2022042607/55ac05e61a28ab9b518b46aa/html5/thumbnails/37.jpg)
McFarling: Conflict within Loops
for i = 1 to 10instruction ainstruction b
Direct-mapped(ambm)10 = 100%
Optimalambm(ahbm)9 = 55%
Source: “Cache Replacement with Dynamic Exclusion”. Scott McFarling