Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th,...
-
Upload
robyn-warner -
Category
Documents
-
view
224 -
download
1
Transcript of Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th,...
![Page 1: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/1.jpg)
Paging for Multi-Core Shared Caches
Alejandro López-Ortiz , Alejandro Salinger
ITCS, January 8th, 2012
![Page 2: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/2.jpg)
2
![Page 3: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/3.jpg)
Multi-Core challenges• Access to data is a key factor • Cache efficiency is determinant
– Algorithms– Schedulers– Paging strategies
• Extensively studied for sequential case• Almost no previous theory for multi-core case
3
![Page 4: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/4.jpg)
Sequential Paging
5
Slow memory Cache of size K
…p6 p3 p2 p4 p4 p2 p10 p11 p5 p4…Page request
Is pi in the cache? -Yes, do nothing (hit)-No, fetch pi from slow memory, evict one page from cache (fault)
Goal: minimize number of faults
![Page 5: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/5.jpg)
Sequential PagingCommon eviction policies:
– Least-Recently-Used (LRU)– First-In-First-Out (FIFO)– Flush-When-Full (FWF)– Furthest-In-The-Future (FITF) (offline)
• An online algorithm A is c-competitive if for all R
6
![Page 6: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/6.jpg)
Multi-Core Paging
7
RAM
Core 1 Core 2 Core 3 Core 4
L2/L3 Cache
![Page 7: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/7.jpg)
t 1 2 3 4 5 6 7 8 9 10 11 12
R1: p2 p8 p1 p4 p3 p4 p10 p5 …
R2: p9 p1 _ _ _ p8 p2 p1 p1 p4 p7 …
R3: p3 p18 p17 p8 p2 p3 p2 p9 …
Multi-Core Paging
• p sequences• shared cache of size K• total length n (n K, p)• hit = 1 unit of time • fault = units
8
t 1 2 3 4 5 6 7 8 9 10 11 12
R1: p2 p8 p1 p4 p3 p4 p10 p5 …
R2: p9 p1 p8 p2 p1 p1 p4 p7 …
R3: p3 p18 p17 p8 p2 p3 p2 p9 …
fault at t=2 on p1,
![Page 8: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/8.jpg)
Related Models
• Multiple applications or threads• Multi-Core model [Hassidim, ICS‘10]
– Makespan– LRU is not competitive– Scheduling
• Our model:– No scheduling of requests– Separates scheduling and paging– Minimize faults
9
![Page 9: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/9.jpg)
Natural Strategies
• Share the cache– Eviction policy
• Partition the cache among cores– Partition function (static, dynamic)– Eviction policy
• Examples: – Shared-LRU – Optimal Static Partition with LRU
10
![Page 10: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/10.jpg)
Partition vs. Shared
11
𝑂𝑝𝑡 𝑆𝑡𝑎𝑡𝑖𝑐 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛h𝑆 𝑎𝑟𝑒𝑑 𝐿𝑅𝑈
=Ω(𝑛)
h𝑆 𝑎𝑟𝑒𝑑 𝐿𝑅𝑈𝑂𝑝𝑡 𝑆𝑡𝑎𝑡𝑖𝑐 𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛
≤𝐾
For any online dynamic partition that changes o(n) times
𝑃𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛h𝑆 𝑎𝑟𝑒𝑑 𝐿𝑅𝑈
=𝜔 (1)
Partitions that don’t change enough are not competitive
![Page 11: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/11.jpg)
Shared strategies
The same applies to FIFO, CLOCK, FWF
12
Theorem:Competitive Ratio of (Shared) LRU =
when offline algorithm has cache h ≈ K/2
![Page 12: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/12.jpg)
Proof idea
13
pages pages
Faults LRU ≥ n/2
Faults Offline ≤ Initial + αK per coloured phase =
Competitive Ratio LRU =
Obs: Furthest-In-The-Future is not optimal
![Page 13: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/13.jpg)
The Offline Problem
14
![Page 14: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/14.jpg)
PARTIAL-INDIVIDUAL-FAULTS (PIF):
Given , time and , can be served such that at time the number of faults on is at most ?
15
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
p1 _ _ p2 p8 p1 p4 _ _ p10 p5 p1 p4 p2 p9 p9 p5 p2 p3 p7
p2 p9 p1 _ _ p4 p8 _ _ p1 p4 p7 p2 _ _ p3 p4 _ _ p1
p3 p4 _ _ p8 p2 p3 p2 p9 p5 p1 p4 p2 p9 p9 p1 _ _ p4 p2
p2 _ _ p3 p8 p1 p1 p3 p9 _ _ p10 p5 p1 p8 _ _ p1 p4 p2
(𝑓 1𝑓 2𝑓 3𝑓 4
)≤(2334)E.g. At t=18, ?
![Page 15: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/15.jpg)
PARTIAL-INDIVIDUAL-FAULTS (PIF):
• Optimization version (MAX-PIF): given an instance of PIF, maximize the number of sequences that fault within given bound
• Unless P=NP, there is no PTAS for MAX-PIF
Theorem: PIF is NP-complete
Theorem: MAX-PIF is APX-hard
![Page 16: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/16.jpg)
PIF vs. Min Faults
• Partial-Individual-Faults remains NP-hard even when
• If , minimizing faults can be solved by FITF• Achieving a fair fault distribution is harder
than minimizing the total number of faults
17
![Page 17: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/17.jpg)
The Offline Problem• Offline algorithm can align sequences properly by means of faults• Algorithm could “force faults” for this sake
• Regular execution
• Forcing a fault on p1
18
p1 p2 p3
p5 p8 p9
p1 p5 p4 p5 p1 p4 p6 p9 …p2 p3 p3 p2 p8 p8 p3 p10 p7 …
p1 p2 p3
p5 p8 p9
p1 p2 p3
p5 p8 p4
p1 p5 p4 _ _ _ p5 p1 p4 p6 p9 …p2 p3 p3 p2 p8 p8 p3 p10 p7 …
p1 p5 p4 p5 p1 p4 p6 p9 …p2 p3 p3 p2 p8 p8 p3 p10 p7 …
p1 _ _ _ p5 p4 p5 p1 p4 p6 p9 …p2 p3 p3 p2 p8 p8 p3 p10 p7 …
p1 p2 p3
p5 p8 p9
p1 p2 p3
p5 p8 p9
p1 p4 p3
p5 p8 p9
p1 _ _ _ p5 p4 _ _ _ p5 p1 p4 p6 p9
p2 p3 p3 p2 p8 p8 p3 p10 p7 …
![Page 18: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/18.jpg)
The Offline Problem
• However, this has no advantage over an honest offline algorithm
19
Theorem: Let A be an offline algorithm that forces faults. There exists an offline algorithm A’ such that for all disjoint R
A(R) =A’(R)
![Page 19: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/19.jpg)
The Offline Problem• For minimizing faults:
• Yields an time algorithm• Can be improved to using dynamic programming (recall n>>p)• This algorithm extends to Partial-Individual-Faults
20
Theorem: There exists an optimal offline algorithm that upon each fault evicts a page whose next request time is maximal in , for some j=1..p
![Page 20: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/20.jpg)
Conclusions• Multi-core paging is significantly different
from sequential paging• Traditional paging strategies are not
competitive • Serving a set of requests while limiting faults
in each sequence is hard• Multi-core paging is in P when number of
cores is constant
21
![Page 21: Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.](https://reader038.fdocuments.us/reader038/viewer/2022103023/56649da85503460f94a948f4/html5/thumbnails/21.jpg)
Open Problems
• What are good online strategies?• What are good measures of performance?
– Fairness? • What is the complexity of minimizing the
number of faults?• Can we obtain more efficient offline
algorithms (exact or approximate)?
22
Thank you