International Symposium on Computer Architecture ( ISCA – 2010 )
description
Transcript of International Symposium on Computer Architecture ( ISCA – 2010 )
JILP
The Journal of Instruction-Level Parallelism
1st JILP Workshop onComputer Architecture Competitions
(JWAC-1):Cache Replacement Championship
International Symposium on Computer Architecture ( ISCA – 2010 )
JILP Submission Requirements
• Cache replacement algorithm• Code that fits into provided framework• Maximum of 3 versions of code were allowed
• 4-page paper
22
JILP Statistics
•Submissions– 26 total papers– 35 distinct code submissions
•Distribution– Asia – 12– North America - 11– Europe - 3
JILP Metrics
•Performance Ranking
•Overall Paper Quality
•Adherence to Competition Rules
•Qualitative Assessment of Logic Complexity
•Intuition provided
JILP Process
•Reviews– 26 papers– 3 reviews per paper -> 78 reviews– 6 reviewers -> ~13 reviews per reviewer– 8 reviewers -> ~10 reviews per reviewer
• Phone program committee– Shared Google docs to manage process
10 Papers Accepted
JILP Types of Policies
• Cache Replacement Strategies:• Insertion Policies• Reuse Distance Prediction• Dead Block Prediction• Memory Region Based Prediction• Counter-based Prediction• Frequency-based Prediction
66
JILP Thanks
• Organizing Committee– Aamer Jaleel, Intel (Chair)– Alaa Alameldeen, Intel– Moin Qureshi, IBM
• Sponsorship/Web– Eric Rottenberg
• Program Committee– Doug Burger, Microsoft– Mainak Chaudhuri, IITK– Aamer Jaleel, Intel– Gabriel Loh, Georgia Tech– Moinuddin Qureshi, IBM– Yan Solihin, NC State
JILP
RESULTS
88
JILP Experimental Framework
• Common framework• Allows for comparison of competing algorithms
• Trace driven performance model• 4-way OoO core• 3-level Cache Hierarchy• 32KB L1, 256KB L2
• Competition Focus: Replacement Policies for LLC (L3)• Private Cache: 1MB LLC (single core)• Shared Cache: 4MB LLC (4-core CMP)
99
JILP Workloads
• Workload Classes• SPEC CPU2006 – Reference Inputs (29)• PC Games and Multimedia (22)• Enterprise Server (14)
• Tracing Methodology:• SPEC workload traces captured with Pin (using Sim Points)• Non-SPEC workloads captured on a HW tracing system
• Simulation Methodology:• Warm up: 100M instructions• Detailed Simulation: 100M instructions• Shorter traces were divided 50/50
1010
JILP Experiments
• Single Threaded Workloads• All 65 traces
• Heterogeneous Multi-Programmed Workloads• 7 workloads selected from the three workload classes• 4-core combinations for each class created (7 choose 4=35)• 35 random selection created from all 21 workloads• Total # of Workloads For Shared Caches: 140
• Metrics:• ST Workloads: Throughput• Multi-Core Workloads: Weighted Speedup
1111All workloads kept secret from ALL contestants
JILP Private Cache Championship Results
1212
JILP Private Cache Championship Awards
• 3rd Place:• D. Jimenez. Dead Block Replacement and Bypass with a Sampling Predictor
• 2nd Place:• P. Michaud. The 3P and 4P cache replacement policies
• Champion:• H. Gao and C. Wilkerson. A Dueling Segmented LRU Replacement Algorithm
with Adaptive Bypassing
1313
JILP Shared Cache Championship
1414
JILP Shared Cache Championship Awards
• 3rd Place:• P. Michaud. The 3P and 4P cache replacement policies
• 2nd Place:• Y. Ishii, M. Inaba, and K. Hiraki. Map-based Adaptive Insertion Policy
• Champion:• H. Gao and C. Wilkerson. A Dueling Segmented LRU Replacement Algorithm
with Adaptive Bypassing
1515
JILP Private Cache Championship Results
1616
JILP Shared Cache Championship
1717