Scalable Directory Protocols for 1000s of Cores

12
Scalable Directory Protocols for 1000s of Cores Dominic DiTomaso EE 6633

description

Scalable Directory Protocols for 1000s of Cores. Dominic DiTomaso EE 6633. Outline. Introduction Background ATAC SPATL Cuckoo Directory SCD Conclusions. Directory Protocols. Snoopy (broadcast) -> Directory (multicast) Large Directory Overhead Overhead = P*M P - # of processors - PowerPoint PPT Presentation

Transcript of Scalable Directory Protocols for 1000s of Cores

Page 1: Scalable Directory Protocols for 1000s of Cores

Scalable Directory Protocols for 1000s of CoresDominic DiTomasoEE 6633

Page 2: Scalable Directory Protocols for 1000s of Cores

Outline

• Introduction•Background

• ATAC• SPATL• Cuckoo Directory

•SCD•Conclusions

Page 3: Scalable Directory Protocols for 1000s of Cores

Directory Protocols• Snoopy (broadcast) -> Directory (multicast)

• Large Directory Overhead• Overhead = P*M• P - # of processors• M - # of memory blocks

• 64 nodes: 12.7% overhead• 256 nodes: 50% overhead• 1024 nodes: 200% overhead

P

M

Page 4: Scalable Directory Protocols for 1000s of Cores

Directory Protocols• Requirements

• Small area, energy, and latency overheads• Accurate sharer information• Limited directory-induced invalidations

• Duplicate Tags• Area-efficient• High associativity -> high power

• Sparse Directory• Power-efficient• Large capacity -> large area

• Coarse-grain vectors, Hierarchical, etc.

Page 5: Scalable Directory Protocols for 1000s of Cores

ATAC• Optical Broadcast Network

Page 6: Scalable Directory Protocols for 1000s of Cores

SPATL• Tagless• Bloom Filters

Page 7: Scalable Directory Protocols for 1000s of Cores

Cuckoo Directory• N-ary Cuckoo Hash Table

Page 8: Scalable Directory Protocols for 1000s of Cores

SCD• Variable directory tags

Page 9: Scalable Directory Protocols for 1000s of Cores

SCD (cont.)

Page 10: Scalable Directory Protocols for 1000s of Cores

SCD (cont.)

Page 11: Scalable Directory Protocols for 1000s of Cores

Conclusions• Large directory overhead at 1000s of cores• Solutions

• Optics – ATAC• Tagless – SPATL• Hash Tables – Cuckoo• Variable Tags – SCD

Page 12: Scalable Directory Protocols for 1000s of Cores

References• [1] George Kurian, Jason E. Miller, James Psota, Jonathan Eastep, Jifeng Liu,

Jurgen Michel, Lionel C. Kimerling, and Anant Agarwal, “ATAC: a 1000-core cache-coherent processor with on-chip optical network,” In Proceedings of the 19th international conference on Parallel architectures and compilation techniques (PACT '10), 2010.

• [2] Daniel Sanchez and Christos Kozyrakis, “SCD: A scalable coherence directory with flexible sharer set encoding,” In Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture (HPCA '12), 2012.

• [3] H. Zhao, A. Shriraman, S. Dwarkadas, and V. Srinivasan, “SPATL: Honey, I Shrunk the Coherence Directory,” In Proceedings of the 20th international conference on Parallel architectures and compilation techniques (PACT ’11), 2011.

• [4] M. Ferdman, P. Lotfi-Kamran, K. Balet, B. Falsafi, "Cuckoo directory: A scalable directory for many-core systems," 2011 IEEE 17th International Symposium on High Performance Computer Architecture (HPCA), pp.169-180, Feb. 2011.