Niagara: a 32-Way Multithreaded SPARC Processor P. Kongetira, K. Aingaran, K.Olokotun Sun...
-
Upload
shannon-osborne -
Category
Documents
-
view
228 -
download
2
Transcript of Niagara: a 32-Way Multithreaded SPARC Processor P. Kongetira, K. Aingaran, K.Olokotun Sun...
Niagara: a 32-Way Multithreaded SPARC
Processor
P. Kongetira, K. Aingaran, K.Olokotun
Sun Microsystems
Presented by Bogdan Romanescu
Goal
• Commercial server applications:– High thread level parallelism (TLP)
• Large numbers of parallel client requests
– Low instruction level parallelism (ILP)• High cache miss rates• Many unpredictable branches• Frequent load-load dependencies
• Power, cooling, and space are major concerns for data centers
Sun’s Solution• UltraSPARC T1 processor • “the highest-throughput and most eco-
responsible processor ever created”®
• Multicore • Fine-grain multithreading within core• Simple pipelines• Small L1 cache• Shared L2• Metric: Performance/Watt
Sparc pipe• UltraSPARC II style • Single issue 6 stage: F, S, D, E, M, W• Shared units:
– L1 $ – TLB – X units – pipe registers
• Hazards:– Data– Structural
Integer Register file
• One register file / thread• SPARC window: in, out, local registers• Highly integrated cell structure to support 4
threads:– 8 windows of 32 locations / thread– 3 read ports + 2 write ports– Read/write: single cycle latency
• 1 Active Window Cell (copy of the architectural set window)
Thread scheduling• Thread selection based on:
– Previous long latency instruction in pipe– Instruction type– LRU status
• Select & Fetch
coupled
Memory
• 16 KB 4 way set assoc. I$/ core• 8 KB 4 way set assoc. D$/ core• 3MB 12 way set assoc. L2 $ shared
– 4 x 750KB independent banks– 2 cycle throughput, 8 cycle latency– Direct link to DRAM & Jbus– Manages cache coherence for the 8 cores– CAM based directory
Write through
• allocate LD
• no-allocate ST
Performance
Test\Architecture Sun Fire
T2000
IBM p5-550 with 2 dual-core Power5
chips Dell PowerEdge
SPECjbb2005 (Java server software) business operations/ sec 63,378 61,789
24,208 (SC1425 with dual single-core Xeon)
SPECweb2005 (Web server performance) 14,001 7,881
4,850 (2850 with two dual-core Xeon processors)
NotesBench (Lotus Notes performance)
16,061 14,740
“Home run“ ?• Relatively slow single-thread performance• Poor floating-point performance • Lack of software support ( Sun Fire T2000 does not
support Linux or Windows)• Price• Concurrency counterattack
– no place as a general-purpose computer running databases– small low-end market segment ?
• Niagara II & The “Rock” – multiprocessor & enhanced single thread support
References
• [1] P. Kongetira, et al, “A 32-Way Multithreaded SPARC Processor,” IEEE Micro, vol. 25, pp. 21-29, Mar., 2005.
• [2] A. S. Leon, et al, “A Power-Efficient High-Throughput 32-Thread SPARC Processor”, ISSCC 2006 , SESSION 5 , PROCESSORS
• [3] S. Chaudhry, S. Yip, P. Caprioli and M. Tremblay, “High Performance Throughput Computing” , IEEE Micro, vol. 25, Issue 3, 2005
• [4] http://opensparc.sunsource.net/nonav/opensparct1.html
• [5] http://www.sun.com/processors/UltraSPARC-T1/features.xml
• [6] http://www.sun.com/servers/coolthreads/t1000/benchmarks.jsp
• [7] http://news.com.com/Sun+begins+Sparc+phase+of+server+overhaul/2163-1010_3-5983365.html
• [8] http://h71028.www7.hp.com/ERC/cache/280124-0-0-0-121.html