OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.
-
date post
23-Jan-2016 -
Category
Documents
-
view
232 -
download
0
Transcript of OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.
![Page 1: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/1.jpg)
OGO 2.1SGI Origin 2000
Robert van Liere
CWI, Amsterdam
TU/e, Eindhoven
11 September 2001
![Page 2: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/2.jpg)
unite.sara.nl
• SGI Origin 2000• Located at SARA in Amsterdam
• Hardware configuration :– 128 MIPS R10000 CPUs @ 250 Mhz– 64 Gbyte main memory– 1 Tbyte disk storage– 11 ethernet @ 100 Mbits– 1 ethernet @ 1 Gbit
![Page 3: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/3.jpg)
Contents
• Architecture– Overview– Module interconnect– Memory hierarchies
• Programming– Parallel models– Data placement
• Pros and cons
![Page 4: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/4.jpg)
Overview - Features
• 64 bit RISC microprocessors
• Large main memory
• “Scalable” in CPU, memory and I/O
• Shared memory programming model
![Page 5: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/5.jpg)
Overview - Applications
• Worldwide : +/- 30.000 systems – ~ 50 with >128 CPUs– ~ 100 with 64-128 CPUs– ~ 500 with 32-64 CPUs
• Computing serving : many CPUs and memory• Database serving : many disks• Web serving : many I/O
![Page 6: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/6.jpg)
System architecture – 1 CPU
• CPU + cache• One system bus• Memory• I/O (network + disk)
• Cached data
![Page 7: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/7.jpg)
System architecture – N CPU
• Symmetric multi-processing (SMP)
• Multi-CPU + caches• One shared bus• Memory• I/O
![Page 8: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/8.jpg)
N CPU – cache coherency
• Problem:– Inconsistent cached data
• Solution:– Snooping– Broadcasting
• Not scalable
![Page 9: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/9.jpg)
Architecture – Origin 2000
• Node board
• 2 CPU + cache• Memory• Directory• HUB• I/O
![Page 10: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/10.jpg)
Origin 2000 Interconnect
• Node boards
• Routers– Six ports
![Page 11: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/11.jpg)
Interconnect Topology
![Page 12: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/12.jpg)
Sample Topologies
![Page 13: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/13.jpg)
128 Topology
![Page 14: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/14.jpg)
Virtual Memory
• One CPU, multi programs
• Page• Paging disk• Page replacement
![Page 15: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/15.jpg)
O2000 Virtual Memory
• Multi CPU, Multi progs
• Non-Uniform Memory Access
• Efficient programs:– Minimize data movement– Data “close” to CPU
![Page 16: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/16.jpg)
Latencies and Bandwidth
![Page 17: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/17.jpg)
Application performance
• Scientific computing– LU, ocean, barnes, radiosity
• Linear speedup– More CPUs -> performance
![Page 18: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/18.jpg)
Programming support
• IRIX operating system• Parallel programming
– C source level with compiler pragmas– Posix Threads– UNIX processes
• Data placement– dplace , dlock, dperf
• Profiling– timex, ssrun
![Page 19: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/19.jpg)
Parallel Programs
• Functional Decomposition– Decompose the problem into different tasks
• Domain Decomposition– Partition the problem’s data structure
• Consider– Mapping tasks/parts onto CPUs– Coordinate work and communication of CPUs
![Page 20: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/20.jpg)
Task Decomposition
• Decompose problem
• Determine dependencies
![Page 21: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/21.jpg)
Task Decomposition
• Map tasks on threads
• Compare:– Sequential case– Parallel case
![Page 22: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/22.jpg)
Efficient programs
• Use many CPUs– Measure speedups
• Avoid:– Excessive data dependencies – Excessive cache misses– Excessive inter-node communication
![Page 23: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001.](https://reader033.fdocuments.us/reader033/viewer/2022061617/56649d225503460f949f879b/html5/thumbnails/23.jpg)
Pros vs Cons
• Multi-processor (128 )• Large memory (64 Gbyte)
• Shared memory programming
• Slow integer CPU
• Performance penalty:– Data dependencies– Off board memory