MONET 05 Time Synchronization In Wireless Sensor Networks Anisha Menon.
1 THE EARTH SIMULATOR SYSTEM By: Shinichi HABATA, Mitsuo YOKOKAWA, Shigemune KITAWAKI Presented by:...
-
Upload
antonia-thornton -
Category
Documents
-
view
218 -
download
0
Transcript of 1 THE EARTH SIMULATOR SYSTEM By: Shinichi HABATA, Mitsuo YOKOKAWA, Shigemune KITAWAKI Presented by:...
1
THE EARTH THE EARTH SIMULATOR SYSTEMSIMULATOR SYSTEM
By: Shinichi HABATA, Mitsuo YOKOKAWA, Shigemune KITAWAKI
Presented by: Anisha Thonour
Extracted from the government websiteExtracted from the government website
A high-end supercomputer (the Earth Simulator) is just like an Alien with a very big head (brain) but small arms and legs.
To make the most of its CPU power, thousands of arms and legs are necessary.
3
DefinitionsDefinitions Super Computer:
A supercomputer is a computer that leads the world in terms of processing capacity, particularly speed of calculation, at the time of its introduction. “Cost is no object with advanced technologies” – Dr.Pfeiffer
Parallel Processing:Processing in which multiple processors work on a single application simultaneously .
4
Cross-sectional View of the Earth Simulator Building
5
TopicsTopics to be introduced……to be introduced……IntroductionSystem OverviewProcessor NodeInterconnection NetworkPerformanceConclusion
6
IntroductionIntroductionGlobal change prediction using computer
simulation1000 times faster1997 - February 200287.5% peak performance(35.86TFLOPS)
– LINPACK64.9% peak performance(26.58TFLOPS)
– global atmospheric circulation model with the spectral method
7
System OverviewSystem Overview Parallel vector super computer 640 processor node and
interconnection network 1 processor node holds 8
arithmetic processors and main memory
Peak performance Processor node = 40TFLOPS Achieved performance Processor node = 35.86TFLOPS Interconnection network = 640 x
640 non-blocking crossbar switch
Bandwidth = 12.3GB/s
8
System Overview ctd….System Overview ctd….
9
System Overview ctd…..System Overview ctd….. 1 cluster consist of 16 processor nodes, a
cluster control station , an I/O control station and system disk
640 nodes divided into 40 clusters 2 types of clusters – S cluster(1), L cluster(39) S cluster- 2 nodes are used to for interactive use
and another for small-size batch jobs User disks - storing user files Mass storage system – cartridge tape library
system
10
ctd….ctd…. Super cluster control station manages all 40 clusters and
provide a single system images operational environment High-performance and high-efficiency Architectural features: Vector Processor Shared memory High-bandwidth and non-blocking interconnection crossbar
network Parallelizing, high-sustained performance Vector processing on a processor Parallel processing with shared memory within a node Parallel processing among distributed nodes via the
interconnection network
11
Processor NodeProcessor Node Each PN consist of 8AP, a main memory system, a
remote-access control unit and an I/O processor. Arithmetic processor can deliver up to 8GFLOPS and
there are 8 APs. It uses a high efficiency heat sink using heat pipe. High speed main memory device to reduce the memory
access latency. Paradigms provided within a processor node is Vector processing on a processor. Parallel processing with shared memory.
12
Processor Node ConfigurationProcessor Node Configuration
13
14
Interconnection NetworkInterconnection Network 640 x 640 non-blocking
crossbar switch Byte-slicing technique Control unit and 128 data
switch unit 320 PN cabinets and 65
IN cabinets Each PN cabinets consist
of 2 processor nodes and 65 IN cabinets containing the interconnection network.
15
16
Interconnection Network WiringInterconnection Network Wiring
17
Inter-node communication Inter-node communication mechanismmechanism
Node A requests the control unit to reserve a data path from node A to node B, and the control unit reserves the data path, then replies to node A.
Node A begins data transfer to node B
Node B receives all the data, then sends the data transfer completion code to node A.
18
Inter-node interface with ECC Inter-node interface with ECC codescodes
19
Inter-node interface with ECC Inter-node interface with ECC codescodes
To resolve the error occurrence rate problem, ECC codes are added to the transfer data.
A receiver node detects the occurrence of intermittent inter-node communication failure by checking ECC codes, and the error byte data can almost always be corrected by RCU within the receiver node.
ECC used for recovering from inter-node communication failure from a data switch unit malfunction.
Correction done until switch unit is repaired.
20
Barrier Synchronization Barrier Synchronization mechanism using GBCmechanism using GBC
21
Barrier synchronization Barrier synchronization mechanism using GBCmechanism using GBC
GBC-Global barrier counterGBF-Global barrier flag
Barrier synchronization mechanism The master node sets the number of nodes used for the parallel
program into GBC within the IN’s control unit The control unit resets all GBF’s of the nodes used for the program The node, on which task completes, decrements GBC within the
control unit , and repeats to check GBF until GBF is asserted When GBC=0, the control unit asserts all GBF’s of the nodes used
for the program All the nodes begin to process the next tasks. The barrier synchronization time is constantly less than 3.5µsec
22
Bird's-eye View of the Earth Bird's-eye View of the Earth Simulator SystemSimulator System
23
24
PerformancePerformance
Using GBC feature, MPI-Barrier synchronization time is constantly less than 3.5µsec.
The software barrier synchronization time increases, or is proportional to the number of nodes.
25
PerformancePerformance
The interconnection network is a single stage network so this performance is always achieved for every two-node communication.
26
PerformancePerformance
The ratio of peak performance is more than 85%.
Performance is proportional to the number of nodes.
27
ConclusionConclusion High-performance and high-efficiency Architectural features: Vector Processor Shared memory High-bandwidth and non-blocking interconnection crossbar
network Parallelizing, high-sustained performance Vector processing on a processor Parallel processing with shared memory within a node Parallel processing among distributed nodes via the
interconnection network 87.5% peak performance(35.86TFLOPS) – LINPACK 64.9% peak performance(26.58TFLOPS) – global atmospheric
circulation model with the spectral method
28
Applications-Solid Earth Applications-Solid Earth Simulation GroupSimulation Group
We are developing new algorithms for the geophysical simulations as well as new grid systems in the spherical
geometry.
29
Solid Earth Simulation Group Solid Earth Simulation Group
30
To understand the mechanism of the variability with time To understand the mechanism of the variability with time scale from a few days to decades and to study the scale from a few days to decades and to study the
predictability in the atmospherepredictability in the atmosphere. .
31
To study the effects of meso-scale phenomena on To study the effects of meso-scale phenomena on the ocean general circulation and the material the ocean general circulation and the material
transport.transport.
32
To understand the mechanism of the variability To understand the mechanism of the variability and to study the predictability in the coupled and to study the predictability in the coupled
atmosphere–ocean system.atmosphere–ocean system.
33
ReferencesReferences http://www.thocp.net/hardware/nec_ess.htm http://www.es.jamstec.go.jp/esc/eng/Hardware/
in.html
Thank YouThank You