High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

56
High Performance Computing G Burton – ICG – Oct12 – v1.1 1

Transcript of High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Page 1: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

High Performance Computing

G Burton – ICG – Oct12 – v1.1

1

Page 2: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Agenda

• Commodity Clusters

• Compute Pool

• Interconnects and Networks

• Shared Storage

• Login Layer and Middleware

Page 3: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

HPC alternatives

Page 4: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

HPC Facts• IBM Sequoia - Number 1 in top

500 with 1.572 million Cores.• 20 petaFLOPS ( Floating Point

Operations / Second ). Sciama 10 teraFLOPS.

• China has Number 5 plus 62 other systems in top 500, now ahead of Germany, UK, Japan and France.

• 75% of Top 500 use Intel processors.

Page 5: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Demystifying the Techno Babble

Page 6: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Demystifying the Techno Babble (2)

Page 7: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Commodity Clusters

7

Page 8: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Commodity Clusters• Made from commodity (off-the-shelf)

components (read PC’s).

• Consequently (relatively) cheap.

• Usually Linux based

• High availability storage (no single point of failure)

• Generic compute pool (cloned servers that can easily be replaced).

Page 9: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Cluster Concept

Page 10: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

10

Compute Pool - Just a bunch of PC’s

Page 11: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

In the “good-ol-days” things were simple ……….

11

Page 12: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

In the “good-ol-days” things were simple ……….

12

Page 13: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

… these days much more packed into the same space … but basically the

same!

13

These are the building blocks of HPC similar to Sciama

Page 14: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Total ICG Compute Pool > 1000 Cores

14

Page 15: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Coke or Pepsi – Chalk & Cheese

• Only 2 remaining commodity CPU makers are Intel and AMD.

• Latest AMD “Bulldozer” architecture competing with Intel “Sandy Bridge” Architecture.

• Both architectures are multi core ( Intel 48 cores)• Architectures use same memory / video cards /

hard drives etc• CPU speed constraints down to on-chip

transmission delays and heat dissipation ( 22nm ).

Page 16: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Intel-AMD – Bangs per Buck

Page 17: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Graphical Processing Units (GPU’s actually GPGPU’s – General purpose)

CPU’s still in charge

Special programminglanguage. CUDA and OpenCL

Three players:-IntelAMDNvidia

Cpu – multiple coresGpu – 100’s of cores

Page 18: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Interconnects and Networks

18

Page 19: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Interconnects and Networks

Moving away from the Processor towards the Internet you get slowerand slower due to Increased Latency and Reduced Bandwidth

Page 20: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Processor Interconnects

For processor running at 3.2GHz – QPI Bus would be 25GBytes / Second ( Kindle version of “War & Peace” is 2GBytes)

Page 21: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Peripheral Component Interconnect Express (PCIe)

PCI bus is interconnect to the outside world

Page 22: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

External Networks

Interconnects are Parallel – Bytes / secondNetworks are serial – bits / second(shown here in B/s for comparison – eg. DAS is 6Gb/s)

Page 23: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Sciama Network

Page 24: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Sciama Traffic

Page 25: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Connecting to Sciama

Page 26: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

26

Shared Storage

Page 27: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Raw Disks are Dumb

Remember: PATA, IDE (Advanced Technology Attachment)

Page 28: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Intelligence is in the File System

Page 29: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

HPC’s Require many disks

Page 30: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Use High Capacity Arrays

Page 31: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

HPC’s require large chucks of storage

Page 32: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Many RAID options 1-6 / 10 /50

Page 33: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Directly Attached Storage (DAS)

Page 34: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Directly Attached Storage (DAS)

Of limited use as cannot be shared.

Page 35: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Network Attached Storage (NAS)

Page 36: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

NAS or Network Appliance

Page 37: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Network BW is often the bottleneck

Page 38: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

NAS - Lustre File System

Lustre is an example of a distributed file system. There are many more.

Sometimes called a “Cluster” file system

Page 39: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

NAS – Lustre

Often used with an Infiniband fabric.

Page 40: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Storage Area Networks

Page 41: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Storage Area Network using iSCSI

Page 42: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Fibre Channel High Availability SAN

Page 43: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Sciama NAS Storage

Page 44: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Sciama Storage Hardware

Storage is expensive.

250gbp / Tbytes

No Backup

Page 45: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Highly Available Hardware

Page 46: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Two paths to most components

Page 47: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Additional Sciama Storage/mnt/astro(1-5)

10 GbE

Page 48: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Login Layer and Middleware

48

Page 49: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Why Login Servers• Login servers will provide the gateway to the cluster.• Users can remotely login into the servers using “ssh” or

a Remote Desktop Client.• A desktop client gives a full working desktop in the

environment (can full screen)

Page 50: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Some users are at remote locations ..

50

Page 51: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Use of Remote Login Client

51

Page 52: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Executable and Jobscript setup in Login Layer

Page 53: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Jobs submitted to the queues

> qsub –q Queue1 run_script.sh

Page 54: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Scheduler prioritises and deems a job ready to run.

Job passed to resource manager

Page 55: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Resource manager (Torque) checks for available resources.

Page 56: High Performance Computing G Burton – ICG – Oct12 – v1.1 1.

Job either runs in the compute pool or returns to queue