Cluster Computing Andreas Engelbredt Dalsgaard May 25,...

27
Introduction to Fyrkat An introduction to Fyrkat Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011 Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Transcript of Cluster Computing Andreas Engelbredt Dalsgaard May 25,...

Page 1: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

An introduction to FyrkatCluster Computing

Andreas Engelbredt Dalsgaard

May 25, 2011

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 2: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

How to get an account

https://fyrkat.grid.aau.dk/useraccount

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 3: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

How to get help

https://fyrkat.grid.aau.dk/wiki

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 4: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

What is a Cluster Anyway

It is NOT something that does any of the following:

Use magicAutomatically make your program run fasterProvide a single virtual OS image of all resourcesAlways makes your software faster

Then what is it?

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 5: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

Terminology

Cluster

A set of closely connected computersUsually homogeneousConnectivity: GB Ethernet, InfiniBand/Myranet/etc.They usually run some form of *nixOften has InfiniBand/Myranet/etc.

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 6: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

Terminology

Grid ComputingVirtual Supercomputer

Composed of several clusters

Distributed within organisation or globallyHigh-energy Physics E.g. CERN

Cloud Computing

Software as a serviceOn demand resourcesCloud storageCommodity hardware componentsUsually virtual access to resources

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 7: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

So what is a cluster used for and by whom

Purpose

To solve problems faster than on a single machineTo solve problems that cannot be solved on a single machinePerformance is everything

Users want the last 10%Think about how the cache is usedThink about memory organisationConsider communication overhead

Users

Scientific researchersEngineersAcademic institutionsGovernment agencies E.g. military

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 8: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

What does it look like

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 9: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

Cluster Overview

Frontend nodeStorage Admin node

LAN

Compute nodes

...

User

Internet

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 10: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

Cluster Overview(2)

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 11: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

How are clusters used

A LRMS: Local Resource Management System

Also called batch systems

Tasks are split into jobsJobs are executed by the batch systemOrder of execution depend on schedulerA job gets a set of nodes/CPUs (exclusively)

How are most LRMS used

Make a job descriptionSubmit it to the LRMSThe LRMS will decide when to run the job

Torque/PBS, SLURM, LoadLeveler, Condor

Some are open source, some are proprietary

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 12: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

What Runs Fyrkat

OpenSSH is used intensively

Ubuntu 10.04 LTS

Admin node

A LRMS serverOpenLdap server(user directory service)Application file server (/pack)

Compilers, Gnu and Intel c,c++ and fortranLibraries, E.g. openmpi

Dhcp, tftp, email, mailinglist, monitoring, monitoring andmonitoring

Frontend node

LRMS clientMount application filesystemFirewall

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 13: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

What Runs Fyrkat(2)

Compute nodes

LRMS clientMount application filesystem/scratch partitionEverything else is read only

Storage

User dataAutomounted on all nodesBattery backed RAM, SSD, disksNo guaranties, use backup

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 14: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

A Little About Hardware

For common jobs

84 IBM blade computers in 6 bladecenters.(killing1-84) 672cores, 1,3 Tb RAM

With two Xeon E5345 quad core 2.33 GHz CPUs, 16 GBRAM, a 53 GB scratch partition, Gbit ethernet and InfiniBandinterconnect.

24 HP computers (lion1-24) 288 cores, 3,5 Tb RAM

With two Xeon X5650 six core 2.66 GHz CPUs, 145 Gb RAM,a 53GB scratch partition, Gbit ethernet and InfiniBandinterconnect.

Multi-core CPUs is the norm

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 15: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

A Little About Hardware(2)

Test jobs

5 SUN computers (sister1-5)

With a Xeon X3220 quad core 2.40GHz CPU, 8GB RAM, a 66GB scratch partition, Gbit ethernet interconnect

Interactive jobs

4 SUN computers (tiger1-4)

With two Xeon X5570 quad core 2.93GHz CPU withhyperthreading(HT) enabled, 68GB RAM, Gbit ethernetinterconnect.

GPU jobs

10 Colfax computers (cub1-10)

With two Xeon X5570 quad core 2.93GHz CPU with HTenabled, 48GB RAM, Gbit ethernet and InfiniBandinterconnect, 3xTesla C2070 or Tesla C1050(soon GTX 580)

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 16: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

Trends

TrendsGPU used for calculationsMassive parallelism10 gbit ethernet

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 17: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

What people use Fyrkat for

ScoutAnalysis of integrated circuitsContinues systemsE.g. Noise Analysis in integrated circuitsMobile communication

FyrkatContinues systemsSimulate an antennas behaviour

Antenna near field simulation using Finite-differencetime-domain method

Acoustic simulationsGrid computing

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 18: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

How people use Fyrkat

Parameter sweeps

Serial code run in parallel

Distributed jobs

Programs are made explicitly parallelThis is hard workOpenMP, Pthreads, MPIMPI is the hard part

Usually it is more of a rewrite than a port

Many MPI solutions for different interconnect types

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 19: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

Common Considerations

Task generationStatic task generation

Matrix-multiplicationParameter sweepsState-space exploration E.g. 15-puzzle

Dynamic task generation

Simulation using the Finite-difference time-domain methodRay tracingState-space exploration E.g. 15-puzzle

Task size

Uniform versus non-uniform

Size of Data Associated with Tasks

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 20: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

Fyrkat

ssh fyrkat.grid.aau.dk

Pretty easy to get access to - Talk to your supervisor and use:https://fyrkat.grid.aau.dk/useraccount

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 21: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

Scout

ssh scout.es.aau.dk

Hard to get access to - Talk to Torben Larsen or Josva Kleist

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 22: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

How to get help

https://fyrkat.grid.aau.dk/wiki

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 23: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

How to use job manager on Fyrkat

Software is installed in /pack

Start by adapting your PATH environment variable(see wiki)

Should add: /pack/slurm/bin/

See job queue: squeue

See status of workernodes: sinfo

See details about job: scontrol show job JobId=job ID

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 24: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

How to submit job on Fyrkat

% - my_job_script

#!/bin/sh

#SBATCH --time=5:00

#SBATCH -n 1

#SBATCH --job-name=hello_world

#SBATCH --error=slurm-%j.err

#SBATCH --mail-type=FAIL

#SBATCH [email protected]

echo Hello World

sbatch my job script

See job queue: squeue

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 25: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

How to submit MPI job on Fyrkat

export PATH=/pack/slurm/bin:/pack/openmpi-gnu-ib/bin:$PATH mpicc

mpi array.c -o mpi array

% - mpi_job_script

#!/bin/sh

#SBATCH --time=5:00

#SBATCH -n 12

#SBATCH --partition=lion

export LD_LIBRARY_PATH=/pack/openmpi-gnu-ib/lib:$LD_LIBRARY_PATH

mpirun mpi_array

sbatch mpi job script

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 26: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

Demonstration

Andreas Engelbredt Dalsgaard An introduction to Fyrkat

Page 27: Cluster Computing Andreas Engelbredt Dalsgaard May 25, 2011people.cs.aau.dk/~bt/PhDSuperComputing2011/cluster.pdf · Grid Computing Virtual Supercomputer Composed of several clusters

Introduction to Fyrkat

How to get an account

https://fyrkat.grid.aau.dk/useraccount

Andreas Engelbredt Dalsgaard An introduction to Fyrkat