CSC630/CSC730: Parallel & Distributed...

CSC630/CSC730:

Parallel & Distributed

Computing

Dr. Joe Zhang PDC-2: Introduction 1

Introduction to PDC

PDC-2: Introduction

Dr. Joe Zhang

Contents

• Basic concept of parallel computing

• Need for parallel computing

• Classification of parallel computer system – Hardware architecture classification:

– Memory-based classification

• Performance issues

• High performance computers

• HPC resources

PDC-2: Introduction

Dr. Joe Zhang

“Computers are incredibly fast, accurate,

and stupid. Human beings are incredibly

slow, inaccurate, and brilliant. The

marriage of the two is a force beyond

calculation.”

- Leo Cherne (1977)

http://en.wikipedia.org/wiki/Leo_Cherne

Computers V.S. Human Being

PDC-2: Introduction

Dr. Joe Zhang

High Performance Computing

High-performance computing (HPC) is the use of

super computers and parallel processing

techniques for solving complex computational

problems.

HPC technology focuses on developing parallel

processing algorithms and systems by

incorporating both administration and parallel

computational techniques.

PDC-2: Introduction

Dr. Joe Zhang

Basic Concept of Parallel Computing

• Parallel computing – Split the problem into many parts, and each part is performed

by a separate processor in parallel.

– To achieve high performance

– It requires an understanding of • parallel architecture

• parallel algorithms

• parallel languages

PDC-2: Introduction

Dr. Joe Zhang

Need for Parallel Computing

• Need for Parallelism – Solution for problems with deadlines

– Grand challenge problems

• Sequential solutions may take months or years.

– Numerical modeling and simulation of scientific and engineering problems.

• Modeling of DNA structures

• Forecasting weather – Hurricane Katrina

– Many computational problems in computer science

• Parallel (distributed) information retrieval – e.g. Google

• Distributed database

• Parallel image processing

• Large scale data analysis

PDC-2: Introduction

Dr. Joe Zhang

Example: Huge Black Hole Collision Simulation

QuickTime?and aPhoto decompressor

are needed to see this picture.

EU Astrophysics

Network

10 EU Institutions

3 years

Continue these problems

PDC-2: Introduction

Dr. Joe Zhang

Example: Universe Simulation

Run on NASA AMES' constellation of supercomputer processors

PDC-2: Introduction

Dr. Joe Zhang

Major Applications of Next Generation Supercomputer

Targeted as grand

challenges

PDC-2: Introduction

Dr. Joe Zhang

Basic Concept for Simulations in Life Sciences

Vascular System

Organism Organ

Tissue Cell

Protein

Genome

Bio-MD

Tissue

Structure Multi-physics

Chemical

Process

Circulation

Gene Therapy

Micro-machine

Catheter

http://ridge.icu.ac.jp

http://info.med.

vale.edu/

PDC-2: Introduction

Dr. Joe Zhang

Need for More Computational Power

• The system designers must concern themselves with:

– The design and implementation of an interconnection network for the processors and memory modules.

– The design and implementation of system software for the hardware.

• The system users must concern themselves with: – The algorithms and data structures for solving their

problem.

– Partition the algorithms and data structures into sub problems.

– Identifying the communications needed among the sub problems.

– Mapping of sub problems to processors and memory modules.

PDC-2: Introduction

Dr. Joe Zhang

HPC Classification

• Parallel computing: single systems with many processors working on the same problem

– Supercomputer (shared memory system)

• Distributed computing: many systems loosely coupled by a scheduler to work on related problems

– Cluster (distributed memory system)

• Grid Computing: many systems tightly coupled by software and networks to work together on single problems or on related problems

– Globally distributed heterogeneous systems (virtual environment)

PDC-2: Introduction

Dr. Joe Zhang

Parallel Computing

– Split the problem into many parts, and each part is performed by a separate processor in parallel.

– It requires an understanding of • parallel architecture • parallel algorithms • parallel languages

– To achieve high performance • Fast CPU • Low communication cost • Task scheduling and load balancing

– Performance metrics • Parallel run time, speed, efficiency, scalability

PDC-2: Introduction

Dr. Joe Zhang

Types of Parallel Computer Systems

• Memory based classification: – Shared memory multiprocessor system

• Multiple processors connected to a shared memory with single address space.

• Multiple processors are connected to the memory through interconnection network

• Supercomputing such as Cray, SGI Origin, …

– Distributed memory system or message-passing multi-computer

• Linux cluster

PDC-2: Introduction

Dr. Joe Zhang

Types of Parallel Computer Systems

• Hardware architecture classification: – Single computer with multiple internal processors (supercomputer)

– Multiple Internet-connected computers (distributed systems)

• Multiple interconnected computers (cluster system)

– Multiple Internet-connected, heterogeneous, globally distributed systems, in “virtual” organization (grid computing system)

PDC-2: Introduction

Dr. Joe Zhang

Distributed Memory Systems

• Distributed memory system – The system is connected with multiple independent computers through an

interconnection network.

– Each computer consists of a processor and local memory that is not accessible by the other processors, since each computer has its own address space.

– The interconnection network is used to pass messages among the processors.

– Messages include commands and data that other processors may require for the computations.

Interconnection network

memory

processor

PDC-2: Introduction

Dr. Joe Zhang

Distributed Memory Systems

• Advantages – Scalable to large system

– Easy to replace

– Easy to maintain

– Cost much cheaper.

• Examples

– Self-contained computer that could operate independently (PC-LINUX operated cluster) or distributed system through Internet.

PDC-2: Introduction

Dr. Joe Zhang

Parallel Programming

• Parallelism – No matter what computer system we put together, we need to

split the problem into many parts, and each part is performed by a separate processor in parallel.

• Objective of parallel computing – to significantly increase in performance

• Writing program for such form of computation is known as parallel programming.

PDC-2: Introduction

Dr. Joe Zhang

Shared Memory Programming

• Programming models are easier since message passing is not necessary. Techniques: – Auto-parallelization via compiler options

– loop-level parallelism via compiler directives

– OpenMP

– pthreads

PDC-2: Introduction

Dr. Joe Zhang

Distributed Memory Programming

• Message Passing Interface (MPI) – A specification of message passing libraries for the developers and users.

– MPI (1992)

– MPI-2

– MPICH / MPICH2 (2006)

• MPI subroutines and functions can be called from Fortran and C/C++, respectively

PDC-2: Introduction

Dr. Joe Zhang

Performance Issue

• Idea speedup

– With n computers, the computational job could be completed in 1/nth of the time used by a single computer.

• Actual speedup

– In practical, people would not achieve that expectation, because there is a need of interaction between parts, both for extra data transfer and synchronization of computations.

• Substantial improvement, depending upon • The particular problem

• The way to parallelize the computational job.

• Memory issue – In addition, multiple computers enables problems that require

larger amounts of main memory to be tackled.

PDC-2: Introduction

Dr. Joe Zhang

Computing and Communication

* Sputnik

1960 1970 1975 1980 1985 1990 1995 2000

* ARPANET

* Email

* Ethernet

* TCP/IP

* IETF

* Internet Era * WWW Era

* Mosaic

* PC Clusters * Crays * MPPs

* Mainframes

* HTML

* Grids

* XEROX PARC worm

* Web Services

* Minicomputers

* WS Clusters

* PDAs

* Workstations

* eScience

* Computing Utility

* eBusiness

* SocialNet

PDC-2: Introduction

Dr. Joe Zhang

ASCI White

Pacific

EDSAC 1

UNIVAC 1

IBM 7090

CDC 6600

IBM 360/195 CDC 7600

Cray 1

Cray X-MP Cray 2

TMC CM-2

TMC CM-5 Cray T3D

ASCI Red

1950 1960 1970 1980 1990 2000 2010

1 KFlop/s

1 MFlop/s

1 GFlop/s

1 TFlop/s

1 PFlop/s

Scalar

Super Scalar

Parallel

Vector

1941 1 (Floating Point operations / second, Flop/s)

1945 100

1949 1,000 (1 KiloFlop/s, KFlop/s)

1951 10,000

1961 100,000

1964 1,000,000 (1 MegaFlop/s, MFlop/s)

1968 10,000,000

1975 100,000,000

1987 1,000,000,000 (1 GigaFlop/s, GFlop/s)

1992 10,000,000,000

1993 100,000,000,000

1997 1,000,000,000,000 (1 TeraFlop/s, TFlop/s)

2000 10,000,000,000,000

2005 131,000,000,000,000 (131 Tflop/s)

Super Scalar/Vector/Parallel

(1012)

(1015)

2X Transistors/Chip

Every 1.5 Years

A Growth-Factor of a Billion in Performance in a Career

PDC-2: Introduction

Dr. Joe Zhang

HPC Performance Development

http://www.top500.org/statistics/perfdevel/

http://www.top500.org/

PDC-2: Introduction

Dr. Joe Zhang

Top 500 Supercomputers (2012)

PDC-2: Introduction

Dr. Joe Zhang

Top 500 Supercomputers (11/2014)

PDC-2: Introduction

Dr. Joe Zhang

Tianhe-2 Supercomputer

• Has retained its position as the world’s No. 1 system with a performance of 33.86 petaflop/s (quadrillions of calculations per second) on the Linpack benchmark

• System – Tianhe-2 (MilkyWay-2) - TH-IVB-FEP

Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P

– Cores: 3,120,000

PDC-2: Introduction

Dr. Joe Zhang

Grid Computing

Grid Resource Broker

Resource Broker

Application

Grid Information Service

Grid Resource Broker

database R2 R3

Grid Information Service

Grid computing — “A form of distributed and parallel computing, whereby

a 'super and virtual computer' is composed of a cluster of networked,

loosely coupled computers acting in concert to perform very large tasks.”

- wikipedia.org

PDC-2: Introduction

Dr. Joe Zhang

Grid Infrastructure

PDC-2: Introduction

Dr. Joe Zhang

Many Grid Projects & Initiatives

• Australia

– Nimrod-G

– GridSim

– Virtual Lab

– Gridbus

– DISCWorld

– ..new coming up

• Europe

– UNICORE

– MOL

– UK eScience

– Poland MC Broker

– EU Data Grid

– EuroGrid

– MetaMPI

– Dutch DAS

– XW, JaWS

– Ninf

– DataFarm

• Korea...

N*Grid

• USA

– Globus

– Legion

– OGSA

– Javelin

– AppLeS

– NASA IPG

– Condor-G

– Jxta

– NetSolve

– AccessGrid

– TeraGrid (NSF)

• Cycle Stealing & .com Initiatives

– Distributed.net

– SETI@Home, ….

– Entropia, UD, Parabon,….

• Public Forums

– Global Grid Forum

– P2P Working Group

– IEEE TFCC

– Grid & CCGrid conferences

http://www.gridcomputing.com

PDC-2: Introduction

Dr. Joe Zhang

Many Testbeds

Legion Testbed

NASA IPG

PDC-2: Introduction

Dr. Joe Zhang

SURAGrid • SURAgrid is a consortium of organizations collaborating and

combining resources to help bring grid technology to the level of seamless, shared infrastructure.

• Capabilities to be cultivated include locally contributed resources, project-specific tools and environments, highly specialized or HPC access, and gateways to national and international cyberinfrastructure.

32 http://www.suragrid.org/sura_grid.html

PDC-2: Introduction

Dr. Joe Zhang

Cloud Computing

Cloud computing is the delivery of computing as a service rather than a

product, whereby shared resources, software and information are provided

to computers and other devices as a utility (like the electricity grid) over a

network (typically the Internet). - wikipedia.org

In March 2007, Dell applied

to trademark the term

"cloud computing" (U.S.

Trademark 77,139,082) in

the United States

PDC-2: Introduction

Dr. Joe Zhang

Grid Computing, MIERSI, DCC/FCUP 34

Relation with Other Paradigms

PDC-2: Introduction

Dr. Joe Zhang

HPC in Mississippi

PDC-2: Introduction

Dr. Joe Zhang

HPC at MCSR

• Available high performance computers

• http://www.mcsr.olemiss.edu/

Sweetgum

A 128-CPU SGI

Origin 2800

supercomputer at

Redwood

Mimosa

PDC-2: Introduction

Dr. Joe Zhang

School of Computing Clusters

• Albacore – primary HPC cluster in the

School of Computing.

– 22 nodes and 224 cores

– 216 GB of RAM

– 8 TB of storage

Thanks to Dr. Glover George

for technical support!

PDC-2: Introduction

Dr. Joe Zhang

Research Opportunities

• DoD – The Department of Defense (DOD) High Performance Computing

Modernization Program (HPCMP) just completed its fiscal year 2014 investment in supercomputing capability supporting the DOD science, engineering, test and acquisition communities.

– http://www.hpc.mil/index.php/2013-08-29-16-06-21/newsroom

• NSF

PDC-2: Introduction

Dr. Joe Zhang

Summary

• Parallel computing is a viable way to achieve high performance. How?

• Need for parallel computing

• Two types of parallel computer systems – Shared memory multiprocessor system

– Distributed memory system

• The performance depends on particular problem, system, algorithm and implementation.

• History of HPC

CSC630/CSC730:

Parallel & Distributed

Computing

Dr. Joe Zhang PDC-2: Introduction 40

Questions?

CSC630/CSC730: Parallel & Distributed...

Documents

Transcript of CSC630/CSC730: Parallel & Distributed...

Scalable Parallel ComputIng

Introduction to Parallel Computing · Introduction to Parallel Computing High-Performance Computing. Prefix sums Jesper Larsson Träff traff@par.tuwien.ac.at TU Wien Parallel Computing

Parallel Computing: Overview - Computer Science Workshop/Parallel... · April 23, 2002 Introduction to Parallel Computing •Why w e need parallel computing • How such machines

Parallel Computing Platforms

Parallel Computing Main

Parallel computing(2)

parallel computing toolbox

Parallel Computing Why & How? - SINTEFWe´re already at the age of parallel computing Parallel computing relies on parallel hardware Parallel computing needs parallel software So parallel

Multiprocessors - Parallel Computing

Parallel computing persentation

Introduction to Parallel Computing - ULisboaIntroduction to Parallel Computing. Ricardo Fonseca | European PhD 2010 What is Parallel Computing • Parallel computing: use of multiple

Introduction to Parallel Computing. Serial Computing.

Parallel Computing Explained Parallel Performance Analysis

Parallel Computing Notes

CSC630/CSC730: Parallel Computingorca.st.usm.edu/~zhang/teaching/csc730/pdcnotes/PDC5_Communication...1 Dr. Joe Zhang PDC-5: Communication Cost CSC630/CSC730: Parallel Computing 1

GPU Parallel Computing

Parallel Computing Overview

Compilers, Parallel Computing, and Grid Computing

CPE 779 Parallel Computing - Spring 20121 CPE 779 Parallel Computing Lecture.

News40 Parallel Computing