Post on 13-Oct-2019
1
CSC630/CSC730:
Parallel & Distributed
Computing
Dr. Joe Zhang PDC-2: Introduction 1
Introduction to PDC
PDC-2: Introduction
Dr. Joe Zhang
2
Contents
• Basic concept of parallel computing
• Need for parallel computing
• Classification of parallel computer system – Hardware architecture classification:
– Memory-based classification
• Performance issues
• High performance computers
• HPC resources
2
PDC-2: Introduction
Dr. Joe Zhang
3
“Computers are incredibly fast, accurate,
and stupid. Human beings are incredibly
slow, inaccurate, and brilliant. The
marriage of the two is a force beyond
calculation.”
- Leo Cherne (1977)
http://en.wikipedia.org/wiki/Leo_Cherne
Computers V.S. Human Being
PDC-2: Introduction
Dr. Joe Zhang
High Performance Computing
4
High-performance computing (HPC) is the use of
super computers and parallel processing
techniques for solving complex computational
problems.
HPC technology focuses on developing parallel
processing algorithms and systems by
incorporating both administration and parallel
computational techniques.
3
PDC-2: Introduction
Dr. Joe Zhang
5
Basic Concept of Parallel Computing
• Parallel computing – Split the problem into many parts, and each part is performed
by a separate processor in parallel.
– To achieve high performance
– It requires an understanding of • parallel architecture
• parallel algorithms
• parallel languages
PDC-2: Introduction
Dr. Joe Zhang
6
Need for Parallel Computing
• Need for Parallelism – Solution for problems with deadlines
– Grand challenge problems
• Sequential solutions may take months or years.
– Numerical modeling and simulation of scientific and engineering problems.
• Modeling of DNA structures
• Forecasting weather – Hurricane Katrina
– Many computational problems in computer science
• Parallel (distributed) information retrieval – e.g. Google
• Distributed database
• Parallel image processing
• Large scale data analysis
4
PDC-2: Introduction
Dr. Joe Zhang
Example: Huge Black Hole Collision Simulation
QuickTime?and aPhoto decompressor
are needed to see this picture.
EU Astrophysics
Network
10 EU Institutions
3 years
Continue these problems
PDC-2: Introduction
Dr. Joe Zhang
Example: Universe Simulation
8
Run on NASA AMES' constellation of supercomputer processors
5
PDC-2: Introduction
Dr. Joe Zhang
Major Applications of Next Generation Supercomputer
Targeted as grand
challenges
PDC-2: Introduction
Dr. Joe Zhang
Basic Concept for Simulations in Life Sciences
Genes
Vascular System
Organism Organ
Tissue Cell
Protein
Genome
Bio-MD
Tissue
Structure Multi-physics
Chemical
Process
Blood
Circulation
DDS
Gene Therapy
HIFU
Micro-machine
Catheter
Micro
Meso
Macro
http://ridge.icu.ac.jp
http://info.med.
vale.edu/
RIKEN
RIKEN
6
PDC-2: Introduction
Dr. Joe Zhang
Need for More Computational Power
• The system designers must concern themselves with:
– The design and implementation of an interconnection network for the processors and memory modules.
– The design and implementation of system software for the hardware.
• The system users must concern themselves with: – The algorithms and data structures for solving their
problem.
– Partition the algorithms and data structures into sub problems.
– Identifying the communications needed among the sub problems.
– Mapping of sub problems to processors and memory modules.
PDC-2: Introduction
Dr. Joe Zhang
HPC Classification
• Parallel computing: single systems with many processors working on the same problem
– Supercomputer (shared memory system)
• Distributed computing: many systems loosely coupled by a scheduler to work on related problems
– Cluster (distributed memory system)
• Grid Computing: many systems tightly coupled by software and networks to work together on single problems or on related problems
– Globally distributed heterogeneous systems (virtual environment)
7
PDC-2: Introduction
Dr. Joe Zhang
13
Parallel Computing
– Split the problem into many parts, and each part is performed by a separate processor in parallel.
– It requires an understanding of • parallel architecture • parallel algorithms • parallel languages
– To achieve high performance • Fast CPU • Low communication cost • Task scheduling and load balancing
– Performance metrics • Parallel run time, speed, efficiency, scalability
PDC-2: Introduction
Dr. Joe Zhang
14
Types of Parallel Computer Systems
• Memory based classification: – Shared memory multiprocessor system
• Multiple processors connected to a shared memory with single address space.
• Multiple processors are connected to the memory through interconnection network
• Supercomputing such as Cray, SGI Origin, …
– Distributed memory system or message-passing multi-computer
• Linux cluster
8
PDC-2: Introduction
Dr. Joe Zhang
15
Types of Parallel Computer Systems
• Hardware architecture classification: – Single computer with multiple internal processors (supercomputer)
– Multiple Internet-connected computers (distributed systems)
• Multiple interconnected computers (cluster system)
– Multiple Internet-connected, heterogeneous, globally distributed systems, in “virtual” organization (grid computing system)
PDC-2: Introduction
Dr. Joe Zhang
16
Distributed Memory Systems
• Distributed memory system – The system is connected with multiple independent computers through an
interconnection network.
– Each computer consists of a processor and local memory that is not accessible by the other processors, since each computer has its own address space.
– The interconnection network is used to pass messages among the processors.
– Messages include commands and data that other processors may require for the computations.
Interconnection network
memory
processor
9
PDC-2: Introduction
Dr. Joe Zhang
17
Distributed Memory Systems
• Advantages – Scalable to large system
– Easy to replace
– Easy to maintain
– Cost much cheaper.
• Examples
– Self-contained computer that could operate independently (PC-LINUX operated cluster) or distributed system through Internet.
PDC-2: Introduction
Dr. Joe Zhang
18
Parallel Programming
• Parallelism – No matter what computer system we put together, we need to
split the problem into many parts, and each part is performed by a separate processor in parallel.
• Objective of parallel computing – to significantly increase in performance
• Writing program for such form of computation is known as parallel programming.
10
PDC-2: Introduction
Dr. Joe Zhang
Shared Memory Programming
• Programming models are easier since message passing is not necessary. Techniques: – Auto-parallelization via compiler options
– loop-level parallelism via compiler directives
– OpenMP
– pthreads
PDC-2: Introduction
Dr. Joe Zhang
20
Distributed Memory Programming
• Message Passing Interface (MPI) – A specification of message passing libraries for the developers and users.
– MPI (1992)
– MPI-2
– MPICH / MPICH2 (2006)
• MPI subroutines and functions can be called from Fortran and C/C++, respectively
11
PDC-2: Introduction
Dr. Joe Zhang
21
Performance Issue
• Idea speedup
– With n computers, the computational job could be completed in 1/nth of the time used by a single computer.
• Actual speedup
– In practical, people would not achieve that expectation, because there is a need of interaction between parts, both for extra data transfer and synchronization of computations.
• Substantial improvement, depending upon • The particular problem
• The way to parallelize the computational job.
• Memory issue – In addition, multiple computers enables problems that require
larger amounts of main memory to be tackled.
PDC-2: Introduction
Dr. Joe Zhang
22
Computing and Communication
* Sputnik
1960 1970 1975 1980 1985 1990 1995 2000
* ARPANET
* Ethernet
* TCP/IP
* IETF
* Internet Era * WWW Era
* Mosaic
* XML
* PC Clusters * Crays * MPPs
* Mainframes
* HTML
* W3C
* P2P
* Grids
* XEROX PARC worm
CO
MP
UT
ING
C
om
mu
nic
ati
on
* Web Services
* Minicomputers
* PCs
* WS Clusters
* PDAs
* Workstations
*
2010
* eScience
* Computing Utility
* eBusiness
* SocialNet
Cloud
12
PDC-2: Introduction
Dr. Joe Zhang
IBM
BG/L
ASCI White
Pacific
EDSAC 1
UNIVAC 1
IBM 7090
CDC 6600
IBM 360/195 CDC 7600
Cray 1
Cray X-MP Cray 2
TMC CM-2
TMC CM-5 Cray T3D
ASCI Red
1950 1960 1970 1980 1990 2000 2010
1 KFlop/s
1 MFlop/s
1 GFlop/s
1 TFlop/s
1 PFlop/s
Scalar
Super Scalar
Parallel
Vector
1941 1 (Floating Point operations / second, Flop/s)
1945 100
1949 1,000 (1 KiloFlop/s, KFlop/s)
1951 10,000
1961 100,000
1964 1,000,000 (1 MegaFlop/s, MFlop/s)
1968 10,000,000
1975 100,000,000
1987 1,000,000,000 (1 GigaFlop/s, GFlop/s)
1992 10,000,000,000
1993 100,000,000,000
1997 1,000,000,000,000 (1 TeraFlop/s, TFlop/s)
2000 10,000,000,000,000
2005 131,000,000,000,000 (131 Tflop/s)
Super Scalar/Vector/Parallel
(103)
(106)
(109)
(1012)
(1015)
2X Transistors/Chip
Every 1.5 Years
A Growth-Factor of a Billion in Performance in a Career
PDC-2: Introduction
Dr. Joe Zhang
HPC Performance Development
http://www.top500.org/statistics/perfdevel/
http://www.top500.org/
13
PDC-2: Introduction
Dr. Joe Zhang
Top 500 Supercomputers (2012)
25
http://www.top500.org/
PDC-2: Introduction
Dr. Joe Zhang
Top 500 Supercomputers (11/2014)
http://www.top500.org/
14
PDC-2: Introduction
Dr. Joe Zhang
Tianhe-2 Supercomputer
• Has retained its position as the world’s No. 1 system with a performance of 33.86 petaflop/s (quadrillions of calculations per second) on the Linpack benchmark
• System – Tianhe-2 (MilkyWay-2) - TH-IVB-FEP
Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P
– Cores: 3,120,000
PDC-2: Introduction
Dr. Joe Zhang
Grid Computing
Grid Resource Broker
Resource Broker
Application
Grid Information Service
Grid Resource Broker
database R2 R3
RN
R1
R4
R5
R6
Grid Information Service
Grid computing — “A form of distributed and parallel computing, whereby
a 'super and virtual computer' is composed of a cluster of networked,
loosely coupled computers acting in concert to perform very large tasks.”
- wikipedia.org
15
PDC-2: Introduction
Dr. Joe Zhang
29
Grid Infrastructure
PDC-2: Introduction
Dr. Joe Zhang
Many Grid Projects & Initiatives
• Australia
– Nimrod-G
– GridSim
– Virtual Lab
– Gridbus
– DISCWorld
– ..new coming up
• Europe
– UNICORE
– MOL
– UK eScience
– Poland MC Broker
– EU Data Grid
– EuroGrid
– MetaMPI
– Dutch DAS
– XW, JaWS
Japan
– Ninf
– DataFarm
• Korea...
N*Grid
• USA
– Globus
– Legion
– OGSA
– Javelin
– AppLeS
– NASA IPG
– Condor-G
– Jxta
– NetSolve
– AccessGrid
– TeraGrid (NSF)
• Cycle Stealing & .com Initiatives
– Distributed.net
– SETI@Home, ….
– Entropia, UD, Parabon,….
• Public Forums
– Global Grid Forum
– P2P Working Group
– IEEE TFCC
– Grid & CCGrid conferences
http://www.gridcomputing.com
16
PDC-2: Introduction
Dr. Joe Zhang
31
Many Testbeds
GUSTO
Legion Testbed
NASA IPG
PDC-2: Introduction
Dr. Joe Zhang
SURAGrid • SURAgrid is a consortium of organizations collaborating and
combining resources to help bring grid technology to the level of seamless, shared infrastructure.
• Capabilities to be cultivated include locally contributed resources, project-specific tools and environments, highly specialized or HPC access, and gateways to national and international cyberinfrastructure.
32 http://www.suragrid.org/sura_grid.html
17
PDC-2: Introduction
Dr. Joe Zhang
33
Cloud Computing
Cloud computing is the delivery of computing as a service rather than a
product, whereby shared resources, software and information are provided
to computers and other devices as a utility (like the electricity grid) over a
network (typically the Internet). - wikipedia.org
In March 2007, Dell applied
to trademark the term
"cloud computing" (U.S.
Trademark 77,139,082) in
the United States
PDC-2: Introduction
Dr. Joe Zhang
Grid Computing, MIERSI, DCC/FCUP 34
Relation with Other Paradigms
18
PDC-2: Introduction
Dr. Joe Zhang
HPC in Mississippi
PDC-2: Introduction
Dr. Joe Zhang
HPC at MCSR
• Available high performance computers
• http://www.mcsr.olemiss.edu/
Sweetgum
A 128-CPU SGI
Origin 2800
supercomputer at
MCSR
Redwood
Mimosa
19
PDC-2: Introduction
Dr. Joe Zhang
School of Computing Clusters
• Albacore – primary HPC cluster in the
School of Computing.
– 22 nodes and 224 cores
– 216 GB of RAM
– 8 TB of storage
37
Thanks to Dr. Glover George
for technical support!
PDC-2: Introduction
Dr. Joe Zhang
Research Opportunities
• DoD – The Department of Defense (DOD) High Performance Computing
Modernization Program (HPCMP) just completed its fiscal year 2014 investment in supercomputing capability supporting the DOD science, engineering, test and acquisition communities.
– http://www.hpc.mil/index.php/2013-08-29-16-06-21/newsroom
• NSF
38
20
PDC-2: Introduction
Dr. Joe Zhang
39
Summary
• Parallel computing is a viable way to achieve high performance. How?
• Need for parallel computing
• Two types of parallel computer systems – Shared memory multiprocessor system
– Distributed memory system
• The performance depends on particular problem, system, algorithm and implementation.
• History of HPC
CSC630/CSC730:
Parallel & Distributed
Computing
Dr. Joe Zhang PDC-2: Introduction 40
Questions?