June 6, 2002D.H.J. Epema/PDS/TUD1 Processor Co-Allocation in Multicluster Systems DAS-2 Workshop...
-
date post
21-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of June 6, 2002D.H.J. Epema/PDS/TUD1 Processor Co-Allocation in Multicluster Systems DAS-2 Workshop...
june 6, 2002 D.H.J. Epema/PDS/TUD 1
Processor Co-Allocation in Multicluster Systems
DAS-2 WorkshopAmsterdamJune 6, 2002
Anca Bucur and Dick EpemaParallel and Distributed Systems Group
Delft University of Technology
june 6, 2002 D.H.J. Epema/PDS/TUD 2
Introduction (1)
• In multicluster systems (like the DAS, in GRIDs), jobs may use co-allocation (i.e., span multiple clusters):– to use available capacity– to process geographically spread data
• Single-application performance issues:– application restructuring– wide-area runtime systems (e.g., optimize collective
communication operations)
• Multiple-application performance issues:– design/analyze scheduling policies– minimize response time, maximize maximal utilization
june 6, 2002 D.H.J. Epema/PDS/TUD 3
Introduction (2): Example
• In april 2001, the Cactus Computational Toolkit was used for four-hour astrophysics simulations involving Einstein’s General Relativity equations
• Equipment:– At NCSA: 480 CPUs of three SGI Origin2000 systems
– At SDSC: 1020 CPUs of Blue Horizon
– OC-12 622-Mbit/s network
june 6, 2002 D.H.J. Epema/PDS/TUD 4
Introduction (3): Problems
time
processors(pattern: idle)
fits with
if flexible
fits with
if unorderedcluster 1
cluster 2
cluster 3
job: 1 2 3
june 6, 2002 D.H.J. Epema/PDS/TUD 5
System Model
• Multicluster system consisting of clusters of processors of equal speed
• Communication speed ratio : the ratio of the wide-area and local message transfer times
CCiN i ,...,2,1,
….lt
wt
lw tt /
2N CN
june 6, 2002 D.H.J. Epema/PDS/TUD 6
Job Components
• A job consists of job components that each go to a single cluster, one task per processor
• Distributions of job-component sizes:– Uniform: U[a,b]
– Truncated and adapted geometric (favors small sizes and powers of 2): D(q) on [1,b]
….….job system
june 6, 2002 D.H.J. Epema/PDS/TUD 7
Job Request Types (1)• Ordered and unordered requests specify their
job-component sizes:
Ordered: Unordered:
1r
2r
Cr
Crrr ,...,, 21
1r
2r
Cr…. ….?…. ….
june 6, 2002 D.H.J. Epema/PDS/TUD 8
Job Request Types (2)
• Flexible and total requests only specify the total number of processors needed: ir
flexible: total:
ir….
ir? iN
june 6, 2002 D.H.J. Epema/PDS/TUD 9
Fitting a Job (1)
• It is clear when an ordered or a total request fits• For an unordered request:
– order components according to decreasing sizes– use First-Fit (FF) or Worst-Fit (WF)
….
job
system
WF.…
in use
idle
june 6, 2002 D.H.J. Epema/PDS/TUD 10
Fitting a Job (2)
• For a flexible request:– determine minimal number of clusters needed– fill least-loaded clusters (CF) completely, or balance load
(LB) (variation: LB-A)
CF LB
in use
idle
job
june 6, 2002 D.H.J. Epema/PDS/TUD 11
Scheduling Policies
• First Come First Served
• Fit Processors First Served: search queue for jobs that fit
jobqueue
….…. ….
system
june 6, 2002 D.H.J. Epema/PDS/TUD 12
Interarrival/Service Times
• Poisson arrival process in simulations
• All tasks in a job have the same service time
• Service-time distributions used:– Deterministic (mean 1)
– Exponential (mean 1)
– Hyperexponential (mean 1, coeff. of var. 3)
– Derived from the DAS
june 6, 2002 D.H.J. Epema/PDS/TUD 13
Communication
• We model jobs without and with communication
• With communication:– tasks alternate between compute and communication
phases
– communication phase: all-to-all personalized communication
– time for a single local synchronous message send operation: 0.001
– communication speed ratios considered: 1-100
june 6, 2002 D.H.J. Epema/PDS/TUD 14
Single-cluster DAS Statistics
service timenodes requested
nu
mb
er
of
job
s
nu
mb
er
of
job
s
mean: 23.34coeff. of var.: 1.11
mean: 356.45 (62.66)coeff. of var.: 5.37
june 6, 2002 D.H.J. Epema/PDS/TUD 15
Performance Evaluation• Parameters we vary:
– job request structure
– job-component-size distribution
– service-time distribution
– number and sizes of clusters (base case: 4x32)
– placement of unordered and flexible jobs
– scheduling policy
– communication speed ratio
– co-allocation versus no co-allocation
– queueing structure (global/local)
• Performance metrics:– mean response time (only simulation)– maximal utilization (analysis and simulation)
june 6, 2002 D.H.J. Epema/PDS/TUD 16
Influence of Structure and Size
resp
on
se t
ime
resp
on
se t
ime
resp
on
se t
ime
total
ordered
unordered
utilization utilization
distribution mean coeff.of var.
U[1,7] 4.000 0.500
D(0.9) on [1,8] 3.996 0.569
D(0.768)on[1,32] 3.996 0.829
U[1,14] 7.500 0.537
D(0.894)on[1,32] 7.476 0.884
june 6, 2002 D.H.J. Epema/PDS/TUD 17
Influence of Communication Speed Ratio
utilization utilization
resp
on
se t
ime
10 100
resp
onse
tim
e
Right to left: total, flexible, unordered, ordered
june 6, 2002 D.H.J. Epema/PDS/TUD 18
Co-Allocation versus no Co-Alloc. (1)
utilization
resp
on
se t
ime
flexible2 components4 components1 component
•no communication•unordered jobs•job size: 4xD(0.9) on [1,8] (fits on a single cluster)
june 6, 2002 D.H.J. Epema/PDS/TUD 19
Co-allocation versus no Co-alloc. (2)
utilization
resp
on
se
tim
e
LB-A, ratio 5LB-A, ratio 50no co-allocation, FF•communication
•flexible jobs•job size: 4xD(0.9) on [1,8]
june 6, 2002 D.H.J. Epema/PDS/TUD 20
An Application on the DAS (1)
• Solves the Poisson equation with a red-black Gauss-Seidel scheme
• Measurements on the DAS (times in ms):
• Time for diffusing local errors and computing the global error: 14 ms
Configuration on unit square
number of iterations
update exchange borders, single cluster
exchange borders, multicluster
4x2 2436 0.962 0.429 6.6
4x4 2132 0.498 0.387 7.0
june 6, 2002 D.H.J. Epema/PDS/TUD 21
An Application on the DAS (2)
utilization
resp
on
se t
ime
Equal mix of jobs of sizes (2,2,2,2) and (4,4,4,4)
totalordered
june 6, 2002 D.H.J. Epema/PDS/TUD 22
Maximal Utilization (1)
• Assume: constant backlog, ordered jobs, exponential service (no communication)
• Consider: the joint probability distribution of the sizes of jobs in the system
• Result: this distribution is the same – when the system runs for a long time
– when the system is filled from the empty state
• Use the convolution of the job-size distribution to determine the distribution of the numbers of jobs in the system
• Compute the maximal utilization
june 6, 2002 D.H.J. Epema/PDS/TUD 23
Maximal Utilization (2)• We have an approximation for the maximal
utilization for unordered jobs with WF
• We use simulations to validate this approximation
• Capacity loss (1-max. util.) for 4 clusters of size 32, uniform job-component sizes:
a b ordered(exact)
unordered(approx.)
unordered(simul.)
total(exact)
1 4 0.149 0.050 0.053 0.038
1 5 0.176 0.065 0.067 0.047
1 13 0.345 0.187 0.192 0.120
1 16 0.380 0.233 0.239 0.148