Achieving Application Performance on the Computational Grid
description
Transcript of Achieving Application Performance on the Computational Grid
![Page 1: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/1.jpg)
Achieving Application Performance
on the Computational Grid
Francine Berman
U. C. San Diego
and
NPACI
![Page 2: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/2.jpg)
The Computing Landscape
data archives
networks
visualizationinstruments
MPPs
clusters
PCs
Workstations
Wireless
![Page 3: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/3.jpg)
Computing Platforms
• Combining resources “in the box”– focus is on new hardware
• Combining resources as a “virtual box”– focus is on software infrastructure
![Page 4: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/4.jpg)
The Computational Grid
Computational Grid = ensemble of distributed and heterogeneous resources
Metaphor: Electric Power Grid– for users, power is ubiquitous
– you can plug in anywhere
– you don’t need to know where the power is coming from
![Page 5: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/5.jpg)
Better Toast• On the electric power grid, power is either
adequate or it’s not– On the computational grid, application
performance depends on the underlying system state
• Major Grid research and development thrusts:– Building the Grid– Programming the Grid
![Page 6: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/6.jpg)
Programming for Performance
• Performance Paradigm:To achieve performance, applications must be designed and implemented to leverage the performance characteristics of the underlying resources.
Performance Characteristics of the Grid• Resources are distributed, heterogeneous
• Resources shared by multiple users
• Resource performance may be hard to predict
![Page 7: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/7.jpg)
How Can Applications Achieve Performance on the Grid?
• Build programs to be grid-aware
• Leverage deliverable resource performance during execution
– Scheduling is fundamental
• Key Grid scheduling components
– dynamic information
– quantitative and qualitative predictions
– adaptivity
![Page 8: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/8.jpg)
Achieving Application Performance
• Many entities will schedule the application
PSE
Config.object
program
wholeprogramcompiler
Source appli-cation
libraries
Realtimeperf
monitor
Dynamicoptimizer
Grid runtime system
negotiation
Softwarecomponents
Service negotiator
Scheduler
Performance feedback
Perfproblem
Grid Application Development System
![Page 9: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/9.jpg)
Application Scheduling
• Application schedulers must– perceive the performance impact of system resources on the
application
– adapt application execution schedule to dynamic conditions
– optimize application schedule for Grid according to the user’s performance criteria
• Application scheduler tasked with promoting application performance over the performance of other applications and system components
![Page 10: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/10.jpg)
• Self-Centered Scheduling:Everything in the system is evaluated in terms
of its impact on the application.
• performance of each system component can be considered as a measurable quantity
• forecasts of quantities relevant to the application can be manipulated to determine schedule
• This simple paradigm forms the basis for AppLeS.
Paradigm for Application Scheduling
![Page 11: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/11.jpg)
AppLeS
• AppLeS =
Application-Level
Scheduler– agent-based approach
– each application integrated with its own AppLeS
– each AppLeS develops and implements a custom application schedule
NWS(Wolski)
UserPrefs
AppPerf
Model
PlannerResource Selector
Application
Act.
Grid/cluster resources/
infrastructure
• Joint project with Rich Wolski at U. Tenn
![Page 12: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/12.jpg)
AppLeS Approach• Select resources
• For each feasible resource set, plan a schedule
– For each schedule, predict application performance at execution time
– consider both the prediction and its qualitative attributes
• Implement the “best” of the schedules wrt user’s performance criteria – execution time
– convergence– turnaround time
NWS(Wolski)
UserPrefs
AppPerf
Model
PlannerResource Selector
Application
Act.
Grid/cluster resources/
infrastructure
![Page 13: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/13.jpg)
Network Weather Service (Wolski, U. Tenn.)
• The NWS provides dynamic resource information for AppLeS
• NWS is stand-alone system
• NWS – monitors current system state
– provides best forecast of resource load from multiple models
Sensor Interface
Reporting Interface
Forecaster
Model ModelModel
![Page 14: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/14.jpg)
iii Commpt
OperAreaT
Using Forecasting in Scheduling
• How much work should each processor be given?
• Jacobi2D AppLeS solves equations for Area:
N N Areai
P1 P2 P3
Fast Ethernet Bandwidth at SDSC
0
10
20
30
40
50
60
70
Time of Day
Meg
abits
per
Sec
ond
Measurements
Exponential SmoothingPredictions
Tue Wed Thu Fri Sat Sun Mon Tue
![Page 15: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/15.jpg)
Good Predictions Promote Good Schedules
• Jacobi2D experiments
0
1
2
3
4
5
6
7
Ex
ecu
tio
n T
ime
(sec
on
ds)
10
00
110
0
12
00
13
00
14
00
15
00
16
00
17
00
18
00
19
00
20
00
Problem Size
Comparison of Execution Times
Compile-time Blocked
Compile-time Irregular Strip
Runtime
![Page 16: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/16.jpg)
SARA: An AppLeS-in-Progress
• SARA = Synthetic Aperture Radar Atlas– application developed at JPL
and SDSC
• Goal: Assemble/process files for user’s desired image– thumbnail image shown
to user
– user selects desired bounding box for more detailed viewing
– SARA provides detailed image in variety of formats
![Page 17: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/17.jpg)
Simple SARA• AppLeS focuses on resource selection problem:
Which site can deliver data the fastest?• Goal is to optimize performance by minimizing transfer time
• Code developed by Alan Su
ComputeServer
DataServer
DataServer
DataServer Computation servers
and data servers are
logical entities, not
necessarily different
nodes
Network shared by variable number of users
Computation assumed to be done at compute servers
![Page 18: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/18.jpg)
Experimental Setup
• Data for image accessed over shared networks
• Data sets 1.4 - 3 megabytes, representative of SARA file sizes
• Servers used for experiments– lolland.cc.gatech.edu
– sitar.cs.uiuc
– perigee.chpc.utah.edu
– mead2.uwashington.edu
– spin.cacr.caltech.edu
via vBNS
via general Internet
![Page 19: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/19.jpg)
Which is “Closer”?
• Sites on the east coast or sites on the west coast?
• Sites on the vBNS or sites on the general Internet?
• Consistently the same site or different sites at different times?
![Page 20: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/20.jpg)
Which is “Closer”?
• Sites on the east coast or sites on the west coast?
• Sites on the vBNS or sites on the general Internet?
• Consistently the same site or different sites at different times?
Depends a lot on traffic ...
![Page 21: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/21.jpg)
Preliminary Results• Experiment with larger data set (3 Mbytes)
• During this time-frame, general Internet provides data mostly faster than vBNS
![Page 22: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/22.jpg)
• Experiment with smaller data set (1.4 Mbytes)• During this time frame, east coast sites provide
data mostly faster than west coast sites
More Preliminary Results
![Page 23: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/23.jpg)
9/21/98 Experiments• Clinton Grand Jury webcast commenced at trial 62
![Page 24: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/24.jpg)
What if File Sizes are Larger?Storage Resource Broker
(SRB)
• SRB provides access to distributed, heterogeneous storage systems
– UNIX, HPSS, DB2, Oracle, ..
– files can be 16MB or larger
– resources accessed via a common SRB interface
![Page 25: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/25.jpg)
An SRB AppLeS
SRB Client
Network Weather Service
SRB Server
MCAT
Distributed Physical Storage
AppLeS
Network
• Being developed by Marcio Faerman
• Like Simple SARA, SRB focuses on resource selection
• NWS probe is 64K, SRB file size is 16MB
• How to predict SRB file transfer time?
![Page 26: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/26.jpg)
Predicting Large File Transfer Times
NWS and SRB present distinct behaviors
Bandwidth x TimeWashington St. Louis - UCSD
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Time
Ban
dw
idth
(M
bit
s/s)
SRB NWS "Predicted" SRB
Dec 3
Dec 11
Bandwidth x TimeWashington St. Louis - UCSD
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Time
Ban
dw
idth
(M
bit
s/s)
SRB NWS
Current approach:Use linear regression on NWS bandwidth measurementsto track SRB behavior
![Page 27: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/27.jpg)
Distributed Data Applications
. . .ComputeServers
DataServers
Client
Move the computationor move the data?
Which computeservers to use?
Which serversto use for multiplefiles?
• Simple SARA and SRB representative of a larger class of distributed data applications
• Goal is to develop AppLeS scheduler for “end-to-end” applications
![Page 28: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/28.jpg)
A Bushel of AppLeS … almost
• During the first “phase” of the project, we’ve focused on developing AppLeS applications
– Jacobi2D
– DOT
– SRB
– Simple SARA
– magnetohydrodynamics
– CompLib
– INS2D
– Tomography, ...
• What have we learned?
![Page 29: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/29.jpg)
Lessons Learned From AppLeS
Compile-time Blocked Partitioning
Run-time AppLeS Non-
Uniform Strip Partitioning
• Dynamic information is critical.
Jacobi2D
![Page 30: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/30.jpg)
Lessons Learned from AppLeS
• Program execution and parameters may exhibit a range of performance
![Page 31: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/31.jpg)
Lessons Learned from AppLeS
• Knowing something about the “goodness” of performance predictions can improve scheduling
Execution time
0
50
100
150
200
250
300
350
Small Medium Large
Problem Size
Tim
e (s)
SuperAppLeSAppLeSMentat
SOR CompLib
![Page 32: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/32.jpg)
Lessons Learned from AppLeS
• Performance of application sensitive to scheduling policy, data, and system characteristics
![Page 33: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/33.jpg)
Achieving Application Performance on the Grid
• AppLeS uses adaptivity to leverage deliverable resource performance
• Performance impact of all components considered
• AppLeS agents target dynamic, multi-user distributed environments
• AppLeS is leading project in application scheduling
![Page 34: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/34.jpg)
Related Work• Application Schedulers
– Mars, Prophet/Gallop, VDCE, ...• Scheduling Services
– Globus GRAM• Resource Allocators
– I-Soft, PBS, LSF, Maui Scheduler, Nile, Legion• PSEs
– Nimrod, NEOS, NetSolve, Ninf• High-Throughput Schedulers
– Condor• Performance Steering
– Autopilot, SciRun
![Page 35: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/35.jpg)
New Directions
• AppLeS Templates– distributed data applications
– parameter sweeps
– master/slave applications
– data parallel stencil applications
AppLeS Template Retargeting Engineering Environment
ApplicationModule
PerformanceModule
SchedulingModule
DeploymentModule
AP
I
AP
I
AP
I
Network Weather Service
dynamicbenchmarking
suite selection
![Page 36: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/36.jpg)
New Directions
• Expanding AppLeS
target execution sites– interactive clusters
• linux, NT
– Globus, Legion
– batch systems
– high-throughput clusters
(Condor)
– all of the above
SCHED
AppLeS
![Page 37: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/37.jpg)
New Directions• Real World Scheduling
• scheduling with
– partial information
– poor information
– dynamically changing information
• Multischeduling• resource economies• scheduling “social structure”
X
![Page 38: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/38.jpg)
The Brave New World• Design, development, and execution of
grid-aware applications
PSE
Config.object
program
wholeprogramcompiler
Source appli-cation
libraries
Realtimeperf
monitor
Dynamicoptimizer
Grid runtime system
negotiation
Softwarecomponents
Service negotiator
Scheduler
Performance feedback
Perfproblem
Grid Application Development System
![Page 39: Achieving Application Performance on the Computational Grid](https://reader035.fdocuments.us/reader035/viewer/2022062409/56814f5a550346895dbd08da/html5/thumbnails/39.jpg)
The AppLeS Project
• AppLeS Corps:– Fran Berman, UCSD– Rich Wolski, U. Tenn– Henri Casanova– Walfredo Cirne– Marcio Faerman– Jaime Frey
– Jim Hayes– Graziano Obertelli– Gary Shao– Shava Smallen– Alan Su– Dmitrii Zagorodnov
• Thanks to NSF, NASA, NPACI, DARPA, DoD
• AppLeS Home Page: http://www-cse.ucsd.edu/groups/hpcl/apples.html