L ondon e-S cience C entre Application Scheduling in a Grid Environment Nine month progress talk...
-
Upload
ella-templeton -
Category
Documents
-
view
218 -
download
0
Transcript of L ondon e-S cience C entre Application Scheduling in a Grid Environment Nine month progress talk...
L ondone-S cienceC entre
Application Scheduling in a Grid Environment
Nine month progress talk
Laurie Young
2
L ondone-S cienceC entre Overview
• Introduction to grid computing
• Work so far…
• Imperial College E-Science Networked Infrastructure (ICENI)
• Scheduling within ICENI
• Optimisation criteria/Scheduling policy
• Scheduling/Mapping algorithms
3
L ondone-S cienceC entre What is a Grid?
CPU Node
CPU Node
Storage Node
Scientific Instrument
Visulisation/Steering Software
4
L ondone-S cienceC entre What is a Grid Application?
Tier2 Centre ~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
FermiLab ~4 TIPSFrance Regional Centre
Italy Regional Centre
Germany Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight (deprecated)
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Caltech ~1 TIPS
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
Tier 2Tier 2
Tier 4Tier 4
1 TIPS is approximately 25,000
SpecInt95 equivalents
5
L ondone-S cienceC entre Current Work
• Development of Supporting Technologies– Development of EPIC (E-Science Portal @ IC)
• GridFTP (High throughput FTP)
• Grid/Globus submission of jobs to resources
• Development of test application– Parameter sweep analysis of submarine acoustics– Multithreaded and Component versions– Integration with EPIC
6
L ondone-S cienceC entre ICENI
• IC e-Science Networked Infrastructure
• Developed by LeSC Grid Middleware Group
• Collect and provide relevant Grid meta-data
• Use to define and develop higher-level services
The Iceni, under Queen Boudicca, united the tribes of South-East England in a revolt against the occupying Roman forces in AD60.
7
L ondone-S cienceC entre ICENI Component Applications
• Each ICENI job is composed of multiple components. Each runs on a different resource
• Each component is connected to at least one other component. Data is passed along these connections
8
L ondone-S cienceC entre The Scheduling Problem
Given a component application and a (large) network of linked
computational resources, what is the best mapping of components
onto resources?
9
L ondone-S cienceC entre Scheduler in ICENI
Resources
ICENI
App Builder (GUI) Component Repository Performance Models
Scheduler Broker
10
L ondone-S cienceC entre Multiple Metrics (1)
• “It is the goal of a scheduler to optimise one or more metrics” (Feitelson & Rudolph)
• Generally one metric is most important– Application Optimisation
• Execution time• Execution cost
– Host Optimisation• Host utilisation• Host throughput• Interaction Latency
11
L ondone-S cienceC entre
• In a Grid Environment there are three application optimisation based important metrics– Start time ( )– End time ( )– Cost ( )
• Relative importance varies on a user by user and application by application basis
Multiple Metrics (2)
b
e
12
L ondone-S cienceC entre
• A Benefit Function maps the metrics we are interested in to a single Benefit Value metric
• Different benefit functions represent different optimisation preferences
Combining Metrics – Benefit Fn
),,( ebBB
13
L ondone-S cienceC entre Optimisation Preferences
• Cost Optimisation
• Time Optimisation
• Cost/Time Optimisation
max max e and if
eB
max max e and if
eB
max max e and if
eB
14
L ondone-S cienceC entre Graph Oriented Scheduling (1)
• Applications are described as a graph– Nodes represent application components– Edges represent component communication
• Resources are described as a graph– Nodes represent resources– Edges represent network connections
15
L ondone-S cienceC entre
VOYAGERMicrosoft/DellIntel Cluster32 processor
Giganet
Centre Resources
SATURNSun E6800 SMP
24 processorsBackplane: 9.6GB/s
PIONEERAthlon Cluster22 processor
100Mb
Storage
StorageATLASCompaq / Quadrics Cluster
32 processorMPI: ~5.7us & >200 MB/s
CONDOR POOL~ 150 PIII processors
AP300080 Sparc Ultra II
APNet
VIKING 1P4/Linux Cluster
66 dual node Myrinet
VIKING 2P4/Linux Cluster
68 dual node100Mb
6TB
1.2TB
24TB
16
L ondone-S cienceC entre Graph Oriented Scheduling (2)
Condor pool
Atlas Saturn
Viking
Design Analyse
Scatter
Gather
Mesh
DRACS
Mesh
DRACS
Mesh
DRACS
Factory
17
L ondone-S cienceC entre Graph Oriented Scheduling (3)
Condor pool
ScatterGather
DesignAtlas
Factory
AnalyseSaturn
Viking
18
L ondone-S cienceC entre Schedule Benefit
• Each component and communication has a benefit function
• Each resource and network connection has a predicted time & cost for each component or communication that could be deployed
• Fit the task graph onto the resource graph to get the maximum Total Predicted Benefit
),,( ebt BB
19
L ondone-S cienceC entre Future Work
• Develop benefit maximisation algorithms
• Test schedulers – On grid simulators such as SimGrid, GridSim and
MicroGrid– On grid testbeds, such as IC Testbed and the
EUDG
• Develop brokering methods
• Define Scheduler-Broker communications
20
L ondone-S cienceC entre Summary
• Concept of grid computing for HPC/HTC
• ICENI Middleware for utilization of grids
• Importance of scheduling metrics
• Combining metrics
• Mapping application graphs - resource graphs
• Optimisation of total benefit
• Need good mapping algorithms…