Security-Driven Heuristics and A Fast Genetic Algorithm for Trusted Grid Job Scheduling Shanshan...
-
date post
21-Dec-2015 -
Category
Documents
-
view
221 -
download
3
Transcript of Security-Driven Heuristics and A Fast Genetic Algorithm for Trusted Grid Job Scheduling Shanshan...
Security-Driven Heuristics Security-Driven Heuristics and A Fast Genetic Algorithm and A Fast Genetic Algorithm for Trusted Grid Job Scheduling for Trusted Grid Job Scheduling
Shanshan Song, Ricky Kwok, and Kai HwangShanshan Song, Ricky Kwok, and Kai Hwang
University of Southern CaliforniaUniversity of Southern CaliforniaLos Angeles, CA 90089 USALos Angeles, CA 90089 USA
Presented by Shanshan Song at the IEEE IPDPS’05, Presented by Shanshan Song at the IEEE IPDPS’05, Denver, Colorado, April 6, 2005Denver, Colorado, April 6, 2005
The work was supported by the NSF ITR Grant 0325409The work was supported by the NSF ITR Grant 0325409
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 22
Presentation Outline:Presentation Outline: MotivationsMotivations The System ModelThe System Model
Three security-driven scheduling strategiesThree security-driven scheduling strategies To bind security to existing time-driven To bind security to existing time-driven
heuristics for parallel job scheduling heuristics for parallel job scheduling A New Space-Time Genetic Algorithm (STGA)A New Space-Time Genetic Algorithm (STGA)
Performance Metrics and WorkloadsPerformance Metrics and Workloads NAS and PSA Benchmark Results NAS and PSA Benchmark Results ConclusionsConclusions
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 33
MotivationsMotivations Highly shared Grid resources create severe Highly shared Grid resources create severe
insecurity problems and privacy concerns. insecurity problems and privacy concerns.
Most schedulers ignored the ‘risky’ factor Most schedulers ignored the ‘risky’ factor
when scheduling large number of jobs in a when scheduling large number of jobs in a
risky real-life Grid environment. risky real-life Grid environment.
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 44
……
Deterministic
……
Adaptive
Historicaldatabase
……
Security - Driven Model:
High secureHigh secure Low secure siteLow secure site
High demandHigh demand Low demand jobLow demand job
Parallel Job Scheduling Scenario in Risky Computational GridsParallel Job Scheduling Scenario in Risky Computational Grids
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 55
(a) Secure (b) Risky (a) Secure (b) Risky (c) (c) ff - Risky - Risky
Historicaldatabase
The bad thing always could happen --- Murphy’s Law
We are scared, Let us just wait
…
We don’t care, just do it. I am
courageous, not a kid anymore …
I calculate too, maybe I am lucky …
I run a calculated risk, but wait a while …
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 66
Three Scheduling ModesThree Scheduling Modes Secure modeSecure mode – Allocate jobs only to those Grid sites with – Allocate jobs only to those Grid sites with
security level exceeding the job requirement (SD < SL)security level exceeding the job requirement (SD < SL) Risky modeRisky mode – Allocate jobs to any available Grid sites without – Allocate jobs to any available Grid sites without
checking the risk level or the job demandchecking the risk level or the job demand f f - risky mode- risky mode – Allocate jobs to those Grid sites taking at most – Allocate jobs to those Grid sites taking at most
ff risk. E.g.: risk. E.g.: ff = 0.5 (50%) = 0.5 (50%)
Secure Secure ff-Risky-Risky RiskyRisky( ) 0P fail ( )P fail f ( ) 100%P fail
Risk Scale:Risk Scale:0 0 ff 100%100%
( )
0 if ( )
1 if SD SL
SD SLP fail
e SD SL
The Failure Model:The Failure Model:
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 77
Scheduling Heuristics under Three Scheduling Heuristics under Three Risky Modes Risky Modes Min-Min heuristic:Min-Min heuristic:
For each job, the resource site that gives the earliest For each job, the resource site that gives the earliest expected completion time is determined first. The job that expected completion time is determined first. The job that has the minimum earliest expected completion time is has the minimum earliest expected completion time is determined and then assigned to the corresponding site.determined and then assigned to the corresponding site.
Sufferage heuristic:Sufferage heuristic: The Sufferage heuristic is based on the idea that better The Sufferage heuristic is based on the idea that better
mappings can be generated by assigning a site to a job that mappings can be generated by assigning a site to a job that would “suffer” most in terms of expected completion time if would “suffer” most in terms of expected completion time if that particular site is not assigned to it. that particular site is not assigned to it.
Heuristic operational modes: Heuristic operational modes: Secure, Secure, ff - Risky, Risky - Risky, Risky
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 88
Genetic Algorithm (GA)Genetic Algorithm (GA) Genetic Algorithm (GA) is a popular technique used Genetic Algorithm (GA) is a popular technique used
for searching large solution spaces for searching large solution spaces It is powerful for generating good solutionIt is powerful for generating good solution It is not widely deployed for its long computation timeIt is not widely deployed for its long computation time
Number of Evolution Iterations
Solution Quality
Generate RandomInitial Population
Good Solution is found
STGA Starting Point
GASTGA
Traditional GA vs. STGA in term of Number of Evolution IterationsTraditional GA vs. STGA in term of Number of Evolution Iterations
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 99
How does STGA Work?How does STGA Work?STGA: Space-Time Genetic AlgorithmSTGA: Space-Time Genetic Algorithm
InputInput SolutionSolution
(%%,**, ###)(%%,**, ###) (423…56)(423…56)
…… ……
(%%,****, ###)(%%,****, ###) (368…89)(368…89)
Lookup TableLookup Table
(%%%, ***, ####)
One batch of jobsOne batch of jobs
(456 … 34)…
(167 … 89)
RandomlyGeneratedSolutions
(123 … 786)GA
Final SolutionFinal Solution
Initial Population
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 1010
STGA Convergence TimeSTGA Convergence Time
0 25 50 75 100 125 150 175 200150000
155000
160000
165000
170000 PSA, N=1000
Ma
kesp
an
(se
con
ds)
Number of Iterations in STGA
Converge at 50 iterations, FAST!!!
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 1111
Performance Metrics and Performance Metrics and WorkloadsWorkloads Performance MetricsPerformance Metrics
Makespan, slowdown ratio, and average response timeMakespan, slowdown ratio, and average response time Site utilization Site utilization Number of failed jobs & number of risk-taking jobsNumber of failed jobs & number of risk-taking jobs
Numerical Aerodynamic Simulation (NAS) WorkloadNumerical Aerodynamic Simulation (NAS) Workload A package contains three months worth of sanitized A package contains three months worth of sanitized
accounting records for the 128-node iPSC/860 located in accounting records for the 128-node iPSC/860 located in the Numerical Aerodynamic Simulation (NAS) Systems the Numerical Aerodynamic Simulation (NAS) Systems Division at NASA Ames Research Center. Division at NASA Ames Research Center.
Parameter Sweep Application (PSA) WorkloadParameter Sweep Application (PSA) Workload Contains a set of independent tasksContains a set of independent tasks Each task has some input files for different parametersEach task has some input files for different parameters
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 1212
Performance Results (Makespan)Performance Results (Makespan) NAS trace workload (16000 jobs, 12 sites)NAS trace workload (16000 jobs, 12 sites) Job arrival rate and workload are from trace dataJob arrival rate and workload are from trace data STGA evolution iterations: 100 STGA evolution iterations: 100 (GA: (GA: 1000 iterations)
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 1313
Performance Results (Response Time)Performance Results (Response Time)
NAS trace workload (16000 jobs, 12 sites)NAS trace workload (16000 jobs, 12 sites) Job arrival rate and workload are from trace dataJob arrival rate and workload are from trace data
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 1414
Performance Results (Utilization)Performance Results (Utilization) NAS trace workload (16000 jobs, 12 sites)NAS trace workload (16000 jobs, 12 sites) Job arrival rate and workload are from trace dataJob arrival rate and workload are from trace data
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 1515
Scalability AnalysisScalability Analysis The scalability analysis is conducted on Number of The scalability analysis is conducted on Number of
simulated jobs (PSA workload)simulated jobs (PSA workload) NN = 1000, 2000, 5000, and 10000 = 1000, 2000, 5000, and 10000
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 1616
ConclusionsConclusions Security binding technique can be applied to Security binding technique can be applied to
improve any time-driven heuristics for online improve any time-driven heuristics for online scheduling of parallel jobs scheduling of parallel jobs in an open risky in an open risky Grid computing environment. Grid computing environment.
The new STGA algorithm works by swiftly The new STGA algorithm works by swiftly generating good scheduling solutions based generating good scheduling solutions based on a prior job execution experience on Grid on a prior job execution experience on Grid platforms. Both NAS and PSA benchmark platforms. Both NAS and PSA benchmark results show the superiority of STGA over results show the superiority of STGA over the heuristics algorithms applied.the heuristics algorithms applied.
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 1717
Min-Min and Sufferage HeuristicsMin-Min and Sufferage Heuristics
Min-Min heuristics:Min-Min heuristics: For each job, the resource For each job, the resource
site that gives the earliest site that gives the earliest expected completion time is expected completion time is determined first. The job that determined first. The job that has the minimum earliest has the minimum earliest expected completion time is expected completion time is determined and then determined and then assigned to the assigned to the corresponding site.corresponding site.
Sufferage heuristics: Sufferage heuristics: The Sufferage heuristic is The Sufferage heuristic is
based on the idea that better based on the idea that better mappings can be generated mappings can be generated by assigning a site to a job by assigning a site to a job that would “suffer” most in that would “suffer” most in terms of expected completion terms of expected completion time if that particular site is time if that particular site is not assigned to it. not assigned to it.
Job1 Job2 Job3
Site1 3 5 7
Site2 2 4 3
Site3 6 9 10
Expected Time to Complete Matrix
Job1 Job2 Job3
Site1 3 5 7
Site2 2 4 3
Site3 6 9 10
Expected Time to Complete Matrix
Suffer value: 1 1 4Suffer value: 1 1 4
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 1818
Genetic Algorithm Overview Genetic Algorithm Overview Genetic Algorithms (GAs) are a popular technique used for Genetic Algorithms (GAs) are a popular technique used for
searching large solution spacessearching large solution spaces ‘‘selection’, ‘crossover’, and ‘mutation’ operationsselection’, ‘crossover’, and ‘mutation’ operations Selection Selection keep good solutions keep good solutions Crossover Crossover global optimization global optimization
Mutation Mutation local jumping local jumping
0
1
0
1
0
1
1
0
0
1
0
0
0
1
0
0
1
0
0
1
0.3 0.6 0.9 0.6
Initial Population
1
1
0
0
1
0
0
0
1
0
0
1
0
0
1
0.9 0.6 0.9 0.6
After selection
0
0
0
1
0
1
1
0
1
0
0
0
0
1
0
0
1
0
0
1
1.0 0.4 0.9 0.6
After crossover
0
0
0
0
1
1
1
0
1
0
1
0
0
1
0
0
1
0
0
1
1.0 0.4 0.8 0.6
After mutation
0
0
0
0
1
http://GridSec.usc.eduhttp://GridSec.usc.eduApril 6, 2005April 6, 2005 1919
How does GA apply to job How does GA apply to job scheduling?scheduling? What we have:What we have:
A set of resource sitesA set of resource sites A number of jobsA number of jobs
Solution need to generate:Solution need to generate: Job and site mappingJob and site mapping
site4site4 site3site3 site5site5 site2site2 site2site2
Job1 Job2 Job3 Job4 Job5
One solution (chromosome in GA)One solution (chromosome in GA)
site4site4 site3site3 site5site5 site2site2 site2site2
site3site3 site1site1 site5site5 site3site3 site2site2
site1site1 site3site3 site4site4 site2site2 site6site6
Initial population (size=200)Initial population (size=200)Job1 Job2 Job3 Job4 Job5