DIOS - compilers

12
DIOS: Dynamic Instrumentation for (not so) Outstanding Scheduling Blake Sutton & Chris Sosa

description

My DIOS presentation for compilers. This is meant more for a compiler-oriented audience

Transcript of DIOS - compilers

Page 1: DIOS - compilers

DIOS: Dynamic Instrumentation for(not so) Outstanding SchedulingBlake Sutton & Chris Sosa

Page 2: DIOS - compilers

Motivation

ON OR

Page 3: DIOS - compilers

Approach: Adaptive Distributed Scheduler

Centralized global scheduler and distributed local services

Hares monitor machines for “undesirable” events

Hares also gather application-specific info with Pin

Rhino schedules jobs and responds to events from Hares Migrate Pause / Resume Kill / Restart

Page 4: DIOS - compilers

“Pinvolvement”: What it is

Insert new code into apps on the fly No recompile Operates on a copy Code caching

Our Pintool Routine-level Instruction-level

pin –t mytool -- ./myprogram

Borrowed from Luk et al. 2005.

Page 5: DIOS - compilers

“Pinvolvement”: What it measures

No reliance on hardware-specific-performance counters

Want to capture memory behavior over time

Gathered: Ratio of malloc to free calls Wall-clock time to execute

10,000,000 insns Number of memory ops in last

2,000,000 insns

Page 6: DIOS - compilers

Evaluation

Distributed scheduler Rhino on realitytv13, Hare on

realitytv13-16 heatedplate with modified

parameters Hares detect if lower than 10%

memory available and informs Rhino to take action

Rhino reschedules youngest job at Hare site

Baseline: Smallest Queues

Pintool 2 applications from SPLASH-2 Heatedplate

Page 7: DIOS - compilers

Results: The Good

Scheduler shows potential for improvement

Lower total runtime with simple policy

1 2 3 4

0

5

10

15

20

IO Benchmark (100 job)Shor tes t Q A daptiv e

Shor tes t Q

N u m b e r o f H o s ts

To

tal R

un

time

Page 8: DIOS - compilers

Results: The Bad

Overhead from Pintool is too high to realize gains Pin isn’t designed for on-the-fly analysis Could not reattach Code caching isn’t enough

application native only pin count malloc/free # mems latency

heatedplate 1.00 1.88 2.65 5.43 7.45 7.26

ocean 1.00 1.48 2.87 7.84 6.04 5.81

lu 1.00 1.25 6.27 14.51 7.90 7.64

Page 9: DIOS - compilers

Results: The “Interesting”

Pintool does capture intriguing info…

Page 10: DIOS - compilers

Other Issues

Condor Process migration requires re-linking Doesn’t support multithreaded applications Other “user-level” process migration

mechanisms have similar requirements Pin

Unable to intersperse low and high overhead with Pintool

Even the smallest overhead was not negligible

Up to almost 2x slowdown just using Pin with heatedplate and no extra instrumentation

Scheduling decisions have a bigger impact for long-running jobs

Page 11: DIOS - compilers

Conclusion: the Future of DIOS

Overhead is prohibitive (for now) Pin needs to support reattach Lighter instrumentation framework

However, instrumentation can capture aspects of application-specific behavior

Future Work Pin as a process migration

mechanism

Page 12: DIOS - compilers

¿Preguntas?