Deadlock Detection Nov 26, 2012 CS 8803 FPL 1. Part I Static Deadlock Detection Reference: Effective...
-
Upload
marion-willis -
Category
Documents
-
view
233 -
download
3
Transcript of Deadlock Detection Nov 26, 2012 CS 8803 FPL 1. Part I Static Deadlock Detection Reference: Effective...
Deadlock Detection
Nov 26, 2012CS 8803 FPL
1
Part I
• Static Deadlock Detection
• Reference:
Effective Static Deadlock Detection [ICSE’09]
• An unintended condition in a shared-memory, multi-threaded program in which:– a set of threads blocks forever– because each thread in the set waits to acquire a
lock being held by another thread in the set• This work: ignore other causes (e.g., wait/notify)
• Example// Thread t1sync (l1) { sync (l2) { … }}
// Thread t2sync (l2) { sync (l1) { … }}
l1
t1
l2
t2
What is a Deadlock?
• Today’s concurrent programs are rife with deadlocks– 6,500/198,000 (~ 3%) of bug reports in Sun’s bug database at
http://bugs.sun.com are deadlocks
• Deadlocks are difficult to detect– Usually triggered non-deterministically, on specific thread
schedules– Fail-stop behavior not guaranteed (some threads may be
deadlocked while others continue to run)
• Fixing other concurrency bugs like races can introduce new deadlocks– Our past experience with reporting races: developers often
ask for deadlock checker
Motivation
• Based on finding cycles in program’s dynamic or static lock order graph
• Dynamic approaches– Inherently unsound– Inapplicable to open programs– Ineffective without sufficient test input data
• Static approaches– Type systems (e.g., Boyapati-Lee-Rinard OOPSLA’02)
• Annotation burden often significant– Model checking (e.g., SPIN)
• Does not currently scale beyond few KLOC– Dataflow analysis (e.g., Engler & Ashcraft SOSP’03;
Williams-Thies-Ernst ECOOP’05)• Scalable but highly imprecise
l1
t1
l2
t2
Previous Work
• Deadlock freedom is a complex property– can t1,t2 denote different threads?– can l1,l4 denote same lock?– can t1 acquire locks l1->l2?– some more …
l = abstract lock acq.
t = abstract thread
t1
l1
l2
t1
l1
l2
t2
l4
l3
t2
Challenges to Static Deadlock Detection
• Deadlock freedom is a complex property– can t1,t2 denote different threads?– can l1,l4 denote same lock?– can t1 acquire locks l1->l2?– some more …
t1
l1
l2
t1
l1
l2
t2
l4
l3
t2
Our Rationale
t1
l1
l2
t1
l1
l2
t2
l4
l3
t2
• Existing static deadlock checkers cannot check all conditions simultaneously and effectively
• But each condition can be checked separately and effectively using existing static analyses
Our Rationale
• Consider all candidate deadlocks in closed program
• Check each of six necessary conditions for each candidateto be a deadlock
• Report candidates that satisfy all six conditions
• Note: Finds only deadlocks involving 2 threads/locks– Deadlocks involving > 2 threads/locks rare in practice
l1
l2
t1
l4
l3
t2t1
l1
l2
t2
• ...
Our Approach
• may-reach(t1,l1,l2)?
• may-alias(l1,l4)?
class LogManager {
static LogManager manager = new LogManager();
155: Hashtable loggers = new Hashtable();
280: sync boolean addLogger(Logger l) {
String name = l.getName();
if (!loggers.put(name, l))
return false;
// ensure l’s parents are instantiated
for (...) {
String pname = ...;
314: Logger.getLogger(pname);
}
return true;
}
420: sync Logger getLogger(String name) {
return (Logger) loggers.get(name);
}
}
class Logger {
226: static sync Logger getLogger(String name) {
LogManager lm = LogManager.manager;
228: Logger l = lm.getLogger(name);
if (l == null) {
l = new Logger(...);
231: lm.addLogger(l);
}
return l;
}
}
class Harness {
static void main(String[] args) {
11: new Thread() { void run() {
13: Logger.getLogger(...);
}}.start();
16: new Thread() { void run() {
18: LogManager.manager.addLogger(...);
}}.start();
}
}
t1
t2
l4
l2
l3
l1
Example: jdk1.4 java.util.logging
*** Stack trace of thread <Harness.java:11>:LogManager.addLogger (LogManager.java:280) - this allocated at <LogManager.java:155> - waiting to lock {<LogManager.java:155>}Logger.getLogger (Logger.java:231) - holds lock {<Logger.java:0>}Harness$1.run (Harness.java:13)
*** Stack trace of thread <Harness.java:16>:Logger.getLogger (Logger.java:226) - waiting to lock {<Logger.java:0>}LogManager.addLogger (LogManager.java:314) - this allocated at <LogManager.java:155> - holds lock {<LogManager.java:155>}Harness$2.run (Harness.java:18)
Example Deadlock Report
• Six necessary conditions identified experimentally
• Checked using four incomplete but sound whole-program static analyses
1. Reachable2. Aliasing3. Escaping4. Parallel5. Non-reentrant6. Non-guarded
1. Call-graph analysis2. May-alias analysis3. Thread-escape analysis4. May-happen-in-parallel analysis
• Relatively language independent• Incomplete but sound checks}
}• Widely-used Java locking idioms• Incomplete and unsound checks
- sound needs must-alias analysis
Our Approach
• Property: In some execution:– can a thread abstracted by t1 reach l1– and after acquiring lock at l1, proceed to reach l2 while holding
that lock?– and similarly for t2, l3, l4
• Solution: Use call-graph analysis– k-object-sensitive [Milanova-Rountev-Ryder ISSTA’03]
t1
l1
l2
t1
l1
l2
l4
l3
t2
l4
l3
t2
Condition 1: Reachable
class LogManager {
static LogManager manager = new LogManager();
155: Hashtable loggers = new Hashtable();
280: sync boolean addLogger(Logger l) {
String name = l.getName();
if (!loggers.put(name, l))
return false;
// ensure l’s parents are instantiated
for (...) {
String pname = ...;
314: Logger.getLogger(pname);
}
return true;
}
420: sync Logger getLogger(String name) {
return (Logger) loggers.get(name);
}
}
class Logger {
226: static sync Logger getLogger(String name) {
LogManager lm = LogManager.manager;
228: Logger l = lm.getLogger(name);
if (l == null) {
l = new Logger(...);
231: lm.addLogger(l);
}
return l;
}
}
class Harness {
static void main(String[] args) {
11: new Thread() { void run() {
13: Logger.getLogger(...);
}}.start();
16: new Thread() { void run() {
18: LogManager.manager.addLogger(...);
}}.start();
}
}
t1
t2
l4
l2
l3
l1
Example: jdk1.4 java.util.logging
• Property: In some execution:– can a lock acquired at l1 be the same as a lock acquired at l4?– and similarly for l2, l3
• Solution: Use may-alias analysis– k-object-sensitive [Milanova-Rountev-Ryder ISSTA’03]
l1
l2
t1
l1
l2
l4
l3
t2
Condition 2: Aliasing
class LogManager {
static LogManager manager = new LogManager();
155: Hashtable loggers = new Hashtable();
280: sync boolean addLogger(Logger l) {
String name = l.getName();
if (!loggers.put(name, l))
return false;
// ensure l’s parents are instantiated
for (...) {
String pname = ...;
314: Logger.getLogger(pname);
}
return true;
}
420: sync Logger getLogger(String name) {
return (Logger) loggers.get(name);
}
}
class Logger {
226: static sync Logger getLogger(String name) {
LogManager lm = LogManager.manager;
228: Logger l = lm.getLogger(name);
if (l == null) {
l = new Logger(...);
231: lm.addLogger(l);
}
return l;
}
}
class Harness {
static void main(String[] args) {
11: new Thread() { void run() {
13: Logger.getLogger(...);
}}.start();
16: new Thread() { void run() {
18: LogManager.manager.addLogger(...);
}}.start();
}
}
t1
t2
l4
l2
l3
l1
Example: jdk1.4 java.util.logging
• Property: In some execution:– can a lock acquired at l1 be thread-shared?– and similarly for each of l2, l3, l4
• Solution: Use thread-escape analysis
l1
l2
t1
l1
l2
l4
l3
l4
l3
t2
Condition 3: Escaping
class LogManager {
static LogManager manager = new LogManager();
155: Hashtable loggers = new Hashtable();
280: sync boolean addLogger(Logger l) {
String name = l.getName();
if (!loggers.put(name, l))
return false;
// ensure l’s parents are instantiated
for (...) {
String pname = ...;
314: Logger.getLogger(pname);
}
return true;
}
420: sync Logger getLogger(String name) {
return (Logger) loggers.get(name);
}
}
class Logger {
226: static sync Logger getLogger(String name) {
LogManager lm = LogManager.manager;
228: Logger l = lm.getLogger(name);
if (l == null) {
l = new Logger(...);
231: lm.addLogger(l);
}
return l;
}
}
class Harness {
static void main(String[] args) {
11: new Thread() { void run() {
13: Logger.getLogger(...);
}}.start();
16: new Thread() { void run() {
18: LogManager.manager.addLogger(...);
}}.start();
}
}
t1
t2
l4
l2
l3
l1
Example: jdk1.4 java.util.logging
• Property: In some execution:– can different threads abstracted by t1 and t2– simultaneously reach l2 and l4?
• Solution: Use may-happen-in-parallel analysis– Does not model full happens-before relation– Models only thread fork construct– Other conditions model other constructs
≠t1
l1
l2
t1
l1
l2
l4
l3
t2
l4
l3
t2
Condition 4: Parallel
class LogManager {
static LogManager manager = new LogManager();
155: Hashtable loggers = new Hashtable();
280: sync boolean addLogger(Logger l) {
String name = l.getName();
if (!loggers.put(name, l))
return false;
// ensure l’s parents are instantiated
for (...) {
String pname = ...;
314: Logger.getLogger(pname);
}
return true;
}
420: sync Logger getLogger(String name) {
return (Logger) loggers.get(name);
}
}
class Logger {
226: static sync Logger getLogger(String name) {
LogManager lm = LogManager.manager;
228: Logger l = lm.getLogger(name);
if (l == null) {
l = new Logger(...);
231: lm.addLogger(l);
}
return l;
}
}
class Harness {
static void main(String[] args) {
11: new Thread() { void run() {
13: Logger.getLogger(...);
}}.start();
16: new Thread() { void run() {
18: LogManager.manager.addLogger(...);
}}.start();
}
}
t1
t2
l4
l2
l3
l1
Example: jdk1.4 java.util.logging
Benchmark LOC Classes Methods Syncs Time
moldyn 31,917 63 238 12 4m48s
montecarlo 157,098 509 3447 190 7m53s
raytracer 32,576 73 287 16 4m51s
tsp 154,288 495 3335 189 7m48s
sor 32,247 57 208 5 4m48s
hedc 160,071 530 3552 204 21m15s
weblech 184,098 656 4620 238 32m02s
jspider 159,494 557 3595 205 15m34s
jigsaw 154,584 497 3346 184 15m23s
ftp 180,904 642 4383 252 35m55s
dbcp 168,018 536 3602 227 16m04s
cache4j 34,603 72 218 7 4m43s
logging 167,923 563 3852 258 9m01s
collections 38,961 124 712 55 5m42s
Benchmarks
Benchmark Deadlocks(0-cfa)
Deadlocks(k-obj.)
Lock type pairs (total)
Lock type pairs (real)
moldyn 0 0 0 0
montecarlo 0 0 0 0
raytracer 0 0 0 0
tsp 0 0 0 0
sor 0 0 0 0
hedc 7,552 2,358 22 19
weblech 4,969 794 22 19
jspider 725 4 1 0
jigsaw 23 18 3 3
ftp 16,259 3,020 33 24
dbcp 320 16 4 3
cache4j 0 0 0 0
logging 4,134 4,134 98 94
collections 598 598 16 16
Experimental Results
Individual Analysis Contributions
• Novel approach to static deadlock detection for Java– Checks six necessary conditions for a deadlock– Uses four off-the-shelf static analyses
• Neither sound nor complete, but effective in practice– Applied to suite of 14 multi-threaded Java
programs comprising over 1.5 MLOC– Found all known deadlocks as well as previously unknown
ones, with few false alarms
Conclusion
Part II
• Dynamic Deadlock Detection
• Reference:
An Effective Dynamic Analysis Technique for Detecting Generalized Deadlocks [FSE’10]
Motivation
• Most previous deadlock detection work has focused on resource deadlocks
• Example
// Thread T1 // Thread T2 sync(L1) { sync(L2) { sync(L2) { sync(L1) { …. …. } } } }
L1
T1
L2
T2
Motivation
• Other kinds of deadlocks, e.g. communication deadlocks, are equally notorious
• Example
// Thread T1 // Thread T2 if (!b) { b = true;
sync(L) { sync(L) { L.wait(); L.notify(); } } }
T2T1
if(!b)
wait L
b = true
notify L
b is initially false
Goal
• Build a dynamic analysis based tool that: – detects communication deadlocks– scales to large programs– has low false positive rate
Our Initial Effort
• Take cue from existing dynamic analyses for other concurrency errors
• Existing dynamic analyses check for violation of a programming idiom– Races:
• every shared variable is consistently protected by a lock– Resource deadlocks:
• no cycle in lock ordering graph– Atomicity violations:
• atomic blocks should have the pattern (R+B)*N(L+B)*
Our Initial Effort
• What programming idiom should we check for communication deadlocks?
Our Initial Effort
• Recommended usage of condition variables
// Thread T1 // Thread T2 sync (L) { sync (L) { while (!cond) cond = true; L.wait(); L.notifyAll();
assert (cond == true); } }
An Example
• Recommended usage of condition variables
// Thread T1 // Thread T2 sync (L) { sync (L) { while (list.isEmpty()) list.add(...); L.wait(); L.notifyAll();
… = list.remove(); } }
Violation of Idiom as Deadlock
• Example
// Thread T1 // Thread T2 if (!b) b = true;
sync (L) L.notifyAll();
sync (L) L.wait();
Must use while, not if
Accesses to b must be
inside sync
Satisfaction of Idiom as Deadlock
• Example
// Thread T1 // Thread T2 sync (L1) sync (L2) while (!b) L2.wait();
sync (L1) sync (L2)
L2.notifyAll();
=> Recommended usage pattern (or idiom) based checking does not work
No violation of idiom, but still
deadlocks!
Revisiting Existing Analyses
• Relax the dependencies between relevant events from different threads– verify all possible event orderings for errors– use data structures to check idioms (vector clocks, lock-
graphs etc.) to implicitly verify all event orderings
Revisiting Existing Analyses
• Programming idiom-based checking does not workfor communication deadlocks
• Nevertheless, we can explicitly verify all orderings of relevant events for deadlocks
Trace Program
// Thread T1 // Thread T2 if (!b) { b = true;
sync (L) { sync (L) {L.wait (); L.notify ();
} } } b is initially false
lock L
wait L
unlock L
lock L
unlock L
notify L
T1 T2
Trace Program
lock L
wait L
unlock L
lock L
unlock L
notify L
T1 T2 Thread T1 {lock L;wait L;unlock L;
}
Thread T2 {lock L;notify L;unlock L;
}
Trace Program
Thread T1 { Thread T2 {lock L; lock L;wait L; || notify L;unlock L; unlock L;
} }
lock L
wait L
unlock L
lock L
unlock L
notify L
T1 T2
Trace Program
• Built out of only a subset of events– usually much smaller than the original program
• Throws away a lot of dependencies between threads– could give false positives– but increases coverage
// Thread T1 // Thread T2 if (!b) { b = true; sync (L) { sync (L) { L.wait (); L.notify (); } } } b is initially false
lock L
wait L
unlock L
lock L
unlock Lnotify L
T1 T2
if (!b)
b = true
Trace Program: Add Dependencies
lock L
wait L
unlock L
lock L
unlock Lnotify L
T1 T2
if (!b)
b = true
Thread T1 { if (!b) {
lock L;wait L;unlock
L; }}
Thread T2 {b = true;lock L;notify L;unlock L;
}
Trace Program: Add Dependencies
Trace Program: Add Predictivity
• Use static analysis to add to the predictive power of the trace program
// Thread T1 // Thread T2 @ !b => L.wait() if (!b) { b = true; sync (L) { sync (L) {
L.wait (); L.notify (); } } } b is initially false
Thread T1 { if (!b) {
lock L;wait L;unlock L;
}}
• Effective for concurrency errors that cannot be detected using a programming idiom– communication deadlocks, deadlocks due of exceptions, …
// Thread T1 // Thread T2 while (!b) { try {
sync (L) { foo(); L.wait(); b = true;
} sync (L) { L.notify(); } } } catch (Exception e) {…}
b is initially false
can throw anexception
Trace Program: Other Errors
• Implemented for deadlock detection– both communication and resource deadlocks
• Built a prototype tool for Java called CHECKMATE
• Applied to several Java libraries and applications– log4j, pool, felix, lucene, jgroups, jruby....
• Found both previously known and unknown deadlocks (17 in total)
Implementation and Evaluation
Conclusion
• CHECKMATE is a novel dynamic analysis for finding deadlocks– both resource and communication deadlocks
• Effective on several real-world Java benchmarks
• Trace program based approach is generic– can be applied to other errors, e.g. deadlocks because of
exceptions
Did Not Cover Today …
• Deadlock Detection in Message-Passing Programs– must model many variants of message sends/receives
• Dynamic Deadlock Avoidance– unique to deadlock errors (cannot, e.g., “avoid” buffer overruns)– see Dimmunix OSDI’08 paper (http://dimmunix.epfl.ch/)
• Dynamic Deadlock Detection by Controlling Thread Schedules– CHESS (http://research.microsoft.com/en-us/projects/chess/) – CalFuzzer (http://srl.cs.berkeley.edu/~ksen/calfuzzer/)
• Type-based Deadlock Detection– statically check lock-order graph (see OOPSLA’02 paper)