Supervisory+Control+for+ Avoidance+of+Concurrency+Bugs+ · source code control flow graph Petri net...
Transcript of Supervisory+Control+for+ Avoidance+of+Concurrency+Bugs+ · source code control flow graph Petri net...
Supervisory Control for Avoidance of Concurrency Bugs
Stéphane Lafortune University of Michigan
ExCAPE Summer School
12 June 2013
2013/06/12 -‐ ExCAPE Summer School Lafortune 1
Acknowledgements
• Yin Wang, Facebook • Terence Kelly, HP Labs • Hongwei Liao, GE Research • Jason Stanley, Hyoun Kyu Cho, ScoQ Mahlke, University of Michigan
• Ahmed Nazeem, Spyros RevelioTs, Georgia Tech
• à Gadara Project
2013/06/12 -‐ ExCAPE Summer School Lafortune 2
Concurrency Bugs
• MulTcore architectures in computer hardware MulTthreaded programming: Notoriously difficult
“AYer 25 years of research, we’re no closer to solving the ‘parallel programming problem’”
-‐-‐-‐ Tim MaQson (Intel) • Data sharing in mulTthreaded programming:
– Races – Atomicity violaTons – Deadlock
2013/06/12 -‐ ExCAPE Summer School Lafortune 3
Explosion of Research • New libraries, languages, features
– Intel TBB, Erlang, Cilk++, atomic secTons, Trans. Memory, OpenMP
• Tools – StaTc analysis, tesTng tools
• Coverity™, Locksmith • Klee, CHESS, CheckFence
– RunTme analysis • Eraser, Intel Thread Checker™
– Post-‐mortem analysis • Triage, CrashRpt
2013/06/12 -‐ ExCAPE Summer School Lafortune 4
Control Engineering ContribuTon
• Formal approach to model, analyze, and control mulTthreaded programs: controller synthesis
• Model-‐based approach using Supervisory Discrete Control of
Discrete Event Systems – Provably prevent “all” Circular Mutex Wait deadlocks (correctness) – Maximally permissive control (opTmality) – ApplicaTon scenarios:
• Rapid prototype development • Post-‐release bugs
• Modeling formalism: Petri nets – Graphical model of program, no explicit state space enumeraTon – Capture system concurrency and resource allocaTon – Structural properTes can be exploited for controller synthesis
2013/06/12 -‐ ExCAPE Summer School Lafortune 5
Control Engineering and Computer Science
• ConTnuous control applied to computer systems problems; [Hellerstein et al., 2004]
• Discrete event control and soYware systems
– Iordache & Antsaklis – Marchand, RuQen, et al. – Benveniste, et al. – Tripakis et al. – Dingel, Rudie, et al.
2013/06/12 -‐ ExCAPE Summer School Lafortune 6
Outline
• IntroducTon ✓ • The Gadara Approach: Main Features • Synthesis of Control Logic for Deadlock Avoidance: Technical Details
• Case Studies and Control Logic ImplementaTon • Discussion and Conclusion
2013/06/12 -‐ ExCAPE Summer School Lafortune 7
Gadara Project Architecture
C program source code
control flow graph
Petri net control logic
compile
translation control logic
synthesis
inst
rum
enta
tion
Instrumented binary
prog
ram
compile
observe
control
observe
control
observe
control
cont
rol l
ogic
offline online
2013/06/12 -‐ ExCAPE Summer School Lafortune 8
An example of the Gadara process 2013/06/12 -‐ ExCAPE Summer School Lafortune 9
Discrete Event Systems: Key Features • Discrete State Space
• {on, off, broken} • {empty, 1, 2, 3, …} • {ready_to_send, waiTng_for_ack, …}
• Event-‐driven dynamics • {start, stop, break, repair} • {arrival, departure} • {send_packet, receive_ack, Tmeout}
• System trajectories: “traces” (or “strings”) of events • oYen, an infinite set [language]
2013/06/12 -‐ ExCAPE Summer School Lafortune 10
DES Modeling Formalisms (in control engineering)
• Automata (labeled transiTon systems) • simple, intuiTve; analyTcal power • extensions to Tmed and hybrid • lack of structure and scalability
• Petri Nets • more structure; analysis more difficult in general but more powerful for special classes
2013/06/12 -‐ ExCAPE Summer School Lafortune 11
0,ACK,1
1,1,1
ack+1
0,EMPTY,1
lost
1,EMPTY,1
lost
1,ACK,0
new1
0,0,1
timeout+0
repeat0
lost
0,0,0
new0
0,EMPTY,0
losttimeout+0
timeout+1
ack+0
1,EMPTY,0
lost
1,1,0
repeat1
lost timeout+1
NoTon of marked state (double circle)
An Automaton
2013/06/12 -‐ ExCAPE Summer School Lafortune 12
Modeling: Petri Nets Preliminaries • Petri net structure A Petri net is a biparTte graph: two types of nodes
place
transiTon arc
token
In general, each arc has an arc weight. A special case: ordinary net
2013/06/12 -‐ ExCAPE Summer School Lafortune 13
Modeling: Petri Nets Preliminaries • Petri net dynamics
transiTon enabled
fire
t1
t2 t3
p1
p2
p3 p4
2013/06/12 -‐ ExCAPE Summer School Lafortune 14
Modeling: Petri Nets Preliminaries
• Marking (State): |P|X1 vector
• Incidence matrix of PN structure: |P| X |T| matrix, denoted by D
t1
t2 t3
p1
p2
p3 p4
0
10
00
M
⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦
p1
p2
p3
p4
2013/06/12 -‐ ExCAPE Summer School Lafortune 15
How are DES models obtained?
• First principles modeling (domain knowledge)
• AbstracTon of conTnuous system (bisimulaTon or other property)
• SoYware: directly from source code
2013/06/12 -‐ ExCAPE Summer School Lafortune 16
Concurrent SoYware Failures
• Locks are used to manage concurrent accesses to shared data and avoid races
• Enforcement of atomicity condiTons requires delaying lock release unTl aYer criTcal secTon
• Holding on to locks can lead to deadlock – Circular wait among a set of threads – Mutual exclusion locks: Mutex
• CMW: Circular-‐Mutex-‐Wait deadlocks – Other kinds of locks: Reader-‐Writer
2013/06/12 -‐ ExCAPE Summer School Lafortune 17
Petri Nets and Locks
ready to acquire
acquire lock
hold lock
release lock
done
lock
the PN that models lock acquisition & release
Kavi et al., IJoPP 2002 Murata, Proc. IEEE 1989
2013/06/12 -‐ ExCAPE Summer School Lafortune 18
Modeling: Programs Modeled by Petri Nets (from enhanced Control Flow Graph)
• Place A set of lines of code of program.
• TransiTon Lock allocaTon or release; and program control flow
• Token Thread execuTon or available lock;
MulTple threads are represented by mulTple tokens.
2013/06/12 -‐ ExCAPE Summer School Lafortune 19
Dining Philosophers void * philosopher(void *arg) { … if (RAND_MAX/2 > random()) { /* grab A first */ pthread_mutex_lock(&forkA); pthread_mutex_lock(&forkB); } else { /* grab B first */ pthread_mutex_lock(&forkB); pthread_mutex_lock(&forkA); } eat(); pthread_mutex_unlock(&forkB); pthread_mutex_unlock(&forkA); ... }
int main(int argc, char *argv[]) { ... pthread_create(&p1, NULL, philosopher, NULL); pthread_create(&p2, NULL, philosopher, NULL); ... }
2013/06/12 -‐ ExCAPE Summer School Lafortune 20
Dining Philosophers
void * philosopher(void *arg) { … if (RAND_MAX/2 > random()) { /* grab A first */ pthread_mutex_lock(&forkA); pthread_mutex_lock(&forkB); } else { /* grab B first */ pthread_mutex_lock(&forkB); pthread_mutex_lock(&forkA); } eat(); pthread_mutex_unlock(&forkB); pthread_mutex_unlock(&forkA); ... }
start
lock(A)lock(B)
lock(B)lock(A)
eat()unlock(B)unlock(A)
if else
CFG
2013/06/12 -‐ ExCAPE Summer School Lafortune 21
Dining Philosophers start
lock(A)lock(B)
lock(B)lock(A)
eat()unlock(B)unlock(A)
if else
CFG
••
• •
• •
•
•
if else
lock(A)
lock(B)
lock(B)
lock(A)
unlock(B)
unlock(A)
A
B
No transiTon enabled. Deadlock!
2013/06/12 -‐ ExCAPE Summer School Lafortune 22
Typical Approaches
• Programmers use coarse-‐grain locking and manage lock acquisiTon ordering
• Recent CS literature: – Rx: Restart the program from a recent checkpoint – Healing: Add more locks to limit concurrency, by heurisTcs
– DImmunix: delay lock allocaTon if it has caused deadlock before
2013/06/12 -‐ ExCAPE Summer School Lafortune 23
Modeling: A Deadlock Example in Linux Kernel
Example
2013/06/12 -‐ ExCAPE Summer School Lafortune 24
CMW Deadlock: Petri net viewpoint
2013/06/12 -‐ ExCAPE Summer School Lafortune 25
Analysis of this model, which is restricted to the three threads of interest, reveals two total-‐deadlock markings that are reachable from the iniTal marking: (i) The first total-‐deadlock marking is M1 , where there is one token in p12 , one in p22 , and one in p33 , while all other places are empty. At marking M1 , all three threads are involved in the deadlock. (ii) The second total-‐deadlock marking is M2 , where there is one token in p14 , one in p22 , and one in p03 , while all other places are empty. At marking M2 , only Threads 1 and 2 are involved in the deadlock
2013/06/12 -‐ ExCAPE Summer School Lafortune 26
CMW Deadlocks: Automata viewpoint
Deadlock state
Livelock states
Blocking states
IniTal and Marked state
Non-‐blockingness: no ``red’’ states reachable
Another marked state
2013/06/12 -‐ ExCAPE Summer School Lafortune 27
Supervisory Control for CMW deadlock avoidance
• This is an instance of SSCP from Stravos Tripakis’ lecture this morning: – DES Plant G is the reachability graph of the Petri net under consideraTon;
– The only marked state is the iniTal state – Must find maximally-‐permissive non-‐blocking supervisor – There are uncontrollable events – The soluTon is called the Supremal Controllable Sublanguage in Supervisory Control Theory; [Ramadge & Wonham, 1987]
– Complexity is worst-‐case quadraTc in the number of states of G
2013/06/12 -‐ ExCAPE Summer School Lafortune 28
Uncontrollable transiTons
Example
2013/06/12 -‐ ExCAPE Summer School Lafortune 29
Supervisory Control for CMW deadlock avoidance
• Working with Automata models, or equivalently with the Reachability Graph of the Petri net, presents two challenges: – Off-‐line ComputaJonal Burden: many reachable states to explicitly represent
– Run-‐Jme Overhead: control acTon may need to be updated at each event occurrence: Supervisor: E* à Power(Ec)
2013/06/12 -‐ ExCAPE Summer School Lafortune 30
Supervisory Control for CMW deadlock avoidance
• We have developed two approaches to address the preceding challenges: – ICOG: structure-‐based control logic synthesis – MSCL: classificaTon-‐based control logic synthesis
2013/06/12 -‐ ExCAPE Summer School Lafortune 31
Outline
• IntroducTon ✓ • The Gadara Approach: Main Features ✓ • Synthesis of Control Logic for Deadlock Avoidance: Technical Details
• Case Studies and Control Logic ImplementaTon • Discussion and Conclusion
2013/06/12 -‐ ExCAPE Summer School Lafortune 32
Petri net Control: Enforcing Linear SpecificaTon by Control Place
• Supervision Based on Place Invariants (SBPI) [Yamalidou et al. 1996], [Moody et al. 1998], [Iordache et al. 2006]
(1) Control specification:
(2) Control synthesis: a new control (aka monitor) place pc with
(3) The supervision is maximally permissive.
Tl M b≤
c
TpD l D= −
0 0( ) TcM p b l M= −
(connectivity to the original net) (initial tokens)
2013/06/12 -‐ ExCAPE Summer School Lafortune 33
SBPI-‐based Control
• Take the linear inequality: – Sum[green places] greater than or equal to one
– Results in red “control” (or monitor) place
– Deadlock avoided!
••
•
•
if else
lock(A)
lock(B)
lock(B)
lock(A)
unlock(B)
unlock(A)
A
B
•
2013/06/12 -‐ ExCAPE Summer School Lafortune 34
SBPI-‐based Control
• How was – Sum[green places] greater than of equal to one
obtained?
••
•
•
if else
lock(A)
lock(B)
lock(B)
lock(A)
unlock(B)
unlock(A)
A
B
•
2013/06/12 -‐ ExCAPE Summer School Lafortune 35
• For a program to be deadlock-‐free, we want its Petri net model to be live (e.g., reversible: always able to return to its iniTal state)
• The key is to map liveness of the Petri net, a behavioral property, to a structural property of the net
• Key noTon: Siphon
2013/06/12 -‐ ExCAPE Summer School Lafortune 36
Analysis: Siphon – Structural Property
DefiniTon and Property
{Input Transitions} {Output Transitions}⊆
n Siphon A set of places whose
n Property
If a siphon becomes empty, it remains empty forever à Write a linear inequality that guarantees the siphon is not empty!
2013/06/12 -‐ ExCAPE Summer School Lafortune 37
Dining Philosophers
• •
if else
lock(A)
lock(B)
llck(B)
lock(A)
unlock(B)
unlock(A)
A
B • Green places
form siphon • Deadlock when
siphon is empty
2013/06/12 -‐ ExCAPE Summer School Lafortune 38
Siphon Based Control
• Siphon is a set of places that can lose tokens permanently – structural property – related to deadlock
• Using SBPI, synthesize control place to prevent empty siphon – linear algebra – maximally permissive
••
•
•
if else
lock(A)
lock(B)
lock(B)
lock(A)
unlock(B)
unlock(A)
A
B
•
2013/06/12 -‐ ExCAPE Summer School Lafortune 39
Siphon Based Control
• Control logic is – fine-‐grained – highly concurrent – easy to implement
••
•
•
if else
lock(A)
lock(B)
lock(B)
lock(A)
unlock(B)
unlock(A)
A
B
•
2013/06/12 -‐ ExCAPE Summer School Lafortune 40
What issues arise?
• Establish precise connecTon between liveness of Petri net and certain kinds of bad siphons
• Build a control strategy based on using SBPI to avoid bad siphons
• IteraTons may be necessary: – Convergence must be established
• Uncontrollability of some transiTons must be handled
2013/06/12 -‐ ExCAPE Summer School Lafortune 41
Outline
• IntroducTon ✓ • The Gadara Approach: Main Features ✓ • Synthesis of Control Logic for Deadlock Avoidance: Technical Details
• Case Studies and Control Logic ImplementaTon • Discussion and Conclusion
2013/06/12 -‐ ExCAPE Summer School Lafortune 42
Modeling: DefiniTon of Gadara Nets
• Ordinary net: all arcs have weight 1
CondiTon 1
• Three types of places:
(1) P0: idle places
(2) PR: resource places
(3) PS: operaTon places
2013/06/12 -‐ ExCAPE Summer School Lafortune 43
Modeling: DefiniTon of Gadara Nets
CondiTon 2 • Set of transiTons
CondiTon 3
• Each process subnet is strongly connected state machine: single arc in/out of each transiTon
CondiTon 4 • Branch selecTon is not constrained by resources.
if else
2013/06/12 -‐ ExCAPE Summer School Lafortune 44
Modeling: DefiniTon of Gadara Nets
CondiTon 5 • All resources are conservaTve in the net; noTon of P-‐semiflow.
CondiTon 6
• IniTal marking.
CondiTon 7 • Any operaTon place is involved with at least one type of resource.
2013/06/12 -‐ ExCAPE Summer School Lafortune 45
Analysis: Liveness and Siphons
The (ordinary) Gadara net is live iff contains no empty siphons.
GN 0( , )GR N M
Behavioral
Property
Structural
Property
ImplicaGons: Program is deadlock-‐free (Goal)
Gadara net is live (Behavioral Property)
Gadara net cannot reach a problemaTc siphon (Structural Property)
⇔⇔
[CDC 2009; J-‐DEDS 2013]
2013/06/12 -‐ ExCAPE Summer School Lafortune 46
Control: Challenges
• The need to iterate – Synthesized control place à generalized resource place – Can introduce new siphons, if coupled with exisTng resources – The new siphons are not considered in previous iteraTons
n Non-‐ordinary nets q Ordinary net à add control place à potenTally non-‐ordinary net q The problem of opTmal control in non-‐ordinary net based on siphons
(structural analysis) is not well-‐resolved yet [Li et al. 2009]
2013/06/12 -‐ ExCAPE Summer School Lafortune 47
Controlled Gadara Net
• Control place is a generalized resource place, with its own semiflow
••
•
•
if else
lock(A)
lock(B)
lock(B)
lock(A)
unlock(B)
unlock(A)
A
B
•
2013/06/12 -‐ ExCAPE Summer School Lafortune 48
Why iteraTons are necessary: Automata viewpoint
Deadlock state
Livelock states
Blocking states
IniTal and Marked state
Nonblocking: no ``red’’ states reachable
2013/06/12 -‐ ExCAPE Summer School Lafortune 49
Why the controlled net may not remain ordinary
Supervision Based on Place Invariants (SBPI)
(1) Control specification:
(2) Control synthesis: a new control (aka monitor) place pc with
(3) The supervision is maximally permissive.
Tl M b≤
c
TpD l D= −
0 0( ) TcM p b l M= −
(connectivity to the original net) (initial tokens)
2013/06/12 -‐ ExCAPE Summer School Lafortune 50
Control: Liveness Property -‐ General Case
The Controlled Gadara net is live iff
contains no Resource-Induced Deadly Marked siphons.
cGN 0( , )c
GR N M
The Gadara net is live iff contains no empty siphons.
GN 0( , )GR N M1
2
ImplicaGons (SGll Apply) Program is deadlock-‐free (Goal)
Gadara net is live (Behavioral Property)
Gadara net cannot reach a problemaTc siphon (Structural Property)
⇔⇔
[CDC 2009; J-‐DEDS]
2013/06/12 -‐ ExCAPE Summer School Lafortune 51
Control: The IteraTve Control Methodology – ICOG
Main ProperTes of ICOG: q IteraTve control based on structural analysis q OpTmal: Maximally-‐permissive and Liveness-‐enforcing (MPLE) q Finite convergence
2013/06/12 -‐ ExCAPE Summer School Lafortune 52
The UCCOR Algorithm: Controlling the idenTfied bad siphons
Some Important ProperTes:
q OpTmal: MPLE
q Applies to non-‐ordinary Gadara nets
q Unit arc weights for synthesized control places
• Covering: A generalized marking allowing for ``don’t care’’ components
• Enables a compact representaTon of a set of states that contains the idenTfied deadlock
2013/06/12 -‐ ExCAPE Summer School Lafortune 53
Control: The Customized ICOG-‐O Methodology Main ProperTes of ICOG-‐O:
q IteraTve control based on structural analysis q OpTmal: Maximally-‐permissive and Liveness-‐enforcing (MPLE) q Finite convergence q No redundant control logic and remains ordinary
MIP formulation
UCCOR-O
2013/06/12 -‐ ExCAPE Summer School Lafortune 54
MSCL: AlternaTve Approach
• Minimizing the overhead of the control logic is essenTal in this applicaTon domain
• ICOG does not guarantee that the number of control places is minimized
• MSCL – Marking SeparaTon using linear CLassifier: an alternaTve control logic synthesis method that leverages SBPI, but in a single overall applicaTon with provably minimum number of monitor places – Relies on classificaTon theory to separate safe and unsafe states; requires (parTal) enumeraTon of such states first
– Details omiQed here
2013/06/12 -‐ ExCAPE Summer School Lafortune 55
Outline
• IntroducTon ✓ • The Gadara Approach: Main Features ✓ • Synthesis of Control Logic for Deadlock Avoidance: Technical Details ✓
• Case Studies and Control Logic ImplementaJon • Discussion and Conclusion
2013/06/12 -‐ ExCAPE Summer School Lafortune 56
Linux kernel deadlock
2013/06/12 -‐ ExCAPE Summer School Lafortune 57
Linux kernel deadlock – ICOG
2013/06/12 -‐ ExCAPE Summer School Lafortune 58
Linux kernel deadlock – MSCL pc
2013/06/12 -‐ ExCAPE Summer School Lafortune 59
Linux kernel deadlock – SSCP approach
2013/06/12 -‐ ExCAPE Summer School Lafortune 60
Summary so far
• CMW-‐deadlock avoidance in mulTthreaded programs using DES control theory
• Challenges: modeling, analysis, control, implementaTon
• Modeling: at compile Tme • Analysis and control: exploit Gadara net structure
• ImplementaJon of control places by code instrumentaJon
2013/06/12 -‐ ExCAPE Summer School Lafortune 61
Gadara Project Architecture
C program source code
control flow graph
Petri net control logic
compile
translation control logic synthesis
inst
rum
enta
tion
Instrumented binary
prog
ram
compile
observe
control
observe
control
observe
control
cont
rol l
ogic
offline online
2013/06/12 -‐ ExCAPE Summer School Lafortune 62
void * philosopher(void *arg) { … if (RAND_MAX/2 > random()) { /* grab A first */ pthread_mutex_lock(&forkA); pthread_mutex_lock(&forkB); } else { /* grab B first */ pthread_mutex_lock(&forkB); pthread_mutex_lock(&forkA); } eat(); replenish(&ctrlplace); pthread_mutex_unlock(&forkB); pthread_mutex_unlock(&forkA); ... }
Controlling SoYware ExecuTon: Program InstrumentaTon
my_lock(&forkB, &ctrlplace);
my_lock(&forkA, &ctrlplace);
••
•
•
if else
lock(A)
lock(B)
lock(B)
lock(A)
unlock(B)
unlock(A)
A
B
•
most lock/unlock funcTon calls unaffected, incur no overhead
2013/06/12 -‐ ExCAPE Summer School Lafortune 63
OpenLDAP (Lightweight Directory Access Protocol)
OpenLDAP main() 2013/06/12 -‐ ExCAPE Summer School Lafortune 64
OpenLDAP Deadlock
65
ldap_pvt_thread_rdwr_wlock(&bdb-‐>bi_cache.c_rwlock); /* LOCK(A) */ … ldap_pvt_thread_mutex_lock( &bdb-‐>bi_cache.lru_mutex ); /* LOCK(B) */ … ldap_pvt_thread_rdwr_wunlock(&bdb-‐>bi_cache.c_rwlock); /*UNLOCK(A)*/ … if ( bdb-‐>bi_cache.c_cursize>bdb-‐>bi_cache.c_maxsize ) { for (...; …; …) { … ldap_pvt_thread_rdwr_wlock(&bdb-‐>bi_cache.c_rwlock); /* LOCK(A) */ … ldap_pvt_thread_rdwr_wunlock(&bdb-‐>bi_cache.c_rwlock); /*UNLOCK(A)*/ } … } else { … } ldap_pvt_thread_mutex_unlock( &bdb-‐>bi_cache.lru_mutex ); /*UNLOCK(B)*/ 2013/06/12 -‐ ExCAPE Summer School Lafortune 65
OpenLDAP deadlock
●
control place
●
●
2013/06/12 -‐ ExCAPE Summer School Lafortune 66
/* LOCK(A) */ … ldap_pvt_thread_mutex_lock( &bdb-‐>bi_cache.lru_mutex ); /* LOCK(B) */ … ldap_pvt_thread_rdwr_wunlock(&bdb-‐>bi_cache.c_rwlock); /*UNLOCK(A)*/ … if ( bdb-‐>bi_cache.c_cursize>bdb-‐>bi_cache.c_maxsize ) { for (...; …; …) { … ldap_pvt_thread_rdwr_wlock(&bdb-‐>bi_cache.c_rwlock); /* LOCK(A) */ … ldap_pvt_thread_rdwr_wunlock(&bdb-‐>bi_cache.c_rwlock); /*UNLOCK(A)*/ } … } else { … } ldap_pvt_thread_mutex_unlock( &bdb-‐>bi_cache.lru_mutex ); /*UNLOCK(B)*/
ldap_pvt_thread_rdwr_wlock(&bdb-‐>bi_cache.c_rwlock);
OpenLDAP Deadlock
replenish (&ctrl_place);
replenish(&ctrl_place);
my_lock(&bdb-‐>bi_cache.c_rwlock, &ctrl_place);
Max Permissiveness
Max Concurrency
2013/06/12 -‐ ExCAPE Summer School Lafortune 67
if ( bdb->bi_cache.c_cursize>bdb->bi_cache.c_maxsize ) { for (...; …; …) { … ldap_pvt_thread_rdwr_wlock(&bdb->bi_cache.c_rwlock); /* LOCK(A) */ … ldap_pvt_thread_rdwr_wunlock(&bdb->bi_cache.c_rwlock); /* UNLOCK(A) */ } … } else { … } ldap_pvt_thread_mutex_unlock( &bdb->bi_cache.lru_mutex ); /* UNLOCK(B) */
• Simple fix? • Newly added code violated the lock ordering
again a year later. • The final fix introduced new locks and
rewrote the code completely • Control solution finds and fixes both
deadlocks automatically
Comparing With the Human Fix ldap_pvt_thread_rdwr_wlock(&bdb->bi_cache.c_rwlock); /* LOCK(A) */ … ldap_pvt_thread_mutex_lock( &bdb->bi_cache.lru_mutex ); /* LOCK(B) */ … ldap_pvt_thread_rdwr_wunlock(&bdb->bi_cache.c_rwlock); /*UNLOCK(A) */ …
2013/06/12 -‐ ExCAPE Summer School Lafortune 68
Outline
• IntroducTon ✓ • The Gadara Approach: Main Features ✓ • Synthesis of Control Logic for Deadlock Avoidance: Technical Details ✓
• Case Studies and Control Logic ImplementaTon ✓ • Discussion and Conclusion
2013/06/12 -‐ ExCAPE Summer School Lafortune 69
• Modeling – Model accuracy: conservaTve model (superset) – language features
• Handles: funcTon pointer, recursion • Ignores: setjmp, longjmp, excepTon/signal
– data flow ambiguity: local annotaTons – dynamically selected locks: type analysis
• Control logic synthesis – uncontrollability: constraint transformaTon – scalability: decomposiTon & pruning – completeness: other synchronizaTon primiTves
Challenges for Large Scale SoYware
2013/06/12 -‐ ExCAPE Summer School Lafortune 70
PracTcal Lock/Unlock Pairing
• Challenges – Infeasible paths – Spanning funcTon boundaries – Pointers
• Lock/Unlock pairing – Path sensiTve program analysis using a SAT solver – Combines best effort staTc analysis and dynamic checking
if (x) lock(A); … if (x) unlock(A);
2013/06/12 -‐ ExCAPE Summer School Lafortune 71
SoYware Development Status
• Rudimentary compiler tools – StaTc analysis – InstrumentaTon
• RelaTvely complete control tool suite – Random net benchmark (generator and verifier) – A few control synthesis algorithms – GUI
2013/06/12 -‐ ExCAPE Summer School Lafortune 72
EvaluaTon of ICOG-‐O (MILP): Stress Test
SS: Safe states; US: Unsafe sates (not explicitly generated by ICOG-‐O)
2013/06/12 -‐ ExCAPE Summer School Lafortune 73
EvaluaTon (cont.): Replacing the MIP formulaTon for deadlock
detecTon with a SAT formulaTon
SS USS Time (s) Monit Iter 6,090,459,834 5,005,028,660 8,832.9 2,013 22,245 2,310,010,077 1,849,976,998 1,819.4 1,167 4,948 661,807,580 660,550,680 862.7 1,235 6439 431,175,853 427,807,294 71,781.1* 7,155* 30,000* 182,819,144 165,869,820 1,036.9 1,073 6,929 90,139,807 73,488,022 117.8 586 2,062 59,503,951 51,035,234 274.0 662 4,578 15,567,136 10,212,724 6.0 175 395
2013/06/12 -‐ ExCAPE Summer School Lafortune 74
DetecTng Unsafe States Using SAT Main Features:
q Exploit structural properTes of Gadara nets q No state space enumeraTon q More scalable than MIP-‐based method in general q CompaTble with ICOG-‐O
Key Ideas:
• Enforce total deadlock in the net
• Exploit the semiflow equaTons
Restrict to CMW deadlocks
Enforce total deadlock
Enforce resource semiflows ~ state equaTon
Binary variable p for each operaTon and resource place:
2013/06/12 -‐ ExCAPE Summer School Lafortune 75
TheoreTcal Side: Latest Results
• Extension of Gadara methodology to handle certain types of atomicity violaTons
• Non-‐linear classifiers for: – RW-‐locks
2013/06/12 -‐ ExCAPE Summer School Lafortune 76
Lessons Learned
• Solving soYware problems using control theory is relevant and rewarding
• Bridging research work from different communiTes is the key to success but also very challenging – TheoreTcal soluTons are oYen preferred, although ad hoc ones typically prevail in the literature
– All soluTons are evaluated by the same standard of pracTcality
2013/06/12 -‐ ExCAPE Summer School Lafortune 77
Conclusion
• Control theory provides a new principled foundaTon to handle soYware failures
• Cross-‐pollinaTng advances both control theory and relevant CS areas
• Many open issues and research opportuniTes – CollaboraTon welcome!
2013/06/12 -‐ ExCAPE Summer School Lafortune 78
References
• Gadara Project: please refer to – hQp://gadara.eecs.umich.edu/papers.html – CS papers: Wang et al.: OSDI’08 and POPL’09; Cho et al., CGO’13 – Control engineering papers:
• Liao et al.: J-‐DEDS (2003); IEEE TAC (2003); IEEE TCST (2003) • Nazeem et al.: IEEE TAC (2002)
• See above papers for pointers to relevant literature • [email protected]; [email protected] (H. Liao);
[email protected] (Y. Wang); ne�[email protected] (H.K. Cho)
2013/06/12 -‐ ExCAPE Summer School Lafortune 79