CHESS: Systematic Concurrency Testing Tom Ball, Sebastian Burckhardt, Madan Musuvathi, Shaz Qadeer...

Post on 16-Dec-2015

221 views 4 download

Transcript of CHESS: Systematic Concurrency Testing Tom Ball, Sebastian Burckhardt, Madan Musuvathi, Shaz Qadeer...

CHESS: Systematic Concurrency Testing

Tom Ball, Sebastian Burckhardt,Madan Musuvathi, Shaz Qadeer

Microsoft Researchhttp://research.microsoft.com/CHESS/

Testing concurrent programs is HARDRare thread interleavings expose bugs

Coverage problemTesting misses thread interleavings that expose errors

Reproducibility problemConcurrency bugs == HeisenbugsNot reproducible hard to debugCrash dumps don’t help

Thread interleavings

x++;x++;x++;x++;

x*=2;x*=2;x*=2;x*=2;

44

11

00

22

88

4433

22

5566

11

00

00

22

22 22

11

44 33

Concurrency testing todayConcurrency testing == stress testing

Example: testing a concurrent queueCreate 100 threads performing queue operationsRun for days/weeks

Stress increases the interleaving variety, butNot systematic: might miss interleavings Not predictable: cannot find the same error againMakes any error found hard to debug

1 Why stress is not sufficient

Concurrency testing : what we need

Methodology and tools to systematically and predictably

test thread interleavings

CHESS in a nutshell

Replace the OS scheduler with a demonic scheduler

Systematically explore all scheduling choices

ConcurrentProgram

ConcurrentProgram

Win32 API

Kernel SchedulerDemonicScheduler

CHESS will run this program 6 times exploring all the different interleavings

x++;x++;x++;x++;

x*=2;x*=2;x*=2;x*=2;

44

11

00

22

88

4433

22

5566

11

00

00

22

22 22

11

44 33

2 Don’t stress, use CHESS

CHESS architecture

Kernel: Threads, Scheduler, Synchronization Objects

While(not done) { TestScenario()}

While(not done) { TestScenario()}

TestScenario() { …}

Program

CHESSCHESS runs the scenario in a loop • Every run takes a different interleaving• Every run is repeatable

Win32 API

Intercept synch. & threading calls• To control and introduce nondeterminism

Detect• Assertion violations• Deadlocks• Dataraces• Livelocks

CHESS methodology generalizes

Need wrappers for every concurrency APICHESS has wrappers for Win32, .NET, Singularity

Wrappers understand the semantics of the APIExpose nondeterminism in the API

Looking for volunteers to build wrappers for Linux and Java

.NETProgram

.NETProgram

.NET CLR

CHESSCHESSWin32

ProgramWin32

Program

Win32 / OS

CHESSCHESSSingularit

yProgram

Singularity

Program

Singularity

CHESSCHESS

CHESS clientsPCP = Parallel Computing Platform (for multi/many-cores)

PLINQ: Parallel LINQCDS: Concurrent Data StructuresSTM: Software Transactional MemoryTPL: Task Parallel LibraryConcRT: Concurrency RunTimeCCR: Concurrency Coordination Runtime

DryadPart of COSMOS

Singularity/MidoriCHESS can systematically test the boot and shutdown process

Stateless model checking [Verisoft ‘97]Systematically enumerate all paths in a state-space

graph

Don’t capture program states Capturing states is extremely hard for large programs

Effective for message-passing programs

CHESS applies stateless model checking for shared-memory multithreaded programs

OutlinePreemption bounding [PLDI ‘07]Fair stateless model checking [PLDI ‘08]Sober [CAV ’08, EC2 ‘08]FeatherLiteConcurrency Explorer [EC2 ‘08]

OutlinePreemption bounding

Makes CHESS effective on deep state spacesFair stateless model checkingSoberFeatherLiteConcurrency Explorer

x = 1; … … … … … y = k;

x = 1; … … … … … y = k;

State space explosion

x = 1; … … … … …y = k;

x = 1; … … … … …y = k;

n threads

k steps each

Number of executions = O( nnk )

Exponential in both n and kTypically: n < 10 k > 100

Limits scalability to large programs

Goal: Scale CHESS to large programs (large k)

x = 1;if (p != 0) { x = p->f;}

x = 1;if (p != 0) { x = p->f;}

Preemption bounding

x = p->f;} x = p->f;}

x = 1;if (p != 0) {x = 1;if (p != 0) {

p = 0;p = 0;

preemption

non-preemption

Polynomial state spaceTerminating program with fixed inputs and deterministic threads

n threads, k steps each, c preemptionsNumber of executions <= nkCc . (n+c)! = O( (n2k)c. n! )

Exponential in n and c, but not in k

x = 1; … … … … …y = k;

x = 1; … … … … …y = k;

x = 1; … … … … … y = k;

x = 1; … … … … … y = k;

x = 1; … … … …

x = 1; … … … …

x = 1; … … …

x = 1; … … …

…y = k; …y = k;

… … … …

y = k;y = k;

• Choose c preemption points

• Permute n+c atomic blocks

3 Preemption bounding

Find lots of bugs with 2 preemptionsProgram Lines of code Bugs

Work Stealing Q 4K 4

CDS 6K 1

CCR 9K 3

ConcRT 16K 4

Dryad 18K 7

APE 19K 4

STM 20K 2

TPL 24K 9

PLINQ 24K 1

Singularity 175K 2

37 (total)

Acknowledgement: testers from PCP team

So, is CHESS is unsound?Soundness: prove that the program is correct for a given

input test harnessNeed to exhaustively explore all interleavings

For small programs, CHESS is soundIteratively increase the preemption bound

Preemption bounding helps scale to large programsA good “knob” to trade resources for coverage

Better search algorithms more coverage fasterPartial-order reductionModular testing of loosely-coupled programs

OutlinePreemption bounding

Makes CHESS effective on deep state spacesFair stateless model checking

Makes CHESS effective on cyclic state spacesEnables CHESS to find liveness violations (livelocks)

SoberFeatherLiteConcurrency Explorer

Concurrent programs have cyclic state spaces

SpinlocksNon-blocking algorithmsImplementations of synchronization primitivesPeriodic timers…

L1: while( ! done) { L2: Sleep(); }

L1: while( ! done) { L2: Sleep(); }

M1: done = 1;M1: done = 1;

! done L2

! done L2

! doneL1

! doneL1

done L2

done L2

doneL1

doneL1

A demonic scheduler unrolls any cycle ad-infinitum

! done! done

donedone! done! done

donedone! done! done

donedone

while( ! done){ Sleep();}

while( ! done){ Sleep();}

done = 1;done = 1;

! done! done

Depth bounding

! done! done

donedone! done! done

donedone! done! done

donedone! done! done

Prune executions beyond a bounded number of steps

Depth bound

Problem 1: Ineffective state coverage

! done! done

! done! done

! done! done

! done! done

Bound has to be large enough to reach the deepest bug Typically, greater than 100

synchronization operations

Every unrolling of a cycle redundantly explores reachable state space

Depth bound

Problem 2: Cannot find livelocksLivelocks : lack of progress in a program

temp = done;while( ! temp){ Sleep();}

temp = done;while( ! temp){ Sleep();}

done = 1;done = 1;

Key idea

This test terminates only when the scheduler is fairFairness is assumed by programmers

All cycles in correct programs are unfair A fair cycle is a livelock

while( ! done){ Sleep();}

while( ! done){ Sleep();}

done = 1;done = 1;! done! done! done! done

donedonedonedone

We need a fair demonic scheduler

Avoid unrolling unfair cyclesEffective state coverage

Detect fair cyclesFind livelocks (violations of fair

termination)

ConcurrentProgram

ConcurrentProgram

Test Harness

Test Harness

Win32 API

DemonicScheduler

FairDemonicScheduler

Fair termination allows CHESS to check for arbitrary liveness properties

Example: Good Samaritan assumptionForall threads t : GF scheduled(t) GF yield(t)A thread when scheduled infinitely often yields the processor

infinitely often

Examples of yield:Sleep(), ScheduleThread(), asm {rep nop;}Thread completion

while( ! done){ Sleep();}

while( ! done){ Sleep();}

done = 1;done = 1;

OutlinePreemption bounding

Makes CHESS effective on deep state spacesFair stateless model checking

Makes CHESS effective on cyclic state spacesEnables CHESS to find liveness violations (livelocks)

SoberDetect relaxed-memory model errorsDo not miss behaviors only possible in a relaxed memory

modelFeatherLiteConcurrency Explorer

C# Examplevolatile bool isIdling;volatile bool hasWork; //Consumer thread void BlockOnIdle(){ lock (condVariable){ isIdling = true; if (!hasWork) Monitor.Wait(condVariable); isIdling = false; } } //Producer thread void NotifyPotentialWork(){ hasWork = true; if (isIdling) lock (condVariable) { Monitor.Pulse(condVariable); } }

32

Key pieces of code on previous slide:

On x86, hardware may perform store lateBug: Producer thread does not notice waiting Consumer,

does not send signal

Store ii, 1 Store ii, 1

Example: Store Buffer Vulnerability

Store ii, 1 Store ii, 1

volatile int ii = 0;volatile int hw = 0;

Load hw, 0Load hw, 0

Load ii, 1Load ii, 1

Store hw, 1Store hw, 1

Consumer Producer

00

33

Sober algorithmProgrammers assume sequential-consistency (SC)

Insert synchronizations & fences to counter memory-model relaxations

Sober checks if a program is memory-model safei.e., program has only SC executions in a memory modelReports any such violation as an error

Sober is a dynamic monitor that checks if any SC execution can be extended to a non-SC execution

Theorem: CHESS + Sober guarantees memory-model safety

OutlinePreemption bounding

Makes CHESS effective on deep state spacesFair stateless model checking

Makes CHESS effective on cyclic state spacesEnables CHESS to find liveness violations (livelocks)

SoberDetect relaxed-memory model errorsDo not miss behaviors only possible in a relaxed memory model

FeatherLiteA light-weight data-race detection engine (<20%

overhead)Concurrency Explorer

OutlinePreemption bounding

Makes CHESS effective on deep state spacesFair stateless model checking

Makes CHESS effective on cyclic state spacesEnables CHESS to find liveness violations (livelocks)

SoberDetect relaxed-memory model errorsDo not miss behaviors only possible in a relaxed memory model

FeatherLiteA light-weight data-race detection engine (<20% overhead)

Concurrency ExplorerFirst-class concurrency debugging

ConclusionDon’t stress, use CHESS

CHESS binary and papers available at http://research.microsoft.com/CHESS

Stateless model checking is very effectivePreemption bounding to scale to deep state spacesFair demonic scheduler to handle nonterminating

programs

Need better testing and debugging methodologies for concurrent programs

Questions