A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by...

35
1 Computer Science, 2 Statistics, NC State University Raleigh, North Carolina, USA November 14, 2005 A Presentation for David Johnson: Can We (Also) Live with Isomorphs ? Franc Brglez 1 Matthias F. Stallmann 1 Jason Osborne 2 On David’s Pet Peeves* Pet Peeve No. 10: Hand-tuned algorithm parameters. “… many papers use different settings for different instances without explaining how those settings were derived …” “The rule I would apply is the following: If different parameter settings are to be used for different instances, the adjustment process must be well-defined an algorithmic, the adjustment algorithm must be described in the paper, and the time for the adjustment must be included in all reported running times.” *David S. Johnson, “A Theoretician’s Guide to the Experimental Analysis of Algorithms” Proceedings of the 5th and 6th DIMACS Implementation Challenges, M. Goldwasser, D. S. Johnson, and C. C. McGeoch, Editors, American Mathematical Society, Providence, 2002.

Transcript of A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by...

Page 1: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

1Computer Science, 2Statistics, NC State University

Raleigh, North Carolina, USA

November 14, 2005

A Presentation for David Johnson:Can We (Also) Live with Isomorphs ?

Franc Brglez1 Matthias F. Stallmann1 Jason Osborne2

On David’s Pet Peeves*

Pet Peeve No. 10: Hand-tuned algorithm parameters.“… many papers use different settings for different instances without explaining how those settings were derived …”

“The rule I would apply is the following: If different parameter settings are to be used for different instances, the adjustment process must be well-defined an algorithmic, the adjustment algorithm must be described in the paper, and the time for the adjustment must be included in all reported running times.”

*David S. Johnson, “A Theoretician’s Guide to the Experimental Analysis of Algorithms” Proceedings of the 5th and 6th DIMACS Implementation Challenges, M. Goldwasser, D. S. Johnson, and C. C. McGeoch, Editors, American Mathematical Society, Providence, 2002.

Franc Brglez. A Presentation for David Johnson: Can We (Also) Live with Isomorphs? A talk given as part of the informal workshop on Nov. 14, 2005, during the visit by David Johnson as the Distinguished Lecturer at CS, NCSU, https://people.engr.ncsu.edu/brglez/publications/ OPUS2-2005-isomorph-Johnson_workshop-Brglez-talk.pdf, 2005
Page 2: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Corollary to David’s Pet Peeves ...

… paraphrased from American Scientist,

July 2002

Can one replicate these

experiments?

Outline• On Hypothesis Testing • Component Reliability vs Instance Solvability • About Instance Isomorphs • xBed: A Testbed Environment for Experimental Algorithmics

• Case Studies (Highlights of SAT’2003 only) - CrossingsBed (Alenex’1999, JEA’2001) - SatBed (Sat’2002, Sat’2003, AMAI’2005) - LabsBed (FEA’2003, InfoSciences’2004) - MinOnesBed (PhD’2004, DAC’2005, …) - MaxOnesBed (…) - CspBed (PhD’2004, …) - BinPackingBed (…)

• Call for more isomorph classes and instances …

Page 3: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

On Hypothesis Testing*

• A View from a Computer Scientist

• Random Instances: What’s the Difference?

• A View from an Engineer

• A View from a Skeptic

• A Test of Hypothesis: An Example

• A View and Questions from a Statistician

A View from a Computer Scientist

set N 250; set M 1065foreach RN {1215 33 1914 1066 1984 …} { set Instance [Gen3Sat $N $M $RN] lappend Results [unitwalk $Instance]}puts “Statistics\n[PostProcess $Results]”

The experiment with k solvers (e.g. unitwalk below) involves a few-lines encapsulation script, e.g.

Statistics may appear “significant” due to large number of instances, but …

… what can we really infer once we demonstrate that these “random instances” are all VERY different?

Page 4: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Random Instances: What’s the Difference?

orders of magnitude differences have been observed …

See experimental results below with unitwalk on four 3-sat instances (N=250, M=1065) from a set of 100, repeated 128 times on replicas of each of four instances

A View from an Engineer

chaff ----- 0.12 5.74 22.3 ------ 28.16

sato ----- 0.01 0.04 0.11

------ 0.17

ALL instances are of different size and characteristics.

Assume many more than the three shown here.

Would statistician declare sato and satoL

equivalent??

satoL ----- 0.01 0.03 0.09 ------ 0.13

Combinatorial solver experimental results are still published in the format below … (for SAT and other problems)

benchm ----- sch5 sch6 sch7 ----- total

time-to-solve (secs)

Page 5: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

A View from a Skeptic

From our presentations at SAT’2002, AMAI’2005:

22.3 3.51 15.5 69.5 19.8 0.11 0.09 238 t’out > 39,713

For each reference (eg. sch7), generate an equiv. class of at least 32 isomorphs and repeat experiment for each instance to find intrinsic variability of state-of-the-art SAT solvers.

0.09 0.06 0.07 0.11 1.83

A total of 9x(32+1) experiments rather than 9 shown earlier

PC-class of sch7, 32-instances:

Reference instance report (from the engineer’s experiment)

solverID satoL chaff sato

initV minV meanV maxV maxV/minV

A Test of Hypothesis: An Example

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100

uf250..87-qt1 uf250..87-uw1

Not really, 95% conf.interval for difference:

(4.4 +- 7.7) secs (t = 1.17)

12.3/12.5

exp. d.16.7/17.2

exp. d.

Given cumulative runtime distribution (solvability) of two solvers on the same isomorph class (of 32): Is uw1 “better” than qt1 on this class ?

Page 6: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

A View from a Statistician

• Population: all instances within a defined class • Sampling scheme: two-stage, with classes and instances within classes. • Parameters of interest: mean runtimes (K solvers) • Hypotheses: equality of means across solvers • False positive (Type I) error rate at .05 - multiplicity adjustments in the case of many solvers or many classes • Sufficiently powerful experiments require - specification of minimum acceptable differences among solvers we hope to detect - specification of runtime variances

Few Questions from a Statistician

• What are the sources of variation in runtimes? - instances, classes, solvers

• How much variability in runtimes can be attributed to these three various factors? • Is there evidence against null models in which - all class means are equal? - all solver means are equal? - solver differences constant across classes? • Can we use block design approach here? Block designs form groups of plots that are similar within groups, but different across groups, then apply all treatments within each block.

Page 7: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Component Reliability vs Instance Solvability*

Two sides of the SAME coin

• replicated components in reliability experiments

• instances of isomorphs in solver experiments

• the same statistical methods for both

Same Coin: Reliability vs Solvability

Side 1 of the coin:

Side 2 of the coin:

Reliability of a critical component

Solvability of a combinatorial problem (nug30)

maximize lifetime

minimize runtime

environment1

30 mins

environment2

30 years

local search

1--60 seconds (1 processor)

branch-and-bound7 days

(2510 processors)RELIABLY!!

Page 8: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Example: Reliability vs Solvability

lifetime

surv

ival f

unct

ion

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100

acFailures exp(-x)

MTTF = 93.3 hours Std = 107 hours

lifetime

(Prochan, 1963)

Component (A/C unit of Boeing 720)

What lifetime?

component replicas

(213 total)

problem replicas

0

10

20

30

40

0 10 20 30 40

rot. -15 deg

0

10

20

30

40

0 10 20 30 40

rot. -30 deg

(24 total)••••••

life time

runtimeso

lvab

ility

0

0.2

0.4

0.6

0.8

1

1 10 100

median = 2 mean = 14.6

std = 25.1

Is it still surprising that “median’” is preferred (by some) to mean/std?

LA-x*y problem

0

10

20

30

40

0 10 20 30 40

What runtime to optimally place x warehouses to minimize travel to

y locations?

About Instance Isomorphs*• introduced for learning experiments by H. Simon (1969) … and Tower of Hanoi still being published (ACCSS’2000)

• introduced for performance evaluation of algorithms: ICCAD’98, IWLS’99, SSST’2001, JEA’2001, SAT2002 … (in contrast to Simon, isomorph syntax is strictly invariant)

Instance generation rules for performance evaluation of algorithms depend on the problem domain: • almost trivial for many graph problems • needs to be “invented” for some problem domains • may be initiated within the scope of five generic rules

Page 9: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Rules on Constructing Isomorphs

R2: reference instance parameters

E2: a=2, b=-10, c=-100

R3: instance invariants E3: a:1 digit, b:2 digits, c:3 digits

R4: parameter sampling range

E4: 1 < a < 10, -100 < b <-9 -1000 < c < -99

R5: sampling process to generate each instance

E5: I a b c 0 2 -10 -100 1 5 -49 -631 2 3 -71 -239 … … … …

R1: syntax

Define, for a problem domain:

Calculator Dexterity* Test Example: find 2 roots

E1: ax^2 + bx + c

*Dexterity is a term referring primarily to the ability of a person to "gracefully" coordinate their movements

Isomorphs Example: Simple Maze Problem

Reference maze Left-edge alg. “appears” faster

A single reference maze can induce an equiv. class to better analyze algorithm’s behavior

The same maze induces large variability in these two algorithms

Reference maze -- reflected RDLU alg. “appears” faster

Reference maze -- rotated both alg. “appear” equivalent

Page 10: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Isomorphs Example: LA Problem

Similar to rotation/translation in maze, “complexity” of facilitiy LA problems in arrangements below is the same

Where to place 2 warehouses to

minimize travel to 7 locations?

0

10

20

30

40

0 10 20 30 40

ref problem

0

10

20

30

40

0 10 20 30 40

ref problem rot. -15 deg

0

10

20

30

40

0 10 20 30 40

ref problem rot. -30 deg

Just the “rotation of x-y” coordinates

induces optimization cost variability exceeding two

orders of magnitude!

LA7x2

LA20x3

LA35x2

Isomorphs Example: Graph Problem

ref instance (crossing no = 80) I-class instance (crossing no = 101)

Similar to rotation/translation in maze, “complexity” of crossing number problem in arrangements below is the same

Optimum placement (crossing no = 5) for all instances in this class Current generation of

algorithms does not scale ... crossing number max/min variability can span orders of magnitude ...

Page 11: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Isomorphs Example: Graph (cont)

<0 450 950 1450 195001234567

crossing number

Crossing numbers reported:Min=20, Max=1870

Mean=933.98

Bigraph placement performed with “dot” heuristic (courtesy of AT&T)

Reference graph: 2123 nodes Crossing number = 0Equivalence class: I Num of instances = 64Ideally, each instance should be embeddable with 0 crossings, but ...

Our work: better performance is attainable by using equivalence class instances while engineering the algorithm. (Alenex1999, JEA2001)

Isomorphs Example: SAT Problem

… and yet, our experiments with I-, C-, P-, PC-class instances have induced orders of magnitude variability in:

Similar to rotation/translation in maze, “complexity” of Boolean circuits in arrangements below is the same

Ref circuit

Ref circuit

reference instance PC-class instance

• SAT problems (SAT’2002, see also our web-postings ...)

• BDD variable ordering problems (ICCAD’98, SSST’2001)

• logic optimization problems (IWLS’1999)

Page 12: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Isomorphs Example: SAT (cont)QuickTime™ and aGraphics decompressorare needed to see this picture.

0

20

40

60

80

100

0 50 100 150 200 250

solv

abilit

y (%

)

runtime (seconds)

Solver 'chaff' applied to 129 unsat instances from the class of uuf250-1065_090_PC:median = 127.mean = 131.stdev= 21.7

(a) solvability function induced by a normal or a near-normal distribution

Goodness-of-fit:9.32 < chi^2(0.05,6)= 12.59hence we accept the normaldistribution hypothesis at 5% levelof significance.

Solver chaff: 129 instances

from uuf250-1065

_090_PC

QuickTime™ and aGraphics decompressorare needed to see this picture.

(b) solvability function induced by an exponential or a near-exponential distribution

0

20

40

60

80

100

0 20 40 60 80 100 120 140

solv

abilit

y (%

)

runtime (seconds)

Solver 'chaff' applied to 129 unsat instances from the class of sched06u_v00826_PC:median = 13.0mean = 19.6stdev= 22.6

Goodness-of-fit:7.42 < chi^2(0.05,14)= 23.68hence we accept the exponentialdistribution hypothesis at 5% levelof significance.

Solver chaff: 129 instances

from sched06u

_v00826_PC

Same solver, two very different distributions, large variance

Isomorphs Example: BinPacking Problem

Class Signature InvariantsQuickTime™ and aGraphics decompressorare needed to see this picture.

0 1 2 3 4 5 6 7 8 9 10 110

2

4

6

8

BinIndex for ref_31_08_12_00 (signature invariants)

NumOfItems

BinWeight

Isomorph Instance SolutionQuickTime™ and aGraphics decompressorare needed to see this picture.

0 1 2 3 4 5 6 7 8 9 10 110

2

4

6

8

BinIndex for ref_31_08_12_00_01 (solution)

Isomorph Instance Filep bpfi 31 8 1 76 5 2 3 7 4 1 1 3 5 2 1 4 2 2 6 1 2 4 1 4 6 1 2 1 5 1 3 1 3 7

Reference Instance Solution

Reference Instance Filep bpfi 31 8 1 75 2 3 2 1 4 5 4 1 2 5 3 5 2 1 5 2 5 3 1 4 5 3 1 4 1 4 5 1 3 4

QuickTime™ and aGraphics decompressorare needed to see this picture.

0 1 2 3 4 5 6 7 8 9 10 110

2

4

6

8

BinIndex for ref_31_08_12_00_00 (solution)

Page 13: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Isomorphs Example: ListSize=99, BinSize=13

Reference Instance

QuickTime™ and aGraphics decompressorare needed to see this picture.

0 1 2 3 4 5 6 7 8 9 10 11 1213 14 15 1617 18 1920 21 22 2324 25 2627 28 29 3002468

101214

ListSize=99 BinSize=13 WeightTotal=99 BinTotal=31

Isomorph Instance

QuickTime™ and aGraphics decompressorare needed to see this picture.

0 1 2 3 4 5 6 7 8 9 1011 12 13 1415 16 1718 19 20 2122 23 2425 26 27 2829 3002468

101214

ListSize=99 BinSize=13 WeightTotal=99 BinTotal=31

QuickTime™ and aGraphics decompressorare needed to see this picture.

0 1 2 3 4 5 6 7 8 9 1011 12 13 1415 16 1718 19 20 2122 23 2425 26 27 2829 3002468

101214

ListSize=99 BinSize=13 WeightTotal=99 BinTotal=31

QuickTime™ and aGraphics decompressorare needed to see this picture.

0 1 2 3 4 5 6 7 8 9 1011 12 13 1415 16 1718 19 20 2122 23 2425 26 27 2829 3002468

101214

ListSize=99 BinSize=13 WeightTotal=99 BinTotal=31

Class Signature Invariants

QuickTime™ and aGraphics decompressorare needed to see this picture.

0 1 2 3 4 5 6 7 8 9 10 1112 13 14 1516 17 18 1920 21 22 2324 25 26 2728 29 3002468

101214

ListSize=99 BinSize=13 WeightTotal=99 BinTotal=31

NumOfItems

BinWeight

Isomorphs Example: BinPacking Experiments

% ../ref2bpfi 13 1011ListSize = 100089 WeightTotal = 100089 BinTotal=31341% ../../xSolver_/packX < uni_000100089_0001_0013_0013_100089.bpf

100089 0.0769231 1 31765 24900 424 423.538100089 0.0769231 1 31758 24810 417 416.923100089 0.0769231 1 31760 24879 419 418.538100089 0.0769231 1 31753 24837 412 411.615100089 0.0769231 1 31758 24781 417 416.538100089 0.0769231 1 31758 24781 417 416.538100089 0.0769231 1 31746 24779 405 404.462100089 0.0769231 1 31752 24826 411 410.077100089 0.0769231 1 31754 24858 413 412.769100089 0.0769231 1 31759 24940 418 417.462100089 0.0769231 1 31759 24746 418 417.308

QuickTime™ and aGraphics decompressorare needed to see this picture.

405 407 409 411 413 415 417 419 421 4230

1

2

3

Excess Bins (from known optimum of 31,341 bins)

Page 14: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

xBed: A Testbed Environment for Experimental Algorithmics*

• a schema and an architecture - a single MasterBed for users - a user-configured UserBed in desired domain • encapsulation class scripts • encapsulated solvers • a userbed configuration flow example • a userbed schema example • a configuration file example • a default posting of results • web-posted experiments from SatBed

xBed : A Schema and An Architecture

MasterBed/ Docs/ BIN/ MasterBedConfigFile ClassScript ClassScript_/ ProgA ProgA_ ProgATests xBed

AdminPrivileges UserPrivilegesUserBed/ Benchmarks/ BIN/ UserBedConfigFile ClassScript_/ ProgB ProgB_ ProgBTests Run/ RunConfigFile Results/Any number of

encapsulation class scripts and programs

xBed can invoke any number of (script, program) pairs

via RunConfigFiles

Page 15: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

xBed : Encapsulation Class Scripts

xGenRef Class Scripts:

xGenMorph xGenRand xTranslator xSolver xAnalyzer xVerifier xGenStatxViewer

sched_classic, ref2bpf, …Encapsulated program examples:

morph4cnf, morph4bpf, …rand2wsf, rand2bpf, …

wsf2cnf, cnf2lpx, …chaff, cplex, packX, …

NOTE: Only a single version of a class script can exist under MasterBed/UserBed. However, another version of a program can be maintained under UserBed.

xBed: Encapsulated Solvers

CrossingsBed:

SatBed:

LabsBed:

MaxOnesBed:

CspBed: BinPackingBed:

MinOnesBed:• chaff • satire • sato • satoL • walksat • unitwalk • qingting • dp0

• dot (AT&T) • tr01--tr23

• es • kl

• espresso • scherzo • aura2 • bsolo • umbra • lp_solve • cplex

• umbraMax • lp_solve • cplex • triglav

• sub_SAT • wpack • maxsat • LB2 • qtmax

• packX

Page 16: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

xBED: A UserBed Configuration Flow Example

Tasks:

refGen classGen

solver1

solverN

statsGen rptGen•••

Data:

… that can be created in the user-configuration file which is then given as an argument to xBed command

xBed: A UserBed Schema Example

Here, we have a testbed archiving a large number of MinOnes problems which are either Unate or Binate.

As shown, isomorphs have already been generated from reference instances which are based on triplets in the ‘Steiner’ family of increasing size.

Page 17: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

xBed: A Configuration File Example

set BenchmarksHome ../Bench_DemoBedIsValid set BenchmarksHomeNode Bench_DemoBedIsValid/Unate set ResultsHome Test_xSolverArchive/RESULTS-Tmp

set InstanceDirList "\ SmallSteiner/_ISOMORPHS/steiner_a0009_CLR \ SmallSteiner/_ISOMORPHS/steiner_a0015_CLR \ "foreach InstanceDir $InstanceDirList {

set Script "xSolver umbra,bb1 $InstanceDir .lpx -NumRepeats=16 " set Program "umbra (FileName) " lappend CommandPairs [ list $Script $Program ] set Script "xSolver lp_solve,S2 $InstanceDir .lpx -NumRepeats=16 " set Program "lp_solve (FileName) " lappend CommandPairs [ list $Script $Program ]}

The user can keep it simple -- using commands set, lappend, and list only, or -- unleash the full programming environment of tcl

xBed: A Default Posting of Results

The posted results inherit the hierarchy of data instances. The results directory is distinct from instance directory and the results are placed at the leaves of instance directory structure.

Page 18: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

SatBed: Web-posted Experiments

In mission-critical applications, number of test instances should be very large ….

Final note: not all SATsolvers return solutions that always pass verification audit!!

Single instance reported in a traditional experiment

••••••

Note variability range for the isomorph class.

A Call for Isomorph Classes and Instances

… ask what solver can do for set of instances in the same isomorph class!

Ask not what a solver can do for a set of unrelated instances …

Page 19: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

A Strategy and A Challenge

Statistical Methods

Algorithm Engineering

Instance Engineering

The Strategy:

Demonstrate at least one solver that dominates all other solvers on instances of more than one isomorph class.

The Challenge:

A Case Study: Highlights from SAT’2003*

QingTing: A Fast SAT Solver Using Local Search and Efficient Unit Propagation Xiao Yu Li Matthias F. Stallmann Franc Brglez

Page 20: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Motivation: UnitWalk Algorithm

Incomplete local search algorithm

Uses unit propagation intensively

Competitive with other solvers such as chaff, sato (complete solvers) and WalkSAT, GSAT (incomplete solvers)

This Talk: A Look at Two Benchmarks

Structured benchmark: sched6.cnf Variables = 808 Clauses = 12024 PC class size = 32

Random 3-SAT benchmark: uf250-1065-087 Variables = 250 Clauses = 1065 PC class size = 128

Page 21: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

sched6: UW1 vs Chaff (1)

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100

sched6_chaff

18.0/20.1

exp. d.

Exponential Distribution Mean = 18.0 Std deviation = 20.1

Comparison between UnitWalk1 and chaff on a structured instance.

sched6: UW1 vs Chaff (2)

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100

sched6_chaff sched6_uw1

18.0/20.1exp. d.0.20/0.19

exp. d.

UW1 solves the instance with less mean runtime and less variation than chaff.

Page 22: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

uf250-1065-87: UW1 vs Chaff (1)

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100 1000 10000

uf250..87_chaff

69.4/62.0

norm. d.

Comparison between UnitWalk1 and chaff on a randomly generated instance.

uf250-1065-87: UW1 vs Chaff (2)

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100 1000 10000

uf250..87_chaff uf250..87_uw1

69.4/62.0

norm. d.12.3/12.5

exp. d.

UW1 outperforms chaff again.

Page 23: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Our Solver: QingTing1

How to improve UnitWalk?

• UnitWalk uses counter-based adjacency list as

underlying data structure

1. Improve its unit propagation

QingTing1 (QT1) uses two well-known unit propagation

techniques used in complete solvers: • sato’s unit propagation algorithm

• chaff’s watched literal lazy data structure

sched06: QT1 vs UW1 (1)

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1

sched6-uw1

0.20/0.19

exp. d.

How much better can we do with better unit propagation techniques

on a structured instance?

Page 24: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

sched06: QT1 vs UW1 (2)

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1

sched6-uw1 sched6-qt1

0.20/0.19

exp. d.

With all other parameters, e.g. flips, approximately the same, QT1 runs five times faster than UW1.

0.037/0.027

exp. d.

uf250-1065-87: QT1 vs UW1 (1)

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100

uf250..87-qt1

exp. d.

How much better can we do on a randomly generated instance?

16.7/17.2

Page 25: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

uf250-1065-87: QT1 vs UW1 (2)

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100

uf250..87-qt1 uf250..87-uw1

t-test result: t = 1.88 < 1.97

12.3/12.5

exp. d.16.7/17.2

exp. d.

Not much! The 5%-level t-test shows no difference. Lack of improvement due to cost of maintaining dynamic data structure.

Switching Strategy

Another Local search algorithm: WalkSAT

Observations: 1) QT1 works well on structured instances,

but not on random instances 2) WalkSAT-Novelty works well on random SAT instances

Questions:

1) How do we know which one to use (switching strategy)? 2) How do we decide with low cost?

Page 26: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Switching Strategy: Method A (1)

Random Assignment Formula

Formula Empty? Yes

No

Terminate

Formula empty when each clause is empty or satisfied.

Switching Strategy: Method A (2)

Method A measures # of random assignments

and normalize wrt # of variables:

random assignment % =

# of random assignments / # of variables

Repeat the process 128 times and calculate the average

Page 27: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Method A: An Example

Random Assignment

x = 1 (y) (y z) (-y -z) (z)

(x y z) (-x y) (-x y z) (-y -z) (-x z)

z = 0

y = 0

formula empty !

(y) (y) ( )

( ) ( ) ( )

Method A: Experiments

Benchmark Name

Random Assignment %

Mean/StdDistribution

Structured bw_large_a 0.98/0.05 near-exp

Instances logistics_b 0.98/0.06 near-exp

Random uf250-1065-27 0.98/0.06 near-exp

Instances uf250-1065-87 0.96/0.07 near-exp

Method A is no good. There is no difference!

Page 28: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Switching Strategy: Method B

Random Assignment Formula

Formula Empty? Yes

No

Terminate

Unit Propagation

Repeat the procedure 128 times and calculate average

Random assignments %

Method B: On the Same Example

Random Assignment

x = 1

(x y z) (-x y) (-x y z) (-y -z) (-x z)

formula empty !

Unit propagate with y = 1

Unit propagate with z = 0

(y) (y z) (-y -z) (z)

(-z) (z)

( ) ( )

Page 29: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Method B: Experiments (1)

Benchmark

Name

Random Assignment %

Mean/StdDistribution

Structured bw_large_a 0.010/0.006 near-exp

Instances logistics_b 0.015/0.002 normal

Random uf250-1065-27 0.150/0.025 normal

Instances uf250-1065-87 0.156/0.026 normal

Random assignment % has large gap between the two groups, but is it because of 3-SAT?

Method B: Experiments (2)

Random assignment % increases but the gap still exists!

What about other randomly generated 3-SAT?

Benchmark Name

Random Assignment %

Mean/StdDistribution

Structured bw_large_a 0.010/0.006 near-exp

Instances bw_large_a_3sat 0.046/0.024 near-exp

logistics_b 0.015/0.002 normal

logistics_b_3sat 0.049/0.066 near-exp

Random uf250-1065-27 0.150/0.025 normal

Instances uf250-1065-87 0.156/0.026 normal

Page 30: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Method B: Experiments (3)

Benchmark

Name

Random Assignment %

Mean/StdDistribution

Structured bw_large_a 0.010/0.006 near-exp

Instances bw_large_a_3sat 0.046/0.024 near-exp

logistics_b 0.015/0.002 normal

logistics_b_3sat 0.049/0.066 near-exp

Random uf250-1065-27 0.150/0.025 normal

Instances uf250-1065-87 0.156/0.026 normal

hgen2-v250-s2 0.159/0.022 normal

hgen2-v250-s500 0.161/0.019 near-normal

QingTing2 and UnitWalk2

QingTing2 • Switching strategy (low overhead) • On randomly generated instances, QT2 behaves like WalkSAT-

Novelty • On structured instances, QT2 behaves like QT1

UnitWalk2 • UnitWalk2 is the latest version of UnitWalk and it also does

some structure detection

Page 31: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

sched6: QT2/QT1 vs UW2/UW1 (1)

runtime (seconds)

solvab

ility

What will the improvements be on a structured instance?

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1

sched6_uw1

exp. d. 0.20/0.19

sched6: QT2/QT1 vs UW2/UW1 (2)

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1

sched6_uw1 sched6-qt1

exp. d. 0.037/0.027

exp. d. 0.20/0.19

QT1 runs five times faster than UW1

Page 32: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

sched6: QT2/QT1 vs UW2/UW1 (3)

runtime (seconds)

solvab

ility

UW2 and UW1 are the same (t-test: t = 1.44 <1.97 )

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1

sched6_uw1 sched6-qt1 sched6_uw2

exp. d. 0.04/0.03

exp. d. 0.20/0.19

exp. d. 0.037/0.027

QT1 runs five times faster than UW1

sched6: QT2/QT1 vs UW2/UW1 (4)

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1

sched6_uw1 sched6-qt1 sched6_uw2 sched6_qt2

exp. d. 0.04/0.03

exp. d. 0.037/0.027

exp. d. 0.20/0.19

exp. d. 0.04/0.03

QT1 and QT2 are the same (t-test: t = 1.18 <1.97 )UW2 and UW1 are the same (t-test: t = 1.44 <1.97 )

QT1 runs five times faster than UW1

Page 33: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

uf250..87: QT2/QT1 vs UW2/UW1 (1)

runtime (seconds)

solvab

ility

What about random 3-SAT instances?

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100

uf250..87-uw1 exp. d. 16.7/17.2

uf250..87: QT2/QT1 vs UW2/UW1 (2)

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100

uf250..87-uw1 uf250..87-qt1

exp. d. 12.3/12.5

exp. d. 16.7/17.2

UW1 performs the same as QT1 (t-test: t = 1.88 > 1.97)

Page 34: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

uf250..87: QT2/QT1 vs UW2/UW1 (3)

runtime (seconds)

solvab

ility

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100

uf250..87-uw1 uf250..87-qt1 uf250..87_uw2

exp. d. 12.3/12.5

exp. d. 16.7/17.2

exp. d. 0.39/0.29

UW2 outperforms UW1 by a factor of 31 ...

UW1 performs the same as QT1 (t-test: t = 1.88 > 1.97)

uf250..87: QT2/QT1 vs UW2/UW1 (4)

runtime (seconds)

solvab

ility exp. d.

0.31/0.28

exp. d.

QT2 outperforms UW2 slightly (t-test: t = 2.24 > 1.97)

0

0.2

0.4

0.6

0.8

1

0.01 0.1 1 10 100

uf250..87-uw1 uf250..87-qt1 uf250..87_uw2 uf250..87_qt2

UW2 outperforms UW1 by a factor of 31 ...

UW1 performs the same as QT1 (t-test: t = 1.88 > 1.97)

exp. d. 16.7/17.2

exp. d. 12.3/12.5

exp. d. 0.39/0.29exp. d.

0.21/0.19

Page 35: A Presentation for David Johnson: Can We (Also) Live with ... · solvability function . induced by a normal or a near-normal distribution. Goodness-of-fit: 9.32 < chi^2(0.05,6)= 12.59

Summary and Conclusions (1)

• QingTing1

- Improving UnitWalk

- Faster Unit Propagation

• QingTing2

- Switching strategy

- Combine QingTing1 with WalkSAT

Summary and Conclusions (2)

Future research directions

1. Use switching strategy in a complete solver

2. Use learning techniques in local search algorithms

3. Combine the theory and the experimental methodology