Distributions of Randomized Backtrack Search

32
Distributions of Randomized Backtrack Search •Key Properties: I Erratic behavior of mean II Distributions have “heavy tails ”.

description

Distributions of Randomized Backtrack Search. Key Properties: I Erratic behavior of mean II Distributions have “ heavy tails ”. 2000. 500. Erratic Behavior of Search Cost Quasigroup Completion Problem. 3500!. sample mean. Median = 1!. number of runs. 1. Number backtracks. - PowerPoint PPT Presentation

Transcript of Distributions of Randomized Backtrack Search

Page 1: Distributions of Randomized Backtrack Search

Distributions of Randomized Backtrack Search

• Key Properties:

• I Erratic behavior of mean

• II Distributions have “heavy tails”.

Page 2: Distributions of Randomized Backtrack Search

Median = 1!

samplemean

3500!

Erratic Behavior of Search CostQuasigroup Completion Problem

500

2000

number of runs

Page 3: Distributions of Randomized Backtrack Search

1

Page 4: Distributions of Randomized Backtrack Search

75%<=30

Number backtracks Number backtracks

Prop

ortio

n of

cas

es S

olve

d

5%>100000

Page 5: Distributions of Randomized Backtrack Search

Heavy-Tailed Distributions

• … … infinite variance … infinite meaninfinite variance … infinite mean

• Introduced by Pareto in the 1920’s• --- “probabilistic curiosity.”

• Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena.

• Examples: stock-market, earth-quakes, weather,...

Page 6: Distributions of Randomized Backtrack Search

Decay of Distributions

• Standard --- Exponential Decay• e.g. Normal:•

• Heavy-Tailed --- Power Law Decay• e.g. Pareto-Levy:••

Pr[ ] , ,X x Ce x for some C x 2 0 1

Pr[ ] ,X x Cx x 0

Page 7: Distributions of Randomized Backtrack Search

Standard Distribution(finite mean & variance)

Power Law Decay

Exponential Decay

Page 8: Distributions of Randomized Backtrack Search

Normal, Cauchy, and Levy

Normal - Exponential Decay

Cauchy -Power law DecayLevy -Power law Decay

Page 9: Distributions of Randomized Backtrack Search

Tail Probabilities (Standard Normal, Cauchy, Levy)

• c Normal Cauchy Levy0 0.5 0.5 11 0.1587 0.25 0.68272 0.0228 0.1476 0.52053 0.001347 0.1024 0.43634 0.00003167 0.078 0.3829

Page 10: Distributions of Randomized Backtrack Search

Example of Heavy Tailed Model(Random Walk)

•Random Walk:•Start at position 0•Toss a fair coin:

–with each head take a step up (+1)–with each tail take a step down (-1)

X --- number of steps the random walk takes to return to position 0.

Page 11: Distributions of Randomized Backtrack Search

The record of 10,000 tosses of an ideal coin(Feller)

Zero crossing Long periods without zero crossing

Page 12: Distributions of Randomized Backtrack Search

Random Walk

Heavy-tails vs. Non-Heavy-Tails

Normal(2,1000000)

Normal(2,1)

O,1%>200000

50%

2

Median=2

1-F(

x)U

nsol

ved

frac

tion

X - number of steps the walk takes to return to zero (log scale)

Page 13: Distributions of Randomized Backtrack Search

How to Check for “Heavy Tails”?

• Log-Log plot of tail of distribution• should be approximately linear.

• Slope gives value of • • infinite mean and infinite varianceinfinite mean and infinite variance

• infinite varianceinfinite variance

1

1 2

Page 14: Distributions of Randomized Backtrack Search

466.0

319.0153.0

Number backtracks (log)

(1-F

(x))

(log)

Uns

olve

d fr

actio

n

1 => Infinite mean

Heavy-Tailed Behavior in QCP Domain

18% unsolved

0.002% unsolved

Page 15: Distributions of Randomized Backtrack Search

Formal Models of Heavy-Tailed Behavior in

Combinatorial Search

Chen, Gomes, Selman 2001

Page 16: Distributions of Randomized Backtrack Search

MotivationMotivation• Research on heavy-tails has been largely based on empirical studies of run time distribution.

• Goal: to provide a formal characterization of tree search models and show under what conditions heavy-tailed distributions can arise.

• Intuition: Heavy-tailed behavior arises:

• from the fact that wrong branching decisions may lead the procedure to explore an exponentially large subtree of the search space that contains no solutions;

• the procedure is characterized by a large variability in the time to find a solution on different runs, which leads to highly different trees from run to run;

••

Page 17: Distributions of Randomized Backtrack Search

Balanced vs. ImbalancedBalanced vs. Imbalanced Tree Model Tree Model

• Balanced Tree Model:

• chronological backtrack search model;• fixed variable ordering;• random child selection with no propagation

mechanisms;

(show demo)

Page 18: Distributions of Randomized Backtrack Search

221)]([

nnTE

12

122)]([n

nTV

The run time distribution of chronological backtrack search ona complete balanced tree is uniform (therefore not heavy-tailed).Both the expected run time and variance scale exponentially

Page 19: Distributions of Randomized Backtrack Search

Balanced Tree ModelBalanced Tree Model

– The expected run time and variance scale exponentially, in the height of the search tree (number of variables);

– The run time distribution is Uniform, (not heavy tailed ).

– Backtrack search on balanced tree model has no restart strategy with exponential polynomial time.

221)]([

nnTE

12

122)]([n

nTV

Chen, Gomes & Selman 01

Page 20: Distributions of Randomized Backtrack Search

• How can we improve on the balanced serach tree model?

• Very clever search heuristic that leads quickly to the solution node - but that is hard in general;

• Combination of pruning, propagation, dynamic variable ordering that prune subtrees that do not contain the solution, allowing for runs that are short.

• ---> resulting trees may vary dramatically from run to run.

Page 21: Distributions of Randomized Backtrack Search

• T - the number of leaf nodes visited up to and including the successful node; b - branching factor

0)1(][ iippibTP

Formal Model Yielding Heavy-Tailed Behavior

b = 2

(show demo)

Page 22: Distributions of Randomized Backtrack Search

• Expected Run Time• (infinite expected time)

• Variance•• (infinite variance)

• Tail

• (heavy-tailed)

][1 TEbp

][21 TVb

p

2log2][2

1 LCpbLpLTP

bp

Page 23: Distributions of Randomized Backtrack Search

Bounded Heavy-Tailed Behavior

(show demo)

Page 24: Distributions of Randomized Backtrack Search

No Heavy-tailed behavior for Proving Optimality

Page 25: Distributions of Randomized Backtrack Search

Proving Optimality

Page 26: Distributions of Randomized Backtrack Search

Small-World Vs. Heavy-Tailed Behavior

• Does a Small-World topology (Watts & Strogatz) induce heavy-tail behavior?

The constraint graph of a quasigroup exhibits a small-world topology(Walsh 99)

Page 27: Distributions of Randomized Backtrack Search

Exploiting Heavy-Tailed Behavior

• Heavy Tailed behavior has been observed in several domains: QCP, Graph Coloring, Planning, Scheduling, Circuit synthesis, Decoding, etc.

• Consequence for algorithm design: • Use restarts or parallel / interleaved

runs to exploit the extreme variance performance.Restarts provably eliminate

heavy-tailed behavior.(Gomes et al. 97, Hoos 99, Horvitz 99, Huberman, Lukose and Hogg 97, Karp et al 96, Luby et al. 93, Rish et al. 97, Wlash 99)

Page 28: Distributions of Randomized Backtrack Search

X XX XX

solved10 101010 10

Sequential: 50 +1 = 51 secondsParallel: 10 machines --- 1 second 51 x speedup

Super-linear Speedups

Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup

Page 29: Distributions of Randomized Backtrack Search

Restarts70%

unsolved

1-F

(x)

Uns

olve

d fr

actio

n

Number backtracks (log)

no restarts

restart every 4 backtracks

250 (62 restarts)

0.001%unsolved

Page 30: Distributions of Randomized Backtrack Search

Example of Rapid Restart Speedup(planning)

1000

10000

100000

1000000

1 10 100 1000 10000 100000 1000000

log( cutoff )

log

( bac

ktra

cks

)

20

2000 ~100 restarts

Cutoff (log)

Num

ber

back

trac

ks (l

og)

~10 restarts100000

Page 31: Distributions of Randomized Backtrack Search

Sketch of proof of elimination of heavy tails

• Let’s truncate the search procedure• after m backtracks.

• Probability of solving problem with truncated version:

• Run the truncated procedure and restart it repeatedly.

pm X m Pr[ ]

X numberof backtracks to solve the problem

Page 32: Distributions of Randomized Backtrack Search

Y total number backtracks with restarts

F Y y pmY m c e c y

Pr[ ] ( ) /1 1

2

Number of starts Y m Geometric pmRe / ~ ( )

Y - does not have Heavy Tails