1 Understanding Problem Hardness: Recent Developments and Directions Bart Selman Cornell University.

Understanding Problem Hardness: Recent Developments and Directions

Bart Selman Cornell University

Understanding Problem Hardness: Recent Developments and Directions

Bart Selman Cornell University

Introduction & MotivationIntroduction & Motivation

Computational Challenges in Planning, Reasoning,

Learning, and Adaptation.

What are the characteristics of challenging

computational problems?

A Few Examples

Reasoning

many forms of deductionabduction / diagnosis (e.g. de Kleer 1989)

default reasoning (e.g. Kautz and Selman 1989)

Bayesian inference (e.g. Dagum and Luby 1993)

Planningdomain-dependent and independent (STRIPS)

(e.g. Chapman 1987; Gupta and Nau 1991; Bylander1994)

Learning

neural net “loading” problem (e.g. Blum and Rivest 1989)

Bayesian net learning

decision tree learning

An abundance of negative complexity results for

many interesting tasks.

Results often apply to very restricted formalisms,

and also to finding approximate solutions.

But worst-case, what about average-case?

Sometimes “surprising” results.

A closer look leads to new insights &

algorithms and solution strategies.

OutlineOutline

A --- “Early’’ results:

phase transitions & computational hardness

B --- Current focus:

--- problem mixtures (tractable / intractable)

--- adding global structure

C --- Future directions and prospects

--- modeling resource constraints

--- adaptive computing

--- deeper theoretical understanding

A. “Early” Results (‘90-’95)

Example Domain: Satisfiability

SAT: Given a formula in propositional calculus, is there an assignment to its variables making it true?

We consider clausal form, e.g.:a b cb db c

The canonical NP-complete problem.

(“exponential search space”)

Variable Clause Size Model

Polynominal average time in regions: Ia Ð Purdom 1987 - backtracking Ib Ð Iwama 1989 - counting alg. Ic Ð Brown and Purdom 1985 - pure literal rule II Ð Franco 1991 III Ð Franco 1994 Open: region IV

Ratio of Clauses-to-Variables

m-2 m-1

1Avera

lause L

Generating Hard Random Formulas

Key: Use fixed-clause-length model.(Mitchell, Selman, and Levesque 1992; Kirkpatrick and

Selman 1994)

Critical parameter: ratio of the number of clauses to the number of variables.

Hardest 3SAT problems at ratio = 4.25

Hardness of 3SAT

02 3 4 5

50 var 40 var 20 var

Intuition

At low ratios:few clauses (constraints)many assignmentseasily found

At high ratios:many clausesinconsistencies easily detected

The 4.3 Point

0.02 3 4 5

50 var 40 var 20 var

50% sat

Mitchell, Selman, and Levesque 1991

Phase transition 2-, 3-, 4-, 5-, and 6-SATPhase transition 2-, 3-, 4-, 5-, and 6-SAT

Theoretical Status Of Threshold

Very challenging problem ...

Current status:

3SAT threshold lies between 3.003 and 4.6. (Motwani et al. 1994; Broder et al. 1992;

Frieze and Suen 1996; Dubois 1990, 1997;

Kirousis et al. 1995; Friedgut 1997;

Archlioptas et al. 1999 / related work:

Beame, Karp, Pitassi, and Saks 1998;

Bollobas, Borgs, Chayes, Han Kim, and

Wilson 1999)

Phase transition and combinatorial problems is an

active research area with fruitful interactions

between computer science, physics (approaches

from statistical mechanics), and mathematics

(combinatorics / random structures).

Also, a close interaction between experimental and

theoretical work. (With experimental findings quite often

confirmed by formal analysis within months to a few years.)

Finally, relevance to applications via algorithmic

advances and notion of “critically constrained

problems”.

Consequences for Algorithm Design

Phase transition work instances led to

improvements in algorithms:

--- local search methods (e.g., GSAT / Walksat)

(Selman et al. 1992; 1996; Min Li 1996; Hoos 1998, etc.)

--- backtrack-style methods (Davis-Putnam and

variants / complete)

(Crawford 1993; Dubois 1994; Bayardo 1997; Zane 1998, etc.)

ProgressProgress

Propositional reasoning and search (SAT):

1990: 100 variables / 200 clauses (constraints)

1998: 10,000 - 100,000 variables / 10^6 clauses

Novel applications:

e.g. in planning (Kautz & Selman),

program debugging (Jackson),

protocol verification (Clarke), and

machine learning (Resende).

B. Current Focus

--- mixtures of problem classes, e.g., 2-SAT

and 3-SAT (“moving between P and NP”)

the 2+p-SAT model

--- structured instances

perturbed quasi-group completion problems

Focus --- 1) mixtures: 2+p-SAT problem

mixture of binary and ternary clauses

p = fraction ternary

p = 0.0 --- 2-SAT / p = 1.0 --- 3-SAT

What happens in-between?

(Monasson, Zecchina, Kirkpatrick, Selman, and Troyansky,

Nature, to appear)

Phase Transition for 2+p-SAT Phase Transition for 2+p-SAT

Location ThresholdLocation Threshold

Computational CostComputational Cost

Results for 2+p-SATResults for 2+p-SAT

p < ~ 0.41 --- model essentially behaves as 2-SAT

search proc. “sees” only binary constraints

smooth, continuous phase transition

p > ~ 0.41 --- behaves as 3-SAT (exponential scaling)

abrupt, discontinuous scaling

Many new, rigorous results (including scaling) by

Achlioptas, Bollobas, Borgs, Chayes, Han Kim,

and Wilson. (Next talk.)

1) Strategies that exploit tractable substructure

with propagation are most effective.

(consistent with the best empirically discovered

methods)

2) In addition, use early branching on critically

constrained variables.

(the “backbone variables” / suggests use of

clustering and statistical learning methods)

(Boyan and Moore 1998)

Proposal: study the influence of globalProposal: study the influence of global

structure on problem hardness.structure on problem hardness.

Focus --- 2) StructureFocus --- 2) Structure

(Gomes and Selman 1997; 1998)

Defn.: a pair (Q, *) where Q is a set, and * is a binary

operation on Q such that

a * x = b ; y * a = b

are uniquely solvable for every pair of elements a,b in Q.

The multiplication table of its binary operation defines a

latin square (i.e., each element of Q appears exactly once

in each row/column).

Example:Quasigroup of order 4

QuasigroupsQuasigroups

Given a partial latin square, can it be completed?

Example:

Quasigroup Completion Problem (QCP)

Quasigroup Completion Problem A Framework for Studying SearchQuasigroup Completion Problem

A Framework for Studying Search

NP-Complete (Colbourn 1983, 1984; Anderson 1985).

Has a regular global structure not found in

random instances.

Leads to interesting search problems when

structure is perturbed.

similar to e.g. structure found in the channel assignment problem

for cellular networks

Computational CostComputational Cost

On these structured problems, backtrack

search methods show so-called

heavy-tailed probability distributions.

(Gomes, Selman & Crato 1997, 1998).

Both very short and very long runs occur

much more frequent than one would expect.

Standard Distribution

Heavy Tailed Cost DistributionHeavy Tailed Cost Distribution

1 10 100 1000 10000 100000

log( Backtracks )

Fringe of Search TreeFringe of Search Tree

Algorithmic Strategy:

Rapid Random Restarts.

Order of magnitude speedup.

(Gomes et al. 1998; 1999)

. Universal strategies

(Ertel and Luby 1993; Alt et al. 1996)

Rapid Restarts --- PlanningRapid Restarts --- Planning

100000

1000000

1 10 100 1000 10000 100000 1000000

log( cutoff )

Portfolio for heavy-tailed search procedures (2-20 processors)

C. Future directions and prospectsC. Future directions and prospects

Modeling resource constraints &

user requirements / utility

should be possible to identify optimal

restart strategies, possibly adaptive

--- may need way of “measuring progress” (Horvitz and Klein 1995; Gomes and Selman 1999)

Adaptive Computing

combine statistical learning methods with

combinatorial search techniques.

first success: STAGE system for local search.

(Boyan and Moore 1998)

extension: train a planner on small instances

(Selman, Kautz, Huang 1999)

Deeper theoretical understanding with continued interactions with experiments

and applications

Summary

During the past few years, we have obtained a much

better understanding of the nature of

computationally hard problems.

Rich interactions between physics, computer

science and mathematics, and between theory,

experiments, and applications.

Clear algorithmic progress with room for future improvements (possibly another level of scaling:

10^6 Boolean variables, 10^8 constraints. Further

applications.)

1 Understanding Problem Hardness: Recent Developments and Directions Bart Selman Cornell University.

Documents

Transcript of 1 Understanding Problem Hardness: Recent Developments and Directions Bart Selman Cornell University.

Tracking Evolving Communities in Large Linked Networks John Hopcroft, Omar Khan, Brian Kulis, and Bart Selman Fatih ER 2002701360.

Encoding Domain Knowledge in the Planning as Satisfiability Framework Bart Selman Cornell University.

1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Induction Rosen, Chapter 4.

Bart Selman CS2800 1 CS 2800 Discrete Structures Prof. Bart Selman selman@cs.cornell.edu Introduction.

A Principled Study of Design Tradeoffs for Autonomous Trading Agents Ioannis A. Vetsikas Bart Selman Cornell University.

1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Algorithms and Growth Rates.

CS 4700: Foundations of Artificial Intelligence · 1 CS 4700: Foundations of Artificial Intelligence Bart Selman selman@cs.cornell.edu Logical Agents ---Intro Knowledge Representation

Intelligent Machines: From Turing to Deep Blue to Watson and Beyond Bart Selman

Understanding Batch Normalization · 2018-12-03 · Understanding Batch Normalization Johan Bjorck, Carla Gomes, Bart Selman, Kilian Q. Weinberger Cornell University {njb225,gomes,selman,kqw4}

Energy Efficient Routing and Self-Configuring Networks Stephen B. Wicker Bart Selman Terrence L. Fine Carla Gomes Bhaskar KrishnamachariDepartment of CS.

1 Towards Efficient Sampling: Exploiting Random Walk Strategy Wei Wei, Jordan Erenrich, and Bart Selman.

Controlling Computational Cost: Structure and Phase Transition Carla Gomes, Scott Kirkpatrick, Bart Selman, Ramon Bejar, Bhaskar Krishnamachari Intelligent.

1 Discrete Math CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Probability --- Part d) 1) Probability Distributions 2) Markov and Chebyshev Bounds.

CS444-Autumn-20061 of 20 Planning as Satisfiability Henry Kautz University of Rochester in collaboration with Bart Selman and Jöerg Hoffmann.

OPTIMIZATION WITH PARITY CONSTRAINTS: FROM BINARY CODES TO DISCRETE INTEGRATION Stefano Ermon*, Carla P. Gomes*, Ashish Sabharwal +, and Bart Selman* *Cornell.

1 Discrete Structures CS 2800 Prof. Bart Selman selman@cs.cornell.edu Module Logic (part 1)

Attention: Professor Bart Selman Department of Computer Science

Wireless Distributed Sensor Challenge Problem: Demo of Physical Modelling Approach Bart Selman, Carla Gomes, Scott Kirkpatrick, Ramon Bejar, Bhaskar Krishnamachari,

Learning to Speed Up Search Bart Selman and Wei Wei.

Dynamic Restarts Optimal Randomized Restart Policies with Observation Henry Kautz, Eric Horvitz, Yongshao Ruan, Carla Gomes and Bart Selman.

OPTIMIZATION WITH PARITY CONSTRAINTS: FROM BINARY CODES TO DISCRETE INTEGRATION Stefano Ermon, Carla P. Gomes, Ashish Sabharwal +, and Bart Selman* *Cornell.