Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

39
Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley

Transcript of Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Page 1: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Lower Bounds for Property Testing

Luca Trevisan

U.C. Berkeley

Page 2: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Sub-linear Time Algorithms

• This talk:– algorithms that run in less than linear time

(cannot read entire input).– No pre-preprocessing. (Unstructured data)– Must be probabilistic and approximate

• For optimization problems: – Compute numerical apx of optimum cost

(and implicit representation of apx solution?)

• For decision problems:– What is approximation for decision problems?

Page 3: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

(Graph) Property TestingTesting a property P with accuracy in

adjacency matrix representation:

• Given graph G that has property P, accept with probability >3/4

• Given graph G that is -far from property P accept with probability <1/4

-far = must change –fraction of adjacency matrix to get property P (add/remove > n2 edges)

Page 4: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Example [GGR,AK]

Testing bipartiteness of a given graph G• Pick (1/)polylog(1/) vertices, and check if they

induce a bipartite graph; if so accept otherwise reject

• If G is bipartite then alg accepts with prob 1• If G is -far from bipartite, then whp algorithm

discovers an odd cycle (non-trivial to prove)• Running time: O ((1/)polylog(1/))

Page 5: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Paleontologist’s approach

Page 6: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Paleontologist’s approach

Page 7: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Paleontologist’s approach

Page 8: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Paleontologist’s approach

Page 9: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Lower Bounds [BT]

• Alon-Krivelevich’s algorithm– has one-sided error, is non-adaptive and has

running time (1/2)polylog(1/)

• Lower Bounds:– (1/2) for non-adaptive algorithms

– (1/1.5) for adaptive algorithms– Both results hold even for two-sided error

Page 10: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Two Distributions

• Gfar: every edge exists with probability – whp it is /3-far from bipartite

• Gbip: pick a random partition, then every edge that crosses the partition exists with probability 2

• Indistinguishable by non-adaptive algorithms making o(1/2) queries

• Indistinguishable by adaptive algorithms making o(1/1.5) queries

Page 11: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Bounded Degree GraphsTesting a property P with accuracy in

adjacency lists representation:

• Given graph G that has property P, accept with probability >3/4

• Given graph G that is -far from property P accept with probability <1/4-far = must change –fraction of adjacency

lists entries to get property P (add/remove > dn edges)

Page 12: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Bipartiteness [GR]

Testing bipartiteness• Repeat polylog n times:

– Start at random point, and pick sqrt(n) random walks of length polylog n, if two of them combine to form an odd cycle reject, otherwise accept

• Analysis: – in a graph where you need to remove

constant fraction of edges to make it bipartite, algorithm finds odd cycle

Page 13: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Matching Lower Bound [GR]

• Define two distributions of graphs:– Gfar: a random hamiltonian circuit, plus a

random matching(whp 1/100-far from bipartite)

– Gbip: a random hamiltonian circuit, plus a random matching conditioned on making the graph bipartite

• Gfar and Gbip are indistinguishable by algorithms of query complexity o(sqrt(n)).

Page 14: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Sublinear Time Approximation

Problems restricted to dense instances:• Max CUT and other graph problems can be

approximated within (1+) in graphs with at least n2 edges in time 2poly(1/)

[GGR]• Max 3SAT can be approximated within (1+) in

instances with at least n3 clauses in time 2poly(1/) and similar results for other satisfiability problems[AFKK]

Page 15: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Sub-linear Time ApproximationProblems on bounded-degree instances• Minimum spanning tree

– given a connected weighted graph of degree d with weights in range {1,…,w}, can approximate MST weight within (1+) in time about O(dw/2)[Chazelle, Rubinfeld, T]

Page 16: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

General Goals

• When looking for polynomial-time algorithms:– Several algorithmic techniques of general

applicability– A general technique to “prove” impossibility

(NP-completeness)

• For sublinear-time algorithms:– General algorithmic techniques?– Impossibility results?

Page 17: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Dense GraphsSome general algorithmic results• All problems with a certain logical representation

testable in time dependent only on [AFKS]• All regular languages testable in time dependent

only on [AFNS]• Only one one-sided error algorithm [GT]

(pick a random subgraph and check it is consistent with the property)– Adaptivity does not help– “Only one algorithm” result also for 2-sided error.

Few lower bounds

Page 18: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Bounded-Degree GraphsFewer and less general algorithms.Some results are different from dense case• adaptivity helps

– No property testable with o(sqrt(n)) queries non-adaptive queries. Several problems testable with O(1) adaptive queries.

• 2-sided better than 1-sided for natural monotone properties– Property “being a forest” has no o(sqrt(n)) one-sided

algorithm, but has O(1) two-sided algorithm

Few lower bounds

Page 19: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Testing 3-Colorability

• Easy in adjacency matrix representation• NP-hard in adjacency list representation• Only for small enough

– Can find 3-coloring good for 80% of the edges in a 3-colorable graph using SDP

– NP-hard to find 3-coloring good for 98% (?) fraction of edges

• Implies non-tight, and conditional, lower bound for query complexity

Page 20: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Other problems

• The query complexity of following problems is equivalent to query complexity of testing 3col – Testing satisfiability of 3SAT instance

• Every variable occurs in O(1) clauses, “adjacency list” representation

– Approximating max cut, vertex cover, independent set, . . ., in bounded-degree graphs

– Approximating Max SAT, Max 2SAT, . . .

• Lower bound of sqrt(n) for all problems implied by [GR] lower bound for testing bipartiteness

Page 21: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Some Results from [BOT]

• For one-sided error algorithms:– (n) query complexity to distinguish

3-colorable graphs from graphs that are (1/3 – )-far

– Lower bound applies to testing problems that are solvable in polynomial time

• For two-sided error algorithms:– For some , (n) query complexity to

distinguish 3-colorable graphs from graphs that are -far.

Page 22: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Additional Results

• Unconditionally, algorithms running in time o(n) cannot:– Approximate Max 3SAT better than 7/8– Approximate Max Cut in bounded-degree

graphs better than 16/17– . . .

• Hastad’97 proved above problems are NP-hard

Page 23: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

The 3-Coloring Lower Bound

• Consider first one-sided error algorithms• It’s enough to find a graph G that is (1/3 – )-far

from 3-colorable, but every subgraph of size < n is 3-colorable– (for every there is an such that . . .)

• Then an algorithm of query complexity < n either accepts G (which is wrong) or rejects some 3-colorable graph (which means the algorithm has not one-sided error)

Page 24: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

The Graph• Pick a graph of degree O(1/2) at random (pick

so many random matchings)• Then it is (1/3 – )-far whp• But, for some , whp, every subgraph induced

by k < n vertices contains <1.5k edges• In a minimal non-3-colorable graph, every vertex

has degree at least 3• Every subgraph induced by < n vertices is 3-

colorable

[Erdos]

Page 25: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Explicit Construction

Can the previous construction be derandomized?

• For constants d, , , and for every suff large n, we can explicitly construct a graph – on n vertices, with max degree d, – -far from 3-colorable,

– every subset of n vertices induces a 3-colorable subgraph.

Page 26: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Explicit Construction

• We construct a 3SAT formula such that for constants k, ’, ’

– Every variable occurs k times– No assignment satisfies more than 1-’

fraction of clauses– Every ’ fraction of clauses is satisfiable– Then we use (slightly new) reduction from

3SAT to 3Coloring

Page 27: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

The Formula

• Fix a degree-d expander graph G=(V,E) such that for every cut (S,V-S) at least min{|S|,|V-S|} edges cross the cut(enough d=14)

• Have two variables xuv and xvu for each egde (u,v)

• For every vertex v have the (3SAT equivalent of) the constraint

– u xuv = 1 + w xvw

Page 28: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Structure of the Analysis

• Impossible to satisfy more than a fraction 1/(d+1) of the constraints

• Can always satisfy half of the constraint– define an auxiliary network– show that the auxiliary network has no small

cut because of expansion– then there is a large flow– use large flow to find assignment for subset of

constraint

Page 29: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Flow Argument

• Want to satisfy constraints corresponding to vertices in C, with |C| < |V|/2

s

t

V-C

C

Construct flow network with new source s, sink t obtained by collapsing V-C, and vertices in C

Page 30: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Flow Argument

s

A

C-A

t|A| edges

|C-A| edges

•Every cut has size at least |C|

•There is a 0/1 flow of cost at least |C|

•Interpreted as an assignment, satisfies all constraints in C

Page 31: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Two-Sided Error Algorithms

Need to define two distributions of graphs Gcol and Gfar such that:

• Graphs in Gcol are (almost) always 3-colorable• Graphs in Gfar are (almost) always far from

3-colorable• To an algorithm of bounded query complexity,

Gcol and Gfar look (almost) the same

Page 32: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Main Step• Define two distributions Dsat and Dfar of

instances of E3LIN-2(systems over GF(2) with 3 variables per equation)– Systems in Dsat are always satisfiable– Systems in Dfar are (almost) always (1/2-)-far from

satisfiable– To an algorithm of bounded query complexity, Dsat

and Dfar look the same

• We get Gcol and Gfar using reduction fromapproximate E3LIN-2 to approximate 3-coloring

Page 33: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

E3LIN-2

X1 + X3 + X10 = 0 mod 2

X2 + X3 + X4 = 1 mod 2

X1 + X2 + X9 = 0 mod 2

. . .

Page 34: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Main Building Block• We show that for every c there is such

that there exists a left-hand side with– n variables, cn equations, 3 variables per

equations, every variable occurs in 3c equations

– every n equations are linearly independent

• Pick the left-hand side at random– repeat 3c times: pick at random a set of n/3

disjoint triples of variables

• Explicit construction?

Page 35: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Distributions

• The left-hand side is always as before

• In Dsat, we pick a random assignment to the variables, and set right-hand side consistently– always satisfiable

• In Dfar, we pick the right-hand side uniformly at random– With high probability, (1/2 – O(1/sqrt c))-far

Page 36: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Indistinguishability

• Two distributions differ only in right-hand side

• In Dfar uniformly distributed

• In Dsat, n-wise independent– Linear independence implies statistical

independence

• Look the same to algorithm that sees less than n equations

Page 37: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Conclusion of the Argument

• No algorithm of “query complexity” o(n) can distinguish satisfiable instances of E3LIN-2 from instances that are (1/2-)-far from satisfiable

• For some , no algorithm of query complexity o(n) can distinguish 3-colorable graphs from graphs that –far from 3-col.

• No algorithm of query complexity o(n) can approximate Max 3SAT better than 7/8 . . .

Page 38: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Open Questions

• Show that distinguishing 3-colorable graphs from (1/3-)-far graphs requires query complexity (n)– we can only prove it for one-sided error

• Show that approximating Max SAT better than ¾ and Max CUT bettter than ½ requires query complexity (n)– we only know (sqrt(n)) [implicit in GR]– would “explain” why we need SDP

Page 39: Lower Bounds for Property Testing Luca Trevisan U.C. Berkeley.

Some more open questions

• In adjacency matrix representation, most interesting problems solvable in constant (in ) time

• For some problems (eg testing triangle-freeness) analysis uses Szemeredy’s regularity lemma, and constant is hyper-exponential in

• Lower bound (1/)log 1/ and only and for one-sided error

• Alternative analysis / stronger lower bounds?