Metric Embedding with Relaxed Guarantees Ofer Neiman Ittai Abraham Yair Bartal.

38
Metric Embedding Metric Embedding with Relaxed with Relaxed Guarantees Guarantees Ofer Neiman Ofer Neiman Ittai Abraham Yair Ittai Abraham Yair Bartal Bartal

Transcript of Metric Embedding with Relaxed Guarantees Ofer Neiman Ittai Abraham Yair Bartal.

Metric Embedding with Metric Embedding with Relaxed GuaranteesRelaxed Guarantees

Ofer NeimanOfer Neiman

Ittai Abraham Yair BartalIttai Abraham Yair Bartal

Embedding metric spacesEmbedding metric spaces

Representation of the metric in a simple and Representation of the metric in a simple and structured space.structured space.

Common target spaces: Common target spaces: llpp, trees (ultra-metrics)., trees (ultra-metrics).

The price of simplicity: The price of simplicity: distortiondistortion, which is the , which is the multiplicative amount by which distances can change.multiplicative amount by which distances can change.

Goal: find low distortion embeddings.Goal: find low distortion embeddings.

A tool for approximation algorithmsA tool for approximation algorithms

Useful for many practical applications.Useful for many practical applications.

Metric embeddingMetric embedding

LetLet X,Y X,Y be metric spaces with metrics be metric spaces with metrics ddxx, d, dyy

respectively.respectively. f : X→Y f : X→Y is an embedding of is an embedding of X X into into YY..

cvud

vfufdc

X

Y

,,

:Xvu

The distortion of The distortion of f f is the minimal is the minimal αα such that for such that for some some cc::

Every metric space on Every metric space on n n points can be embedded into points can be embedded into Euclidean spaceEuclidean space with distortion O(log with distortion O(log nn) and ) and dimension O(logdimension O(log22nn). [Bourgain/LLR]). [Bourgain/LLR]

Basic resultsBasic results

Every metric space on Every metric space on n n points can be embedded points can be embedded into a tree metricinto a tree metric with distortion with distortion

[Bartal/BLMN/RR].[Bartal/BLMN/RR].

n

problemproblem

The lower bounds on the distortion and the dimension The lower bounds on the distortion and the dimension are high, and grow with are high, and grow with nn..

In some cases, weaker guarantees are acceptable..In some cases, weaker guarantees are acceptable..

Some Alternative SchemesSome Alternative Schemes

Probabilistic embeddingProbabilistic embedding: : considering the expected considering the expected distortion. distortion. [Bartal, FRT][Bartal, FRT]

Ramsey theoremsRamsey theorems: : embedding a large subspace of the embedding a large subspace of the original metric. original metric. [BFM, BLMN][BFM, BLMN]

Partial embeddingPartial embedding: : embedding all but a fraction of the embedding all but a fraction of the distances.distances. [KSW, ABCDGKNS][KSW, ABCDGKNS]

MotivationMotivation Estimating latencies (round-trip time) in the internet.Estimating latencies (round-trip time) in the internet.

- the distance matrix is almost a metric.- the distance matrix is almost a metric.

- embedding heuristics yield surprisingly good results...- embedding heuristics yield surprisingly good results...

[Ng+Zhang’02, ST’03, DCKM’04][Ng+Zhang’02, ST’03, DCKM’04]

Practical network embedding requires:Practical network embedding requires:

- Small number of dimensions.- Small number of dimensions.

- No centralized co-ordination.- No centralized co-ordination.

- Linear number of distances measurement.- Linear number of distances measurement.

Finding nearest: copy of a file, service from some server, ect.Finding nearest: copy of a file, service from some server, ect.

(1-(1-εε) partial embedding) partial embedding

X, Y X, Y are metric spaces.are metric spaces.

f : X→Yf : X→Y has ( has (1-1-εε) ) partial partial distortiondistortion at at

most most αα if there exists a set of pairs if there exists a set of pairs GGεε such that: such that:

For all For all pairs (pairs (u,vu,v))єєGGεε..

cvud

vfufdc

X

Y

,,

21

nG

Scaling EmbeddingScaling Embedding

A stronger requirement is a map that will be good for A stronger requirement is a map that will be good for all all εε simultaneouslysimultaneously..

Definition:Definition: an embedding an embedding ff has has scalingscaling distortion distortion DD((εε) if for ) if for anyany εε>0>0, it is an (1-, it is an (1-εε) partial embedding ) partial embedding with distortion with distortion DD((εε).).

Scaling & Average DistortionScaling & Average Distortion

Thm:Thm: every metric space has every metric space has scaling probabilistic embedding scaling probabilistic embedding with distortion with distortion OO(log((log(1/1/εε)) into trees.)) into trees.

Thm:Thm: every metric space has every metric space has scaling embedding with scaling embedding with distortion distortion OO(log((log(1/1/εε)) and dimension )) and dimension OO(log (log nn)) intointo Euclidean Euclidean space.space.

implies constant average distortion!implies constant average distortion!

Applications:Applications: weighted average problems weighted average problems

sparsest cut, quadratic assignment, linear arrangement, ect.sparsest cut, quadratic assignment, linear arrangement, ect.

Any Any λλ-doubling metric space -doubling metric space X X can be embedded into can be embedded into ll22 with (1- with (1-εε) partial distortion ) partial distortion [KSW].[KSW].

Previous workPrevious work

DefinitionDefinition: a metric space : a metric space XX is called is called λλ--doubling if for any doubling if for any r>0r>0, any ball of , any ball of radius radius rr can be covered by can be covered by λλ balls of radius balls of radius r/2.r/2.

logO

Partial embedding into trees with distortion .Partial embedding into trees with distortion . Distortion & Dimension don’t depend on Distortion & Dimension don’t depend on

the size of the size of XX!!

Our ResultsOur Results Partial embedding into Partial embedding into ll22 with distortion and dimension with distortion and dimension

OO(log(1/(log(1/εε)).)).

1O

General theorem converting classical General theorem converting classical llpp embeddings into embeddings into

the partial model.the partial model.

Tight lower bounds. Tight lower bounds.

Appeared in FOCS05 together with CDGKSAppeared in FOCS05 together with CDGKS

Thm:Thm: Any Any subset-closedsubset-closed family of metric spaces family of metric spaces X X , that has, that has for any for any XXєєX X on on n n points, an embedding points, an embedding φφ::XX→ → llpp with with

- distortion - distortion αα((nn).). - dimension - dimension ββ((nn).).

φφ can be converted into (can be converted into (1-1-εε) partial embedding of ) partial embedding of XX with with

- distortion- distortion

- dimension- dimension

Embedding into Embedding into l lpp

1

log1

O

1

log1

log1

O

1O

1O

In practice..In practice..

(1-(1-εε) partial ) partial embedding of any metric space into embedding of any metric space into llpp with distortion with distortion

and dimension [and dimension [Bourgain,Matousek,Bartal]Bourgain,Matousek,Bartal]

(1-(1-εε) partial ) partial embedding of any negative type metric (embedding of any negative type metric (ll11 metrics) metrics) into into ll22 with distortion and dimension with distortion and dimension

[ARV, ALN][ARV, ALN]

(1-(1-εε) partial ) partial embedding of any doubling metric into embedding of any doubling metric into llpp with with distortion and dimension distortion and dimension [KLMN][KLMN]

(1-(1-εε) partial ) partial embedding of any tree metric into embedding of any tree metric into ll22 with distortion with distortion and dimension and dimension [Matousek] [Matousek]

Main ResultsMain Results

p

O1log

1logO

1loglog1logO

1log2O

1log1 pO 1log2O

1loglogO

1logpOe

Definitions Definitions

Let Let rrεε((uu)) be the minimal radius such that be the minimal radius such that ||B(B(u,ru,rεε((uu))| ))|

≥ ≥ εεn.n.

A pair (A pair (u,vu,v), ), w.l.o.g w.l.o.g rrεε((uu) ≥ ) ≥ rrεε((vv):):

has has shortshort distance distance if if dd((u,vu,v) ) << rrεε((uu))

has has medium medium distance if distance if rrεε((uu) ≤ ) ≤ dd((u,vu,v) < 4∙) < 4∙rrεε((uu).).

has has long long distancedistance if 4·if 4·rrεε((uu) ≤ ) ≤ dd((u,vu,v).).

Close DistancesClose Distances

((u,vu,v) is a short pair. ) is a short pair.

Short pairs are Short pairs are ignored - ignored - at most at most εεnn22..

urε(u)

vrε(v)

rε(w)w

Beacon Based EmbeddingBeacon Based Embedding

Randomly choose Randomly choose beacons = beacons = BB..

Each point attached to Each point attached to nearest beacon.nearest beacon.

1

log1

Some More Bad PointsSome More Bad Points

If If dd((u,Bu,B) > ) > rrεε((uu) then) then

is is badbad..

For each For each uuєєXX : :

With probability ½ at With probability ½ at most most 22εεnn22 bad pairs.bad pairs.

urε(u)

vrε(v)

rε(w)w xux ,:

urBud ,Pr

Partial EmbeddingPartial Embedding

Use the embedding Use the embedding φφ::B→lB→lpp.. φφ has distortion guarantee of has distortion guarantee of .. The partial embedding The partial embedding is:is:

1

log1

B

plXhf :

φφ((bb))

1

log1

1

logOk

hh((uu))f f ((uu))

u u attached to attached to beacon beacon bb

0,1

pk

ur 0,1

pk

ur 0,1

pk

ur

bv

bu

Upper BoundUpper BoundWe assume for the pair We assume for the pair (u,v(u,v))::

- Each point has a beacon in its ball.- Each point has a beacon in its ball.

- Both - Both u,vu,v are outside each other’s ball. are outside each other’s ball.

- The mapping - The mapping φφ is a contraction. is a contraction.

urε(u)

rε(v)v

vudbbd vu ,3,

pvu

p

pvu bbdbb ,

ppp

pvudvfuf , 3

pp

pvudvhuh ,

bv

dd((u,vu,v) ) ≥ ≥ 4·max{4·max{rrεε((uu), ), rrεε((vv)})}

Lower Bound - Long DistancesLower Bound - Long Distances

rε(u)u

rε(v)v

bu

d(bd(buu,b,bvv) ≥ ) ≥ dd((u,vu,v)/2 )/2

p

vup

pvu B

bbdbb

,

pp

p

vudvfuf

1log12

,

u

v

Medium Distances??Medium Distances??

There is a problem in this case:There is a problem in this case:

rε(u)

rε(v)

u,v u,v are attached to the same beacon!!are attached to the same beacon!!

The additional coordinates The additional coordinates hh will will guarantee enough contribution..guarantee enough contribution..

Medium DistancesMedium Distances

With probability < With probability < εε the pair the pair ((u,vu,v) will be smaller than half ) will be smaller than half its expectation.its expectation.

Pairs satisfying: Pairs satisfying: rrεε((uu)) ≤≤ d(u,vd(u,v) ≤ 4) ≤ 4rrεε((uu))

[w.l.o.g [w.l.o.g rrεε((uu) ≥ ) ≥ rrεε((vv)) ] ] rrεε((uu),0),0 rrεε((uu),0),0 rrεε((uu),0),0

rrεε((vv),0),0 rrεε((vv),0),0 rrεε((vv),0),0

rrεε((uu)) 00 rrεε((uu))

In expectation In expectation ¼ of the ¼ of the coordinates will be coordinates will be rrεε((uu). ).

With probability With probability ¼ we get ¼ we get rrεε((uu))

hh((uu)-)-hh((vv))

Medium DistancesMedium Distances

With probability With probability ½, ½, 22εεnn22 medium pairs failed, but for medium pairs failed, but for the others:the others:

ppp

pvudurvhuh ,

End of proof!End of proof!

Coarse Partial EmbeddingCoarse Partial Embedding

Another version: ignoring only the short distances Another version: ignoring only the short distances (i.e., from each point to its nearest (i.e., from each point to its nearest εεnn neighbors). neighbors). the dimension increases to the dimension increases to OO(log((log(nn)·)·ββ((11//εε)).)).

Partial Embedding into TreesPartial Embedding into Trees

Thm:Thm: every metric space has (1- every metric space has (1-εε) partial embedding ) partial embedding with distortion into a tree (ultra-metric).with distortion into a tree (ultra-metric). 1O

Ultra-metricsUltra-metrics

Metric on leaves of Metric on leaves of rooted labeled tree.rooted labeled tree.

0 ≤ 0 ≤ ΔΔ((DD) ≤ ) ≤ ΔΔ((BB) ≤ ) ≤ ΔΔ(A).(A). dd((x,yx,y) = ) = ΔΔ((lcalca((x,yx,y)).)).

dd((x,yx,y) = ) = ΔΔ((DD).).

dd((x,wx,w) = ) = ΔΔ((BB).).

dd((w,zw,z) = ) = ΔΔ((AA).).xyz

Δ(A)

Δ(B)Δ(C)

Δ(D)

w

Embedding into Ultra-metricEmbedding into Ultra-metric

Partition Partition X X into 2 sets into 2 sets XX11, X, X22

Create a root labeled Create a root labeled ΔΔ = = diamdiam((XX).).

The children of the root are The children of the root are created recursively on created recursively on XX11, X, X22

Using induction the number Using induction the number of distances we ignore isof distances we ignore is

BB – bad distances for current – bad distances for current level.level.

XXXX11 XX22

Δ

XX11 XX22

222

21 XB

XX

|B|≤ |B|≤ εε|X|X11||X||X22||

Take a point Take a point uu such that | such that |BB((u,u,ΔΔ/2/2)| ≤ )| ≤ n/2.n/2.

Let Let i=1,…,1/i=1,…,1/εε Let Let SSii=A=Ai+1i+1-A-Aii

We need a “slim” shell…We need a “slim” shell… only distances inside the only distances inside the

shell are distorted by shell are distorted by more thanmore than

Where to Cut?Where to Cut?

2, iuBAi

1O

AA11

AAii

uu

AAi+i+11

Case 1: |Case 1: |AA11|< |< εεn.n.

XX11= u, X= u, X22= X\= X\{{uu}}

Where to Cut?Where to Cut?

AA11

AAii

211 11 XXnAB Δ

X\uX\u

uu

Where to Cut?Where to Cut?

Case 2: Case 2:

Choose an Choose an i i such that:such that:

Let Let XX11=A=Ai+½i+½, X, X22 = X\X = X\X11

nA 1

AA11

AAii

uu

AAi+i+11

XX11

XX22

Δ

212

XXS

B i

22

ii An

S

Assume by contradiction for all Assume by contradiction for all

|S|Sii||22 > > εεn|An|Aii||

Then by induction Then by induction |A|Aii| ≥ | ≥ εεn·in·i22. .

which implies which implies |A|Att| ≥ n. | ≥ n.

End of proof!End of proof!

Finding Shell Finding Shell SSii

1

,,1 ti

Lower BoundsLower Bounds

General method to obtain partial lower bounds from General method to obtain partial lower bounds from known classical ones.known classical ones.

ThmThm: given a lower bound : given a lower bound αα for embedding a family for embedding a family

XX into a family into a family YY : : i.e. for any i.e. for any nn there is there is XXєєXX on on n n points and any points and any embedding of embedding of XX requires distortion at least requires distortion at least αα((nn).).

Then there is Then there is X’X’єєXX for which any (1- for which any (1-εε) partial ) partial embedding requires distortion embedding requires distortion

1

The family The family XX must be must be nearly closed under compositionnearly closed under composition!!

• distortion for partial embedding into trees. [Bartal/BLMN/RR].[Bartal/BLMN/RR].

• distortion for partial embedding of doubling or l1 metrics into l2. [NR]

• distortion for probabilistic partial embedding into trees. [Bartal]

Main corollariesMain corollaries

• distortion for partial embedding into lp. [LLR, Mat]

p

1log

1.

1log

1log

Choose Choose XXєєX X such that such that

For each For each xxєєX X create a metric create a metric CCxx

such that such that

- C- CxxєєXX..

- - X’X’ contain many “copies” of contain many “copies” of XX.. Let Let f f be a (1-be a (1-εε) partial ) partial embedding embedding

that ignores the set of edges that ignores the set of edges II. By . By definition .definition .

General ideaGeneral idea

3

1X

nCx 3

ddδδ

XXX’X’

22nI

T: T: vertices intersecting less vertices intersecting less than edges in than edges in II..

For each For each xxєєXX, choose some, choose some

vvxxєєCCxx∩T∩T..

For each pair (For each pair (vvxx,v,vyy) find ) find t t єєCCy y

such that:such that:

Finding a copy of Finding a copy of XX

n nT 1

Itvtv yx ,,,

in T

vx

in T

vy

t

Cx Cy

n2

n2nCC yx 3

Distortion of the CopyDistortion of the Copy

vvxx vvyy

tt

f f has distortion has distortion guarantees for both guarantees for both these distancesthese distances

Its distortion must be at Its distortion must be at leastleast 1

yxyx

yxyx

vtdtvdvvd

vtdtvdvvd

,,,

,,,

dd((t,vt,vyy)) is negligible is negligible