Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P...

23
Improving Kruskal’s Algorithm Vitaly Osipov, Peter Sanders, Johannes Singler Universit ¨ at Karlsruhe (TH) Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Transcript of Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P...

Page 1: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Improving Kruskal’s AlgorithmVitaly Osipov, Peter Sanders, Johannes Singler

Universitat Karlsruhe (TH)

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 2: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Motivation

What is the best (practical) algorithm for Minimum Spanning Trees?

Kruskal? for very sparse graphs Jarnık–Prim? else Karger–Klein–Tarjan, Chazelle, Pettie–Ramachandran,. . . ?

too complicated

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 3: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Kruskal’s Algorithm

Procedure kruskal(E , T : Sequence of Edge, P : UnionFind)sort E by increasing edge weightforeach u, v ∈ E do

if u and v are in different components of P thenadd edge u, v to Tjoin the partitions of u and v in P

Time O ((n + m) log m)

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 4: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Quick-Kruskal (Moret–Shapiro, Paredes–Navarro)

Procedure qKruskal(E , T : Sequence of Edge, P : UnionFind)if m ≤ kruskalThreshold(n, m, |T |) then kruskal(E , T , P)else

pick a pivot p ∈ EE≤:= 〈e ∈ E : e ≤ p〉E>:= 〈e ∈ E : e > p〉qKruskal(E≤, T , P)if |T | = n − 1 then exitqKruskal(E>, T , P)

p

Time: O

(

m + n log2 n)

– random graphs, random edge weights

Θ ((n + m) log m) if there is any heavy MST edge,e.g., Lollipop graph with random edge weights

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 5: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Filter-Kruskal

Procedure filterKruskal(E , T : Sequence of Edge, P : UnionFind)if m ≤ kruskalThreshold(n, m, |T |) then kruskal(E , T , P)else

pick a pivot p ∈ EE≤:= 〈e ∈ E : e ≤ p〉E>:= 〈e ∈ E : e > p〉qKruskal(E≤, T , P)if |T | = n − 1 then exitE>:= filter(E>, P)qKruskal(E>, T , P)

Function filter(E)return 〈u, v ∈ E : u, v are in different components of P〉

p

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 6: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Analysis – Arbitrary Graph, Random Weights

Lemma: It suffices to count edge comparisons

Chan’s sampling lemma:

r lightest edges processed ⇒ P [e ∈ E> survives filtering] ≤nr

Optimistic Analysis:Assume edge with rank i is filtered when r = i.

E[#survivors] ≤ n +∑

i>n

ni

= Θ(

n logmn

)

Partitioning m edges and sorting n log mn edges:

Ω(m + n log n logmn

)

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 7: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Analysis – Upper Bound (Outline)

Idea: Generalize textbook analysis of quicksort.0–1-RV X ij := 1 iff edges with ranks i and j are compared.

E [#comparisons] =∑

i≤m

i<j≤m

P[

Xij = 1]

.

strengthen bound on P[

Xij = 1]

using Chan’s sampling lemma.. . .

. . .O(m + n log n log

mn

) expected comparisons

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 8: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Getting rid of the log m/n? E.g., random graphs

2

2.5

3

3.5

4

4.5

32 64 27 28 29 210211212213214215216217218219220221222

com

paris

ons

/ edg

es m

= n

log(

n)

number of nodes n

filterKruskallog(log(x))

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 9: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Filter-Kruskal+

Function filter+(E , T , P)E ′:= 〈u, v ∈ E : u, v are in different components of P〉T ′:= 〈u, v ∈ E ′ : u or v have degree one in comp. graph 〉T := T ∪ T ′

return E ′ \ T ′

p

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 10: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Linear Time? E.g., random graphs

5

10

15

20

25

210 211 212 213 214 215 216 217 218 219 220 221 222 223 224

num

ber

of c

ompa

rison

s / n

number of nodes n

m=nm=2nm=4n

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 11: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Parallelization

Procedure filterKruskal(E , T : Sequence of Edge, P : UnionFind)if m ≤ kruskalThreshold(n, m, |T |) then

kruskal(E , T , P) // parallel sortelse

pick a pivot p ∈ EE≤:= 〈e ∈ E : e ≤ p〉 // parallelE>:= 〈e ∈ E : e > p〉 // partitioningqKruskal(E≤, T , P)if |T | = n − 1 then exitE>:= filter(E>, P) // parallel removeIfqKruskal(E>, T , P)

Easy: available in the Multi-Core Parallel STL (e.g. g++)

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 12: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Running Time: Random graph with 216 nodes

10

100

1 4 16 64 256 1024

time

/ m [n

s]

number of edges m / number of nodes n

KruskalqKruskalKruskal8

filterKruskal+filterKruskal

filterKruskal8qJPpJP

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 13: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Graph Formatting: Random graph with 222 nodes

10

100

1000

10000

1 2 4 8 16 32

time

/ m [n

s]

number of edges m / number of nodes n

List of edges -> Adjacency ArrayAdjacency Array -> List of Edges

filterKruskalpJP

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 14: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Conclusions

filterKruskal improves on Kruskal and Jarnık–Prim very simple often faster parallelizable needs only edge sequence

Todo: More real world inputs, tight analysis,. . .

Open Problem: What about filterKruskal+ provably linear time? scalable parallelization? Sequential component O

(

n0.6)

? more efficient implementation

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 15: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Thank you!

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 16: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Random graph with 210 nodes

10

100

1000

1 2 4 8 16 32 64 128 256

time

/ m [n

s]

number of edges m / number of nodes n

KruskalqKruskalKruskal8

filterKruskal+filterKruskal

filterKruskal8qJPpJP

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 17: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Random graph with 222 nodes

100

1000

1 2 4 8 16

time

/ m [n

s]

number of edges m / number of nodes n

KruskalqKruskalKruskal8filterKruskal+filterKruskalfilterKruskal8qJPpJP

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 18: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Random graph, n = 216, anti-Prim weights

10

100

1000

10000

100000

1 2 4 8 16 32 64 128 256

time

/ m [n

s]

number of edges m / number of nodes n

KruskalqKruskalKruskal8

qJPpJP

filterKruskal

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 19: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Random geometric graph with 216 nodes

10

100

4 8 16 32 64 128 256

time

/ m [n

s]

number of edges m / number of nodes n

KruskalqKruskalKruskal8filterKruskal+filterKruskalfilterKruskal8qJPpJP

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 20: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Lollipop graph with 210 nodes

10

100

1000

1 2 4 8 16 32 64 128

time

/ m [n

s]

number of edges m / number of nodes n

KruskalqKruskalKruskal8

filterKruskal+filterKruskal

filterKruskal8qJPpJP

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 21: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Lollipop graph with 216 nodes

10

100

1 4 16 64 256 1024

time

/ m [n

s]

number of edges m / number of nodes n

KruskalqKruskalKruskal8filterKruskal+filterKruskalfilterKruskal8qJPpJP

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 22: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Lollipop graph with 222 nodes

32

64

128

256

512

1024

1 2 4 8 16

time

/ m [n

s]

number of edges m / number of nodes n

KruskalqKruskalKruskal8filterKruskal+filterKruskalfilterKruskal8qJPpJP

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm

Page 23: Improving Kruskal’s Algorithm · Kruskal’s Algorithm Procedure kruskal(E,T : Sequence of Edge,P : UnionFind) sort E by increasing edge weight foreach {u,v} ∈ E do if u and v

Image Segmentation Application

10

100

4 8 16

time

/ m [n

s]

m / n

KruskalqKruskalKruskal8

filterKruskal+filterKruskal

filterKruskal8qJPpJP

4 8 16

m / n

Vitaly Osipov, Johannes Singler, Peter Sanders Improving Kruskal’s Algorithm