Incremental Topological Ordering (and Cycle Detection)

INCREMENTAL TOPOLOGICAL ORDERING(AND CYCLE DETECTION)

Bernhard Haeupler, Telikepalli Kavitha, Rogers Mathew, Siddhartha Sen, and Robert E. Tarjan

Topological Ordering Input: directed graph G = (V, E) Goal: Find a total order of V so:

u < v there is no path from v to u in G If a graph has a cycle, this is impossible

Instead report that a cycle is detected

A

B

C

DE

Unique ordering: A < D < E < B < C

Static algorithms If G is already known (fixed) then this

problem is easy: Remove one source/sink at a time If no sources/sinks then a cycle must exist Runs in O(m + n) (n vertices, m edges)

What if G changes? Assume V is fixed, but E could change Three possibilities:

Incremental (edges are only added) Decremental (edges are only deleted) Dynamic (both)

Here we will deal only with the incremental case

Goals Data structure that stores a topological

ordering for G After each edge insertion updates are fast

We assume m n

Compare: Running static algorithm after each edge

addition takes O(m2) total time

Why is this useful? Two example applications:

1. Evaluation order in spreadsheets Topological order gives the order in which to

evaluate cells Cells may be added leading to new dependencies A cycle represents an error

2. Compilation order New files added to a project may introduce new

dependencies Compilation needs to proceed in topological

order

Outline First approach: limited search

Improvements Analysis

A better way: two-way limited search Algorithm overview Example

Semi-ordered search Search in dense graphs

Example Conclusion

Simple approach: “Limited search”

1. Maintain a topological order2. After an edge insertion (u, v):

a) Check whether u < v in the orderb) Otherwise, run a search from v to uc) If the search succeeds, report a cycled) Otherwise, we need to fix the order:

i. Move all visited vertices after unvisited vertices

ii. Order the visited vertices

Making it better May have to search most of the graph each

time So this doesn’t improve on O(m2)

Improvement: When searching from v to u, we can ignore

vertices w with u < w How much time does it take now?

If (x, y) is traversed when searching from v, then x < u, so after (u, v) added it won’t be traversed again from v

So over all searches, at most O(mn) edges are traversed

Doing the reordering Dynamic ordered list problem

Store a list of distinct elements, supporting: Order queries (x < y?) Insertions Deletions

Can be done in O(1) time per operation (Dietz & Sleator 1987)

With this structure the total time is O(mn)

A better way: two-way search Instead of just searching v u, also search

u v, until: Either a common vertex is found Or we are sure there is no path possible

Maintain two sets: A contains forward vertices (those reached

forward from v) with untraversed outgoing edges

B contains backward vertices with untraversed incoming edges

Two-way search algorithm Adding an edge (u, v) with v < u in the order Initialize:

Mark v forward, and add to A if it has an outgoing edge Mark u backward, and add to B if it has an incoming edge

Repeat until min(A) > max(B): Forward search:

Choose x in A and choose an untraversed edge (x, y) and mark traversed.

Possibly delete x from A. If y is backward, stop and report cycle. If y is unvisited, mark it forward and possibly add to A.

Backward search: Same as above, but opposite.

Restoring order Search doesn’t find a cycle

So either A is empty, B is empty, or min(A) > max(B)

Set t = min(A) (or max(B) or u) Set F to all forward vertices < t Set R all backward vertices > t

Rearrange: If t is not forward (other case symmetric):

Move those in R after t, those in F after R Order those in R, F topologically

Two-way search Example

ab

d

e

gf

c

vu

h

Original topological order: a, v, c, b, e, h, g, f, u, d


ab

d

e

gf

c

vu

h


Adding new edge (u, v)


ab

d

e

gf

c

vu

h


A: {v}B: {u}

Forward Backward

Forward step

ab

d

e

gf

c

vu

h


A: {c}B: {u}

Forward Backward

Backward step

ab

d

e

gf

c

vu

h


A: {c}B: {f}

Forward Backward

Forward step

ab

d

e

gf

c

vu

h


A: {} is now empty, so we are doneB: {f}

Forward Backward

Restoring order

ab

d

e

gf

c

vu

h


t = f, F = {v, c, b}, R = {u}

Forward Backward

Restoring order

ab

d

e

gf

c

vu

h


t = f, F = {v, c, b}, R = {u}

Forward Backward

New topological order: a, e, h, g, f, u, v, c, b, d

Which step to perform? In the example we alternated forward and

backward steps This is balanced search

To choose a vertex, choose the minimum vertex from A and the max from B This is ordered search

To implement, make A and B heaps Theorem 1: Ordered, balanced search does

O(m3/2) steps, so O(m3/2 log n) total

Semi-ordered search Ordered search is too restrictive

Compatible search: Balanced search where the backward step is

compatible with preceding forward step If prev. forward step chose u, then backward step

can choose any z as long as u < z

Theorem 2: Compatible search does O(m3/2) steps.

Proof of Theorem 2 The last search does O(m) steps. For the other searches, we count edge-edge

pairs. Let (w, x) and (y, z) be edges with w < z.

(w, x) traversed in forward step (y, z) traversed in backward step

The edge that was added creates a path from z through (u, v) to w

So these edges are on the same path But they weren’t before

Proof of Theorem 2 – cont’d Let there be k backward steps Sort all edges (w, x) traversed in forward

steps on w Each edge has a “twin” edge (y, z)

w < z by compatibility Each edge forms a related pair with its

twin, and the twin of every following edge in the sorted order: O(k2) pairs

Proof of Theorem 2 – cont’d For each search, two cases:

If it does < 2m1/2 forward steps: Then total time for all these searches is O(m3/2)

Otherwise: It increases number of pairs by O(k2) It does O(k) forward steps, increasing number

of edge pairs by O(km1/2) The total number of edge pairs is O(m2) So these searches also do O(m3/2) total

Semi-ordered search conclusion Extra flexibility allows O(1) time per step

Use sets (represented as arrays) instead of heaps Use conservative estimates instead of the real

min(A), max(B) and t Thus O(m3/2) total time.

Lower bound: Any local incremental algorithm must spend (nm1/2) time. Local: only reorders vertices between v and u. So semi-ordered search optimal for sparse graphs.

Dense graphs If m = (n2) then semi-ordered search is

no better than regular limited search

New approach: topological search Still a two-way search Main differences:

Balances vertices instead of edges Searches topological order Different reordering method

Topological search Inserting a new edge (u, v) with u > v F = vertices reachable from v going forward, using

a current vertex x R = vertices from which u is reachable going

backward, using a current vertex y Initialization:

Set F = {v}, R = {u}, x = v, y = u Alternate until x = y:

Replace x by next vertex in order until there is an edge (w, x) with w in F. Then add x to F.

Replace y by prev. vertex in order until there is an edge (y, z) with z in R. Then add y to R.

Topological Search Example

ab

d

e

gf

c

vu

h


Adding new edge (u, v)


ab

d

e

gf

c

vu

h


F: {v} x = vR: {u} y = u

Forward Backward


ab

d

e

gf

c

vu

h


F: {v, c} x = c, (v, c) is an edgeR: {u} y = u

Forward Backward


ab

d

e

gf

c

vu

h


F: {v, c} x = cR: {u, f} y = f, (f, u) is an edge

Forward Backward


ab

d

e

gf

c

vu

h


F: {v, c, b} x = b, (c, b) is an edgeR: {u, f} y = f

Forward Backward


ab

d

e

gf

c

vu

h


F: {v, c, b} x = bR: {u, f, g} y = g, (g, f) is an edge

Forward Backward


ab

d

e

gf

c

vu

h


F: {v, c, b} x = e, h, g STOP!R: {u, f, g} y = g

Forward Backward

Vertex Reordering Could do it just like before, using t = x=

y as the “pivot” valuea

b

d

e

gf

c

vu

h


F: {v, c, b}, R = {u, f, g}, t = gNew topological order: a, e, h, g, f, u, v, c, b, d

Why not? Doing reordering this way doesn’t give good

time bounds: (n3) in worst case We need a new reordering method Sketch:

Treat F as a queue of “violated” vertices which must be reordered

Repeat: Search through the topological order to find a

violated vertex, then add it to F Replace that vertex with an invalid vertex (Alternate, doing the same for R)

Topological search analysis Analyzing the running time is tricky Analysis:

Runtime bounded by a constant times sum of distances of all vertex moves

To simplify, break down vertex moves into pairwise vertex swaps

Potential analysis: Any sequence of swaps has total distance O(n2.5)

Theorem 3: Topological search spends O(n2.5) time for all edge additions.

Conclusion Semi-ordered search takes O(m3/2) Topological search takes O(n2.5)

But may be O(n2 log n) A different algorithm runs in O(n2 log n)

(Bender, Fineman, and Gilbert 2009) Exploits the idea of a partial order

Is the lower bound of (nm1/2) achievable for all graphs?

THE ENDQuestions?

Incremental Topological Ordering (and Cycle Detection)

Documents

Transcript of Incremental Topological Ordering (and Cycle Detection)