Part4 graph algorithms

Part IVGraph Algorithms

1Saturday, April 30, 2011

그래프 (Graph)

(무방향) 그래프 G=(V,E)•V : 노드•E : 노드쌍을 연결하는 에지•개체(object)들 간의 이진관계를 표현•n=|V|, m=|E|

1

Chapter 3

Graphs

Slides by Kevin Wayne.Copyright © 2005 Pearson-Addison Wesley.All rights reserved.

3.1 Basic Definitions and Applications

3

Undirected Graphs

Undirected graph. G = (V, E)

! V = nodes.

! E = edges between pairs of nodes.

! Captures pairwise relationship between objects.

! Graph size parameters: n = |V|, m = |E|.

V = { 1, 2, 3, 4, 5, 6, 7, 8 }

E = { 1-2, 1-3, 2-3, 2-4, 2-5, 3-5, 3-7, 3-8, 4-5, 5-6 }

n = 8

m = 11

4

Some Graph Applications

transportation

Graph

street intersections

Nodes Edges

highways

communication computers fiber optic cables

World Wide Web web pages hyperlinks

social people relationships

food web species predator-prey

software systems functions function calls

scheduling tasks precedence constraints

circuits gates wires

V={1,2,3,4,5,6,7,8}

E={(1,2),(1,3),(2,3),...,(7,8)}

n=8

m=11


방향그래프(Directed Graph) G=(V,E)

•에지 (u,v)는 u로부터 v로의 방향을 가짐

•예: Web graph

가중치(weighted) 그래프

•에지마다 가중치(weight)가 지정

3.5 Connectivity in Directed Graphs

34

Directed Graphs

Directed graph. G = (V, E)

! Edge (u, v) goes from node u to node v.

Ex. Web graph - hyperlink points from one web page to another.

! Directedness of graph is crucial.

! Modern web search engines exploit hyperlink structure to rank web

pages by importance.

35

Graph Search

Directed reachability. Given a node s, find all nodes reachable from s.

Directed s-t shortest path problem. Given two node s and t, what is

the length of the shortest path between s and t?

Graph search. BFS extends naturally to directed graphs.

Web crawler. Start from web page s. Find all web pages linked from s,

either directly or indirectly.

36

Strong Connectivity

Def. Node u and v are mutually reachable if there is a path from u to v

and also a path from v to u.

Def. A graph is strongly connected if every pair of nodes is mutually

reachable.

Lemma. Let s be any node. G is strongly connected iff every node is

reachable from s, and s is reachable from every node.

Pf. ! Follows from definition.

Pf. " Path from u to v: concatenate u-s path with s-v path.

Path from v to u: concatenate v-s path with s-u path. !

s

v

u

ok if paths overlap

방향 그래프와 가중치 그래프


인접행렬 (adjacency matrix)

|V| × |V| matrix

5

World Wide Web

Web graph.

! Node: web page.

! Edge: hyperlink from one page to another.

cnn.com

cnnsi.comnovell.comnetscape.com timewarner.com

hbo.com

sorpranos.com

6

9-11 Terrorist Network

Social network graph.

! Node: people.

! Edge: relationship between two people.

Reference: Valdis Krebs, http://www.firstmonday.org/issues/issue7_4/krebs

7

Ecological Food Web

Food web graph.

! Node = species.

! Edge = from prey to predator.

Reference: http://www.twingroves.district96.k12.il.us/Wetlands/Salamander/SalGraphics/salfoodweb.giff

8

Graph Representation: Adjacency Matrix

Adjacency matrix. n-by-n matrix with Auv = 1 if (u, v) is an edge.

! Two representations of each edge.

! Space proportional to n2.

! Checking if (u, v) is an edge takes !(1) time.

! Identifying all edges takes !(n2) time.

1 2 3 4 5 6 7 8

1 0 1 1 0 0 0 0 0

2 1 0 1 1 1 0 0 0

3 1 1 0 0 1 0 1 1

4 0 1 0 1 1 0 0 0

5 0 1 1 1 0 1 0 0

6 0 0 0 0 1 0 0 0

7 0 0 1 0 0 0 0 1

8 0 0 1 0 0 0 1 0

그래프의 표현

•저장 공간: O(n2)•어떤 노드 v에 인접한 모든 노드 찾기: O(n) 시간•어떤 에지 (u,v)가 존재하는지 검사: O(n) 시간

대칭행렬

A = (aij), where aij =

�1 if (i, j) ∈ E,

0 otherwise.


그래프의 표현

인접리스트 (adjacency list)

•정점 집합을 표현하는 하나의 배열과

•각 정점마다 인접한 정점들의 연결 리스트

•저장 공간: O(n+m)•어떤 노드 v에 인접한 모든 노드 찾기: O(degree(v)) 시간•어떤 에지 (u,v)가 존재하는지 검사: O(degree(u)) 시간

노드개수: 2m


방향그래프

• 인접행렬은 비대칭

• 인접 리스트는 m개의 노드를 가짐


가중치 그래프의 인접행렬 표현

•에지의 존재를 나타내는 값으로 1대신 에지의 가중치를 저장

•에지가 없을 때 혹은 대각선:

•특별히 정해진 규칙은 없으며, 그래프와 가중치가 의미하는 바에 따라서

•예: 가중치가 거리 혹은 비용을 의미하는 경우라면 에지가 없으면 ∞, 대각선은 0.

•예: 만약 가중치가 용량을 의미한다면 에지가 없을때 0, 대각선은 ∞


경로와 연결성

• 무방향 그래프 G=(V,E)에서 노드 u와 노드 v를 연결하는 경로(path)가 존재할 때 v와 u는 서로 연결되어 있다고 말함

• 노든 노드 쌍들이 서로 연결된 그래프를 연결된(connected) 그래프라고 한다.

• 연결요소 (connected component)

9

Graph Representation: Adjacency List

Adjacency list. Node indexed array of lists.

! Two representations of each edge.

! Space proportional to m + n.

! Checking if (u, v) is an edge takes O(deg(u)) time.

! Identifying all edges takes !(m + n) time.

1 2 3

2

3

4 2 5

5

6

7 3 8

8

1 3 4 5

1 2 5 87

2 3 4 6

5

degree = number of neighbors of u

3 7

10

Paths and Connectivity

Def. A path in an undirected graph G = (V, E) is a sequence P of nodes

v1, v2, …, vk-1, vk with the property that each consecutive pair vi, vi+1 is

joined by an edge in E.

Def. A path is simple if all nodes are distinct.

Def. An undirected graph is connected if for every pair of nodes u and

v, there is a path between u and v.

11

Cycles

Def. A cycle is a path v1, v2, …, vk-1, vk in which v1 = vk, k > 2, and the

first k-1 nodes are all distinct.

cycle C = 1-2-4-5-3-1

12

Trees

Def. An undirected graph is a tree if it is connected and does not

contain a cycle.

Theorem. Let G be an undirected graph on n nodes. Any two of the

following statements imply the third.

! G is connected.

! G does not contain a cycle.

! G has n-1 edges.

3개의 연결요소로 구성


너비우선탐색 (BFS)

•입력: 방향 혹은 무방향 그래프 G=(V,E), 그리고 출발노드 s∈V

•출력: 모든 노드 v에 대해서

• d[v] = s로 부터 v까지의 거리(에지 수)

• π[v] = s로부터 v까지의 최단경로상의 v직전 노드

• {(π[v],v) | v∈V}는 s를 루트로 하는 하나의 트리가 됨

we call π[v] the predecessor of v


너비우선탐색 (BFS)

BFS 알고리즘• L0 = {s}• L1 = L0의 모든 이웃 노드들• L2 = L1의 이웃들 중 L0와 L1에 속하지 않는 모든 노드들• Li = Li-1의 이웃들 중 이전 레벨에 속하지 않는 모든 노드들

17

Connectivity

s-t connectivity problem. Given two node s and t, is there a path

between s and t?

s-t shortest path problem. Given two node s and t, what is the length

of the shortest path between s and t?

Applications.

! Friendster.

! Maze traversal.

! Kevin Bacon number.

! Fewest number of hops in a communication network.

18

Breadth First Search

BFS intuition. Explore outward from s in all possible directions, adding

nodes one "layer" at a time.

BFS algorithm.

! L0 = { s }.

! L1 = all neighbors of L0.

! L2 = all nodes that do not belong to L0 or L1, and that have an edge

to a node in L1.

! Li+1 = all nodes that do not belong to an earlier layer, and that have

an edge to a node in Li.

Theorem. For each i, Li consists of all nodes at distance exactly i

from s. There is a path from s to t iff t appears in some layer.

s L1 L2 L n-1

19


Property. Let T be a BFS tree of G = (V, E), and let (x, y) be an edge of

G. Then the level of x and y differ by at most 1.

L0

L1

L2

L3

20

Breadth First Search: Analysis

Theorem. The above implementation of BFS runs in O(m + n) time if

the graph is given by its adjacency representation.

Pf.

! Easy to prove O(n2) running time:

– at most n lists L[i]

– each node occurs on at most one list; for loop runs ! n times

– when we consider node u, there are ! n incident edges (u, v),

and we spend O(1) processing each edge

! Actually runs in O(m + n) time:

– when we consider node u, there are deg(u) incident edges (u, v)

– total time processing edges is "u#V deg(u) = 2m !

each edge (u, v) is counted exactly twicein sum: once in deg(u) and once in deg(v)


너비우선순회 (BFS)

17

Connectivity


between s and t?



Applications.

! Friendster.

! Maze traversal.



18




BFS algorithm.

! L0 = { s }.



to a node in L1.





s L1 L2 L n-1

19




L0

L1

L2

L3

20




Pf.










1

a queue

s=11. insert the start node into queue√



17

Connectivity


between s and t?



Applications.

! Friendster.

! Maze traversal.



18




BFS algorithm.

! L0 = { s }.



to a node in L1.





s L1 L2 L n-1

19




L0

L1

L2

L3

20




Pf.










1

s=12. remove a node from queue3. insert neighbors of it into the queue

2 3

√

√ √



17

Connectivity


between s and t?



Applications.

! Friendster.

! Maze traversal.



18




BFS algorithm.

! L0 = { s }.



to a node in L1.





s L1 L2 L n-1

19




L0

L1

L2

L3

20




Pf.











2

3

√

√ √

√ √

3 4 5



17

Connectivity


between s and t?



Applications.

! Friendster.

! Maze traversal.



18




BFS algorithm.

! L0 = { s }.



to a node in L1.





s L1 L2 L n-1

19




L0

L1

L2

L3

20




Pf.











3

4

√

√ √

√ √

4

5

65

√


너비우선순회

O(n+m) with adjacent list



•s로부터 노드 v까지의 경로 출력하기



•그래프가 disconnected이거나 혹은 방향 그래프라면 BFS에 의해서 모든 노드가 방문되지 않을 수도 있음

•BFS를 반복하여 모든 노드 방문

BFS2( G ){ while there exists unvisited node v BFS(G, v);}


깊이우선탐색 (DFS)

•입력: 방향 혹은 무방향 그래프 G=(V,E). 출발노드 없음

•출력: 모든 노드 v에 대해서

• d[v] = discovery time (처음 발견한 시간)

• f[v] = finishing time (인접한 모든 노드의 방문 완료 시간)

• π[v] = v’s predecessor



96 Algorithms

Figure 3.4 The result of explore(A) on the graph of Figure 3.2.

I

E

J

C

F

B

A

D

G

H

Figure 3.5 Depth-first search.procedure dfs(G)

for all v ! V :visited(v) = false

for all v ! V :if not visited(v): explore(v)

3.2.2 Depth-first searchThe explore procedure visits only the portion of the graph reachable from its starting point.To examine the rest of the graph, we need to restart the procedure elsewhere, at some vertexthat has not yet been visited. The algorithm of Figure 3.5, called depth-first search (DFS),does this repeatedly until the entire graph has been traversed.The first step in analyzing the running time of DFS is to observe that each vertex is

explore’d just once, thanks to the visited array (the chalk marks). During the explorationof a vertex, there are the following steps:1. Some fixed amount of work—marking the spot as visited, and the pre/postvisit.

2. A loop in which adjacent edges are scanned, to see if they lead somewhere new.This loop takes a different amount of time for each vertex, so let’s consider all vertices to-gether. The total work done in step 1 is then O(|V |). In step 2, over the course of the entireDFS, each edge {x, y} ! E is examined exactly twice, once during explore(x) and once dur-ing explore(y). The overall time for step 2 is therefore O(|E|) and so the depth-first search

94 Algorithms

Figure 3.2 Exploring a graph is rather like navigating a maze.

A

C

B

F

D

H I J

K

E

G

L

H

G

DA

C

FKL

J

I

B

E

What parts of the graph are reachable from a given vertex?

To understand this task, try putting yourself in the position of a computer that has just beengiven a new graph, say in the form of an adjacency list. This representation offers just onebasic operation: finding the neighbors of a vertex. With only this primitive, the reachabilityproblem is rather like exploring a labyrinth (Figure 3.2). You start walking from a fixed placeand whenever you arrive at any junction (vertex) there are a variety of passages (edges) youcan follow. A careless choice of passages might lead you around in circles or might cause youto overlook some accessible part of the maze. Clearly, you need to record some intermediateinformation during exploration.This classic challenge has amused people for centuries. Everybody knows that all you

need to explore a labyrinth is a ball of string and a piece of chalk. The chalk prevents looping,by marking the junctions you have already visited. The string always takes you back to thestarting place, enabling you to return to passages that you previously saw but did not yetinvestigate.How can we simulate these two primitives, chalk and string, on a computer? The chalk

marks are easy: for each vertex, maintain a Boolean variable indicating whether it has beenvisited already. As for the ball of string, the correct cyberanalog is a stack. After all, the exactrole of the string is to offer two primitive operations—unwind to get to a new junction (thestack equivalent is to push the new vertex) and rewind to return to the previous junction (popthe stack).Instead of explicitly maintaining a stack, we will do so implicitly via recursion (which

is implemented using a stack of activation records). The resulting algorithm is shown inFigure 3.3.1 The previsit and postvisit procedures are optional, meant for performingoperations on a vertex when it is first discovered and also when it is being left for the lasttime. We will soon see some creative uses for them.

1As with many of our graph algorithms, this one applies to both undirected and directed graphs. In such cases,we adopt the directed notation for edges, (x, y). If the graph is undirected, then each of its edges should be thoughtof as existing in both directions: (x, y) and (y, x).

begin here



•white: not visited•gray: visited, but not finished•black: finished


연결요소

•무방향 그래프의 경우 BFS 혹은 DFS를 이용해서 연결요소를 찾을수 있음

void connected_comp(){ for ( int u=1; u<=n; u++ ) if ( visited[u] == false ) dfs_visit(u,index++);}

void dfs_visit( int u, int index ){ visited[u] = index; for each unvisited neighbor v dfs_visit(v,index);}


강연결성

•방향 그래프 G=(V,E)

•노드 u로부터 v까지 경로가 존재할 경우 u➙v라고 표현

•두 노드 u와 v에 대해서 u➙v이면서 v➙u일 경우 두 노드는 mutually reachable하다고 한다.

•그래프 G의 모든 노드쌍들이 mutually reachable할때 G는 strongly connected라고 한다.


강연결성

임의의 노드 s에 대해서 모든 노드가 s로부터 reachable하고, 모든 노드로부터 s가 reachable하면 G는 강연결(strongly-connected) 그래프이다.

3.5 Connectivity in Directed Graphs

34

Directed Graphs

Directed graph. G = (V, E)

! Edge (u, v) goes from node u to node v.

Ex. Web graph - hyperlink points from one web page to another.

! Directedness of graph is crucial.

! Modern web search engines exploit hyperlink structure to rank web

pages by importance.

35

Graph Search

Directed reachability. Given a node s, find all nodes reachable from s.

Directed s-t shortest path problem. Given two node s and t, what is

the length of the shortest path between s and t?

Graph search. BFS extends naturally to directed graphs.

Web crawler. Start from web page s. Find all web pages linked from s,

either directly or indirectly.

36

Strong Connectivity

Def. Node u and v are mutually reachable if there is a path from u to v

and also a path from v to u.

Def. A graph is strongly connected if every pair of nodes is mutually

reachable.

Lemma. Let s be any node. G is strongly connected iff every node is

reachable from s, and s is reachable from every node.

Pf. ! Follows from definition.

Pf. " Path from u to v: concatenate u-s path with s-v path.

Path from v to u: concatenate v-s path with s-u path. !

s

v

u

ok if paths overlap


강연결성 검사

1. 임의의 노드 s를 선택.

2. G에서 s로부터 BFS 실행.

3. 그래프 GT 에서 s로부터 BFS실행.

4. 모든 노드들이 두번의 BFS에서 방문되면 강연결 그래프.

37

Strong Connectivity: Algorithm

Theorem. Can determine if G is strongly connected in O(m + n) time.

Pf.

! Pick any node s.

! Run BFS from s in G.

! Run BFS from s in Grev.

! Return true iff all nodes reached in both BFS executions.

! Correctness follows immediately from previous lemma. !

reverse orientation of every edge in G

strongly connected not strongly connected

3.6 DAGs and Topological Ordering

39

Directed Acyclic Graphs

Def. An DAG is a directed graph that contains no directed cycles.

Ex. Precedence constraints: edge (vi, vj) means vi must precede vj.

Def. A topological order of a directed graph G = (V, E) is an ordering

of its nodes as v1, v2, …, vn so that for every edge (vi, vj) we have i < j.

a DAG a topological ordering

v2 v3

v6 v5 v4

v7 v1

v1 v2 v3 v4 v5 v6 v7

40

Precedence Constraints

Precedence constraints. Edge (vi, vj) means task vi must occur before vj.

Applications.

! Course prerequisite graph: course vi must be taken before vj.

! Compilation: module vi must be compiled before vj. Pipeline of

computing jobs: output of job vi needed to determine input of job vj.

GT : G의 모든 에지의 방향을 뒤집은 그래프


Directed Acyclic Graph

•DAG는 방향 사이클(directed cycle)이 없는 방향 그래프. 예: 작업들의 우선순위

•위상순서(topological order)

•노드들의 순서화 v1, v2,...,vn, 단, 모든 에지 (vi,vj)에 대해서 i<j가 되도록.

37

Strong Connectivity: Algorithm

Theorem. Can determine if G is strongly connected in O(m + n) time.

Pf.

! Pick any node s.

! Run BFS from s in G.

! Run BFS from s in Grev.

! Return true iff all nodes reached in both BFS executions.

! Correctness follows immediately from previous lemma. !

reverse orientation of every edge in G

strongly connected not strongly connected

3.6 DAGs and Topological Ordering

39

Directed Acyclic Graphs

Def. An DAG is a directed graph that contains no directed cycles.

Ex. Precedence constraints: edge (vi, vj) means vi must precede vj.

Def. A topological order of a directed graph G = (V, E) is an ordering

of its nodes as v1, v2, …, vn so that for every edge (vi, vj) we have i < j.

a DAG a topological ordering

v2 v3

v6 v5 v4

v7 v1

v1 v2 v3 v4 v5 v6 v7

40

Precedence Constraints

Precedence constraints. Edge (vi, vj) means task vi must occur before vj.

Applications.

! Course prerequisite graph: course vi must be taken before vj.

! Compilation: module vi must be compiled before vj. Pipeline of

computing jobs: output of job vi needed to determine input of job vj.


위상순서

computes the indegree of every node;find a node v with indegree zero;enqueue(v);while( !queue_empty() ) { v = dequeue(); print v; check v; for ( each unchecked neighbor w of v ) { decrement the indegree of w by one; if the indegree of w becomes zero enqueue(w); }} O(n+m) with adjacent list


강연결성분strongly-connected component

그래프 G의 강연결성분(SCC) C는 다음과 같은 조건을 만족하는 부그래프(subgraph)이다.

1.C에 속한 모든 노드들은 서로 reachable하다.

2.C에 속하지 않은 어떤 노드도 C에 속한 어떤 노드와 서로reachable하지 않다.

즉, maximal subset of mutually reachable nodes


강연결성분

d

b

f

e

a

c

g

h

b e

da

cfh

gSCC


강연결성분

d

b

f

e

a

c

g

h

b e

da

cfh

g

more than one SCC


강연결성분


그래프 GT

•그래프 GT

•GT = (V, ET), ET = {(u,v) | (v,u) ∈E }•O(n+m) 시간에 GT 의 인접리스트 구성 가능

•관찰: G와 GT 는 동일한 SCC를 가진다. 즉 임의의 두 노드 u와 v가 G에서 서로 reachable하면 GT 에서도 서로 reachable하다.


강연결성분

• 강연결성분들은 서로 disjoint한가?

a SCC another SCC

YES !!

The SCCs partition nodes of graph.


강연결성분

d

b

f

e

a

c

g

h

b e

da

cfh

g

4개의 강연결 성분을 가짐


SCC Graph

d

b

f

e

a

cg

h

be

da

cfh

g

C1

C2

C3

C4

•GSCC = (VSCC, ESCC)•VSCC는 G의 각 SCC마다 하나의 노드를 가짐

•G의 두 SCC간에 에지가 있으면 그 SCC를 표현하는 GSCC

의 두 노드간에 동일 방향의 에지가 존재


SCC Graph

C1C2

C3C4

SCC graph is always a DAG. Why?

d

b

f

e

a

c

g

h

b e

da

cfh

g


SCC 찾기

•가장 간단한 알고리즘

while there remains a node in G { choose a node v in G; perform DFS(v) in G ; perform DFS(v) in Grev; find nodes common to both; remove them from G;}

시간 복잡도 ?


개선된 알고리즘

1. DFS(G)를 호출하여 모든 노드들에 대해서 종료시간 f[u]를 계산한다.

2. GT를 만든다.

3. DFS(GT)를 호출한다. 단 노드들을 1단계에서 계산해둔 f[u]값이 감소하는 순서대로 고려한다.

4. DFS(GT)에 의해서 만들어지는 DFS forest의 각각의 트리가 하나의 SCC가 된다.


13/14 11/16 1/10 8/9

12/15 3/4 2/7 5/6

a b c d

e f g h

DFS(G)의 결과


13/14 11/16 1/10 8/9

12/15 3/4 2/7 5/6

a b c d

e f g h

DFS(GT)

start here

Nodes {b,a,e} form a SCC.


13/14 11/16 1/10 8/9

12/15 3/4 2/7 5/6

a b c d

e f g h

DFS(GT)

start here

Nodes {c,d} form a SCC.


13/14 11/16 1/10 8/9

12/15 3/4 2/7 5/6

a b c d

e f g h

DFS(GT)

start here

Nodes {g,f} form a SCC.


13/14 11/16 1/10 8/9

12/15 3/4 2/7 5/6

a b c d

e f g h

DFS(GT)

start hereNodes {h} form a SCC.


13/14 11/16 1/10 8/9

12/15 3/4 2/7 5/6

a b c d

e f g h

DFS(GT)

4개의 SCC가 존재


Correctness

•임의의 SCC C에 대해서 f(C) = max{ f(u) | u∈C }

•C와 C’를 G의 두 SCC라고 하자.

•만약 u∈C이고 v∈C’인 두 노드 사이에 에지 (u,v)가 존재한다면 f(C)>f(C’)이다.

DFS를 C에 속한 노드에서 먼저 시작한 경우와, 반대의 경우로 나누어 생각해보면 간단히

증명가능


Correctness

•G의 두 SCC C와 C’에 대해 f(C)>f(C’)라고 하자.

•그러면 GT에는 C로부터 C’으로 가는 에지가 존재하지 않는다.

따라서 f[u]가 최대인 노드에서 출발한 DFS는 그 노드가 속한 SCC를 벗

어날 수 없다.


모든 강연결성분 구하기

시간복잡도 O(n+m)


Part4 graph algorithms

Documents

Transcript of Part4 graph algorithms