Algorithms and Data Structures for Data Science Graph ...

27
Department of Computer Science Algorithms and Data Structures for Data Science CS 277 Brad Solomon November 8, 2021 Graph Traversals

Transcript of Algorithms and Data Structures for Data Science Graph ...

Department of Computer Science

Algorithms and Data Structures for Data Science

CS 277 Brad Solomon

November 8, 2021

Graph Traversals

mp_huffman review

Average:

Final Project Mid-Project Check-ins

Mid-project check-ins start next week (November 8th-12th)!

Proposal resubmissions due November 5th

To sign up for a mid-project meeting time:

https://bit.ly/3GzmC3P

Learning Objectives

Introduce graph coloring and graph traversals

Identify utility of traversal labelings

Define key graph functions and discuss implementation details

Graph Implementation: Edge List

v

u

w z

u

v

w

z

Vertex Storage:

Edge Storage:

|V|= n,|E|= m

Graph Implementation: Adjacency Matrix

v

u

w z

u

v

w

z

Vertex Storage:

Edge Storage:

u v w z

u 0 1 1 0

v 1 0 1 0

w 1 1 0 1

z 0 0 1 0

Adjacency List

v

u

w z

Vertex Storage:

Edge Storage:u

v

w

z

v w

u w

v u z

z

d=2

d=2

d=3

d=1

Graph Coloring

5

36

4

2

1

4

Graph Coloring

A

G F

HB

E

D

I

C

Example modified from Princeton (Math Alive) — Author unknown

Graph Coloring

F

C I

BG

H

E

A

D

Example modified from Princeton (Math Alive) — Author unknown

Graph Coloring

An optimal solution can be found — but is NP-complete

Heuristic algorithms have no guarantees (but are usually fast)

Our heuristic has an upper bound but not a clear lower bound

Traversal: Objective: Visit every vertex and every edge in the graph.

Purpose: Search for interesting sub-structures in the graph.

We’ve seen traversal before ….but it’s different:

• Ordered • Obvious Start •

• • •

Traversal: BFS

A

C D

E

B

G H

F

Traversal: BFS v d P Adjacent Edges

A

B

C

D

E

F

G

H

A

C D

E

B

G H

F

Traversal: BFS

G H F E D C B A

v d P Adjacent Edges

A 0 - B C D

B 1 A A C E

C 1 A A B D E F

D 1 A A C F H

E 2 C B C G

F 2 C C D G

G 3 E E F H

H 2 D D G

A

C D

E

B

G H

F

Traversal: BFS

G H F E D B C A

v d P Adjacent Edges

A 0 - C B D

B 1 A A C E

C 1 A A B D E F

D 1 A A C F H

E 2 C B C G

F 2 C C D G

G 3 E E F H

H 2 D D G

A

C D

E

B

G H

F

class vertObj(): def __init__(self, v): self.val = v self.eList = []

# Added for BFS self.visited = 0 self.pred = None self.dist = -1

1 2 3 4 5 6 7 8 9

10 11 12

class edgeObj(): def __init__(self, e): self.v2 = e self.parallel = None

# Added for BFS self.label = 0

1 2 3 4 5 6 7 8 9

10 11 12

BFS AnalysisQ: What do we do if we have a disjoint graph?

Q: Does our implementation detect a cycle?

Q: What is the running time?

Running time of BFS

G H F E D B C A

v d P Adjacent Edges

A 0 - C B D

B 1 A A C E

C 1 A B A D E F

D 1 A A C F H

E 2 C B C G

F 2 C C D G

G 3 E E F H

H 2 D D G

A

C D

E

B

G H

F

BFS ObservationsQ: What is a shortest path from A to H?

Q: What is a shortest path from E to H?

Q: How does a cross edge relate to d?

Q: What structure is made from discovery edges?

v d P Adjacent Edges

A 0 - C B D

B 1 A A C E

C 1 A B A D E F

D 1 A A C F H

E 2 C B C G

F 2 C C D G

G 3 E E F H

H 2 D D G

A

C D

E

B

G H

F

BFS ObservationsObs. 1: BFS can be used to count components.

Obs. 2: BFS can be used to detect cycles.

Obs. 3: In BFS, d provides the shortest distance to every vertex.

Obs. 4: In BFS, the endpoints of a cross edge never differ in distance, d, by more than 1: |d(u) − d(v) | ≤ 1

Traversal: DFS

A

C

D

E

BF

G

H

JI

Traversal: DFS

Discovery Edge

Back Edge

A

C

D

E

BF

G

H

JI

def DFS_recur(self, s): self.V[s].visited = 1

el = self.getEdges(s) for e in el: l = e.label v = self.V[l] if v.visited == 0: self.DFS_recur(l)

1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22

Traversal: DFS

DFS ObservationsObs. 1: DFS can be used to count components.

Obs. 2: DFS can be used to detect cycles.

Obs. 3: In DFS, distance provides no clear meaning

DFS vs BFSDFS: BFS:

Pros: Pros:

Cons: Cons:

“The Muddy City” by CS Unplugged, Creative Commons BY-NC-SA 4.0