CSC5101 Advanced Algorithms Analysis Lecture 4 (Graph Problems & Elementary Graph algorithms) Slides...

CSC5101 Advanced Algorithms

AnalysisLecture 4 (Graph Problems & Elementary Graph

algorithms)

Slides are prepared by: Syed Imtiaz Ali

Basic Definitions and Applications• A graph G is simply a way of encoding pairwise

relationships among a set of objects:

• It consists of a collection V of nodes and a collection E of edges, each of which "joins" two of the nodes.

• A node in a graph is also frequently called a vertex

• Edges indicate a symmetric relationship between their ends.

• Asymmetric relationships, use the closely related notion of a directed graph.

• In directed graph, the roles of u and v are not interchangeable, and we call u the tail of the edge and v the head.

• We will say that edge e’ leaves node u and enters node v

Graph shape

Examples of Graphs - Transportation networks.• The map of routes served by an airline carrier natural forms a graph:

• The nodes are airports, and there is an edge from u to v if there is a nonstop flight that departs from u and arrives at v.

• In practice when there is an edge (u, v), there is almost always an edge (v, u), so treat the airline route map as an undirected graph with edges joining pairs of airports that have nonstop flights each way.

• Looking at such a graph, we’d quickly notice a few things:

• There are often a small number of hubs with a very large number of incident edges

• It’s possible to get between any two nodes in the graph via a very small number of intermediate stops.

• Other transportation networks can be modeled in a similar way. For example:

• We could take a rail network and have a node for each terminal, and an edge joining u and v if there’s a section of railway track that goes between them without stopping at any intermediate terminal.

Examples of Graphs - Communication networks.

• Collection of connected computers can be modeled as graph

• We could have a node for each computer and an edge joining u and v if there is a direct physical link connecting them.

• In Internet, a node can be the set of all machines controlled by a single Internet service provider, with an edge joining u and v if there is a direct peering relationship between them

• In wireless networks, nodes are computing devices situated at locations in physical space, and there is an edge from u to v if u is close enough to v to receive a signal from it.

• Its useful to view such a graph as directed, since it may be the case that u can hear v’s signal but v cannot hear u’s signal (if, for example, u has a stronger transmitter).

Examples of Graphs - Information networks.• The World Wide Web can be naturally viewed as a directed graph, in

which nodes correspond to Web pages and there is an edge from u to v if u has a hyperlink to v.

• The directedness of the graph is crucial here; many pages, for example, link to popular news sites, but these sites clearly do not reciprocate all these links.

• The structure of all these hyperlinks can be used by algorithms to try inferring the most important pages on the Web, a technique employed by most current search engines.

• The hypertextual structure of the Web is anticipated by a number of information networks that predate the Internet by many decades.

• These include the network of cross-references among articles in an encyclopedia or other reference work, and the network of bibliographic citations among scientific papers.

Examples of Graphs - Social networks.• Given any collection of people who interact:

• The employees of a company

• The students in a high school

• The residents of a small town

• We can define a network whose nodes are people, with an edge joining u and v if they are friends with one another.

• We could have the edges mean a number of different things instead of friendship:

• The undirected edge (u, v) could mean that u and v have had a romantic relationship or a financial relationship

• The directed edge (u, v) could mean that u seeks advice from v, or that u lists v in his or her e-mail address book.

• Social Networks are used extensively by sociologists to study the dynamics of interaction among people.

• They can be used to identify the most "influential" people in a company or organization, to model trust relationships in a financial or political setting

Path, Simple and Cycle

• Path in an undirected graph G = (V, E) to be a sequence P of nodes v1, v2 ..... vk-1, vk with the property that each consecutive pair vi, v i+1 joined by an edge in G. P is often called a path from v1 to vk.

• For example, the nodes 4, 2, 1, 7, 8 form a path in the figure.

• Simple is a path if all its vertices are distinct from one another.

• Cycle is a path v1, v2 ..... vk-1, vk in which k > 2, the first k – 1 nodes

are all distinct, and v1 = vk in other words, the sequence of nodes "cycles back" to where it began.

• All of these definitions carry over naturally to directed graphs, with the following change:

• Each pair of consecutive nodes has the property that (vi, v i+1 ) is an edge.

• In other words, the sequence of nodes in the path or cycle must respect the directionality of edges.

Graph Connectivity and distance

• An undirected graph is connected if, for every pair of nodes u and v, there is a path from u to v.

• A directed graph is strongly connected if, for every two nodes u and v, there is a path from u to v and a path from v to u.

• The distance between two nodes u and v is the minimum number of edges in a u-v path.

• We can designate symbol infinity (∞ ) to denote the distance between nodes that are not connected by a path

• The term distance here comes from imagining G as representing a communication or transportation network; if we want to get from u to v, we may well want a route with as few "hops" as possible.

Trees• Undirected graph is a tree if it is connected and does not

contain a cycle.

• Trees are the simplest kind of connected graph: deleting any edge from a tree will disconnect it.

• For structure of a tree T, it is useful to root it at a particular node r.

• More precisely, we "orient" each edge of T away from r; for each other node v, we declare the parent of v to be the node u that directly precedes v on its path from r; we declare w to be a child of v if v is the parent of w.

• More generally, we say that w is a descendant of v (or v is an ancestor of w) if v lies on the path from the root to w; and we say that a node x is a leaf if it has no descendants.

Representing Graphs

• Standard ways to represent a graph G = (V, E):

• Adjacency lists

• Adjacency matrix.

• Either way applies to both directed and undirected graphs.

• Adjacency-list representation provides a compact way to represent sparse graphs.

• Adjacency-matrix representation, is preferred, when:

• The graph is dense

• When we need to tell quickly if there is an edge connecting two given vertices.

Representing Graphs – An Example: Undirected graph

Representing Graphs – An Example: Directed graph

Adjacency-list representation• The adjacency-list representation of a graph G = (V, E) consists

of:

• An array that contains information of all vertices.

• For each vertex there is a adjacency-list (linked list) showing information about each edge.

• For both directed and undirected graphs, the adjacency-list representation has the desirable property that the amount of memory it requires is θ (V + E).

• It can readily be used to represent weighted graphs, that is, graphs for which each edge has an associated weight

• It is quite robust that we can modify it to support many other graph variants

• A potential disadvantage of the adjacency-list representation is that it provides no quicker way to determine whether a given edge (u,v) is present in the graph

Adjacency-matrix representation• The adjacency-matrix representation of a graph G

consists of a |V| x |V| matrix A = (aij) such that:

• It requires θ (V2) memory, independent of the number of edges in the graph

• Since in an undirected graph, (u, v) and (v, u) represent the same edge, the adjacency matrix A of an undirected graph is its own transpose: A = AT.

• In some applications, it pays to store only the entries on and above the diagonal of the adjacency matrix, thereby cutting the memory needed to store the graph almost in half

• An adjacency matrix can also represent a weighted graph

Breadth-First Search (BFS) Algorithm

• If there is a path between node s and node t in a tree, we say s-t connectivity exists

• BFS is the simplest algorithm for determining s-t connectivity:

• In this we explore outward from s in all possible directions, adding nodes one "layer" at a time.

• Thus we start with s and include all nodes that are joined by an edge to s--this is the first layer of the search.

• We then include all additional nodes that are joined by an edge to any node in the first layer--this is the second layer.

• We continue in this way until no new nodes are encountered.


• In the example of starting with node 1 as s,

• First layer of the search would consist of nodes 2 and 3

• Second layer would consist of nodes 4, 5, 7, and 8

• Third layer would consist just of node 6.

• The search would stop, since there are no further nodes that could be added

• Note that nodes 9 through 13 are never reached by the search


• To keep track of search progress, the algorithm colors each vertex white, gray, or black.

• All vertices start out white and may later become gray and then black.

• Gray and black vertices are discovered one.

• All vertices adjacent to black vertices have been discovered.

• Gray vertices may have some adjacent white vertices

• Gray vertex represent the frontier between discovered and undiscovered vertices.


• Breadth-first search constructs a breadth-first tree, initially containing only its root, which is the source vertex s.

• Whenever the search discovers a white vertex v, in the course of scanning the adjacency list of an already discovered vertex u, the vertex v and the edge (u, v) are added to the tree.

• We say that u is the predecessor or parent of v in the breadth-first tree.

• Since a vertex is discovered at most once, it has at most one parent.

• Ancestor and descendant relationships in the breadth-first tree are defined relative to the root s as usual: if u is on the simple path in the tree from the root s to vertex v, then u is an ancestor of v and v is a descendant of u.

BFS Algorithm• The procedure BFS works as follows.

• With the exception of the source vertex s, lines 1-4:

• paint every vertex white

• set u.d to be infinity for each vertex u

• set the parent of every vertex to be NIL.

• Line 5 paints s gray, since we consider it to be discovered as the procedure begins.

• Line 6 initializes s.d to 0

• line 7 sets the predecessor of the source to be NIL.

• Lines 8–9 initialize Q to the queue containing just the vertex s.

BFS Algorithm• The while loop of lines 10–18

iterates as long as there remain gray vertices, which are discovered vertices that have not yet had their adjacency lists fully examined.

• This while loop maintains the following invariant:

• Line 10, the queue Q consists of the set of gray vertices.

• Prior to the first iteration, the only gray vertex, and the only vertex in Q, is the source vertex s.

• Line 11 determines the gray vertex u at the head of the queue Q and removes it from Q.

BFS Algorithm• The for loop of lines 12–17

considers each vertex v in the adjacency list of u.

• If v is white, then it has not yet been discovered, and the procedure discovers it by executing lines 14–17.

• The procedure paints vertex v gray, sets its distance v.d to u.d+1, records u as its parent v.π, and places it at the tail of the queueQ.

• Once the procedure has examined all the vertices on u’s adjacency list, it blackens u in line 18.

BFS Algorithm• The loop invariant is

maintained because whenever a vertex is:

• painted gray (in line 14) it is also enqueued (in line 17)

• dequeued (in line 11) it is also painted black (in line 18).

• The results of breadth-first search may depend upon the order in which the neighbors of a given vertex are visited in line 12: the breadth-first tree may vary, but the distances d computed by the algorithm will not.

BFS Algorithm

(BFS )Algorithm Analysis

• We will analyzing running time on an input graph G = (V,E)

• After initialization, breadth-first search never whitens a vertex, and thus the test in line 13 ensures that each vertex is enqueued at most once, and hence dequeued at most once.

• The operations of enqueuing and dequeuing take O(1) time, and so the total time devoted to queue operations is O(V).

• The procedure scans the adjacency list of each vertex only when the vertex is dequeued, it scans each adjacency list at most once.

• Since the sum of the lengths of all the adjacency lists is θ(E), the total time spent in scanning adjacency lists is O(E).

• The overhead for initialization is O(V), and thus the total running time of the BFS procedure is O(V + E).

• Breadth-first search runs in time linear in the size of the adjacency-list representation of G.

Depth-First Search (DFS) Algorithm

• Another method to find the nodes reachable from s is to start from s and try the first edge leading out of it, to a node u.

• Follow the first edge leading out of u, and continue in this way until you reached a "dead end"--a node for which you had already explored all its neighbors.

• You’d then backtrack until you got to a node with an unexplored neighbor, and resume from there.

• We call this algorithm depthfirst search (DFS), since it explores G by going as deeply’ as possible and only retreating when necessary.

• It is most easily described in recursive form:

• we can invoke DFS from any starting point but maintain global knowledge of which nodes have already been explored.

DFS Algorithm

• Procedure DFS works as follows:

• Lines 1–3 paint all vertices white and initialize their π attributes to NIL.

• Line 4 resets the global time counter.

• Lines 5–7 check each vertex in V in turn and, when a white vertex is found, visit it using DFS-VISIT.

• Every time DFS-VISIT (G,u) is called in line 7, vertex u becomes the root of a new tree in the depth-first forest.

• When DFS returns, every vertex u has been assigned a discovery time u.d and a finishing time u.f

DFS Algorithm• In each call DFS-VISIT (G, u) vertex u is

initially white.

• Line 1 increments the global variable time

• Line 2 records the new value of time as the discovery time u.d

• Line 3 paints u gray.

• Lines 4–7 examine each vertex v adjacent to u and recursively visit v if it is white.

• As each vertex v € Adj[u] is considered in line 4, we say that edge (u, v) is explored by the depth-first search.

• Finally, after every edge leaving u has been explored, lines 8–10 paint u black, increment time, and record the finishing time in u:f .

DFS Algorithm

DFS Algorithm Analysis

• What is the running time of DFS?

• The loops on lines 1–3 and lines 5–7 of DFS take time θ (V) exclusive of the time to execute the calls to DFS-VISIT.

• The procedure DFS-VISIT is called exactly once for each vertex, since the vertex u on which DFS-VISIT is invoked must be white and the first thing DFS-VISIT does is paint vertex u gray.

• During an execution of DFS-VISIT (G, u) the loop on lines 4–7 executes Adj[v] times.

• The total cost of executing lines 4–7 of DFS-VISIT is θ (E).

• The running time of DFS is therefore θ (V + E).

Similarities & Differences between DFS & BFS

• The Similarities:

• Both build the connected component containing s

• Both achieve qualitatively similar levels of efficiency

• The Differences:

• The order of traversal is totally different

• Although both yield a natural rooted tree T on the component containing s, but the tree will generally have a very different structure.

CSC5101 Advanced Algorithms Analysis Lecture 4 (Graph Problems & Elementary Graph algorithms) Slides...

Documents

Transcript of CSC5101 Advanced Algorithms Analysis Lecture 4 (Graph Problems & Elementary Graph algorithms) Slides...