CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and...
-
Upload
ethelbert-fisher -
Category
Documents
-
view
217 -
download
0
Transcript of CSCE350 Algorithms and Data Structure Lecture 19 Jianjun Hu Department of Computer Science and...
CSCE350 Algorithms and Data Structure
Lecture 19
Jianjun Hu
Department of Computer Science and Engineering
University of South Carolina
2009.11.
Hw5 summary
Average 39.97
Max:49
Common problems:
Pseudocode of Floyd’s Algorithm
The next matrix in sequence can be written over its predecessor
D
jkDkiDjiDjiD
nj
ni
nk
WD
nnWFloyd
return
do to for
do to for
do to for
ALGORITHM
],[],[],,[min],[
1
1
1
])..1,..1[(
Dijkstra’s Algorithm
Using greedy strategy to find the single-source shortest paths
In Floyd’s algorithm, we find the all-pair shortest paths, which may not be necessary in many applications
Single-source short path problem – find a family of shortest paths starting from a given vertex to any other vertex
Certainly, all-pair shortest paths contain the single-source shortest paths. But Floyd’s algorithm has O(|V|3) complexity!
There are many algorithms that can solve this problem, here we introduce the Dijkstra’s algorithm
Note: Dijkstra’s algorithm only works when all the edge-weight are nonnegative.
Dijkstra’s Algorithm on Undirected Graph
Similar to Prim’s MST algorithm, with the following difference:• Start with tree consisting of one vertex• “grow” tree one vertex/edge at a time to produce spanning tree
– Construct a series of expanding subtrees T1, T2, …• Keep track of shortest path from source to each of the vertices
in Ti
• at each stage construct Ti+1 from Ti: add minimum weight edge connecting a vertex in tree (Ti) to one not yet in tree– choose from “fringe” edges – (this is the “greedy” step!)
• algorithm stops when all vertices are included
edge (v,w) with lowest d(s,v) + d(v,w)
Example:
b
eda
c
3
4
62 5
7 4
Find the shortest paths starting from vertex a
Step 1:
Tree vertices: a(-,0)
Remaining vertices: b(a,3), c(-,∞), d(a,7), e (-,∞)
b
eda
c
3
4
62 5
7 4
Step 2:
Tree vertices: a(-,0), b(a,3),
Remaining vertices: c(-,∞)c(b,3+4), d(a,7)d(b,3+2), e (-,∞)
b
eda
c
3
4
62 5
7 4
Step 3:
Tree vertices: a(-,0), b(a,3), d(b,5)
Remaining vertices: c(b,3+4), e (-,∞)e(d,5+4)
b
eda
c
3
4
62 5
7 4
Step 4:
Tree vertices: a(-,0), b(a,3), d(b,5), c(b,7)
Remaining vertices: e(d,9)
b
eda
c
3
4
62 5
7 4
Step 5:
Tree vertices: a(-,0), b(a,3), d(b,5), c(b,7), e(d,9)
Remaining vertices: none the algorithm is done!
b
eda
c
3
4
62 5
7 4
Output the Single-Source Shortest Paths
Tree vertices: a(-,0), b(a,3), d(b,5), c(b,7), e(d,9)
b
eda
c
3
4
62 5
7 4
from a to b: a-b of length 3
from a to d: a-b-d of length 5
from a to c: a-b-c of length 7
from a to e: a-b-d-e of length 9
Proof of the Correctness
Using the induction method
Assume the first i steps provide shortest path to first i vertices
Prove that the i+1 step generate the shortest path from the source this this i+1-th vertex
),,(
*);*,(
)*,( if
do * oadjacent t is that in ex every vertfor
*
elementpriority minimum the//Delete)(*
do 1 to0for
with ofpriority //update),,(;0
)(
null;
do in ex every vertfor
empty toqueuepriority vertex ze//initiali)(
),(ALGORITHM
*
*
v
u
uuu
uu
T
TT
T
sss
vv
duQDecrease
upuuwdd
duuwd
uV-Vu
uVV
QDeleteMinu
Vi
V
dsdsQDecreased
Q,v,dInsert
pd
Vv
QInitialize
sGDijkstra
Notes on Dijkstra’s algorithm
Doesn’t work with negative weights
Applicable to both undirected and directed graphs
Efficiency: • O(|E| log |V|) when the graph is represented by adjacency
linked list and priority queue is implemented by a min-heap. • Reason: the whole process is almost the same as Prim’s
algorithm. Following the same analysis in Prim’s algorithm, we can see that it has the same complexity as Prim’s algorithm.
The efficiency can be further improved if a more sophisticated data structure called Fibonacci heap is used.
Huffman Trees – Coding Problem
Text consists of characters from some n-character alphabet
In communication, we need to code each character by a sequence of bits – codeword
Fixed-length encoding: code for each character has m bits and we know that m≥log2n (why?)
Standard seven-bit ASCII code does this
Using shorter bit string to code a text variable-length encoding: using shorter codeword for more frequent characters and longer codeword for less frequent ones
• For example, in Morse telegraph code, e(.), a(.-), q(--.-), z(--..)• How many bits represent the first character? – we want no
confusion here.
Huffman Tree – From coding to a binary tree
To avoid confusion, here we consider prefix codes, where no codeword is a prefix of a codeword of another character
This way, we can scan the bit string constructed from a text from the beginning until get the first group of bits that is a valid codeword (for some character)
It is natural to associate the alphabet’s characters with leaves of a binary tree in which all the left edges are labeled by 0 and all the right edges are labeled by 1 (or vice versa)
The codeword of a character (leaf) is the list of labels along the path from the root of the binary to the this leaf
Huffman tree can reduce the total bit string by assigning shorter codeword to frequent character and longer codeword to less frequent ones.
Huffman Coding Algorithm
Step 1: Initialize n one node trees and label them with the characters of the alphabet. Record the frequency of each character in its tree’s root to indicate the tree’s weight
Step 2: Repeat the following operation until a single tree is obtained. Find two trees with the smallest weights. Make them the left and right subtrees of a new tree and record the sum of their weights in the root of the new tree as its weight
Example: alphabet A, B, C, D, _ with frequency
character A B C D _
probability 0.35 0.1 0.2 0.2 0.15
0.25
0.1
B
0.15
_
0.2
C
0.2
D
0.35
A
0.2
C
0.2
D
0.35
A
0.1
B
0.15
_
0.4
0.2
C
0.2
D
0.6
0.25
0.1
B
0.15
_
0.6
1.0
0 1
0.4
0.2
C
0.2
D0.25
0.1
B
0.15
_
0 1 0
0
1
1
0.25
0.1
B
0.15
_
0.35
A
0.4
0.2
C
0.2
D
0.35
A
0.35
A
0.25
0.1
B
0.15
_
0.2
C
0.2
D
0.35
A
0.2
C
0.2
D
0.35
A
0.1
B
0.15
_
0.4
0.2
C
0.2
D
0.6
0.25
0.1
B
0.15
_
0.6
1.0
0 1
0.4
0.2
C
0.2
D0.25
0.1
B
0.15
_
0 1 0
0
1
1
0.25
0.1
B
0.15
_
0.35
A
0.4
0.2
C
0.2
D
0.35
A
0.35
A
The Huffman Code is
Therefore,
‘DAD’ is encoded as 011101 and
100110110111010 is decoded as BAD_AD
character A B C D _
probability 0.35 0.1 0.2 0.2 0.15
codeword 11 100 00 01 101
Notes on Huffman Tree (Coding)
The expected number of bits per character is then
2x0.35 + 3x0.1+2x0.2+2x0.2+3x0.15 = 2.25
Fixed-length encoding needs at 3 bits for each character
This is an important technique for file (data) compression
Huffman tree/coding has more general applications:• Assign n positive numbers w1, w2, …, wn to the n leaves of a
binary tree• We want to minimize weighted path length with li be
the depth of the leaf I• This has particular applications in making decisions – decision
trees
n
iiiwl
1
Examples of the Decision Tree
n>1
n=1 n=2
n>3
n=3 n=4
n>2yesno
yesno no yes
n=3
n=3
n=4
n=4yesno
yesno
n=2
n=1 n=2
yesno
25.04321 pppp
4.0,3.0,2.0,1.0 4321 pppp
Chapter 10: Limitations of Algorithm Power
1 constant
log n logarithmic
n linear
n log n n log n
n2 quadratic
n3 cubic
2n exponential
n! factorial
Basic Asymptotic Efficiency Classes
Polynomial-Time Complexity
Polynomial-time complexity: the complexity of an algorithm is
with a fixed degree b>0.
If a problem can be solved in polynomial time, it is usually considered to be theoretically computable in current computers.
When an algorithm’s complexity is larger than polynomial, i.e., exponential, theoretically it is considered to be too expensive to be useful.
This may not exactly reflect it’s usability in practice!
)(... 01
11
1bb
bb
b nananana
List of Problems
Sorting
Searching
Shortest paths in a graph
Minimum spanning tree
Primality testing
Traveling salesman problem
Knapsack problem
Chess
Towers of Hanoi …
Lower Bound
Problem A can be solved by algorithms a1, a2,…, ap
Problem B can be solved by algorithms b1, b2,…, bq
We may ask • Which algorithm is more efficient? This makes more sense
when the compared algorithms solve the same problem• Which problem is more complex? We may compare the
complexity of the best algorithm for A and the best algorithm for B
For each problem, we want to know the lower bound: the best possible algorithm’s efficiency for a problem Ω(.)
Tight lower bound: we have found an algorithm of the this lower-bound complexity