Data Structure13-binary-tree2.pdfU Kang (2015) 2 InThis Lecture nLearn the Priority Queue data...
Transcript of Data Structure13-binary-tree2.pdfU Kang (2015) 2 InThis Lecture nLearn the Priority Queue data...
U Kang (2015) 1
Data Structure
Lecture#12: Binary Trees 2(Chapter 5)
U KangSeoul National University
U Kang (2015) 2
In This Lecture
n Learn the Priority Queue data structure
n Learn the Heap data structure
n Learn the Huffman tree data structure
U Kang (2015) 3
Priority Queues (1)
Problem: We want a data structure that stores records as they come (insert), but on request, releases the record with the greatest value (removemax)
Example: Scheduling jobs in a multi-tasking operating system.
U Kang (2015) 4
Priority Queues (2)
Possible Solutions:- Insert appends to an array or a linked list ( O(1) ) and
then removemax determines the maximum by scanning the list ( O(n) )
- A linked list is used and is in decreasing order; insert places an element in its correct position ( O(n) ) and removemax simply removes the head of the list( O(1) ).
- Use a heap – both insert and removemax areO( log n ) operations
U Kang (2015) 5
Heaps
Heap: Complete binary tree with the heap property:n Min-heap: All values less than child values.n Max-heap: All values greater than child values.
The values are partially ordered.
Heap representation: Normally the array-based complete binary tree representation.
U Kang (2015) 6
Max Heap Example
88 85 83 72 73 42 57 6 48 60
U Kang (2015) 7
Max Heap Implementation (1)public class MaxHeap<K extends Comparable<? super K>, E> {private E[] Heap; // Pointer to heap arrayprivate int size; // Maximum size of heapprivate int n; // # of things in heappublic MaxHeap(E[] h, int num, int max){ Heap = h; n = num; size = max; buildheap(); }public int heapsize() { return n; }public boolean isLeaf(int pos) // Is pos a leaf position?{ return (pos >= n/2) && (pos < n); }public int leftchild(int pos) { // Leftchild positionassert pos < n/2 : "Position has no left child";return 2*pos + 1;}public int rightchild(int pos) { // Rightchild positionassert pos < (n-1)/2 : "Position has no right child";return 2*pos + 2;}public int parent(int pos) {assert pos > 0 : "Position has no parent";return (pos-1)/2;}
U Kang (2015) 8
Building Heaps
Binary tree to heap
(4-2), (4-1), (2-1), (5-2), (5-4), (6-3), (6-5), (7-5), (7-6)
(5-2), (7-3), (7-1), (6-1)
U Kang (2015) 9
Building Heaps
How to build heap? “sift down” method
Both H1 and H2 are heaps. Push R down properly
Sifting down flour
U Kang (2015) 10
Building Heaps
Siftdown operation
U Kang (2015) 11
Sift Downpublic void buildheap() // Heapify contents{ for (int i=n/2-1; i>=0; i--) siftdown(i); }private void siftdown(int pos) {assert (pos >= 0) && (pos < n) :"Illegal heap position";while (!isLeaf(pos)) {int j = leftchild(pos);if ((j<(n-1)) &&(Heap[j].compareTo(Heap[j+1]) < 0))j++; // index of child w/ greater valueif (Heap[pos].compareTo(Heap[j]) >= 0)return;DSutil.swap(Heap, pos, j);pos = j; // Move down}}
U Kang (2015) 12
RemoveMax, Insertpublic E removemax() {assert n > 0 : "Removing from empty heap";DSutil.swap(Heap, 0, --n);if (n != 0) siftdown(0);return Heap[n];}public void insert(E val) {assert n < size : "Heap is full";int curr = n++;Heap[curr] = val;// Siftup until curr parent's key > curr keywhile ((curr != 0) &&(Heap[curr].compareTo(Heap[parent(curr)])> 0)) {DSutil.swap(Heap, curr, parent(curr));curr = parent(curr);}}
U Kang (2015) 13
Example of Root DeletionGiven the initial heap:
In a heap of N nodes, the maximum distance the root can sift down would be élog (N+1)ù - 1.
97
79
93
90 81
84
83
5542 2173 83
83
79
93
90 81
84
83
5542 217393
79
83
90 81
84
83
5542 2173
U Kang (2015) 14
Heap Building Analysis
n Insert into the heap one value at a time: q Push each new value down the tree from the root to where it
belongsq ∑ log = ( )
n Starting with full array, work from bottom upq Since nodes below form a heap, just need to push current node
down (at worst, go to bottom)q Most nodes are at the bottom, so not far to goq When i is the level of the node counting from the bottom starting
with 1, this is ∑ − 1 = ∑ = ().
U Kang (2015) 15
Huffman Coding Trees
ASCII codes: 8 bits per character.n Fixed-length coding.
Can take advantage of relative frequency of letters to save space.
n Variable-length codingq E.g., 1 bit for ‘e’, 6 bit for ‘z’
Build the tree with minimum external path weight.
Z K M C U D L E2 7 24 32 37 42 42 120
U Kang (2015) 16
Huffman Coding Trees
n Goal: Build the tree with minimum external path weight.q Weighted path length of a leaf: weight × (depth of the leaf)q External path weight: sum of weighted path lengths for the
leaves
n Fact: Huffman coding tree gives the minimum external path weight
U Kang (2015) 17
Huffman Tree Construction (1)
U Kang (2015) 18
Huffman Tree Construction (2)
U Kang (2015) 19
Assigning CodesLetter Freq Code Bits
C 32D 42E 120K 7L 42M 24U 37Z 2
Follow the path from the root to each letter
U Kang (2015) 20
Assigning CodesLetter Freq Code Bits
C 32 1110 4D 42 101 3E 120 0 1K 7 111101 6L 42 110 3M 24 11111 5U 37 100 3Z 2 111100 6
Follow the path from the root to each letter
U Kang (2015) 21
Using Huffman Codes
n To encode a string using Huffman codes, just concatenate the Huffman code for each letterq Code for DEED: 10100101
n Given a Huffman-encoded string, can we correctly decode it?
U Kang (2015) 22
Coding and Decoding
n Given a Huffman-encoded string, can we correctly decode it?q Yes! Thanks to the ‘prefix’ property of Huffman code
n A set of codes is said to meet the prefix propertyif no code in the set is the prefix of another.q Thus, each code is uniquely identified
n Decode 1011001110111101:
U Kang (2015) 23
Expected Cost
n Expected cost = Average cost per letter =
ci = bits for letter ifi = freq for letter i
n The expected cost is 2.57 bits/letter which is smaller than log 8 = 3 bits/letter
Letter Freq Code Bits
C 32 1110 4D 42 101 3E 120 0 1K 7 111101 6L 42 110 3M 24 11111 5U 37 100 3Z 2 111100 6
U Kang (2015) 24
n In a BST, the root value splits the key range into everything less than or greater than the keyq The split points are determined by the data values
n View Huffman tree as a search treeq All keys starting with 0 are in the left branch, all keys
starting with 1 are in the right branchq The root splits the key range in halfq The split points are determined by the data structure, not
the data valuesq Such a structure is called a Trie
Search Tree vs. Trie
U Kang (2015) 25
Summary
n Learn the Priority Queue data structure
n Learn the Heap data structure
n Learn the Huffman tree data structure
U Kang (2015) 26
Questions?