13 A: External Algorithms II; Disjoint Sets; Java API Supportcs1102s/slides/slides_13_A.bw.pdf ·...
Transcript of 13 A: External Algorithms II; Disjoint Sets; Java API Supportcs1102s/slides/slides_13_A.bw.pdf ·...
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
13 A: External Algorithms II; Disjoint Sets;Java API Support
CS1102S: Data Structures and Algorithms
Martin Henz
April 15, 2009
Generated on Friday 17th April, 2009, 12:37CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support1
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
1 External Sorting
2 Disjoint Sets
3 Java API Support for Data Structures
4 Puzzlers
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support2
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Model for External SortingThe Simple AlgorithmMultiway Merge
1 External SortingModel for External SortingThe Simple AlgorithmMultiway Merge
2 Disjoint Sets
3 Java API Support for Data Structures
4 Puzzlers
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support3
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Model for External SortingThe Simple AlgorithmMultiway Merge
Tapes as Storage
Similar to disks
Access time many orders of magnitude slower than mainmemory
Additional characteristics
Large amounts of data can be read sequentially quite efficiently
Access of previous locations
is extremely slow, as it requires re-winding the tape!
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support4
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Model for External SortingThe Simple AlgorithmMultiway Merge
External Sorting
Main idea
Use tapes sequentially, and read one block from each inputtape tape
Merge blocks
Sort the blocksUse merge procedure from mergesort to merge
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support5
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Model for External SortingThe Simple AlgorithmMultiway Merge
The Simple Algorithm: Overview
Four tapes
Two input tapes; two output tapes
Read and write runs
Read runs from input tape, sort them and write alternatively tooutput tapes
Continue, writing larger runs
Read two runs from each “output” tape, and merge them on thefly, writing alternatively to “input” tapes
Continue
until one tape has all sorted dataCS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support6
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Model for External SortingThe Simple AlgorithmMultiway Merge
Multiway Merge
Why only four tapes?
If we have more than four tapes, we can take advantage ofthem by using multiway merge
How finding the smallest element during merge?
Priority queue!
Each iteration of inner loop
deleteMin to find smallest elementinsert new element from tape from which element was deleted
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support7
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Model for External SortingThe Simple AlgorithmMultiway Merge
Polyphase Merge and Replacement Selection
Polyphase merge: main idea
Make use of fewer tapes, by re-using tapes for reading andwritingLeading to tape organization using k th order Fibonaccinumbers
Replacement selection: main idea
Make use of input tape as output tape, reusing the tapes “onthe fly”
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support8
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
1 External Sorting
2 Disjoint SetsEquivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariantsApplications
3 Java API Support for Data Structures
4 Puzzlers
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support9
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
Equivalence Relations
Definition
An equivalence relation is a relation R that satisfies threeproperties:
1 (Reflexive) aRa, for all a ∈ S.2 (Symmetric) aRb if and only if bRa.3 (Transitive) aRb and bRc implies aRc.
Examples
Electrical connectivity (metal wires between points)
Cities belonging to same country
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support10
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
The Dynamic Equivalence Problem
Initial setup
Collection of N disjoint sets, each with one element
Operations
find(a): return the set of which x is element
union(a, b): merge the sets to which a and b belong, sothat find(a) = find(b)
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support11
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
Strategies
Fast Find, Slow Union
Use array repres to store equivalence class for each element
find(a): return repres[a]
union(a, b): if repres[x ] = repres[b] then set repres[x ] torepres[a]
Fast Union, Reasonable Find
Union/find data structure
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support12
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
Basic Data Structure
Idea
Maintain forest corresponding to equivalence relation
Union
Merge trees
Find
Return root of tree
Observe
Only upward direction needed!
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support13
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
Example
Initial setup:
After union(4, 5)
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support14
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
Example
After union(4, 5)
After union(6, 7)
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support15
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
Example
After union(6, 7)
After union(4, 6)
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support16
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
Representation
Idea
Remember parent node only; mark root with −1
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support17
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
Variants
Problem
How to choose root for union? Bad choice can lead to longpaths
Union-by-size
Always make the smaller tree a subtree of the larger tree
Analysis
When depth increases, the tree is smaller than the other side.Thus, after union, it is at least twice as large.
Height
less than or equal to log NCS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support18
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
Variants
Union-by-height
Always make the shorter tree a subtree of the higher tree
Height
As with union-by-size: O(log N)
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support19
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
Path Compression
During find make every node point to root
after find(14)
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support20
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
A Very Slowly Growing Function
Definition
log∗ N is the number of times log needs to be applied to N untilN ≤ 1.
Examples
log∗ 2 = 1
log∗ 4 = 2
log∗ 16 = 3
log∗ 65536 = 4
...
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support21
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Equivalence RelationsThe Dynamic Equivalence ProblemBasic Data StructureVariants
Runtime
Consider variant
Union-by-height combined with path compression
Theorem
The running time of M unions and finds is O(M log∗ N).
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support22
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
1 External Sorting
2 Disjoint Sets
3 Java API Support for Data StructuresCollections, Lists, IteratorsTreesHashingPriorityQueueSorting
4 Puzzlers
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support23
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
The Top-level Collection Interface
public inter face Co l l ec t i on <Any>extends I t e r a b l e <Any>
{i n t s ize ( ) ;boolean isEmpty ( ) ;void c l ea r ( ) ;boolean conta ins ( Any x ) ;boolean add ( Any x ) ; / / s i cboolean remove ( Any x ) ; / / s i cjava . u t i l . I t e r a t o r <Any> i t e r a t o r ( ) ;
}
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support24
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
The List Interface in Collection API
public inter face L i s t <Any>extends Co l l ec t i on <Any>
{Any get ( i n t i dx ) ;Any set ( i n t idx , Any newVal ) ;void add ( i n t idx , Any x ) ;void remove ( i n t i dx ) ;
L i s t I t e r a t o r <Any> l i s t I t e r a t o r ( i n t pos ) ;}
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support25
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
ArrayList and LinkedList
public class Ar rayL i s t <Any>implements L i s t <Any> { . . . }
public class L inkedL is t <Any>implements L i s t <Any> { . . . }
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support26
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
Iterators
public inter face I t e r a t o r <Any> {boolean hasNext ( ) ;Any next ( ) ;void remove ( ) ;
}
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support27
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
ListIterators
public inter face L i s t I t e r a t o r <Any>extends I t e r a t o r <Any>
{boolean hasPrevious ( ) ;Any prev ious ( ) ;void add ( Any x ) ;void set ( Any newVal ) ;
}
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support28
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
TreeSet
Implements Collection
Guarantees O(log N) time for add, remove and contains
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support29
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
AbstractMap<K,V>
Basic operations
V get(K key): Returns the value to which the specified keyis mapped.
V put(K key, V value): Associates the specified value withthe specified key in this map.
Other operations
containsKey(key), containsValue(val), remove(key)
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support30
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
TreeMap
Extends AbstractMap
Guarantees O(log N) time for put, get, containsKey,containsValue, remove
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support31
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
HashMap
Extends AbstractMap
Uses separate chaining with rehashing
Rehashing is governed by initial capacity and load factor,set in constructor
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support32
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
HashSet
Implements Collection using HashMap
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support33
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
PriorityQueue
Implements Collection
Efficient implementation of heap data structureOperation names:
deleteMin is called “poll”insert is called “add”
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support34
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Collections, Lists, IteratorsTreesHashingPriorityQueueSorting
Sorting
Generic sorting supported by class Collections
Uses mergesort in order to minimize number ofcomparisons
Sorting of built-in numerical types supported by classArrays
Uses efficient implementation of quicksort, to takeadvantage of tight inner loop.
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support35
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
1 External Sorting
2 Disjoint Sets
3 Java API Support for Data Structures
4 Puzzlers
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support36
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Last Puzzler: Package Deal
package c l i c k ;public class CodeTalk {
public void d o I t ( ) {printMessage ( ) ;
}void printMessage ( ) {
System . out . p r i n t l n ( ” C l i c k ” ) ;} }
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support37
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Last Puzzler: Package Deal
package hack ;import c l i c k . CodeTalk ;public class TypeI t {
private s t a t i c class C l i c k I t extends CodeTalk {void printMessage ( ) {
System . out . p r i n t l n ( ” Hack ” ) ;} }public s t a t i c void main ( S t r i n g [ ] args ) {
C l i c k I t c l i c k i t = new C l i c k I t ( ) ;c l i c k i t . d o I t ( ) ;
} }
What does clickit . doIt () print? “Click” or “Hack”?
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support38
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Java’s Access Modifiers
public : available wherever the class is available
private : only available within the class
protected : available in subclasses and within same package
none : available within the same package
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support39
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
Access Modifiers Govern Inheritance
Overriding only available methods
A method can be overridden only when it is available accordingto the modifier rules.
Package visibility
Method printMessage can only be overridden within packageclick .
Result
”Click” is printed.
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support40
External SortingDisjoint Sets
Java API Support for Data StructuresPuzzlers
This Week and Beyond
Thursday tutorial: Assignment 9
Friday lecture: CS1102S summary, outlook; questions?
Next week: Reading week, consultation by appointment
27/4 and 28/4: no consultation
29/4, 5pm: Final
CS1102S: Data Structures and Algorithms 13 A: External Algorithms II; Disjoint Sets; Java API Support41