WEEK 5 The Disjoint Set Class Ch 8.1-8.2-8.3-8.4-8.5 CE222 Dr. Senem Kumova Metin 2011-2012.
-
Upload
imogen-neal -
Category
Documents
-
view
218 -
download
0
description
Transcript of WEEK 5 The Disjoint Set Class Ch 8.1-8.2-8.3-8.4-8.5 CE222 Dr. Senem Kumova Metin 2011-2012.
WEEK 5 The Disjoint Set Class
Ch 8.1-8.2-8.3-8.4-8.5
CE222Dr. Senem Kumova Metin
2011-2012
OUTLINE• Definitions• Dynamic Equivalence Problem• Operations on Disjoint Sets• Smart Union Algorithms• Path compression
Next Week– Worst Case Analysis– Example
2CE222_Dr. Senem Kumova Metin
DEFINITIONS• A set is a collection of objects.
• Set A is a subset of set B if all elements of A are in B.
• Subsets are also sets
• Union of two sets A and B is a set C which consists of all elements in A and B
• Two sets are mutually disjoint if they do not have a common element Disjoint Sets
3CE222_Dr. Senem Kumova Metin
DEFINITIONS
• Partition of a set is a collection of subsets such that union of all these subsets is the set itself
EXAMPLE :S = {1,2,3,4}, A = {1,2}, B = {3,4}, C = {2,3,4}, D = {4}• Is A, B a partition of S? YES • Is A, C partition of S? NO• Is A, D partition of S? NO
CE222_Dr. Senem Kumova Metin 4
DEFINITIONS• A relation R is defined on a set S if for every pair of elements
(a,b), a,bS, a R b is either true or false. If a R b is true, then we say that a is related to b.
• An equivalence relation is a relation R that satisfy three properties:– (reflexive) a R a, for all a S.– (symmetric) a R b if and only if b R a.– (transitive) a R b and b R c implies that a R c.
• An equivalence relation partitions a set into distinct equivalence classes
CE222_Dr. Senem Kumova Metin 5
The Dynamic Equivalence Problem• How can we decide for any a and b if a is related to b?
Answer : A two dimensional array of Boolean variables Result in constant time
• What happens if the relation is not explicitly defined ???• Relations : a1~a3 , a3~a5, a4~a5
CE222_Dr. Senem Kumova Metin 6
The Dynamic Equivalence Problem• The equivalence class of an element aS is the subset of S
that contains all the elements that are related to a.
• To decide if two member are related, only need to check whether the two are in the same equivalence class.
• Five element set { a1, a2, a3, a4, a5}, if following relations are given a1~a3 , a3~a5, a5~a4 .. Is a1 related to a4??
CE222_Dr. Senem Kumova Metin 7
The Dynamic Equivalence Problem• Each equivalence class may be represented by a
single object: the representative object
• For the relations given : a1~a3 , a3~a5, a5~a4
CE222_Dr. Senem Kumova Metin 8
Operations on Disjoint Sets
Union (Add operation (e.g., add relation a~b))– Check if a and b are already related: if they are in the same
equivalence class.– If not, merge the two equivalence classes containing a and
b into a new equivalence class.
FindReturn the name (pointer or index of representative) of the set containing a given element
CE222_Dr. Senem Kumova Metin 9
A simple Implementation : Example
Consider the following disjoint set on the ten decimal digits:
A simple Implementation : Example
After UnionSets(1,2) After ( Add relation 1~2 )
A simple Implementation : Example
After UnıonSets(2,4) find(4)=>4 and find(2)=>1
A simple Implementation : Example
After UnıonSets(5,6) find(5)=>5 and find(6)=>6
A simple Implementation : Example
After UnıonSets(6,7) find(6)=>5 and find(7)=>7
A simple Implementation : Example
After UnıonSets(2,9) find(2)=>1 and find(9)=>9
A simple Implementation : Example
After UnıonSets(5,1) find(5)=>5 and find(1)=>1
A simple Implementation : Example
After UnıonSets(3,0) find(5)=>5 and find(1)=>1
A simple Implementation : Example
After UnionSets(0,8) find(0)=>3 and find(8)=>8
A simple Implementation : Example
After UnionSets(3,5) find(3)=>3 and find(5)=>5
Tree Implementation• For simplicity, we will assume we are creating disjoint
sets with N integers• We will define an arrayinitialize(int N){ parent = new int[N];for ( int i = 0; i < N; ++i ) { parent[i] = -1; }}
• If parent[i] == -1, then i is a root node. Initially, each integer is in its own set
Tree Implementation : FIND and UNION// ITERATIVEint find( int i ) { while( parent[i]!=-1)
i = parent[i]; return i; }
//RECURSIVEint find( int i ) const { if(parent[i]==-1)
return i; else
return find(parent[i]);
}void UnionSets( int i, int j ) { i = find( i ); // root of i
j = find( j ); // root of j
if ( i != j ) parent[j] = i;// 2nd set is appended to 1st set
}
Tree Implementation : Time Complexity// ITERATIVEint find( int i ) { while( parent[i]!=-1)
i = parent[i]; return i; }
// worst case O(N)void UnionSets( int i, int j ) { i = find( i );
j = find( j );
if ( i != j ) parent[j] = i;
} // O(1)
If we have u Union, f Find operations thencomplexity is O(u + f N)
M consecutive operations could take O(MN) time in worst case
Array Implementation: Example
Array Implementation: Time Complexity
int find( int i ) { return array[i]; }// O(1)void UnionSets( int i, int
j ) { rooti=find(i);
rootj=find(j);
for (int k=1; k<=N; k++)if (array[k] == rootj) array[k] = rooti; }
//O(N)
Initialize( int N ) { array = new int [N+1];
for (int e=1; e<=N; e++)
array[e] = e; }
If we have u Union, f Find operations thencomplexity is O(uN + f )
M consecutive operations could take O(MN) time in worst case
C++ Implementation from Text Book
CE222_Dr. Senem Kumova Metin 25
class DisjSets{ public: DisjSets(int numElements):s(numElements){ for(int j=0; j<s.size(); j++)s[j]=-1; }
int find(int x) const { if(s[x]<0) return x;else return find(s[x]); }
void unionSets(int root1,int root2) { s[root2]=root1; }private: vector<int> s; // an array with varying size};
Linked List Implementation of Disjoint Sets
CE222_Dr. Senem Kumova Metin 26
After Union (f,c)
Linked List Implementation of Disjoint Sets
• Each set is represented by a linked list • The first object in each linked list serves as its set's representative.• Each object in the linked list contains
– a set member, – a pointer to the object containing the next set member, – a pointer back to the representative.
• Each list maintains pointers head, to the representative, and tail, to the last object in the list.
• Within each linked list, the objects may appear in any order (subject to our assumption that the first object in each list is the representative).
CE222_Dr. Senem Kumova Metin 27
Smart Union Algorithms
• Union by size – Make the smaller tree a subtree of the larger.– If union-by-size, the depth of any node is never
more than logN: a find operation is O(logN), and O(MlogN) for a sequence of M.
– The worst-case trees are binomial trees
• Union by height (Union by rank)
CE222_Dr. Senem Kumova Metin 28
Smart Union Algorithms• Union by height (Union by rank)
/*Make the shallow tree a subtree of the deeper*/
void DisjSets::unionSets(root1, root2){if(s[root2]<s[root1]) //root2 is deepers[root1]=root2;else { //update height if same if(s[root1]==s[root2]) s[root1]--; s[root2]=root1; }
}
CE222_Dr. Senem Kumova Metin 29
CE222_Dr. Senem Kumova Metin 30
Smart Union Algorithms : Example
Analysis of Smart Union Algorithms
Suppose each list also includes the length of the list and that we always append the smaller list onto the longer (weighted-unionunion by height), with ties broken arbitrarily.Theorem 2.1: Using the linked-list representation ofdisjoint sets and the weighted-union heuristic, asequence of m operations take O(m + n lg n) time.
CE222_Dr. Senem Kumova Metin 31
Analysis of Smart Union Algorithms: Theorem 2.1 Proof:
Consider a fixed object x. We know that each time x's representative pointer wasupdated, x must have started in the smaller set. The first time x's representativepointer was updated, therefore, the resulting set must have had at least 2 members.Similarly, the next time x's representative pointer was updated, the resulting set must have had at least 4 members. Continuing on, we observe that for any k ≤ n, after x's representative pointer has been updated lg k times, the resulting set ⌈ ⌉must have at least k members. Since the largest set has at most n members, each object's representative pointer has been updated at most lg n times over all the UNION⌈ ⌉ operations. The total time used in updating the n objects is thus O(n lg n).
The time for the entire sequence of m operations follows easily. Each MAKE-SET andFIND-SET operation takes O(1) time, and there are O(m) of them. The total time forthe entire sequence is thus O(m + n lg n).
CE222_Dr. Senem Kumova Metin 32
33
Path compression int Find(int x)if (parent[x] == 0) return xelse
return parent[x] = Find(parent[x])
• Any single find can still be O(log N), but later finds on the same path are faster
• u Unions, f Finds: O(u + f (f, u))
• (f, u) is a functional inverse of Ackermann’s function N-1 Unions, O(N) Finds: “almost linear” total time
Path compression
CE222_Dr. Senem Kumova Metin 34