Disjoint (Union-Find) data structure

Posted by
Suppose there are N elements. Some of them connected with each other directly, some of them connected with each other in-directly and some of them not connected with each other.Now we put elements which are connected with each other in a single group.Thus there will be multiple groups.We also want to check two elements connected with each other or not.
A simple example is graph.Nodes of graphs which are connected with each other can be put in a single group or subset. Thus there will be multiple subsets which are called connected components.
Using disjoint data structure (also known as Union-Find) we can manage above situation in efficient way.It includes basically two types of operation.
1. Union : Connect two elements with each other.
2. Find : To Check two elements belongs to same subset or not.
We will manage above using array data structure and we will use each element as index of array. 
Lets see a example :
Assume we have elements { 0,1,2,3,4,5 } and we put these elements into Array in following way. 
Array 0 1 2 3 4 5
Index 0 1 2 3 4 5
UNION (A,B)  :  Find all values in Array which are equal to value of Array [ A ] and replace all these values with value of Array [ B ].
1. UNION (0,1) :  Here A = 0,B = 1. hold values at index 0 and 1 into variables tempA and tempB.
tempA = Arr [ 0 ] = 0
tempB = Arr [ 1 ] = 1
In the Array Only at index 0 has the value which is equal to Array [ A ] i.e tempA. So we change value at index 0 with value of Array [ 1 ] .i.e tempB. Now Array becomes as
Array 1 1 2 3 4 5
Index 0 1 2 3 4 5
2. UNION (2,3) :  Here A = 2,B = 3. hold values at index 2 and 3 into variables tempA and tempB.
tempA = Arr [ 2 ] = 2
tempB = Arr [ 3 ] = 3
In the Array Only at index 2 has the value which is equal to Array [ A ] i.e tempA. So we change value at index 2 with value of Array [ B ] .i.e tempB. Now Array becomes as
Array 1 1 3 3 4 5
Index 0 1 2 3 4 5
3. UNION (2,4) :  Here A = 2,B = 4. hold values at index 2 and 4 into variables tempA and tempB.
tempA = Arr [ 2 ] = 3
tempB = Arr [ 4 ] = 4
In the Array at index 2 and 3 has the value which is equal to Array [ A ] i.e tempA. So we change value at index 2 and 3  with value of Array [ B ] .i.e tempB. Now Array becomes as
Array 1 1 4 4 4 5
Index 0 1 2 3 4 5

Diagram :

Code Snippet :

 

To Check two elements belongs to each other or not we need to compare value at corresponding index. if they are same than elements belongs to each other otherwise not. In above example 2,3 and 4 connected to each other.Elements 2 and 5 not connected to each other. perform_find(…) function can be used for this.
Time Complexity :
We noticed that when we connect an element we need to iterate N times every time.So in case to connect all elements number of iteration will be N*N. Hence worst case time complexity is O(N*N).

Second Approach :

Now we will modify above algorithm by using a concept of root element for every subset. It will remove overhead of iteration of all elements when performing union operation.So a root element of every subset is a specific element for which condition Array [ A ] == A holds true for element A. Means value at index A is equal to A.
When performing Union of A and B we will set root of A as the parent of root of B.It will include A and B in same subset thus both A and B now have same root element which is root of A.
On find operation of A and B we will compare root of A and B, if A and B has same root then they are connected with each other or belongs to same set/subset/group.
Lets see a example :
Assume we have elements { 0,1,2,3,4,5 } and we put these elements into Array in following way. 
Array 0 1 2 3 4 5
Index 0 1 2 3 4 5
1. UNION (0,1) :  Here A = 0,B = 1. hold values at index 0 and 1 into variables rootA and rootB.
rootA = Arr [ 0 ] = 0
rootB = Arr [ 1 ] = 1
Now rootA is equal to A, so rootA is the root of subset { 0 }.Similarly rootB is root of subset { 1 }. Now set rootA as the parent of rootB using Array [ rootB ] = Array [ rootA ].
i.e Array [ 1 ] = Array[ 0 ].
Now array becomes as
Array 0 0 2 3 4 5
Index 0 1 2 3 4 5
2. UNION (1,3) :  Here A = 1,B = 3. hold values at index 1 and 3 into variables rootA and rootB.
rootA = Arr [ 1 ] = 0
rootB = Arr [ 3 ] = 3
In above case rootA is not the root of A because rootA is not equal to A. So to get root element we need to perform following operation
A = rootA
rootA = Array [ A ] = Array [ 0 ] = 0 
i.e  rootA = Array [ Array [ A ] ] = 0
Now rootA is 0 and A is zero both are equal.So rootA is root of subset { 0,1 }.
rootB is equal to B so rootB is root of subset { 3 }.
Again in final step set rootA as the parent of rootB using Array [ rootB ] = Array [ rootA ].
i.e Array [ 3 ] = Array[ 0 ].
Now array becomes as
Array 0 0 2 0 4 5
Index 0 1 2 3 4 5
3. UNION (5,3) :  Here A = 5,B = 3. hold values at index 1 and 3 into variables rootA and rootB.
rootA = Arr [ 5 ] = 5
rootB = Arr [ 3 ] = 0
In above case rootB is not the root of B because rootB is not equal to B. So to get root element we need to perform following operation
B = rootB
rootB = Array [ B ] = Array [ 0 ] = 0 
i.e  rootB = Array [ Array [ B ] ] = 0
Now rootB is 0 and B is 0 both are equal.So rootB is root of subset { 0,1,3 }.
rootA is equal to A so rootA is root of subset { 5 }.
Again in final step set tempA as the parent of tempB using Array [ rootB ] = Array [ rootA ].
i.e Array [ 0 ] = Array [ 5 ].
Now array becomes as
Array 5 0 2 0 4 5
Index 0 1 2 3 4 5

Let’s find root of 3 it will give you more understanding about root element.

A = 3
rootA = Array [ A ] = Array [ 3 ] =  0
A = rootA = 0
rootA = Array [ A ] = Array [ 0 ] = 5
A = rootA = 5
rootA = Array [ A ] = Array [ 5 ] =  5
Above all steps are equal to
rootA = Array [ Array [ Array [ A ] ] ] = 5
At this stage rootA is equal to A and we found root of 3 is 5 which is true since  { 0,1,3,5} belongs to same set.
We noticed that To Find root element we used following approach.
 rootA = Array [ Array [ Array [ Array [  A ] ] ] ]…………………..
Note : By Using above approach we must ensure that it should not create a cycle otherwise finding a root element will stuck in infinity loop.
Diagram :
Code Snippet :
 Time Complexity :
In worst case above approach will reduce time complexity to O(N) since we don’t need to check every element while performing union operation after applying concept of root element.
Third Approach :
In this approach we will maintain a size array which will store number of elements in a subset and also we will change union operation concept, this time we will set root of bigger subset as the parent of root of smaller subset.Further you will notice that it will create a balance tree which will definitely reduce complexity. 
Lets see a example :
Assume we have elements { 0,1,2,3,4,5 } and we put these elements into Array in following way. 
Array 0 1 2 3 4 5
Index 0 1 2 3 4 5
In starting every subset has only one element and there are six subset, store size in a new array Size.
Size 1 1 1 1 1 1
Index 0 1 2 3 4 5
1. UNION (0,1) :  Here A = 0,B = 1. find the roots of both as explained in second approach.
rootA = 0 and rootB = 1. In size Array check size of both subset so
sizeA = Size [ rootA ] = 1
sizeB = Size [ rootB ] = 1
Now check which subset has bigger size. Here both are equal so select any root and make it parent of another root and increase size of parent root.
Let’s make rootA as the parent of rootB this will increase size of rootA subset.
i.e Array [ rootB ] = Array [ rootA ] = Array [ 0 ] = 0
& Size [rootA ] = Size [ rootA ] + Size[ rootB ] =  1 + 1 = 2
& Size [ rootB ] = 0
Now array becomes as
Array 0 0 2 3 4 5
Index 0 1 2 3 4 5
And size array becomes
Size 2 0 1 1 1 1
Index 0 1 2 3 4 5
2. UNION (1,3) :  Here A = 1,B = 3. find the roots of both as explained in second approach.
rootA = 0 and rootB = 3. In size Array check size of both subset so
sizeA = Size [ rootA ] = 2
sizeB = Size [ rootB ] = 1
Now check which subset has bigger size. Here subset with rootA has bigger size of 2 so
Let’s make rootA as the parent of rootB this will increase size of rootA subset.
i.e Array [ rootB ] = Array [ rootA ] = Array [ 0 ] = 0
& Size [rootA ] = Size [ rootA ] + Size[ rootB ] =  2 + 1 = 3
& Size [ rootB ] = 0
Now array becomes as
Array 0 0 2 0 4 5
Index 0 1 2 3 4 5
And size array becomes
Size 3 0 1 0 1 1
Index 0 1 2 3 4 5
3. UNION (5,3) :  Here A = 5,B = 3. find the roots of both as explained in second approach.
rootA = 5 and rootB = 0. In size Array check size of both subset so
sizeA = Size [ rootA ] = 1
sizeB = Size [ rootB ] = 3
Now check which subset has bigger size. Here subset with rootB has bigger size of 3 so
Let’s make rootB as the parent of rootA this will increase size of rootB subset.
i.e Array [ rootA ] = Array [ rootB ] = Array [ 0 ] = 0
& Size [rootB ] = Size [ rootB ] + Size[ rootA ] =  3 + 1 = 4
& Size [ rootA ] = 0
Now array becomes as
Array 0 0 2 0 4 0
Index 0 1 2 3 4 5
And size array becomes
Size 4 0 1 0 1 0
Index 0 1 2 3 4 5

Diagram : 

Code Snippet : find operation and root operations is same as previous and we just modify union operation and init operation.

Above union operation balancing tree so it is also called weighted-union operation.

Time Complexity : Due to balancing tree above approach maintain time complexity to log2N.

Path Compression :

In above all explanation when finding root to A, A pointing to it’s parent element.Now we will modify it such that A points to its grandparent.When this path compression is used with weighted-union operation, it will maintain time complexity to log*N which computes number of times finding logN before reaching N reaches to  1.

Run a simple C program :

Output :
Size array
4 0 1 0 1 0
0 and 5 belongs to same subset
1 and 2 are belogns to different subset

Size array tells us number of subsets and which subset has how many number of elements.

Thanks

Leave a Reply