The UnionFind algorithm groups elements into sets based on a specified property. Each set is non-overlapping, containing unique elements not present in any other set. The pattern uses a disjoint set data structure, such as an array, to keep track of which set each element belongs to.
Each set forms a tree data structure and has a representative element that resides at the root of the tree. Every element in this tree maintains a pointer to its parent. The representative’s parent pointer points to itself. We’ll always reach the set representative if we pick any element in a set and follow its parent pointers.
The algorithm is composed of two methods performed on the disjoint data structure:
find(v)
: This method finds the representative of the set that contains v
.
union(v1, v2)
: This method merges the sets containing v1
and v2
into one set.
In the find(v)
method, we recursively find the parent node of v
by traversing the intermediate nodes sequentially. Supposing we have nodes, this operation can take time in the worst case, where the nodes form a degenerate tree.
We optimize the find(v)
method using path compression to avoid repeatedly traversing the intermediate nodes. On each find(v)
method on a node of a tree, we update that node’s parent to point directly to the root node. This reduces the length of the path of that node to the root, ensuring we don’t have to travel all the intermediate nodes on future find(v)
methods.
Due to this optimization, the find(v)
method will now take when it is called on any previously traversed nodes.
The slides below illustrate how path compression works:
Let’s take a look at a Python implementation for path compression:
class UnionFind:def __init__(self, n):# Initialize the parent array with length nself.parent = [i for i in range(n)]def find(self, v):# The below line is only for printing purposesprint("\tfind(" + str(v) + ")")# If x is not the parent of itself, recursively search for its parentif v != self.parent[v]:self.parent[v] = self.find(self.parent[v])return self.parent[v]def union(self, v1, v2):# Find the parent of x and yp1, p2 = self.find(v1), self.find(v2)# If x and y are already in the same connected component, do nothing# If x and y are not in the same connected componentif p1 != p2:self.parent[p1] = p2# Driver codedef main():nodes = UnionFind(6)nodes.parent = [1, 2, 3, 4, 5, 5]print("parent array before path compression:", nodes.parent)print("\tFinding the root node of 0:")print("\tThe root node of 0 is:", nodes.find(0))print("\nparent array after path compression:", nodes.parent)print("\tFinding the root node of 0:")print("\tThe root node of 0 is:", nodes.find(0))if __name__ == "__main__":main()
In the find(v)
method:
find(v)
method, if the current value of v
is not equal to its respective index, we recursively search for the root parent of v
. After a node matching its respective index has been found, we set all traversed nodes in the parent
array equal to this found node. Finally, we return this node.In the main
function (driver code):
Lines 30–32: We initialize a UnionFind
object with a parent
array of six nodes. We reinitialize the parent
array to the nodes in the example above.
Lines 27–29: We print the parent
array before path compression is applied array and find the root parent of node 0
through the find(0)
method.
Lines 34–36: We print the parent
array after path compression is applied array and again find the root parent of node 0
through the find(0)
method.
Free Resources