What is the UnionFind optimization for path compression?

The UnionFind algorithm groups elements into sets based on a specified property. Each set is non-overlapping, containing unique elements not present in any other set. The pattern uses a disjoint set data structure, such as an array, to keep track of which set each element belongs to.

Each set forms a tree data structure and has a representative element that resides at the root of the tree. Every element in this tree maintains a pointer to its parent. The representative’s parent pointer points to itself. We’ll always reach the set representative if we pick any element in a set and follow its parent pointers.

The algorithm is composed of two methods performed on the disjoint data structure:

find(v): This method finds the representative of the set that contains v.

union(v1, v2): This method merges the sets containing v1 and v2 into one set.

Path compression

In the find(v) method, we recursively find the parent node of v by traversing the intermediate nodes sequentially. Supposing we have nn nodes, this operation can take O(n)O(n) time in the worst case, where the nodes form a degenerate tree.

We optimize the find(v) method using path compression to avoid repeatedly traversing the intermediate nodes. On each find(v) method on a node of a tree, we update that node’s parent to point directly to the root node. This reduces the length of the path of that node to the root, ensuring we don’t have to travel all the intermediate nodes on future find(v) methods.

Due to this optimization, the find(v) method will now take O(1)O(1) when it is called on any previously traversed nodes.

The slides below illustrate how path compression works:

canvasAnimation-image
1 of 10

Example

Let’s take a look at a Python implementation for path compression:

class UnionFind:
def __init__(self, n):
# Initialize the parent array with length n
self.parent = [i for i in range(n)]
def find(self, v):
# The below line is only for printing purposes
print("\tfind(" + str(v) + ")")
# If x is not the parent of itself, recursively search for its parent
if v != self.parent[v]:
self.parent[v] = self.find(self.parent[v])
return self.parent[v]
def union(self, v1, v2):
# Find the parent of x and y
p1, p2 = self.find(v1), self.find(v2)
# If x and y are already in the same connected component, do nothing
# If x and y are not in the same connected component
if p1 != p2:
self.parent[p1] = p2
# Driver code
def main():
nodes = UnionFind(6)
nodes.parent = [1, 2, 3, 4, 5, 5]
print("parent array before path compression:", nodes.parent)
print("\tFinding the root node of 0:")
print("\tThe root node of 0 is:", nodes.find(0))
print("\nparent array after path compression:", nodes.parent)
print("\tFinding the root node of 0:")
print("\tThe root node of 0 is:", nodes.find(0))
if __name__ == "__main__":
main()
Example demonstrating the use of path compresison

Explanation

In the find(v) method:

  • Lines 7–14: In the find(v) method, if the current value of v is not equal to its respective index, we recursively search for the root parent of v. After a node matching its respective index has been found, we set all traversed nodes in the parent array equal to this found node. Finally, we return this node.

In the main function (driver code):

  • Lines 30–32: We initialize a UnionFind object with a parent array of six nodes. We reinitialize the parent array to the nodes in the example above.

  • Lines 27–29: We print the parent array before path compression is applied array and find the root parent of node 0 through the find(0) method.

  • Lines 34–36: We print the parent array after path compression is applied array and again find the root parent of node 0 through the find(0) method.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved