Union-Find (Disjoint Set)

The Union-Find or Disjoint Set data structure is a data structure that keeps track of a partition of elements into disjoint sets. It supports two primary operations:

  1. Union: Merges two sets containing different elements.
  2. Find: Determines which set a particular element is in.

Union-Find is commonly used to solve problems related to connectivity in graphs, such as checking if two elements are in the same component, detecting cycles, or finding connected components. The data structure can be optimized using Quick Find, Quick Union, and Path Compression techniques.

Basic Union-Find Structure

We represent each element with a unique identifier and maintain two main arrays:

  1. Parent Array: Each element points to its parent. If it points to itself, it’s the root of its set.
  2. Size or Rank Array: This array keeps track of the “size” or “rank” of each set to optimize unions.

Quick Find

Quick Find is an approach where each element directly points to the root of its set. Thus, the find operation is fast (O(1)), but union can be slow (O(n)) because we might need to update multiple elements to ensure they point to the new root.

Quick Find: Find and Union Operations

  • Find: Simply returns the root (or parent) of the element since each element directly points to the root.
  • Union: To merge two sets, iterate over the parent array and set the parent of every element in one set to the root of the other set.
class QuickFindUF:
    def __init__(self, n):
        self.parent = list(range(n))

    def find(self, p):
        return self.parent[p]

    def union(self, p, q):
        root_p = self.find(p)
        root_q = self.find(q)
        
        if root_p != root_q:
            for i in range(len(self.parent)):
                if self.parent[i] == root_p:
                    self.parent[i] = root_q

Explanation:

  • Find: Constant time, as each element directly points to its root.
  • Union: Potentially slow, as we might have to update many elements to change their roots.

Drawbacks: Quick Find is inefficient for large datasets since union is O(n).

Quick Union

In Quick Union, each element points to a parent, but only the root element points to itself. This approach allows for faster unions, but find operations can be slow if the tree becomes deep.

Quick Union: Find and Union Operations

Find: Traverse up the tree until the root (an element pointing to itself) is found.

Union: Attach the root of one set to the root of the other set.

class QuickUnionUF:
    def __init__(self, n):
        self.parent = list(range(n))
    
    def find(self, p):
        # Follow the chain of parent pointers until reaching the root
        while p != self.parent[p]:
            p = self.parent[p]
        return p
    
    def union(self, p, q):
        root_p = self.find(p)
        root_q = self.find(q)
        
        if root_p != root_q:
            self.parent[root_p] = root_q

Explanation:

  • Find: Traverses up to the root, which can be slow for deep trees.
  • Union: Faster than Quick Find since we only need to update the root of one tree.

Drawbacks: - Trees can become very deep, causing search operations to be slow.

Optimized Union-Find: Path Compression and Union by Rank

To optimize Quick Union, we use Path Compression and Union by Rank. These techniques keep the trees shallow, making both find and union operations almost constant time (O(α(n)), where α is the inverse Ackermann function, which grows very slowly).

Path Compression

Path Compression flattens the tree by making each node in the path point directly to the root. This speeds up future find operations because every node visited will now directly point to the root.

def find(self, p):
    if p != self.parent[p]:
        self.parent[p] = self.find(self.parent[p])  # Path compression
    return self.parent[p]

Union by Rank

Instead of always attaching one tree to another, Union by Rank attaches the smaller tree to the larger one. We maintain a rank array, where the rank represents the depth of the tree.

  1. Attach the root of the smaller rank tree to the root of the larger rank tree.
  2. If both trees have the same rank, attach one to the other and increase the rank by 1.

Complete Optimized Union-Find Implementation

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n  # Rank array for Union by Rank

    def find(self, p):
        if p != self.parent[p]:
            self.parent[p] = self.find(self.parent[p])  # Path compression
        return self.parent[p]

    def union(self, p, q):
        root_p = self.find(p)
        root_q = self.find(q)
        
        if root_p != root_q:
            # Union by Rank
            if self.rank[root_p] < self.rank[root_q]:
                self.parent[root_p] = root_q
            elif self.rank[root_p] > self.rank[root_q]:
                self.parent[root_q] = root_p
            else:
                self.parent[root_q] = root_p
                self.rank[root_p] += 1

Explanation:

  • Find with Path Compression: Flattens the tree by making nodes point directly to the root, improving future accesses.
  • Union by Rank: Balances the tree by attaching the shorter tree to the root of the taller tree, keeping trees shallow.

Efficiency: Both find and union operations run in nearly constant time, O(α(n)), where α is the inverse Ackermann function, which is practically constant for all reasonable inputs.

Practical Applications of Union-Find

  1. Detecting Cycles in Graphs: Union-Find can detect cycles in an undirected graph by checking if two vertices are already in the same set.
  2. Kruskal’s Algorithm for Minimum Spanning Tree: Used to check and merge components in Kruskal’s MST algorithm.
  3. Connected Components in Graphs: Union-Find can efficiently track which nodes are in the same connected component.
  4. Social Network Connectivity: Finding if two people are connected through some direct or indirect friendship.

Example Problem: Detecting Cycles in an Undirected Graph

Using Union-Find with path compression and union by rank, we can detect cycles in an undirected graph by iterating over edges and checking if two vertices belong to the same set.

class Graph:
    def __init__(self, vertices):
        self.vertices = vertices
        self.edges = []
    
    def add_edge(self, u, v):
        self.edges.append((u, v))
    
    def has_cycle(self):
        uf = UnionFind(self.vertices)
        for u, v in self.edges:
            if uf.find(u) == uf.find(v):  # Same root implies a cycle
                return True
            uf.union(u, v)
        return False

# Example usage
graph = Graph(5)
graph.add_edge(0, 1)
graph.add_edge(1, 2)
graph.add_edge(2, 3)
graph.add_edge(3, 4)
graph.add_edge(4, 0)  # Adding a cycle

print("Cycle Detected:" if graph.has_cycle() else "No Cycle Detected")
# Output: Cycle Detected

Explanation: Each edge is processed to check if it forms a cycle. If two nodes are already connected, adding the edge would create a cycle.

Track your progress

Mark this subtopic as completed when you finish reading.