Published on

Minimum Spanning Tree - Kruskal and Prim Algorithms

Authors
  • avatar
    Name
    Balaram Shiwakoti
    Twitter

Minimum Spanning Tree was initially one of those graph problems that seemed abstract to me during DSA preparation. I kept thinking, "Why do we need to connect all vertices with minimum cost?" But once I understood real-world applications like network design, circuit layout, and clustering, the importance became clear. Let me share what I've learned about these fundamental graph algorithms.

Introduction to Minimum Spanning Tree

When I first encountered MST problems, I was confused about why we needed special algorithms. But then I realized that finding the minimum cost to connect all vertices in a graph is a fundamental optimization problem with countless practical applications.

Minimum Spanning Tree (MST) is a subset of edges in a connected, weighted graph that connects all vertices with the minimum total edge weight, without forming any cycles.

MST Properties and Definitions

Spanning Tree Properties

For a graph G = (V, E) with n vertices:

Spanning Tree characteristics:

  • Contains exactly n-1 edges
  • Connects all n vertices
  • Contains no cycles (acyclic)
  • Removing any edge disconnects the graph
  • Adding any edge creates exactly one cycle

Minimum Spanning Tree Properties

MST characteristics:

  • Among all possible spanning trees, has minimum total weight
  • May not be unique (multiple MSTs with same weight possible)
  • Unique if all edge weights are distinct
  • Always exists for connected graphs

Example:

Graph with 4 vertices:
    A ---- 1 ---- B
    |             |
    2             3
    |             |
    C ---- 4 ---- D

Possible spanning trees:
1. Edges: AB(1), AC(2), BD(3) - Total: 6
2. Edges: AB(1), AC(2), CD(4) - Total: 7
3. Edges: AB(1), BD(3), CD(4) - Total: 8

MST: AB(1), AC(2), BD(3) with total weight 6

I remember drawing all possible spanning trees for small graphs to understand this concept!

Kruskal's Algorithm

Algorithm Overview

Kruskal's approach: Greedy algorithm that builds MST by adding edges in order of increasing weight, avoiding cycles.

Key insight: Always choose the minimum weight edge that doesn't create a cycle.

Algorithm Steps

  1. Sort all edges by weight in ascending order
  2. Initialize each vertex as a separate component
  3. For each edge in sorted order:
    • If edge connects different components, add to MST
    • Union the components
  4. Stop when MST has n-1 edges

Implementation Overview

Key Components:

  1. Union-Find Data Structure:

    • Tracks connected components to detect cycles
    • Uses path compression and union by rank for efficiency
    • Essential for Kruskal's algorithm
  2. Algorithm Steps:

    • Sort all edges by weight in ascending order
    • Initialize Union-Find with each vertex as separate component
    • Process edges in order, adding to MST if no cycle created
    • Stop when MST has exactly V-1 edges
  3. Cycle Detection:

    • Use Union-Find to check if vertices are in same component
    • If same component, adding edge would create cycle
    • If different components, safe to add edge and union components

Kruskal Example Walkthrough

Graph:

    A ---- 1 ---- B
    |             |
    2             3
    |             |
    C ---- 4 ---- D

Step-by-step execution:

Initial state:

  • Sorted edges: [(1,A,B), (2,A,C), (3,B,D), (4,C,D)]
  • Components: {A}, {B}, {C}, {D}

Step 1: Process edge (1,A,B)

  • A and B in different components ✓
  • Add to MST: [(A,B,1)]
  • Union A and B: {A,B}, {C}, {D}

Step 2: Process edge (2,A,C)

  • A and C in different components ✓
  • Add to MST: [(A,B,1), (A,C,2)]
  • Union A and C: {A,B,C}, {D}

Step 3: Process edge (3,B,D)

  • B and D in different components ✓
  • Add to MST: [(A,B,1), (A,C,2), (B,D,3)]
  • Union B and D: {A,B,C,D}

Result: MST with edges [(A,B,1), (A,C,2), (B,D,3)], total weight = 6

Prim's Algorithm

Algorithm Overview

Prim's approach: Greedy algorithm that grows MST one vertex at a time, always adding the minimum weight edge that connects the current tree to a new vertex.

Key insight: Start with any vertex and keep adding the cheapest edge to expand the tree.

Algorithm Steps

  1. Start with any vertex in the MST
  2. Repeat until all vertices are included:
    • Find minimum weight edge connecting MST to non-MST vertex
    • Add this edge and vertex to MST
  3. Result is the complete MST

Implementation Overview

Key Components:

  1. Priority Queue (Min-Heap):

    • Stores edges with their weights as priority
    • Always extracts minimum weight edge
    • Essential for efficient vertex selection
  2. Visited Set:

    • Tracks vertices already included in MST
    • Prevents cycles and duplicate processing
    • Ensures each vertex added exactly once
  3. Algorithm Steps:

    • Start with any vertex (typically vertex 0)
    • Add all edges from current MST to priority queue
    • Extract minimum weight edge connecting to new vertex
    • Add new vertex and edge to MST, repeat until complete
  4. Two Implementation Approaches:

    • Adjacency List + Heap: Better for sparse graphs, O(E log V)
    • Adjacency Matrix + Array: Better for dense graphs, O(V²)

Prim Example Walkthrough

Graph: Same as Kruskal example

Step-by-step execution starting from vertex A:

Initial state:

  • MST: {A}
  • Available edges: [(A,B,1), (A,C,2)]
  • Min heap: [(1,A,B), (2,A,C)]

Step 1: Add edge (A,B,1)

  • MST: {A,B}
  • New available edges: [(B,D,3)]
  • Min heap: [(2,A,C), (3,B,D)]

Step 2: Add edge (A,C,2)

  • MST: {A,B,C}
  • New available edges: [(C,D,4)]
  • Min heap: [(3,B,D), (4,C,D)]

Step 3: Add edge (B,D,3)

  • MST: {A,B,C,D}
  • All vertices included

Result: Same MST as Kruskal with total weight = 6

Algorithm Comparison

Time Complexity

AlgorithmTime ComplexitySpace Complexity
KruskalO(E log E)O(V)
Prim (Binary Heap)O(E log V)O(V)
Prim (Fibonacci Heap)O(E + V log V)O(V)
Prim (Adjacency Matrix)O(V²)O(V)

Where V = vertices, E = edges

When to Use Each Algorithm

Use Kruskal when:

  • Graph is sparse (E << V²)
  • Edges are already sorted
  • Working with edge list representation
  • Need to find MST incrementally

Use Prim when:

  • Graph is dense (E ≈ V²)
  • Working with adjacency matrix/list
  • Need to start from specific vertex
  • Memory is limited

I found that understanding these trade-offs was crucial for choosing the right algorithm!

Applications of MST

Network Design

Telecommunications:

  • Design minimum cost network to connect all cities
  • Minimize cable laying costs
  • Ensure connectivity with least infrastructure

Computer Networks:

  • Design LAN topology
  • Minimize network latency
  • Reduce infrastructure costs

Circuit Design

VLSI Design:

  • Connect components with minimum wire length
  • Reduce signal delay
  • Minimize manufacturing cost

Printed Circuit Boards:

  • Route connections efficiently
  • Minimize board size
  • Reduce electromagnetic interference

Clustering and Approximation

Data Clustering:

  • Build hierarchical clusters
  • Find natural groupings in data
  • Approximate solutions to TSP

Image Segmentation:

  • Segment images based on pixel similarity
  • Computer vision applications
  • Medical image analysis

Real-world Example

Problem: Connect 5 cities with minimum road construction cost

Given data:

  • Cities: A, B, C, D, E
  • Construction costs (in millions):
    • A-B: 10M,AC:10M, A-C: 15M, A-D: $20M
    • B-C: 25M,BE:25M, B-E: 30M
    • C-D: 35M,CE:35M, C-E: 40M
    • D-E: $45M

Solution approach:

  1. Apply Kruskal's algorithm to find minimum spanning tree
  2. Sort edges by cost: A-B(10M),AC(10M), A-C(15M), A-D(20M),BC(20M), B-C(25M), B-E($30M)...
  3. Select edges that don't create cycles
  4. Result: A-B, A-C, A-D, B-E with total cost $75M

Advanced Topics

MST Variants

Minimum Bottleneck Spanning Tree:

  • Minimize the maximum edge weight
  • Different from MST but related
  • Applications in network reliability

Degree-Constrained MST:

  • Limit maximum degree of vertices
  • NP-hard problem
  • Approximation algorithms exist

Dynamic MST:

  • Handle edge insertions/deletions
  • Maintain MST efficiently
  • Advanced data structures required

MST Properties and Theorems

Cut Property:

  • For any cut, minimum weight crossing edge is in some MST
  • Foundation for correctness proofs

Cycle Property:

  • For any cycle, maximum weight edge is not in MST
  • Used in optimization algorithms

Uniqueness:

  • MST is unique if all edge weights are distinct
  • Multiple MSTs possible with equal weights

Implementation Optimizations

Kruskal Optimizations

Key optimization techniques:

  • Early termination: Stop when MST has V-1 edges
  • Efficient sorting: Use counting sort for small weight ranges
  • Union-Find optimizations: Path compression and union by rank
  • Edge preprocessing: Remove duplicate edges, keep minimum weights

Prim Optimizations

Key optimization techniques:

  • Efficient priority queue: Use binary heap for O(log V) operations
  • Lazy deletion: Mark vertices as visited instead of removing from heap
  • Dense graph optimization: Use adjacency matrix with simple array for key values
  • Fibonacci heap: Theoretical improvement to O(E + V log V) for very dense graphs

Common Pitfalls and Tips

1. Graph Connectivity

Problem: MST requires connected graph Solution: Check connectivity before applying MST algorithms Method: Use DFS/BFS to verify all vertices are reachable Quick check: Graph needs at least V-1 edges to be connected

2. Handling Duplicate Edges

Problem: Multiple edges between same vertices with different weights Solution: Keep only the edge with minimum weight Method: Use dictionary to track minimum weight for each vertex pair Important: Ensure consistent edge representation (u,v) vs (v,u)

3. Integer Overflow

Problem: Large edge weights can cause overflow in total weight calculation Solution: Use appropriate data types and check for overflow Prevention: Validate input ranges and use 64-bit integers when needed Detection: Check if adding next weight would exceed maximum value

Practice Problems

Here are some problems I found helpful:

  1. Connecting Cities - Basic MST application
  2. Network Delay Time - Modified MST problem
  3. Min Cost to Connect All Points - Coordinate-based MST
  4. Optimize Water Distribution - Real-world MST application
  5. Critical Connections - Bridge finding (related to MST)

Conclusion

Minimum Spanning Tree algorithms are fundamental tools for optimization problems involving connectivity. Key takeaways:

Core Concepts:

  • MST connects all vertices with minimum total weight
  • Greedy algorithms work due to optimal substructure
  • Two main approaches: edge-based (Kruskal) and vertex-based (Prim)

Algorithm Choice:

  • Kruskal: Better for sparse graphs, edge-list representation
  • Prim: Better for dense graphs, adjacency representation
  • Both achieve same result with different approaches

Applications:

  • Network design and optimization
  • Circuit layout and VLSI design
  • Clustering and approximation algorithms
  • Infrastructure planning

For loksewa preparation, focus on:

  • Understanding both Kruskal and Prim algorithms
  • Implementing Union-Find for cycle detection
  • Analyzing time and space complexity
  • Recognizing MST applications in real problems
  • Comparing algorithm trade-offs

Remember: MST algorithms demonstrate the power of greedy approaches when optimal substructure exists. The key insight is that locally optimal choices (minimum weight edges) lead to globally optimal solutions when cycles are avoided.

Understanding MST algorithms provides a foundation for more advanced graph algorithms and optimization techniques!