Graph Algorithm: Minimum Spanning Tree (MST) - Kruskal
Definition
A Minimum Spanning Tree (MST) is a spanning tree of a connected, weighted, undirected graph whose total edge weight is as small as possible.
Kruskal’s algorithm is a greedy algorithm used to construct an MST by:
- sorting all edges in non-decreasing order of weight,
- choosing the smallest edge available,
- and adding it only if it does not form a cycle.
A spanning tree must:
- include all vertices,
- contain exactly edges for a graph with vertices,
- and have no cycles.
Main Content
1. Minimum Spanning Tree (MST)
Meaning and structure
- :
An MST is a tree formed from a graph that connects every vertex using the minimum possible total cost. Since it is a tree, it always has no cycles and exactly edges, where is the number of vertices.
If a graph has multiple possible spanning trees, the MST is the one with the least total edge weight.
Example and intuition
- : Suppose 4 cities are connected by roads with different construction costs. To connect all cities with the least expense, we choose roads that connect the cities while avoiding unnecessary extra roads. The resulting network may look like this:
A --1-- B
| |
3 2
| |
C --4-- D
If we choose edges , , and , the total cost is , and all vertices are connected without cycles.
2. Kruskal’s Algorithm
Greedy strategy
- : Kruskal’s algorithm follows a greedy approach. It always tries to pick the smallest available edge first. The idea is that choosing the smallest safe edge at each step leads to an overall minimum spanning tree. This works because locally optimal choices, when made carefully, produce a globally optimal solution for MST.
Cycle prevention and edge selection
- :
The main challenge is avoiding cycles. When the algorithm considers an edge, it checks whether the two vertices of that edge are already connected through previously chosen edges. If they are, adding the edge would create a cycle, so it is skipped.
To support this efficiently, Kruskal’s algorithm commonly uses the Disjoint Set Union (DSU) or Union-Find data structure.
Key idea:
- Sort edges by weight.
- Pick the smallest edge.
- Add it if it does not create a cycle.
- Repeat until edges are selected.
3. Disjoint Set Union (Union-Find) in Kruskal
Purpose
- : Union-Find helps track which vertices belong to the same connected component. This is crucial for detecting whether adding an edge would form a cycle.
Operations used
- :
- Find: determines which set/component a vertex belongs to.
- Union: merges two sets/components when an edge is added.
If two endpoints of an edge belong to the same set, then they are already connected, and adding this edge would create a cycle.
Example of use:
Suppose vertices are in one set, and is alone. If an edge is considered, it is skipped because and are already in the same set. But if edge is considered, it can be added because it joins two different components.
Working / Process
1. Sort all edges by weight
- List every edge in the graph.
- Arrange them in increasing order of weight.
- This ensures the algorithm always considers the cheapest edge next.
2. Pick edges one by one
- Start from the smallest edge.
- For each edge, check whether its endpoints are in different components.
- If yes, include the edge in the MST.
- If no, reject it because it would form a cycle.
3. Stop when the MST is complete
- Continue until exactly edges have been selected.
- The selected edges now form the minimum spanning tree.
- The total weight of these edges is the minimum possible among all spanning trees.
Illustrative example:
Graph edges with weights:
Sorted order:
Selection:
- Add
- Add
- Skip because it forms a cycle
- Add
MST edges:
Total weight = 7
Advantages / Applications
Simple and easy to implement
- : Kruskal’s algorithm is conceptually straightforward. Sorting edges and adding the smallest safe edge is easy to understand and code.
Efficient for sparse graphs
- : It performs very well when the graph has relatively few edges compared to the number of vertices. This makes it a good choice for sparse networks.
Wide real-world applications
-
: It is used in:
-
designing communication networks,
- building road and railway systems,
- clustering in machine learning,
- image segmentation,
- laying electrical or water pipelines,
- reducing cost in network infrastructure.
Summary
- Kruskal’s algorithm is a greedy method for finding a minimum spanning tree.
- It selects edges in increasing order of weight while avoiding cycles.
- It is commonly implemented using Union-Find to detect connected components efficiently.
Important terms to remember
- Minimum Spanning Tree, spanning tree, greedy algorithm, cycle detection, Union-Find, edge sorting