| title | Minimum Spanning Tree | ||||
|---|---|---|---|---|---|
| tags |
|
Given an undirected weighted connected graph
Minimum spanning tree (MST) is a spanning tree in which the sum of edge weights is minimum. The MST of a graph is not unique in general, there might be more than one spanning tree with the same minimum cost. For example, take a graph where all edges have the same weight, then any spanning tree would be a minimum spanning tree. In problems involving minimum spanning trees where you have to output the tree itself (and not just the minimum cost), it either puts more constraint so the answer is unique, or simply asks for any minimum spanning tree.
 MST of the graph. It spans all nodes of the graph and it is connected.To find the minimum spanning tree of a graph, we will introduce two algorithms. The first one called Prim's algorithm, which is similar to Dijkstra's algorithm. Another algorithm is Kruskal agorithm, which makes use of the disjoint set data structure. Let's discover each one of them in detail!
Prim algorithm is very similar to Dijkstra's shortest path algorithm. In this algorithm we have a set
There is a problem with this implementation, it assumes that the graph is connected. If the graph is not connected this algorithm will be stuck on loop. There is a good visualization for Prim algorithm at [10]. If we use priority queue complexity would be
In Prim algorithm we started with a specific node and then proceeded with choosing the closest neighbor node to our current graph. In Kruskal algorithm, we follow a different strategy; we start building our MST by choosing one edge at a time, and link our (intially separated) nodes together until we connect all of the graph.
To achieve this task, we will start with having all the nodes separated each in a group. In addition, we will have the list of edges from the original graph sorted based on their cost. At each step, we will:
- Pick the smallest available edge (that is not taken yet)
- Link the nodes it connects together, by merging their group into one unified group
- Add the cost of the edge to our answer
However, you may realize in some cases the link we add will connect two nodes from the same group (because they were grouped before by other taken edges), hence violating the spanning tree condition (Acyclic) and more importantly introducing unnecessary edges that adds more cost to the answer. So to solve this problem, we will only add the edges as long as they connect two currently (at the time of processing this edge) separated nodes that belong to different groups, hence completing the algorithm.
The optimality of Kruskal algorithm comes from the fact that we are taking from a sorted list of edges. For more rigorous proof please refer to [11].
So how can we effectively merge the group of nodes and check that which group each node belong? We can utilize disjoint set data structure which will help us to make union and find operations in an amortized constant
typedef pair<int,pair<int,int>> edge;
// represent edge as triplet (w,u,v)
// w is weigth, u and v verticies.
// edge.first is weigth edge.second.first -> u, edge.second.second -> v
typedef vector<edge> weigthed_graph;
/*union - find data structure utilities */
const int maxN = 3005;
int parent[maxN];
int ssize[maxN];
void make_set(int v);
int find_set(int v);
void union_sets(int a, int b);
void init_union_find();
/*Code that finds edges in MST */
void kruskal(vector<edge> &edgeList ){
vector<edge> mst;
init_union_find();
sort(edgeList.begin(),edgeList.end(), \
[](const auto &a, const auto &b) { return a.first< b.first;});
//well this weird syntax is lambda function
// for sorting pairs to respect their first element.
for( auto e: edgeList){
if( find_set(e.second.first )!= find_set(e.second.second)){
mst.push_back(e);
union_sets(e.second.first, e.second.second);
}
}
}To calculate the time complexity, observe how we first sorted the edges, this takes
So in total we have