|
COT 5405 Advanced Algorithms Chris Lacher Notes 10: Graph Algorithms 2 |
Assume (G,w) is a connected weighted undirected graph and A is a subgraph of G. A safe edge for A is an edge such that A.Insert(e) is a subset of a MST for (G,w).
MST Algorithm Pattern: graph A; // starts out with no edges; becomes an MST while (A is not a MST for G) { // loop invariant: A is a subset of a MST for G find safe edge e for A add e to A (along with the vertices of e) } return A
Note that when the loop terminates, A is a MST for G. We need to verify the loop invariant by induction. The base case is trivial, and the definition of "safe edge" is an edge e that can be added to A such that A union {e} is a subset of a MST, so the inductive step is embodied in the defintion of safe edge.
Theorem: Assume that G = (V,E) is a connected undirected graph with weight function w:E-->Reals. Let A be a subgraph of a MST for G. If (S, V - S) is a cut of G that respects A and e=(u,v) is a mimimum weight edge crossing the cut, then e is safe for A.
Proof: Let T be a MST containing A. If e is in T, by definition e is safe.
Suppose e=(u,v) is not in T. Then, because T spans G, there is a path P in T connecting u and v. Because u and v are on opposite sides of the cut, some edge e' = (x,y) in P also crosses the cut. Let T' be the subgraph obtained by removing e' from T and adding e:
T' = T - {e'} + {e}.
The edge e' connects two components of T - {e'}, one containing u and the other v. Thus adding e reconnects these components, forming a new spanning tree. Moreover,
w(T') = w(T) - w(e') + w(e) <= w(T)
because w(e) is minimal among egdes crossing the cut. Since by assumption w(T) is minimal, w(T') is also minimal, so T' is a MST. Since A + {e} is a subset of T', we have shown that e is safe for A.
Corollary: A connected undirected weighted graph has a minimum spanning tree.
We have now shown that a connected undirected graph has a MST. If the graph has n vertices, then so must every MST. By tree theory, the MST must have n-1 edges. These two facts can be used to simplify the loop structure of the algorithm pattern, as follows:
MST Algorithm Pattern: graph A; // starts out with no edges; becomes an MST int n = number of vertices of G for (i = 1; i < n; ++i) { find safe edge e for A add e to A (along with the vertices of e) } return A
Two specific MST algorithms depend on the way a safe edge is chosen:
To prove Kurskal is correct, apply Safe Edge Theorem with S = vertices of one tree in the forrest.
To prove Prim is correct, apply Safe Edge Theorem with S = vertices of the growing tree.
// Kruskal's MST Algorithm // G = (V,E,w) is a connected weighted undirected graph // resources: int n; // number of vertices of G MinPriorityQueue < EdgeType > E; // edges of G, prioritized by minimum weight Set < EdgeType > F; // edges of Kruskal's forrest Vector < Set < VertexType > > C; // vertices of the forrest, organized by component // initialization: for (each edge e of G) // (1) E.Push(e); // ensures we encounter edges by non-decreasing weight for (i = 0; i < n; ++i) // (2) C[x].Insert(x); // start with each vertex its own component // run: while (!E.Empty()) // (3) { (x,y) = E.Pop(); // (4) if (!C[x].Includes(y)) // (5) x and y are not connected in F { F.Insert(e); // (6) add e to Kruskal's forrest C[x] += C[y]; // (7) union C[y] into C[x], since connected by e C[y].Clear(); // (8) make C[y] empty } } // F = edges of a MST for G
Runtime: O(e log e) = O(e log n) [n = number of vertices, e = number of edges]
Loop runtime <= O((e + n)(log e + log n)) =
O(e log e)
because n <= e+1. Note also that Θ(log n) =
Θ(log e), because n - 1 <= e <=
n2, which gives the alternate statement.
class Prim // minor changes from BFS - could be template parameter { private: // reference to structure being searched GraphBase& g_; // adjacency list representation (vertices indexed 0,1,...n-1) Vertex s_; // starting search here private: // control variables Vector < ColorType > color_; // Queue < Vertex > conQueue_; MinPriorityQueue < Vertex > conQueue_; // by distance_ public: // informational variables Vector < Vertex > parent_; // = parent in BFS tree Vector < int > distance_; // = weighted distance from start public: // methods void Init() { for(each vertex v of g_ except possibly s_) // (1) { color_[v] = white; distance_[v] = infinity; parent_[v] = NIL; } color_[s_] = gray; distance_[s_] = 0; parent_[s_] = NIL; conQueue_.MakeEmpty(); conQueue_.Push(s_); } void Run() { while (!conQueue_.Empty()) // (2) { u = conQueue_.Pop(); // (3) for each v in g_.ADJ[u] // (4) { if (color_[v] = white) { color_[v] = gray; // distance_[v] = distance_[u] + 1; distance_[v] = distance_[u] + w(u,v); parent_[v] = u; conQueue_.Push(v); // (5) } // if } // for color_[u] = black; } // while } };
Runtime: O(e log n)
Thus the total cost of the entire algorithm is O(n log n + e log n) = O(e log n). One final note: if we use a Fibannaci heap to implement the priority queue, we can improve this runtime to O(e + n log n), which is better for non-sparse graphs [where O(e) > O(n)].