Version 07/19/19

Graphs 2: Edge-Weighted Graphs

1 Edge Objects

template < typename N > class Edge { public: typedef N Vertex; // an unsigned integer type Edge (); // default constructor so arrays are possible Edge (Vertex x, Vertex y, double w); // preferred initializing constructor Vertex x_, y_; double w_; };

2 Edge-Weighted Graph Classes

It is fairly straightforward to enhance the Graph APIs to handle edge weights:

template < typename N > Weighted Graph API { public: typedef N Vertex; typedef typename ????????????????? SetType; typedef typename SetType::ConstIterator AdjIterator; void SetVrtxSize (N n); size_t VrtxSize () const; void AddEdge (Vertex from, Vertex to, double weight = 1.0); bool HasEdge (Vertex from, Vertex to, double& wt) const; // sets wt variable if edge exists Edge<N> GetEdge (Vertex from, Vertex to) const; // returns [0,0,0.0] if edge does not exist void SetWeight (Vertex from, Vertex to, double weight); // (re)sets weight double GetWeight (Vertex from, Vertex to) const; // returns weight if edge exists, 0.0 otherwise size_t EdgeSize () const; size_t OutDegree (Vertex v) const; // same as Degree for undirected case size_t InDegree (Vertex v) const; // same is OutDegree for undirected case ... AdjIterator Begin (Vertex v) const; AdjIterator End (Vertex v) const; ... };

There are a fair number of ways to implement these APIs. We will concentrate on two:

2.1 Weight Function

The first is to enhance the graph class already in use by adding a mapping Weight: {edges} -> double that gives the weight of an edge. There are several advantages to this direction, among them:

We can derive the weighted graph class from the unweighted version, thus re-using a substantial part of the base class code.
The weight mapping, implemented as an associative array, provides a way to answer such questions as HasEdge(v,w) in amortized constant time by querying the support table.

The OOP model for this direction looks like this:

namespace fsu { template < typename N > ALUG class ALUWGraph : public ALUGraph <N> / \ { ALDG ALUWG ... \ }; ALDWG template < typename N > blue = defined in graph.h class ALDWGraph : public ALDGraph <N> red = defined in wgraph.h { ... (Option 1) }; } // namespace fsu

Implementing the weight mapping as a Map or HashMap provides an alternative way to traverse the edges of a weighted graph. The first is the usual "for each vertex x { for each edge from x }":

// standard Graph traversal: for (x = 0; x < g.VrtxSize(); ++x) { for (AdjIterator i = g.Begin(x); i != g.End(x); ++i) { // [x,*i,GetWeight(x,*i)] is a weighted edge in g } }

Note however that in the undirected case each edge is encountered twice in this traversal, so some way to disambiguate the two representations is needed. One straighforward way is to only consider the edges when the representation has its "from" vertex less than its "to" vertex:

// standard Graph traversal - unweighted case: for (x = 0; x < g.VrtxSize(); ++x) { for (AdjIterator i = g.Begin(x); i != g.End(x); ++i) { // if (x < *i) [x,*i,GetWeight(x,*i)] is a unique weighted edge in g } }

Alternatively one can traverse the map itself:

// traverse the weight map: for (Map_type::Iterator i = Weight.Begin(); i != Weight.End(); ++i) { // Pair p = to_pair((*i).key_); // [p.first_,p.second_,(*i).data_] is a unique weighted edge in g }

Deriving from the unweighted graph classes defined in LIB/graph/graph.h saves a fair amount of re-work, because the weighted graph objects "are" ordinary unweighted graphs and therefore functions and algorithms defined for the parent type can be applied directly to the child type.

In specifying and instantiating the weight mapping, care must be taken with efficiency, both time and space. A simple matrix is out of the question for most practical situations where graphs are big and sparse, for the same reasons we don't typically use adjacency matrices. To keep memory use O(n) we need some kind of associative array that can restrict the domain to actual edges of the graph (and not bloated to all possible edges, as a matrix would do). There is little need for "ordered" output of the map itself, so little justification for the overhead of a BST-based map. These considerations lead us to an associative array using hash table technology.

One advantage of this design direction is that most of the (unweighted) graph framework itself can be re-used. One pitfall is that in the undirected case an edge {x,y} is the same edge as {y,x}, so there should be only one entry for this edge in the mapping. This nuance can be accomplished by using two different definitions of a private member function

static fsu::String Key ( Vertex x , Vertex y );

that produces an unambigous key for an edge [x,y] whose weight is stored in the weight map. (Hint: see the code illustrating the memoized knapsack problem in the Dynamic Programming Notes. ToHex is probably faster than ToDec.)

2.2 Edges as Objects

The second way we will use to represent weighted graphs is to make a fresh start, re-defining an adjacency list to be a list of Edge object handles - pointers or references to edge objects. Advantages of this approach include:

The implementation includes the unweighted case, by keeping all weights the same value 1.0.
Multiple edges with the same ends, as well as self-loops, are legal in the model.

The OOP model for this direction looks like this:

namespace fsu { template < typename N > ALUG ALUWG class ALUWGraph / / { ALDG ALDWG ... }; template < typename N > blue = defined in graph.h class ALDWGraph : public ALUWGraph <N> red = defined in wgraph.h { ... (Option 2) }; } // namespace fsu

The "edges as objects" implementation requires a separate "edge inventory" E where the set of actual edge objects is defined explicitly. The handles in adjacency lists point into the edge inventory. The edge inventory provides a second way to traverse the edges of a graph:

// standard Graph traversal - undirected case: for (x = 0; x < g.VrtxSize(); ++x) { for (AdjIterator i = g.Begin(x); i != g.End(x); ++i) { // if (x < *i) [x,*i,GetWeight(x,*i)] is a unique weighted edge in g } } // standard Graph traversal - directed case: for (x = 0; x < g.VrtxSize(); ++x) { for (AdjIterator i = g.Begin(x); i != g.End(x); ++i) { // [x,*i,GetWeight(x,*i)] is a unique weighted edge in g } } // traverse the Edge inventory: for (E::Iterator i = g.inventory_.Begin(); i != g.inventory_.End(); ++i) { // *i is a unique weighted edge in g }

2.3 Using a Weight Map

template < typename N >

2.4 Using Edge Objects

template < typename N >

3 Minimum Spanning Trees

A spanning tree for an undirected graph G is a subgraph T of G such that:

T is a tree.
T contains all the vertices of G.

See the spanning tree theorem (Theorem 3 in the Union-Find notes.)

A minimum spanning tree (MST) for a weighted undirected graph G is a spanning tree T whose total edge weight is minimal among all spanning trees of G.

Two algorithms for calculating MSTs are introduced in the Greedy Algorithms notes - Kruskal and Prim. Re-reading those descriptions, you will probably notice that they seem quite easy to follow but also that some details are glossed over - in each case a notion of optimal choice of a next edge to consider. It is in providing an efficient way to produce a "next optimal candidate" that the details become significant.

3.1 MST API

Our approach to MST algorithms is similar to the breadth- and depth-first surveys of the previous chapter. The MST algorithm is a class that attaches itself (via a const reference) to a graph object and houses the results of the algorithm. The actual processing is divided into two steps - an initialization phase embodied in the member function Init and an execution phase embodied in the member function Exec. These ideas are captured in the following API:

template < class G > MinimumSpanningTree API { typedef G Graph; typedef G::Vertex Vertex; // an unsigned integer type ... public: void Init (); // initializes all class variables and control structures void Exec (); // executes algorithm to completion const List<Edge>& MST () const { return mst_; } double Weight () const { return mstw_; } ... private: const G& g_; // undirected weighted graph List<Edge> mst_; // edges of MST (calculated by Exec()) double mstw_; // weight of MST (calculated by Exec()) ... };

Note that the two private data fields mst_ and mstw_ in the API are to be populated by Init() followed by Exec() and then the resulting data is accessed by the two const member functions: MST() returns a list of the MST edges in "discovery" order, and Weight() returns the total weight of the MST. The details of how a specific MST algorithm operates on a weighted graph are captured in additional private class members.

The reference to a graph object (of type G) must be initialized by the class constructor because it is const. This is the graph on which the algorithm acts.

3.2 Kruskal's MST Algorithm

The idea supporting Kruskal's MST algorithm is to build a minimum spanning tree by starting with the forest F consisting of the vertices of a connected graph G and add edges to the forest until it is a tree. At that point F will of necessity be a spanning tree of G, since it contains all the vertices. The critical step is in the selection of an edge e to add to the growing forest. These are the criteria used by Kruskal:

The vertices of e are in different trees of the forest.
The weight of e is minimal among all edges satisfying criterion 1.

We can ensure that criterion 2 is satisfied by considering edges in increasing order by weight and adding the edge to F when it satisfies criterion 1. The algorithm can be terminated when the number of edges added is n - 1, where n is the number of vertices. (See Theorem 4 in Section 4 of the notes on Graph Search.)

An efficient way to organize the edges uses a Min Priority Queue of edges using edge weight as priority. Putting all edges into a vector v and initializing v as a priority queue requires Θ(|E|) time. Popping the front of the priority queue requires O(log |E|) time, and thus the runtime of the entire process would be Θ(|E|) + k×O(log |E|) ≤ O(|E| + k×log |E|) where k is the number of edges that are considered before n-1 have been selected.

Adding support for these Kruskal specifics to the generic MST API yields this class definition (enhancements specific to Kruskal shown in red):

template < class G > Kruskal { typedef G Graph; typedef G::Vertex Vertex; typedef fsu::Edge<Vertex> Edge; typedef fsu::Vector<Edge> Container; typedef fsu::GreaterThan<Edge> Predicate; typedef fsu::PriorityQueue<Edge,Container,Predicate> PQ; public: void Init ( bool verbose = 0 ); void Exec ( bool verbose = 0 ); const List<Edge>& MST () const { return mst_; } double Weight () const { return mstw_; } Kruskal ( const G& g ) : g_(g),mst_(), mstw_(0.0),c_(0),pred_(),pq_(c_,pred_) {} private: const G& g_; // undirected weighted graph List<Edge> mst_; // edges of MST (calculated by Exec()) double mstw_; // weight of MST (calculated by Exec()) Container c_; // "input" edge set Predicate pred_; // edge prioritizer PQ pq_; // priority queue package operating on c_ with pred_ };

The predicate is used to organize the container into a priority queue. The Init method accomplishes these operations:

Clear mst_ and c_ and set mstw_ to zero
Add all edges to c_, being careful not to add any edge more than once: [x,y,w] and [y,x,w] represent the same undirected edge
Init pq_
If verbose, output pq_ info

The Exec method follows the Kruskal algorithm:

Use an fsu::Partition object to keep track of forest connectivity, as in the Spanning Tree algorithm.
Repeat:
1. Pop edges from the priority queue until one has vertices in distinct trees of the forest
2. Add the edge to the forest
3. Update the partition object
Until the forest is connected (or until n-1 edges have been added)
If verbose, output pq_ info

Exercise 1. Show that Kruskal as described is a greedy algorithm.

Exercise 2. Show that the "discovery" order of edges in the Kruskal MST is ordered by increasing weight.

Exercise 3. Show that the runtime of Kruskal is O(|E| + k log |E|) where k is the number of edges that are inspected.

3.3 Prim's MST Algorithm

The idea behind Prim is to build a spanning tree by starting with the tree consisting of a start vertex (we will start at vertex 0) and grow the tree by greedily adding minimal weight edges that have one end in the tree and one end outside the tree, until we run out of edges or the tree has n-1 edges.

template <class G> class Prim { typedef G Graph; typedef typename G::Vertex Vertex; typedef fsu::Edge<Vertex> Edge; typedef fsu::Vector<Edge> Container; typedef fsu::GreaterThan<Edge> Predicate; typedef fsu::PriorityQueue<Edge,Container,Predicate> PQ; public: void Init ( bool verbose = 0 ); void Exec ( bool verbose = 0 ); const fsu::List<Edge>& MST () const { return mst_; } double Weight () const { return mstw_; } Prim (const G& g) : g_(g),mst_(), mstw_(0.0), c_(0), pred_(), pq_(c_,pred_), inTree_(0) {} private: const G& g_; // undirected weighted graph fsu::List<Edge> mst_; // "output" edge set double mstw_; // weight of MST Container c_; // "dynamic" edge set - discovered but not yet used Predicate pred_; // edge prioritizer PQ pq_; // priority queue package operating on c_ with pred_ fsu::Vector<bool> inTree_; // tree vertices };

The "blue" additions are identical to those for Kruskal. The "red" are Prim-specific changes or additions.

The predicate is used to organize the container into a priority queue. The Init method accomplishes these operations:

Clear mst_ and c_ and set mstw_ to zero
Initialize inTree_ to size with all values false
Add all edges from the start vertex (0) to c_
Init pq_
Set inTree_[0] to true
If verbose, output pq_ info

The Exec method follows the Prim algorithm:

Repeat:
1. Pop edges from the priority queue until one has a vertex (say v) outside the tree
2. Add the edge to the tree
3. Update inTree_[v] to true
4. For each edge e from v, push e onto the priority queue if the other end of e is not in the tree (these are the newly discovered edges)
Until the tree has n-1 edges (or there are no more edges to consider)
If verbose, output pq_ info

The priority queues for Kruskal and Prim behave differently. In Kruskal, the PQ is filled with all edges at the beginning (by Init()) and then edges are popped during execution (in Exec()) until the tree is built. In Prim, the PQ is initialized (in Init()) only by the edges visible from the start vertex and then edges are popped and others pushed during execution (by Exec()).

The bool vector inTree_ is used to keep track of the vertices in the growing MST. Note that all edges in the PQ have at least one vertex in the tree, but you don't know whether it is the larger or smaller numbered end. Use inTree to decide whether the other end is in the tree.

Push an edge to the PQ only if its "other" end is not in the tree. This prevents redundant insertions of edges into the PQ. Once inserted, however, the other end could migrate into the tree by a circuitous route, so when it comes to the front of the PQ it has to be tested again to know whether one end is still not in the tree.

Be sure to maintain inTree_ by setting inTree[x] to true when an edge ending at x is inserted into the MST.

Exercise 4. Show that Prim as described is a greedy algorithm.

Exercise 5. Explain why the "discovery" order of the MST edges using Prim is not ordered by increasing edge weight..

Exercise 6. Show that the runtime of Prim is O(|E| log |E|).

4 Edge-Weighted Directed Graphs

4.1 Using Map

4.2 Using Edge Objects

5 Minimum Length Paths

5.1 Minimum length paths

5.2 Negative Weights

5.3 SSSP API

template < class G > SSSP API { typedef G Graph; typedef G::Vertex Vertex; // an unsigned integer type ... public: void Init (Vertex s); // initializes start at s along with all class variables void Exec (); // executes the algorithm to completion const Vector<double>& Distance () const { return d_; } const Vector<Vertex>& Parent () const { return p_; } void Path (Vertex x, List& path) const; // constructs path from s to x ... private: // methods void Relax (Vertex v); // called by Exec() ... private: // data const G& g_; // directed weighted graph fsu::Vector d_; // distance (calculated by Exec()) fsu::Vector p_; // predecessor in search (calculated by Exec()) ... };

5.4 Relaxation

The concept of relaxation applies to edges e = (x,y,w). The idea is to check whether the current known distance to y can be shortened by going through x and if so, update the known distance and also modify the currently known shortest path by going through x to y.

Relax(Edge e) { if (d(x) + w < d(y)) // If passing through x gives a shorter way to get to y { // then d(y) = d(x) + w; // update the distance estimate d(y) parent(y) = x; // update the path to y } }

In all of our algorithms that use relaxation it is applied to all edges from a given vertex x. So we can simplify the code for Dijkstra, BellFord, A*, and the SSSP for DAGs by defining Relax for vertices:

Relax(Vertex x) { for (each directed edge e = (x,y,w)) { if (d(x) + w < d(y)) { d(y) = d(x) + w; parent(y) = x; // algorithm-specific house-keeping code here } } }

(Note that the blue portion is the same as Relax(e).)

Using this version of Relax simplifies the code for all the SSSP algos. Note however that each SSSP algo may require a small enhancement that assists in managing control structures of the implementation:

For Dijkstra: after y is updated, push y onto the priority queue with the new priority d(y)
For enhanced BellFord: after y is updated, push y onto the (ordinary) queue if necessary (see 3.6.2 below)

5.5 Dijkstra's SSSP Algorithm

Exercise 3. Show that Dijkstra as described is a greedy algorithm.

5.6 The Bellman-Ford SSSP Algorithm

The brief coverage of BellmanFord in the Cormen text and the courses notes doesn't do justice. This is an important algorithm:

Often used as the default go-to SSSP algorithm
Allows negative weights (and detects negative cycles)
Often faster than Dijkstra in practice, despite it's worst case Ω(|V|×|E|)
Comparatively simple to code

5.6.1 Classic Bellman-Ford - general structure

Bellman-Ford, which we'll abbreviate to BellFord, has the simplicity we saw in the bottom-up versions of dynamic programming algorithms (introduced by Bellman!). It consists of nested fixed-length loops:

for each v ∈ V { for each e ∈ E { Relax(e) } }

This amounts to: Relax each edge in E n times, where n = |V|, and using the same order of edges each time. A negative cycle is detected by the existence of one edge that still needs to be Relaxed after the nested for loops have run. While the simplicity of this is admirable - simpler is always a worthy goal - the fact that we are stuck with Θ (n×|E|) runtime is undesirable and compares quite unfavorably with Dijkstra's worst case runtime of O(|E|×log n). The innovation that speeds BellFord up is presented next.

5.6.2 Enhanced BellFord

Don't you just love simplicity in an algorithm? The innovation that makes BellFord competitive with Dijkstra in many real-world graphs is this observation: If no edge (x,y,-) has been updated during pass k of edge relaxations, then (y,z,-) cannot need updating in the next pass. This means we only need to consider vertices in a pass when a predecessor has been updated in the previous pass. A simple FIFO queue q_ and a boolean-valued vector "onQ_" facilitate, and the code is in Relax:

Relax(Vertex x) // enhanced for BellFord { for (each edge e = (x,y,w)) { if (d(x) + w < d(y)) { d(y) = d(x) + w; parent(y) = x; if (!onQ_[y]) // prevents redundant inserts { q_.Push(y); // consider y for relaxation in next pass onQ_[y] = true; // record y is on the queue } } } }

The FIFO q_ is then used to run through the vertices, but only those that might need it. The Exec() portion of the algorithm then looks like this:

void Exec() { Vertex v; while (!q_.Empty()) { v = q_.Front(); q_.Pop(); onQ_[v] = false; Relax(v); } }

Of course you also need to Init(s) for the source vertex s. That process would be:

void Init(Vertex s) { set size and initial values for distance vector d_ // INFINITY set size and initial values for parent vector p_ // null vertex (one more than last vertex index) set size and initial values for vector onQ_ // bool false start_ = s; d_[start_] = 0.0; q_.Push(start_); onQ_[start_] = true; }

The algo as described above works fine as long as there are no negative weight cycles. We'll take up the negative weight cycle issues in the next section.

Exercise 4. Is BellFord as described a greedy algorithm? If not, is there another classification that applies?

5.7 Negative weight cycles

As discussed in the Cormen textbook, class notes, and most other sources:

Dijkstra depends absolutely on the assumption that no weights are negative.
If negative weights are allowed, it is possible to have a cycle in the graph whose total weight is negative. If a negative weight cycle is reachable from the source vertex s, then no shortest path can exist to any vertex reachable from s: paths can be "shortened" by going around the cycle - which adds edges but subtracts weight.
BellFord works in all cases: If a negative weight cycle is reachable from s it can be detected and "no solution" returned. Otherwise the BellFord algo constructs a tree of shortest paths from s to all reachable vertices.

The classic "nested loops running to completion" version detects the negative cycle defect by making a single pass through the edges (after the nested loops have run to completion): if any edge needs relaxation then there is a negative cycle. The enhanced version is queue-controlled rather than having fixed loop lengths. However the same conclusion holds: If each edge has been relaxed n times and still needs it, there is a negative weight cycle.

Note that a negative weight cycle would make the enhanced version run interminably, because an edge would continue to need relaxation as we wind around the negative cycle. To detect this condition there are a few options:

Add an edge relaxation counter, and if that exceeds n×|E| terminate the main while loop by setting the class variable "hasNegativeCycle_" to true.
Occasionally (about every n queue pops) run a negative cycle detection process in the currently known reachable vertices. (This is a costly call, so we amortize it's use.) This cycle detector can report the cycle if desired.

In our code we should use the first option. Just send out the message "negative cycle detected" if the while loop terminates that way.