Dijkstra's Shortest paths

Introduction

The algorithm to compute the shortest path from a source node to each of the other nodes was made famous by the Dutch computer scientist Edsger Dijkstra in the 1950's. It can be used for many applications including what your GPS (Navi) does in computing your road trip. It's easy to think of the nodes as cities and edges as two-way roads connecting them.

Consider some nodes (towns) randomly placed and connected by edges (roads). One of the nodes, labeled S, is the source and we want to calculate the shortest route to all the other nodes. In a early demonstration project Dijstra used the algorithm to find the shortest routes from a city input by the user to several other cities in the Netherlands by using data on the actual roads between these cities. Your GPS or Navi as we call them in Europe basically does the same thing.

The images below show a simple example. From the source node labeled S we want to find the shortes path to each of the four other nodes A, B, C, D using the 5 edges in white. The numbers in green show the distance between the two node that an edge connects. The nodes are labeled in yellow with their letter designator and the shortest distance from node S if it is known. If the distance is not known then a ? is shown. At the start of the computation only one distance is known. The distance from S to S is zero.

images/dij_001.png images/dij_012.png

The image on the right shows the graph as solved. Edges used to get to one or more of the target nodes A,B,C,D from the source node S are now shown in blue. All the target nodes are also labeled with their computed minimum distance from S and shown in red.

To see this in action as a little movie, click on the animation link

Animated (return with the browser's Back button)

The Algorithm

As with the minimum spanning tree, the process is fairly simple. Each node (vertex is another term) is visited only once. Here is a rough outline of the algorithm.

  1. Find the unvisited node closest to the source node. Call it nodc
  2. For each edge from nodc find the neighbor node nodn at the other end of the edge. Calculate the distance from the source to nodn (going through nodc) by adding the nodc's distance from the source to the length of the edge connecting it to nodn. If the new distance is less than nodn's current distance to the source (stored in nodn.dist) or if nodn.dist has not been set, then set nodn.dist to the new calculated distance and set the attribute nodn.prev to the edge connecting it to nodc.
  3. Mark nodc as visited. It will not be of further use in the algorithm.
  4. Repeat until all nodes have been visited.

The Code

As with the other projects in this group, the sarg, camera and pgcon modules are described in the pgame support library If you haven't already done so, you may want to read about them first.

The program code shown below is available as dijkstra.py. The whole package can be downloaded as a zip file.

The nodes are represented by Node instances.

13 class Node :
14     def __init__ (self, name, pos, dist=INFINITY) :
15         self.name = name
16         self.pos  = pos
17         self.dist = dist      # total distance back to source
18         self.color= ORANGE
19         self.edges = []       # edges from this node
20         self.prev = None      # edge leading back to source
21         nodes.append(self)    # global list of all node instances
22
23     def neighbor(self, edge) :  # find node at other end of an edge
24         if edge.node1 == self : return edge.node2
25         else               : return edge.node1
26
27     def __str__(self) :
28         nbors = [self.neighbor(edge).name  for edge in self.edges]
29         return "%s: %s" % (self.name, nbors)
30

The attributes of a node are simply its position, name (uppercase letter), distance from the source node (or INFINITY), color for display, a pointer to its previous node, and a list of edges (defined below).

The node has edges and the method neighbor returns the node at the opposite end of an edge. Overriding the __str__ method makes it easier to view a node and its edges.

A global list nodes contains a reference to each node.

The class Edge is quite similar. The code for it follows:

32 class Edge :
33     def __init__ (self, node1, node2, length, color=WHITE) :
34         self.color = color
35         self.node1 = node1
36         self.node2 = node2
37         self.length = length
38         node1.edges.append(self)
39         node2.edges.append(self)
40         edges.append(self)
41
42     def __str__(self) :
43         return "Edge: %s to %s len=%s" % (self.node1.name,self.node2.name,self.length)

An edge is defined with 2 nodes that it connects, a length and a color. With the creation of an edge instance, each node's edges list is updated. So there is a double link between an edge and a node.

Displaying a graph is done with the function display which is called for each frame in an animation. The globals edges and nodes are lists of the class instances.

45 def display (banner) :
46     pgcon.newScreen()
47     for e in edges :
48         pgcon.lineDraw(e.color, e.node1.pos, e.node2.pos, 1)
49         mid = ((e.node1.pos[0]+e.node2.pos[0])/2, (e.node1.pos[1]+e.node2.pos[1])/2)
50         pgcon.textDraw(GREEN, mid, str(e.length))
51     for node in nodes :
52         if node.dist < INFINITY : label = "%s--%d" % (node.name,node.dist)
53         else                    : label = "%s--?"  %  node.name
54         pgcon.textDraw(node.color, node.pos, label)
55     pgcon.textDraw(WHITE, (pixels/2,20), banner)
56     pgcon.camera.armed = 1
57     pgcon.writeScreen()
58

Line 46 starts a new blank frame to draw on.

In line 47 we start a loop to draw and label each of the edges in the graph. An edge is a line of either white or blue and labeled at the midpoint in green. The midpoint of an edge is computed in line 49. Positions are two element tuples for an (x,y) on the window. The length of the edge is shown in green and centered on the midpoint.

In line 51, each node is drawn. It's simply a label at a position. The label contains the one letter name, a dash, and the shortest distance (so far) from the source node. Labels are drawn in either orange or red. The color red indicates the node has been visited.

The banner, a text message, is drawn in white in the top center of the window. It's used to indicate what is happening in the current frame.

Finally, the camera (if activated) is armed to take a snapshot of the frame as a PNG image.

The Heart of the Code

The function findpaths implements the Dijkstra algorithm. It's about 30 lines of Python and we'll walk through it slowly. Possibly a third of the code is there to handle issues for the display.

59 def findpaths(nodes) :
60     "implements the dijkstra algorithm. Updates nodes dist & prev attributes"
61     source = None
62     ncaptures = 0
63     while nodes :
64         best = 0
65         for test in range(len(nodes)) :
66             if nodes[test].dist < nodes[best].dist : best=test
67         node1 = nodes.pop(best)
68         if not source : source = node1.name

The list of node instances is passed in. The variable source will be set to the name of the is copied to vs (line 61) so that we can visit each node in the correct order and then remove if from vs. This guarentees that we not vist a node more than once.

In the while loop (line 64) we will first find in each pass the unvisited node closest to the source. On the first pass this will be the source node itself whose dist attribute was set to zero. All the others were set to the maximum possible integer which call INFINITY and which shows on the display as a ?. The closest node is the popped from vs and assigned to node1. The popping will make sure no second visit to node1 can take place. The local variable source is set to the name of the first node for use at the last display.

69         node1.color = RED   # Show node visited
70         display("Working from Node %s" % node1.name)

We'll paint our current node node1 temporarily red and display a frame showing that we're working on it. Then for each edge from node1 we will look at the neighbor node2 at the other end of the edge. If node2 has not been visited we calculate its distance from the source (line 74) coming through node1. If the new distance is less than its node2's current distance (INFINITY or a previous setting) then node1 will capture node2 resetting node2.prev to now reference the edge connecting it to node1 and changing node2.dist accordingly.

An edge is colored blue (line 83) when it used in a path, and returned to white if the target node is later captured. (line 77).

Each edge is also only visited once since node1 disappears from nodes and we never look backwards to visited nodes (line 73).

71         for edge in node1.edges :
72             node2 = node1.neighbor(edge)
73             if node2 not in nodes : continue  # already visited
74             newDist = node1.dist + edge.length
75             if newDist < node2.dist :
76                 if node2.prev :            # erase use of earlier
77                     node2.prev.color = WHITE
78                     action = "captures"
79                     ncaptures += 1
80                 else : action = "connects to"
81                 node2.dist = newDist
82                 node2.prev = edge
83                 edge.color = BLUE
84             else :
85                 action = "can't improve dist to"
86             banner = "%s %s %s" % (node1.name,action,node2.name)
87             display(banner)
88     display("Dijkstr's Shortest Paths from %s Done" % source)
89     print "There were %d captures" % ncaptures
90

Finally, when the list nodes is empty we're done. The final frame displays the solution (line 88) and a message how many nodes were captured after being set initially. It's generally only a very few.

Making Graphs

The program can be run with any arbitary graph. A function fillGraph is imported from a seperate file. For the example above we've been using the file graph_000.py which looks like this.

def fillGraph (Node, nodes, Edge, edges) :
   s = Node("S", (200,100), dist=0)
   a = Node("A", (100,200))
   b = Node("B", (250,250))
   c = Node("C", (200,280))
   d = Node("D", (100,300))

   Edge(s,a,4)
   Edge(s,b,5)
   Edge(a,c,3)
   Edge(a,d,2)
   Edge(b,c,1)

The class definitions Node and Edge are passed to it along with references to the global lists nodes and edges which are then populated by fillGraph. The local variables a,b,c,d,s are assigned to each node as it is created. These nodes are then passed via these variables to create the edges. When fillGraph returns, the lists nodes and edges are populated.

Driving the Program

Around the code above we need some support. A Pygame connector pgcon is constructed from the Pgcon class in pglib. The command line is queried via sarg to set control variables;

  • pixels - defines the size of the Pygame window (square)
  • delay - the time in seconds to keep a frame displayed before continuing
  • graph - the Python code containing a function fillgraph This module is imported at line 11 after the program has started running.

There are other command line options for controlling the camera which captures PNG images of the frames and even create a GIF animation from them. See the pgame support library

01 from pgcon import Pgcon, WHITE, RED, GREEN, BLUE, ORANGE
02 from sys   import maxint as INFINITY
03 import sarg
04
05 pgCon = None
06
07 nodes  = []
08 edges  = []
09 pixels = sarg.Int("pixels",600)
10 graph  = sarg.Str("graph","makeGraph")
11 exec("from %s import fillGraph" % graph)
12

.....
91 def main()  :
92     global pgcon
93     pgcon = Pgcon()
94     fillGraph (Node, nodes, Edge, edges)
95     display("Initial Configuration")
96     findpaths(nodes[:])        # pass copy of nodes to keep original
97     pgcon.close()
98
99 if __name__ == "__main__" : main()

The main function gets things going in pretty much the same way as in the kmeans and minspan projects. It uses the imported fillGraph to populate the lists nodes and edges (line 94) and calls findpaths to put things in motion. Although not necessary for how the program is now used, main passes a copy of the nodes list to leave the original intact.

Running Dijkstra

Here is the command line that created the animation you saw above.

$ # Just run the display window
$ python dijkstra.py graph=graph_000 delay=1.5 pixels=400

$ # Run the display, saving the PNG images and create dij.gif at the end
$ python dijkstra.py graph=graph_000 delay=1.5 pixels=400  prefix=dij gif=1 maxpics=50 keep_pngs=1
Just captured frame: 1, dij_001.png
Just captured frame: 2, dij_002.png
Just captured frame: 3, dij_003.png
Just captured frame: 4, dij_004.png
Just captured frame: 5, dij_005.png
Just captured frame: 6, dij_006.png
Just captured frame: 7, dij_007.png
Just captured frame: 8, dij_008.png
Just captured frame: 9, dij_009.png
Just captured frame: 10, dij_010.png
Just captured frame: 11, dij_011.png
Just captured frame: 12, dij_012.png
There were 1 captures
Hit Return to Continue
Making GIF
convert -delay 150 -loop 1 dij_*.png dij.gif
dij.gif has 12 frames

When the message "Hit Return to Continue" is displayed you may need to click the terminal window before hitting the Enter or Return key. With the settings on the command line above the program continues on to create a GIF file with 1.5 seconds between frames. Setting keep_pngs=1 keeps the PNG images from being removed afterwards.

If you follow the individual PNG files or even watch the GIF carefully, you will notice the order that the nodes are visited and the linkages created. Node A sets an initial distance to node C but when node B is visited it captures node C, offering it a shorter distance.

The order in which the nodes are visited is critical. When a node is visited it only considers unvisted neighbors to update. If visited neighbors could be updated things get messy. When they were visited earlier they likely would have updated their neighbors and now those dist values are out of date.

In line 64-66 of dijkstra.py we find the unvisited node closest to the source. In just 3 lines of Python, it is pretty simple. But if we are working with a very large number of nodes things can get slow. This can be reduced at least by using a priority queue where efficient deletes and inserts can be performed. But this requires much more complexity. For a "How it works" demonstration like this it would be overkill.

More Interesting Examples

The liberal use of sarg switches makes it easy to play with the program. First let's use another fillgraph function defined in graph_456.py

$ python dijkstra.py graph=graph_456
There were 1 captures
Hit Return to Continue

You should see an animation (probably bigger) like this

If graph= is omitted then the function the makeGraph.py is used. Its fillGraph creates a graph at random and returns it. It first selects a random seed between 1 and 999 and seeds the random module with it. It prints the seed and the code that would produce the same graph on the terminal window. Here is an example:

$ python dijkstra.py
Seed is 848
# ------------- Copy paste following to graph_848.py
def fillGraph(Node, nodes, Edge, edges) : # seed=848
   s = Node('S', (69, 570), dist=0)
   a = Node('A', (70, 338))
   b = Node('B', (369, 534))
   c = Node('C', (374, 131))
   d = Node('D', (513, 483))
   e = Node('E', (528, 348))

   Edge(s,a,232)
   Edge(s,b,302)
   Edge(a,b,357)
   Edge(a,c,367)
   Edge(b,d,152)
   Edge(b,e,244)
   Edge(c,e,266)
   Edge(c,d,378)
   Edge(d,e,135)
# ------------- End of Copy paste

There were 0 captures
Hit Return to Continue
$

Now the following produces the same graph again as long as you don't change the number of nodes with something like nodes=7.

$ python dijkstra.py seed=848

You can see all the possible sarg switches and their defaults by an expedient

$ grep sarg *.py

and looking at the code around it. I hope it turns out to be useful in exploring the program and also in extending it as you wish. Be careful with the spelling of switch names. Misspelled switches are simply ignored.

Copyright © 2017 Chris Meyers

.