NetworKit Distance Tutorial

NetworKit provides several graph traversal and pathfinding algorithms within the distance module. This notebook covers most of these algorithms, and shows how to use them.

[1]:
import networkit as nk

For this tutorial we will use the same graph, and the same source and target node. We will indext the edges of the graph because some algorithms require the edges to be indexed.

[2]:
# Read a graph
G = nk.readGraph("../input/foodweb-baydry.konect", nk.Format.KONECT)
GDir = G
G = nk.graphtools.toUndirected(G)
source = 0
target = 27
G.indexEdges()

Algebraic Distance

Algebraic distance assigns a distance value to pairs of nodes according to their structural closeness in the graph. Algebraic distances will become small within dense subgraphs.

The AlgebraicDistance(G, numberSystems=10, numberIterations=30, omega=0.5, norm=0, withEdgeScores=False) constructor expects a graph followed by the number of systems to use for algebraic iteration and the number of iterations in each system. omega is the overrelaxation parameter while norm is the norm factor of the extended algebraic distance. Set withEdgeScores to true if the array of scores for edges {u,v} that equal ad(u,v) should be calculated.

[3]:
# Initialize algorithm
ad = nk.distance.AlgebraicDistance(G, 10, 100, 1, 1, True)
[4]:
# Run
ad.preprocess()
[4]:
<networkit.distance.AlgebraicDistance at 0x7f6df2a85690>
[5]:
# The algebraic distance between the source and target node
ad.distance(source, target)
[5]:
4.79997134363237

All-Pairs Shortest-Paths (APSP)

The APSP algorithm computes all pairwise shortest-path distances in a given graph. It is implemented running Dijkstra’s algorithm from each node, or BFS if the graph is unweighted.

The constructor APSP(G) expects a graph.

[6]:
# Initialize algorithm
apsp = nk.distance.APSP(G)
[7]:
# Run
apsp.run()
[7]:
<networkit.distance.APSP at 0x7f6e2ca33910>
[8]:
# The distance from source to target node
print(apsp.getDistance(source, target))
0.0006647776978699999

Pruned Landmark Labeling

Pruned Landmark Labeling is an alternative to APSP. It computes distance labels by performing a pruned BFS from each node in the graph. Distance labels are then used to quickly compute shortest-path distances between node pairs. This algorithm only works for unweighted graphs.

[9]:
# Initialize the algorithm - in case of weighted graphs, edge weights are ignored
pll = nk.distance.PrunedLandmarkLabeling(G)
[10]:
# Run - this step computes the distance labels
pll.run()
[10]:
<networkit.distance.PrunedLandmarkLabeling at 0x7f6e2c6d0670>
[11]:
# Retrieve the shortest-path distance
print(pll.query(source, target))
2

Dynamic Pruned Landmark Labeling

Dynamic Pruned Landmark Labeling quickly updates distance labels after edge insertions.

[12]:
# Initialize the algorithm
dyn_pll = nk.distance.DynPrunedLandmarkLabeling(G)
[13]:
# Run - this step computes the distance labels
dyn_pll.run()
[13]:
<networkit.distance.DynPrunedLandmarkLabeling at 0x7f6df2a86fe0>
[14]:
# Pick two nodes
source, target = 1, 102
print(f"Distance between {source} and {target} before edge insertion: {pll.query(source, target)}")
Distance between 1 and 102 before edge insertion: 3
[15]:
# Shorten distance between the two nodes
G.addEdge(57, 102)
[15]:
True
[16]:
# Update distance labels
dyn_pll.update(nk.dynamics.GraphEvent(
    nk.dynamics.GraphEventType.EDGE_ADDITION,
    57,  # Source
    102, # Target
    1   # Weight
))
[16]:
<networkit.distance.DynPrunedLandmarkLabeling at 0x7f6df2a86fe0>
[17]:
# New distance between the two nodes
print(f"Distance between {source} and {target} after edge insertion: {dyn_pll.query(source, target)}")
Distance between 1 and 102 after edge insertion: 2
[18]:
# Remove the edge we added before
G.removeEdge(57, 102)
[18]:
<networkit.graph.Graph at 0x7f6df2a80fd0>

Some-Pairs Shortest-Paths (SPSP)

SPSP is an alternative to APSP, it computes the shortest-path distances from a set of user-specified source nodes to all the other nodes of the graph.

The constructor SPSP(G, sources takes as input a graph and a list of source nodes.

Some-Pairs Shortest-Paths (SPSP)

SPSP is an alternative to APSP, it computes the shortest-path distances from a set of user-specified source nodes to all the other nodes of the graph.

The constructor SPSP(G, sources takes as input a graph and a list of source nodes.

[19]:
# Initialize the algorithm
sources = [0, 1, 2]

spsp = nk.distance.SPSP(G, sources)

# Run
spsp.run()

# Print the distances from the selected sources to the target
for source in sources:
    print("Distance from {:d} to {:d}: {:.3e}".format(source, target, spsp.getDistance(source, target)))
Distance from 0 to 102: 6.643e-04
Distance from 1 to 102: 1.223e-05
Distance from 2 to 102: 1.124e-04

A*

A* is an informed search algorithm , as it uses information about path cost and also uses heuristics to find the shortest path.

The AStar(G, heu, source, target, storePred=True) constructor expects a graph, the source and target nodes as mandatory parameters. The algorithm will also store the predecessors and reconstruct a shortest path from the source and the target if storePred is true. heu is a list of lower bounds of the distance of each node to the target.

As we do not have any prior knowledge about the graph we choose all zeros as a heuristic because zero is always a lower bound of the distance between two nodes. In this case, the A* algorithm is equivalent to Dijkstra.

[20]:
# Initialize algorithm
heuristic = [0 for _ in range(G.upperNodeIdBound())]
astar = nk.distance.AStar(G, heuristic, source, target)
[21]:
# Run
astar.run()
[21]:
<networkit.distance.AStar at 0x7f6df2a71000>
[22]:
# The distance from source to target node
print(astar.getDistance())
# The path from source to target node
print(astar.getPath())
0.00011239908770000002
[46, 105]

Breadth-First Search (BFS)

BFS is an algorithm for traversing a graph which starts from the source node u, and explores all of the u’s neighbors nodes at the present depth before moving on to the nodes at the next depth level. BFS finds the shortest paths from a source to all the reachable nodes of an unweighted graph.

The BFS(G, source, storePaths=True, storeNodesSortedByDistance=False, target=none) constructor expects a graph and a source node as mandatory parameters. If the paths should be stored, set storedPaths to true. If storeNodesSortedByDistance is set, a vector of nodes ordered in increasing distance from the source is stored. target is the target node.

[23]:
# Initialize algorithm
bfs = nk.distance.BFS(G, source, True, False, target)
[24]:
# Run
bfs.run()
[24]:
<networkit.distance.BFS at 0x7f6df2a87a60>
[25]:
# The distance from source to target node
print(bfs.distance(target))
# The number of shortest paths between the source node
print(bfs.numberOfPaths(target))
# Returns a shortest path from source to target
print(bfs.getPath(target))
2.0
4.0
[2, 79, 102]

Bidirectional BFS

The Bidirectional BFS algorithm explores the graph from both the source and target nodes until the two explorations meet. This version of BFS is more efficient than BFS when the target node is known.

The BidirectionalBFS(G, source, target, storePred=True) constructor expects a graph, the source and target nodes as mandatory parameters. The algorithm will also store the predecessors and reconstruct a shortest path from the source and the target if storePred is true.

[26]:
# Initialize algorithm
biBFS = nk.distance.BidirectionalBFS(G, source, target)
[27]:
# Run
biBFS.run()
[27]:
<networkit.distance.BidirectionalBFS at 0x7f6df2aaaca0>

Unlike BFS, the getPath method does not include the source at the beginning, and the target at the end of the the returned list.

[28]:
# The distance from source to target node
print(biBFS.getPath())
print(len(biBFS.getPath()))
[79]
1

Dijkstra

Dijkstra’s algorithm finds the shortest path from a source node a target node. This algorithm creates a tree of shortest paths from the source to all other nodes in the graph. Dijkstra’s algorithm finds the shortest paths from a source to all the reachable nodes of a weighted graph.

The Dijkstra(G, source, storePaths=True, storeNodesSortedByDistance=False, target=none) constructor expects a graph and a source node as mandatory parameters. If the paths should be stored, set storedPaths to true. If storeNodesSortedByDistance is set, a vector of nodes ordered in increasing distance from the source is stored. target is the target node.

[29]:
# Initialize algorithm
dijkstra = nk.distance.Dijkstra(G, source, True, False, target)
[30]:
# Run
dijkstra.run()
[30]:
<networkit.distance.Dijkstra at 0x7f6df2a87df0>
[31]:
# The distance from source to target node
print(dijkstra.distance(target))
# The number of shortest paths between the source node
print(dijkstra.numberOfPaths(target))
# Returns a shortest path from source to target
print(dijkstra.getPath(target))
0.00011239908770000002
1.0
[2, 46, 105, 102]

Bidirectional Dijkstra

The Bidirectional Dijkstra algorithm explores the graph from both the source and target nodes until the two explorations meet. This version of Dijkstra is more efficient than the convential Dijkstra when the target node is known.

The BidirectionalDijkstra(G, source, target, storePred=True) constructor expects a graph, the source and target nodes as mandatory parameters. The algorithm will also store the predecessors and reconstruct a shortest path from the source and the target if storePred is true.

[32]:
# Initialize algorithm
biDij = nk.distance.BidirectionalDijkstra(G, source, target)
[33]:
# Run
biDij.run()
[33]:
<networkit.distance.BidirectionalDijkstra at 0x7f6df2aa9f30>

Unlike Dijkstra, the getPath method does not include the source at the beginning, and the target at the end of the the returned list.

[34]:
# The distance from source to target node
print(biDij.getDistance())
# The path from source to target node
print(biDij.getPath())
0.00011239908770000002
[46, 105]

Commute Time Distance

This class computes the Euclidean Commute Time Distance between each pair of nodes for an undirected unweighted graph.

The CommuteTimeDistance(G, tol=0.1) constructor expects a graph as a mandatory parameter. The optional parameter tol is the tolerance parameter used for approximation.

[35]:
# Initialize algorithm
ctd = nk.distance.CommuteTimeDistance(G)
[36]:
# Run
ctd.run()
[36]:
<networkit.distance.CommuteTimeDistance at 0x7f6df2abc400>
[37]:
# The distance from source to target node
print(ctd.distance(source, target))
465.3162793062133

If one wants to compute the commute time distance between two nodes, then they should use runSinglePair(u, v) method.

[38]:
ctd.runSinglePair(source,target)
[38]:
465.3162793062133

Diameter

This algorithm gives an estimation of the diameter of a given graph. The algorithm is based on the ExactSumSweep algorithm presented in Michele Borassi, Pierluigi Crescenzi, Michel Habib, Walter A. Kosters, Andrea Marino, Frank W. Takes: http://www.sciencedirect.com/science/article/pii/S0304397515001644.

The Diameter(G, algo=DiameterAlgo.AUTOMATIC, error=1.0, nSamples=0) constructor expects a graph as mandatory parameter. algo specifies the choice of diameter algorithm while error is the maximum allowed relative error. Set to 0 for the exact diameter. nSamplesis the number of samples to be used. algo can be chosen between from 0. automatic 1. exact 2. estimatedRange 3. estimatedSamples 4. estimatedPedantic

Note that the input graph must be connected, otherwise the resulting diameter will be infinite. As the graph we are using is not connected, we shall extract the largest connected component from it and then compute the diameter of the resulting graph.

[39]:
# Extract largest connect component
newGraph = nk.components.ConnectedComponents.extractLargestConnectedComponent(G, True)
newGraph.numberOfNodes()
[39]:
128
[40]:
# Initialize algorithm to compute the exact diameter of the input graph
diam = nk.distance.Diameter(newGraph,algo=1)
[41]:
# Run
diam.run()
[41]:
<networkit.distance.Diameter at 0x7f6df2abc760>
[42]:
# Get diameter of graph
diam.getDiameter()
[42]:
(72, 0)

The return value of getDiameter is a pair of integers, i.e., the lower bound and upper bound of the diameter. In the case, that we computed the exact diameter, the diameter is the first value of the pair.

Eccentricity

The eccentricity of a node u is defined as the distance to the farthest node from node u. In other words, it is the longest shortest-path starting from node u.

The eccentricity of a graph can be computed by calling the `getValue(G, v) <>`__ method, and passing a graph and a node. The method returns the node farthest from v, and the length of the shortest path between v and the farthest node.

[43]:
# Run
nk.distance.Eccentricity.getValue(G, source)
[43]:
(123, 3)

Effective Diameter

The effective diameter is defined as the number of edges on average to reach a given ratio of all other nodes.

The EffectiveDiameter(G, ratio=0.9) constructor expects an undirected graph and the ratio of nodes that should be connected. The ratio must be between in the interval (0,1].

[44]:
# Initialize algorithm
ed = nk.distance.EffectiveDiameter(G)
[45]:
# Run
ed.run()
[45]:
<networkit.distance.EffectiveDiameter at 0x7f6df2abccd0>
[46]:
# Get effective diameter
ed.getEffectiveDiameter()
[46]:
2.0546875

Effective Diameter Approximation

This class approximates the effective diameter according to the algorithm presented in the “A Fast and Scalable Tool for Data Mining in Massive Graphs” by Palmer, Gibbons and Faloutsos.

The `EffectiveDiameter(G, ratio=0.9, k=64, r=7) <>`__ constructor expects an undirected graph, the ratio of nodes that should be connected, the number of parallel approximations k to get a more robust results, and the number of bits r that should be added to the bitmask. The more bits are added to the bitmask, the higher the accuracy. The ratio must be between in the interval (0,1].

[47]:
# Initialize algorithm
eda = nk.distance.EffectiveDiameterApproximation(G)
[48]:
# Run
eda.run()
[48]:
<networkit.distance.EffectiveDiameterApproximation at 0x7f6df2abcc70>
[49]:
# Get effective diameter
eda.getEffectiveDiameter()
[49]:
2.015625

Reverse BFS

This class does a reverse breadth-first search (following the incoming edges of a node) on a directed graph from a given source node.

The ReverseBFS(G, source, storePaths=True, storeNodesSortedByDistance=False, target=none) constructor expects a graph and a source node as mandatory parameters. If the paths should be stored, set storedPaths to true. If storeNodesSortedByDistance is set, a vector of nodes ordered in increasing distance from the source is stored. target is the target node.

[50]:
# Initialize algorithm
rbfs = nk.distance.ReverseBFS(G, source, True, False, target)
[51]:
# Run
rbfs.run()
[51]:
<networkit.distance.ReverseBFS at 0x7f6df2abcf10>
[52]:
# The distance from source to target node
print(rbfs.distance(target))
# The number of shortest paths between source and target
print(rbfs.numberOfPaths(target))
2.0
4.0