Group Centrality Tutorial

This notebook covers the group centrality algorithms implemented in NetworKit. Group centrality measures the relative importance of a group of nodes within a graph.

The documentation of NetworKit group centrality algorithms is available in the centrality module.

[1]:
import networkit as nk
[2]:
# Read graph
G = nk.readGraph("../input/karate.graph", nk.Format.METIS)

Group Betweenness

Group betweenness measures the importance of a group \(S\) as the fraction of shortest paths connecting pairs of non-group members that pass through \(S\).

The constructor ApproxGroupBetweenness(G, groupSize, epsilon) expects as inputs an undirected graph, the desired group size groupSize, and the accuracy of the approximation epsilon.

[3]:
# Initalize algorithm
btwn = nk.centrality.ApproxGroupBetweenness(G, 10, 0.1)
[4]:
# Run
btwn.run()
[4]:
<networkit.centrality.ApproxGroupBetweenness at 0x7fc56b5d5f30>

After running the algorithm, we can extract some information about the group betweenness centrality, e.g. the group with the highest approximate betweenness centrality score.

[5]:
# Group with highest approximate betweenness centrality
btwn.groupMaxBetweenness()
[5]:
[0, 33, 32, 2, 1, 31, 23, 27, 13, 8]

One may also have a group of nodes and be interested in the group’s betweeness centrality score. The function `scoreOfGroup(group, normalized=False) <>`__ returns the betweenness centrality score of the list of nodes in group.

[6]:
# Create an arbitrary group of nodes
group = [7, 8, 11, 20]

# Get betweenness centrality score of group
btwn.scoreOfGroup(group)
[6]:
59

Group Closeness

GroupCloseness measures the importance of a group \(S\) as the inverse of the sum of the distances from all the nodes outside \(S\) to their nearest node in \(S\).

In NetworKit, GroupCloseness implements an heuristic greedy algorithm to compute a group of nodes with high group closeness centrality. The algorithm was introduced in “Scaling up Group Closeness Maximization” Bergamini et al., ALENEX 2018.

The constructor GroupCloseness(G, k=1, H=0) expects an unweighted graph and the desired group size k. If H > 0, all BFSs will explore the graph up to distance H. This can speed up the algorithm, at the cost of a lower solution quality.

[7]:
# Initalize algorithm
close = nk.centrality.GroupCloseness(G, 10)
[8]:
# Run
close.run()
[8]:
<networkit.centrality.GroupCloseness at 0x7fc56b536d10>

groupMaxCloseness() returns the group of k nodes computed by the algorithm.

[9]:
# Computed group of nodes
close.groupMaxCloseness()
[9]:
[0, 33, 31, 16, 5, 30, 29, 28, 27, 26]

scoreOfGroup(group) takes a group of nodes as input, and returns the group’s closeness centrality score. The score is returned as a value between 0 and 1: a score of 1 indicates maximum group closeness centrality.

[10]:
# Create an arbitrary group of nodes
group = [7, 8, 11, 20]

# Get closeness centrality score of group
close.scoreOfGroup(group)
[10]:
0.5357142857142857

Local Search for Group Closeness

A faster local search algorithm for group closeness is the Grow-Shrink algorithm introduced in Local Search for Group Closeness Maximization on Big Graphs, Angriman et al., IEEE BigData 2019.

The algorithm starts from an arbitrary group of vertices and performs local improvements until a local optimum is reached. You need to pass as input a graph and an initial set of nodes. Further, you can specify whether or not to use the extended version of the algorithm (use the extended parameter) and how many insertions/removals to perform at each iteration (insertion parameter). See the paper for further details about those parameters.

[11]:
def closenessOfGroup(group):
    distances = []
    nk.traversal.Traversal.BFSfrom(G, group, lambda u, dist: distances.append(dist))
    return 1. / sum(distances)
[12]:
# Create a random group of vertices
k, group = 5, set()
while (len(group) < k):
    group.add(nk.graphtools.randomNode(G))

print("Initial group", list(group), "has score {:.3f}".format(closenessOfGroup(group)))
Initial group [32, 3, 6, 13, 18] has score 0.025
[13]:
gs = nk.centrality.GroupClosenessGrowShrink(G, group).run()
group = gs.groupMaxCloseness()
print("Final group", group, "has score {:.3f}".format(closenessOfGroup(group)))
Final group [4, 23, 3, 6, 32] has score 0.027

The LS-Restrict algorithm for group closeness only allows swaps between vertices in the group and their neighbors outside. Because it considers less candidates for a swap it is faster than Grow-Shrink, but might yield results with lower quality.

[14]:
# Create a random group of vertices
k, group = 5, set()
while (len(group) < k):
    group.add(nk.graphtools.randomNode(G))

print("Initial group", list(group), "has score {:.3f}".format(closenessOfGroup(group)))
Initial group [10, 18, 19, 28, 29] has score 0.021
[15]:
gs = nk.centrality.GroupClosenessLocalSwaps(G, group).run()
group = gs.groupMaxCloseness()
print("Final group", group, "has score {:.3f}".format(closenessOfGroup(group)))
Final group [0, 29, 19, 28, 18] has score 0.026
[16]:
# Create a random group of vertices
k, group = 5, set()
while (len(group) < k):
    group.add(nk.graphtools.randomNode(G))

print("Initial group", list(group), "has score {:.3f}".format(close.scoreOfGroup(group)))
Initial group [15, 20, 26, 28, 29] has score 0.460
[17]:
gs = nk.centrality.GroupClosenessLocalSearch(G, group).run()
group = gs.groupMaxCloseness()
print("Final group", group, "has score {:.3f}".format(close.scoreOfGroup(group)))
Final group [33, 0, 25, 5, 10] has score 1.000

GroupClosenessLocalSearch evaluates all possible single-swaps between vertices in the group and the vertices outside. Its time complexity is \(O(k \cdot (n - k))\), compared to the aforementioned algorithms for group closeness it is slower, but it yields groups with higher group closeness score.

Group Harmonic Closeness

Group Harmonic Closeness interprets the importance of a group \(S\) as the harmonic sum of the distances from the nodes outside \(S\) to their nearest node in \(S\).

In NetworKit, GroupHarmonicCloseness implements an heuristic approximation algorithm for group harmonic closeness centrality. The algorithm was introduced in “Group-Harmonic and Group-Closeness Maximization – Approximation and Engineering” Angriman et al., ALENEX 2021.

[18]:
# Create a random group of vertices
k, group = 5, set()
while (len(group) < k):
    group.add(nk.graphtools.randomNode(G))

print("Initial group", list(group), "has score {:.3f}".format(close.scoreOfGroup(group)))
Initial group [33, 4, 13, 23, 24] has score 0.806
[19]:
hClose = nk.centrality.GroupHarmonicCloseness(G, 10).run()
hClose.groupMaxHarmonicCloseness()
[19]:
[33, 0, 24, 16, 5, 29, 28, 27, 26, 25]

Group Degree

Group Degree measures the importance of a group of nodes \(S\) as the number of non group members that can be reached from \(S\) in one hop. The algorithm implemented in NetworKit is a greedy algorithm that find an approximation of the group with the highest group degree centrality.

The constructor GroupDegree(G, k = 1, countGroupNodes = True) expects a graph, the desired group size k, and a boolean countGroupNodes stating if nodes inside the group should be counted in the centrality score. By default, the nodes are included in the centrality score. Note that, in order to have a fixed \((1 - 1/e)\) approximation guarantee, countGroupNodes must be set to False.

[20]:
# Initalize algorithm
degree = nk.centrality.GroupDegree(G, 10)
[21]:
# Run
degree.run()
[21]:
<networkit.centrality.GroupDegree at 0x7fc56b537400>

groupMaxDegree() returns the group of k nodes computed by the algorithm.

[22]:
# Group with high degree centrality
degree.groupMaxDegree()
[22]:
[33, 0, 31, 6, 16, 5, 27, 25, 24, 23]

scoreOfGroup(group) takes a group of nodes as input parameter, and returns the group’s degree centrality score.

[23]:
# Create an arbitrary group of nodes
group = [7, 8, 11, 20]

# Get degree centrality score of group
degree.scoreOfGroup(group)
[23]:
11