The index sums up the reciprocals of the logarithm of the degree of all common neighbors of u and v.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None

AdjustedRandIndex proposed by Hoffman et al. with natural threshold of 0.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None

class networkit.linkprediction.AlgebraicDistanceIndex(G, numberSystems, numberIterations, omega=0.5, norm=2)

Algebraic distance assigns a distance value to pairs of nodes according to their structural closeness in the graph.

Parameters
• G (networkit.Graph) – The graph to work on. Can be set to None and default is None.

• numberSystems (int) – Number of vectors/systems used for algebraic iteration.

• numberIterations (int) – Number of iterations in each system.

• omega (float, optional) – Overrelaxation parameter. Default: 0.5

• norm (int, optional) – The norm factor of the extended algebraic distance. Maximum norm is realized by setting the norm to 0. Default: 2

preprocess()

Executes necessary initializations.

Starting with random initialization, compute for all numberSystems “diffusion” systems the situation after numberIterations iterations of overrelaxation with overrelaxation parameter omega.

Notes

Needs to be called before algdist delivers meaningful results!

The CommonNeighborsIndex calculates the number of common neighbors of a node-pair in a given graph.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None

Bases: `object`

Abstract base class for evaluation curves.

The evualation curves are generated based on the predictions calculated by the link predictor and a testGraph to compare against.

Parameters

testGraph (networkit.Graph) – Graph containing the links to use for evaluation. Can be set to None and default is None.

getAreaUnderCurve(curve=([], []))

Returns the area under the most recently calculated or optionally the given curve by using the trapezoidal rule.

Note that if there is no curve specified or the lists of the given curves are empty than the area under the most recently calculated curve will be returned.

Parameters

curve (tuple(list(float), list(float)), optional) – Curve whose AUC to determine. Default: Empty lists.

Returns

The area under the given curve.

Return type

float

getCurve(predictions, numThresholds=1000)

Returns a pair of X- and Y-lists describing the evaluation curve generated from the given predictions.

The latest y-value will be used as a tie-breaker in case there are multiple y-values for one x-value. Note that the given number of thresholds is an upper bound for the number of points returned. This is due to the fact that multiple y-values can map to one x-value in which case the tie-breaking behaviour described above will intervene.

Parameters
• predictions (list(tuple(tuple(int, int), float))) – Predictions to evaluate.

• numThresholds (int, optional) – The number of thresholds to use the metric on. Default: 1000

Returns

A pair of lists where the first list contains all x-values and the second one contains the corresponding y-value.

Return type

tuple(list(float), list(float))

setTestGraph(newTestGraph)

Sets a new graph to use as ground truth for evaluation.

Note that this won’t reset the most recently calculated curve and as a consequence getAreaUnderCurve() will still behave as expected by returning the AUC of the most recent curve.

Parameters

newTestGraph (networkit.Graph) – New graph to use as ground truth.

Implementation of the Jaccard index which normalizes the Common Neighbors Index.

This is done through dividing the number of common neighbors by the number of nodes in the neighboorhood-union.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None

class networkit.linkprediction.KatzIndex(G=None, maxPathLength=5, dampingValue=0.005)

Implementation of the Katz index.

Katz index assigns a pair of nodes a similarity score that is based on the sum of the weighted number of paths of length l where l is smaller than a given limit.

Parameters
• G (networkit.Graph, optional) – The graph to operate on. Default: None

• maxPathLength (int, optional) – Maximal length of the paths to consider. Default: 5

• dampingValue (float, optional) – Used to exponentially damp every addend of the sum. Should be in (0, 1]. Default: 0.005

Bases: `object`

Abstract base class for link predictors.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None.

run(u, v)

Returns a score indicating the likelihood of a future link between the given nodes.

Prior to calling this method a graph should be provided through the constructor or by calling setGraph. Note that only undirected graphs are accepted. There is also no lower or upper bound for scores and the actual range of values depends on the specific link predictor implementation. In case u == v a 0 is returned. If suitable this method might make use of parallelization to enhance performance.

Parameters
• u (int) – First node in graph.

• v (int) – Second node in graph.

Returns

A prediction-score indicating the likelihood of a future link between the given nodes.

Return type

float

runAll()

Runs the link predictor on all currently unconnected node-pairs.

Possible self-loops are also excluded. The method makes use of parallelisation.

Returns

A list of pairs containing all currently unconnected node-pairs as the first elements and the corresponding scores as the second elements. The list is sorted ascendingly by node-pair.

Return type

list(tuple(int, int))

runOn(nodePairs)

Executes the run-method on al given node-pairs and returns a list of predictions.

The result is a list of pairs where the first element is the node-pair and it’s second element the corresponding score generated by the run-method. The method makes use of parallelisation.

Parameters

nodePairs (list(tuple(int, int))) – Node-pairs to run the predictor on.

Returns

A list of pairs containing the given node-pair as the first element and it’s corresponding score as the second element. The list is sorted ascendingly by node-pair.

Return type

list(tuple(int, float))

setGraph(newGraph)

Sets the graph to work on.

Parameters

newGraph (networkit.Graph) – The graph to work on.

Bases: `object`

Filters given predictions based on some criterion and returns a list of node-pairs that fulfill the given criterion.

This can be used to determine which node-pairs should actually be interpreted as future links and which shouldn’t.

Returns the first numLinks highest scored node-pairs.

Parameters
• predictions (list(tuple(tuple(int, int), float))) – Predictions to filter.

• numLinks (int) – Number of top-scored node-pairs to return.

Returns

The first numLinks highest scored node-pairs.

Return type

list(tuple(int, int))

Returns the first percentageLinks percent of the highest scores node-pairs.

Parameters
• predictions (list(tuple(tuple(int, int), float))) – Predictions to filter.

• percentageLinks (float) – Percentage of highest scored node-pairs to return.

Returns

The first percentageLinks percent of the highest scores node-pairs.

Return type

list(tuple(int, int))

static byScore(predictions, minScore)

Returns the node-pairs whose scores are at least equal to the given minScore.

Parameters
• predictions (list(tuple(tuple(int, int), float))) – Predictions to filter.

• minScore (float) – Minimal score that the returned node-pairs should have.

Returns

A list of node-pairs whose scores are at least equal to the given minScore.

Return type

list(tuple(int, int))

Bases: `object`

Allows the user to find missing links in the given graph.

The absent links to find are narrowed down by providing a distance that the nodes of the missing links should have. For example in case of distance 2 only node-pairs that would close a triangle in the given graph get returned.

Parameters

G (networkit.Graph) – The graph to find missing links in.

findAtDistance(k)

Returns all missing links in the graph that have distance k.

Note that a distance of k actually means that there are k different links on the path of the two nodes that are connected through that path.

Parameters

k (int) – Distance of the absent links.

Returns

An ascendingly sorted list of node-pairs where there is a missing link of distance k between the two nodes.

Return type

list(tuple(int, int))

findFromNode(u, k)

Returns all missing links in the graph that have distance k and are connected to u.

Note that a distance of k actually means that there are k different links on the path of the two nodes that are connected through that path.

Parameters
• u (int) – Node to find missing links from.

• k (int) – Distance of the absent links.

Returns

A list of node-pairs where there is a missing link of distance k between the given node u and another node in the graph.

Return type

list(tuple(int, int))

Assigns a distance value to pairs of nodes according to the overlap of their neighborhoods.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None

Bases: `object`

Provides basic operations on neighborhoods in a given graph.

static getCommonNeighbors(G, u, v)

Returns a list containing the node-ids of all common neighbors of u and v.

Parameters
• G (networkit.Graph) – Graph to obtain common neighbors from.

• u (int) – First node.

• v (int) – Second node.

Returns

A list containing the node-ids of all common neighbors of u and v.

Return type

list(int)

static getNeighborsUnion(G, u, v)

Returns the union of the neighboorhoods of u and v.

Parameters
• G (networkit.Graph) – Graph to obtain neighbors-union from.

• u (int) – The first node.

• v (int) – The second node.

Returns

A list containing all the nodes in the neighboorhood-union of u and v.

Return type

list(int)

Implementation of the Neighbors Measure Index.

This index is also known as Friends Measure and simply returns the number of connections between neighbors of the given nodes u and v.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None

Provides points that define the Precision-Recall curve for a given set of predictions.

Based on the generated points the area under the curve can be calculated with the trapzoidal rule.

Bases: `object`

Allows the sorting of predictions by score or node-pair.

static sortByNodePair(predictions)

Sorts the predictions ascendingly by node-pair.

This means for example (0, 0) < (0, 1) and (1, 1) < (1, 0).

Parameters

predictions (list(tuple(tuple(int, int), float))) – The predictions to sort.

static sortByScore(predictions)

Sorts the given predictions descendingly by score.

In case there is a tie the node-pairs are used as a tie-breaker by sorting them ascendingly on the first node and on a tie ascendingly by the second node.

Parameters

predictions (list(tuple(tuple(int, int), float))) – The predictions to sort.

Implementation of the Preferential Attachment Index.

The run-method simply calculates the product of the number of nodes in the neighborhoods regarding the given nodes.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None

Provides points that define the Receiver Operating Characteristic curve for a given set of predictions.

Based on the generated points the area under the curve can be calculated with the trapzoidal rule.

Parameters

testGraph (networkit.Graph, optional) – Graph containing the links to use for evaluation. Default: None

Bases: `object`

Provides methods to randomly sample a number of edges from a given graph.

Returns a graph that contains numLinks links from the given graph G.

The links are randomly selected from G until the given count is reached.

Parameters
• G (networkit.Graph) – The graph to construct the training graph from.

• numLinks (int) – Number of links the returned graph should consist of.

Returns

A graph that contains the given number of links from G.

Return type

networkit.Graph

static byPercentage(G, percentage)

Returns a graph that contains percentage percent of links form the given graph G.

The links are randomly selected from G until the given percentage is reached.

Parameters
• G (networkit.Graph) – The graph to construct the training graph from.

• percentage (float) – Percentage of links regarding the number of links in the given graph that should be in the returned graph.

Returns

A graph that contains the given percentage of links from G.

Return type

networkit.Graph

Implementation of the ResourceAllocationIndex.

The index is similar to Adamic/Adar and sums up the reciprocals of the degree of all common neighbors of u and v.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None

Index to determine whether two nodes are in the same community.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None

Implementation of the Total Neighbors Index.

This index is also known as Total Friends Index and returns the number of nodes in the neighborhood-union of u and v.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None

Index that simply returns the degree of the first given node.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None

Index that simply returns the degree of the second given node.

Parameters

G (networkit.Graph, optional) – The graph to work on. Default: None

Returns a numpy-array containing the generated scores from the predictors for the given node-pairs.

Parameters
• nodePairs (list(tuple(int, int))) – Node-pairs to get the samples for.

• *linkPredictors (tuple()) – List of link predictors to use for sample-generation.

Returns

A numpy-array of shape (#nodePairs, #linkPredictors) containing the generated scores from the predictors for the given node-pairs.

Return type

numpy.array

Returns a numpy-array containing the labels of the given node-pairs.

The labels are defined as follows: 1 = link, 0 = absent link.

Parameters
• nodePairs (list(tuple(int, int))) – Node-pairs to get the labels for.

• G (networkit.Graph) – Graph which provides ground truth for the labels.

Returns

A numpy-array containing the labels of the given node-pairs.

Return type

numpy.array