NetworKit graph I/O tutorial

This notebook will guide you through reading and writing graphs to files using NetworKit, i.e. the following will be covered, - Reading and writing graphs graph from a file - Using the different file formats supported by NetworKit - Converting a graph to a specific format

[1]:
import networkit as nk

Reading a graph from a file

NetworKit supports several graph formats which can be found via graphio.Format. Although most graphs support both reading and writing, a few do not support both alike. If you will be reading large graphs often, it is recommended to convert your graph to the NetworkitBinaryGraph format as this is currently the fastest reader available in NetworKit. For information on how to convert graphs between formats in NetworKit, see the section on converting graphs

SNAP file format

The (optional) first line of a file denotes the problem line p <0 or 1-indexed>.

The problem line is followed by a list of exactly m edges. The format is <u v w> for a weighted graph, and <u v> for an unweighted graph.

The SNAPGraphReader(directed = False, remapNodes = True, nodeCount = 0) constructor expects 3 optional values, i.e., directed which is true if the graph is directed, remapNodes indicates whether nodes should be remapped to other node ids in order to create consecutive node ids and the number of nodes in the graph as nodeCount which is used to preallocate memory for the number of nodes.

Reading a file in SNAP using the default constructor values can done like this:

[2]:
G = nk.readGraph("../input/wiki-Vote.txt", nk.Format.SNAP)

You can now access the graph object via G. Alternatively, you can explicitly use the SNAPGraphReader class as follows:

[3]:
G = nk.graphio.SNAPGraphReader().read("../input/wiki-Vote.txt")

Passing other values to the SNAPGraphReader can be done by creating a SNAPGraphReader object and then calling the read method on it like is done below.

[4]:
snapReader = nk.graphio.SNAPGraphReader(True, False, 7115)
G = snapReader.read("../input/wiki-Vote.txt")

If we want to write G to a file, we can use writeGraph(), and pass G, the path to the file the graph should be written to and the format to the method.

[5]:
import os

if not os.path.isdir('./output/'):
    os.makedirs('./output')
nk.writeGraph(G,"./output/wikiSNAP", nk.Format.SNAP)

EdgeList file format

The EdgeList file format is a simple format that stores each node’s adjacency array in a seperate line. The EdgeList file format has several variations, all differing in the character used to seperate nodes in an edge list or the ID of the first node. The constructor EdgeListReader(separator, firstNode, commentPrefix = “#”, continuous = True, directed = False) expects 5 parameters that dictate the exact format of the edge lists. NetworKit provides five standard EdgeListReaders:

  1. EdgeListSpaceZero with seperator being a whitespace and firstNode’s ID is 0.

  2. EdgeListSpaceOne with seperator being a whitespace and firstNode’s ID is 1.

  3. EdgeListTabZero with seperator being a tab and firstNode’s ID is 0.

  4. EdgeListTab0ne with seperator being a tab and firstNode’s ID is 1.

  5. EdgeListCommaOne with seperatorbeing a comma and firstNode’s ID is 1.

Reading can be done in the same way as in the previous example. You can specify a different format for the EdgeListReader by calling its constructor, and passing the values to it. Assuming we want to use a ‘$’ as a seperator, the first node is 0, and comments are prefixed by a semi-colon, we can do the following:

[6]:
# Specify seperator, firstNode and commentPrefix for the EdgeListReader
edgeListReader = nk.graphio.EdgeListReader('$', 0, ';')

Reading a file with one of the exisiting EdgeListReaders, e.g. EdgeListTabOne can be done by calling the readGraph method and specifying the format as EdgeListTabOne

[7]:
G = nk.readGraph("../input/example.edgelist", nk.Format.EdgeListTabOne)

We can write G to a file by calling the writeGraph() method, and passing G, the path to the file the graph should be written to, and the format to writeGraph().

[8]:
import os

if not os.path.isdir('./output'):
    os.makedirs('./output')
nk.writeGraph(G, './output/example.edgelist.TabOne', nk.Format.EdgeListTabOne)

METIS file format

The METIS format stores a graph of N nodes is stored in a file of N+1 lines. The first line lists the number of nodes and the number of edges seperated by a whitespace. If the first line contains more than two values, the extra values indicate the weights. Each line then contains a node’s adjacency list. Comment lines begin with a “%” sign. A file in METIS format can be read using the readGraph method or by explicitly using the METISGraphReader class:

[9]:
G = nk.readGraph("../input/celegans_metabolic.graph", nk.Format.METIS)
# Alternative:
metisReader = nk.graphio.METISGraphReader()
G = metisReader.read("../input/celegans_metabolic.graph")

Writing a file in METIS format is the same as for the other formats we have seen so far:

[10]:
import os

if not os.path.isdir('./output/'):
    os.makedirs('./output')
nk.writeGraph(G,"./output/celegans_metabolicMETIS", nk.Format.METIS)

GraphML format

The GML format is an XML-based file format for graphs. For me details, please refer to the GML format specification. Reading a file in GML is done in the same way as is reading other formats.

[11]:
G = nk.readGraph("../input/jazz2_directed.gml", nk.Format.GML)
# Alternative:
gmlReader = nk.graphio.GMLGraphReader()
G = gmlReader.read("../input/jazz2_directed.gml")

Writing a file in GML format is the same as for the other formats we have seen so far:

[12]:
import os

if not os.path.isdir('./output/'):
    os.makedirs('./output')
nk.writeGraph(G,"./output/jazz2_directedGML", nk.Format.GML)

GraphViz/ DOT file format

NetworKit currently only supports writing of the DOT file format. More information on the DOT file format can be found here. We can read a graph in any format, and then write it in DOT as follows:

[13]:
import os
# Read graph in GML
G = nk.readGraph("../input/jazz2_directed.gml", nk.Format.GML)

if not os.path.isdir('./output/'):
    os.makedirs('./output')
# Write G in GraphViz/DOT
nk.writeGraph(G,"./output/jazz2_directedGraphViz", nk.Format.GraphViz)
# Write G in DOT
nk.writeGraph(G,"./output/jazz2_directedDOT", nk.Format.DOT)

LFR

Graphs in LFR are identical to those in the EdgeListTabOne format. Therefore, in order to read a graph in LFR, the EdgeListTabOne reader is used. Refer to the section about the EdgeListTabOne reader here. Alternatively, you can also read the graph by specifying LFR as the format. In this case, NetworKit calls the EdgeListTabOnereader internally. The same goes for writing a graph to a file in the LFR file format.

[14]:
G = nk.readGraph("../input/network_overlapping.dat", nk.Format.LFR)

import os
if not os.path.isdir('./output/'):
    os.makedirs('./output')
nk.writeGraph(G,"./output/network_overlapping.dat", nk.Format.LFR)

KONECT file format

The reader KONECTGraphReader(remapNodes = False, handlingmethod = DISCARD_EDGES) expects two parameters; Node ids are remapped to consecutive ids ifremapNodesis set to true. If your graph contains multiple edges between nodes, handlingmethod specifies how NetworKit should handle the multiple edges. handlingmethod can take any of the following three values: - DISCARD_EDGES = 0, //Reads and selects the first edge which occurs and discards all following - SUM_WEIGHTS_UP = 1, //If an edge occurs again, the weight of it is added to the existing edge - KEEP_MINIUM_WEIGHT = 2 //The edge with the lowest weight is kept

In order to read a graph with multiple edges in while summing the weights of the multiple edges, you can pass the parameters to the KONECTGraphReader as follows:

[15]:
konectReader = nk.graphio.KONECTGraphReader(True, 1)
G = konectReader.read("../input/foodweb-baydry.konect")

NetworKit currently only supports reading of the KONECT file format. If you want to write your graph to a file, you can write the graph in another format of your choice.

GraphToolBinary file format

The GraphToolBinaryReader reads graphs written in the binary format described here. The graph’s properties are stored in the file, and therefore, no constructor arguments are passed to the reader. Reading a graph from a file in GraphToolBinaryReadercan be done like this:

[16]:
G = nk.readGraph("../input/power.gt", nk.Format.GraphToolBinary)
# Alternative:
graphToolReader = nk.graphio.GraphToolBinaryReader()
G = graphToolReader.read("../input/power.gt")

When writing the graph to the file, the writer GraphToolBinaryWriter(littleEndianness = True) expects a Boolean value indicating the endianness of the machine. Set littleEndianness to true if you are running a little endian machine. The example below shows how you can pass the endianness to the GraphToolBinaryWriter.

[17]:
import os
if not os.path.isdir('./output/'):
    os.makedirs('./output')
nk.writeGraph(G,"./output/power.gt", nk.Format.GraphToolBinary, littleEndianness=True)

ThrillBinary file format

The ThrillBinaryReader(n) reads a graph format consisting of a serialized DIA of vector from the Thrill format. The constructor optionally takes a 64-but unsigned integer n which is the number of nodes in the graph. Reading is more efficient if the ThrillBinaryReader knows the number of nodes.

[18]:
G = nk.readGraph("../input/celegans_metabolic.thrill", nk.Format.ThrillBinary)
# Alternative:
thrillBinaryReader = nk.graphio.ThrillGraphBinaryReader()
G = thrillBinaryReader.read("../input/celegans_metabolic.thrill")

Using the ThrillBinaryWriter, writing is similar to the other writers:

[19]:
import os
if not os.path.isdir('./output/'):
    os.makedirs('./output')
nk.writeGraph(G,"./output/foodweb-baydry.thrill", nk.Format.ThrillBinary)

NetworkitBinaryGraph file format

The NetworkitBinaryGraph is a custom binary NetworKit file format for reading and writing graphs. It is not only much faster than existing formats, it is also compressed. The graph properties are stored directly in the file.

[20]:
G = nk.readGraph("../input/foodweb-baydry.nkbg003", nk.Format.NetworkitBinary)
#Alternative:
networkitBinaryReader = nk.graphio.NetworkitBinaryReader()
G = networkitBinaryReader.read("../input/foodweb-baydry.nkbg003")

The NetworkitBinaryWriter(chunks = 32, weightsType = NetworkitBinaryWeights::AUTO_DETECT) constructor takes two optional parameters. The NetworkitBinaryWriter groups nodes in to chunks which reduces the space needed to save a graph. Futhermore, it takes the type of weights as an optional parameter. If none is passed, the NetworkitBinaryWriter detects the type of weights automatically. weightsType can be any of the following options: - none = 0, // The graph is not weighted - unsignedFormat = 1, //The weights are unsigned integers - signedFormat = 2, //The weights are signed integers - doubleFormat = 3, //The weights are doubles - floatFormat = 4, //The weights are floats - autoDetect You can pass the number of chunks and type of weights to the writer as follows(assuming signed weights):

[21]:
import os
if not os.path.isdir('./output/'):
    os.makedirs('./output')
nk.writeGraph(G,"./output/foodweb-baydry.nkbg003", nk.Format.NetworkitBinary, chunks=16, NetworkitBinaryWeights=2)

Convert graphs to other formats

Not all graph formats support reading and writing alike, and therefore, one may want to convert a graph to a different format. For example, if you want to convert '../input/wiki-Vote.txt' in the SNAP format to GML and save it in the ./output directory, you can either use the convertGraph(fromFormat, toFormat, fromPath, toPath=None) function from graphio:

[22]:
nk.graphio.convertGraph(nk.Format.SNAP, nk.Format.GML, "../input/wiki-Vote.txt", "output/example.gml")
converted ../input/wiki-Vote.txt to output/example.gml

or you can pass the new format to the writeGraph method:

[23]:
import os

if not os.path.isdir('./output/'):
    os.makedirs('./output')
nk.writeGraph(G,"./output/example.gml", nk.Format.GML)
WARNING:root:overriding given file