Shared-Memory Parallel Nucleus Decomposition

Organization

This repository contains code for our shared-memory parallel (r, s) nucleus decomposition algorithms. Note that the repository uses the Graph-Based Benchmark Suite (GBBS) for parallel primitives and benchmarks.

The benchmarks/NucleusDecomposition/ directory contains the main executable, NucleusDecomposition_main.

Compilation

Compiler:

g++ >= 7.4.0 with pthread support for the ParlayLib scheduler
g++ >= 7.4.0 with support for Cilk Plus

Build system:

Bazel 2.0.0

To build, navigate to the benchmarks/NucleusDecomposition/ directory, and run:

$ bazel build :NucleusDecomposition_main

Most optionality from GBBS applies. In particular, to compile benchmarks for graphs with more than 2^32 edges, the LONG command-line parameter should be set.

The default compilation is the ParlayLib scheduler (Homegrown). To compile with the Cilk Plus scheduler instead of the Homegrown scheduler, use the Bazel configuration --config=cilk. To compile using OpenMP instead, use the Bazel configuration --config=openmp. To compile serially instead, use the Bazel configuration --config=serial.

Note that the default compilation mode in bazel is to build optimized binaries (stripped of debug symbols). You can compile debug binaries by supplying -c dbg to the bazel build command.

The following command cleans the directory:

$ bazel clean  # removes all executables

Graph Format

The applications take as input the adjacency graph format used by GBBS.

The adjacency graph format starts with a sequence of offsets one for each vertex, followed by a sequence of directed edges ordered by their source vertex. The offset for a vertex i refers to the location of the start of a contiguous block of out edges for vertex i in the sequence of edges. The block continues until the offset of the next vertex, or the end if i is the last vertex. All vertices and offsets are 0 based and represented in decimal. The specific format is as follows:

AdjacencyGraph
<n>
<m>
<o0>
<o1>
...
<o(n-1)>
<e0>
<e1>
...
<e(m-1)>

This file is represented as plain text.

Using SNAP Graphs

Graphs from the SNAP dataset collection are used in our experiments. We use a tool from the GBBS that converts the most common SNAP graph format to the adjacency graph format that GBBS accepts. Usage example:

# Download a graph from the SNAP collection.
wget https://snap.stanford.edu/data/wiki-Vote.txt.gz
gzip --decompress ${PWD}/wiki-Vote.txt.gz
# Run the SNAP-to-adjacency-graph converter.
# Run with Bazel:
bazel run //utils:snap_converter -- -s -i ${PWD}/wiki-Vote.txt -o <output file>

Running Code

The applications take the input graph as input, as well as flags to specify the parameters of the (r, s) nucleus decomposition algorithm and desired optimizations. Note that the -s flag must be set to indicate a symmetric (undirected) graph, and the -rounds 1 argument must be passed in.

The options for arguments are:

--rClique followed by an integer specifying r
--sClique followed by an integer specifying s
--numberOfLevels followed by a string, which must be ONE_LEVEL, TWO_LEVEL, or MULTI_LEVEL
--numberOfMultiLevels followed by an integer which must be less than or equal to r, specifying the number of levels desired if the MULTI_LEVEL option is chosen (this argument is ignored if ONE_LEVEL or TWO_LEVEL is chosen)
--inverseIndexMap followed by a string, which must be BINARY_SEARCH or STORED_POINTERS (this option is ignored if ONE_LEVEL is chosen)
--contiguousSpace, which specifies that contiguous space should be used for the parallel hash table T if set (this argument is ignored if ONE_LEVEL or STORED_POINTERS are chosen, both of which require contiguous space)
--relabel, which specifies that the graph should be relabeled if set
--updateAggregation followed by a string, which must be SIMPLE_ARRAY, LIST_BUFFER, or HASH_TABLE
--compact, which specifies that the graph should be relabeled if set (this argument only applies to r = 2 and s = 3, assuming TWO_LEVEL, STORED_POINTERS, and HASH_TABLE)

Example Usage

The main executable is NucleusDecomposition_main in the benchmarks/NucleusDecomposition/ directory.

After navigating to the benchmarks/NucleusDecomposition/ directory, a template command is:

$ bazel run :NucleusDecomposition_main -- -s -rounds 1 --rClique 3 --sClique 4 --numberOfLevels TWO_LEVEL --inverseIndexMap STORED_POINTERS --relabel --updateAggregation LIST_BUFFER </path/to/input/graph>

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
benchmarks		benchmarks
bin		bin
external		external
gbbs		gbbs
inputs		inputs
internal_tools		internal_tools
pybindings		pybindings
scripts		scripts
utils		utils
.bazelrc		.bazelrc
.gitignore		.gitignore
.gitmodules		.gitmodules
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
Nucleus_Decomposition_full_version.pdf		Nucleus_Decomposition_full_version.pdf
README.md		README.md
WORKSPACE		WORKSPACE
gbbs.py		gbbs.py
makefile.variables		makefile.variables

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shared-Memory Parallel Nucleus Decomposition

Organization

Compilation

Graph Format

Running Code

About

Releases

Packages

Languages

License

jeshi96/arb-nucleus-decomp

Folders and files

Latest commit

History

Repository files navigation

Shared-Memory Parallel Nucleus Decomposition

Organization

Compilation

Graph Format

Running Code

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages