Optimizing Breadth-First Search on Modern Energy-Efficient Multicore CPUs

This project contains two implementations of Breadth-First Search (BFS): an OpenMP implementation and a pthreads implementation. Both implementations focus on optimizing BFS performance on modern energy-efficient multicore CPUs by addressing memory access patterns and synchronization overheads.

Thesis | Slides

(If you are looking for the version presented at the SpeedCode FastCode Challenge 2025 (paper), please check out the speedcode-challenge branch.)

Overview

Breadth-First Search (BFS) is a fundamental algorithm for graph analysis, but its performance on modern multicore CPUs is generally limited by irregular memory access.

The OpenMP implementation uses an optimized data structure to store graph data, called MergedCSR. By co-locating vertex metadata with its adjacency list, this format improves spatial locality and reduces cache misses.The pthreads implementation uses the MergedCSR structure and some custom components to further improve performance. Namely, it includes a custom implementation of a thread pool, a chunk-based frontier with dynamic work-stealing for load balancing, and a lightweight, custom barrier for scalable synchronization.

Both implementations were evaluated against the GAP Benchmark Suite (included as a submodule) on a diverse set of graphs across x86, RISC-V, and ARM platforms. The results demonstrate that the MergedCSR data structure significantly improves memory performance, enabling a geomean speedup of up to 1.5x over the baseline on large-diameter graphs. Furthermore, the explicit pthreads implementation exhibits superior scalability due to its custom synchronization and outperforms the GAP benchmark with a geomean of 2.28x on road networks and 1.87x on random geometric graphs.

Requirements

A C++17 compatible compiler (e.g., GCC, Clang).
CMake (version 3.10 or newer).
Make.
OpenMP for parallel execution.

The project has been tested with g++ 13.0.0 and OpenMP 4.5.

Installation

Clone the repository:

git clone <your-repository-url>
cd <repository-name>

Initialize and update the submodules (distributed_mmio and gapbs):
```
git submodule update --init --recursive
```

SbatchMan and MtxMan

This project is compatible with SbatchMan and MtxMan for job scheduling and matrix management. The SbatchMan configuration files are provided in the scripts directory. The matrix files used for the evaluation be found in the scripts/matrices.yaml directory.

Running

Pthreads

The Pthreads implementation can be run from the root directory.

./pthreads/bin/bfs -f <graph-file.mtx> -n <number-of-runs>

-f: Path to the input graph file in Matrix Market (.mtx) format.
-n: Number of BFS runs to execute.
-s: Specify a source vertex ID. If not provided, a random source is chosen.

OpenMP

The OpenMP implementation can also be run from the root directory.

OMP_NUM_THREADS=<threads> ./openmp/bin/bfs <graph-file.mtx> <runs> <implementation>

OMP_NUM_THREADS: Set the number of OpenMP threads.
<graph-file.mtx>: Path to the input graph file.
<runs>: Number of BFS runs.
<implementation>: The BFS algorithm to use. Options are reference, merged_csr_distances, merged_csr_parents.

GAP Benchmark Suite (GAPBS)

The GAPBS implementation is located in the gapbs directory.

cd gapbs
OMP_NUM_THREADS=<threads> ./bfs -f <graph-file> -n <iterations>

OMP_NUM_THREADS: Set the number of OpenMP threads.
-f: Path to the input graph file. GAPBS supports various formats like .el, .mtx, .gr, and its own serialized .sg format.
-n: Number of iterations to run.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
distributed_mmio @ 91c53e9		distributed_mmio @ 91c53e9
figures		figures
gapbs		gapbs
openmp		openmp
plots		plots
pthreads		pthreads
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
slides.pdf		slides.pdf
thesis.pdf		thesis.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Optimizing Breadth-First Search on Modern Energy-Efficient Multicore CPUs

Overview

Requirements

Installation

SbatchMan and MtxMan

Running

Pthreads

OpenMP

GAP Benchmark Suite (GAPBS)

About

Uh oh!

Releases

Packages

Languages

HicrestLaboratory/MergedCSR

Folders and files

Latest commit

History

Repository files navigation

Optimizing Breadth-First Search on Modern Energy-Efficient Multicore CPUs

Overview

Requirements

Installation

SbatchMan and MtxMan

Running

Pthreads

OpenMP

GAP Benchmark Suite (GAPBS)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages