Concurrent Skip Lists in C++

University Project at TU Wien | Advanced Multiprocessor Programming (184.726)

This project focuses on the implementation, benchmarking, and performance analysis of concurrent skip lists using C++ with OpenMP and Pthreads. The objectives are to develop a sequential skip list as a baseline and explore three concurrent implementations: coarse-grained, fine-grained, and lock-free.

Team Members

Florian Frech
Tobias Jechtl
Lorenz Moser

Implementations

We explore four distinct approaches to managing state in a probabilistic data structure:

Sequential (sequential)
- A standard, non-thread-safe Skip List.
- Serves as the baseline for evaluating the performance of the concurrent versions.
- Complexity: Expected O(log n) time complexity for fundamental operations.
Coarse-Grained Locking (coarse_grained)
- Uses a single global OpenMP lock to serialize all access.
- Ensures thread safety by protecting all operations with a single global lock.
- Limits scalability under high contention due to serialization.
Fine-Grained Locking (fine_grained)
- Implements "Lazy Synchronization" using hand-over-hand locking with pthread_mutex.
- Locks individual nodes and their relationships to reduce contention and improve performance.
- Uses a two-step deletion: Logical marking followed by physical unlinking.
Lock-Free (lock_free)
- Based on Herlihy & Shavit's design using Atomic Markable References.
- Leverages atomic compare-and-set operations to achieve synchronization without locks.
- Guarantees correctness even in the presence of concurrent modifications.

Project Structure

.
├── assets/                 # Evaluation plots
├── incl/                   # Header files (.hpp)
├── src/                    # Source implementation (.cpp)
├── tests/                  # Integration tests for correctness
├── scripts/                # Python scripts for plotting
├── Makefile                # Build automation
└── README.md               # Documentation

Build & Usage

Prerequisites

Compiler: g++ (supports C++17)
Libraries: OpenMP, pthread
Build Tool: Make

Compilation

To build the main executable:

make all

To build and run the correctness tests:

make tests

Running Benchmarks

You can run specific pre-configured benchmark suites using the Makefile:

make bench-sequential
make bench-coarse_grained
make bench-fine_grained
make bench-lock_free

Custom Execution (CLI)

You can run custom scenarios using the generated build/skiplist executable.

Example: Run a Lock-Free benchmark with 4 threads, a 50/50 insert/delete mix, for 5 seconds:

./build/skiplist --kind lock_free --threads 4 --time-interval 5 --mix 0.5,0.5,0.0 --output-print 1

Parameters:

Flag	Description	Default
`--kind`	Implementation (`sequential`, `coarse_grained`, `fine_grained`, `lock_free`)	`sequential`
`--threads`	Number of concurrent threads	`1`
`--mix`	Operation mix: `insert,remove,contains`	`0.1,0.1,0.8`
`--key-range`	Range of keys (e.g., `0,100000`)	`0,100000`
`--key-range-type`	Key distribution: `common`, `disjoint`, or `per-thread`	`common`
`--time-interval`	Duration of the benchmark in seconds	`1`

Results & Evaluation

The following charts illustrate the Throughput Speed-Up relative to the single-threaded parallel baselines. These results reflect the corrected evaluation of our implementation.

Experimental Setup:

Duration: 5 seconds per run.
Threads: Exponential steps (2^0 to 2^6).
Workloads:
- Read-Heavy: 10% Insert, 10% Remove, 80% Contains.
- Balanced: 40% Insert, 40% Remove, 20% Contains.

1. Coarse-Grained Locking

Observation: The coarse-grained implementation suffers from significant contention. The speed-up peaks at approximately 2.0x with 4 threads and then degrades as thread count increases, regardless of whether key ranges are common or disjoint.

2. Fine-Grained Locking

Observation: Fine-grained locking demonstrates much better scalability. For the disjoint, read-heavy workload (green line), we observe a speed-up of approximately 45x at 40 threads. However, overhead becomes apparent at very high thread counts (2^6).

3. Lock-Free

Observation: The Lock-Free implementation achieves the highest scalability. In the disjoint, read-heavy scenario (green line), it reaches a speed-up of over 50x. Even in high-contention scenarios (common key range), it maintains an upward trend, outperforming the lock-based approaches.

References

Herlihy, M., & Shavit, N. (2020). The Art of Multiprocessor Programming. Newnes.
Pugh, W. (1990). Skip Lists: A Probabilistic Alternative to Balanced Trees. Communications of the ACM.
OpenMP Architecture Review Board. (2015). OpenMP Application Program Interface Version 4.5.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Concurrent Skip Lists in C++

Team Members

Implementations

Project Structure

Build & Usage

Prerequisites

Compilation

Running Benchmarks

Custom Execution (CLI)

Results & Evaluation

1. Coarse-Grained Locking

2. Fine-Grained Locking

3. Lock-Free

References

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
assets		assets
data		data
incl		incl
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

tobjec/concurrent-skiplists

Folders and files

Latest commit

History

Repository files navigation

Concurrent Skip Lists in C++

Team Members

Implementations

Project Structure

Build & Usage

Prerequisites

Compilation

Running Benchmarks

Custom Execution (CLI)

Results & Evaluation

1. Coarse-Grained Locking

2. Fine-Grained Locking

3. Lock-Free

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages