University Project at TU Wien | Advanced Multiprocessor Programming (184.726)
This project focuses on the implementation, benchmarking, and performance analysis of concurrent skip lists using C++ with OpenMP and Pthreads. The objectives are to develop a sequential skip list as a baseline and explore three concurrent implementations: coarse-grained, fine-grained, and lock-free.
- Florian Frech
- Tobias Jechtl
- Lorenz Moser
We explore four distinct approaches to managing state in a probabilistic data structure:
-
Sequential (
sequential)- A standard, non-thread-safe Skip List.
- Serves as the baseline for evaluating the performance of the concurrent versions.
- Complexity: Expected O(log n) time complexity for fundamental operations.
-
Coarse-Grained Locking (
coarse_grained)- Uses a single global
OpenMPlock to serialize all access. - Ensures thread safety by protecting all operations with a single global lock.
- Limits scalability under high contention due to serialization.
- Uses a single global
-
Fine-Grained Locking (
fine_grained)- Implements "Lazy Synchronization" using hand-over-hand locking with
pthread_mutex. - Locks individual nodes and their relationships to reduce contention and improve performance.
- Uses a two-step deletion: Logical marking followed by physical unlinking.
- Implements "Lazy Synchronization" using hand-over-hand locking with
-
Lock-Free (
lock_free)- Based on Herlihy & Shavit's design using Atomic Markable References.
- Leverages atomic compare-and-set operations to achieve synchronization without locks.
- Guarantees correctness even in the presence of concurrent modifications.
.
├── assets/ # Evaluation plots
├── incl/ # Header files (.hpp)
├── src/ # Source implementation (.cpp)
├── tests/ # Integration tests for correctness
├── scripts/ # Python scripts for plotting
├── Makefile # Build automation
└── README.md # Documentation
- Compiler:
g++(supports C++17) - Libraries:
OpenMP,pthread - Build Tool:
Make
To build the main executable:
make allTo build and run the correctness tests:
make testsYou can run specific pre-configured benchmark suites using the Makefile:
make bench-sequential
make bench-coarse_grained
make bench-fine_grained
make bench-lock_freeYou can run custom scenarios using the generated build/skiplist executable.
Example: Run a Lock-Free benchmark with 4 threads, a 50/50 insert/delete mix, for 5 seconds:
./build/skiplist --kind lock_free --threads 4 --time-interval 5 --mix 0.5,0.5,0.0 --output-print 1Parameters:
| Flag | Description | Default |
|---|---|---|
--kind |
Implementation (sequential, coarse_grained, fine_grained, lock_free) |
sequential |
--threads |
Number of concurrent threads | 1 |
--mix |
Operation mix: insert,remove,contains |
0.1,0.1,0.8 |
--key-range |
Range of keys (e.g., 0,100000) |
0,100000 |
--key-range-type |
Key distribution: common, disjoint, or per-thread |
common |
--time-interval |
Duration of the benchmark in seconds | 1 |
The following charts illustrate the Throughput Speed-Up relative to the single-threaded parallel baselines. These results reflect the corrected evaluation of our implementation.
Experimental Setup:
- Duration: 5 seconds per run.
- Threads: Exponential steps (2^0 to 2^6).
- Workloads:
- Read-Heavy: 10% Insert, 10% Remove, 80% Contains.
- Balanced: 40% Insert, 40% Remove, 20% Contains.
Observation: The coarse-grained implementation suffers from significant contention. The speed-up peaks at approximately 2.0x with 4 threads and then degrades as thread count increases, regardless of whether key ranges are common or disjoint.
Observation: Fine-grained locking demonstrates much better scalability. For the disjoint, read-heavy workload (green line), we observe a speed-up of approximately 45x at 40 threads. However, overhead becomes apparent at very high thread counts (2^6).
Observation: The Lock-Free implementation achieves the highest scalability. In the disjoint, read-heavy scenario (green line), it reaches a speed-up of over 50x. Even in high-contention scenarios (common key range), it maintains an upward trend, outperforming the lock-based approaches.
- Herlihy, M., & Shavit, N. (2020). The Art of Multiprocessor Programming. Newnes.
- Pugh, W. (1990). Skip Lists: A Probabilistic Alternative to Balanced Trees. Communications of the ACM.
- OpenMP Architecture Review Board. (2015). OpenMP Application Program Interface Version 4.5.


