Skip to content

navtej21/parallel_performance_lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Multi-Core Cache Contention & Parallel Scaling Analysis

Technologies: C/C++, OpenMP, Linux (WSL)


Problem Statement

Parallel programs often fail to achieve expected speedup on multi-core processors. This project investigates hardware-level causes of performance degradation, specifically cache line contention (false sharing) and scalability limits using OpenMP workloads.


Experiment 1: False Sharing

Each thread incremented its own counter. Memory was then padded so each counter occupied a separate cache line.

Threads: 16
Normal: 0.0156784 sec
Padded: 0.0102941 sec

Observation: Performance improved by approximately 1.5x without changing the algorithm, confirming cache line contention.


Experiment 2: Parallel Scaling

Execution time measured as thread count increased.

Threads 1 : 7.235e-06 sec
Threads 2 : 0.000234615 sec
Threads 4 : 0.000237547 sec
Threads 8 : 0.000262015 sec
Threads 16: 0.0179845 sec

Observation: Increasing threads did not improve performance. Higher thread counts increased runtime due to coherence overhead and scheduling cost.


Key Findings

  • Adjacent thread variables caused cache contention
  • Separating data across cache lines improved performance
  • Parallel scalability has practical limits
  • Hardware coherence overhead can serialize execution

Conclusion

Parallel performance depends not only on algorithm design but also on memory layout and CPU cache behavior. False sharing significantly limits real-world scalability of multi-threaded programs.


How to Run

g++ false_sharing.cpp -O2 -fopenmp -o fs
./fs

g++ scaling.cpp -O2 -fopenmp -o scaling ./scaling

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages