Multicore Processors: Architecture & Programming Project

Memory Allocator for OpenMP Programs

Team Members

Darshan D

Abstract

The end of Moore’s Law necessitates the development of innovative solutions to augment the performance of applications rather than attempting to pack more transistors on a chip and/or increasing the CPU frequency. In this direction, Multicore and Multiprocessor systems are now ubiquitous and are an ideal candidate to improve performance as they enable the execution of parallel, multithreaded programs. However, these parallel applications are often inhibited by the memory allocator which can negatively throttle performance and scalability. Moreover, the memory allocator can also introduce other issues such as false sharing and fragmentation which can be considerable bottlenecks to performance. This project presents a scalable memory allocator that can be used in conjunction with these parallel applications (even those developed with OpenMP). It introduces the concept of per thread heaps with memory ownership that can scale efficiently while almost eliminating false sharing and minimizing fragmentation. The generated results for the developed OpenMP benchmark programs denote substantial promise with the proposed and implemented memory allocator exhibiting considerable improvements in Scalability, False Sharing avoidance, and low Fragmentation as compared to the Malloc and Hoard memory allocators.

Details about running the software

All the Source Code is placed in the src directory
The src directory includes the 3 implemented incremental versions of the memory allocator organized as nested directories as follows:
1. 1_Sequential_Memory_Allocator: This corresponds to the single-threaded-only implementation
2. 2_Concurrent_Memory_Allocator: This corresponds to the Serial Single Heap with global lock implementation
3. 3_Concurrent_Scalable_Memory_Allocator: This corresponds to the finalized implementation with per-thread heaps
Each of the above 3 directories contains the following files:
- custom_mem_alloc.h: A header file exposing the interface of the required functions
- custom_mem_alloc.cpp: The implementation file where all the functionalities are implemented
- client_mem_alloc.cpp: A Sanity Client file that verifies the behavior of the implemented functionalities
- Makefile: The Makefile to build the executable for a particular version of the memory allocator
The steps to execute the Sanity Client for each of the above implementations are as follows:
- cd into the directory corresponding to the version that you need to verify
- Execute the make clean command to remove the previous object files and executable
- Execute the make command to generate the executable for the Sanity Client linked with a specific implementation of the memory allocator
- Execute the ./<executable_name> command to run the executable where <executable_name> refers to the name of the executable generated by the above makefile
It has to be noted that the 3_Concurrent_Scalable_Memory_Allocator version is the finalized version used for experimenting with the benchmarks and generating the required results
All the Benchmark programs used for experimentation are placed within the Benchmark_Experiments directory
This directory is further organized into the following directories:
- comparison_testing_malloc: Benchmark testing for the malloc memory allocator
- comparison_testing_hoard: Benchmark testing for the hoard memory allocator
- comparison_testing_my_mem_alloc: Benchmark testing for the my_mem_alloc memory allocator which corresponds to the 3_Concurrent_Scalable_Memory_Allocator version implemented as part of this project
Each of the above comparison_testing directories is organized as follows:
- 1_Speed: Contains the benchmarks for the Speed metric
- 2_Scalability: Contains the benchmarks for the Scalability metric
- 3_False_Sharing: Contains the benchmarks for the False Sharing metric
- 4_Fragmentation: Contains the benchmarks for the Fragmentation metric
It has to be noted that while running the Scalability benchmark, the number of threads needs to be passed as a command line argument
In order to run the benchmarks for the malloc memory allocator, follow the steps given below:
- Compile the benchmark program using the command g++ -fopenmp <benchmark_program>
- Run the generated executable using the command ./<executable_name>
- While running the Scalability benchmark, the number of threads needs to be passed as a command line argument
In order to run the benchmarks for the hoard memory allocator, follow the steps given below:
- Clone the source code for Hoard using the command git clone https://github.com/emeryberger/Hoard
- cd into the src directory of the cloned repository
- Run the make command to generate the libhoard.so shared object
- Run the command export LD_LIBRARY_PATH=<path_to_so>:$LD_LIBRARY_PATH
- Compile the benchmark program using the command g++ -c -fopenmp <benchmark_program> to generate .o file
- Link the .o with the .so using the command g++ -fopenmp <benchmark_program>.o -L<path_to_so> -lhoard
- Run the generated executable using the command ./<executable_name>
- While running the Scalability benchmark, the number of threads needs to be passed as a command line argument
In order to run the benchmarks for the my_mem_alloc memory allocator, follow the steps given below:
- Ensure to place the custom_mem_alloc.h from the 3_Concurrent_Scalable_Memory_Allocator directory in the same directory as the benchmark program or change the include path in the benchmark program
- Copy the custom_mem_alloc.o generated using make in 3_Concurrent_Scalable_Memory_Allocator to the directory with the benchmark program
- Compile the benchmark program using the command g++ -c -fopenmp <benchmark_program>
- Link the .o files using the command g++ -fopenmp <benchmark_program>.o custom_mem_alloc.o
- Run the generated executable using the command ./<executable_name>
- While running the Scalability benchmark, the number of threads needs to be passed as a command line argument

Implementation Details

The details regarding the implementation are included as part of the Project Report titled Multicore_Project_Report_dd3888_Darshan_Dinesh_Kumar.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Benchmark_Experiments		Benchmark_Experiments
result_graph_generation		result_graph_generation
src		src
Multicore_Project_Report_dd3888_Darshan_Dinesh_Kumar.pdf		Multicore_Project_Report_dd3888_Darshan_Dinesh_Kumar.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multicore Processors: Architecture & Programming Project

Memory Allocator for OpenMP Programs

Team Members

Abstract

Details about running the software

Implementation Details

About

Uh oh!

Releases

Packages

Languages

darshand15/Multicore_Project

Folders and files

Latest commit

History

Repository files navigation

Multicore Processors: Architecture & Programming Project

Memory Allocator for OpenMP Programs

Team Members

Abstract

Details about running the software

Implementation Details

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages