OpenCL Multi‑Device Memory Bandwidth Analyzer

Overview

OpenCL Multi‑Device Memory Bandwidth Analyzer is a C++ benchmarking tool that measures memory bandwidth performance across all OpenCL devices available on a system.

The program automatically detects all OpenCL platforms and devices (GPU, CPU, accelerators) and performs several tests to evaluate:

• Host → Device memory bandwidth
• Device → Host memory bandwidth
• Kernel global memory throughput

This allows developers and researchers to quickly identify which compute device provides the best OpenCL memory performance.

The project is lightweight, dependency‑minimal, and designed for reproducible benchmarking.

Screenshot

Example execution on a laptop GPU system.

Example Output

Below is a real example output produced by the program.

====================================================================================================
PLATFORM #0
====================================================================================================
Name    : NVIDIA CUDA
Vendor  : NVIDIA Corporation
Version : OpenCL 3.0 CUDA 13.1.86
Devices : 1

  DEVICE #0
    Name           : NVIDIA GeForce RTX 4070 Laptop GPU
    Type           : GPU
    Version        : OpenCL 3.0 CUDA
    Driver         : 591.44
    Compute Units  : 36
    Global Mem     : 8187 MB
    Write BW       : 11.39 GB/s
    Read  BW       : 12.27 GB/s
    Kernel BW      : 13343.97 GB/s
    Status         : PASS

The program evaluates multiple devices and reports the measured bandwidth and status.

What is OpenCL

OpenCL (Open Computing Language) is an open standard for parallel computing across heterogeneous hardware.

OpenCL allows programs to run compute workloads on:

• GPUs
• CPUs
• integrated GPUs
• FPGAs
• accelerators

OpenCL separates programs into two parts:

Host Program

Runs on the CPU and is responsible for:

• discovering OpenCL platforms and devices
• allocating memory buffers
• compiling kernels
• launching compute kernels

Kernel Program

Runs on the compute device (GPU / CPU) and performs massively parallel operations.

Software and Tools Using OpenCL

OpenCL is used in many real‑world applications and frameworks.

Examples include:

Software	Use Case
Blender	GPU rendering
DaVinci Resolve	video processing
Darktable	photo processing
OpenCV	image processing
Intel oneAPI	heterogeneous computing
AMD ROCm	GPU compute
scientific HPC tools	simulations

Benchmark Tests

This tool performs three different measurements.

1 Host Write Bandwidth

Measures transfer speed from:

CPU → GPU

Implemented with:

clEnqueueWriteBuffer

2 Host Read Bandwidth

Measures transfer speed from:

GPU → CPU

Implemented with:

clEnqueueReadBuffer

3 Kernel Memory Throughput

A custom OpenCL kernel repeatedly reads and writes global memory.

Example kernel:

__kernel void memory_copy_test(__global const uchar* src,
                               __global uchar* dst,
                               const uint iterations)

This simulates heavy GPU memory traffic.

Libraries Used

The project intentionally uses minimal dependencies.

OpenCL

Main API used:

CL/cl.h

Used for:

• platform enumeration
• device discovery
• memory allocation
• kernel compilation
• kernel execution

Standard C++ Libraries

Library	Purpose
iostream	console output
vector	data containers
string	device information
algorithm	sorting results
numeric	averaging
chrono	performance timing
iomanip	formatted printing

Project Structure

opencl-multidevice-bandwidth-analyzer
│
├── src
│   └── main.cpp
│
├── doc
│   └── image1.png
│
├── include
│   └── CL
│       └── cl.h
│
├── lib
│   └── OpenCL.lib
│
├── README.md
├── LICENSE
└── .gitignore

src/

Contains the C++ benchmark implementation.

Main responsibilities:

• OpenCL platform discovery
• device enumeration
• memory transfer benchmarks
• kernel execution
• device ranking

doc/

Contains documentation assets such as screenshots used in the README.

Installation

1 Install OpenCL Drivers

Install OpenCL drivers appropriate for your hardware.

NVIDIA

Install the latest GPU driver.

https://developer.nvidia.com/opencl

Intel

Install Intel oneAPI Base Toolkit.

https://www.intel.com/oneapi

AMD

Install ROCm or AMD GPU drivers.

https://rocm.docs.amd.com

2 Clone the Repository

git clone https://github.com/YOUR_USERNAME/opencl-multidevice-bandwidth-analyzer.git
cd opencl-multidevice-bandwidth-analyzer

3 Build

Linux / WSL

g++ src/main.cpp -O2 -lOpenCL -o bandwidth_analyzer

Windows (MSVC)

cl src\main.cpp OpenCL.lib

4 Run

./bandwidth_analyzer

or

bandwidth_analyzer.exe

The program will automatically detect all OpenCL devices and run the benchmark.

Limitations

Current limitations:

• Only global memory bandwidth is tested
• No local/shared memory benchmarks
• No compute FLOPS test
• No multi‑GPU concurrent benchmarking
• Results may vary due to PCIe bandwidth or driver differences

Future Improvements

Possible future extensions:

• GPU compute FLOPS benchmark
• shared/local memory benchmark
• OpenCL event profiling
• CSV export of results
• graphical charts for comparison
• CUDA vs OpenCL comparison mode
• multi‑GPU parallel testing

Author

Sayed Ahmadreza Razian, PhD

LinkedIn
https://www.linkedin.com/in/ahmadrezarazian/

Google Scholar
https://scholar.google.com/citations?user=Dh9Iy2YAAAAJ

Email
AhmadrezaRazian@gmail.com

Feel free to contact me for collaboration or questions.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
doc		doc
include/CL		include/CL
lib		lib
src		src
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

OpenCL Multi‑Device Memory Bandwidth Analyzer

Overview

Screenshot

Example Output

What is OpenCL

Host Program

Kernel Program

Software and Tools Using OpenCL

Benchmark Tests

1 Host Write Bandwidth

2 Host Read Bandwidth

3 Kernel Memory Throughput

Libraries Used

OpenCL

Standard C++ Libraries

Project Structure

src/

doc/

Installation

1 Install OpenCL Drivers

NVIDIA

Intel

AMD

2 Clone the Repository

3 Build

Linux / WSL

Windows (MSVC)

4 Run

Limitations

Future Improvements

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages