This repository provides the implementation and resources used in the research paper:
"Estimating Power Consumption of GPU Application Using Machine Learning Tool"
by Gargi Alavani Prabhu, Tanish Desai, Sharvil Potdar, Nayan Gogari, Snehanshu Saha and Santonu Sarkar
Cite as:
Gargi Alavani Prabhu, Tanish Desai, Sharvil Potdar, Nayan Gogari, Snehanshu Saha and Santonu Sarkar, "Estimating Power Consumption of GPU Application Using Machine Learning Tool", 2024. [Google Scholar]
Accurately predicting power consumption of GPU kernels is crucial for optimizing performance and energy efficiency in high-performance computing. This project presents a benchmark suite and a PowerAPI tool for profiling and predicting the power usage of GPU kernels based on machine learning techniques. The suite includes a diverse set of CUDA kernels and datasets, enabling robust training and evaluation of power models. PowerAPI provides an easy-to-use interface for instrumenting CUDA applications and collecting power/energy data, facilitating automated and reproducible measurements for research and development.
- PowerAPI/: Source code and interface for GPU power measurement.
- Benchmark Suite/: Collection of sample CUDA kernels and benchmarks.
- Each sample folder contains a README with build/run instructions and key concepts.
- Datasets/: Example datasets for training and evaluating power prediction models.
- Supplementary File.pdf: Additional details and supporting material.
- CUDA Toolkit 11.0+: For compiling and running GPU benchmarks.
- PowerAPI: Custom API for measuring GPU power and energy.
- C++/C/CUDA/SWIG/Makefile/Python/Shell: Multi-language implementation for flexibility and extensibility.
-
Add PowerAPI to the library path
LD_LIBRARY_PATH="/path/of/Folder/PowerAPI" -
Import PowerAPI in your CUDA file
At the start of your.cufile, add:#include "GPUDevice.h"
-
Instrument your kernel code
Create a GPUDevice object and wrap your kernel calls:GPUDevice g1 = GPUDevice(<GPU Device ID>,<Kernel Name>,<Grid Size>,<Block Size>); g1.startReading(); // <<Cuda Kernel Calls>> g1.stopReading(); -
Run your code
Output is saved in a text file named after the kernel.- Power in Watts
- Energy in milliJoules
Sample output:
KernelName,GridSize,BlockSize,MaxPower,MinPower,AvgPower,Time,Energy, reluKernel, 86436, 256, 84, 64, 76 , 67603944.000000, 5154,
- Windows: Use provided Visual Studio solution files (
*_vs<version>.sln). - Linux: Use makefiles in each sample directory:
Options:
cd <sample_dir> make
TARGET_ARCH=<arch>: Target specific CPU architecturedbg=1: Build with debug symbolsSMS="A B ...": Build for specific SM architecturesHOST_COMPILER=<host_compiler>: Use a custom host compiler
See individual sample READMEs in Benchmark Suite/ for details and supported architectures.
For questions or contributions, please refer to the repository issues or contact the authors via their GitHub profiles.