Skip to content

Agekyan/GPU-Computing

Repository files navigation

GPU-Computing

This repository contains code examples and resources for GPU computing using CUDA and OpenCL.

Introduction

GPU computing utilizes the Graphics Processing Units (GPUs) to perform parallel computations, significantly speeding up tasks that can be parallelized. Our goal is to parallelize many simple operations, with techniques such as memory coalescing to speed up computation. This repository provides some personal example code from a GPU Computing course which utilized NVIDIA A40s on a GPU compute node via a supercomputer virtual connection.

Local Setup

  1. Install the necessary drivers for your GPU.
  2. Install CUDA Toolkit (for NVIDIA GPUs) or OpenCL SDK (for AMD and Intel GPUs).
  3. Set up your IDE or text editor for CUDA or OpenCL development.
  4. Clone this repository to your local machine.
git clone

This below is the command to access a GPU compute node via SLURM job scheduler:

srun --account=<ACCOUNT_NAME> \
     --partition=<GPU_PARTITION_NAME> \
     --nodes=1 \
     --gpus-per-node=1 \
     --tasks=1 \
     --tasks-per-node=16 \
     --cpus-per-task=1 \
     --mem=20g \
     --pty bash

or if you have environment variables set for your account and partition, you can use:

srun --account=$ACCOUNT \
     --partition=$PARTITION \
     --nodes=1 \
     --gpus-per-node=1 \
     --tasks=1 \
     --tasks-per-node=16 \
     --cpus-per-task=1 \
     --mem=20g \
     --pty bash

Explanation of SLRUM, and other options to consider

  • --account=<ACCOUNT_NAME>: Specifies the account to be charged for the job.
  • --partition=<GPU_PARTITION_NAME>: Specifies the partition (queue) to submit the job to. This should be a GPU-enabled partition.
  • --nodes=1: Requests one node for the job.
  • --gpus-per-node=1: Requests one GPU per node.
  • --tasks=1: Requests one task for the job.
  • --tasks-per-node=16: Specifies the number of tasks to run on each node.
  • --cpus-per-task=1: Allocates one CPU core per task.
  • --mem=20g: Allocates 20 GB of memory for the job.
  • --pty bash: Starts an interactive bash session on the allocated resources.

Ray, SkyPilot, Kubernetes, Transformer Lab GPU Orchestration, NVIDIA Base Command Platform, LLsub, and other options exist for managing GPU resources, but SLURM is widely used in academic and research environments. Kubernetes is more common in production environments, while Ray and SkyPilot are used for distributed computing tasks.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages