Skip to content

Implement CUDA tensor operations #7

@kingjuno

Description

@kingjuno

Title: Implement CUDA tensor operations

Description:

We need atleast the following tensor operations using CUDA. Rest of them we can implement later.

Tasks:

  • Implement cpu_to_cuda to copy tensor data from CPU to CUDA.
  • Implement cuda_to_cpu to copy tensor data from CUDA to CPU.
  • Implement free_cuda to free CUDA memory allocated for tensor data.

Kernel and Host Functions:

  • Implement add_tensor_cuda_kernel for elementwise tensor addition.
  • Implement add_broadcasted_tensor_cuda_kernel for elementwise tensor addition with broadcasting.
  • Implement sub_broadcasted_tensor_cuda_kernel for elementwise tensor subtraction with broadcasting.
  • Implement sum_tensor_cuda_kernel for summing elements in a tensor.
  • Implement max_tensor_cuda_kernel for finding the maximum value in a tensor.
  • Implement min_tensor_cuda_kernel for finding the minimum value in a tensor.
  • Implement sub_tensor_cuda_kernel for elementwise tensor subtraction.
  • Implement elementwise_mul_tensor_cuda_kernel for elementwise tensor multiplication.
  • Implement scalar_mul_tensor_cuda_kernel for scalar multiplication of a tensor.
  • Implement scalar_div_tensor_cuda_kernel for scalar division of a tensor.
  • Implement tensor_div_scalar_cuda_kernel for division of tensor by scalar.
  • Implement tensor_div_tensor_cuda_kernel for elementwise tensor division.
  • Implement matmul_tensor_cuda_kernel for matrix multiplication of two tensors.
  • Implement batched_matmul_tensor_cuda_kernel for batched matrix multiplication.
  • Implement broadcasted_batched_matmul_tensor_cuda_kernel for batched matrix multiplication with broadcasting.
  • Implement tensor_pow_scalar_cuda_kernel for raising tensor elements to a scalar power.
  • Implement scalar_pow_tensor_cuda_kernel for raising tensor elements to a scalar base power.
  • Implement log_tensor_cuda_kernel for computing the logarithm of each element in a tensor.
  • Implement equal_tensor_cuda_kernel for checking elementwise equality between two tensors.
  • Implement equal_broadcasted_tensor_cuda_kernel for checking elementwise equality between two tensors with broadcasting.
  • Implement ones_like_tensor_cuda_kernel for creating a tensor of ones with the same shape as the input tensor.
  • Implement zeros_like_tensor_cuda_kernel for creating a tensor of zeros with the same shape as the input tensor.
  • Implement transpose_1D_tensor_cuda_kernel for transposing a 1D tensor.
  • Implement transpose_2D_tensor_cuda_kernel for transposing a 2D tensor.
  • Implement transpose_3D_tensor_cuda_kernel for transposing a 3D tensor.
  • Implement assign_tensor_cuda_kernel for assigning data to a tensor.
  • Implement make_contiguous_tensor_cuda_kernel for making a tensor contiguous in memory.
  • Implement sin_tensor_cuda_kernel for applying the sine function elementwise to a tensor.
  • Implement cos_tensor_cuda_kernel for applying the cosine function elementwise to a tensor.
  • Implement sigmoid_tensor_cuda_kernel for applying the sigmoid function elementwise to a tensor.

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions