-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Title: Implement CUDA tensor operations
Description:
We need atleast the following tensor operations using CUDA. Rest of them we can implement later.
Tasks:
- Implement
cpu_to_cudato copy tensor data from CPU to CUDA. - Implement
cuda_to_cputo copy tensor data from CUDA to CPU. - Implement
free_cudato free CUDA memory allocated for tensor data.
Kernel and Host Functions:
- Implement
add_tensor_cuda_kernelfor elementwise tensor addition. - Implement
add_broadcasted_tensor_cuda_kernelfor elementwise tensor addition with broadcasting. - Implement
sub_broadcasted_tensor_cuda_kernelfor elementwise tensor subtraction with broadcasting. - Implement
sum_tensor_cuda_kernelfor summing elements in a tensor. - Implement
max_tensor_cuda_kernelfor finding the maximum value in a tensor. - Implement
min_tensor_cuda_kernelfor finding the minimum value in a tensor. - Implement
sub_tensor_cuda_kernelfor elementwise tensor subtraction. - Implement
elementwise_mul_tensor_cuda_kernelfor elementwise tensor multiplication. - Implement
scalar_mul_tensor_cuda_kernelfor scalar multiplication of a tensor. - Implement
scalar_div_tensor_cuda_kernelfor scalar division of a tensor. - Implement
tensor_div_scalar_cuda_kernelfor division of tensor by scalar. - Implement
tensor_div_tensor_cuda_kernelfor elementwise tensor division. - Implement
matmul_tensor_cuda_kernelfor matrix multiplication of two tensors. - Implement
batched_matmul_tensor_cuda_kernelfor batched matrix multiplication. - Implement
broadcasted_batched_matmul_tensor_cuda_kernelfor batched matrix multiplication with broadcasting. - Implement
tensor_pow_scalar_cuda_kernelfor raising tensor elements to a scalar power. - Implement
scalar_pow_tensor_cuda_kernelfor raising tensor elements to a scalar base power. - Implement
log_tensor_cuda_kernelfor computing the logarithm of each element in a tensor. - Implement
equal_tensor_cuda_kernelfor checking elementwise equality between two tensors. - Implement
equal_broadcasted_tensor_cuda_kernelfor checking elementwise equality between two tensors with broadcasting. - Implement
ones_like_tensor_cuda_kernelfor creating a tensor of ones with the same shape as the input tensor. - Implement
zeros_like_tensor_cuda_kernelfor creating a tensor of zeros with the same shape as the input tensor. - Implement
transpose_1D_tensor_cuda_kernelfor transposing a 1D tensor. - Implement
transpose_2D_tensor_cuda_kernelfor transposing a 2D tensor. - Implement
transpose_3D_tensor_cuda_kernelfor transposing a 3D tensor. - Implement
assign_tensor_cuda_kernelfor assigning data to a tensor. - Implement
make_contiguous_tensor_cuda_kernelfor making a tensor contiguous in memory. - Implement
sin_tensor_cuda_kernelfor applying the sine function elementwise to a tensor. - Implement
cos_tensor_cuda_kernelfor applying the cosine function elementwise to a tensor. - Implement
sigmoid_tensor_cuda_kernelfor applying the sigmoid function elementwise to a tensor.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request