-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
Device: NVIDIA GeForce RTX 4090 D,
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0
gcc version 13.2.0 (Ubuntu 13.2.0-23ubuntu4)
torch 2.3.0
Python 3.10
In the execution of the example code where f = host.predict_async(prompt_tokens, 32), an error occurred: RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling cublasGemmEx(handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels