build error and runtime error

Device: NVIDIA GeForce RTX 4090 D,
             Cuda compilation tools, release 12.2, V12.2.91
             Build cuda_12.2.r12.2/compiler.32965470_0
             gcc version 13.2.0 (Ubuntu 13.2.0-23ubuntu4)
             torch                             2.3.0
             Python                          3.10
In the execution of the example code where f = host.predict_async(prompt_tokens, 32), an error occurred: RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling cublasGemmEx(handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build error and runtime error #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

build error and runtime error #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions