Skip to content

build error and runtime error #2

@15963064649

Description

@15963064649

Device: NVIDIA GeForce RTX 4090 D,
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0
gcc version 13.2.0 (Ubuntu 13.2.0-23ubuntu4)
torch 2.3.0
Python 3.10
In the execution of the example code where f = host.predict_async(prompt_tokens, 32), an error occurred: RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling cublasGemmEx(handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions