Skip to content

There is an OOM error #1

@gokasiko

Description

@gokasiko

I followed the steps on my server with 120GB CPU-memory, but the model is said to have an OOM error and the training process is killed, unfortunately.
Before the processed is terminated, we found that the 120G CPU-memory is fully consumed but the GPU memory is not used at all.
I wonder how large the size of the CPU-memory is sufficient to run the model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions