Skip to content

HMM Breaks Process Memory Metrics reporting #106

@sidewinder12s

Description

@sidewinder12s

Is this a known issue?

I have a sample CUDA workload that is allocating memory.
Before that starts I setup nv-hostengine to run as root and then start StartPIDWatch.

In another process running within a container, I am then running that sample CUDA workload and it is collecting its own process statistics with GetProcessInfo.

If I run this on A10 and T4 GPUs with the 580 series driver, I get GlobalUsed memory metrics.

When I run this on an L4 GPU with HMM addressing mode enabled, GlobalUsed always reports 0. Is this a known deficiency? With an updated driver, kernel, etc. on T4 and A10 GPUs, HMM isn't enabled and we don't have these issues.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions