Skip to content

fix(utils): add GPU warm-up for profiling#62

Open
YutongJau wants to merge 1 commit intoaliyun:masterfrom
YutongJau:fix/gpu-warmup
Open

fix(utils): add GPU warm-up for profiling#62
YutongJau wants to merge 1 commit intoaliyun:masterfrom
YutongJau:fix/gpu-warmup

Conversation

@YutongJau
Copy link

Motivation

The initial execution of measure_model captures significant overhead from CUDA context initialization and lazy kernel loading, resulting in heavily skewed profiling data, such as Emb layer metrics (e.g., observed 35000ms vs stable 550ms). This leads to inaccurate inputs for the AIOB simulator.

Changes

  • Implemented a 10-step warm-up loop in utils/utils.py to ensure the GPU is fully initialized before profiling.

Impact

Eliminates cold-start outliers and improves profiling accuracy.

Verification (Environment: NVIDIA RTX 4090):

Metric (Unit: ms) No Warm-up 10-step Warm-up Status
Emb 35,479.4 550.0 Stabilized
Param 4,974.6 1,496.5 Stabilized

@CLAassistant
Copy link

CLAassistant commented Feb 2, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants