-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
The simplest way is in the trainer: iterate model.variables(), group by the model.layers.N prefix, and log per-layer grad norms. That uses names and extract_layer_index in shared/modeling/src/distro.rs rather than block state. Keeping layer_idx inside the block only helps if you want to log forward activations or per-layer debug metrics inside the block; it doesn’t give you grad norms by itself.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels