Skip to content

Add per-layer gradient norm logging #9

@plugyawn

Description

@plugyawn

The simplest way is in the trainer: iterate model.variables(), group by the model.layers.N prefix, and log per-layer grad norms. That uses names and extract_layer_index in shared/modeling/src/distro.rs rather than block state. Keeping layer_idx inside the block only helps if you want to log forward activations or per-layer debug metrics inside the block; it doesn’t give you grad norms by itself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions