From ecdd8f45208341a07e8e77d8786699cbfbdbd06c Mon Sep 17 00:00:00 2001 From: Chen Cui Date: Wed, 4 Mar 2026 10:37:35 -0800 Subject: [PATCH] Add W&B report Signed-off-by: Chen Cui --- examples/models/gpt_oss/README.md | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/examples/models/gpt_oss/README.md b/examples/models/gpt_oss/README.md index 29b0e4fddd..6713779f21 100644 --- a/examples/models/gpt_oss/README.md +++ b/examples/models/gpt_oss/README.md @@ -94,19 +94,16 @@ cfg.dataset.path_to_cache = "/path/to/cache" Preprocess your data using the [DCLM data preprocessing tutorial](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/tutorials/data/dclm). -W&B report coming soon. - ### Supervised Fine-Tuning (SFT) See the [slurm_sft.sh](slurm_sft.sh) script for full parameter fine-tuning. The recipe uses sequence packing by default. -W&B report coming soon. - ### Parameter-Efficient Fine-Tuning (PEFT) with LoRA See the [slurm_peft.sh](slurm_peft.sh) script for LoRA fine-tuning. The recipe uses sequence packing by default. -W&B report coming soon. +### Expected Training Dynamics +We provide a [Weights & Biases report](https://api.wandb.ai/links/nvidia-nemo-fw-public/xs3rmk4t) for the expected loss curves and grad norms. ## Inference @@ -130,4 +127,4 @@ TP×PP×EP must equal `--nproc_per_node`. Adjust parallelism to match your SFT r ## Evaluation -Coming soon. \ No newline at end of file +Coming soon.