Update main #3

Gastron · 2025-07-08T10:20:47Z

No description provided.

* Add model and data loading from minio * Add deepspeed config example * Validate starting of sync process, escape and quote path argument * Wait 1s before checking if sync process started * Fix for quotes in checkpointsRemote * Update readme and other edits for clarity * Update workloads/llm-finetune-llama-factory/helm/README.md Co-authored-by: Aku Rouhe <akurouhe@amd.com> --------- Co-authored-by: Aku Rouhe <akurouhe@amd.com>

* veRL GRPO finetuning ROCm example workload * Refactor and complete VeRL workload * Fix comments and typos --------- Co-authored-by: Emil Eirola <emil.eirola@amd.com>

* Add basic MLFlow export * Upgrade ROCm image. * Fix nested folders for artifacts on MLFlow EVEN BETTER! * Fix extra f-string --------- Co-authored-by: Sander Bijl de Vroe <Sander.BijldeVroe@amd.com>

…rmonise model names. Harmonise arg parser. (#354) * Fix erroneously removed LLM client URL prefix.

* Quote paths and escape chars in mc mirror * Fix handling of minio paths * Fix handling of quotes in echo statements

* Add on-boarding documentation for pre-commit * clarify cd in docs and fix <br /> * small edit

* WandB downloader * Make it work * Correct override name, always mount * No ephemeral storage, just emptyDir

emeirola and others added 12 commits June 26, 2025 13:12

Update silogen finetuning config doc (#353)

2e67e79

Correct overrides names and contents (#352)

e18a78f

VeRL finetuning workload (#340)

7a0eecb

* veRL GRPO finetuning ROCm example workload * Refactor and complete VeRL workload * Fix comments and typos --------- Co-authored-by: Emil Eirola <emil.eirola@amd.com>

Evaluation Metrics MLFlow (#356)

bbe8dc9

* Add basic MLFlow export * Upgrade ROCm image. * Fix nested folders for artifacts on MLFlow EVEN BETTER! * Fix extra f-string --------- Co-authored-by: Sander Bijl de Vroe <Sander.BijldeVroe@amd.com>

Fix prefix use for LLM-as-a-judge evaluation. Upgrade ROCm images. Ha…

f86238a

…rmonise model names. Harmonise arg parser. (#354) * Fix erroneously removed LLM client URL prefix.

Quote paths and escape chars in mc mirror (#335)

b46cb10

* Quote paths and escape chars in mc mirror * Fix handling of minio paths * Fix handling of quotes in echo statements

remove obsolete startupProbe code (#351)

ef6a559

update JupyterLab entrypoint for default kernel (#361)

e6309d1

Add on-boarding documentation for pre-commit (#349)

5d7f64a

* Add on-boarding documentation for pre-commit * clarify cd in docs and fix <br /> * small edit

Fix auto single process and remove PLACEHOLDER dataset (#290)

c12377e

WandB Artifact downloader (#370)

e2fa226

* WandB downloader * Make it work * Correct override name, always mount * No ephemeral storage, just emptyDir

Gastron requested a review from Brednas July 8, 2025 10:22

sarooshsh approved these changes Jul 8, 2025

View reviewed changes

Brednas approved these changes Jul 8, 2025

View reviewed changes

Gastron merged commit a77b46c into main Jul 8, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update main #3

Update main #3

Uh oh!

Gastron commented Jul 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Update main #3

Update main #3

Uh oh!

Conversation

Gastron commented Jul 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants