Sandbox (+FA2 support, SharedContext for CompositeLoss, FP32 logits trainer side, ...) by joanvelja · Pull Request #17 · hallerite/ludic

joanvelja · 2025-12-28T14:44:39Z

Main contribution is Sandbox backend (stored in src/ludic/envs/code_exec). Both the backend and the test file have readmes with explainer. Has been thoroughly tested on HPC (podman-hpc instead of docker, a daemonless wrapper). TBT on docker-containing setups, although I do not expect crashes/problems since the whole orchestration logic is shared for arbitrary sandbox hosts. Details about Sandbox and how it works are found at examples/code_exec/README.md.

Backend code auto-detects sandbox type (docker, podman-hpc) and support for different sandbox types can be steadily extended (e.g., Singularity).

PR contains a few QoL improvements such as Flash-Attn support (automated, hardware aware), FP32 upcasted logits (trainer side; inference side requires patching vLLM or fetching someone else's patch—Minimax?), ScaleRL (Meta RL scaling laws objective), HybridCreditAssignment (within group mean calc, within batch std calc), and fixes a bug for composite losses that caused a spike in memory usage due to logprobs being computed for every LossFn in the composite loss object: added SharedContext to avoid recomputation of redundant objects.

type token traces & rollout extras; add some docs

HF login

…o apps_batched

… run

Finished Sandbox + QoL (FP32 logits trainer side; flash attention support; shared context for CompositeLoss)

joanvelja and others added 30 commits December 19, 2025 13:32

Sandbox: CodeExecEnv

5a137ab

Merge pull request #2 from hallerite/master

fd45541

type token traces & rollout extras; add some docs

smoketests for sandbox

3dedcb9

smoketests for sandbox

88b6275

API sync

63730f0

podman quirks

8206a5d

APPS trainer example + baseline (to be tested on HPC)

a0a9e8f

debugging isambard headaches

3b89fb4

i hate CUDA

d36c569

i hate CUDA

b93cc8c

revamp venv

11b9c51

some breaking changes, some other deprecation warnings torn down

6de833b

minimizing deps for podman-hpc

5ee2e57

Double async loop mistake

b4c4478

HF login

db8c528

Circuit breaker

812fa90

Merge pull request #3 from joanvelja/huggingface-login

5cd2fe7

HF login

Drain pipe

2e375f6

drainer -fix

db9087e

wandb to ignore

7655ecf

sandboxing issues

3f82ee3

contention

8b90ecd

config

4e6870a

Big change: from serial to batched test exec

64fc363

ignore checkpoints data

6b5c245

memory limit cgroup clash

d757aae

Merge branch 'apps_batched' of https://github.com/joanvelja/ludic int…

4871f2b

…o apps_batched

update

3be35a4

cache bug

a231a69

hangs.. inspecting

904f47a

joanvelja and others added 17 commits December 23, 2025 14:45

update

13c656e

deprecated concurrency args

c9c9bdd

parallel baby

8249153

subtle exec pool bug

1f0b1dd

new optim benching: volume

23749f6

Bind mount feature: 3 exec calls --> 1

7f805a9

visualization efforts

6fcaad1

Merge remote-tracking branch 'upstream/master' into dual-model

8c9aa56

Update the API with Hallerite's changes (grad_accum, algos) — for dry…

70cddbb

… run

lost in translation: podman workspace dir update

3bfdca3

path problems with dir

92ad2cf

path problems with dir

778731d

clean readme for sandbox execution

4c4225e

Memory efficient KL-div + ScaleRL recipe

e38448c

Flash attention

f53dabb

Merge pull request #5 from joanvelja/dual-model

848a822

Finished Sandbox + QoL (FP32 logits trainer side; flash attention support; shared context for CompositeLoss)

Cleanup personal files

321e8d7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sandbox (+FA2 support, SharedContext for CompositeLoss, FP32 logits trainer side, ...)#17

Sandbox (+FA2 support, SharedContext for CompositeLoss, FP32 logits trainer side, ...)#17
joanvelja wants to merge 47 commits intohallerite:masterfrom
joanvelja:master

joanvelja commented Dec 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joanvelja commented Dec 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant