feat: configurable evolver features and refactored loss returns #128

vijk777 · 2026-01-23T15:51:16Z

add tv norm regularization
add feature flag for evolver zero initialization
refactor to use dict[LossType, Tensor] returns from train step, which makes the code cleaner. No detectable compute overhead.

add `zero_init` flag to EvolverParams to control whether evolver starts as identity function (z_{t+1} = z_t). this provides training stability but may slow dynamics learning. reconstruction_warmup_epochs was already configurable in TrainingConfig and freezes evolver while training encoder/decoder on reconstruction loss. both features can now be easily toggled via config or cli overrides: - --evolver_params.zero_init false - --training.reconstruction_warmup_epochs 10 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

implement option b tv norm regularization: directly penalize the magnitude of evolver updates (Δz) using l1 norm. this stabilizes dynamics and prevents explosive rollouts during long-horizon evolution. changes: - add `tv_reg_loss` parameter to EvolverParams (default: 0.0) - compute tv loss as ||Δz||₁ at each evolver step - add TV_LOSS to LossType enum and logging - conditional computation: only compute delta_z explicitly when tv_reg_loss > 0 - update all config files with tv_reg_loss (default 0.0, typical: 1e-5 to 1e-3) implementation: - when tv_reg_loss > 0: explicitly compute delta_z = evolver(z_t, stim) then accumulate tv_loss += ||delta_z||₁ * coeff before updating z_{t+1} = z_t + delta_z - when tv_reg_loss = 0: use original path for efficiency typical usage: python latent.py exp latent_20step.yaml --evolver_params.tv_reg_loss 0.0001 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

add benchmark comparing tuple, namedtuple, and dict return types with torch.compile to determine best approach for loss returns. results (cpu, reduce-overhead mode): - tuple: 236.69 ± 24.02 µs/iter (baseline) - namedtuple: 245.24 ± 21.82 µs/iter (+3.6% overhead) - dict (enum keys): 244.83 ± 19.62 µs/iter (+3.4% overhead) - dict (str keys): 322.72 ± 157.59 µs/iter (+36.4%, high variance) conclusion: namedtuple has negligible overhead (<4%) and provides semantic access, type safety, and flexibility to omit unused fields. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

fix benchmark to use realistic training step computation instead of trivial mean/std operations. previous version was too small and showed compiled code being slower than uncompiled (nonsensical). changes: - simulate encoder/decoder with matrix multiplies and relu - add multiple loss computations (recon, l1 reg, temporal smoothness) - use batch_size=256, neurons=1000, latent=256 (realistic sizes) - ensure proper cuda synchronization - add compilation speedup metrics this should show proper speedup from torch.compile and accurate overhead comparison between tuple, namedtuple, and dict returns. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

replace tuple returns with dict[LossType, Tensor] for semantic access and programmatic tensorboard logging. changes: - train_step_nocompile: returns dict, builds incrementally with losses[LossType.X] = value - train_step_reconstruction_only_nocompile: returns dict with only computed losses (total, recon, reg) - loss accumulation: updated to work with dict instead of tuple indexing - tensorboard logging: now programmatic using loss_type.name.lower() iteration benefits: - semantic access: losses[LossType.RECON] instead of loss_tuple[1] - flexible returns: warmup only returns computed losses - programmatic logging: automatically logs all loss components - type safe: enum keys prevent typos benchmark showed dict with enum keys has <2% overhead vs tuple on gpu. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

test completed, results confirmed dict/namedtuple have <2% overhead. keeping results in git history for reference. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

document that dict[LossType, Tensor] has <2% overhead vs tuple based on gpu benchmarking with realistic computation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

vijk777 and others added 8 commits January 23, 2026 06:56

chore: remove benchmark test script

c443898

test completed, results confirmed dict/namedtuple have <2% overhead. keeping results in git history for reference. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

docs: add benchmarking results to LossType docstring

10d923a

document that dict[LossType, Tensor] has <2% overhead vs tuple based on gpu benchmarking with realistic computation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

add experiments:

9a7c8d8

vijk777 merged commit b689ddb into main Jan 23, 2026
2 checks passed

vijk777 deleted the vj/derivative branch January 23, 2026 15:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: configurable evolver features and refactored loss returns #128

feat: configurable evolver features and refactored loss returns #128

Uh oh!

vijk777 commented Jan 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: configurable evolver features and refactored loss returns #128

feat: configurable evolver features and refactored loss returns #128

Uh oh!

Conversation

vijk777 commented Jan 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants