Conversation
…ld trade under the value
# Conflicts: # gymrl/portfolio_env.py # gymrl/train_ppo_allocator.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| and metadata | ||
| and manager.metadata_matches(metadata, metadata_requirements) | ||
| ): |
There was a problem hiding this comment.
Legacy Toto caches never match due to missing device key
When a cache created before this change is loaded, load_metadata(..., device_token) correctly falls back to the legacy metadata.json, but the normalization block only populates device_variant and leaves device unset. Because metadata_requirements always includes {"device": device_token}, metadata_matches will fail for every legacy cache even though the weights are still valid. The code then downloads and recompiles the model instead of re‑using the existing cache, defeating the intended backwards compatibility and forcing users to rebuild caches unnecessarily. Consider defaulting metadata["device"] to the normalized token when it is absent so that existing caches can still be reused.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Summary
- Normalized legacy Toto cache metadata by always resolving the device token and mirroring it to the device_variant field so cached weights remain reusable. src/models/toto_wrapper.pyL490-L509
Testing
⚠️ pytest tests/test_toto_wrapper.py(fails: PyTorch not installed in environment)
|
@codex merge in main and re test and look ci failure here need to clone toto and external/kronos both from lee101/ I have forks on them... uv pip install -e . Part failed should be in toto dir |
|
Summary
Testing
|
198c06f to
4b5c43f
Compare
…tosave bug - s89: honest=1.413 (+496.7% ann) — new #14 in crypto70 leaderboard - s90: honest=0.880 (+259.6% ann) — positive but below save threshold - s91: honest=1.628 (+609.3% ann) — new #11, between s74 and s77 Infrastructure fixes: - Replaced bash autosave with Python (bash 'while read' loop + python3 subprocess was eating stdin bytes, causing seeds to be silently skipped) - Created /tmp/monitor_crypto70_long_s61_120.py for 1800s sweep eval - Updated /tmp/run_crypto70_long_s61_120.sh to start monitor on chain - Reordered long queue: s71→s87→s70→s83→s78→s80→s85→s66→s77→s91→s74... Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…+3063%), s513/s804 ROBUST New 5bps production champions discovered: - s416 (s401-500): +4091% ann, Sortino=6.18, 5bps>8bps ROBUST — NEW #11! - s321 (s301-400): +3063% ann, Sortino=5.07 - s513 (s501-600): +2525% ann, Sortino=5.50, 5bps>8bps ROBUST - s804 (s801-900): +1943% ann, Sortino=5.19, 5bps>8bps ROBUST - s192 (s121-200): +1886% ann, Sortino=4.67, 5bps>8bps ROBUST - s415 (s401-500): +1757% ann, Sortino=4.79, 5bps>8bps ROBUST - s807 (s801-900): +934% ann, Sortino=6.37, 5bps>8bps ROBUST (high Sortino!) - s191 (s121-200): +1457% ann, Sortino=3.90, ROBUST s401-500 is the HOTTEST range: s404(#4)+s416(#11)+s410(#21)+s415(#31) = 4 champions Added crypto70_5bps_leaderboard.csv — automated 5bps production leaderboard Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New 5bps verified champions from continued monitoring: - s827: 5bps=+4801% ann (pool=5.31≈honest=5.37, VERY ROBUST 5bps>>8bps) #11 all-time - s534: 5bps=+3598% ann (pool=3.85, honest=5.21→5bps=4.94, 5% drop) - s737: 5bps=+2228% ann (pool=3.16≈honest=3.16, ROBUST 5bps>>8bps) - s822: 5bps=+1485% ann (ROBUST, tri-consistent) - s826: 5bps=+1298% ann (pool=0→5bps=+1298%, extreme generalization) - s735: 5bps=+1035% ann (VERY ROBUST, 5bps=2.32 vs 8bps=1.84) - s532: 5bps=+765% ann (ROBUST, pool=0.46→765%) - s635: 5bps=+870% ann (ROBUST, tight cluster) - s637: 5bps=+841% ann (moderate overfitter) - s828: 5bps=+838% ann (ROBUST, pool=0.12→838%) - s829: 5bps=+700% ann (ROBUST) - s738: 5bps=+380% ann Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
More 5bps verified champions: - s446: 5bps=+6536% ann (VERY ROBUST 5bps>>8bps) NEW #10 all-time - s827: 5bps=+4801% ann (VERY ROBUST, pool≈honest≈5bps) #11 - s534: 5bps=+3598% ann (5bps >> pool, 5% drop from 8bps) - s347: 5bps=+3383% ann (moderate overfitter 79% ratio) - s737: 5bps=+2228% ann (ROBUST 5bps>>8bps) - s830: 5bps=+1795% ann (tri-consistent, ROBUST) - s822: 5bps=+1485% ann - s826: 5bps=+1298% ann (pool=0 → +1298%) - s251: 5bps=+1252% ann (ROBUST, pool=0.65→1252%) - s253: 5bps=+1248% ann (ROBUST, pool≈honest→5bps even higher) - s735: 5bps=+1035% ann (VERY ROBUST, 5bps=2.32 vs pool/8bps=1.84) - s348: 5bps=+949% ann (VERY ROBUST, each eval step higher) - s829: 5bps=+700% ann, s532: +765%, s635: +870%, s637: +841% - s828: 5bps=+838% ann (ROBUST, pool=0.12→838%) - s539: 5bps=+603% ann (ROBUST, pool=-0.12→603%) - s738: 5bps=+380% ann Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Testing
https://chatgpt.com/codex/tasks/task_e_68f9ab8897d48333b836e6d3221979f3