Skip to content

Add device-aware caching for Toto and Kronos#11

Open
lee101 wants to merge 204 commits intomainfrom
codex/enable-cpu-and-gpu-support-for-models
Open

Add device-aware caching for Toto and Kronos#11
lee101 wants to merge 204 commits intomainfrom
codex/enable-cpu-and-gpu-support-for-models

Conversation

@lee101
Copy link
Owner

@lee101 lee101 commented Oct 23, 2025

Summary

  • add device-aware variant support to the model cache with device token helpers and legacy fallbacks
  • update Toto and Kronos wrappers to normalise device names, persist metadata per device, and gracefully handle missing Toto packages
  • extend the cache test suite to cover CPU/GPU separation and new device token logic

Testing

  • pytest tests/test_model_cache_variants.py tests/test_toto_cache.py tests/test_kronos_wrapper_unit.py

https://chatgpt.com/codex/tasks/task_e_68f9ab8897d48333b836e6d3221979f3

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 463 to 465
and metadata
and manager.metadata_matches(metadata, metadata_requirements)
):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Legacy Toto caches never match due to missing device key

When a cache created before this change is loaded, load_metadata(..., device_token) correctly falls back to the legacy metadata.json, but the normalization block only populates device_variant and leaves device unset. Because metadata_requirements always includes {"device": device_token}, metadata_matches will fail for every legacy cache even though the weights are still valid. The code then downloads and recompiles the model instead of re‑using the existing cache, defeating the intended backwards compatibility and forcing users to rebuild caches unnecessarily. Consider defaulting metadata["device"] to the normalized token when it is absent so that existing caches can still be reused.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codex ok fix

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

  • Normalized legacy Toto cache metadata by always resolving the device token and mirroring it to the device_variant field so cached weights remain reusable. src/models/toto_wrapper.pyL490-L509

Testing

  • ⚠️ pytest tests/test_toto_wrapper.py (fails: PyTorch not installed in environment)

View task →

@lee101
Copy link
Owner Author

lee101 commented Oct 30, 2025

@codex merge in main and re test and look ci failure here need to clone toto and external/kronos both from lee101/ I have forks on them... uv pip install -e . Part failed should be in toto dir

@chatgpt-codex-connector
Copy link

Summary

  • Defaulted legacy Toto cache metadata to the current device token and mirrored it into device_variant, ensuring legacy caches still satisfy device-aware requirements. src/models/toto_wrapper.pyL490-L505

Testing

  • pytest tests/test_model_cache_variants.py tests/test_toto_cache.py tests/test_kronos_wrapper_unit.py

View task →

@lee101 lee101 force-pushed the main branch 2 times, most recently from 198c06f to 4b5c43f Compare March 24, 2026 07:45
lee101 added a commit that referenced this pull request Mar 25, 2026
…tosave bug

- s89: honest=1.413 (+496.7% ann) — new #14 in crypto70 leaderboard
- s90: honest=0.880 (+259.6% ann) — positive but below save threshold
- s91: honest=1.628 (+609.3% ann) — new #11, between s74 and s77

Infrastructure fixes:
- Replaced bash autosave with Python (bash 'while read' loop + python3
  subprocess was eating stdin bytes, causing seeds to be silently skipped)
- Created /tmp/monitor_crypto70_long_s61_120.py for 1800s sweep eval
- Updated /tmp/run_crypto70_long_s61_120.sh to start monitor on chain
- Reordered long queue: s71→s87→s70→s83→s78→s80→s85→s66→s77→s91→s74...

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
lee101 added a commit that referenced this pull request Mar 25, 2026
…+3063%), s513/s804 ROBUST

New 5bps production champions discovered:
- s416 (s401-500): +4091% ann, Sortino=6.18, 5bps>8bps ROBUST — NEW #11!
- s321 (s301-400): +3063% ann, Sortino=5.07
- s513 (s501-600): +2525% ann, Sortino=5.50, 5bps>8bps ROBUST
- s804 (s801-900): +1943% ann, Sortino=5.19, 5bps>8bps ROBUST
- s192 (s121-200): +1886% ann, Sortino=4.67, 5bps>8bps ROBUST
- s415 (s401-500): +1757% ann, Sortino=4.79, 5bps>8bps ROBUST
- s807 (s801-900): +934% ann, Sortino=6.37, 5bps>8bps ROBUST (high Sortino!)
- s191 (s121-200): +1457% ann, Sortino=3.90, ROBUST

s401-500 is the HOTTEST range: s404(#4)+s416(#11)+s410(#21)+s415(#31) = 4 champions
Added crypto70_5bps_leaderboard.csv — automated 5bps production leaderboard

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
lee101 added a commit that referenced this pull request Mar 25, 2026
New 5bps verified champions from continued monitoring:
- s827: 5bps=+4801% ann (pool=5.31≈honest=5.37, VERY ROBUST 5bps>>8bps) #11 all-time
- s534: 5bps=+3598% ann (pool=3.85, honest=5.21→5bps=4.94, 5% drop)
- s737: 5bps=+2228% ann (pool=3.16≈honest=3.16, ROBUST 5bps>>8bps)
- s822: 5bps=+1485% ann (ROBUST, tri-consistent)
- s826: 5bps=+1298% ann (pool=0→5bps=+1298%, extreme generalization)
- s735: 5bps=+1035% ann (VERY ROBUST, 5bps=2.32 vs 8bps=1.84)
- s532: 5bps=+765% ann (ROBUST, pool=0.46→765%)
- s635: 5bps=+870% ann (ROBUST, tight cluster)
- s637: 5bps=+841% ann (moderate overfitter)
- s828: 5bps=+838% ann (ROBUST, pool=0.12→838%)
- s829: 5bps=+700% ann (ROBUST)
- s738: 5bps=+380% ann

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
lee101 added a commit that referenced this pull request Mar 25, 2026
More 5bps verified champions:
- s446: 5bps=+6536% ann (VERY ROBUST 5bps>>8bps) NEW #10 all-time
- s827: 5bps=+4801% ann (VERY ROBUST, pool≈honest≈5bps) #11
- s534: 5bps=+3598% ann (5bps >> pool, 5% drop from 8bps)
- s347: 5bps=+3383% ann (moderate overfitter 79% ratio)
- s737: 5bps=+2228% ann (ROBUST 5bps>>8bps)
- s830: 5bps=+1795% ann (tri-consistent, ROBUST)
- s822: 5bps=+1485% ann
- s826: 5bps=+1298% ann (pool=0 → +1298%)
- s251: 5bps=+1252% ann (ROBUST, pool=0.65→1252%)
- s253: 5bps=+1248% ann (ROBUST, pool≈honest→5bps even higher)
- s735: 5bps=+1035% ann (VERY ROBUST, 5bps=2.32 vs pool/8bps=1.84)
- s348: 5bps=+949% ann (VERY ROBUST, each eval step higher)
- s829: 5bps=+700% ann, s532: +765%, s635: +870%, s637: +841%
- s828: 5bps=+838% ann (ROBUST, pool=0.12→838%)
- s539: 5bps=+603% ann (ROBUST, pool=-0.12→603%)
- s738: 5bps=+380% ann

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant