Skip to content

feat(stream): internal bet sync feed and renderer health#1065

Open
rndrntwrk wants to merge 23 commits intomainfrom
enoomian/stream-bet-sync
Open

feat(stream): internal bet sync feed and renderer health#1065
rndrntwrk wants to merge 23 commits intomainfrom
enoomian/stream-bet-sync

Conversation

@rndrntwrk
Copy link

Summary

This is the Hyperscapes source-side half of the stream-to-bet sync upgrade.

It makes the streaming lifecycle authoritative for betting consumers, hardens the source-side duel/market bridge, and publishes renderer-health signals so downstream consumers can distinguish healthy live arena frames from degraded initialization shells.

Associated Hyperbet consumer PR: HyperscapeAI/hyperbet#28

What This Adds

  • authenticated internal bet-sync surfaces at /api/internal/bet-sync/state and /api/internal/bet-sync/events
  • sequence-aware source payloads for betting consumers, including renderer health and degraded reason metadata
  • stronger DuelBettingBridge lifecycle handling so the active market stays aligned with the live streaming duel and is not recreated during resolution
  • stricter streaming scheduler timing for PM-integrated flows
  • renderer-readiness hardening in streaming mode so the loading/finalizing overlay only clears when the arena is actually ready
  • visible fallback avatars and related client resilience improvements when upstream character assets fail to load
  • soak/capture-path hardening for the source render lane

Why This PR Exists

Hyperbet already has a paired consumer hardening branch, but the consumer cannot be frontier-grade if the source stream is not authoritative and explicit about health.

This PR keeps Hyperscapes as the canonical duel lifecycle source and gives Hyperbet a stable source contract to consume without relying on delayed public spectator snapshots as the primary sync input.

Validation

  • unit coverage for the internal betting feed route
  • unit coverage for DuelBettingBridge lifecycle behavior
  • streaming client tests covering renderer readiness / loading screen behavior
  • integration coverage for the streaming duel scheduler path

Draft / Merge Notes

This PR should remain draft until the paired Hyperbet PR and the end-to-end soak both pass together.

Do not merge this PR independently of HyperscapeAI/hyperbet#28.

@claude
Copy link

claude bot commented Mar 20, 2026

PR Review: feat(stream): internal bet sync feed and renderer health

Good work on making the streaming lifecycle authoritative for betting consumers. The architecture is sound — separating the source-side feed, adding sequence-aware payloads, and hardening the DuelBettingBridge lifecycle. Here's detailed feedback:


Security Concerns

1. Timing-unsafe token comparison (streaming.ts)
assertBettingAuth uses token === requiredToken which is vulnerable to timing side-channel attacks. For auth tokens protecting a betting feed, use a constant-time comparison:

import { timingSafeEqual } from "node:crypto";

const isValid = token.length === requiredToken.length &&
  timingSafeEqual(Buffer.from(token), Buffer.from(requiredToken));

2. Rate limiting disabled on betting routes
Both /api/internal/bet-sync/state and /api/internal/bet-sync/events have rateLimit: false. Even for internal endpoints, this opens the door to resource exhaustion. Consider at least a generous rate limit rather than none.

3. Access-Control-Allow-Origin: * on internal betting SSE
The /events endpoint sets Access-Control-Allow-Origin: *. If these are truly internal endpoints, the CORS header should be restricted or omitted. If a consumer needs CORS, use a specific allowed origin.

4. getDatabaseSystem uses any (streaming.ts:L649)

const getDatabaseSystem = (): { getDb?: () => any } | null => ...

This violates the project's no-any rule. Type the return of getDb properly, even if it's a generic database interface.


Potential Bugs

5. Sequence mismatch in captureBettingFrame
The payload is initially built with seq: bettingSequence + 1, serialized for dedup comparison, then bettingSequence is incremented, and the frame is constructed with seq: bettingSequence. The serialized string used for dedup thus contains a seq that matches the final frame — but only by coincidence (pre-increment + 1 == post-increment). This is fragile and confusing. Consider incrementing bettingSequence first, then building the payload once with the correct value.

6. getDockerComposeCommand doesn't use resolved binary
In scripts/cdn.mjs, the getDockerComposeCommand function returns the hardcoded string 'docker compose' even when DOCKER_BIN resolves to a non-default path. The returned command won't use the resolved binary:

// Line 89: always returns 'docker compose', not `${DOCKER_BIN} compose`
return 'docker compose'

7. selectReplayDelivery returns "bootstrap" when client is ahead of server
When sinceSeq > latestFrame.seq (client has seen frames the server's buffer no longer has — e.g. after a server restart), the function falls through findIndex returning -1 and returns mode: "bootstrap". This is semantically misleading — it's actually a gap/reset scenario. Consider returning "reset" mode so the consumer knows its state is stale.


Code Quality

8. Duplicated resolveDockerBinary function
resolveDockerBinary is copy-pasted into both docker-manager.ts and scripts/cdn.mjs with minor differences (existsSync vs accessSync). Extract to a shared utility to avoid drift.

9. handleStreamingResolution double-fetches market
The method fetches this.activeMarkets.get(data.duelId), finds it missing, calls reconcileLiveCycle() which may create the market, then fetches it again with this.activeMarkets.get(data.duelId). The intermediate reconcileLiveCycle call could create AND resolve the market in one pass (since the cycle is already in RESOLUTION phase), making the second fetch return null. The control flow here is hard to reason about — consider making the intent explicit.

10. reconcileLiveCycle runs unconditionally
The reconciliation loop starts at init and ticks every 1s regardless of whether streaming mode is active. Since getStreamingDuelScheduler() can return null when streaming is off, this is wasted work. Consider starting the loop only when streaming is active or guarding with an early return + longer sleep when the scheduler is null.


Performance

11. Array.shift() on replay buffer is O(n)
Both bettingReplayFrames and the existing replayFrames use .shift() to evict old entries. With a buffer of up to 2048 frames, this is O(n) per eviction. A ring buffer or deque would be more efficient, though this may be acceptable given the push interval.

12. findIndex linear scan in selectReplayDelivery
Since frames are ordered by seq, a binary search would be more efficient for large replay buffers. Minor concern at current buffer sizes.


Test Coverage

Tests are solid for the new code — betting feed payload building, replay delivery modes, DuelBettingBridge lifecycle, and phaseVersion advancement are all covered. A few gaps:

  • assertBettingAuth — no test coverage for the auth middleware (valid token, invalid token, missing token, missing env var scenarios)
  • ensureBettingSourceEpoch — the DB persistence/recovery path is untested
  • removeBettingClient / SSE lifecycle — no test for client disconnect cleanup or betting loop start/stop behavior
  • shouldDismissStreamingLoading — only 2 of the possible combinations are tested; consider adding cases for connected: false and terrainReady: false

Minor Nits

  • LoadingScreen effect deps fix ([world, completionStage]) is correct and good to see
  • The packageManager field addition (bun@1.3.1) looks intentional but check that CI and all contributors are on 1.3.1 — this field triggers corepack enforcement
  • betCloseTime clamping with Math.min in DuelArenaOraclePublisher is a nice defensive touch

Summary

The core architecture is well-designed — sequence-aware SSE feeds, source epoch persistence, and the reconciliation loop are all solid patterns. The main items to address before merge are the timing-unsafe auth comparison, the any type violation, the docker compose binary resolution bug, and the sequence construction fragility in captureBettingFrame. The rest are improvements worth considering.

@claude
Copy link

claude bot commented Mar 20, 2026

PR Review: feat(stream): internal bet sync feed and renderer health

Good work on this PR — the approach of making Hyperscape the authoritative source for duel lifecycle and publishing renderer-health signals is solid. Here's detailed feedback:


Code Quality & Best Practices

Strengths:

  • Clean extraction of pure functions (deriveStreamingRendererHealth, shouldDismissStreamingLoading, buildBettingFeedPayload, selectReplayDelivery) that are easy to test and reason about.
  • Good use of useMemo for rendererHealth derivation to avoid unnecessary recomputation.
  • The DuelBettingBridge refactor into createOrSyncMarket / resolveMarket consolidates previously duplicated logic nicely.
  • The reconciliation loop with reconcileInFlight guard prevents overlapping async work.

Issues:

  1. any type in getDatabaseSystem (streaming.ts ~line 701):

    const getDatabaseSystem = (): {
      getDb?: () => any;
    } | null => world.getSystem("database") as { getDb?: () => any } | null;

    Per CLAUDE.md: "NO any types are allowed — ESLint will reject them." This needs a proper type for the DB return value or at minimum a more specific type assertion.

  2. Duplicated validation logic across client and server: isActiveStreamingPhase, requiresArenaPositions, and arena position validation appear in both StreamingMode.tsx and streaming.ts. Consider extracting these into the shared package to maintain a single source of truth and prevent drift.

  3. Repeated window as unknown as { ... } type casts in GameClient.tsx and StreamingMode.tsx: The window augmentation type for __HYPERSCAPE_STREAM_READY__ and __HYPERSCAPE_STREAM_RENDERER_HEALTH__ is duplicated in at least 5 places. A single declare global { interface Window { ... } } in a shared .d.ts file would be cleaner and safer.

  4. data.winnerId! / data.loserId! non-null assertions in handleDuelResult (DuelBettingBridge.ts ~line 2758-2759): These assertions are safe here because of the early return guard, but they're unnecessary — the guard already narrowed the types. Use the narrowed values directly.


Potential Bugs

  1. clearFallbackAvatar disposes shared geometries (PlayerRemote.ts): The clearFallbackAvatar method traverses children and calls child.geometry.dispose(), but ensureFallbackAvatar clones fallbackHeadGeometry and fallbackBeaconGeometry via .clone(). However, capsuleGeometry (used for the body) is also cloned — good. Just verify capsuleGeometry is an existing module-level geometry that should be cloned rather than shared, since disposing a shared geometry would corrupt other meshes using it.

  2. onInitError?.(null) is called before world.init() (GameClient.tsx line 53/415): This clears any previous error state right before attempting init. If the component re-renders and re-runs this effect, the error could flash away and back. Consider only clearing the error on successful init (after world.init() resolves) rather than optimistically before.

  3. selectReplayDelivery returns "bootstrap" when sinceSeq > latestSeq (streaming-betting-feed.ts ~line 1383): If a consumer sends a since seq that's ahead of the buffer (e.g., from a future epoch), it gets a bootstrap response with no frames. This silently resets the consumer. Consider whether a "reset" mode would be more appropriate here to signal the epoch mismatch.

  4. Race in ensureBettingSourceEpoch: The init promise is cached, but if the DB read fails and falls through to the catch block setting bettingSourceEpoch = Date.now(), the bettingSourceEpochReady flag is set inside the try block's success path but also needs to be set in the catch path. Looking more carefully — it's set after the catch, inside the async IIFE. This looks correct but the flow is subtle. A comment would help.


Performance Considerations

  1. probeRendererHealth calls pageRef.evaluate() on every poll (stream-to-rtmp.ts): This executes document.body.innerText.slice(0, 1024) which forces a full layout reflow in the browser. At the 2-second status interval this is probably fine, but the 30-second status log also calls refreshRendererHealthSnapshot, so there's a potential double-probe within the same cycle. The .catch(() => undefined) on line 1059 prevents errors but doesn't deduplicate.

  2. captureBettingFrame serializes the full payload on every push interval tick just to compare against lastSerializedBettingState. For high-frequency polling this is fine, but if the payload grows, consider a cheaper change-detection mechanism (e.g., comparing a hash of just the changing fields like phase, phaseVersion, hp values).

  3. Betting replay buffer uses Array.shift() in a while loop (streaming.ts ~line 1771): Array.shift() is O(n) for each call. With the current STREAMING_SSE_REPLAY_BUFFER size this is negligible, but if the buffer grows, consider a ring buffer or deque.


Security Concerns

  1. Token passed via query string (?token= / ?streamToken=) in assertBettingAuth: Query string tokens are logged in server access logs, browser history, and intermediary proxies. The Bearer header path is preferred. Consider deprecating or removing the query string auth path, or at minimum documenting the risk. If SSE clients can't set headers (e.g., native EventSource), this is a known trade-off — but worth a comment.

  2. Rate limiting explicitly disabled on betting routes:

    config: { rateLimit: false }

    The betting SSE and bootstrap endpoints disable rate limiting entirely. While these are internal endpoints behind auth, a compromised or buggy consumer could still overwhelm the server. Consider a generous but non-zero rate limit.

  3. Access-Control-Allow-Origin: * on the betting SSE endpoint (streaming.ts ~line 1975): For an internal/authenticated endpoint, this is overly permissive. Consider restricting to the known Hyperbet origin(s).


Test Coverage

Good coverage provided:

  • Unit tests for deriveStreamingRendererHealth and shouldDismissStreamingLoading with multiple scenarios.
  • Unit tests for buildBettingFeedPayload and selectReplayDelivery covering edge cases.
  • DuelBettingBridge tests covering the full lifecycle (create → lock → resolve) and reconciliation.
  • Integration test for phaseVersion incrementing.
  • LoadingScreen test for custom completion stage.

Gaps:

  • No test coverage for assertBettingAuth — should verify both valid/invalid tokens and the query string fallback path.
  • No test for deriveBettingRendererHealth on the server side (the function that merges external RTMP status with cycle guardrails).
  • No test for the probeRendererHealth fallback path (when explicitHealth is null).
  • No test for the PlayerRemote fallback avatar lifecycle (ensure → clear → retry).
  • The docker-manager.ts changes (resolveDockerBinary) have no test coverage — consider at least a unit test for the resolution logic.

Minor Nits

  • packageManager: "bun@1.3.1" in root package.json: The CLAUDE.md says v1.1.38+. Is 1.3.1 intentional? If so, consider updating the docs.
  • The hideUI conditional spread for bg/hdr in GameClient.tsx (line 40-45) changes the base environment config semantically — streaming mode now has no background/HDR. Make sure this is intentional and doesn't cause visual artifacts in the 3D scene.
  • normalizedCriticalErrorReason in stream-to-rtmp.ts has a fallback to "canvas_missing" when hasCriticalErrorUi is false, but it's only ever called when criticalUiVisible is true or for fallback reasons — the naming is slightly misleading.

Summary

This is a well-structured PR that meaningfully hardens the streaming-to-betting pipeline. The main items to address before merge are the any type violation, the duplicated validation logic across packages, and the security considerations around CORS and query-string auth on the internal endpoints. The test coverage is solid for the core logic but has some gaps in the infrastructure and resilience paths.

@claude
Copy link

claude bot commented Mar 20, 2026

PR Review: feat(stream): internal bet sync feed and renderer health

Good work on this PR — the overall design is solid. Making Hyperscape the authoritative source for duel lifecycle and renderer health is the right architectural call. Below are findings organized by severity.


Security

1. Token accepted via query string (streaming.ts ~line 688)
The assertBettingAuth function accepts tokens via ?token= / ?streamToken= query parameters. Query params are logged in access logs, browser history, and proxy caches. For an internal betting feed carrying financial data, this is a meaningful leak vector. Consider requiring Authorization: Bearer header only, or at minimum documenting that the query param path is for SSE connections only (where custom headers aren't supported by EventSource).

2. Auth bypass when STREAMING_VIEWER_ACCESS_TOKEN is unset (streaming.ts ~line 674)
If the env var is empty/unset, assertBettingAuth returns true unconditionally — the betting feed is fully open. This is fine for local dev but dangerous if accidentally deployed without the token. Consider logging a warning at startup when this token is missing in production (NODE_ENV=production).

3. streaming-betting-auth.ts — early-return on length mismatch leaks token length
The hasValidBettingFeedToken function returns false early when lengths differ (line 15-17), before calling timingSafeEqual. This is a deliberate tradeoff (since timingSafeEqual requires equal-length buffers), but it does leak the token length via timing. The standard mitigation is to HMAC both values with a fixed key and compare the HMACs, which always produces equal-length outputs. Low severity for an internal feed, but worth noting.


Potential Bugs

4. clearFallbackAvatar disposes shared geometry (PlayerRemote.ts)
clearFallbackAvatar traverses and calls child.geometry.dispose() on every mesh child. But fallbackHeadGeometry and fallbackBeaconGeometry are module-level singletons created with new THREE.SphereGeometry(...) / new THREE.CylinderGeometry(...). In ensureFallbackAvatar, these are .clone()'d — so this is safe if .clone() is always used. However, capsuleGeometry (used for the body) is also .clone()'d. Double-check that no code path skips the clone. If the shared geometry is ever accidentally passed directly, disposing it will corrupt all other fallback avatars. Consider adding a defensive check or using a factory function instead of cloning module-level singletons.

5. selectReplayDelivery returns "bootstrap" when sinceSeq === latestFrame.seq (lines 228-235)
When the client is exactly caught up (sinceSeq === latestFrame.seq), the binary search finds low >= frames.length and returns mode "bootstrap" with the latest frame. This is functionally correct but semantically odd — a "caught up" response would be clearer. Consumers should handle this, but it's a potential source of confusion.

6. Race in reconcileLiveCyclecreateOrSyncMarket (DuelBettingBridge.ts)
reconcileLiveCycle is guarded by reconcileInFlight, but handleStreamingAnnouncement and handleStreamingFightStart also call createOrSyncMarket / reconcileLiveCycle without acquiring the same guard. If an event-driven handler and the reconciliation timer fire near-simultaneously, you could get duplicate initRound calls to the Solana operator. The !existing check in createOrSyncMarket mitigates this for the common case, but there's a TOCTOU window between reading this.activeMarkets.get() and this.activeMarkets.set() since initRound is async.

7. Non-null assertions in handleDuelResult (DuelBettingBridge.ts ~line 452-453)

winnerId: data.winnerId!,
loserId: data.loserId!,

These are safe because of the early return on !data.winnerId || !data.loserId, but TypeScript non-null assertions after an as cast are fragile. Since the guard check is already there, consider using the guarded variables directly or restructuring to avoid the !.


Performance

8. captureBettingFrame serializes the full payload on every poll tick (streaming.ts)
Every STREAMING_SSE_PUSH_INTERVAL_MS tick, captureBettingFrame calls JSON.stringify(payload) and compares it to lastSerializedBettingState for dedup. For payloads with agent stats that change every tick (e.g., damageDealtThisFight, currentHp), this comparison will almost always fail during fights, so the dedup won't help. Consider a cheaper dirty-flag approach using phaseVersion + a hash of mutable fields instead of full JSON comparison.

9. Fallback avatar creates new materials per instance (PlayerRemote.ts)
ensureFallbackAvatar creates 3 new MeshBasicNodeMaterial instances per player. If many players fail avatar loading simultaneously (e.g., CDN outage), this could create a lot of GPU materials. Consider sharing materials by color from a small cache.


Code Quality

10. Duplicated validation logic between client and server
deriveStreamingRendererHealth (client, StreamingMode.tsx) and deriveCycleGuardrailReason (server, streaming.ts) implement similar validation (agent presence, HP validity, arena position checks) independently. If these diverge, the client and server will disagree on renderer health. Consider extracting shared validation into @hyperscape/shared or at least adding a comment cross-referencing the two.

11. hideUI used as proxy for "streaming mode" in GameClient.tsx (line 369-375)
The environment background/HDR is skipped when hideUI is true, but hideUI is a general-purpose flag. A more explicit flag like streamingMode or skipEnvironment would better express intent and avoid surprising behavior if hideUI is used for other purposes (e.g., a cinematic cutscene mode).

12. packageManager field bump (package.json)
Adding "packageManager": "bun@1.3.1" is unrelated to the streaming/betting changes. Consider splitting this into its own commit for cleaner history, or at least noting it in the PR description.


Test Coverage

Tests are solid — good unit coverage for auth, feed payload construction, replay delivery, bridge lifecycle, and renderer health derivation. A few gaps:

  • No test for the assertBettingAuth function — specifically the query-param fallback path and the "no token configured" bypass.
  • No test for deriveBettingRendererHealth on the server side (the server's renderer health derivation that reads RTMP status and capture stats).
  • DuelBettingBridge tests use (bridge as any) to call private methods directly. This is pragmatic but brittle — if method names change, tests break silently. Consider exposing a test-only interface or testing through public event emission.

Summary

This is a well-structured PR with good separation between the betting feed protocol, auth, and bridge lifecycle. The main concerns are: (1) the timing side-channel in auth, (2) the potential for duplicate on-chain calls from concurrent reconciliation, and (3) the duplicated validation logic between client and server. None are blockers for a draft PR, but items 2 and 6 should be addressed before merging.

@claude
Copy link

claude bot commented Mar 20, 2026

PR Review: feat(stream): internal bet sync feed and renderer health

Good work on establishing the source-side contract for the betting sync pipeline. The architecture is sound — making Hyperscape authoritative for duel lifecycle and publishing renderer health signals is the right approach. Here's detailed feedback:


Security

1. Stream token exposed in URL (StreamingMode.tsx)
The streamToken is passed as a URL query parameter:

const streamToken = new URLSearchParams(window.location.search).get("streamToken");

Tokens in URLs get logged in browser history, server access logs, CDN logs, and potentially referrer headers. Consider moving to a POST body, Authorization header, or a session-based approach instead. The same pattern appears in stream-to-rtmp.ts where the token is appended to the URL.

2. Token comparison in streaming-betting-auth.ts — well done
Using timingSafeEqual with length pre-check is the correct approach. No issues here.

3. Authorization header case sensitivity (streaming.ts)
The assertBettingAuth helper checks request.headers.authorization and expects "Bearer " prefix. While Fastify normalizes header names to lowercase, the Bearer scheme comparison should technically be case-insensitive per RFC 7235. Minor, but worth noting.

4. Input validation gaps (streaming.ts)

  • STREAMING_SSE_REPLAY_BUFFER and similar env vars have no upper bound — a misconfiguration could exhaust memory.
  • The /api/internal/bet-sync/events limit query param is parsed but only clamped on one path; verify all consumer paths clamp consistently.

Potential Bugs

5. Race condition in DuelBettingBridge.reconcileLiveCycle()
The reconcileInFlight boolean guard prevents concurrent reconciliation, but if reconcileLiveCycle() throws, the flag is never reset:

if (this.reconcileInFlight) return;
this.reconcileInFlight = true;
// ... if an exception occurs here, reconcileInFlight stays true forever

Wrap the body in try/finally to ensure the flag resets on error.

6. clearFallbackAvatar() disposes shared geometry (PlayerRemote.ts)
fallbackHeadGeometry and fallbackBeaconGeometry are module-level shared instances, but clearFallbackAvatar() calls child.geometry.dispose() on traversal. If the fallback meshes use these shared geometries directly (without .clone()), disposing them would corrupt all future fallback avatars. I see .clone() is used in ensureFallbackAvatar() — good — but double-check that capsuleGeometry (also cloned) follows the same pattern and isn't inadvertently shared elsewhere.

7. Timer closure captures in resolveMarket (DuelBettingBridge.ts)
The setTimeout inside resolveMarket captures local variables. If resolveMarket is called twice for the same duel rapidly (e.g., duplicate resolution events), the delayed callback could fire against stale state. Consider deduplicating or guarding the timeout.

8. onInitError?.(null) called before world.init() (GameClient.tsx)

onInitError?.(null);  // clears error
await world.init(config);  // might throw

If the previous error was real and init throws again, the brief null signal could mislead consumers into thinking the renderer recovered. Move onInitError?.(null) to after successful init.


Code Quality

9. Unsafe type escapes
Several places use as unknown as to bypass type checking:

  • streaming.ts line ~419: as unknown as Entity | null
  • DuelBettingBridge.ts line ~142: (this.world as unknown as { solanaArenaOperator?: ... })
  • StreamingMode.tsx line ~787: improper union in as cast

Per CLAUDE.md, any is forbidden but these double-cast escape hatches effectively achieve the same thing. Consider defining proper interfaces for these access patterns (e.g., a WorldWithArenaOperator type).

10. Unbounded collections

  • streaming.ts: sseClients and bettingClients Maps grow until clients disconnect. A malicious or buggy consumer that connects but never reads could hold memory indefinitely. Consider adding a max client count or idle timeout.
  • DuelBettingBridge.ts: marketHistory is capped at 100 entries (good), but activeMarkets has no cap. Unlikely to be an issue in practice but worth a defensive check.

11. Hardcoded timeouts in stream-to-rtmp.ts
Multiple magic numbers (5000, 20000, 90000ms, MAX_RECONNECT = 10, forced browser restart at 1 hour) are scattered throughout. These would benefit from being pulled into a configuration object at the top of the file, similar to how STREAMING_TIMING is structured in the scheduler.


Performance

12. Replay buffer cleanup uses shift() in a loop (streaming.ts)

while (this.replayFrames.length > maxSize) {
  this.replayFrames.shift();
}

Array.shift() is O(n). If the buffer is large and many frames exceed the limit, this becomes O(n²). Consider using splice(0, excess) for a single O(n) operation, or switch to a ring buffer.

13. getInventorySnapshot() called per request with no caching (streaming.ts)
For high-frequency polling consumers, this could add up. Consider caching the snapshot with a short TTL or computing it only on inventory change events.

14. Fallback avatar geometry clones (PlayerRemote.ts)
Each fallback avatar clones geometries. If many players simultaneously fail avatar loads (e.g., CDN outage), this creates many geometry copies. The shared instances could be used directly since MeshBasicNodeMaterial is per-instance — just don't dispose the geometry on cleanup (only dispose the materials).


Test Coverage

The test suite covers the core paths well:

  • streaming-betting-auth.test.ts — token matching/rejection ✓
  • streaming-betting-feed.test.ts — payload schema, replay delivery modes ✓
  • DuelBettingBridge.test.ts — market lifecycle, reconciliation, idempotent resolution ✓
  • StreamingMode.test.ts — renderer readiness, loading screen behavior ✓
  • StreamingDuelScheduler.test.ts — phase version tracking ✓

Gaps to consider:

  • No test for the reconcileInFlight race condition (item 5 above)
  • No test for SSE client cleanup/eviction under pressure
  • No test for assertBettingAuth with malformed headers (empty string, no space after Bearer, etc.)
  • LoadingScreen.test.tsx is added but the test scope relative to the streaming overlay changes is unclear

Summary

This is a solid foundation for the bet-sync contract. The architecture correctly keeps Hyperscape as the authoritative source. Main items to address before merge:

  1. Fix the reconcileInFlight flag leak (bug — try/finally)
  2. Fix the onInitError?.(null) ordering (bug — move after successful init)
  3. Consider moving streamToken out of URL query params (security)
  4. Consider adding bounds to SSE client connections and env var buffer sizes (hardening)

The rest are quality-of-life improvements that could be addressed in follow-up work. Nice test coverage for a feature of this scope.

@claude
Copy link

claude bot commented Mar 20, 2026

PR Review: feat(stream): internal bet sync feed and renderer health

Good work on this PR — the overall direction of making Hyperscape the authoritative source for duel lifecycle events and publishing renderer health is sound. Here's detailed feedback organized by area.


Code Quality & Architecture

Strengths:

  • Clean separation of the betting feed into its own modules (streaming-betting-auth.ts, streaming-betting-feed.ts, streamingGuardrails.ts)
  • The streamingGuardrails.ts shared utility is well-designed — reusable across client and server without duplication
  • Good use of timing-safe comparison for token validation (timingSafeEqual on SHA-256 digests)
  • The DuelBettingBridge refactor into createOrSyncMarket / resolveMarket is a clear improvement over the previous inline logic
  • Comprehensive test coverage for the new modules

Issues & Suggestions:

  1. Deep relative import crossing package boundariesStreamingMode.tsx imports directly from ../../../shared/src/utils/rendering/streamingGuardrails. This bypasses the package's public API and will break if shared's internal structure changes. Use the @hyperscape/shared package export instead (which you've already set up in shared/src/index.ts):

    // ❌ packages/client/src/screens/StreamingMode.tsx
    import { deriveStreamingGuardrailReason, ... } from "../../../shared/src/utils/rendering/streamingGuardrails";
    
    // ✅ 
    import { deriveStreamingGuardrailReason, ... } from "@hyperscape/shared";

    Same issue in packages/server/src/routes/streaming.ts which imports from "../../../shared/src/utils/rendering/streamingGuardrails".

  2. Duplicated resolveDockerBinary() function — The exact same function exists in both packages/server/src/infrastructure/docker/docker-manager.ts and scripts/cdn.mjs. Extract it to a shared utility or at minimum keep them in sync. The implementations also diverge slightly (existsSync vs fs.accessSync with X_OK) — the cdn.mjs version checking execute permission is actually more correct.

  3. window global type casts are repeated — The WindowWithEnv & { __HYPERSCAPE_STREAM_READY__?: boolean; __HYPERSCAPE_STREAM_RENDERER_HEALTH__?: ... } pattern is cast in 4+ places across GameClient.tsx and StreamingMode.tsx. Consider defining a single StreamingWindow type and reusing it.


Potential Bugs

  1. reconcileLiveCycle does not catch errors from createOrSyncMarket — The test "resets reconcileInFlight even when a reconcile pass throws" explicitly expects reconcileLiveCycle() to reject, which means runScheduledReconciliation will swallow the error silently in the finally block and schedule the next reconciliation. This is fine for resilience, but createOrSyncMarket calls this.solanaOperator.initRound() which could repeatedly fail. Consider adding error logging in runScheduledReconciliation's catch path.

  2. Race between captureBettingFrame and ensureBettingSourceEpochcaptureBettingFrame reads bettingSourceEpoch synchronously, but ensureBettingSourceEpoch sets it asynchronously. If captureBettingFrame is called before ensureBettingSourceEpoch resolves (e.g., from startBettingLoopsIfNeededsetInterval), it will use the pre-init Date.now() value instead of the persisted epoch. The handleBettingEvents handler does await ensureBettingSourceEpoch() before adding the client, but the push interval may fire before the bootstrap frame is captured. This is a narrow window but could produce frames with an inconsistent sourceEpoch.

  3. shouldDismissStreamingLoading relaxes worldReady — The new logic allows dismissal when !worldReady && hasStreamingState. The comment says "allows dismissal once the server state is ready even before worldReady flips" — is this intentional? This means the loading screen can dismiss before the 3D world is actually initialized, which could show an empty/incomplete scene for a frame or two.


Performance Considerations

  1. Renderer health probing on 2-second intervalrefreshRendererHealthSnapshot calls page.evaluate() which involves a CDP round-trip. At 2s intervals this is fine, but the 30-second status log also calls it (await refreshRendererHealthSnapshot(page).catch(() => undefined)), which could overlap with the interval probe. Consider debouncing or caching the last result for a short TTL.

  2. trimReplayFrames uses splice(0, removeCount) — For large replay buffers (up to 8192 frames), splice(0, N) shifts the entire array. This is called on every frame capture. For high-frequency feeds this could cause GC pressure. A ring buffer or linked list would be more efficient, though this may be premature optimization at current scale.

  3. Module-level geometry allocations in PlayerRemote.tsfallbackHeadGeometry and fallbackBeaconGeometry are allocated at module import time. This is fine if the module is always loaded, but if tree-shaking is involved, these allocations persist even if PlayerRemote is never instantiated. Minor concern, but worth noting.


Security

  1. Token in URL fragment vs query string — Moving streamToken from query params to the URL fragment (#streamToken=...) and then scrubbing it is a good security improvement — fragments aren't sent to the server in HTTP requests and won't appear in server logs. Well done.

  2. Auth bypass when STREAMING_VIEWER_ACCESS_TOKEN is unsetassertBettingAuth returns true when no required token is configured. The production warning log is good, but the internal bet-sync endpoints (/api/internal/bet-sync/*) are fully open in non-production environments. Consider whether these should require auth even in development, or at minimum document this behavior.

  3. Rate limits on betting endpoints are generous — 240 req/min on bootstrap and 60 req/min on SSE connect. The SSE endpoint already has BETTING_SSE_MAX_CLIENTS (32 default), but a single client could open/close connections rapidly to exhaust the rate limit without ever hitting the client cap. This is low risk for an internal API but worth considering.


Test Coverage

Good coverage for:

  • streaming-betting-auth.ts — token extraction, timing-safe comparison, query vs header modes
  • streaming-betting-feed.ts — payload building, replay delivery modes
  • DuelBettingBridge — streaming lifecycle (announcement → fight → resolution), reconciliation, error recovery
  • StreamingMode — renderer health derivation, loading dismissal logic
  • LoadingScreen — custom completion stage
  • StreamingDuelSchedulerphaseVersion increments

Gaps:

  • No tests for probeRendererHealth or refreshRendererHealthSnapshot in stream-to-rtmp.ts
  • No tests for deriveBettingRendererHealth in streaming.ts (the server-side health derivation that composes guardrails + external RTMP status + capture stats)
  • No tests for the assertBettingAuth middleware behavior (auth bypass when token is unset, CORS header injection)
  • The PlayerRemote fallback avatar code (geometry creation, cleanup, retry delay) has no unit tests — though this may be hard to test without a full Three.js environment

Minor Nits

  • streaming.ts is already quite large and this PR adds ~650 lines to it. Consider whether the betting feed handler logic could live in its own route file that's registered by the streaming routes.
  • The DuelBettingBridge constructor takes World but the mock world type in tests is Parameters<typeof DuelBettingBridge>[0] — this works but DuelBettingBridge is a class, not a function. It should be ConstructorParameters<typeof DuelBettingBridge>[0].
  • hasValidStreamingGuardrailAgentSnapshot checks agent.hp > agent.maxHp as invalid — during RESOLUTION phase, could HP legitimately be 0 and maxHp > 0? It allows hp === 0 which is correct, just confirming that's intentional.

Summary

This is a well-structured PR that solves a real problem (source-authoritative duel lifecycle for betting consumers). The main actionable items are:

  1. Fix the cross-package deep imports (items 1) — these will likely cause issues
  2. Add error logging in reconciliation (item 4) — silent failures in on-chain operations are dangerous
  3. Verify the worldReady relaxation is intentional (item 6)
  4. Consider extracting betting routes from the growing streaming.ts file

The test coverage is solid for the core logic. Good to merge once the above items are addressed and the paired Hyperbet PR is validated end-to-end.

@claude
Copy link

claude bot commented Mar 20, 2026

PR Review: feat(stream): internal bet sync feed and renderer health

@claude
Copy link

claude bot commented Mar 20, 2026

Security Findings

  1. Auth bypass when STREAMING_VIEWER_ACCESS_TOKEN is unset (Medium) - In streaming.ts, assertBettingAuth returns true when requiredToken is falsy. If the env var is unset, the internal betting endpoints are completely unauthenticated. Consider returning 503 when the token is unset in production.

  2. Stream token in URL fragment vs query param (Low) - Moving streamToken to the URL fragment is a good security improvement. The scrubStreamingAccessTokenFromUrl() cleanup is nice. However, the token is still accepted from query params as a fallback -- ensure this does not lead to token leakage in server logs.

@claude
Copy link

claude bot commented Mar 20, 2026

Potential Bugs

  1. reconcileLiveCycle unhandled rejection (Medium) - In DuelBettingBridge.ts, handleStreamingFightStart calls this.reconcileLiveCycle() directly without a catch. If it throws, the rejection is unhandled. Consider wrapping with try/catch.

  2. useEffect dependency list in GameClient.tsx includes hideUI (Low) - hideUI changing could re-trigger the entire world initialization effect. If hideUI is expected to be stable, consider documenting that or removing it from deps.

  3. selectReplayDelivery bootstrap naming (Low) - When sinceSeq equals the latest frame seq, returns mode bootstrap with the latest frame. A caught-up client receives a redundant frame on reconnect. The naming is slightly misleading.

@claude
Copy link

claude bot commented Mar 20, 2026

Code Quality

  1. Duplicated resolveDockerBinary function - Same logic in docker-manager.ts and cdn.mjs with subtly different behavior (existsSync vs fs.accessSync with X_OK). Extract to shared utility to avoid drift.

  2. Large streaming.ts growing further - This PR adds ~665 lines to an already substantial file. The betting feed logic (frame capture, SSE broadcast, auth, bootstrap/events) could be extracted into streaming-betting-routes.ts.

  3. Repeated agent fixture boilerplate - StreamingMode.test.ts repeats full agent objects (~15 fields) in every test case. A createTestAgent(overrides) helper would reduce noise, similar to createCycle() in streaming-betting-feed.test.ts.

@claude
Copy link

claude bot commented Mar 20, 2026

Performance, Test Coverage, and Summary

Performance (Low): probeRendererHealth runs page.evaluate every 2s reading document.body.innerText (forces layout reflow). Current cadence is fine. trimReplayFrames uses splice(0, N) which is O(N) but fine with max buffer of 8192.

Test Coverage: Good unit tests for streaming-betting-auth, streaming-betting-feed, DuelBettingBridge, StreamingMode, LoadingScreen, StreamingDuelScheduler. Missing: deriveBettingRendererHealth branches, probeRendererHealth fallback path, handleBettingEvents SSE lifecycle.

Minor Nits: toAgentSnapshot uses agent.characterId for id (server) while toGuardrailAgent uses agent.id (client) -- asymmetry worth a comment. BETTING_SOURCE_EPOCH_STORAGE_KEY could live in a shared constants file.

Summary: Solid work. Main actionable items: (1) tighten auth-bypass-when-unset for production, (2) add try/catch around reconcileLiveCycle() in handleStreamingFightStart, (3) consider extracting betting routes from streaming.ts.

@claude
Copy link

claude bot commented Mar 20, 2026

PR Review: feat(stream): internal bet sync feed and renderer health

Overall Assessment

This is a substantial, well-structured PR that adds an internal betting sync feed, renderer health signaling, and hardens the streaming duel lifecycle. The code quality is generally high with good separation of concerns and solid test coverage for the new functionality. Below are specific findings organized by severity.


Security

Good practices observed:

  • timingSafeEqual for token comparison in streaming-betting-auth.ts — prevents timing attacks
  • Token is hashed before comparison via SHA-256 digest
  • Stream token is scrubbed from the URL after extraction (scrubStreamingAccessTokenFromUrl) — prevents leakage via referrer headers or browser history
  • Token moved from query string to URL hash fragment, reducing server-side logging exposure
  • SSE client caps (STREAMING_SSE_MAX_CLIENTS, BETTING_SSE_MAX_CLIENTS) prevent resource exhaustion
  • Upper bounds added to environment-controlled buffer sizes (replay buffer, max pending bytes)

Concern — Auth bypass when STREAMING_VIEWER_ACCESS_TOKEN is unset:
In assertBettingAuth (streaming.ts), if STREAMING_VIEWER_ACCESS_TOKEN is not set, the function returns true (open access). While there's a production warning log at registration time, the internal betting feed endpoints are still fully accessible without authentication. Consider failing closed in production — returning 403 when the token is missing in NODE_ENV=production rather than just logging a warning.

Minor — allowQueryToken on SSE endpoint:
The /api/internal/bet-sync/events SSE endpoint allows streamToken via query parameter (allowQueryToken: true). Query params can appear in server access logs and CDN/proxy logs. The bootstrap endpoint correctly restricts to Bearer header only. Consider whether the SSE endpoint truly needs query-param fallback or if clients can use Bearer headers.


Potential Bugs

  1. verifiedUserId scope change in connection-handler.ts (lines ~1348-1367): Moving the let verifiedUserId declaration outside the if block changes its scope but it's only assigned inside one branch. Verify that the variable isn't unintentionally used as null in code paths below that previously wouldn't have had access to it. This looks intentional for a broader refactor, but the diff context is narrow — worth confirming.

  2. reconcileLiveCycle throws but runScheduledReconciliation catches: The reconcileLiveCycle method can throw (it's not wrapped in try/catch internally — the finally block just resets reconcileInFlight). The scheduled reconciliation catches the error, but the test "resets reconcileInFlight even when a reconcile pass throws" confirms the throw propagates. This means handleStreamingFightStart and getResolvableStreamingMarket, which call reconcileLiveCycle directly without a catch, will propagate unhandled rejections from createOrSyncMarket. The void prefix on handleStreamingFightStart swallows the rejection silently — consider adding explicit error handling in those call sites.

  3. selectReplayDelivery edge case: When sinceSeq > latestFrame.seq, a "reset" is returned. This could happen if a client reconnects with a stale Last-Event-ID from a previous server epoch. The behavior seems intentional, but worth documenting that clients must handle the reset event type to avoid missing state transitions.


Performance Considerations

  1. Reconciliation loop runs continuously (DuelBettingBridge): The reconciliation timer runs every config.reconcileIntervalMs (default 1s) indefinitely, even when no streaming scheduler is active. When no scheduler exists, it backs off to 5s. This is fine for a streaming server but consider adding a way to pause reconciliation entirely when streaming mode is known to be inactive.

  2. probeRendererHealth calls page.evaluate on every status snapshot (every 2s in the status interval, plus every 30s in the status log): Each page.evaluate is a cross-process IPC call to Chrome. The current approach is reasonable for 2s intervals, but be aware this adds latency to the status snapshot writes. The refreshRendererHealthSnapshot catch-all is good defensive programming.

  3. trimReplayFrames uses splice(0, removeCount): For large buffers, Array.splice at index 0 is O(n) due to element shifting. The current upper bound of 8192 frames makes this acceptable, but if the buffer grows, consider a ring buffer. Not a problem at current scale.

  4. Betting frame deduplication via JSON serialization comparison (captureBettingFrame): Comparing JSON.stringify output on every push interval tick is simple and effective. The payloads are small enough that this is fine, but note that Date.now() in emittedAt means frames will almost never deduplicate unless nothing else changes. The emittedAt is set before serialization, so duplicate detection only works if the rest of the payload is identical AND the timestamps happen to match — which won't happen. Consider excluding emittedAt from the dedup comparison, or moving emittedAt assignment after the dedup check.


Code Quality

Strengths:

  • Clean extraction of streamingGuardrails.ts into shared package — good reuse across client and server
  • StreamingWindow type consolidation eliminates scattered as unknown as casts
  • DuelBettingBridge refactored from monolithic handlers into composable createOrSyncMarket/resolveMarket — much easier to reason about
  • Docker binary resolution (resolveDockerBinary) is pragmatic and handles macOS/Linux edge cases well
  • Test coverage is thorough: auth, feed payload construction, replay delivery, bridge lifecycle, renderer health derivation, loading screen behavior
  • The useEffect dependency arrays in StreamingMode.tsx are now correct (previously missing deps)

Suggestions:

  • Duplicate resolveDockerBinary: The same function appears in both docker-manager.ts and scripts/cdn.mjs. Consider extracting to a shared utility to avoid drift.
  • (bridge as any) in tests: The DuelBettingBridge tests access private methods via as any. Consider exposing a test-only interface or making the tested methods protected/package-private to improve type safety in tests.
  • ServerRuntime test changes: The test updates (30Hz → 2Hz, catch-up → no catch-up) are a significant behavioral change to the server runtime. These appear to be documenting an already-shipped change, but verify this doesn't conflict with other in-flight work on the tick system.

Test Coverage

Good coverage for the new functionality:

  • streaming-betting-auth.test.ts — token extraction and validation
  • streaming-betting-feed.test.ts — payload construction and replay delivery modes
  • DuelBettingBridge.test.ts — full lifecycle: announcement → fight → resolution, reconciliation, abort, error recovery
  • StreamingMode.test.ts — renderer health derivation and loading dismiss logic
  • LoadingScreen.test.tsx — custom completion stage
  • StreamingDuelScheduler.test.ts — phaseVersion incrementation

Missing coverage worth considering:

  • No tests for deriveBettingRendererHealth (server-side renderer health derivation from external RTMP snapshots)
  • No tests for the SSE client lifecycle (connect, replay delivery, heartbeat, disconnect, capacity limits)
  • No tests for scrubStreamingAccessTokenFromUrl or getStreamingAccessToken URL parsing

Minor Nits

  • packages/procgen/tsconfig.json maps @hyperscape/shared to a single .d.ts file (../shared/build/framework.d.ts). This is fragile — if the shared package build output changes, this path silently breaks. Consider using the package's actual types field from package.json.
  • The uWebSockets.js.d.ts type declaration file is a useful addition but may drift from the actual uWS API. Consider adding a comment noting which uWS version it targets.

Summary

This is a well-engineered PR that significantly improves the streaming-to-betting pipeline reliability. The main areas to address before merge are:

  1. Auth posture in production when STREAMING_VIEWER_ACCESS_TOKEN is unset (fail open vs. fail closed)
  2. Betting frame dedupemittedAt in the serialized comparison defeats deduplication
  3. Unhandled rejections from reconcileLiveCycle in non-scheduled call sites
  4. Confirm the verifiedUserId scope change in connection-handler.ts is intentional

The draft status and paired-PR requirement with hyperbet#28 is the right call — this should soak together before merging.

@claude
Copy link

claude bot commented Mar 21, 2026

PR Review: feat(stream): internal bet sync feed and renderer health

Solid PR overall — well-structured streaming lifecycle hardening with good test coverage for the new surfaces. Here are findings organized by severity.


Security

1. Token in URL fragment is an improvement, but scrubbing has a race window
scrubStreamingAccessTokenFromUrl() runs in a useEffect — the token is visible in the URL bar for potentially multiple render cycles. If any analytics/telemetry scripts capture window.location on page load, the token leaks. Consider scrubbing synchronously before React mounts (e.g., in the module scope or an inline <script>).

2. streaming-betting-auth.ts — timing-safe comparison is done correctly
Using timingSafeEqual on SHA-256 digests is the right approach. Good work here.

3. STREAMING_VIEWER_ACCESS_TOKEN reuse for betting feed auth
The same env var gates both the viewer WebSocket and the internal betting feed. If a streaming viewer token leaks (e.g., via URL logs), it also grants access to the betting feed. Consider a dedicated BETTING_FEED_ACCESS_TOKEN for separation of concerns, or at least document this coupling.

4. Docker command injection surface in docker-manager.ts
this.config.containerName and this.config.postgresUser are interpolated directly into shell strings passed to execAsync. If these ever come from user input, this is a command injection vector. Currently they appear to be hardcoded defaults, but consider using spawn with argument arrays instead of execAsync with string interpolation for defense in depth.


Potential Bugs

5. captureBettingFrame calls deriveBettingRendererHealth twice
In captureBettingFrame, buildBettingFeedPayload is called twice — once for dedup comparison and once for the actual frame. Each call invokes deriveBettingRendererHealth(cycle) which reads external status files and capture stats. Between the two calls, the renderer health could change, causing the dedup check to pass but the actual frame to contain different data than what was compared. Consider computing renderer health once and passing it to both calls.

// Current (two separate derivations):
const dedupePayload = buildBettingFeedPayload({ ..., rendererHealth: deriveBettingRendererHealth(cycle) });
// ... dedup check ...
const payload = buildBettingFeedPayload({ ..., rendererHealth: deriveBettingRendererHealth(cycle) }); // could differ!

6. bettingBootstrapAuthPreHandler returns reply on failure but has void return type

const bettingBootstrapAuthPreHandler = async (
    request: FastifyRequest,
    reply: FastifyReply,
  ): Promise<void> => {
    if (!(await assertBettingAuth(request, reply, { allowQueryToken: false }))) {
      return reply; // This returns a FastifyReply from a Promise<void> function
    }
  };

Same for bettingEventsAuthPreHandler. While Fastify may handle this gracefully, returning reply from a void handler is semantically wrong and could cause issues with strict type checking or future Fastify versions. Should be just return; since assertBettingAuth already sends the response.

7. verifiedUserId scope widening in connection-handler.ts
Moving let verifiedUserId outside the if block widens its scope, but I don't see it being used in the outer scope in the diff. If it's needed elsewhere (not shown), this is fine — otherwise it's an unintentional leak of mutable state.

8. Missing hasStreamingState check in shouldDismissStreamingLoading
shouldDismissStreamingLoading requires hasStreamingState to be true, but deriveStreamingRendererHealth can report ready: true even when hasStreamingState is false (during IDLE phase). This inconsistency between the two functions could cause the renderer health to report "ready" while the loading overlay is still showing.


Performance

9. trimReplayFrames uses splice(0, N) which is O(N) array shifting
For the SSE replay buffer that can hold up to 8192 frames, repeatedly splicing from the front is expensive. Consider using a ring buffer or tracking a start offset instead. This applies to both the existing SSE replay and the new betting replay.

10. probeRendererHealth runs page.evaluate every 2 seconds
The status snapshot interval calls refreshRendererHealthSnapshot which does a full page.evaluate(). While 2s is reasonable, the evaluate includes DOM text scanning (document.body.innerText.slice(0, 1024)) which forces layout. If the page is under heavy rendering load, this could cause frame drops. Consider using page.evaluateHandle or caching the DOM probe result.

11. Reconciliation loop in DuelBettingBridge runs even when streaming scheduler is inactive
When no streaming scheduler exists, the loop backs off to 5s intervals but never stops. Consider only starting the reconciliation loop when a streaming scheduler is detected, or stopping it after sustained inactivity.


Code Quality

12. Duplicated resolveDockerBinary() in docker-manager.ts and cdn.mjs
The same Docker binary resolution logic appears in both files. Consider extracting to a shared utility.

13. streaming.ts is getting very large (~1700+ lines after this PR)
The file now handles public SSE, betting SSE, betting bootstrap, betting auth, renderer health derivation, and more. Consider splitting the betting feed routes into their own file (e.g., streaming-betting-routes.ts) to improve maintainability.

14. Test files use (bridge as any) for private method access
DuelBettingBridge.test.ts accesses private methods via (bridge as any).handleStreamingAnnouncement(...). While pragmatic, consider exposing a test-only interface or using event emission to trigger these paths through the public API — especially since the bridge already listens for world events.

15. probeRendererHealth duplicates type definitions
The page.evaluate callback in stream-to-rtmp.ts re-declares the __HYPERSCAPE_STREAM_RENDERER_HEALTH__ shape inline rather than importing StreamingWindowRendererHealth from the client. Since this runs in a Playwright evaluate context (no imports), this is understandable but worth a comment explaining why.

16. Shared fallback geometry objects in PlayerRemote.ts
fallbackHeadGeometry and fallbackBeaconGeometry are module-level singletons that get cloned per player. This is a good pattern — just ensure capsuleGeometry (used in ensureFallbackAvatar) is also imported correctly and not disposed elsewhere.


Test Coverage

Good coverage for the new code paths:

  • ✅ Betting auth token extraction and validation
  • ✅ Betting feed payload construction and replay delivery
  • ✅ DuelBettingBridge streaming lifecycle (announcement → fight → resolution)
  • ✅ Reconciliation loop error recovery
  • ✅ Streaming renderer health derivation
  • ✅ Loading screen completion stage customization

Missing coverage worth adding:

  • deriveBettingRendererHealth (server-side health derivation with external RTMP status fallback)
  • scrubStreamingAccessTokenFromUrl edge cases (no token, hash-only, search-only)
  • SSE client capacity limits (503 responses at max clients)
  • selectReplayDelivery with empty sinceSeq and boundary conditions at oldestSeq

Minor Nits

  • The uWebSockets.js.d.ts type declarations are a good addition — consider checking if @types/uwebsockets or the package's own types could be used instead to reduce maintenance burden.
  • ServerRuntime.test.ts changes (30Hz → 2Hz) seem unrelated to the streaming/betting work. If this is a separate behavioral change, it might be cleaner in its own commit.
  • The @hyperscape/shared dependency addition to procgen and the tsconfig path mapping should be verified against the circular dependency note in CLAUDE.md (AUDIT-004).

Summary

This is a well-thought-out PR that correctly makes the streaming lifecycle authoritative for betting consumers. The renderer health signaling, sequence-aware payloads, and reconciliation loop are solid design choices. The main areas to address before merge are the double renderer-health derivation bug (#5), the void return type mismatch (#6), and the security considerations around token handling (#1, #3). The rest are improvements that can be addressed incrementally.

@claude
Copy link

claude bot commented Mar 21, 2026

PR Review: feat(stream): internal bet sync feed and renderer health

Overall this is a well-structured PR with solid security practices (timing-safe token comparison, URL token scrubbing, execFile hardening). Below are findings organized by severity.


Security

1. Command injection in scripts/cdn.mjs via DOCKER_BIN ⚠️
The docker-manager migration from execexecFile is a great security improvement, but cdn.mjs still uses execSync() with string interpolation from resolveDockerBinary(). If DOCKER_BIN is set to a malicious value, execSync will evaluate it in a shell. Should use execFileSync to match the hardening done in docker-manager.ts.

2. Auth bypass in non-production environments
assertBettingAuth returns true with zero authentication when NODE_ENV !== "production" and no token is configured. Any staging environment exposed to a network would have unauthenticated betting feed access. Consider requiring explicit opt-in for the dev bypass (e.g., a BETTING_FEED_SKIP_AUTH=true flag).

3. Token in query parameter on SSE endpoint (Low — documented trade-off)
The EventSource limitation is well-documented in comments. The SHA-256 digest comparison mitigates timing attacks. Good.


Bugs & Correctness

4. Dedup key includes volatile rendererHealth.updatedAt 🐛
buildBettingFeedDedupKey zeroes out emittedAt but does not zero out rendererHealth.updatedAt. Since updatedAt is Date.now() and changes every cycle, dedup will rarely match — defeating its purpose. Should be zeroed alongside emittedAt.

5. Missing await on resolveMarket in handleDuelResult 🐛
DuelBettingBridge.tshandleDuelResult uses void this.resolveMarket(market, ...) which discards the promise. If resolveMarket throws, it becomes an unhandled rejection. The streaming path (handleStreamingResolution) properly awaits the same call. Should be consistent.

6. reconcileLiveCycle resolution guard is fragile
The logic depends on canCreateMarketForStreamingPhase("RESOLUTION") returning false to prevent market re-creation after resolution. This works today but is a subtle invariant — a future change to allowed phases could silently break it. Consider adding an explicit "already resolved this cycle" guard.


Performance

7. Synchronous file read on hot path ⚠️
readExternalRtmpStatusSnapshot uses fs.readFileSync inside deriveBettingRendererHealth, called every push interval (250ms–1000ms). Sync I/O on the event loop at this cadence risks blocking under load. Should use fs.readFile (async) or cache the result with a TTL.

8. Array.splice(0, N) in replay buffer (Low — bounded)
streaming-sse-buffer.ts uses splice(0, N) which is O(N). The 8192 frame cap makes this acceptable. The code comment correctly notes a ring buffer would be better at scale.


Code Quality

9. .js file in TypeScript codebase
resolveDockerBinary.js is a plain JavaScript file. The project is TypeScript-only — this should be .ts with proper types rather than JSDoc annotations.

10. Dual route registration
Both /api/streaming/betting/* and /api/internal/bet-sync/* map to the same handlers. This is fine for migration but doubles the surface area. Consider documenting which path is canonical and adding a deprecation plan for the other.

11. No any types found
Clean across all new files — consistent with the project's strict typing rules.


Test Coverage

12. Missing happy-path test for betting routes
streaming-betting-routes.test.ts only tests negative/security cases (query-token rejection, capacity exhaustion, production fail-closed). No test verifies that a valid authenticated request returns a 200 bootstrap response.

13. Private method testing anti-pattern in DuelBettingBridge.test.ts
Uses as unknown as DuelBettingBridgeTestHarness to access private methods. This is fragile — renaming private methods silently breaks tests. Consider testing through public API or adding explicit test hooks.

14. Incomplete streamingAccessToken test coverage
Only resolveStreamingAccessTokenFromHref is tested (~40% of module exports). primeStreamingAccessTokenFromWindow and getStreamingAccessToken (the caching/window layer) are untested.

15. StreamingMode component untested
Tests cover exported pure functions (shouldDismissStreamingLoading, deriveStreamingRendererHealth) but the React component's state transitions, WebSocket handling, and fade-out behavior have no coverage.

16. Boilerplate duplication
StreamingMode.test.ts repeats full AgentInfo objects (~30 lines) in every test case. streaming-betting-routes.test.ts copy-pastes the 15-option config 3 times. Both should use shared factory functions.

17. Mock usage vs. CLAUDE.md "NO MOCKS" rule
Multiple test files use vi.mock and synthetic test doubles. For pure utility functions and auth boundary tests, this is a reasonable exception. However, the PR would benefit from companion Playwright integration tests for the full SSE betting feed flow and renderer health reporting chain.


Positive Highlights

  • Token scrubbing from URL (streamingAccessToken.ts) — history.replaceState to remove tokens from URL bar prevents leakage in browser history, referrer headers, and screenshots
  • execexecFile migration in docker-manager.ts — eliminates shell injection surface
  • ElizaOS v2 adapter fix — clean refactor replacing monkey-patched createLogs with proper log method
  • AgentManager.executeBehaviorTick — new public method eliminates (manager as any) cast in tests
  • useEffect dependency fix in LoadingScreen.tsx — the empty [] was a React bug
  • Bounded replay buffer with documented trade-offs

Summary

The PR is well-architected with good separation of concerns between the feed layer, auth, health derivation, and bridge lifecycle. The most actionable items are:

  1. Fix command injection vector in cdn.mjs (fix: camera system, player movement, UI design #1)
  2. Make readExternalRtmpStatusSnapshot async (resource gathering and depletion for woodcutting #7)
  3. Zero rendererHealth.updatedAt in dedup key (Game Loop Flow v1 #4)
  4. await instead of void on resolveMarket in handleDuelResult (Game Loop Flow v2 #5)
  5. Add happy-path test for betting routes (pick-up-interaction-fixes #12)

@claude
Copy link

claude bot commented Mar 21, 2026

Code Review: feat(stream): internal bet sync feed and renderer health

Thorough review of this 4943-line addition across ~50 files. Overall the PR is well-structured with good security practices. Below are the findings organized by severity.


Security

1. Viewer token fallback weakens betting feed isolation (streaming-betting-auth.ts)
resolveBettingFeedAccessToken falls back from BETTING_FEED_ACCESS_TOKEN to STREAMING_VIEWER_ACCESS_TOKEN. This means any stream viewer implicitly has access to the internal betting feed - these have different trust boundaries. The viewer token is passed in URLs (query/hash), so if it leaks, all internal betting endpoints are exposed. The warning log acknowledges this is temporary, but there is no enforcement of a migration deadline.

2. Auth bypass scope is broader than necessary (streaming-betting-auth.ts)
shouldSkipBettingFeedAuth checks NODE_ENV !== "production". Staging or pre-prod environments that do not set NODE_ENV=production would have auth completely bypassable via BETTING_FEED_SKIP_AUTH=true. Consider restricting to NODE_ENV === "development" || NODE_ENV === "test" only.

3. Query-string token on SSE events endpoint (streaming-betting-routes.ts)
allowQueryToken: true on the SSE events route is a pragmatic tradeoff (EventSource cannot set headers), but query params appear in server access logs, proxy logs, and Referer headers. The client-side URL scrubbing (streamingAccessToken.ts) is well done, but server-side log exposure remains. Consider documenting this as an accepted risk or using short-lived tokens.

Positive: Timing-safe token comparison via timingSafeEqual on SHA-256 digests is correctly implemented. The exec to execFile migration in docker-manager.ts and cdn.mjs eliminates shell injection vectors - nice improvement. Auth fails closed in production when no token is configured.


Potential Bugs

4. No runtime validation on event payloads in DuelBettingBridge
Event payloads are cast with as { duelId: string, ... } throughout the bridge (e.g., handleStreamingAnnouncement, handleStreamingEnd, handlePhaseChange). No runtime validation is performed. For a betting/financial system, a malformed payload (e.g., missing winnerId on resolution) could cause incorrect market state. Consider a lightweight runtime validator, even if just checking required fields exist.

5. externalStatusRefreshInterval leaks on hot reload (streaming-betting-routes.ts)
The interval at line ~3125 is stored in a closure variable, but the guard (if (externalStatusFile && !externalStatusRefreshInterval)) uses the freshly-null local variable on each call to registerStreamingBettingRoutes. If the function is called multiple times (e.g., dev server restart), the old interval leaks.

6. selectReplayDelivery returns misleading mode (streaming-betting-feed.ts)
When sinceSeq equals the latest sequence (client is caught up), the function returns mode: "bootstrap" with frames: []. The mode name "bootstrap" is misleading for a caught-up client - an explicit "up-to-date" or empty replay would be clearer.

7. Placeholder test assertions (ServerRuntime.test.ts)
expect(true).toBe(true) placeholder assertions test nothing and should either assert real values or be removed.


Code Quality

8. Tests use mocks extensively - violates CLAUDE.md "NO MOCKS" rule
Multiple new test files use vi.mock() and vi.fn(): streaming-betting-routes.test.ts, streaming-betting-renderer-health.test.ts, DuelBettingBridge.test.ts, StreamingMode.component.test.tsx, LoadingScreen.test.tsx. CLAUDE.md states "NO MOCKS - Use real Hyperscape instances with Playwright." The pure function unit tests are arguably fine, but the component/integration tests with mocked systems violate project philosophy. If this is an intentional carve-out for unit-level tests, the project testing rules should be updated.

9. registerStreamingBettingRoutes is ~580 lines with ~15 mutable closure variables
This function manages SSE broadcast, auth, replay buffering, external status I/O, renderer health, and database persistence all via closures. Consider extracting the SSE broadcast engine into a reusable class (it duplicates patterns from streaming.ts) and separating the health/status concern.

10. as unknown as used extensively as type-safety escape hatch
While technically avoiding any, patterns like world.getSystem("database") as unknown as DatabaseSystemLike and bridge as unknown as DuelBettingBridgeTestHarness are equally type-unsafe. The test harness pattern that accesses private methods will silently break if method names change.

11. PUBLIC_DISABLE_WEBGPU in StreamingWindow type (streamingWindow.ts)
Including PUBLIC_DISABLE_WEBGPU?: string in the type definition signals that WebGPU disabling is an accepted configuration path, which contradicts the CLAUDE.md rule: "NEVER use --disable-webgpu or forceWebGL flags."


Performance

12. Double JSON.stringify per frame (streaming-betting-routes.ts)
Every captureBettingFrame call serializes the full payload once for buildBettingFeedDedupKey and once more for Buffer.byteLength(JSON.stringify(payload)). At 250ms push intervals, this is two full serializations per tick. Cache the serialized form.

13. probeRendererHealth has no concurrency guard (stream-to-rtmp.ts)
page.evaluate() is called every 2 seconds with no debounce. If the browser is slow to respond, calls can stack up. Consider a simple boolean guard or skipping the probe if the previous one has not resolved.

14. trimReplayFrames uses splice(0, N) - O(N) reindex (streaming-sse-buffer.ts)
The inline comment acknowledges this should become a ring buffer. Acceptable at current scale (2048 frames), but worth tracking.

15. Fallback avatar geometries cloned per player (PlayerRemote.ts)
cloneFallbackGeometry creates new buffer copies per player. If many players fail avatar loading simultaneously (e.g., streaming viewer), this creates per-player GPU buffer uploads. The geometries are not modified, so they could be shared via instancing.


Positive Highlights

  • Token scrubbing from the browser URL bar before React/telemetry can observe it - good security hygiene
  • phaseVersion monotonic counter enables idempotent dedup on the consumer side
  • Replay buffer enforces both frame count and byte size limits, preventing unbounded memory growth
  • Fallback avatar retry throttling with 15s cooldown prevents hammering failed loads
  • Clean separation of auth, feed logic, transport, and routes
  • bettingSourceEpoch from database ensures sequence continuity across restarts

Summary

The security posture is solid (timing-safe comparison, URL scrubbing, fail-closed auth, shell injection fixes). The main areas to address before merge:

  1. Viewer token fallback giving betting feed access to all stream viewers (security boundary concern)
  2. No runtime validation on event payloads in a financial/betting bridge (correctness risk)
  3. Extensive mock usage violating project testing rules (or update the rules)
  4. Placeholder test assertions that test nothing
  5. Double JSON serialization per betting frame (easy performance win)

Good work on the overall architecture - the source-of-truth model with sequence-aware replay is well designed.

@claude
Copy link

claude bot commented Mar 21, 2026

PR Review: feat(stream): internal bet sync feed and renderer health

Good work on a substantial PR -- the architecture is well-considered, the auth hardening is a clear improvement, and the test coverage is solid for a feature of this scope. Below are findings organized by severity.


Security

HIGH -- postMessage origin not validated (packages/client/src/index.tsx)
The comment says in production, should check against allowed origins but event.origin is never validated. Any page that embeds the client in an iframe can send a HYPERSCAPE_AUTH message and inject an arbitrary authToken, agentId, characterId, and followEntity. This is a real attack vector if the client is ever embedded cross-origin.

MEDIUM -- sessionToken accepted via URL params (packages/client/src/index.tsx)
While authToken is correctly excluded from URL params, sessionToken is still accepted via URL and could leak through referrer headers and browser history. Consider scrubbing it like the streaming access token does.

MEDIUM -- INTERNAL_BET_SYNC_ALLOWED_ORIGIN not validated (streaming-betting-routes.ts)
The env var is written directly as the Access-Control-Allow-Origin header with no format validation. If misconfigured to *, it re-opens the wildcard CORS hole that this PR is trying to close.

LOW -- Query-string tokens for SSE appear in access logs
Acknowledged in comments, but worth documenting log-scrubbing requirements for production since the SSE EventSource API forces the token into the URL.

LOW -- stream-to-rtmp.ts logs access token to stdout (~line 760)
Using game page URL will print the streaming access token in logs. Consider redacting.

LOW -- --disable-web-security in stream-to-rtmp.ts
Disables same-origin policy for the capture browser. Acceptable in Docker/CI but should be documented as a risk.


Bugs

No-op ternary: reason (StreamingDuelScheduler/index.ts ~line 1415)
reason: winReason === "kill" ? "death" : "death" always evaluates to "death". Either a copy-paste oversight or the ternary should be replaced with just reason: "death".

No-op ternary: display (GameClient.tsx ~line 486)
display: ui.visible ? "block" : "block" -- both branches produce "block". The false case was likely intended to be "none".

GPU context lost effect is a complete no-op (GameClient.tsx lines 312-334)
The event listener registration is commented out and the cleanup returns an empty function. Should either be completed or removed.

parseReplayCursor prioritizes ?since= over Last-Event-Id (streaming-betting-routes.ts)
The SSE spec has the browser automatically send Last-Event-Id on reconnect. Preferring the query param over the header could cause duplicate frame delivery if the query param is stale from initial connection.

Stale closure on worldReady in StreamingMode.tsx (handleSetup callback)
worldReady is captured at callback creation time but not listed in useCallback deps. The callback always sees worldReady === false. Benign today (React deduplicates identical setState calls) but a latent bug if logic becomes conditional on the value.


Race Conditions and State Management

No concurrency guard on createOrSyncMarket (DuelBettingBridge.ts)
Multiple concurrent calls can pass the activeMarkets.has() / hasResolvedMarket guards simultaneously, leading to duplicate initRound() on-chain calls. Consider a per-duelId lock or pending-promise cache.

handleStreamingAbort deletes market without on-chain cleanup (DuelBettingBridge.ts ~line 659)
If initRound was already called on Solana, aborting the streaming cycle deletes the local market entry but leaves the on-chain market unresolvable. Either add cancellation logic or document that the Solana program handles orphaned markets.

_endCycleInProgress flag lifecycle is fragile (StreamingDuelScheduler/index.ts)
The flag is only reset inside a setTimeout nested inside a .finally() on the cleanup promise. If destroy() is called during the delay window, recovery depends on destroy() manually resetting the flag -- which it does, but the chain is fragile.


Performance and Memory

Dual storage of payload + payloadJson in replay frames (streaming-betting-routes.ts)
Each BettingFeedFrame stores both the parsed object and the JSON string. Only payloadJson is needed for SSE delivery. The payload object is only used for the latest frame in bootstrap. This roughly doubles per-frame memory.

No backpressure on SSE raw.write() (streaming-betting-routes.ts ~line 450)
write() return value (false = buffer full) is not checked. The writableLength pre-check is a reasonable approximation but writes can still exceed the threshold.

~200 lines of inline CSS re-parsed every render (LoadingScreen.tsx)
The entire style block with interpolated progress values is re-injected on every render. Consider using CSS custom properties or inline style for dynamic parts only.

Spectator entity poll interval has no upper bound (EmbeddedGameClient.tsx ~line 438)
The 1-second setInterval runs indefinitely if the target entity never spawns. The 20-second timeout only logs a warning but does not clear the interval.


Code Quality

SSE utilities duplicated between streaming.ts and streaming-betting-routes.ts
formatSseEvent, writeSseMessage, writeSseEvent, removeSseClient are near-identical in both files. Extract to a shared SSE utility module to reduce maintenance burden.

Empty catch blocks in StreamingMode.tsx (lines ~776, 788, 818)
try/catch with empty catch swallows all errors silently. At minimum, these should log at debug level for diagnosability.

as unknown as DatabaseSystemLike double cast (streaming-betting-routes.ts ~line 494)
Suggests the World.getSystem return type is too loose. Not blocking but worth tracking.

Duplicated createCycle test helper
The same factory is copy-pasted across streaming-betting-feed.test.ts, streaming-betting-renderer-health.test.ts, and partially in DuelBettingBridge.test.ts. Extract to a shared test fixture.

isDevelopment unused in types.ts (StreamingDuelScheduler/types.ts ~line 178)
Declared but never referenced -- dead code.

Docker manager default password check (docker-manager.ts ~line 267)
pwd === "hyperscape" checks against the wrong string -- the actual default is "hyperscape_dev_password". The read-actual-password-from-container fallback will never trigger for the real default.


Test Coverage Gaps

The test suite is good overall -- proper unit tests with mocks (appropriate for unit tests per project conventions), and the integration test covers the scheduler state machine. Key gaps:

  • No test for SSE /events actually streaming data -- only the 503 capacity case is tested
  • No CORS header or rate limiting tests for betting routes
  • No positive test for shouldDismissStreamingLoading returning true -- only false cases tested
  • LoadingScreen.test.tsx -- single test case; no tests for default text, progress rendering, or world.off cleanup on unmount
  • No concurrent reconciliation test for DuelBettingBridge (reconcileInFlight guard)
  • No test for deriveStreamingRendererHealth when connected: false

Minor / Nits

  • lockMarket timeout fires even if market was already locked early by a streaming:fight:start event -- harmless but the timer could be cleared
  • postMessage to parent uses wildcard origin for HYPERSCAPE_READY -- benign payload but best practice is to restrict
  • loadRuntimeEnv (GameClient.tsx) appends a new script on every call with no deduplication guard
  • PlayerRemote.ts line ~504: non-null assertion on getHeadToHeight() could propagate NaN to chat bubble position
  • Promise.resolve().then(...) for position restoration after bake (PlayerRemote.ts ~line 585) is fragile if the bake is async

Summary

This is a well-structured PR with good separation of concerns between auth, feed mechanics, and lifecycle management. The main items to address before merge:

  1. Security: Validate postMessage origin in index.tsx (HIGH)
  2. Bugs: Fix the three dead-code/no-op ternaries (reason, display, GPU context effect)
  3. Race condition: Add concurrency guard to createOrSyncMarket
  4. On-chain safety: Clarify or handle orphaned Solana markets on streaming abort

The rest are improvements that can be addressed incrementally. Nice work on the timing-safe token comparison, the URL token scrubbing, and the thorough test coverage for the betting feed builder functions.

@claude
Copy link

claude bot commented Mar 21, 2026

Code Review: feat(stream): internal bet sync feed and renderer health

Thorough review of the ~6,000-line diff. Overall this is a well-structured PR with strong auth improvements and good test coverage. Below are findings organized by severity.


Critical

1. --disable-web-security in streaming capture browser (stream-to-rtmp.ts)
The Chrome launch flags include --disable-web-security. The added comment rationalizes it ("the browser runs a trusted local page"), but this disables same-origin policy entirely. If the capture browser ever navigates to or is redirected to an untrusted URL (e.g., a compromised game server URL), all cross-origin protections are gone. Consider removing this flag and using proper CORS configuration instead.

2. SSE endpoint accepts token via query string (streaming-betting-routes.ts)
The /api/internal/bet-sync/events SSE endpoint sets allowQueryToken: true, meaning BETTING_FEED_ACCESS_TOKEN can appear as ?streamToken= in the URL. Query parameters are logged in access logs, proxy logs, CDN logs, and browser history. The comment acknowledges this ("Production access logs and reverse proxies must redact streamToken from URLs"), but documentation-only mitigation is fragile. Consider:

  • Adding server-side log redaction as code, not just a comment
  • Exploring Authorization header-based auth for SSE (some SSE libraries support this)

3. Auth bypass in NODE_ENV=test (streaming-betting-auth.ts)
When NODE_ENV=test and BETTING_FEED_SKIP_AUTH=true, auth is entirely bypassed. If a staging or CI environment inadvertently sets NODE_ENV=test (common in some CI setups), the betting feed becomes unauthenticated. Consider using a more specific env check (e.g., a dedicated HYPERSCAPE_TEST_MODE flag) or removing the runtime bypass entirely and handling it in test setup.


Important

4. Unhandled async rejection in lockMarket via setTimeout (DuelBettingBridge.ts)
lockMarket is now async (calls this.solanaOperator.lockMarket()), but the setTimeout-based lock timer calls it with void this.lockMarket(...). If lockMarket throws, the promise rejection is swallowed silently. Add a .catch() handler or use try/catch in the timeout callback to log the error.

5. Module-level singleton externalStatusPollers Map (streaming-betting-routes.ts)
The externalStatusPollers Map lives at module scope with reference-counted setInterval timers. If registerStreamingBettingRoutes is called multiple times (hot-reload, tests), pollers accumulate. The refcount mechanism mitigates this, but if close() is not called, intervals leak. Consider tying poller lifecycle to the Fastify instance via onClose hooks (which it partially does, but verify all paths).

6. verifiedUserId scope hoisting (connection-handler.ts)
verifiedUserId is hoisted from inside the authenticated block to the outer scope. If the auth branch is not taken (anonymous spectator), it remains null. Downstream code must correctly handle null. The change looks correct but is fragile — consider adding a type annotation like let verifiedUserId: string | null = null to make the contract explicit.


Minor

7. Duplicate readiness-probe logic (stream-to-rtmp.ts)
After the refactor, waitForStreamReadiness calls probeRendererHealth in a loop but still has its own secondary canvas/boot-UI gate. This is redundant with logic inside probeRendererHealth and could drift over time.

8. nextBettingClientId unbounded counter (streaming-betting-routes.ts)
Monotonically incremented with no upper bound. In a long-running server this will eventually exceed Number.MAX_SAFE_INTEGER. Extremely unlikely in practice but technically incorrect — could use BigInt or reset on wrap.

9. Regex fallback in redactStreamingSecretsFromUrl may leave dangling ?
When streamToken is the only query parameter and URL parsing fails, the regex fallback replaces ?streamToken=value with ?, leaving a trailing ?. The primary new URL() path handles this correctly, so this only affects malformed URLs.

10. useEffect dependency array in GameClient.tsx
The useEffect that initializes the world now depends on hideUI and streamingMode. If these props change at runtime, the world will be torn down and re-initialized. Verify this is intentional or memoize/stabilize these values.


Positive Observations

  • Auth upgrade: The refactoring from URL-parameter-based tokens to postMessage with origin validation is a significant security improvement.
  • Timing-safe comparison: timingSafeEqual is correctly used for token comparison in hasValidBettingFeedToken (via SHA-256 digest comparison), preventing timing attacks.
  • Shell injection fix: The execSync/exec to execFileSync/execFile migration in docker-manager.ts and cdn.mjs eliminates shell injection vectors. Good catch.
  • URL scrubbing: The streaming access token is removed from the URL bar via history.replaceState before React or telemetry can observe it.
  • Log redaction: redactStreamingSecretsFromUrl properly strips secrets from log output.
  • Test coverage: Comprehensive unit tests cover auth, feed construction, replay delivery, renderer health, and the betting bridge lifecycle. Tests properly validate both happy and error paths.
  • No any types: The diff consistently uses unknown with proper narrowing or Record<string, unknown> — compliant with the project's no-any rule.

Summary

The PR is well-written with thoughtful security hardening (especially the execFile migration and postMessage auth). The main areas to address before merge are: (1) the --disable-web-security flag rationalization, (2) query-string token exposure in SSE, (3) the NODE_ENV=test auth bypass risk, and (4) the unhandled async rejection in the lockMarket timeout path. The rest are minor items that can be addressed incrementally.

@claude
Copy link

claude bot commented Mar 21, 2026

Code Review: PR #1065feat(stream): internal bet sync feed and renderer health

Overall Assessment

This is a well-structured PR that adds a significant feature (internal betting sync feed with renderer health monitoring) while making meaningful security improvements (token scrubbing from URLs, origin allowlisting, removal of --disable-web-security, timing-safe token comparison). Code quality is high with strong typing throughout and comprehensive unit test coverage.

That said, there are a few issues worth addressing before merge.


Critical

C1. --no-sandbox in capture browser launch args
packages/server/src/streaming/captureBrowserPolicy.ts:28

--no-sandbox is included in default Chromium launch args. While this may be required for headless Linux GPU containers, it disables Chromium's process sandboxing. If the capture browser is redirected to an attacker-controlled page, the sandbox would normally limit damage.

Suggestion: Gate --no-sandbox behind an explicit opt-in environment variable (e.g., CAPTURE_DISABLE_SANDBOX=true) and document the security tradeoff. The resolveAllowedCaptureOrigins navigation guard helps mitigate this, but defense-in-depth is preferable.

C2. Aborted on-chain market has no cancellation path
packages/server/src/systems/DuelScheduler/DuelBettingBridge.ts (~handleStreamingAbort)

When a streaming duel is aborted after onChainInitialized === true, the code logs a warning that the operator doesn't support cancellation and the round is "locally terminal only." This means real funds could be locked in an on-chain market with no resolution.

Suggestion: At minimum, emit a dedicated alert/metric for monitoring. Ideally, implement a cancelRound path on the Solana operator or set up an automated intervention queue. This is flagged in the code but is a funds-at-risk scenario that warrants a tracking issue.


Important

I1. resolveMarket setTimeout callback could fire after destroy()
packages/server/src/systems/DuelScheduler/DuelBettingBridge.ts (resolveMarket method)

The resolveTimer is added to this.pendingTimeouts and properly cleared in destroy(). However, the callback captures references to this.solanaOperator and this.world which could be stale if the bridge is destroyed between scheduling and callback firing (edge case with rapid destroy + clearTimeout race).

Suggestion: Add an if (this.destroyed) guard at the start of the setTimeout callback body.

I2. SSE connections — consider per-IP concurrent connection tracking
packages/server/src/routes/streaming-betting-routes.ts:556-632

While maxClients caps total connections and eventsRateLimit applies per-IP, an attacker could rapidly connect and disconnect to exhaust resources. The auth + rate limit check runs before reply.hijack() (good), but there's no per-IP concurrent connection limit.

Suggestion: Add per-IP concurrent connection tracking to prevent a single origin from consuming all maxClients slots.

I3. Module-level singleton externalStatusPollers map
packages/server/src/routes/streaming-betting-routes.ts:771

externalStatusPollers is a module-level Map with reference counting. If routes are registered multiple times (tests, hot-reload), interval timers could leak. The tests do verify cleanup, but shared mutable state at module scope is fragile.

Suggestion: Consider making the poller map instance-scoped (passed as option to the route plugin) rather than module-scoped.

I4. PlayerRemote fallback avatar added directly to scene instead of entity node
packages/shared/src/entities/player/PlayerRemote.ts (~ensureFallbackAvatar)

The fallback avatar is added to world.stage.scene directly, bypassing the entity's node hierarchy. The update() method manually copies position/quaternion, creating a one-frame-behind lag and bypassing the entity's visibility/culling systems.

Suggestion: Consider adding to this.node and letting the scene graph handle transforms naturally. If scene-level attachment is necessary (e.g., node not yet in scene), document why.


Minor

M1. ServerRuntime test tick rate changed from 30Hz to 2Hz
packages/shared/src/systems/server/__tests__/ServerRuntime.test.ts

This is a significant behavioral change to the test but isn't mentioned in the PR description. If intentional, call it out — if it's just a flaky-test workaround, it may mask timing issues.

M2. biomeConfig fields removed without migration note
packages/asset-forge/src/components/WorldBuilder/TerrainPreview.tsx, types.ts

Five biome config fields (mountainHeightThreshold, mountainWeightBoost, etc.) are removed. If any persisted configs reference these fields, they would fail silently. Worth confirming no saved configs depend on them.

M3. Duplicate route registration for legacy compatibility
packages/server/src/routes/streaming-betting-routes.ts:636-662

Both /api/streaming/betting/* and /api/internal/bet-sync/* point to the same handlers. Intentional for backward compatibility, but consider adding a deprecation header to the legacy routes to signal consumers to migrate.

M4. GameClient effect dependency array change
packages/client/src/screens/GameClient.tsx:444

The init effect now only depends on [world] since wsUrl, onSetup, and onInitError are stored in refs. This is correct but could surprise future maintainers — a brief comment explaining why would be helpful.


Security Positives (things done well)

  • Token moved from query string to hash fragment (not sent to servers in HTTP requests)
  • redactStreamingSecretsFromUrl used consistently in log statements
  • --disable-web-security explicitly removed and tested to not be present
  • Timing-safe token comparison in auth
  • shouldSkipBettingFeedAuth correctly requires BOTH NODE_ENV === "development" AND explicit env var
  • PLAYWRIGHT_TEST wildcard origin fallback properly gated

Test Coverage Assessment

Strengths: Auth edge cases, replay buffer delivery modes, bridge lifecycle (create → lock → resolve, create → abort), renderer health derivation, and StreamingMode component loading overlay all have solid coverage.

Gaps to consider:

  • No test for SSE reconnection behavior on mid-stream disconnect
  • No test for concurrent SSE client exhaustion and recovery
  • probeRendererHealth in stream-to-rtmp.ts has no unit test (the shouldAcceptCaptureReadiness helper is tested, which covers most logic)
  • No integration test for the full betting feed SSE flow (connect → receive frames → disconnect)

Good work overall. The security posture improvements are welcome and the test coverage is solid for a PR of this scope. The main items to address before merge are C1 (sandbox opt-in) and C2 (on-chain market cancellation tracking), plus the destroy() guard in I1.

@claude
Copy link

claude bot commented Mar 21, 2026

PR Review: feat(stream): internal bet sync feed and renderer health

Overall Assessment

This is a substantial, well-structured PR that significantly improves the security posture and reliability of the streaming/betting pipeline. The architecture — separating auth, feed construction, SSE transport, and renderer health into distinct modules — is clean and follows good separation of concerns. Test coverage is solid across the critical paths.


Security (Mostly Positive — Several Important Hardening Wins)

Improvements worth highlighting:

  1. Token scrubbing (streamingAccessToken.ts): Synchronous hash-fragment extraction + history.replaceState before React mounts prevents tokens from leaking into browser history, referrer headers, or telemetry. Well done.

  2. Origin-validated postMessage (embeddedAuth.ts): The old code accepted HYPERSCAPE_AUTH from any origin. The new code validates event.source === window.parent, checks event.origin against an explicit allowlist, and rejects *, null, and javascript: protocols. Major improvement.

  3. Capture browser policy (captureBrowserPolicy.ts): Removing --disable-web-security from default Chromium args and making --no-sandbox opt-in via CAPTURE_DISABLE_SANDBOX are important fixes. The navigation allowlist on framenavigated prevents redirect-to-malicious-origin attacks.

  4. Command injection fix (docker-manager.ts, cdn.mjs): Replacing exec (shell-interpreted) with execFile (no shell interpolation) eliminates a command injection vector where containerName could contain shell metacharacters. Good catch.

  5. Timing-safe token comparison (streaming-betting-auth.ts): Using timingSafeEqual with SHA-256 digests for BETTING_FEED_ACCESS_TOKEN. Correct approach.

Minor concerns:

  • parseExternalRtmpStatusSnapshot reads a file path from the RTMP_STATUS_FILE env var with no path validation. Since controlling env vars already implies server compromise, this is acceptable — but adding a path prefix check (e.g., must be under a known directory) would be defense-in-depth.
  • normalizeString in embeddedAuth.ts uses .trim() which normalizes Unicode whitespace — the max-length check provides defense in depth, so this is fine in practice.

Code Quality

Strong points:

  • No any types found in any production code — fully compliant with the project's strict typing rule.
  • Proper payload validation in DuelBettingBridge: The old code used payload as { ... } unsafe casts. The refactor introduces typed parse functions (parseDuelScheduledPayload, parseStreamingAnnouncementPayload, etc.) that validate all fields.
  • Pure, testable functions extracted for deriveStreamingRendererHealth, shouldDismissStreamingLoading, and the streaming guardrail validators — all with comprehensive unit tests.
  • Bug fix in GameClient.tsx: display: ${ui.visible ? "block" : "block"} (always "block" regardless of state) is now correctly using resolveGameClientUiDisplay(ui.visible).

Suggestions:

  • externalStatusPollers module-level Map in streaming-betting-routes.ts is global mutable state. The onClose hook mitigates resource leaks, but if closeRoutes() never fires (e.g., unhandled crash), pollers could leak. Consider adding a setInterval health check that clears stale entries, or document the cleanup contract explicitly.

  • DuelBettingBridge test harness accesses private methods via asTestHarness(). The newer executeBehaviorTick pattern on AgentManager is cleaner. Consider migrating the bridge tests to use a similar pattern for less fragile testing.


Potential Bugs

  1. Fallback avatar retry timing (PlayerRemote.ts): When applyAvatar fails, _nextAvatarRetryAt is set to Date.now() + 15_000. The safety-net check in update() then gates on Date.now() >= this._nextAvatarRetryAt. This is correct, but if the avatar server is persistently down, the player will retry every 15s indefinitely with no backoff. Consider exponential backoff with a cap (e.g., 15s → 30s → 60s → 60s).

  2. Fallback avatar scene attachment: In ensureFallbackAvatar(), the fallback root is added to scene (if available) or this.node. When added to the scene directly, position sync happens in update() by copying this.node.position. If update() isn't called on the same frame (e.g., entity is culled), the fallback could appear at the wrong position for one frame. This is cosmetic and low-risk.

  3. SSE reconnect after reset mode: In selectReplayDelivery, when sinceSeq > latestFrame.seq, the client receives mode: "reset" with the full buffer. If the client doesn't handle this mode correctly and re-sends the same stale sinceSeq, it could enter a reset loop. This depends on the Hyperbet consumer implementation — worth validating in end-to-end testing.


Performance

  • readFileSync → async read for RTMP status snapshots in streaming.ts: Good fix, prevents event loop blocking on I/O.
  • useMemo for wsUrl in StreamingMode.tsx: Prevents URL object re-creation on every render.
  • useRef for stable callbacks in GameClient.tsx: Prevents the world init effect from re-running when callback identities change. Correct pattern.
  • SSE capacity limits (STREAMING_SSE_MAX_CLIENTS, STREAMING_SSE_MAX_PENDING_BYTES): Good resource exhaustion protection.
  • Bounded replay buffer (capped at 8192): Prevents unbounded memory growth.

No performance concerns identified.


Test Coverage

Test coverage is comprehensive for the new code:

  • streaming-betting-auth.test.ts — covers dev skip, token validation, missing tokens
  • streaming-betting-feed.test.ts — covers replay delivery modes, frame ordering
  • streaming-betting-renderer-health.test.ts — covers health derivation edge cases
  • streaming-betting-routes.test.ts — covers SSE lifecycle, CORS, capacity limits
  • DuelBettingBridge.test.ts — covers lifecycle transitions, reconciliation, error paths
  • StreamingMode.component.test.tsx / StreamingMode.test.ts — covers renderer health derivation, loading dismissal logic
  • embeddedAuth.test.ts — covers origin validation, field validation, wildcard rejection
  • streamingAccessToken.test.ts — covers token extraction, URL scrubbing

The tests follow the project convention of testing real behavior rather than mocking internals.


Summary

This PR is well-executed. The security improvements alone (token scrubbing, origin validation, capture browser hardening, command injection fix) make it a significant upgrade. Code quality is high, types are strict, and test coverage is thorough.

Recommended before merge:

  1. Validate the SSE reset mode handling end-to-end with the paired Hyperbet PR
  2. Consider exponential backoff for fallback avatar retries
  3. End-to-end soak as noted in the PR description

This is ready for merge once the paired Hyperbet PR (HyperscapeAI/hyperbet#28) and soak testing pass.

@rndrntwrk rndrntwrk marked this pull request as ready for review March 21, 2026 08:24
@claude
Copy link

claude bot commented Mar 21, 2026

Code Review: feat(stream): internal bet sync feed and renderer health

Security

Positives:

  • Removed --disable-web-security default from capture browser launch and made --no-sandbox opt-in via CAPTURE_DISABLE_SANDBOX — meaningful hardening.
  • Betting feed uses its own dedicated BETTING_FEED_ACCESS_TOKEN with timingSafeEqual on SHA-256 digests (timing-attack resistant). Auth separation from STREAMING_VIEWER_ACCESS_TOKEN is correct.
  • Stream tokens moved to URL hash fragment (invisible to server referrer/logs) and immediately scrubbed via history.replaceState. Log output sanitized via redactStreamingSecretsFromUrl.
  • Embedded auth now validates event.source === window.parent and checks origin against an explicit allowlist instead of wildcard postMessage. Wildcard fallback only permitted in dev/Playwright.
  • execexecFile in Docker manager and cdn.mjs eliminates shell injection vectors.

Concerns:

  • parseExternalRtmpStatusSnapshot in streaming-betting-routes.ts reads a JSON file from RTMP_STATUS_FILE with minimal shape validation then spreads it into API responses. If the file is tampered with, arbitrary JSON gets forwarded to clients. Consider stricter schema validation on the parsed output.
  • abortCaptureForUnexpectedNavigation calls process.exit(1) — any unexpected redirect (e.g., auth flow) kills the capture process. Probably intentional, but worth an operational note or comment.

Potential Bugs

  • Hash fragment parsing fragility (streamingAccessToken.ts): new URLSearchParams(url.hash.replace(/^#/, "")) assumes the hash is strictly key=value&... format. A hash like #/route?streamToken=abc would silently fail to find the token. This may be fine for the current controlled use case but is brittle if the hash format ever changes.
  • deriveStreamingRendererHealth calls Date.now() inside useMemo: The updatedAt timestamp only updates when memo dependencies change, not on every render. The field may be stale relative to wall-clock time. Not a bug per se, but the semantics are surprising.
  • reconcileInFlight boolean guard in DuelBettingBridge: A second concurrent caller to reconcileLiveCycleFrom will silently skip reconciliation if the first is in-flight. If the second call carries new state that should trigger a different outcome, it is silently dropped. Compare this with createOrSyncInFlight which uses promise chaining to serialize properly. Consider whether reconcile needs the same treatment.
  • worldReadyRef set in two places (StreamingMode.tsx): Set both in onWorldReady callback and in the streaming:state:update handler. Duplicated state transitions make it easy to introduce inconsistencies — consider extracting to a single helper.

Code Quality

Positives:

  • Replaced as any / as unknown as { ... } casts in ElizaDuelBot.ts and ModelAgentSpawner.ts with properly typed access through public API. Good cleanup.
  • Fallback avatar lifecycle in PlayerRemote.ts correctly handles geometry clone ownership via OWNED_FALLBACK_GEOMETRY_KEY and properly disposes materials/geometry in clearFallbackAvatar.
  • Deduplication via buildBettingFeedDedupKey prevents duplicate SSE frames by normalizing away timestamps.

Concerns:

  • streaming-betting-routes.ts is 1006 lines — SSE lifecycle, authentication middleware, external status polling, source epoch persistence, and replay buffer management all in one file. Consider splitting into at least: route definitions, SSE transport, and external status poller.
  • Test files use vi.mock/vi.fnCLAUDE.md states "NO MOCKS" and requires real Hyperscape instances with Playwright. The pure-logic unit tests (auth token handling, SSE replay selection, health derivation) are arguably fine, but StreamingMode.component.test.tsx mocking out the entire GameClient is closer to the boundary where the CLAUDE.md policy applies. Worth a team discussion on where the line is drawn.
  • DuelBettingBridgeTestHarness casts private methods as public for testing. This creates tight coupling between tests and implementation internals — changes to private method signatures will break tests in non-obvious ways. Consider testing through public methods or exposing a formal test-only interface.

Performance

  • bettingReplayFrames.splice(0, removeCount) in streaming-sse-buffer.ts is O(n) for every eviction pass with STREAMING_SSE_REPLAY_BUFFER up to 8192. The code comment acknowledges this and suggests a ring buffer if scale grows — acceptable for now.
  • probeRendererHealth reads document.body?.textContent every 2 seconds via page.evaluate, which forces text content computation of the entire body DOM. For a game page with complex DOM, this could cause layout thrashing. Consider a lighter probe that only checks window.__RENDERER_HEALTH__ or similar global without reading DOM text.

Summary

This is a well-structured PR with strong security improvements across the board (token separation, URL scrubbing, origin validation, execFile, sandbox hardening). The main areas I'd suggest addressing before merge:

  1. Split streaming-betting-routes.ts — 1006 lines is too many concerns in one file
  2. Add schema validation for the external RTMP status JSON before forwarding to clients
  3. Lighten the renderer health probe — drop the textContent read, use only the window global
  4. Resolve the reconcileInFlight coalescing question — verify that silently dropping concurrent reconciliation calls is always safe

The DuelBettingBridge lifecycle logic, fallback avatar system, and embedded auth hardening are all solid work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant