Skip to content

Add replay-from-trace and scenario-authoring tooling#587

Merged
Chris0Jeky merged 12 commits intomainfrom
test/332-trace-replay-scenario-authoring
Mar 30, 2026
Merged

Add replay-from-trace and scenario-authoring tooling#587
Chris0Jeky merged 12 commits intomainfrom
test/332-trace-replay-scenario-authoring

Conversation

@Chris0Jeky
Copy link
Copy Markdown
Owner

Summary

  • Adds trace data model (types/trace.ts) and recorder composable (useTraceRecorder) for capturing user action sequences during normal use
  • Replay engine (traceReplay.ts) with play/pause/stop/seek and adjustable speed for re-executing recorded traces
  • Scenario JSON schema validation (scenarioSchema.ts) with type-safe step validation for all 7 step types
  • Form-based scenario editor with add/remove/reorder steps, JSON view toggle, import/export, and live validation
  • DevToolsView combining trace replay and scenario editor, gated behind devTools feature flag (disabled by default)
  • 44 new unit tests covering trace recorder, replay engine, and scenario validation

Closes #332

Test plan

  • Vitest unit tests for trace recorder (9 tests), replay engine (11 tests), and scenario validation (24 tests)
  • npm run typecheck passes
  • npm run build passes
  • Full test suite passes (1218 tests, 126 files)
  • Manual: enable devTools flag in settings, navigate to /workspace/dev-tools, record a trace, replay it, edit a scenario

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@Chris0Jeky
Copy link
Copy Markdown
Owner Author

Self-Review Findings

Trace data sensitivity

  • Trace payloads can contain selectors, API endpoints, and input values. Since this is internal-facing tooling gated behind a disabled-by-default feature flag, the risk is low. The traces are only stored in component state (not persisted to disk/backend), and export is explicit user action.
  • No sensitive data is logged to console except during active replay (action labels + types, no payloads).

Edge cases reviewed

  • Empty trace replay: handled -- completes immediately with completed status
  • Double start recording: warns and ignores (tested)
  • Stop without recording: returns null (tested)
  • Invalid speed (0 or negative): ignored (tested)
  • Seek out of bounds: ignored (tested)
  • Action handler throws: sets error state (tested)
  • Dispose during playback: clears timers (tested)
  • Import malformed JSON: shows error message, does not crash

Potential improvements (not blocking)

  1. The importTrace function does minimal validation -- it checks for id and actions array but doesn't validate individual action shapes. Acceptable for internal tooling.
  2. The scenario editor's dynamic param editing uses Record<string, unknown> casting which works but loses type safety on individual param fields. A per-type param editor component would be more robust but is over-engineering for this internal tool.
  3. Trace recordings are in-memory only (lost on page reload). LocalStorage persistence could be added as a follow-up.
  4. The idCounter in useTraceRecorder is module-scoped, so IDs are unique within a session but not globally. Acceptable for internal use.

No issues found that warrant fixing before merge.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a suite of internal developer tools for recording and replaying user action traces, as well as a scenario editor for authoring test cases. It includes a new useTraceRecorder composable, a TraceReplayEngine utility, and a DevToolsView component protected by a feature flag. While the implementation is comprehensive, the review identifies issues with the scenario editor's parameter handling, where numeric values like durationMs are incorrectly treated as strings, leading to validation failures. Additionally, the feedback suggests improving error visibility during file imports by logging caught exceptions to the console to assist in debugging.

Comment on lines +585 to +590
<input
:value="(step.params as unknown as Record<string, unknown>)[key]"
type="text"
class="w-full bg-zinc-900 border border-zinc-600 rounded px-2 py-1 text-xs text-zinc-200"
@input="(step.params as unknown as Record<string, unknown>)[key] = ($event.target as HTMLInputElement).value"
/>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The generic input for scenario step parameters treats all values as strings. This will cause a validation error for numeric parameters like durationMs in a wait step, which is expected to be a number. The input should handle numeric types correctly by checking the parameter key and converting the value to a number if necessary.

                <input
                  :value="(step.params as any)[key]"
                  :type="key === 'durationMs' ? 'number' : 'text'"
                  class="w-full bg-zinc-900 border border-zinc-600 rounded px-2 py-1 text-xs text-zinc-200"
                  @input="(step.params as any)[key] = key === 'durationMs' ? Number(($event.target as HTMLInputElement).value) : ($event.target as HTMLInputElement).value"
                />

Comment on lines +67 to +69
} catch {
traceError.value = 'Failed to import trace file.'
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When catching an error during file import, it's helpful for debugging to log the actual error to the console, especially for a developer tool. This provides more context than just a generic error message in the UI.

    } catch (err) {
      traceError.value = 'Failed to import trace file.'
      console.error('Failed to import trace:', err)
    }

Comment on lines +181 to +183
} catch {
scenarioErrors.value = [{ path: '', message: 'Failed to read file.' }]
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the trace import, logging the actual error to the console when a scenario import fails will aid in debugging. This is particularly useful for a developer-facing tool.

    } catch (err) {
      scenarioErrors.value = [{ path: '', message: 'Failed to read file.' }]
      console.error('Failed to import scenario:', err)
    }

- Use dynamic :type binding to detect numeric params (e.g. durationMs)
  and render type="number" input with Number() conversion on input
- Add console.error logging in trace import catch block
- Add console.error logging in scenario import catch block
Cover non-Error throw path, replay-after-completion restart, seekTo
during active play, pause no-op when idle, elapsed time reporting,
playback speed timing, assert text-equals validation, null params,
non-object steps, and all createBlankStep type defaults.
…ranch coverage threshold

Cover numeric-to-string mapping, out-of-range fallback, and
case-insensitive string normalization for command run status,
restore status, proposal status, source type, and risk level.
Also test featureFlagStore restore with empty localStorage.
@Chris0Jeky
Copy link
Copy Markdown
Owner Author

Fixes applied

Gemini code review findings (DevToolsView.vue)

  1. HIGH: Numeric params treated as strings -- Dynamic params input now uses :type binding that detects numeric values (e.g. durationMs) and renders type="number" with Number() conversion on input, preventing string coercion.

  2. MEDIUM: Missing console.error on import failures -- Added console.error('Failed to import trace:', err) and console.error('Failed to import scenario:', err) in the respective catch blocks for trace and scenario file imports.

CI coverage threshold failure

Branch coverage was 70.28%, below the 71% threshold. Added 66 new tests across 8 files:

  • traceReplay.spec.ts (+9 tests): non-Error throw path, replay-after-completion restart, seekTo during play, pause no-op, elapsed time, playback speed, resume from paused, seekTo while paused, buildState zero elapsed
  • scenarioSchema.spec.ts (+20 tests): text-equals assertion, visible/hidden assertions, invalid expectation, null params, non-object steps, missing description, valid fill/api-seed/store-dispatch/wait steps, non-string description, non-array tags, delayMs edge cases, createBlankStep for all 5 remaining types
  • useTraceRecorder.spec.ts (+6 tests): unique trace IDs, durationMs updates, endedAt timestamp, unique action IDs, metadata fields, action type recording
  • featureFlagStore.spec.ts (+1 test): restore with empty localStorage
  • ops.spec.ts (new, 10 tests): normalizeCommandRunStatus numeric/string/fallback
  • archive.spec.ts (new, 4 tests): normalizeRestoreStatus numeric/string/fallback
  • automation.spec.ts (new, 12 tests): normalizeProposalStatus/SourceType/RiskLevel

Verification

  • All 129 test files pass (1284 tests)
  • Branch coverage: 71.28% (above 71% threshold)
  • Typecheck: clean
  • Build: clean

Chris0Jeky and others added 2 commits March 30, 2026 00:56
…d, and roles utils

Cover previously untested branches: out-of-range numeric fallbacks,
empty/whitespace inputs, missing payload segments, and normalizer edge
cases. Brings branch coverage from 70.61% to 71.06%, passing the 71%
CI threshold.
@Chris0Jeky Chris0Jeky merged commit 6fedf2a into main Mar 30, 2026
18 checks passed
@github-project-automation github-project-automation bot moved this from Pending to Done in Taskdeck Execution Mar 30, 2026
@Chris0Jeky Chris0Jeky deleted the test/332-trace-replay-scenario-authoring branch March 30, 2026 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

TST-22: Add replay-from-trace and scenario-authoring follow-through

1 participant