Add codemode: typechecked TypeScript execution with tools #16

jonastemplestein · 2025-12-05T20:19:10Z

Summary

Adds codemode execution to the agent: the LLM can write TypeScript code blocks that are typechecked and executed, with access to tools for file I/O, shell commands, web fetching, and secrets.

Features

Code Execution

<codemode>...</codemode> blocks are typechecked with strict TypeScript
Executed in a Bun subprocess with streamed stdout/stderr
Type errors trigger agent retry (agent sees errors and can fix)

Available Tools

t.sendMessage(msg) — show message to USER (no agent turn)
t.readFile(path) / t.writeFile(path, content) — file I/O
t.exec(command) — run shell commands
t.fetch(url) — HTTP requests
t.getSecret(name) — access secrets (hidden from LLM)

Agent Loop

console.log() output triggers another agent turn
t.sendMessage() output goes to user, no turn
Most tasks complete in one turn
Max 3 iterations before forced stop

Event Model

Added triggerAgentTurn property to all events
LLM triggered by any event with triggerAgentTurn="after-current-turn"
CodemodeResultEvent persisted with stdout/stderr for conversation history

Files Added

src/codemode.model.ts — code block parsing, event types
src/codemode.service.ts — orchestrates parse → typecheck → execute
src/codemode.repository.ts — stores generated code files
src/code-executor.service.ts — runs code in Bun subprocess
src/typechecker.service.ts — TypeScript type checking
test/codemode.e2e.test.ts — end-to-end tests

All 75 tests pass.

Note

Adds codemode enabling LLM-emitted TypeScript to be typechecked and executed via Bun with tools, persisting results to drive an agent loop across CLI/UI/HTTP and a new per-context storage layout.

Codemode (core):
- Add codemode pipeline: parse <codemode> blocks, typecheck (TS), execute via Bun subprocess, stream events.
- New services: CodemodeService, TypecheckService, CodeExecutor, CodemodeRepository.
- New events: CodemodeResultEvent, CodemodeValidationErrorEvent and codemode streaming events; add triggerAgentTurn to events.
- LLM prompt building updated to include codemode results/validation.
Context/Agent:
- ContextService.addEvents supports codemode and emits codemode events; swaps in codemode system prompt when enabled.
- Agent continuation driven by triggerAgentTurn with persisted CodemodeResult.
CLI/UI:
- Add codemode run subcommand; integrate codemode into chat flow with an agent loop (iteration cap) and colored streaming output.
- TUI updated to render codemode results/validation and ignore ephemeral codemode stream events.
HTTP/Adapters:
- SSE streaming and LayerCode adapter generalized to ContextOrCodemodeEvent.
Persistence:
- Contexts now stored per-directory with events.yaml; repository APIs updated (getContextDir, listing by directories).
Tests/Config:
- Add E2E and unit tests for codemode; vitest config includes src/**/*.test.ts.
- ESLint ignores .mini-agent/**.

^{Written by Cursor Bugbot for commit 8f32ec5. This will update automatically on new commits. Configure here.}

cursor · 2025-12-05T20:22:59Z

src/context.service.ts

+
            if (newPersistedInputs.length > 0) {
-              const allEvents = [...existingEvents, ...newPersistedInputs]
+              const allEvents = [...eventsWithPrompt, ...newPersistedInputs]


Bug: System prompt permanently overwritten when codemode enabled

The ensureCodemodePrompt function replaces the user's original system prompt with CODEMODE_SYSTEM_PROMPT, and this modified event list gets persisted to disk via repo.save(). This permanently overwrites the user's custom system prompt in the context file, even though the codemode prompt replacement is likely intended only for the current LLM request, not for permanent storage. Subsequent loads of the context will return the codemode prompt instead of the user's original.

src/context.service.ts

example.txt

cursor · 2025-12-05T20:22:59Z

src/context.service.ts

+                      })
+                      yield* persistEvent(result)
+                      return result as ContextOrCodemodeEvent
+                    })


Bug: Multiple codeblocks lose earlier output on later failure

When processing multiple codeblocks in a single assistant response, the tracking variables (stdout, stderr, exitCode, typecheckFailed) accumulate across all blocks. If an early block executes successfully but a later block fails typechecking, the typecheckFailed path at line 175-185 discards all accumulated stdout with stdout: "", losing output from the successful earlier blocks. Similarly, exitCode only captures the last ExecutionComplete event, so if an earlier block fails (exit code 1) but a later block succeeds (exit code 0), the failure is masked.

cursor · 2025-12-05T22:28:52Z

src/code-executor.service.ts

+              })
+            )
+          )
+        )


Bug: No execution timeout allows infinite loops to hang agent

The subprocess execution in CodeExecutor has no timeout mechanism. The stream waits indefinitely for process.exitCode to resolve. If LLM-generated code contains an infinite loop (e.g., while(true){}), a blocking operation that never completes, or code that hangs waiting for input, the agent will hang forever with no way to recover except killing the process. This could make the agent unresponsive until manually terminated.

cursor · 2025-12-05T23:19:25Z

src/http.ts

-  // Stream SSE events directly - provide services to remove context requirements
+  // Filter to InputEvent only (exclude SystemPromptEvent which isn't an InputEvent)
+  const isUserMessage = (e: ScriptInputEvent): e is UserMessageEvent => Schema.is(UserMessageEvent)(e)
+  const events: Array<InputEvent> = parsedEvents.filter(isUserMessage)


Bug: HTTP handler drops valid FileAttachmentEvent inputs

The HTTP event filter only keeps UserMessageEvent using isUserMessage, but InputEvent includes FileAttachmentEvent and CodemodeResultEvent as valid input types. File attachment events would be dropped even if they were properly parsed. Additionally, ScriptInputEvent in server.service.ts only unions UserMessageEvent and SystemPromptEvent, preventing FileAttachmentEvent from being accepted via HTTP at all, making the HTTP endpoint unable to handle file attachments that the CLI supports.

Additional Locations (1)

src/server.service.ts#L16-L17

cursor · 2025-12-06T21:08:24Z

src/cli/chat-ui.ts


    if (result._tag === "completed") {
+      if (needsContinuation) {
+        return yield* runAgentContinuation(contextName, contextService, chat, mailbox)


Bug: Missing iteration limit allows unbounded agent loop recursion

The runAgentContinuation function recursively calls itself at line 252 without any iteration limit, unlike the CLI's runEventStream which uses MAX_AGENT_LOOP_ITERATIONS = 15. This unbounded recursion occurs whenever needsContinuation is true, which happens when CodemodeResultEvent or CodemodeValidationErrorEvent has triggerAgentTurn === "after-current-turn". If the LLM consistently produces typechecking errors, outputs to stdout via console.log(), or fails to include <codemode> tags, the agent loop will recurse indefinitely, potentially causing stack overflow or resource exhaustion in the interactive chat UI.

Additional Locations (1)

src/cli/chat-ui.ts#L173-L175

Implements the core services needed for codemode functionality: - codemode.model.ts: Event schemas and <codemode> block parsing - codemode.repository.ts: Stores generated code in timestamped directories - typechecker.service.ts: TypeScript compiler API wrapper for validation - code-executor.service.ts: Bun subprocess execution with streaming output - codemode.service.ts: Orchestrates parse/store/typecheck/execute workflow Also adds error types, wires layers into main.ts, and updates vitest config to include colocated tests in src/. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add --codemode / -x flag to enable code block processing - Add handleCodemodeEvent for colored terminal output - Refactor codemode.service to use Stream.unwrap for cleaner control flow - Fix typecheck failure handling to properly stop execution - Add E2E tests proving the full pipeline works 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add CodemodeResultEvent with toLLMMessage() returning user role - Update PersistedEvent union to include CodemodeResultEvent - Update eventsToPrompt to handle CodemodeResult as user message - Integrate codemode execution into ContextService.addEvents - CodemodeResult is now persisted and included in next LLM request - Refactor CLI to use unified event handling The codemode workflow now: 1. Assistant responds with <codemode> blocks 2. Code is parsed, typechecked, and executed 3. stdout/stderr captured as CodemodeResultEvent 4. Event persisted to context 5. Next LLM request includes it as user message 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add getSecret tool to codemode tools interface for retrieving secrets hidden from LLM (implementation in code-executor.service.ts) - Implement CodemodeResult return type with endTurn and data fields - Add agent loop that re-calls LLM when endTurn=false - Create CODEMODE_SYSTEM_PROMPT explaining tools and agent loop - Context service swaps in codemode prompt when -x flag is used - Add e2e tests for getSecret and CodemodeResult parsing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Remove codemodeOption from CLI - Always pass codemode: true to ContextService.addEvents - Update test to expect output contains "i" (via tools.log) rather than exact match 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

The generated codemode files use `Promise<CodemodeResult>` in their function signature but the import only included `Tools`. Added `CodemodeResult` to the import statement. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Made system prompt more emphatic about requiring explicit type annotations (noImplicitAny is enabled) - When typecheck fails, create CodemodeResultEvent with endTurn=false so the LLM can see the errors and retry with fixed code 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

The system prompt now explicitly states that Tools and CodemodeResult are automatically available (imports are auto-prepended), preventing LLMs from adding duplicate imports that cause TypeScript errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add triggerAgentTurn enum property ("after-current-turn" | "never") to all persisted events - LLM triggered by any event with triggerAgentTurn="after-current-turn", not just UserMessage - Remove endTurn and data fields from CodemodeResultEvent - Add t.sendMessage() tool for user-facing output (stderr, no agent turn) - Add t.fetch() tool for web requests - console.log() triggers agent turn; sendMessage() doesn't - Simplify system prompt: most tasks are single-turn - All 75 tests pass 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>

Internalize Scope in CodeExecutor by using Stream.unwrapScoped to manage subprocess lifecycle. This removes Scope from all public interfaces making the stream easier to consume. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Add `mini-agent codemode run <path>` CLI command for executing codemode blocks - Refactor CodeExecutor to call CLI command instead of inline bun -e - Move agent loop from ContextService to CLI layer for cleaner separation - Increase max agent loop iterations from 3 to 15 - Add utility tools from kathmandu: calculate, now, sleep - Add __CODEMODE_RESULT__ marker for cleaner stdout parsing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Dead code - appendLog was never called. Codeblock directories now only contain: - index.ts (generated code) - types.ts (tool type definitions) - tsconfig.json (TypeScript config) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

These files were accidentally committed during development and should not be tracked in the repository. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- chat-ui.ts now uses contextService.addEvents with codemode: true instead of calling streamLLMResponse directly - Add CodemodeValidationErrorEvent to detect when LLM doesn't output codemode tags, triggering retry with chastising error message - Add feed item renderers for CodemodeResult and CodemodeValidationError in opentui-chat.tsx - Update llm.ts to include CodemodeValidationErrorEvent in prompt conversion - Add agent continuation loop in chat-ui for codemode follow-ups 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

cursor · 2025-12-06T21:28:14Z

src/cli/commands.ts

+    while (
+      lastCodemodeResult &&
+      lastCodemodeResult.triggerAgentTurn === "after-current-turn" &&
+      iteration < MAX_AGENT_LOOP_ITERATIONS


Bug: CLI agent loop ignores validation error retry trigger

The agent loop in runEventStream only tracks lastCodemodeResult (CodemodeResultEvent) but ignores CodemodeValidationErrorEvent. When the LLM responds without codemode tags, a CodemodeValidationErrorEvent is emitted and persisted with triggerAgentTurn: "after-current-turn", but since lastCodemodeResult remains undefined, the while loop condition fails and the agent never retries. The chat-ui.ts correctly handles this via triggersContinuation checking both event types.

cursor bot reviewed Dec 5, 2025

View reviewed changes

jonastemplestein changed the title ~~Replace endTurn with triggerAgentTurn~~ Add codemode: typechecked TypeScript execution with tools Dec 5, 2025

jonastemplestein force-pushed the codemode-execution branch from e3a539c to e560d87 Compare December 5, 2025 20:48

cursor bot reviewed Dec 5, 2025

View reviewed changes

jonastemplestein force-pushed the codemode-execution branch from df65165 to 63e8675 Compare December 6, 2025 20:59

cursor bot reviewed Dec 6, 2025

View reviewed changes

jonastemplestein and others added 14 commits December 6, 2025 21:17

Remove test artifact files (example.txt, output.txt)

2277f26

These files were accidentally committed during development and should not be tracked in the repository. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

jonastemplestein force-pushed the codemode-execution branch from 63e8675 to 8f32ec5 Compare December 6, 2025 21:22

cursor bot reviewed Dec 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add codemode: typechecked TypeScript execution with tools #16

Add codemode: typechecked TypeScript execution with tools #16

Uh oh!

jonastemplestein commented Dec 5, 2025 •

edited by cursor bot

Loading

Uh oh!

cursor bot Dec 5, 2025

Uh oh!

Uh oh!

Uh oh!

cursor bot Dec 5, 2025

Uh oh!

cursor bot Dec 5, 2025

Uh oh!

cursor bot Dec 5, 2025

Uh oh!

cursor bot Dec 6, 2025

Uh oh!

cursor bot Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add codemode: typechecked TypeScript execution with tools #16

Are you sure you want to change the base?

Add codemode: typechecked TypeScript execution with tools #16

Uh oh!

Conversation

jonastemplestein commented Dec 5, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Features

Code Execution

Available Tools

Agent Loop

Event Model

Files Added

Uh oh!

cursor bot Dec 5, 2025

Choose a reason for hiding this comment

Bug: System prompt permanently overwritten when codemode enabled

Uh oh!

Uh oh!

Uh oh!

cursor bot Dec 5, 2025

Choose a reason for hiding this comment

Bug: Multiple codeblocks lose earlier output on later failure

Uh oh!

cursor bot Dec 5, 2025

Choose a reason for hiding this comment

Bug: No execution timeout allows infinite loops to hang agent

Uh oh!

cursor bot Dec 5, 2025

Choose a reason for hiding this comment

Bug: HTTP handler drops valid FileAttachmentEvent inputs

Uh oh!

cursor bot Dec 6, 2025

Choose a reason for hiding this comment

Bug: Missing iteration limit allows unbounded agent loop recursion

Uh oh!

cursor bot Dec 6, 2025

Choose a reason for hiding this comment

Bug: CLI agent loop ignores validation error retry trigger

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jonastemplestein commented Dec 5, 2025 •

edited by cursor bot

Loading