Skip to content

feat: add headless gRPC server for external agent integration#278

Merged
kevincodex1 merged 17 commits intoGitlawb:mainfrom
NikitaBabenko:gRPC
Apr 6, 2026
Merged

feat: add headless gRPC server for external agent integration#278
kevincodex1 merged 17 commits intoGitlawb:mainfrom
NikitaBabenko:gRPC

Conversation

@NikitaBabenko
Copy link
Copy Markdown
Contributor

@NikitaBabenko NikitaBabenko commented Apr 3, 2026

Summary

  • what changed:
  1. Added a headless gRPC server implementation (src/grpc/server.ts) that runs alongside the original React Ink CLI.
  2. Defined a bidirectional streaming Protobuf contract (src/proto/openclaude.proto) for real-time interaction (streaming text tokens, intercepting tool calls, and requesting permission for actions).
  3. Created an interactive CLI test client (scripts/grpc-cli.ts) to demonstrate and validate the gRPC communication.
  4. Added entrypoint scripts (dev:grpc and dev:grpc:cli) to package.json and updated the README.md with instructions for headless usage.
  • why it changed:
  1. To allow openclaude's powerful agentic capabilities (tools, bash, file editing, workspace context) to be integrated into external applications, CI/CD pipelines, IDE plugins, or bots. By decoupling the core QueryEngine from the React Ink UI, developers can now build custom clients in any language using standard gRPC patterns while retaining the Human-in-the-Loop permission flow.

Impact

  • user-facing impact:
  1. None for existing users. The original interactive CLI (openclaude) works exactly as before. Users who want to build integrations can now start the headless server via bun run dev:grpc.
  • developer/maintainer impact:
  1. Minimal intrusion. We rely purely on QueryEngine.ts and getTools(). We injected a MACRO polyfill in the standalone start-grpc.ts entrypoint to mimic the bundler's behavior, ensuring we don't disrupt the existing build.ts process.

Testing

  • bun install --frozen-lockfile
  • bun run build
  • bun run smoke
  • focused tests: Tested bidirectional streaming locally using the new grpc-cli.ts client. Verified that streaming tokens (stream_event), tool calls (tool_start), tool results (tool_result), and permission prompts (action_required) are successfully intercepted and processed back and forth without crashing.

Notes

  • provider/model path tested: Tested with OpenRouter (anthropic/claude-3.5-sonnet) by passing environment variables.
  • screenshots attached (if UI changed): N/A (Headless service additions).
  • follow-up work or known limitations:
    1. Multi-turn conversation context is preserved within a single gRPC stream via initialMessages. Reconnecting on a new stream starts a fresh session (session_id is accepted by the proto but not yet used server-side for cross-stream persistence).
    2. The gRPC endpoint is unauthenticated and defaults to localhost:50051 (loopback only). Use GRPC_HOST=0.0.0.0 to explicitly expose on all interfaces. Future iterations could add grpc.Metadata authorization.
    3. We rely on Object.assign(globalThis, { MACRO: ... }) in start-grpc.ts because it skips the esbuild injection step.

@NikitaBabenko NikitaBabenko marked this pull request as ready for review April 3, 2026 14:59
Copy link
Copy Markdown
Collaborator

@Vasanthdev2004 Vasanthdev2004 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a promising direction, but I don't think it's safe to merge as-is.

Two blocker-level issues from my side:

  1. The PR adds new dependencies without updating bun.lock.
    package.json:56-57 adds @grpc/grpc-js and @grpc/proto-loader, but the PR does not include a bun.lock update. In the review worktree, bun install --frozen-lockfile fails immediately and requires regenerating the lockfile. Since this repo uses Bun and commits bun.lock, I think the lockfile update needs to be part of the PR.

  2. The server binds an unauthenticated gRPC endpoint to 0.0.0.0 by default, while the README says localhost:50051.

    • README.md:211 says the service starts on localhost:50051
    • src/grpc/server.ts:36-43 actually binds to 0.0.0.0:${port} with createInsecure()

    That's a real security/behavior mismatch. Exposing an unauthenticated tool-executing endpoint on all interfaces by default is not something we should ship casually. At minimum this should default to 127.0.0.1/localhost, with any broader bind requiring an explicit opt-in.

I also tried the actual startup path locally via bun run scripts/start-grpc.ts, and the current implementation failed to bind with Failed to listen at 0.0.0.0, so I would definitely want the bind behavior tightened up before merge.

Vasanthdev2004

This comment was marked as spam.

- Update bun.lock for new dependencies (frozen-lockfile CI fix)
- Add multi-turn session persistence via initialMessages
- Replace hardcoded done payload with real token counts
- Default bind to localhost instead of 0.0.0.0
@NikitaBabenko
Copy link
Copy Markdown
Contributor Author

@Vasanthdev2004 Thanks for the thorough review! All four issues addressed:

  1. bun.lock — updated, bun install --frozen-lockfile passes now
  2. Session persistence — new QueryEngine per request with initialMessages from previous turn. Multi-turn context works within a stream
  3. Done payload — real full_text and token counts extracted from the result message
  4. Bind address — defaults to localhost, opt-in via GRPC_HOST=0.0.0.0

Ready for re-review

Copy link
Copy Markdown
Collaborator

@Vasanthdev2004 Vasanthdev2004 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two blocker-level issues still stand out for me on the latest #278 head:

  1. scripts/start-grpc.ts:13-21 only calls enableConfigs() and then starts the gRPC server. It skips the normal startup prep that the main CLI does before running: applying safe managed env vars, loading saved provider profiles, hydrating stored credentials/tokens, and validating the resulting provider config. In practice that means the headless server can start in a different auth/provider state from normal OpenClaude, especially for saved profile flows and any credential path that relies on startup hydration. Since the README positions this as a usable headless entrypoint, I think it needs startup parity with the normal CLI bootstrap before merge.

  2. src/grpc/server.ts:179-180 handles cancel by just calling call.end(), but it never interrupts the in-flight QueryEngine. QueryEngine explicitly exposes interrupt() for this (src/QueryEngine.ts:1158-1159). As written, a client cancel closes the stream but can still leave the underlying model/tool execution running in the server process. For a headless integration API, that is a real behavior bug, not just a missing refinement.

The earlier blockers around lockfile/default bind/done payload look addressed, but I�d want these two fixed before calling the gRPC path mergeable.

NikitaBabenko and others added 2 commits April 4, 2026 11:37
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace enableConfigs() with await init() in start-grpc.ts for full
  bootstrap parity with the main CLI (env vars, CA certs, mTLS, proxy,
  OAuth, Windows shell)
- Call engine.interrupt() before call.end() in the cancel handler so
  in-flight model/tool execution is actually stopped
- Show done.full_text in the CLI client when no text_chunk was received,
  preventing silent drops when streaming is unavailable
Copy link
Copy Markdown
Collaborator

@Vasanthdev2004 Vasanthdev2004 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rechecked the latest head after the recent fixes, and the earlier startup/cancel blockers look addressed. I still have two blocker-level API-contract issues before I can approve this:

  1. ChatRequest.provider is still ignored by the server.
    The proto exposes provider as an optional request field in src/proto/openclaude.proto:35, but src/grpc/server.ts:64-107 never reads req.provider at all. The only request-scoped routing that currently happens is userSpecifiedModel / fallbackModel from req.model, and the file still has a comment saying provider should be configured elsewhere. For an external integration surface, exposing a provider field that the server silently ignores is a correctness bug.

  2. session_id is still a dead field, so the published session contract does not actually work.
    ClientMessage defines session_id at the top level in src/proto/openclaude.proto:15-27, but src/grpc/server.ts never uses it. Session history is kept only in the per-stream previousMessages array (src/grpc/server.ts:56-57), so reconnecting with the same session_id does not resume anything. The bundled CLI client also writes session_id into request instead of the top-level ClientMessage envelope (scripts/grpc-cli.ts:107-110), which reinforces that the field is not currently wired end-to-end.

The recent fixes definitely moved this closer, but I still need the request/session fields in the proto to either be implemented authoritatively or removed from the public contract before I can call it mergeable.

NikitaBabenko and others added 2 commits April 4, 2026 18:22
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move session_id from ClientMessage into ChatRequest to fix proto-loader
  oneofs encoding bug and make the field functional
- Implement in-memory session store so reconnecting with the same
  session_id resumes conversation context across streams
- Remove ChatRequest.provider — per-request provider routing requires
  global process.env mutation, unsafe for concurrent clients; provider
  is configured via env vars at server startup
Copy link
Copy Markdown
Collaborator

@Vasanthdev2004 Vasanthdev2004 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rechecked the current head. The earlier provider/session field problems do look fixed, and I also reran �un run build plus �un run smoke in the review worktree. I also verified the gRPC server now binds on localhost:50051 when started locally.

I still see two blocker-level issues before I�d call this mergeable:

  1. scripts/start-grpc.ts still does not mirror the normal CLI bootstrap path. It only calls init() (scripts/start-grpc.ts:13-21), but the normal CLI also hydrates Gemini/GitHub secure tokens, resolves the saved provider profile into startup env, and validates provider configuration before first use (src/entrypoints/cli.tsx:71-96). That means the headless server can still start in the wrong auth/provider state for saved-profile setups that rely on secure token hydration or startup profile resolution.

  2. The gRPC contract for tool results is still inconsistent. The proto says ToolCallResult.tool_name (src/proto/openclaude.proto:71-75), but the server is filling that field with �lock.tool_use_id (src/grpc/server.ts:146-151). So clients receive a UUID in the field named ool_name, and the bundled CLI displays that value as if it were the tool name. If you want correlation, the API needs a separate ool_use_id field; otherwise this field should carry the actual tool name.

…field

scripts/start-grpc.ts now runs the same provider/auth bootstrap as the
normal CLI entrypoint: enableConfigs, safe env vars, Gemini/GitHub token
hydration, saved-profile resolution with warn-and-fallback, and provider
validation before the server binds.

ToolCallResult.tool_name was being populated with the tool_use_id UUID.
Added a toolNameById map (filled in canUseTool) so tool_name now carries
the actual tool name (e.g. "Bash"). The UUID moves to a new tool_use_id
field (proto field 4) for client-side correlation.
Copy link
Copy Markdown
Collaborator

@Vasanthdev2004 Vasanthdev2004 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rechecked the current head.

I reran:

  • bun run build
  • bun run smoke

Both pass, and the earlier startup-profile / tool-result-name issues are definitely improved. I still see two blocker-level problems before I'd merge it:

  1. The gRPC contract still doesn't actually support end-to-end tool call correlation.

ToolCallResult now includes tool_use_id and comments that it is a correlation ID matching ToolCallStart (src/proto/openclaude.proto:71-76), but ToolCallStart still has no corresponding tool_use_id field at all (src/proto/openclaude.proto:64-68).

That means an external client still can't reliably match a tool_result back to a specific tool_start when the same tool is used multiple times. Right now the server only emits:

  • tool_start.tool_name
  • tool_start.arguments_json

If correlation is part of the public contract, ToolCallStart needs the same ID too.

  1. The server still keeps work running when the client closes the stream without sending an explicit CancelSignal.

The only place that interrupts the QueryEngine is the protocol-level cancel branch (src/grpc/server.ts:197-201). In call.on('end'), the current code just nulls the local reference and clears pending requests (src/grpc/server.ts:215-219).

So if a client disconnects or closes the stream mid-generation without first sending a CancelSignal, the underlying model/tool work can keep running in the server process even though the client is gone.

Those two feel like real API/lifecycle issues rather than polish, so I'd still want another pass here before merge.

…tream close

Two blocker-level issues flagged in code review:

- ToolCallStart was missing tool_use_id, making it impossible for clients
  to correlate tool_start events with tool_result when the same tool runs
  multiple times. Added tool_use_id = 3 to the proto message and populated
  it from the toolUseID parameter in canUseTool.

- On stream close without an explicit CancelSignal the server only nulled
  the engine reference, leaving the underlying model/tool work running
  as an orphan. Added engine.interrupt() in the call.on('end') handler
  to stop work immediately when the client disconnects.
…el writes

Four lifecycle and contract issues identified during proactive review:

- Pending permission Promises in canUseTool would hang forever if the
  client disconnected mid-stream. On call 'end', all pending resolvers
  are now called with 'no' so the engine can unblock and terminate.

- The done message and session save could fire after call.end() when
  a CancelSignal arrived mid-generation. Added an `interrupted` flag
  set on both cancel and stream close to gate all post-loop writes.

- The session map had no eviction policy, allowing unbounded memory
  growth. Capped at MAX_SESSIONS=1000 with FIFO eviction of the
  oldest entry.

- Field 3 was silently absent from ChatRequest. Added `reserved 3`
  to document the gap and prevent accidental reuse in future.
Comment thread src/grpc/server.ts
sessionId = req.session_id || ''

// Load previous messages from session store (cross-stream persistence)
if (sessionId && this.sessions.has(sessionId)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still leaks conversation history across different session IDs on the same stream. On the current head, previousMessages is only replaced when the incoming session_id already exists in his.sessions; otherwise it keeps whatever was accumulated from the prior request on the stream. Direct repro with a mocked QueryEngine: send one request with session_id = 's1', let it persist one message, then send a second request on the same stream with a brand-new session_id = 's2'. The second QueryEngine is constructed with the first request's history in initialMessages instead of starting fresh. That breaks the published session contract because changing to a new session ID should not inherit another session's transcript.

Copy link
Copy Markdown
Collaborator

@Vasanthdev2004 Vasanthdev2004 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rechecked the latest head 98339c33ba3480b484bd30752e70ffaf6c9602c6 against current origin/main.

A lot of the earlier blocker feedback is addressed on this head:

  • bun.lock is updated and bun install --frozen-lockfile succeeds
  • the server now defaults to localhost
  • session_id and tool_use_id are wired into the proto/server path
  • startup bootstrap is much closer to the normal CLI path
  • disconnect cleanup now interrupts the engine and resolves pending permission prompts

I still can't approve it because there is one real session-contract bug left:

  1. src/grpc/server.ts leaks conversation history across different session IDs on the same stream.
    On the current head, previousMessages is only replaced when the incoming session_id already exists in this.sessions; otherwise it keeps whatever was accumulated from the prior request on the stream.

    Direct repro on this head with a mocked QueryEngine:

    • send one request with session_id = 's1'
    • let it persist one message
    • send a second request on the same stream with a brand-new session_id = 's2'
    • the second QueryEngine is constructed with the first request's history in initialMessages instead of starting fresh

    That breaks the published session contract because changing to a new session ID should not inherit another session's transcript.

Fresh verification on this head:

  • bun install --frozen-lockfile -> success
  • bun test src/commands/model/model.test.tsx src/utils/model/providers.test.ts -> 12 pass
  • bun run build -> success
  • bun run smoke -> success
  • direct mocked repro of the session leak above -> reproduced

I also checked basic startup separately: this Windows machine cannot bind even a minimal standalone @grpc/grpc-js server on localhost, so I am not using local bind failure as evidence against this PR. The blocker above is branch-specific and directly reproduced.

…ion history leak

previousMessages was declared at stream scope and only overwritten when
the incoming session_id already existed in the session store. A second
request on the same stream with a new session_id would silently inherit
the first request's conversation history in initialMessages instead of
starting fresh, violating the session contract.

Fix: reset previousMessages to [] at the start of each ChatRequest
before the session-store lookup.
…concurrent ChatRequest

Two stream-scoped state bugs found during proactive audit:

- The `interrupted` flag was never reset between requests on the same
  stream. If the first request was cancelled, all subsequent requests
  would silently skip the done message, causing the client to hang.

- A second ChatRequest arriving while the first was still processing
  would overwrite the engine reference, corrupting the lifecycle of
  both requests. Now returns ALREADY_EXISTS error instead. Engine is
  nulled after the for-await loop completes so subsequent requests
  can proceed normally.
Copy link
Copy Markdown
Collaborator

@Vasanthdev2004 Vasanthdev2004 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rechecked the latest head 466d4cf33222f3e0529c8a1b9f85bbd9f77274a8 against current origin/main.

The previous session-contract blocker is fixed on this revision.

What I verified on this head:

  • direct mocked repro of the old leak is fixed:
    • first request on a stream with session_id = 's1'
    • second request on the same stream with a brand-new session_id = 's2'
    • QueryEngine initial messages are now [] then [], so the new session starts fresh instead of inheriting the prior transcript
  • direct mocked repro of the intended persistence path still works:
    • first stream stores history for session_id = 's1'
    • second stream with the same session_id = 's1' receives that history in initialMessages
  • direct mocked repro of normal sequential requests on a single stream still works:
    • first request completes with done
    • second request on the same stream also completes with its own done
  • direct mocked repro of overlapping requests returns the expected ALREADY_EXISTS error while allowing the first request to finish normally
  • bun install --frozen-lockfile -> success
  • bun test src/commands/model/model.test.tsx src/utils/model/providers.test.ts -> 12 pass
  • bun run build -> success
  • bun run smoke -> success

I didn't find a remaining branch-specific blocker on the current head.

Residual risk is still coverage rather than demonstrated breakage: there are no dedicated automated tests for src/grpc/server.ts or the session/lifecycle paths, so the behavior above is currently protected by direct repro rather than a checked-in test file.

@kevincodex1 kevincodex1 merged commit 26eef92 into Gitlawb:main Apr 6, 2026
1 check passed
euxaristia pushed a commit to euxaristia/openclaude that referenced this pull request Apr 13, 2026
…b#278)

* gRPC Server

* gRPC fix

* UpdProto

* fix: address PR review feedback for gRPC server

- Update bun.lock for new dependencies (frozen-lockfile CI fix)
- Add multi-turn session persistence via initialMessages
- Replace hardcoded done payload with real token counts
- Default bind to localhost instead of 0.0.0.0

* fix(grpc): startup parity, cancel interrupt, and cli text fallback

- Replace enableConfigs() with await init() in start-grpc.ts for full
  bootstrap parity with the main CLI (env vars, CA certs, mTLS, proxy,
  OAuth, Windows shell)
- Call engine.interrupt() before call.end() in the cancel handler so
  in-flight model/tool execution is actually stopped
- Show done.full_text in the CLI client when no text_chunk was received,
  preventing silent drops when streaming is unavailable

* fix(grpc): wire session_id end-to-end and remove dead provider field

- Move session_id from ClientMessage into ChatRequest to fix proto-loader
  oneofs encoding bug and make the field functional
- Implement in-memory session store so reconnecting with the same
  session_id resumes conversation context across streams
- Remove ChatRequest.provider — per-request provider routing requires
  global process.env mutation, unsafe for concurrent clients; provider
  is configured via env vars at server startup

* fix(grpc): mirror CLI auth bootstrap in start-grpc and fix tool_name field

scripts/start-grpc.ts now runs the same provider/auth bootstrap as the
normal CLI entrypoint: enableConfigs, safe env vars, Gemini/GitHub token
hydration, saved-profile resolution with warn-and-fallback, and provider
validation before the server binds.

ToolCallResult.tool_name was being populated with the tool_use_id UUID.
Added a toolNameById map (filled in canUseTool) so tool_name now carries
the actual tool name (e.g. "Bash"). The UUID moves to a new tool_use_id
field (proto field 4) for client-side correlation.

* fix(grpc): add tool_use_id to ToolCallStart and interrupt engine on stream close

Two blocker-level issues flagged in code review:

- ToolCallStart was missing tool_use_id, making it impossible for clients
  to correlate tool_start events with tool_result when the same tool runs
  multiple times. Added tool_use_id = 3 to the proto message and populated
  it from the toolUseID parameter in canUseTool.

- On stream close without an explicit CancelSignal the server only nulled
  the engine reference, leaving the underlying model/tool work running
  as an orphan. Added engine.interrupt() in the call.on('end') handler
  to stop work immediately when the client disconnects.

* fix(grpc): resolve pending promises on disconnect and guard post-cancel writes

Four lifecycle and contract issues identified during proactive review:

- Pending permission Promises in canUseTool would hang forever if the
  client disconnected mid-stream. On call 'end', all pending resolvers
  are now called with 'no' so the engine can unblock and terminate.

- The done message and session save could fire after call.end() when
  a CancelSignal arrived mid-generation. Added an `interrupted` flag
  set on both cancel and stream close to gate all post-loop writes.

- The session map had no eviction policy, allowing unbounded memory
  growth. Capped at MAX_SESSIONS=1000 with FIFO eviction of the
  oldest entry.

- Field 3 was silently absent from ChatRequest. Added `reserved 3`
  to document the gap and prevent accidental reuse in future.

* fix(grpc): reset previousMessages on each new request to prevent session history leak

previousMessages was declared at stream scope and only overwritten when
the incoming session_id already existed in the session store. A second
request on the same stream with a new session_id would silently inherit
the first request's conversation history in initialMessages instead of
starting fresh, violating the session contract.

Fix: reset previousMessages to [] at the start of each ChatRequest
before the session-store lookup.

* fix(grpc): reset interrupted flag between requests and guard against concurrent ChatRequest

Two stream-scoped state bugs found during proactive audit:

- The `interrupted` flag was never reset between requests on the same
  stream. If the first request was cancelled, all subsequent requests
  would silently skip the done message, causing the client to hang.

- A second ChatRequest arriving while the first was still processing
  would overwrite the engine reference, corrupting the lifecycle of
  both requests. Now returns ALREADY_EXISTS error instead. Engine is
  nulled after the for-await loop completes so subsequent requests
  can proceed normally.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants