Skip to content

Conversation

@Wangmerlyn
Copy link
Owner

@Wangmerlyn Wangmerlyn commented Dec 9, 2025

Summary by CodeRabbit

  • New Features

    • HTTP transport option for the MCP server (JSON‑RPC over HTTP) in addition to the existing stdio mode.
    • New remote-inspection commands to query status and list GPUs.
    • CLI option to select transport mode and configure HTTP host/port.
  • Documentation

    • Expanded MCP guide with HTTP examples, client config snippets, and remote deployment tips (SSH tunnel, reverse proxy).
    • Reorganized getting-started and contributor guidance; updated site navigation and README links.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 9, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds an HTTP JSON‑RPC transport to the MCP server while retaining stdio mode; introduces CLI flags (--mode/--host/--port); extends KeepGPUServer with status() and list_gpus() and adds a quiet parameter to stop_keep(); updates docs with MCP HTTP usage, client examples, and remote-access guidance.

Changes

Cohort / File(s) Summary
Documentation
README.md, docs/contributing.md, docs/getting-started.md, docs/guides/mcp.md, docs/index.md, mkdocs.yml
Adds MCP HTTP docs and examples, SSH-tunnel guidance, minimal HTTP client YAML, reorganizes usage/getting-started content and navigation, and enables tabbed Markdown extension.
MCP Server Implementation
src/keep_gpu/mcp/server.py
Adds HTTP JSON‑RPC handler _JSONRPCHandler, runners run_http() and run_stdio(), and CLI main() with mode/host/port; introduces status(job_id=None) and list_gpus() methods and extends stop_keep(..., quiet=False); _handle_request() remains central dispatcher.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant HTTP_Server as HTTP Handler
    participant KeepGPUServer as Server

    Client->>HTTP_Server: HTTP POST / with JSON‑RPC payload
    HTTP_Server->>KeepGPUServer: _handle_request(parsed request)
    KeepGPUServer->>KeepGPUServer: execute method (start_keep / stop_keep / status / list_gpus)
    KeepGPUServer-->>HTTP_Server: JSON‑RPC response (result or error)
    HTTP_Server-->>Client: HTTP 200 with JSON‑RPC body
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

  • Inspect src/keep_gpu/mcp/server.py for JSON‑RPC compliance, HTTP handler correctness, threading and graceful shutdown.
  • Verify stop_keep(..., quiet=False) is backward compatible and status()/list_gpus() return shapes match docs.
  • Confirm CLI arg parsing defaults and examples in docs are accurate.

Possibly related PRs

Poem

🐇 I hop between queues and tiny threads,

I carry JSON calls to sleepy heads.
Stdio hums and HTTP sings,
GPUs dance when orchestration rings.
A carrot-coded cheer for new wings.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: adding HTTP server mode to MCP and documentation for remote usage.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch mcp-http-transport

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 38fdd49 and c84f3eb.

📒 Files selected for processing (1)
  • mkdocs.yml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • mkdocs.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build
  • GitHub Check: Run pre-commit checks
  • GitHub Check: Build documentation

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Wangmerlyn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to the keep-gpu-mcp-server by adding an HTTP server mode. This new capability allows for more flexible and remote management of GPU resources, moving beyond the previous stdin/stdout-only interaction. The changes include comprehensive updates to the documentation to ensure users can easily configure and utilize this new network-accessible interface, particularly for cluster and remote setups, while also refining server behavior for cleaner shutdowns.

Highlights

  • HTTP Server Mode: The MCP (Minimal Control Protocol) server now supports an HTTP mode, allowing it to expose its JSON-RPC interface over a network, in addition to the existing stdin/stdout method.
  • Remote Usage Documentation: Extensive documentation has been added across README.md, docs/contributing.md, and docs/getting-started.md to guide users on how to set up and utilize the HTTP server for remote and cluster environments, including examples for client configuration and SSH tunneling.
  • Server Refactoring: The keep-gpu-mcp-server script has been refactored to incorporate argparse for command-line argument parsing, enabling users to easily select between stdio and http modes, and configure HTTP host and port.
  • Improved Shutdown Logging: The stop_keep and shutdown methods in the server have been updated to allow for quieter operation, suppressing log messages during cleanup to avoid unnecessary output.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature by adding an HTTP server mode for the MCP endpoint, complementing the existing stdio mode. The implementation is straightforward, leveraging Python's standard library for the HTTP server. The documentation has been updated comprehensively across the README, getting started guide, and contributing guide to reflect these new capabilities, including examples for remote usage and security considerations like SSH tunneling. I have a couple of suggestions for the implementation in src/keep_gpu/mcp/server.py to improve robustness and clarity. Additionally, a significant consideration for future development is that the KeepGPUServer class is not currently thread-safe. While the server uses a single-threaded TCPServer, any future switch to a multi-threaded server could introduce race conditions when accessing shared session data. I recommend making KeepGPUServer thread-safe by using a threading.Lock to protect access to the _sessions dictionary.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/keep_gpu/mcp/server.py (1)

149-171: HTTP handler implementation is functional, but address static analysis hints.

The HTTP request handler correctly implements JSON-RPC over POST. However, static analysis flags several issues:

  1. Lines 152, 169: The noqa directives are unused according to current linter configuration. Remove them unless you're explicitly suppressing these rules in your config.
  2. Line 159: Catching blind Exception is flagged but acceptable in an HTTP handler for robustness.
  3. Line 169: The unused parameters format and args are required by the log_message signature override, so this is a false positive from the linter.

Apply this diff to remove unused noqa directives:

-    def do_POST(self):  # noqa: N802
+    def do_POST(self):
         length = int(self.headers.get("content-length", "0"))
         body = self.rfile.read(length).decode()
         try:
             payload = json.loads(body)
             response = _handle_request(self.server.keepgpu_server, payload)  # type: ignore[attr-defined]
             status = 200
         except Exception as exc:  # pragma: no cover - defensive
             response = {"error": {"message": str(exc)}}
             status = 400
         data = json.dumps(response).encode()
         self.send_response(status)
         self.send_header("content-type", "application/json")
         self.send_header("content-length", str(len(data)))
         self.end_headers()
         self.wfile.write(data)
 
-    def log_message(self, format, *args):  # noqa: A003
+    def log_message(self, format, *args):
         return
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1dd2b1f and 3551735.

📒 Files selected for processing (4)
  • README.md (2 hunks)
  • docs/contributing.md (1 hunks)
  • docs/getting-started.md (1 hunks)
  • src/keep_gpu/mcp/server.py (6 hunks)
🧰 Additional context used
🪛 Ruff (0.14.8)
src/keep_gpu/mcp/server.py

152-152: Unused noqa directive (non-enabled: N802)

Remove unused noqa directive

(RUF100)


159-159: Do not catch blind exception: Exception

(BLE001)


169-169: Unused method argument: format

(ARG002)


169-169: Unused method argument: args

(ARG002)


169-169: Unused noqa directive (non-enabled: A003)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build
  • GitHub Check: Build documentation
  • GitHub Check: Run pre-commit checks
🔇 Additional comments (11)
docs/contributing.md (2)

45-45: Verify security implications of the example binding.

The example uses --host 0.0.0.0, which exposes the server on all network interfaces. While this is necessary for remote access, ensure users understand the security implications. The guidance in lines 51-53 helps, but consider whether the default in the code (127.0.0.1) should be emphasized as the safe default for local-only usage.


51-53: LGTM! Good security guidance.

The remote usage tip appropriately warns about security and provides practical alternatives (auth/reverse-proxy or SSH tunnel). The SSH tunnel example is correct and actionable.

docs/getting-started.md (1)

80-109: Documentation is clear and comprehensive.

The HTTP transport and remote usage sections provide good coverage of the new functionality. The examples are consistent with the implementation and other documentation files.

One structural note: this section appears between the MCP endpoint documentation and the "Editable dev install" section (line 110), which may disrupt the document flow. Consider moving it to maintain logical grouping.

README.md (3)

95-95: LGTM! Clear description of default mode.

The addition of "(default)" clarifies the transport mode hierarchy.


99-102: HTTP mode documentation is consistent.

The example matches the implementation and is consistent with other documentation files. Same security consideration as docs/contributing.md applies: --host 0.0.0.0 exposes on all interfaces.


115-134: LGTM! Comprehensive remote usage guidance.

The HTTP client configuration examples are clear, and the remote/SSH tunnel guidance appropriately emphasizes security considerations. The examples are actionable and consistent across documentation files.

src/keep_gpu/mcp/server.py (5)

1-10: LGTM! Docstring accurately updated.

The docstring correctly documents both transport modes and the complete method list.


18-21: LGTM! Appropriate standard library imports.

The new imports are all from the standard library and necessary for HTTP server functionality.


78-96: LGTM! Clean addition of quiet parameter.

The optional quiet parameter effectively suppresses logging during automatic shutdown (line 122) while maintaining normal logging for explicit calls. The implementation is clean and backward-compatible.


187-209: HTTP server implementation is functional but verify the threading approach.

The HTTP server correctly uses TCPServer with graceful shutdown. A few observations:

  1. Line 203: thread.join() on a daemon thread running serve_forever() will block indefinitely. This appears intentional—the only exit path is KeyboardInterrupt (line 204), which triggers the shutdown sequence. This is acceptable for a long-running service.

  2. Line 192: Dynamically attaching keepgpu_server to the httpd instance works but isn't type-safe. This is acknowledged with the type: ignore comment.

  3. Security: The default host="127.0.0.1" is safe (localhost only), but documentation examples frequently use 0.0.0.0 with security warnings. Consider emphasizing the safe default more prominently in docs.

Verify that the threading approach behaves as expected:

  • Confirm daemon threads clean up properly during shutdown
  • Test that KeyboardInterrupt is handled gracefully in various environments

212-228: LGTM! Clean CLI entry point with safe defaults.

The main function correctly:

  • Provides sensible defaults (stdio mode, localhost binding, standard port)
  • Routes to appropriate transport based on mode
  • Uses standard argparse for CLI argument handling

The default host="127.0.0.1" is a safe choice that prevents accidental network exposure.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/keep_gpu/mcp/server.py (3)

78-80: quiet flag integration and shutdown behavior look good; consider documenting quiet explicitly.

The quiet: bool = False parameter preserves existing behavior by default, and using quiet=True in shutdown() avoids noisy logs during interpreter teardown while still releasing sessions. As a small follow‑up, you might want to mention the quiet option in the “Supported methods” header or a docstring so Python callers know it exists (JSON‑RPC clients can still ignore it).

Also applies to: 85-86, 94-96, 120-125


98-118: status / list_gpus shapes are consistent and client‑friendly.

Per‑job vs “all jobs” responses have stable keys, you don’t raise on missing job_id (nice for polling clients), and list_gpus() cleanly wraps get_gpu_info() under a "gpus" key. If you ever want symmetry with stop_keep, you could optionally add a "message" on the inactive status case, but the current contract is already reasonable.


149-171: HTTP JSON‑RPC handler is solid; minor lint/clarity tweaks available.

The narrowed exception list in do_POST makes 400s clearly client‑side (decode/length issues) while delegating real server errors to _handle_request, which is great. Ruff’s hints about unused noqa and arguments suggest a tiny cleanup:

-class _JSONRPCHandler(BaseHTTPRequestHandler):
-    server_version = "KeepGPU-MCP/0.1"
-
-    def do_POST(self):  # noqa: N802
+class _JSONRPCHandler(BaseHTTPRequestHandler):
+    server_version = "KeepGPU-MCP/0.1"
+
+    def do_POST(self) -> None:
         try:
             length = int(self.headers.get("content-length", "0"))
             body = self.rfile.read(length).decode("utf-8")
@@
-        self.wfile.write(data)
-
-    def log_message(self, format, *args):  # noqa: A003
-        return
+        self.wfile.write(data)
+
+    def log_message(self, _format, *_args) -> None:
+        return
  • Dropping # noqa: N802/A003 addresses the RUF100 “unused noqa” warnings.
  • Renaming to _format, *_args keeps the override signature compatible while making it explicit that the arguments are intentionally unused and avoiding ARG002.

You may also optionally log 400‑level “bad request” errors at debug or info level if you want easier diagnosis of malformed client input.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3551735 and 9e4b35b.

📒 Files selected for processing (1)
  • src/keep_gpu/mcp/server.py (6 hunks)
🧰 Additional context used
🪛 Ruff (0.14.8)
src/keep_gpu/mcp/server.py

152-152: Unused noqa directive (non-enabled: N802)

Remove unused noqa directive

(RUF100)


169-169: Unused method argument: format

(ARG002)


169-169: Unused method argument: args

(ARG002)


169-169: Unused noqa directive (non-enabled: A003)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build documentation
  • GitHub Check: Run pre-commit checks
  • GitHub Check: build
🔇 Additional comments (6)
src/keep_gpu/mcp/server.py (6)

4-9: Docstring now matches the expanded API surface.

The header comment correctly describes both transports and all four JSON‑RPC methods (start_keep, stop_keep, status, list_gpus); no changes needed here.


18-21: New stdlib imports are appropriate and all used.

argparse, threading, BaseHTTPRequestHandler, and TCPServer are each exercised in the new CLI/HTTP paths with no unused imports or layering concerns.


128-146: New methods are correctly wired through _handle_request.

The dispatcher now routes status and list_gpus alongside the existing methods, and params = payload.get("params", {}) or {} gives a safe empty‑dict default even when JSON‑RPC sends null. Centralized error handling keeps both stdio and HTTP transports consistent.


173-185: run_stdio helper cleanly preserves the existing stdio contract.

The loop still processes one JSON‑RPC object per line, uses _handle_request for routing, and flushes responses promptly; extracting it into a named function for reuse from main() is a nice structural improvement.


187-210: run_http correctly starts and tears down the HTTP server.

Using a small _Server subclass with allow_reuse_address = True makes quick restarts easier, and running serve_forever() in a background thread with shutdown/close logic in finally gives a clean path on KeyboardInterrupt while still blocking the main thread via join() for CLI usage. Overall, the lifecycle and cleanup of both the HTTP server and KeepGPUServer look correct.


212-229: CLI entry point is straightforward and backward‑compatible.

--mode defaults to stdio, preserving previous behavior, while --host/--port are clearly documented as HTTP‑mode options; delegating directly to run_stdio/run_http keeps main() thin and easy to extend later.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 9, 2025

Note

Docstrings generation - SUCCESS
Generated docstrings for this pull request at #61

coderabbitai bot added a commit that referenced this pull request Dec 9, 2025
Docstrings generation was requested by @Wangmerlyn.

* #60 (comment)

The following files were modified:

* `src/keep_gpu/mcp/server.py`
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/guides/mcp.md (1)

8-10: Revise sentence structure to avoid repetitive opening.

Lines 8–10 all begin with "You," which reduces readability. Vary the sentence structure for better flow.

Consider this revision:

 - You run KeepGPU from an agent (LangChain, custom orchestrator, etc.) instead of a shell.
-- You want to keep GPUs alive on a remote box over TCP rather than stdio.
-- You need a quick way to list GPU utilization/memory via the same interface.
+- GPUs need to stay alive on a remote box over TCP rather than stdio.
+- Quick GPU utilization/memory inspection is required via the same interface.

(Or similar rewording that diversifies the sentence openings.)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9e4b35b and a4a524e.

📒 Files selected for processing (4)
  • docs/getting-started.md (4 hunks)
  • docs/guides/mcp.md (1 hunks)
  • docs/index.md (1 hunks)
  • mkdocs.yml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/getting-started.md
🧰 Additional context used
🪛 LanguageTool
docs/guides/mcp.md

[style] ~10-~10: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...emote box over TCP rather than stdio. - You need a quick way to list GPU utilizatio...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build
  • GitHub Check: Run pre-commit checks
  • GitHub Check: Build documentation
🔇 Additional comments (3)
mkdocs.yml (1)

14-26: Navigation structure looks good.

The rename from "Guides" to "Usage" improves clarity, and the new MCP Server entry is appropriately positioned alongside CLI and Python API documentation.

docs/index.md (1)

35-43: Documentation updates align well with navigation changes.

The new MCP Server link and updated sidebar flow description are consistent with the mkdocs.yml restructuring. The description accurately conveys the purpose of the MCP guide.

docs/guides/mcp.md (1)

12-85: Documentation is comprehensive and well-structured.

The guide effectively covers both stdio and HTTP usage patterns, includes practical examples, client configurations, and remote deployment guidance. The tabbed interface makes it easy to switch between transport modes. Response examples help developers understand expected outputs.

coderabbitai bot and others added 2 commits December 9, 2025 16:56
* 📝 Add docstrings to `mcp-http-transport`

Docstrings generation was requested by @Wangmerlyn.

* #60 (comment)

The following files were modified:

* `src/keep_gpu/mcp/server.py`

* chore: format mcp server docstrings

* style: shorten mcp server docstrings for hooks

* docs: restore detailed mcp server docstrings

* docs: rephrase mcp guide for docstyle hook

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Wang Siyuan <wsy0227@sjtu.edu.cn>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/keep_gpu/mcp/server.py (2)

150-156: Explicitly justify or narrow the broad Exception catch in shutdown

Catching a blanket Exception is reasonable here given interpreter teardown, but it does trip BLE001 and can hide unexpected bugs.

Two options that keep behavior while making intent explicit:

-    def shutdown(self) -> None:
-        """Stop all sessions quietly; ignore errors during interpreter teardown."""
-        try:
-            self.stop_keep(None, quiet=True)
-        except Exception:  # pragma: no cover - defensive
-            # Avoid noisy errors during interpreter teardown
-            return
+    def shutdown(self) -> None:
+        """Stop all sessions quietly; ignore errors during interpreter teardown."""
+        try:
+            self.stop_keep(None, quiet=True)
+        except Exception:  # noqa: BLE001
+            # Avoid noisy errors during interpreter teardown; best‑effort only.
+            return

If you prefer not to use noqa, consider narrowing this to the specific exceptions you expect from stop_keep and its release() calls.


190-219: Tidy up lint issues in HTTP handler and make intent explicit

Ruff hints about unused noqa directives and unused log_message arguments can be resolved with small, behavior‑preserving tweaks:

-class _JSONRPCHandler(BaseHTTPRequestHandler):
+class _JSONRPCHandler(BaseHTTPRequestHandler):
     server_version = "KeepGPU-MCP/0.1"
 
-    def do_POST(self):  # noqa: N802
+    def do_POST(self):
         """
         Handle an HTTP JSON-RPC request and write a JSON response.
@@
-    def log_message(self, format, *args):  # noqa: A003
-        """Suppress default request logging."""
-        return
+    def log_message(self, format, *args):
+        """Suppress default request logging from BaseHTTPRequestHandler."""
+        # Mark arguments as used to satisfy linters; we intentionally drop logs.
+        del format, args
+        return

This:

  • Removes now‑unnecessary # noqa comments flagged by RUF100.
  • Silences ARG002 without changing the override semantics.
  • Keeps the improved error handling in do_POST intact.

If you expect untrusted clients, you may also want to consider enforcing a maximum content-length to avoid reading arbitrarily large request bodies; that can be added later without changing the API.

docs/guides/mcp.md (1)

8-11: Optional: vary wording in the “When to use this” bullets

Minor style nit: all three bullets start with “You…”. If you care about polishing the prose, consider rephrasing one or two (e.g., “Run KeepGPU from an agent…”, “Keep GPUs alive on a remote box over TCP…”) to address the LanguageTool hint. Content itself is clear.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a4a524e and 88d56c8.

📒 Files selected for processing (2)
  • docs/guides/mcp.md (1 hunks)
  • src/keep_gpu/mcp/server.py (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/keep_gpu/mcp/server.py (1)
tests/mcp/test_server.py (1)
  • release (16-17)
🪛 LanguageTool
docs/guides/mcp.md

[style] ~10-~10: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...emote box over TCP rather than stdio. - You need a quick way to list GPU utilizatio...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 Ruff (0.14.8)
src/keep_gpu/mcp/server.py

154-154: Do not catch blind exception: Exception

(BLE001)


193-193: Unused noqa directive (non-enabled: N802)

Remove unused noqa directive

(RUF100)


217-217: Unused method argument: format

(ARG002)


217-217: Unused method argument: args

(ARG002)


217-217: Unused noqa directive (non-enabled: A003)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build documentation
  • GitHub Check: build
  • GitHub Check: Run pre-commit checks
🔇 Additional comments (9)
src/keep_gpu/mcp/server.py (6)

55-70: Docstring for start_keep is clear and matches behavior

The parameter/return/exception docs align with the implementation and make the JSON‑RPC surface easy to consume. No changes needed here.


94-126: stop_keep quiet flag wiring and responses look consistent

The quiet flag is correctly honored in both single‑job and “stop all” paths, and the returned shapes ({"stopped": [...]} plus optional "message") are stable for JSON‑RPC clients. This is a clean extension of the existing API.


160-187: Central JSON‑RPC dispatcher looks solid

_handle_request cleanly routes methods, normalizes params to a dict, and wraps server failures into a structured "error" while logging the exception. The shapes returned here are consistent with the docs and examples.


222-235: Stdio mode behavior matches the JSON‑RPC contract

run_stdio’s “one JSON object per line” contract, plus mapping any parsing/dispatch failure to a JSON error response, is simple and predictable for MCP‑style clients. For now the broad except is acceptable given it’s a CLI transport and _handle_request already logs server‑side exceptions.


237-263: HTTP server thread lifecycle and shutdown sequencing are reasonable

Running serve_forever() in a dedicated thread, joining in the main thread, and then calling httpd.shutdown()/server.shutdown() in finally gives a sensible Ctrl‑C story for the CLI and ensures controllers are released on exit. Dropping daemon=True (per the earlier review) removes ambiguity about thread lifetime without changing behavior.


265-283: CLI entry point aligns with docs and keeps defaults safe

The main() function exposes --mode, --host, and --port with sane defaults (stdio / localhost / 8765) and matches the examples in docs/guides/mcp.md. That keeps the stdio use case frictionless while making HTTP explicit.

docs/guides/mcp.md (3)

14-37: Method list and quick‑start examples accurately reflect the server API

The stdio/HTTP examples, method signatures (start_keep, stop_keep, status, list_gpus), and parameter semantics (optional job_id to target/stop vs list all) all line up with KeepGPUServer and _handle_request. This doc should be immediately usable for MCP clients.


57-76: Remote/cluster guidance and security note are on point

The section on running HTTP mode on a GPU host, pointing clients at it, and preferring SSH tunnels or a reverse proxy on untrusted networks matches the server’s lack of built‑in auth/TLS and sets good expectations for production use.


79-85: Response examples match the JSON‑RPC shapes returned by the server

The sample result structures for start_keep, stop_keep, status, status (all jobs), and list_gpus are consistent with KeepGPUServer’s return values. They deliberately show a subset of fields in params/GPU info, which is fine as long as readers understand real responses may contain more keys.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/guides/mcp.md (1)

8-10: Reduce repetition in bullet point phrasing.

All three bullet points begin with "You," which is a stylistic repetition. Vary the phrasing for better readability—for example, use imperatives ("Run KeepGPU…", "Keep GPUs alive…") or restructure some as outcomes.

Consider rephrasing as:

- You run KeepGPU from an agent (LangChain, custom orchestrator, etc.) instead of a shell.
- You want to keep GPUs alive on a remote box over TCP rather than stdio.
- You need a quick way to list GPU utilization/memory by way of the same interface.
+ Run KeepGPU from an agent (LangChain, custom orchestrator, etc.) instead of a shell.
+ Keep GPUs alive on a remote box over TCP rather than stdio.
+ List GPU utilization/memory by way of the same interface.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 88d56c8 and 832f7c8.

📒 Files selected for processing (1)
  • docs/guides/mcp.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/guides/mcp.md

[style] ~10-~10: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...emote box over TCP rather than stdio. - You need a quick way to list GPU utilizatio...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 markdownlint-cli2 (0.18.1)
docs/guides/mcp.md

67-67: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)


71-71: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)


78-78: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)


86-86: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Run pre-commit checks
  • GitHub Check: Build documentation
  • GitHub Check: build
🔇 Additional comments (4)
docs/guides/mcp.md (4)

14-43: Well-structured quick-start examples.

The stdio and HTTP mode examples are clear and practical. Method signatures with optional parameter hints (using ?) effectively communicate the API contract.


45-62: Clear and practical MCP client configuration examples.

Both stdio and HTTP adapter configurations are accurate and follow MCP conventions. The contrast between command-based and URL-based adapters is well-illustrated.


67-82: Address markdown code block style inconsistency.

Per markdownlint rules, code blocks should use indented style (four spaces) rather than fenced backticks. Lines 67, 71, and 78 use fenced code blocks; they should be indented to match the linter configuration.

Additionally, the security guidance (SSH tunneling, reverse-proxy recommendation) is excellent and well-placed.

Verify whether your project's markdownlint configuration enforces indented code blocks. If that is a project-wide standard, update these fenced code blocks to use indentation instead:

- Run on the GPU host:
-   ```bash
-   keep-gpu-mcp-server --mode http --host 0.0.0.0 --port 8765
-   ```
+ Run on the GPU host:
+     keep-gpu-mcp-server --mode http --host 0.0.0.0 --port 8765

If fenced code blocks are acceptable, confirm that the linter configuration should permit them.


84-92: Apply consistent markdown code block style to responses section.

The JSON response examples are helpful and clearly demonstrate the expected output for each MCP method. However, the code block at line 86 uses fenced backticks instead of the indented style expected by your markdownlint configuration.

Update the code block to use indentation (four spaces) instead of fenced backticks to resolve the MD046 linting issue.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
README.md (1)

122-134: SSH tunnel guidance is technically sound but could be more explicit about client-side usage. The SSH tunnel command syntax is correct, but the documentation could clarify that after running the tunnel command locally, the client connects to localhost:8765 on the client machine (which tunnels to the remote server). Additionally, consider being more explicit that --host 0.0.0.0 makes the server network-accessible and should only be used with appropriate trust boundaries or behind auth/proxy.

Consider updating the guidance to include a step-by-step SSH tunnel workflow:

- Remote/SSH tunnel example (HTTP):
  ```bash
  keep-gpu-mcp-server --mode http --host 0.0.0.0 --port 8765

On your local client machine, establish an SSH tunnel to the remote server:

ssh -L 8765:localhost:8765 user@gpu-box

Then configure your MCP client to connect to the tunneled endpoint:

servers:
  keepgpu:
    url: http://localhost:8765/
    adapter: http

Security Note: The --host 0.0.0.0 option makes the server accessible on all network interfaces. For untrusted networks, either run the server behind an authentication layer/reverse-proxy or use the SSH tunnel pattern above to restrict access.


</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used**: CodeRabbit UI

**Review profile**: CHILL

**Plan**: Pro

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 832f7c8007b9b33a623f06aaa61e8652f8e13ca8 and 38fdd4946e586421b3cc5343d3ec0ceff1fce602.

</details>

<details>
<summary>📒 Files selected for processing (2)</summary>

* `README.md` (2 hunks)
* `docs/contributing.md` (1 hunks)

</details>

<details>
<summary>🚧 Files skipped from review as they are similar to previous changes (1)</summary>

* docs/contributing.md

</details>

<details>
<summary>🔇 Additional comments (4)</summary><blockquote>

<details>
<summary>README.md (4)</summary><blockquote>

`95-102`: **Clear documentation of both stdio and HTTP modes.** The examples are well-formatted and the progression from default stdin/stdout behavior to explicit HTTP exposure is intuitive.

---

`115-121`: **HTTP client config example is clear and well-formatted.** The YAML structure correctly shows the HTTP adapter option with the appropriate localhost URL reference.

---

`139-139`: **The contributing documentation link is valid.** The file `docs/contributing.md` exists and contains proper documentation, confirming the markdown hyperlink format in README.md is correctly implemented.

---

`107-107`: [rewritten comment]
[classification tag]

</blockquote></details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

@Wangmerlyn Wangmerlyn merged commit e995d3f into main Dec 9, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants