-
Notifications
You must be signed in to change notification settings - Fork 5
[mcp] feat: add HTTP server mode and remote usage docs #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds an HTTP JSON‑RPC transport to the MCP server while retaining stdio mode; introduces CLI flags (--mode/--host/--port); extends KeepGPUServer with status() and list_gpus() and adds a quiet parameter to stop_keep(); updates docs with MCP HTTP usage, client examples, and remote-access guidance. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant HTTP_Server as HTTP Handler
participant KeepGPUServer as Server
Client->>HTTP_Server: HTTP POST / with JSON‑RPC payload
HTTP_Server->>KeepGPUServer: _handle_request(parsed request)
KeepGPUServer->>KeepGPUServer: execute method (start_keep / stop_keep / status / list_gpus)
KeepGPUServer-->>HTTP_Server: JSON‑RPC response (result or error)
HTTP_Server-->>Client: HTTP 200 with JSON‑RPC body
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
Comment |
Summary of ChangesHello @Wangmerlyn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant enhancement to the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable feature by adding an HTTP server mode for the MCP endpoint, complementing the existing stdio mode. The implementation is straightforward, leveraging Python's standard library for the HTTP server. The documentation has been updated comprehensively across the README, getting started guide, and contributing guide to reflect these new capabilities, including examples for remote usage and security considerations like SSH tunneling. I have a couple of suggestions for the implementation in src/keep_gpu/mcp/server.py to improve robustness and clarity. Additionally, a significant consideration for future development is that the KeepGPUServer class is not currently thread-safe. While the server uses a single-threaded TCPServer, any future switch to a multi-threaded server could introduce race conditions when accessing shared session data. I recommend making KeepGPUServer thread-safe by using a threading.Lock to protect access to the _sessions dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/keep_gpu/mcp/server.py (1)
149-171: HTTP handler implementation is functional, but address static analysis hints.The HTTP request handler correctly implements JSON-RPC over POST. However, static analysis flags several issues:
- Lines 152, 169: The
noqadirectives are unused according to current linter configuration. Remove them unless you're explicitly suppressing these rules in your config.- Line 159: Catching blind
Exceptionis flagged but acceptable in an HTTP handler for robustness.- Line 169: The unused parameters
formatandargsare required by thelog_messagesignature override, so this is a false positive from the linter.Apply this diff to remove unused noqa directives:
- def do_POST(self): # noqa: N802 + def do_POST(self): length = int(self.headers.get("content-length", "0")) body = self.rfile.read(length).decode() try: payload = json.loads(body) response = _handle_request(self.server.keepgpu_server, payload) # type: ignore[attr-defined] status = 200 except Exception as exc: # pragma: no cover - defensive response = {"error": {"message": str(exc)}} status = 400 data = json.dumps(response).encode() self.send_response(status) self.send_header("content-type", "application/json") self.send_header("content-length", str(len(data))) self.end_headers() self.wfile.write(data) - def log_message(self, format, *args): # noqa: A003 + def log_message(self, format, *args): return
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
README.md(2 hunks)docs/contributing.md(1 hunks)docs/getting-started.md(1 hunks)src/keep_gpu/mcp/server.py(6 hunks)
🧰 Additional context used
🪛 Ruff (0.14.8)
src/keep_gpu/mcp/server.py
152-152: Unused noqa directive (non-enabled: N802)
Remove unused noqa directive
(RUF100)
159-159: Do not catch blind exception: Exception
(BLE001)
169-169: Unused method argument: format
(ARG002)
169-169: Unused method argument: args
(ARG002)
169-169: Unused noqa directive (non-enabled: A003)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build
- GitHub Check: Build documentation
- GitHub Check: Run pre-commit checks
🔇 Additional comments (11)
docs/contributing.md (2)
45-45: Verify security implications of the example binding.The example uses
--host 0.0.0.0, which exposes the server on all network interfaces. While this is necessary for remote access, ensure users understand the security implications. The guidance in lines 51-53 helps, but consider whether the default in the code (127.0.0.1) should be emphasized as the safe default for local-only usage.
51-53: LGTM! Good security guidance.The remote usage tip appropriately warns about security and provides practical alternatives (auth/reverse-proxy or SSH tunnel). The SSH tunnel example is correct and actionable.
docs/getting-started.md (1)
80-109: Documentation is clear and comprehensive.The HTTP transport and remote usage sections provide good coverage of the new functionality. The examples are consistent with the implementation and other documentation files.
One structural note: this section appears between the MCP endpoint documentation and the "Editable dev install" section (line 110), which may disrupt the document flow. Consider moving it to maintain logical grouping.
README.md (3)
95-95: LGTM! Clear description of default mode.The addition of "(default)" clarifies the transport mode hierarchy.
99-102: HTTP mode documentation is consistent.The example matches the implementation and is consistent with other documentation files. Same security consideration as docs/contributing.md applies:
--host 0.0.0.0exposes on all interfaces.
115-134: LGTM! Comprehensive remote usage guidance.The HTTP client configuration examples are clear, and the remote/SSH tunnel guidance appropriately emphasizes security considerations. The examples are actionable and consistent across documentation files.
src/keep_gpu/mcp/server.py (5)
1-10: LGTM! Docstring accurately updated.The docstring correctly documents both transport modes and the complete method list.
18-21: LGTM! Appropriate standard library imports.The new imports are all from the standard library and necessary for HTTP server functionality.
78-96: LGTM! Clean addition of quiet parameter.The optional
quietparameter effectively suppresses logging during automatic shutdown (line 122) while maintaining normal logging for explicit calls. The implementation is clean and backward-compatible.
187-209: HTTP server implementation is functional but verify the threading approach.The HTTP server correctly uses
TCPServerwith graceful shutdown. A few observations:
Line 203:
thread.join()on a daemon thread runningserve_forever()will block indefinitely. This appears intentional—the only exit path isKeyboardInterrupt(line 204), which triggers the shutdown sequence. This is acceptable for a long-running service.Line 192: Dynamically attaching
keepgpu_serverto the httpd instance works but isn't type-safe. This is acknowledged with thetype: ignorecomment.Security: The default
host="127.0.0.1"is safe (localhost only), but documentation examples frequently use0.0.0.0with security warnings. Consider emphasizing the safe default more prominently in docs.Verify that the threading approach behaves as expected:
- Confirm daemon threads clean up properly during shutdown
- Test that KeyboardInterrupt is handled gracefully in various environments
212-228: LGTM! Clean CLI entry point with safe defaults.The main function correctly:
- Provides sensible defaults (stdio mode, localhost binding, standard port)
- Routes to appropriate transport based on mode
- Uses standard argparse for CLI argument handling
The default
host="127.0.0.1"is a safe choice that prevents accidental network exposure.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/keep_gpu/mcp/server.py (3)
78-80:quietflag integration and shutdown behavior look good; consider documentingquietexplicitly.The
quiet: bool = Falseparameter preserves existing behavior by default, and usingquiet=Trueinshutdown()avoids noisy logs during interpreter teardown while still releasing sessions. As a small follow‑up, you might want to mention thequietoption in the “Supported methods” header or a docstring so Python callers know it exists (JSON‑RPC clients can still ignore it).Also applies to: 85-86, 94-96, 120-125
98-118:status/list_gpusshapes are consistent and client‑friendly.Per‑job vs “all jobs” responses have stable keys, you don’t raise on missing
job_id(nice for polling clients), andlist_gpus()cleanly wrapsget_gpu_info()under a"gpus"key. If you ever want symmetry withstop_keep, you could optionally add a"message"on the inactivestatuscase, but the current contract is already reasonable.
149-171: HTTP JSON‑RPC handler is solid; minor lint/clarity tweaks available.The narrowed exception list in
do_POSTmakes 400s clearly client‑side (decode/length issues) while delegating real server errors to_handle_request, which is great. Ruff’s hints about unusednoqaand arguments suggest a tiny cleanup:-class _JSONRPCHandler(BaseHTTPRequestHandler): - server_version = "KeepGPU-MCP/0.1" - - def do_POST(self): # noqa: N802 +class _JSONRPCHandler(BaseHTTPRequestHandler): + server_version = "KeepGPU-MCP/0.1" + + def do_POST(self) -> None: try: length = int(self.headers.get("content-length", "0")) body = self.rfile.read(length).decode("utf-8") @@ - self.wfile.write(data) - - def log_message(self, format, *args): # noqa: A003 - return + self.wfile.write(data) + + def log_message(self, _format, *_args) -> None: + return
- Dropping
# noqa: N802/A003addresses the RUF100 “unused noqa” warnings.- Renaming to
_format,*_argskeeps the override signature compatible while making it explicit that the arguments are intentionally unused and avoiding ARG002.You may also optionally log 400‑level “bad request” errors at debug or info level if you want easier diagnosis of malformed client input.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/keep_gpu/mcp/server.py(6 hunks)
🧰 Additional context used
🪛 Ruff (0.14.8)
src/keep_gpu/mcp/server.py
152-152: Unused noqa directive (non-enabled: N802)
Remove unused noqa directive
(RUF100)
169-169: Unused method argument: format
(ARG002)
169-169: Unused method argument: args
(ARG002)
169-169: Unused noqa directive (non-enabled: A003)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build documentation
- GitHub Check: Run pre-commit checks
- GitHub Check: build
🔇 Additional comments (6)
src/keep_gpu/mcp/server.py (6)
4-9: Docstring now matches the expanded API surface.The header comment correctly describes both transports and all four JSON‑RPC methods (
start_keep,stop_keep,status,list_gpus); no changes needed here.
18-21: New stdlib imports are appropriate and all used.
argparse,threading,BaseHTTPRequestHandler, andTCPServerare each exercised in the new CLI/HTTP paths with no unused imports or layering concerns.
128-146: New methods are correctly wired through_handle_request.The dispatcher now routes
statusandlist_gpusalongside the existing methods, andparams = payload.get("params", {}) or {}gives a safe empty‑dict default even when JSON‑RPC sendsnull. Centralized error handling keeps both stdio and HTTP transports consistent.
173-185:run_stdiohelper cleanly preserves the existing stdio contract.The loop still processes one JSON‑RPC object per line, uses
_handle_requestfor routing, and flushes responses promptly; extracting it into a named function for reuse frommain()is a nice structural improvement.
187-210:run_httpcorrectly starts and tears down the HTTP server.Using a small
_Serversubclass withallow_reuse_address = Truemakes quick restarts easier, and runningserve_forever()in a background thread with shutdown/close logic infinallygives a clean path onKeyboardInterruptwhile still blocking the main thread viajoin()for CLI usage. Overall, the lifecycle and cleanup of both the HTTP server andKeepGPUServerlook correct.
212-229: CLI entry point is straightforward and backward‑compatible.
--modedefaults tostdio, preserving previous behavior, while--host/--portare clearly documented as HTTP‑mode options; delegating directly torun_stdio/run_httpkeepsmain()thin and easy to extend later.
|
Note Docstrings generation - SUCCESS |
Docstrings generation was requested by @Wangmerlyn. * #60 (comment) The following files were modified: * `src/keep_gpu/mcp/server.py`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
docs/guides/mcp.md (1)
8-10: Revise sentence structure to avoid repetitive opening.Lines 8–10 all begin with "You," which reduces readability. Vary the sentence structure for better flow.
Consider this revision:
- You run KeepGPU from an agent (LangChain, custom orchestrator, etc.) instead of a shell. -- You want to keep GPUs alive on a remote box over TCP rather than stdio. -- You need a quick way to list GPU utilization/memory via the same interface. +- GPUs need to stay alive on a remote box over TCP rather than stdio. +- Quick GPU utilization/memory inspection is required via the same interface.(Or similar rewording that diversifies the sentence openings.)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
docs/getting-started.md(4 hunks)docs/guides/mcp.md(1 hunks)docs/index.md(1 hunks)mkdocs.yml(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/getting-started.md
🧰 Additional context used
🪛 LanguageTool
docs/guides/mcp.md
[style] ~10-~10: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...emote box over TCP rather than stdio. - You need a quick way to list GPU utilizatio...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build
- GitHub Check: Run pre-commit checks
- GitHub Check: Build documentation
🔇 Additional comments (3)
mkdocs.yml (1)
14-26: Navigation structure looks good.The rename from "Guides" to "Usage" improves clarity, and the new MCP Server entry is appropriately positioned alongside CLI and Python API documentation.
docs/index.md (1)
35-43: Documentation updates align well with navigation changes.The new MCP Server link and updated sidebar flow description are consistent with the mkdocs.yml restructuring. The description accurately conveys the purpose of the MCP guide.
docs/guides/mcp.md (1)
12-85: Documentation is comprehensive and well-structured.The guide effectively covers both stdio and HTTP usage patterns, includes practical examples, client configurations, and remote deployment guidance. The tabbed interface makes it easy to switch between transport modes. Response examples help developers understand expected outputs.
* 📝 Add docstrings to `mcp-http-transport` Docstrings generation was requested by @Wangmerlyn. * #60 (comment) The following files were modified: * `src/keep_gpu/mcp/server.py` * chore: format mcp server docstrings * style: shorten mcp server docstrings for hooks * docs: restore detailed mcp server docstrings * docs: rephrase mcp guide for docstyle hook --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: Wang Siyuan <wsy0227@sjtu.edu.cn>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/keep_gpu/mcp/server.py (2)
150-156: Explicitly justify or narrow the broadExceptioncatch inshutdownCatching a blanket
Exceptionis reasonable here given interpreter teardown, but it does trip BLE001 and can hide unexpected bugs.Two options that keep behavior while making intent explicit:
- def shutdown(self) -> None: - """Stop all sessions quietly; ignore errors during interpreter teardown.""" - try: - self.stop_keep(None, quiet=True) - except Exception: # pragma: no cover - defensive - # Avoid noisy errors during interpreter teardown - return + def shutdown(self) -> None: + """Stop all sessions quietly; ignore errors during interpreter teardown.""" + try: + self.stop_keep(None, quiet=True) + except Exception: # noqa: BLE001 + # Avoid noisy errors during interpreter teardown; best‑effort only. + returnIf you prefer not to use
noqa, consider narrowing this to the specific exceptions you expect fromstop_keepand itsrelease()calls.
190-219: Tidy up lint issues in HTTP handler and make intent explicitRuff hints about unused
noqadirectives and unusedlog_messagearguments can be resolved with small, behavior‑preserving tweaks:-class _JSONRPCHandler(BaseHTTPRequestHandler): +class _JSONRPCHandler(BaseHTTPRequestHandler): server_version = "KeepGPU-MCP/0.1" - def do_POST(self): # noqa: N802 + def do_POST(self): """ Handle an HTTP JSON-RPC request and write a JSON response. @@ - def log_message(self, format, *args): # noqa: A003 - """Suppress default request logging.""" - return + def log_message(self, format, *args): + """Suppress default request logging from BaseHTTPRequestHandler.""" + # Mark arguments as used to satisfy linters; we intentionally drop logs. + del format, args + returnThis:
- Removes now‑unnecessary
# noqacomments flagged by RUF100.- Silences
ARG002without changing the override semantics.- Keeps the improved error handling in
do_POSTintact.If you expect untrusted clients, you may also want to consider enforcing a maximum
content-lengthto avoid reading arbitrarily large request bodies; that can be added later without changing the API.docs/guides/mcp.md (1)
8-11: Optional: vary wording in the “When to use this” bulletsMinor style nit: all three bullets start with “You…”. If you care about polishing the prose, consider rephrasing one or two (e.g., “Run KeepGPU from an agent…”, “Keep GPUs alive on a remote box over TCP…”) to address the LanguageTool hint. Content itself is clear.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/guides/mcp.md(1 hunks)src/keep_gpu/mcp/server.py(7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/keep_gpu/mcp/server.py (1)
tests/mcp/test_server.py (1)
release(16-17)
🪛 LanguageTool
docs/guides/mcp.md
[style] ~10-~10: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...emote box over TCP rather than stdio. - You need a quick way to list GPU utilizatio...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🪛 Ruff (0.14.8)
src/keep_gpu/mcp/server.py
154-154: Do not catch blind exception: Exception
(BLE001)
193-193: Unused noqa directive (non-enabled: N802)
Remove unused noqa directive
(RUF100)
217-217: Unused method argument: format
(ARG002)
217-217: Unused method argument: args
(ARG002)
217-217: Unused noqa directive (non-enabled: A003)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build documentation
- GitHub Check: build
- GitHub Check: Run pre-commit checks
🔇 Additional comments (9)
src/keep_gpu/mcp/server.py (6)
55-70: Docstring forstart_keepis clear and matches behaviorThe parameter/return/exception docs align with the implementation and make the JSON‑RPC surface easy to consume. No changes needed here.
94-126:stop_keepquiet flag wiring and responses look consistentThe
quietflag is correctly honored in both single‑job and “stop all” paths, and the returned shapes ({"stopped": [...]}plus optional"message") are stable for JSON‑RPC clients. This is a clean extension of the existing API.
160-187: Central JSON‑RPC dispatcher looks solid
_handle_requestcleanly routes methods, normalizesparamsto a dict, and wraps server failures into a structured"error"while logging the exception. The shapes returned here are consistent with the docs and examples.
222-235: Stdio mode behavior matches the JSON‑RPC contract
run_stdio’s “one JSON object per line” contract, plus mapping any parsing/dispatch failure to a JSON error response, is simple and predictable for MCP‑style clients. For now the broadexceptis acceptable given it’s a CLI transport and_handle_requestalready logs server‑side exceptions.
237-263: HTTP server thread lifecycle and shutdown sequencing are reasonableRunning
serve_forever()in a dedicated thread, joining in the main thread, and then callinghttpd.shutdown()/server.shutdown()infinallygives a sensible Ctrl‑C story for the CLI and ensures controllers are released on exit. Droppingdaemon=True(per the earlier review) removes ambiguity about thread lifetime without changing behavior.
265-283: CLI entry point aligns with docs and keeps defaults safeThe
main()function exposes--mode,--host, and--portwith sane defaults (stdio / localhost / 8765) and matches the examples indocs/guides/mcp.md. That keeps the stdio use case frictionless while making HTTP explicit.docs/guides/mcp.md (3)
14-37: Method list and quick‑start examples accurately reflect the server APIThe stdio/HTTP examples, method signatures (
start_keep,stop_keep,status,list_gpus), and parameter semantics (optionaljob_idto target/stop vs list all) all line up withKeepGPUServerand_handle_request. This doc should be immediately usable for MCP clients.
57-76: Remote/cluster guidance and security note are on pointThe section on running HTTP mode on a GPU host, pointing clients at it, and preferring SSH tunnels or a reverse proxy on untrusted networks matches the server’s lack of built‑in auth/TLS and sets good expectations for production use.
79-85: Response examples match the JSON‑RPC shapes returned by the serverThe sample
resultstructures forstart_keep,stop_keep,status,status(all jobs), andlist_gpusare consistent withKeepGPUServer’s return values. They deliberately show a subset of fields inparams/GPU info, which is fine as long as readers understand real responses may contain more keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
docs/guides/mcp.md (1)
8-10: Reduce repetition in bullet point phrasing.All three bullet points begin with "You," which is a stylistic repetition. Vary the phrasing for better readability—for example, use imperatives ("Run KeepGPU…", "Keep GPUs alive…") or restructure some as outcomes.
Consider rephrasing as:
- You run KeepGPU from an agent (LangChain, custom orchestrator, etc.) instead of a shell. - You want to keep GPUs alive on a remote box over TCP rather than stdio. - You need a quick way to list GPU utilization/memory by way of the same interface. + Run KeepGPU from an agent (LangChain, custom orchestrator, etc.) instead of a shell. + Keep GPUs alive on a remote box over TCP rather than stdio. + List GPU utilization/memory by way of the same interface.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/guides/mcp.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/guides/mcp.md
[style] ~10-~10: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...emote box over TCP rather than stdio. - You need a quick way to list GPU utilizatio...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🪛 markdownlint-cli2 (0.18.1)
docs/guides/mcp.md
67-67: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
71-71: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
78-78: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
86-86: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Run pre-commit checks
- GitHub Check: Build documentation
- GitHub Check: build
🔇 Additional comments (4)
docs/guides/mcp.md (4)
14-43: Well-structured quick-start examples.The stdio and HTTP mode examples are clear and practical. Method signatures with optional parameter hints (using
?) effectively communicate the API contract.
45-62: Clear and practical MCP client configuration examples.Both stdio and HTTP adapter configurations are accurate and follow MCP conventions. The contrast between command-based and URL-based adapters is well-illustrated.
67-82: Address markdown code block style inconsistency.Per markdownlint rules, code blocks should use indented style (four spaces) rather than fenced backticks. Lines 67, 71, and 78 use fenced code blocks; they should be indented to match the linter configuration.
Additionally, the security guidance (SSH tunneling, reverse-proxy recommendation) is excellent and well-placed.
Verify whether your project's markdownlint configuration enforces indented code blocks. If that is a project-wide standard, update these fenced code blocks to use indentation instead:
- Run on the GPU host: - ```bash - keep-gpu-mcp-server --mode http --host 0.0.0.0 --port 8765 - ``` + Run on the GPU host: + keep-gpu-mcp-server --mode http --host 0.0.0.0 --port 8765If fenced code blocks are acceptable, confirm that the linter configuration should permit them.
84-92: Apply consistent markdown code block style to responses section.The JSON response examples are helpful and clearly demonstrate the expected output for each MCP method. However, the code block at line 86 uses fenced backticks instead of the indented style expected by your markdownlint configuration.
Update the code block to use indentation (four spaces) instead of fenced backticks to resolve the MD046 linting issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
README.md (1)
122-134: SSH tunnel guidance is technically sound but could be more explicit about client-side usage. The SSH tunnel command syntax is correct, but the documentation could clarify that after running the tunnel command locally, the client connects tolocalhost:8765on the client machine (which tunnels to the remote server). Additionally, consider being more explicit that--host 0.0.0.0makes the server network-accessible and should only be used with appropriate trust boundaries or behind auth/proxy.Consider updating the guidance to include a step-by-step SSH tunnel workflow:
- Remote/SSH tunnel example (HTTP): ```bash keep-gpu-mcp-server --mode http --host 0.0.0.0 --port 8765On your local client machine, establish an SSH tunnel to the remote server:
ssh -L 8765:localhost:8765 user@gpu-boxThen configure your MCP client to connect to the tunneled endpoint:
servers: keepgpu: url: http://localhost:8765/ adapter: httpSecurity Note: The
--host 0.0.0.0option makes the server accessible on all network interfaces. For untrusted networks, either run the server behind an authentication layer/reverse-proxy or use the SSH tunnel pattern above to restrict access.</blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used**: CodeRabbit UI **Review profile**: CHILL **Plan**: Pro <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 832f7c8007b9b33a623f06aaa61e8652f8e13ca8 and 38fdd4946e586421b3cc5343d3ec0ceff1fce602. </details> <details> <summary>📒 Files selected for processing (2)</summary> * `README.md` (2 hunks) * `docs/contributing.md` (1 hunks) </details> <details> <summary>🚧 Files skipped from review as they are similar to previous changes (1)</summary> * docs/contributing.md </details> <details> <summary>🔇 Additional comments (4)</summary><blockquote> <details> <summary>README.md (4)</summary><blockquote> `95-102`: **Clear documentation of both stdio and HTTP modes.** The examples are well-formatted and the progression from default stdin/stdout behavior to explicit HTTP exposure is intuitive. --- `115-121`: **HTTP client config example is clear and well-formatted.** The YAML structure correctly shows the HTTP adapter option with the appropriate localhost URL reference. --- `139-139`: **The contributing documentation link is valid.** The file `docs/contributing.md` exists and contains proper documentation, confirming the markdown hyperlink format in README.md is correctly implemented. --- `107-107`: [rewritten comment] [classification tag] </blockquote></details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
Summary by CodeRabbit
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.