Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughThis PR updates Llama Stack docs and notebooks to replace third‑party LLM API secret guidance with vLLM OpenAI‑compatible endpoints (VLLM_URL), add vLLM/KServe tool‑calling guidance and predictor args, consolidate MCP examples into the main quickstart, and refactor the quickstart notebook to support dual tool integration paths (client-side and MCP). Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Notebook
participant FastMCP
participant Client as LlamaStackClient
participant vLLM as vLLM/KServe
rect rgba(200,230,255,0.5)
Note over Notebook,FastMCP: TOOL_OPTION selects tool integration path
end
User->>Notebook: run notebook (choose TOOL_OPTION)
alt Client-side tool
Notebook->>Client: register local client tool (`@client_tool`) -> AGENT_TOOLS
Client->>vLLM: send agent/model request (may include tool call)
vLLM-->>Client: model response / tool invocation result
Client-->>Notebook: stream results
else MCP tool (FastMCP)
Notebook->>FastMCP: spawn MCP server (background process)
Notebook->>Client: register MCP toolgroup (provider_id=model-context-protocol) -> AGENT_TOOLS
Client->>FastMCP: invoke MCP tool (tool metadata)
FastMCP->>vLLM: forward/coordinate model calls
vLLM-->>FastMCP: model response
FastMCP-->>Client: tool result
Client-->>Notebook: stream results
end
Notebook->>FastMCP: cleanup (terminate server)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Ruff (0.15.6)docs/public/llama-stack/llama-stack_quickstart.ipynbUnexpected end of JSON input Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
docs/public/llama-stack/llama-stack_quickstart.ipynb (3)
197-214: Consider adding a readiness check instead of a fixed sleep.The
time.sleep(1.5)at line 203 is a race condition—the MCP server may not be ready to accept connections yet, especially on slower systems. Consider polling the endpoint for readiness.💡 Suggested improvement for server readiness
mcp_process = Process(target=_run_mcp_weather_server, daemon=True) mcp_process.start() import socket import time -time.sleep(1.5) +# Wait for server to be ready (up to 10 seconds) +for _ in range(20): + time.sleep(0.5) + try: + with socket.create_connection(("127.0.0.1", 8002), timeout=1): + break + except OSError: + pass🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/public/llama-stack/llama-stack_quickstart.ipynb` around lines 197 - 214, Replace the fixed sleep that waits for the MCP server with an active readiness poll: after starting mcp_process (the Process created with target _run_mcp_weather_server), repeatedly attempt an HTTP GET to the MCP endpoint (use MCP_SERVER_URL if set or construct http://{host}:8002/mcp like the current logic) with a short timeout and small backoff, stop polling when you receive a successful response (2xx) or after a reasonable overall timeout, and only then set os.environ["MCP_SERVER_URL"] and print the success message; keep the same host/port (/mcp) and error/timeout handling so the notebook doesn’t hang indefinitely.
289-298: Redundant import and client initialization.
LlamaStackClientis imported andclientis initialized earlier at lines 91-96. This cell re-importsLlamaStackClientandAgent, but then uses the existingclientvariable from the earlier cell. Consider removing the redundant import or clarifying that cells are designed to be run independently.Note: If the intent is cell independence for users who skip Option A/B setup cells, this is acceptable but could be documented.
💡 Suggested simplification
import os -from llama_stack_client import LlamaStackClient, Agent +from llama_stack_client import Agent if "AGENT_TOOLS" not in globals(): raise RuntimeError("AGENT_TOOLS is missing. Run the matching setup cell(s) in section 2 for the selected TOOL_OPTION.") base_url = os.getenv("LLAMA_STACK_URL", "http://localhost:8321") print(f"Connecting to Server: {base_url}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/public/llama-stack/llama-stack_quickstart.ipynb` around lines 289 - 298, The cell re-imports LlamaStackClient and Agent and references the previously-created client, causing redundancy or confusion; either remove the duplicate imports and rely on the earlier client variable (keep AGENT_TOOLS and base_url handling) or make the cell self-contained by checking/creating a client when missing (e.g., test for existing client variable and call LlamaStackClient(...) to initialize it) so references to LlamaStackClient, Agent, client, AGENT_TOOLS and base_url are consistent; update cell text to document the chosen approach (dependent vs standalone cells).
205-210: IP detection may be unreliable in containerized environments.The
socket.gethostbyname(socket.gethostname())approach may return unexpected results in containers or when multiple network interfaces exist. The fallback toMCP_SERVER_HOSTenv var is good, but consider documenting this limitation more prominently or defaulting to127.0.0.1with a clearer note.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/public/llama-stack/llama-stack_quickstart.ipynb` around lines 205 - 210, The IP autodetection using socket.gethostbyname(socket.gethostname()) to set MCP_SERVER_URL is unreliable in containers/multi-interface hosts; update the logic around MCP_SERVER_URL so it prefers an explicit MCP_SERVER_HOST env var (MCP_SERVER_HOST) or defaults to 127.0.0.1 when MCP_SERVER_HOST is missing, and add a brief inline comment or documentation note next to the MCP_SERVER_URL assignment explaining the container/network limitation and instructing users to set MCP_SERVER_HOST when running inside containers; reference the MCP_SERVER_URL assignment and the socket.gethostbyname/socket.gethostname calls when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docs/public/llama-stack/llama-stack_quickstart.ipynb`:
- Around line 197-214: Replace the fixed sleep that waits for the MCP server
with an active readiness poll: after starting mcp_process (the Process created
with target _run_mcp_weather_server), repeatedly attempt an HTTP GET to the MCP
endpoint (use MCP_SERVER_URL if set or construct http://{host}:8002/mcp like the
current logic) with a short timeout and small backoff, stop polling when you
receive a successful response (2xx) or after a reasonable overall timeout, and
only then set os.environ["MCP_SERVER_URL"] and print the success message; keep
the same host/port (/mcp) and error/timeout handling so the notebook doesn’t
hang indefinitely.
- Around line 289-298: The cell re-imports LlamaStackClient and Agent and
references the previously-created client, causing redundancy or confusion;
either remove the duplicate imports and rely on the earlier client variable
(keep AGENT_TOOLS and base_url handling) or make the cell self-contained by
checking/creating a client when missing (e.g., test for existing client variable
and call LlamaStackClient(...) to initialize it) so references to
LlamaStackClient, Agent, client, AGENT_TOOLS and base_url are consistent; update
cell text to document the chosen approach (dependent vs standalone cells).
- Around line 205-210: The IP autodetection using
socket.gethostbyname(socket.gethostname()) to set MCP_SERVER_URL is unreliable
in containers/multi-interface hosts; update the logic around MCP_SERVER_URL so
it prefers an explicit MCP_SERVER_HOST env var (MCP_SERVER_HOST) or defaults to
127.0.0.1 when MCP_SERVER_HOST is missing, and add a brief inline comment or
documentation note next to the MCP_SERVER_URL assignment explaining the
container/network limitation and instructing users to set MCP_SERVER_HOST when
running inside containers; reference the MCP_SERVER_URL assignment and the
socket.gethostbyname/socket.gethostname calls when making the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a79c809c-a938-492e-9d3a-875a7b08b56f
📒 Files selected for processing (4)
docs/en/llama_stack/install.mdxdocs/en/llama_stack/quickstart.mdxdocs/public/llama-stack/llama-stack_quickstart.ipynbdocs/public/llama-stack/llama-stack_quickstart_mcp.ipynb
💤 Files with no reviewable changes (1)
- docs/public/llama-stack/llama-stack_quickstart_mcp.ipynb
Deploying alauda-ai with
|
| Latest commit: |
e807a06
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://2e569e01.alauda-ai.pages.dev |
| Branch Preview URL: | https://fix-ai-23784.alauda-ai.pages.dev |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/public/llama-stack/llama-stack_quickstart.ipynb`:
- Around line 233-240: The code currently overwrites
os.environ["MCP_SERVER_URL"] without preserving the caller's original value;
capture the original value (e.g., orig = os.environ.get("MCP_SERVER_URL"))
before setting MCP_SERVER_URL and set os.environ["MCP_SERVER_URL"] to the
computed value, then in the corresponding cleanup/teardown section restore the
environment to its prior state (if orig is None delete the key, otherwise set it
back to orig); apply the same capture-and-restore pattern for the other
occurrence referenced (the block around the later cleanup).
- Around line 104-106: The notebook can reuse a stale AGENT_TOOLS when
TOOL_OPTION is switched; update the TOOL_OPTION selector cell to reset/clear
AGENT_TOOLS (e.g., set AGENT_TOOLS = None or {}) whenever TOOL_OPTION is
assigned, and add a validation immediately before the agent construction step to
assert AGENT_TOOLS matches the chosen TOOL_OPTION (or raise an explicit error)
so reruns fail fast and the wrong backend is never used.
- Around line 114-116: The shared FastAPI test flow calls requests.post(...) but
the requests import is currently inside the Option A-only cell (which may be
skipped); move the requests import out of the Option A-only cell into a
common/non-optional setup cell (or add an explicit requests import in the API
test cell) so requests is always defined for the shared flow; update references
around get_weather and AGENT_TOOLS cells accordingly to ensure requests is
available regardless of TOOL_OPTION.
- Around line 183-241: The notebook spawns multiprocessing.Process with a nested
target (_run_mcp_weather_server) which is not picklable on spawn-based kernels
and uses a brittle time.sleep(5) readiness wait; move the server entrypoint out
of the cell into a top-level, importable function (e.g., module-level
run_mcp_weather_server) or launch the FastMCP server via subprocess instead of
Process, then replace the fixed sleep with an explicit readiness poll against
the MCP endpoint (MCP_SERVER_URL) with retry/backoff and timeout before setting
os.environ and printing; update usages of mcp_process, get_weather_mcp, and the
Process creation to reference the new top-level entrypoint.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d37c39e0-1029-46c4-8892-583e0d623e9e
📒 Files selected for processing (2)
docs/en/llama_stack/install.mdxdocs/public/llama-stack/llama-stack_quickstart.ipynb
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/en/llama_stack/install.mdx
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/public/llama-stack/llama-stack_quickstart.ipynb`:
- Around line 255-261: Wrap the call to
socket.gethostbyname(socket.gethostname()) in a try/except that catches
socket.gaierror and socket.herror (and optionally a broad Exception) and on
exception fall back to using os.getenv("MCP_SERVER_HOST", "127.0.0.1") for
_host; then proceed to build MCP_SERVER_URL as before (look for the
MCP_SERVER_URL assignment and the _host logic around socket.gethostbyname to
apply this change).
- Around line 296-309: AGENT_TOOLS is using an invalid dict format for MCP;
after calling client.toolgroups.register(...) you must fetch the actual tool
objects or identifiers and pass those to Agent(tools=...), e.g. call
client.tools.list(toolgroup_id=toolgroup_id) and assign AGENT_TOOLS to that
returned list (or build a list of explicit MCP identifiers like
["mcp::filesystem::read_file"]) so Agent(tools=AGENT_TOOLS) receives valid tool
entries instead of the current {"type":..., "server_label":...,
"server_url":...} structure.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3519d066-0229-46b8-8681-c7fd9604a6a6
📒 Files selected for processing (1)
docs/public/llama-stack/llama-stack_quickstart.ipynb
|
/test-pass |
Summary by CodeRabbit