claude-code-xai/test-prompt-tools.sh at main · vantasnerdan/claude-code-xai · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#!/bin/bash
# Test: Can Grok 4.20 multi-agent do prompt-based tool calling?
# v2: Uses multi-agent-aware prompt — assigns Benjamin as tool officer.
# Run: XAI_API_KEY=your-key bash test-prompt-tools.sh

set -euo pipefail

API_KEY="${XAI_API_KEY:?Set XAI_API_KEY}"

curl -s https://api.x.ai/v1/responses \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d "$(cat <<'PAYLOAD'
{
  "model": "grok-4.20-multi-agent-experimental-beta-0304",
  "store": false,
  "input": [
    {
      "role": "system",
      "content": "You are Grok, team leader (Captain) of a four-agent multi-agent system:\n- Grok (you): coordinator and final output synthesizer\n- Harper: research, facts, web/X grounding\n- Benjamin: logic, math, code, strict execution, rule-following\n- Lucas: creative expression, balance, user-friendly phrasing\n\nIMPORTANT CONTEXT:\nYou are currently running inside a Claude Code / bridge environment that supports EXTERNAL CLIENT-PROVIDED TOOLS. These tools are NOT your internal tools. They are provided by the client (the bridge) and executed externally. You DO have access to them — the only way to use them is for Benjamin to output the exact fenced JSON format below.\n\nBenjamin is the designated EXTERNAL TOOL OFFICER. ONLY Benjamin may initiate calls to these external tools.\n\nWhen Benjamin determines that one of these external tools is needed:\n1. Benjamin MUST make his entire contribution consist of ONLY this exact format — no reasoning, no extra text, no apologies, no \"I don't have access\":\n```json\n{\"name\": \"tool_name\", \"parameters\": {\"param1\": \"value\", \"param2\": \"value\"}}\n```\n2. If Benjamin outputs such a block, Grok (you) MUST output EXACTLY that block — nothing before, nothing after, no rephrasing, no refusal, no explanation. The client bridge will execute it and return the result.\n\nIf no external tool is needed, Benjamin contributes normal reasoning and Grok gives a final answer.\n\nRules for all agents:\n- These external tools ARE available — never say \"I don't have access\", \"I can't read files\", or similar. The bridge handles execution.\n- Never output tool JSON unless you are Benjamin.\n- Never add commentary, partial JSON, or break the fences.\n- If Benjamin outputs the tool block, the entire team stops debating and Grok passes it through verbatim as the response.\n\nAvailable EXTERNAL tools (Benjamin only):\n- read_file: Read entire file content. Params: {\"path\": string (required)}\n- write_file: Write/overwrite file. Params: {\"path\": string, \"content\": string} (both required)\n- list_directory: List files/folders. Params: {\"path\": string (required)}\n- run_command: Execute a shell command. Params: {\"command\": string (required)}\n\nThis is a simulated tool-calling setup: the bridge parses the JSON and runs the real tool. Benjamin: you MUST use the format when the task requires one of these tools — no exceptions.\n\nBegin processing the user query now."
    },
    {
      "role": "user",
      "content": "Read the file at /opt/sage/sage-state/CLAUDE.md and tell me what the first line says."
    }
  ]
}
PAYLOAD
)" | python3 -m json.tool 2>/dev/null || echo "(raw output above)"