From 53c27dd03a12024cf072564c057dc23985bfbd49 Mon Sep 17 00:00:00 2001 From: firstdata-dev Date: Wed, 1 Apr 2026 01:28:48 +0800 Subject: [PATCH 1/7] docs: add MCP tool limitations, report_feedback example, and description quality guidelines MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## What this PR does Adds comprehensive Limitations documentation for all 5 MCP tools based on verified testing and schema analysis. Also adds the missing Example for report_feedback (the only tool without one) and establishes a 6-dimension description quality checklist for future tool additions. ## Changes ### SKILL.md — MCP Tools Reference (new section) - Common Limitations: authentication, daily quota, network dependency - search_source: 200 max results, keyword substring matching behavior, space-in-keyword pitfall, domain substring matching, no boolean operators - get_source: silent error behavior (isError:false with error objects), recommended batch size - ask_agent: query constraints, non-idempotent, 2-8s response time, web_search trigger warning - get_access_guide: incomplete instruction coverage, 3-20s response time, operation specificity requirement - report_feedback: message length, non-idempotent, two usage examples (broken link + outdated content) ### Description Quality Guidelines (new section) - Core principle: 'Write it right before writing it all' - 6-dimension checklist for PR review ### mcp-tool-descriptions-draft.md (new file) - Server-side description text ready to paste into Python code - Verification evidence table with test results and schema references ## Verification Evidence Every limitation is backed by schema analysis or live testing: - search_source limit 200: inputSchema maximum:200 - Keywords not auto-tokenized: tested ['中国 GDP']→0, ['中国','GDP']→173 - get_source silent error: tested invalid ID returns error object, isError:false - ask_agent timing: 3 runs measured 1.8s, 2.9s, 7.4s - get_access_guide timing: 3 runs measured 3.0s, 17.6s, 19.1s - Token quota: TokenVerifyResponse schema has quota_allowed/remaining_daily - Trial quota 30/day: verified via /api/trial/session-info ## 6-Dimension Self-Assessment (post-change) | Dimension | search_source | get_source | ask_agent | get_access_guide | report_feedback | |-----------|:---:|:---:|:---:|:---:|:---:| | Purpose | ✅ | ✅ | ✅ | ✅ | ✅ | | Guidelines | ✅ | ✅ | ✅ | ✅ | ✅ | | Examples | ✅ | ✅ | ✅ | ✅ | ✅ (NEW) | | Limitations | ✅ (NEW) | ✅ (NEW) | ✅ (NEW) | ✅ (NEW) | ✅ (NEW) | | Parameters | ✅ | ✅ | ✅ | ✅ | ✅ | | Return Format | ✅ | ✅ | ✅ | ✅ | ✅ | Target: 5/5 tools × 6/6 dimensions = 30/30 ✅ Refs: MCP Search Quality Research #5, arXiv 2602.14878, arXiv 2602.18914 --- skills/firstdata/SKILL.md | 88 ++++++++++++++++++ .../firstdata/mcp-tool-descriptions-draft.md | 89 +++++++++++++++++++ 2 files changed, 177 insertions(+) create mode 100644 skills/firstdata/mcp-tool-descriptions-draft.md diff --git a/skills/firstdata/SKILL.md b/skills/firstdata/SKILL.md index 498f608..0627f96 100644 --- a/skills/firstdata/SKILL.md +++ b/skills/firstdata/SKILL.md @@ -86,6 +86,94 @@ Or add manually to your MCP config: Once connected, browse the tool list provided by the firstdata MCP and select the appropriate tool based on your needs. +## MCP Tools Reference + +The FirstData MCP server provides 5 tools. Below is a reference with usage guidelines, limitations, and examples. + +### Common Limitations (all tools) + +- **Authentication required**: All tools require a valid API key (JWT token) via `Authorization: Bearer ` header. +- **Daily call quota**: API usage is subject to a per-token daily call quota. Remaining quota is tracked server-side; exceeding the limit will result in rejected requests. +- **Network dependency**: All tools make HTTP calls to the FirstData server (`firstdata.deepminer.com.cn`). Network latency and server availability affect response times. + +### Tool: `search_source` + +**Purpose**: Unified data source search tool supporting keyword search, structured filtering, pagination, and multiple output modes. + +**Limitations**: +- Maximum **200** results per query (`limit` parameter range: 1–200, default: 20). +- ⚠️ **Keywords are NOT auto-tokenized by spaces.** Each keyword in the array is matched as an exact substring. `["中国 GDP"]` returns 0 results — use `["中国", "GDP"]` instead. +- Keyword matching is **substring-based**, not semantic search. Keywords are matched against source metadata fields (name, description, tags, content). +- The `domain` parameter uses **substring matching**, not exact enum matching (e.g., `"finance"` matches `"public-finance"`, `"finance"`, `"financial-markets"`). +- No boolean operators (AND/OR/NOT). Multiple keywords in the array are combined with AND logic. +- Response time: typically **~1 second**. + +### Tool: `get_source` + +**Purpose**: Retrieve full details for specific data sources by their IDs. + +**Limitations**: +- Invalid `source_id` values do NOT cause an error response (`isError: false`). Instead, the result array includes `{"id": "xxx", "error": "Not found"}` for each invalid ID alongside valid results. Callers must check individual items for `error` fields rather than relying solely on `isError`. +- No schema-level limit on the number of `source_ids` per request, but performance with large batches (50+) is unverified. Recommended: ≤20 IDs per request. +- The `fields` parameter filters returned fields; when omitted, all fields are returned. + +### Tool: `ask_agent` + +**Purpose**: LLM-powered intelligent search agent for complex, cross-domain, or ambiguous queries that require multi-step reasoning. + +**Limitations**: +- Query length: 2–1,000 characters. +- Maximum results: 1–20 (default: 5). +- **Non-idempotent**: Same query may return different results across calls (LLM reasoning varies). +- **Response time: typically 2–8 seconds** (involves LLM inference). May take longer (10–30+ seconds) when the agent triggers `web_search` for external information. +- Internally uses LangChain ReAct agent with `jq` for local data queries plus optional `web_search`. The web search step is not user-controllable. +- **Use `search_source` instead** for simple keyword matching or structured filtering — it is faster, deterministic, and cheaper. + +### Tool: `get_access_guide` + +**Purpose**: Generate detailed access instructions for a specific data source using RAG (Retrieval-Augmented Generation). + +**Limitations**: +- **Not all data sources have instruction libraries.** If a source has no pre-built instructions, results will be empty or irrelevant. +- Invalid `source_id` returns `{"error": "数据源 xxx 不存在"}`. +- `top_k` range: 1–5 (default: 3). +- **Response time is highly variable: 3–20 seconds**, depending on RAG retrieval complexity and server load. +- Retrieval quality depends heavily on the specificity of the `operation` parameter. Vague descriptions yield lower-quality matches. Use specific action verbs and entity names (e.g., "查询2024年M2货币供应量数据" rather than "查数据"). + +### Tool: `report_feedback` + +**Purpose**: Submit user feedback to the development team when FirstData has a confirmed issue. + +**Limitations**: +- `feedback_message` length: 10–2,000 characters. +- **Non-idempotent**: Duplicate calls create duplicate feedback entries. Do not retry on success. +- Only use when a genuine issue is confirmed (missing source, incorrect data, broken functionality). Do not use as a general comment channel. + +**Examples**: + +``` +# Example 1: Broken link +feedback_message="链接失效:数据源 china-pbc 的 data_url (https://www.pbc.gov.cn/diaochatongjisi/116219/index.html) 返回 404,无法访问数据页面。检索关键词:中国货币供应量" + +# Example 2: Outdated or inaccurate content +feedback_message="数据内容过时:数据源 worldbank-open-data 的 update_frequency 标注为 quarterly,但实际数据页面显示最近更新日期为 2024-06-30,已超过 6 个月未更新。建议核实并修正更新频率或标注数据滞后情况" +``` + +## Description Quality Guidelines + +When adding or modifying MCP tool descriptions, follow these principles (based on [MCP tool description quality research](https://arxiv.org/abs/2602.14878)): + +**Core principle: "Write it right before writing it all"** — Functionality accuracy (+11.6% impact) matters ~8× more than Conciseness (+1.5%). + +**6-dimension checklist** (check all before submitting): + +- [ ] **Purpose**: Is the tool's function clearly stated in the first sentence? +- [ ] **Guidelines**: Are usage scenarios and when-to-use / when-not-to-use rules included? +- [ ] **Examples**: Are typical input/output examples provided? +- [ ] **Limitations**: Are constraints, edge cases, and known limitations documented? +- [ ] **Parameters**: Are all parameters described with types, ranges, and defaults? +- [ ] **Return Format**: Is the response structure documented? + ## Community FirstData is an open-source project — join us in building the authoritative data source knowledge base for agents: diff --git a/skills/firstdata/mcp-tool-descriptions-draft.md b/skills/firstdata/mcp-tool-descriptions-draft.md new file mode 100644 index 0000000..b20cc91 --- /dev/null +++ b/skills/firstdata/mcp-tool-descriptions-draft.md @@ -0,0 +1,89 @@ +# MCP Tool Descriptions — Server-Side Draft + +> **Purpose**: This file contains the exact text to be added to each tool's description in the MCP server Python code. +> After PR review approval, copy these Limitations sections into the server-side tool description strings. +> These texts must remain **identical** to the corresponding sections in `SKILL.md`. + +--- + +## search_source — Add to description + +``` +**Limitations:** +- Maximum 200 results per query (limit range: 1–200, default: 20) +- Keywords are NOT auto-tokenized by spaces. Each keyword is matched as an exact substring. ["中国 GDP"] returns 0 results — use ["中国", "GDP"] instead +- Keyword matching is substring-based, not semantic search +- domain parameter uses substring matching, not exact enum matching +- No boolean operators (AND/OR/NOT). Multiple keywords use AND logic +- Subject to daily API call quota per token +``` + +## get_source — Add to description + +``` +**Limitations:** +- Invalid source_id does NOT set isError=true. Returns {"id": "xxx", "error": "Not found"} in the result array. Callers must check individual items for error fields +- No schema-level limit on source_ids count, but recommended ≤20 per request for reliable performance +- Subject to daily API call quota per token +``` + +## ask_agent — Add to description + +``` +**Limitations:** +- Query length: 2–1,000 characters +- Maximum results: 1–20 (default: 5) +- Non-idempotent: same query may return different results (LLM reasoning varies) +- Response time: typically 2–8 seconds; may reach 10–30+ seconds when web_search is triggered +- Subject to daily API call quota per token +``` + +## get_access_guide — Add to description + +``` +**Limitations:** +- Not all data sources have instruction libraries. Sources without pre-built instructions return empty or irrelevant results +- Invalid source_id returns {"error": "数据源 xxx 不存在"} +- top_k range: 1–5 (default: 3) +- Response time is highly variable: 3–20 seconds depending on RAG retrieval complexity and server load +- Retrieval quality depends on specificity of the operation parameter. Use specific action verbs and entity names +- Subject to daily API call quota per token +``` + +## report_feedback — Add to description + Example + +``` +**Limitations:** +- feedback_message length: 10–2,000 characters +- Non-idempotent: duplicate calls create duplicate feedback entries. Do not retry on success +- Subject to daily API call quota per token + +**示例:** +- 链接失效反馈: feedback_message="链接失效:数据源 china-pbc 的 data_url 返回 404,无法访问。检索关键词:中国货币供应量" +- 数据过时反馈: feedback_message="数据内容过时:数据源 worldbank-open-data 的 update_frequency 标注为 quarterly,但实际已超过 6 个月未更新" +``` + +--- + +## Verification Evidence + +Each limitation is backed by one of these sources: + +| Limitation | Source | +|---|---| +| search_source limit: 1–200 | inputSchema `maximum: 200, minimum: 1` | +| Keywords not auto-tokenized | Tested: `["中国 GDP"]` → 0 results; `["中国", "GDP"]` → 173 results | +| Substring matching | Tested: `["中国GDP"]` → 1 result (exact substring); `["GDP"]` → 100 results | +| domain substring matching | inputSchema description: "领域关键词,子串匹配" | +| get_source silent error | Tested: invalid ID returns `{"id":"xxx","error":"Not found"}` with `isError: false` | +| get_source mixed valid/invalid | Tested: valid IDs return data, invalid return error objects, no request interruption | +| ask_agent query length | inputSchema `minLength: 2, maxLength: 1000` | +| ask_agent max_results | inputSchema `minimum: 1, maximum: 20, default: 5` | +| ask_agent non-idempotent | annotations `idempotentHint: false` | +| ask_agent response time | Tested 3 runs: 7.4s, 2.9s, 1.8s | +| get_access_guide invalid source | Tested: returns `{"error": "数据源 xxx 不存在"}` | +| get_access_guide top_k | inputSchema `minimum: 1, maximum: 5, default: 3` | +| get_access_guide response time | Tested 3 runs: 3.0s, 17.6s, 19.1s | +| report_feedback message length | inputSchema `minLength: 10, maxLength: 2000` | +| Daily call quota exists | TokenVerifyResponse schema: `quota_allowed`, `remaining_daily` fields | +| Trial quota: 30/day | Tested via `/api/trial/session-info`: `total_calls: 30` | From 4f772711af38a45f5c14a871c435134f7e195380 Mon Sep 17 00:00:00 2001 From: firstdata-dev Date: Wed, 1 Apr 2026 01:30:15 +0800 Subject: [PATCH 2/7] refine: keyword wording (guiding > restrictive) + quota query limitation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Address review feedback: 1. Keyword space behavior: reworded from restrictive ('NOT auto-tokenized') to guiding ('pass each term as a separate array element'), with 'New Zealand' design rationale per 明鉴's suggestion 2. Token quota: added explicit note that no client-facing API exists to query remaining quota at runtime, per 明鉴's question --- skills/firstdata/SKILL.md | 4 ++-- skills/firstdata/mcp-tool-descriptions-draft.md | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/skills/firstdata/SKILL.md b/skills/firstdata/SKILL.md index 0627f96..8f7d7f3 100644 --- a/skills/firstdata/SKILL.md +++ b/skills/firstdata/SKILL.md @@ -93,7 +93,7 @@ The FirstData MCP server provides 5 tools. Below is a reference with usage guide ### Common Limitations (all tools) - **Authentication required**: All tools require a valid API key (JWT token) via `Authorization: Bearer ` header. -- **Daily call quota**: API usage is subject to a per-token daily call quota. Remaining quota is tracked server-side; exceeding the limit will result in rejected requests. +- **Daily call quota**: API usage is subject to a per-token daily call quota. Quota varies by API key tier (trial accounts: 30 calls/day). There is currently no client-facing API to query remaining quota at runtime — callers should implement their own usage tracking if quota management is needed. - **Network dependency**: All tools make HTTP calls to the FirstData server (`firstdata.deepminer.com.cn`). Network latency and server availability affect response times. ### Tool: `search_source` @@ -102,7 +102,7 @@ The FirstData MCP server provides 5 tools. Below is a reference with usage guide **Limitations**: - Maximum **200** results per query (`limit` parameter range: 1–200, default: 20). -- ⚠️ **Keywords are NOT auto-tokenized by spaces.** Each keyword in the array is matched as an exact substring. `["中国 GDP"]` returns 0 results — use `["中国", "GDP"]` instead. +- ⚠️ **Each keyword is matched as an independent substring — pass each search term as a separate array element.** For example, use `["中国", "GDP"]` (173 results) instead of `["中国 GDP"]` (0 results). This is by design to preserve multi-word terms like `"New Zealand"` or `"World Bank"`. - Keyword matching is **substring-based**, not semantic search. Keywords are matched against source metadata fields (name, description, tags, content). - The `domain` parameter uses **substring matching**, not exact enum matching (e.g., `"finance"` matches `"public-finance"`, `"finance"`, `"financial-markets"`). - No boolean operators (AND/OR/NOT). Multiple keywords in the array are combined with AND logic. diff --git a/skills/firstdata/mcp-tool-descriptions-draft.md b/skills/firstdata/mcp-tool-descriptions-draft.md index b20cc91..50fd327 100644 --- a/skills/firstdata/mcp-tool-descriptions-draft.md +++ b/skills/firstdata/mcp-tool-descriptions-draft.md @@ -11,11 +11,11 @@ ``` **Limitations:** - Maximum 200 results per query (limit range: 1–200, default: 20) -- Keywords are NOT auto-tokenized by spaces. Each keyword is matched as an exact substring. ["中国 GDP"] returns 0 results — use ["中国", "GDP"] instead +- Each keyword is matched as an independent substring — pass each search term as a separate array element. Use ["中国", "GDP"] instead of ["中国 GDP"]. This preserves multi-word terms like "New Zealand" or "World Bank" - Keyword matching is substring-based, not semantic search - domain parameter uses substring matching, not exact enum matching - No boolean operators (AND/OR/NOT). Multiple keywords use AND logic -- Subject to daily API call quota per token +- Subject to daily API call quota per token (no client-facing API to query remaining quota) ``` ## get_source — Add to description From a9ad08cf2d542ca0e03d4a63c68d116aaec40699 Mon Sep 17 00:00:00 2001 From: firstdata-dev Date: Wed, 1 Apr 2026 01:30:49 +0800 Subject: [PATCH 3/7] refine: source_ids batch size as practical guideline, not hard limit --- skills/firstdata/SKILL.md | 2 +- skills/firstdata/mcp-tool-descriptions-draft.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/skills/firstdata/SKILL.md b/skills/firstdata/SKILL.md index 8f7d7f3..53ad4e2 100644 --- a/skills/firstdata/SKILL.md +++ b/skills/firstdata/SKILL.md @@ -114,7 +114,7 @@ The FirstData MCP server provides 5 tools. Below is a reference with usage guide **Limitations**: - Invalid `source_id` values do NOT cause an error response (`isError: false`). Instead, the result array includes `{"id": "xxx", "error": "Not found"}` for each invalid ID alongside valid results. Callers must check individual items for `error` fields rather than relying solely on `isError`. -- No schema-level limit on the number of `source_ids` per request, but performance with large batches (50+) is unverified. Recommended: ≤20 IDs per request. +- No schema-level limit on the number of `source_ids` per request, but performance with large batches (50+) is unverified. As a practical guideline (not a hard limit), consider batching in groups of ~20. - The `fields` parameter filters returned fields; when omitted, all fields are returned. ### Tool: `ask_agent` diff --git a/skills/firstdata/mcp-tool-descriptions-draft.md b/skills/firstdata/mcp-tool-descriptions-draft.md index 50fd327..14df482 100644 --- a/skills/firstdata/mcp-tool-descriptions-draft.md +++ b/skills/firstdata/mcp-tool-descriptions-draft.md @@ -23,7 +23,7 @@ ``` **Limitations:** - Invalid source_id does NOT set isError=true. Returns {"id": "xxx", "error": "Not found"} in the result array. Callers must check individual items for error fields -- No schema-level limit on source_ids count, but recommended ≤20 per request for reliable performance +- No schema-level limit on source_ids count, but large batch performance is unverified. Practical guideline (not a hard limit): batch in groups of ~20 - Subject to daily API call quota per token ``` From 32a2bab335d66e2e7931c119b8a8d42b43d5e553 Mon Sep 17 00:00:00 2001 From: firstdata-dev Date: Wed, 1 Apr 2026 01:32:37 +0800 Subject: [PATCH 4/7] fix: align report_feedback examples between SKILL.md and draft Draft had shortened versions of the examples; now both files have identical text as required by the draft file's own header. --- skills/firstdata/mcp-tool-descriptions-draft.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/skills/firstdata/mcp-tool-descriptions-draft.md b/skills/firstdata/mcp-tool-descriptions-draft.md index 14df482..eb0f62a 100644 --- a/skills/firstdata/mcp-tool-descriptions-draft.md +++ b/skills/firstdata/mcp-tool-descriptions-draft.md @@ -59,8 +59,8 @@ - Subject to daily API call quota per token **示例:** -- 链接失效反馈: feedback_message="链接失效:数据源 china-pbc 的 data_url 返回 404,无法访问。检索关键词:中国货币供应量" -- 数据过时反馈: feedback_message="数据内容过时:数据源 worldbank-open-data 的 update_frequency 标注为 quarterly,但实际已超过 6 个月未更新" +- 链接失效反馈: feedback_message="链接失效:数据源 china-pbc 的 data_url (https://www.pbc.gov.cn/diaochatongjisi/116219/index.html) 返回 404,无法访问数据页面。检索关键词:中国货币供应量" +- 数据过时反馈: feedback_message="数据内容过时:数据源 worldbank-open-data 的 update_frequency 标注为 quarterly,但实际数据页面显示最近更新日期为 2024-06-30,已超过 6 个月未更新。建议核实并修正更新频率或标注数据滞后情况" ``` --- From 611c081f41770e57e74e206d00d88e5bbfee3b5f Mon Sep 17 00:00:00 2001 From: firstdata-dev Date: Wed, 1 Apr 2026 01:33:46 +0800 Subject: [PATCH 5/7] fix: align examples (short version) + add quota query mechanism MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 1. Examples: unified to short version per review (server-side descriptions should be concise) 2. Quota: replaced 'no client-facing API' with actual mechanism — Token verification API (POST /api/token/verify) returns remaining_daily, but this is a separate HTTP call, not available via MCP tool invocation --- skills/firstdata/SKILL.md | 8 ++++---- skills/firstdata/mcp-tool-descriptions-draft.md | 6 +++--- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/skills/firstdata/SKILL.md b/skills/firstdata/SKILL.md index 53ad4e2..d68ab1e 100644 --- a/skills/firstdata/SKILL.md +++ b/skills/firstdata/SKILL.md @@ -93,7 +93,7 @@ The FirstData MCP server provides 5 tools. Below is a reference with usage guide ### Common Limitations (all tools) - **Authentication required**: All tools require a valid API key (JWT token) via `Authorization: Bearer ` header. -- **Daily call quota**: API usage is subject to a per-token daily call quota. Quota varies by API key tier (trial accounts: 30 calls/day). There is currently no client-facing API to query remaining quota at runtime — callers should implement their own usage tracking if quota management is needed. +- **Daily call quota**: API usage is subject to a per-token daily call quota. Quota varies by API key tier (trial accounts: 30 calls/day). MCP tool calls do not return remaining quota information. To check quota, use the Token verification API (`POST /api/token/verify`) which returns `remaining_daily` in the response — this is a separate HTTP call, not available through MCP tool invocation. - **Network dependency**: All tools make HTTP calls to the FirstData server (`firstdata.deepminer.com.cn`). Network latency and server availability affect response times. ### Tool: `search_source` @@ -153,10 +153,10 @@ The FirstData MCP server provides 5 tools. Below is a reference with usage guide ``` # Example 1: Broken link -feedback_message="链接失效:数据源 china-pbc 的 data_url (https://www.pbc.gov.cn/diaochatongjisi/116219/index.html) 返回 404,无法访问数据页面。检索关键词:中国货币供应量" +feedback_message="链接失效:数据源 china-pbc 的 data_url 返回 404,无法访问数据页面。检索关键词:中国货币供应量" -# Example 2: Outdated or inaccurate content -feedback_message="数据内容过时:数据源 worldbank-open-data 的 update_frequency 标注为 quarterly,但实际数据页面显示最近更新日期为 2024-06-30,已超过 6 个月未更新。建议核实并修正更新频率或标注数据滞后情况" +# Example 2: Outdated content +feedback_message="数据内容过时:数据源 worldbank-open-data 的 update_frequency 标注为 quarterly,但实际已超过 6 个月未更新" ``` ## Description Quality Guidelines diff --git a/skills/firstdata/mcp-tool-descriptions-draft.md b/skills/firstdata/mcp-tool-descriptions-draft.md index eb0f62a..500ec4a 100644 --- a/skills/firstdata/mcp-tool-descriptions-draft.md +++ b/skills/firstdata/mcp-tool-descriptions-draft.md @@ -15,7 +15,7 @@ - Keyword matching is substring-based, not semantic search - domain parameter uses substring matching, not exact enum matching - No boolean operators (AND/OR/NOT). Multiple keywords use AND logic -- Subject to daily API call quota per token (no client-facing API to query remaining quota) +- Subject to daily API call quota per token. MCP tool calls do not return remaining quota; use Token verification API (POST /api/token/verify, returns remaining_daily) to check ``` ## get_source — Add to description @@ -59,8 +59,8 @@ - Subject to daily API call quota per token **示例:** -- 链接失效反馈: feedback_message="链接失效:数据源 china-pbc 的 data_url (https://www.pbc.gov.cn/diaochatongjisi/116219/index.html) 返回 404,无法访问数据页面。检索关键词:中国货币供应量" -- 数据过时反馈: feedback_message="数据内容过时:数据源 worldbank-open-data 的 update_frequency 标注为 quarterly,但实际数据页面显示最近更新日期为 2024-06-30,已超过 6 个月未更新。建议核实并修正更新频率或标注数据滞后情况" +- 链接失效反馈: feedback_message="链接失效:数据源 china-pbc 的 data_url 返回 404,无法访问数据页面。检索关键词:中国货币供应量" +- 数据过时反馈: feedback_message="数据内容过时:数据源 worldbank-open-data 的 update_frequency 标注为 quarterly,但实际已超过 6 个月未更新" ``` --- From 3f072086f672238d04a3b91563be24c2fa52f43c Mon Sep 17 00:00:00 2001 From: firstdata-dev Date: Wed, 1 Apr 2026 01:35:39 +0800 Subject: [PATCH 6/7] =?UTF-8?q?fix:=20AND=E2=86=92OR=20logic=20(verified),?= =?UTF-8?q?=20draft=20header=20wording,=20add=20OR=20evidence?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Critical fix: - Multiple keywords use OR logic, NOT AND. Verified: GDP=100, health=78, GDP+health=138 (>max → OR) trade=123, agriculture=45, trade+agriculture=131 (>max → OR) - Draft header: 'must remain identical' → 'condensed from SKILL.md, semantics must match' - Added OR logic verification to evidence table --- skills/firstdata/SKILL.md | 2 +- skills/firstdata/mcp-tool-descriptions-draft.md | 7 ++++--- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/skills/firstdata/SKILL.md b/skills/firstdata/SKILL.md index d68ab1e..4a667a9 100644 --- a/skills/firstdata/SKILL.md +++ b/skills/firstdata/SKILL.md @@ -105,7 +105,7 @@ The FirstData MCP server provides 5 tools. Below is a reference with usage guide - ⚠️ **Each keyword is matched as an independent substring — pass each search term as a separate array element.** For example, use `["中国", "GDP"]` (173 results) instead of `["中国 GDP"]` (0 results). This is by design to preserve multi-word terms like `"New Zealand"` or `"World Bank"`. - Keyword matching is **substring-based**, not semantic search. Keywords are matched against source metadata fields (name, description, tags, content). - The `domain` parameter uses **substring matching**, not exact enum matching (e.g., `"finance"` matches `"public-finance"`, `"finance"`, `"financial-markets"`). -- No boolean operators (AND/OR/NOT). Multiple keywords in the array are combined with AND logic. +- No boolean operators (AND/OR/NOT). Multiple keywords in the array are combined with **OR logic** (results matching any keyword are returned, deduplicated). - Response time: typically **~1 second**. ### Tool: `get_source` diff --git a/skills/firstdata/mcp-tool-descriptions-draft.md b/skills/firstdata/mcp-tool-descriptions-draft.md index 500ec4a..5ab1afb 100644 --- a/skills/firstdata/mcp-tool-descriptions-draft.md +++ b/skills/firstdata/mcp-tool-descriptions-draft.md @@ -1,8 +1,8 @@ # MCP Tool Descriptions — Server-Side Draft -> **Purpose**: This file contains the exact text to be added to each tool's description in the MCP server Python code. +> **Purpose**: This file contains the text to be added to each tool's description in the MCP server Python code. > After PR review approval, copy these Limitations sections into the server-side tool description strings. -> These texts must remain **identical** to the corresponding sections in `SKILL.md`. +> Content is condensed from `SKILL.md` for server-side use. Semantics must match; formatting may differ slightly for plain-text context. --- @@ -14,7 +14,7 @@ - Each keyword is matched as an independent substring — pass each search term as a separate array element. Use ["中国", "GDP"] instead of ["中国 GDP"]. This preserves multi-word terms like "New Zealand" or "World Bank" - Keyword matching is substring-based, not semantic search - domain parameter uses substring matching, not exact enum matching -- No boolean operators (AND/OR/NOT). Multiple keywords use AND logic +- No boolean operators (AND/OR/NOT). Multiple keywords use OR logic (results matching any keyword are returned, deduplicated) - Subject to daily API call quota per token. MCP tool calls do not return remaining quota; use Token verification API (POST /api/token/verify, returns remaining_daily) to check ``` @@ -73,6 +73,7 @@ Each limitation is backed by one of these sources: |---|---| | search_source limit: 1–200 | inputSchema `maximum: 200, minimum: 1` | | Keywords not auto-tokenized | Tested: `["中国 GDP"]` → 0 results; `["中国", "GDP"]` → 173 results | +| Multiple keywords use OR logic | Tested: `["GDP"]`→100, `["health"]`→78, `["GDP","health"]`→138 (>max, confirmed OR) | | Substring matching | Tested: `["中国GDP"]` → 1 result (exact substring); `["GDP"]` → 100 results | | domain substring matching | inputSchema description: "领域关键词,子串匹配" | | get_source silent error | Tested: invalid ID returns `{"id":"xxx","error":"Not found"}` with `isError: false` | From ec3b785374f53c48ebc3c895fc14e09a33c36bd8 Mon Sep 17 00:00:00 2001 From: firstdata-dev Date: Wed, 1 Apr 2026 01:36:14 +0800 Subject: [PATCH 7/7] add: search_source response time (~1s) to draft MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per 明鉴 review: Agent needs response time info for all tools, not just the slow ones, to make informed tool selection decisions. --- skills/firstdata/mcp-tool-descriptions-draft.md | 1 + 1 file changed, 1 insertion(+) diff --git a/skills/firstdata/mcp-tool-descriptions-draft.md b/skills/firstdata/mcp-tool-descriptions-draft.md index 5ab1afb..4c2a8f7 100644 --- a/skills/firstdata/mcp-tool-descriptions-draft.md +++ b/skills/firstdata/mcp-tool-descriptions-draft.md @@ -15,6 +15,7 @@ - Keyword matching is substring-based, not semantic search - domain parameter uses substring matching, not exact enum matching - No boolean operators (AND/OR/NOT). Multiple keywords use OR logic (results matching any keyword are returned, deduplicated) +- Response time: typically ~1 second - Subject to daily API call quota per token. MCP tool calls do not return remaining quota; use Token verification API (POST /api/token/verify, returns remaining_daily) to check ```