Skip to content

feat(jd,taobao,cnki): add JD, Taobao, and CNKI adapters#248

Open
Muuuun wants to merge 5 commits intojackwener:mainfrom
Muuuun:feat/jd-taobao-cnki-adapters
Open

feat(jd,taobao,cnki): add JD, Taobao, and CNKI adapters#248
Muuuun wants to merge 5 commits intojackwener:mainfrom
Muuuun:feat/jd-taobao-cnki-adapters

Conversation

@Muuuun
Copy link

@Muuuun Muuuun commented Mar 22, 2026

Summary

  • Add 12 new adapters for JD (京东), Taobao (淘宝), and CNKI (知网)
  • Both JD and Taobao have complete shopping workflows: search, detail, reviews, add-to-cart, cart

JD (京东) — 5 commands

Command Description
jd search <query> Product search (title, price, shop, SKU)
jd detail <sku> Product detail (ratings, review tags, shop)
jd reviews <sku> User review extraction
jd add-cart <sku> Add to cart via cart.jd.com/gate.action
jd cart View cart contents via JD cart API

JD uses fully obfuscated CSS classes — extraction uses div[data-sku] attributes and text pattern matching.

Taobao (淘宝) — 5 commands

Command Description
taobao search <query> Product search with sort options (default/sale/price)
taobao detail <id> Product detail (title, price, shop, location)
taobao reviews <id> User review extraction
taobao add-cart <id> Add to cart via button click
taobao cart View cart contents

Taobao uses obfuscated CSS with semantic prefixes (e.g. title--xxx, priceInt--xxx, realSales--xxx). The adapter matches via [class*="prefix--"] selectors. Item IDs are extracted from data-spm-act-id attributes.

Note: Taobao requires login in the automation window.

CNKI (知网) — 1 command

Command Description
cnki search <query> Chinese academic paper search via oversea.cnki.net

Test plan

  • npx tsc --noEmit — type check passed
  • npx vitest run src/ — all unit tests passed
  • opencli validate — 86 CLI definitions validated, 0 errors
  • JD: search ✅, detail ✅, reviews ✅, add-cart ✅, cart ✅
  • Taobao: search ✅, detail ✅
  • CNKI: search ✅

🤖 Generated with Claude Code

Mu Qiao and others added 5 commits March 22, 2026 12:29
JD (京东) — full shopping workflow:
- jd/search: product search with price, sales, shop, SKU
- jd/detail: product detail with ratings, review tags, shop info
- jd/reviews: user review extraction
- jd/add-cart: add to cart via gate.action API
- jd/cart: view cart contents via JD cart API

Taobao (淘宝) — search with clean field extraction:
- taobao/search: uses CSS class prefix matching (title--, priceInt--,
  realSales--, shopName--, procity--) to cleanly extract structured
  data from Taobao's obfuscated DOM

CNKI (知网) — overseas portal:
- cnki/search: uses oversea.cnki.net to avoid domestic access
  restrictions, extracts results from search table DOM

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Complete the Taobao shopping workflow to match JD feature parity:
- taobao/detail: product info with title, price, shop, location
- taobao/reviews: user review extraction from product page
- taobao/add-cart: add to cart via button click automation
- taobao/cart: view cart contents

Also update taobao/search to extract item_id from data-spm-act-id
attribute and generate proper item.taobao.com URLs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- detail: fix price extraction (regex from text), shop name matching,
  add spec listing
- cart: rewrite using text section parsing (split by "移入收藏"),
  extract price from split ¥/digits lines
- add-cart: fix navigation (go via taobao.com for session cookies)
- reviews: use tmall/taobao rate API (currently returns empty as
  API requires MTOP signing — documented limitation)
- search: extract item_id from data-spm-act-id, generate proper URLs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reviews: bypass CORS by injecting a <script> tag with JSONP callback
to call rate.tmall.com API directly from the product page context.
Extracts sellerId from page HTML, constructs rate API URL, and
parses the JSONP response for user, content, date, and SKU spec.

Cart: parse cart items by splitting text on "移入收藏" delimiters,
extracting product title, split-digit prices (¥/digits), and specs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add --spec flag for specifying product variants when adding to cart.

Usage:
  opencli taobao add-cart <id> --spec "SG90 180度 小5孔"

- Multiple keywords are space-separated, matched against spec options
- Each spec group selects the option with the most keyword matches
- Without --spec, auto-selects the first available option per group
- Polls for cart confirmation dialog (handles async UI)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Contributor

@Astro-Han Astro-Han left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive scope — complete shopping workflows for both JD and Taobao, including spec selection. A few concerns, some more serious than others:

page.evaluate injection — user input interpolated into JS strings

Several adapters embed kwargs.sku / kwargs.id directly inside page.evaluate template strings without validation:

// jd/detail.ts
{ field: 'SKU', value: '${kwargs.sku}' }

// taobao/detail.ts
location.href = 'https://item.taobao.com/item.htm?id=${kwargs.id}'

If the value contains a single quote or backtick, it breaks the script; a crafted value can inject arbitrary JS in the page's authenticated context. Since sku and id are always numeric, a simple guard like if (!/^\d+$/.test(kwargs.sku)) throw ... before interpolation would close this.

Affected files: jd/detail.ts, jd/add-cart.ts, taobao/detail.ts, taobao/reviews.ts, taobao/add-cart.ts, taobao/search.ts.

add-cart — write operations with no dry-run

Both jd/add-cart and taobao/add-cart modify real shopping carts on execution. taobao/add-cart also auto-selects the first available spec when --spec is omitted, which could surprise users. Consider adding a --dry-run flag that shows what would be added without committing the action.

taobao/reviews.ts — JSONP script injection

The JSONP callback creates a global window[cbName] and injects a <script> tag, but neither the script element nor the callback is reliably cleaned up on all paths (success, error, timeout). The 10s timeout and callback deletion can also race. Minor, but worth a cleanup pass.

Taobao two-step navigation

All 5 Taobao adapters do goto('https://www.taobao.com')wait(2)evaluate location.href = target. This is presumably to establish session cookies before navigating to item pages. A brief comment explaining the rationale would help future maintainers understand whether this can be simplified.

jd/cart.ts — hardcoded delivery region

The cart API URL includes area=22_1930_50948_52157, which locks prices/availability to a specific region. Worth documenting as a known limitation or making it configurable.

Tests

This is the second consecutive PR (after #243) with zero test coverage. 11 new commands — including 2 write operations — with no E2E entries. Per TESTING.md, browser+auth commands should have entries in browser-auth.test.ts (at minimum verifying graceful failure when not logged in). The add-cart commands especially need test coverage to ensure they don't silently "succeed" in unauthenticated sessions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants