You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Use the bundled fetcher to sync to the latest Content Taxonomy files from the official IAB GitHub repository. It will locate the latest 2.x and 3.x datasets and normalize them into this tool’s schemas.
@@ -147,11 +156,21 @@ Replace the stub `data/*.json` with your **full IAB catalogs** (include `id`, `l
Gate releases on accuracy deltas so behavior stays stable for audits.
336
354
355
+
Minimal starter:
356
+
357
+
```json
358
+
// scripts/gold.json
359
+
[{"in_label":"Sports","topic_ids":["483"]}]
360
+
```
361
+
362
+
```python
363
+
# scripts/eval.py (toy example)
364
+
import json, sys
365
+
pred = { (r.get('in_label')): set(r.get('topic_ids',[])) for r in json.load(open(sys.argv[1])) }
366
+
gold = { (r.get('in_label')): set(r.get('topic_ids',[])) for r in json.load(open(sys.argv[2])) }
367
+
tp=fp=fn=0
368
+
for k in gold:
369
+
g=gold[k]; p=pred.get(k,set())
370
+
tp +=len(g & p); fp +=len(p - g); fn +=len(g - p)
371
+
print({'tp':tp,'fp':fp,'fn':fn})
372
+
```
373
+
337
374
---
338
375
339
376
## 🛠️ Updating Catalogs
@@ -347,6 +384,34 @@ Commit with a version bump and note `taxonomy_version` in your release notes.
347
384
348
385
---
349
386
387
+
## 🔐 Security & operations
388
+
389
+
- Local-first: processing happens on your machine; no external APIs needed.
390
+
- No PII required; CSV/JSON processed in-memory.
391
+
- Air‑gapped: prebundle ST model and run `iab-mapper` fully offline.
392
+
393
+
---
394
+
395
+
## 🤝 Using Mixpeek API (optional)
396
+
397
+
If you prefer managing catalogs, outputs, and audits centrally, you can run mapping locally and then persist results via Mixpeek for auditability.
398
+
399
+
```http
400
+
# 1) create collection
401
+
POST /collections { "name": "iab-taxonomy" }
402
+
403
+
# 2) create 'document' with 2.x codes
404
+
POST /collections/{id}/documents { "document_id":"iab-2x", "properties": { ... } }
405
+
406
+
# 3) run taxonomy feature extractor (2.x → 3.0)
407
+
POST /collections/{id}/documents/{doc}/features { "extractor":"taxonomy", "params":{"target_version":"3.0"} }
408
+
409
+
# 4) fetch enriched doc
410
+
GET /collections/{id}/documents/{doc}
411
+
```
412
+
413
+
See also: [Taxonomy Mapper tool](/tools/iab-taxonomy-mapper), [Taxonomy audit tool](/tools/taxonomy-audit), [Video guide](/education/videos/taxonomies-guide), and the landing page at [mxp.co/taxonomy](https://mxp.co/taxonomy).
414
+
350
415
## 🧯 Troubleshooting
351
416
-**No matches:** lower `--fuzzy-cut` or enable `--use-embeddings`.
352
417
-**Weird matches:** raise thresholds; add synonyms into `synonyms_*.json`.
@@ -359,10 +424,10 @@ Commit with a version bump and note `taxonomy_version` in your release notes.
0 commit comments