datamata-io · bimantoromaesa · Mar 7, 2026 · Mar 7, 2026
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
@@ -122,7 +122,7 @@ Built-in Tools (zoom, crop) + Provider Tools (detect, classify, etc.)
 - **Error modes**: retry (default), skip, fail
 - **Observability**: Full tracing and metrics integration
 
-**Testing:** 336 comprehensive tests in `test_tool_schema.py`, `test_tool_registry.py`, `test_agent_loop.py`, `test_tool_prompts.py`, `test_tool_call_parser.py`, etc. All passing with zero regressions.
+**Testing:** 342 comprehensive tests in `test_tool_schema.py`, `test_tool_registry.py`, `test_agent_loop.py`, `test_tool_prompts.py`, `test_tool_call_parser.py`, etc. All passing with zero regressions.
 
 **Documentation:** See `docs/VLM_TOOL_CALLING_SUMMARY.md` for complete architecture details, design decisions, limitations, and future roadmap.
 
@@ -152,7 +152,7 @@ pytest tests/test_track_node.py -v         # Track graph node (39 tests)
 
 # VLM tool-calling test suites (v1.7.0)
 pytest tests/test_tool_schema.py -v        # Tool schema (33 tests)
-pytest tests/test_tool_registry.py -v      # Tool registry (44 tests)
+pytest tests/test_tool_registry.py -v      # Tool registry (49 tests)
 pytest tests/test_agent_loop.py -v         # Agent loop (51 tests)
 pytest tests/test_tool_prompts.py -v       # Tool prompts (18 tests)
 pytest tests/test_tool_call_parser.py -v   # Tool call parser (51 tests)

diff --git a/.gitignore b/.gitignore
@@ -167,6 +167,8 @@ runs/
 # Dev Test Artifacts
 examples/inference/outputs/
 bm_test/
+example_test.sh
+example_test.ps1
 
 # Auto Claude data directory
 .auto-claude/

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -11,6 +11,26 @@ Versions follow [Semantic Versioning](https://semver.org/).
 
 ---
 
+## [1.9.1] - 2026-03-08
+
+### Changed
+
+- Refactored graph flow notation from `→` to `>` in all examples, scripts, and documentation for consistency with the DSL operator syntax
+- Updated expected output structure descriptions in examples and docs to match the new `>` notation
+
+### Added
+
+- `ToolRegistry` now requires `text_prompts` for zero-shot providers (GroundingDINO, OWL-ViT, CLIP) and raises `ValueError` when they are missing
+- Improved tool schema generation: zero-shot providers automatically include a `text_prompts` parameter in their generated `ToolSchema`
+- Tests for zero-shot provider detection and `text_prompts` schema requirement in `test_tool_registry.py`
+
+### Fixed
+
+- SAM adapter: minor issue where prompt-less calls could silently produce empty masks instead of raising a clear error
+- Video tracking examples: corrected frame iteration and output path handling in `examples/track/`
+
+---
+
 ## [1.9.0] - 2026-03-02
 
 ### Added

diff --git a/README.md b/README.md
@@ -1171,72 +1171,7 @@ export MATA_CONFIG=/path/to/config.json
 
 ## 🛣️ Roadmap
 
-### ✅ Completed (v1.9.0 - Current)
-
-#### **OCR / Text Extraction** — Five backends, graph nodes, evaluation pipeline
-
-- ✅ **Five OCR backends**: EasyOCR (80+ languages), PaddleOCR (multilingual), Tesseract (classic), GOT-OCR2 (HuggingFace end-to-end), TrOCR (HuggingFace line-level)
-- ✅ **`mata.run("ocr", ...)` API**: Unified entry point — `model=` selects backend or HuggingFace ID
-- ✅ **`mata.load("ocr", ...)` API**: Returns persistent adapter for repeated inference
-- ✅ **`OCRResult` type**: `.full_text`, `.regions` (list of `TextRegion` with bbox + score + text)
-- ✅ **Multi-format export**: `.save("out.txt")`, `.save("out.csv")`, `.save("out.json")`, `.save("overlay.png")`
-- ✅ **`filter_by_score()`**: Confidence-threshold filtering on OCR results
-- ✅ **`OCRText` graph artifact**: Strongly-typed artifact for graph pipelines
-- ✅ **`OCR` graph node**: Accepts `Image` or `ROIs` input, aggregates per-crop results with `instance_id` correlation
-- ✅ **`ExtractROIs` graph node**: Crops detection regions for downstream OCR
-- ✅ **`OCRWrapper`**: Protocol-based capability wrapper enabling OCR as a graph provider
-- ✅ **VLM tool integration**: `"ocr"` registered in `ToolRegistry` and `TASK_SCHEMA_DEFAULTS` for agent mode
-- ✅ **UniversalLoader routing**: Bare engine names (`"easyocr"`, `"paddleocr"`, `"tesseract"`) routed via `_EXTERNAL_OCR_ENGINES`; HuggingFace OCR IDs routed through `_load_from_huggingface()`
-- ✅ **Optional dependencies**: EasyOCR, PaddleOCR, Tesseract declared as optional extras in `pyproject.toml`
-- ✅ **71 evaluation tests**: `test_eval_ocr.py` — all passing, zero regressions against 4307+ total
-- ⏳ **`mata.val("ocr", ...)` evaluation**: `OCRMetrics` (word accuracy, character accuracy, precision, recall, F1) with COCO-Text JSON dataset loader | PENDING for v1.9.1 release due to dataset licensing review.
-
-### ✅ Completed (v1.8)
-
-#### **Object Tracking** — ByteTrack + BotSort
-
-- ✅ **Vendored ByteTrack**: Zero-dependency implementation in `src/mata/trackers/` (no yolox/ultralytics)
-- ✅ **Vendored BotSort**: IoU + Global Motion Compensation (GMC via sparse optical flow)
-- ✅ **`mata.track()` API**: One-liner video/stream/webcam/image-dir tracking
-- ✅ **`mata.load("track", ...)` API**: Returns `TrackingAdapter` for persistent per-frame tracking
-- ✅ **Multiple source types**: Video files, RTSP streams, webcams, image directories, single images
-- ✅ **Track ID rendering**: `show_track_ids=True` with deterministic per-track colors
-- ✅ **Trajectory trails**: `show_trails=True` — PIL-native polyline history rendering
-- ✅ **CSV/JSON export**: MOT-compatible CSV export, multi-frame JSON with metadata
-- ✅ **Graph node upgrade**: `Track` node uses vendored trackers, `BotSortWrapper` added
-- ✅ **Graph presets**: BotSort variants added to surveillance/driving presets
-- ✅ **YAML config**: Tracker settings in `~/.mata/models.yaml` under `track:` task
-- ✅ **687 tests**: All passing, zero regressions against 4047+ total
-
-### ✅ Completed (v1.6)
-
-#### **Graph System Architecture** - Multi-task workflows with parallel execution
-
-- ✅ **Artifact Type System**: Strongly-typed vision primitives (Image, Detections, Masks, Keypoints, Tracks, ROIs)
-- ✅ **Task Graph Builder**: Fluent API for composing multi-task pipelines (Detect → Segment → Pose)
-- ✅ **Parallel Execution**: Automatic parallelization of independent tasks (1.5-3x speedup, 41x in benchmarks)
-- ✅ **Conditional Branching**: Result-driven workflow control with If/else, HasLabel, CountAbove, ScoreAbove
-- ✅ **Temporal Processing**: Video inference with BYTETrack/IoU tracking and frame policies
-- ✅ **Capability Providers**: Protocol-based model registry with lazy loading
-- ✅ **VLM Graph Nodes**: VLMDescribe, VLMDetect, VLMQuery, PromoteEntities for Entity→Instance workflows
-- ✅ **Visualization Nodes**: Native Annotate and NMS nodes reusing existing PIL/matplotlib backends
-- ✅ **Pre-built Presets**: 8 graph presets (detection+segmentation, scene analysis, VLM workflows, tracking)
-- ✅ **Observability**: Metrics collection, execution tracing, and provenance tracking
-- ✅ **`mata.infer()` API**: New public API for graph execution with flat provider dicts
-- ✅ **Backward Compatibility**: 100% compatible with existing `mata.load()`/`mata.run()` APIs
-- ✅ **Comprehensive Testing**: 2185 tests, >80% coverage
-
-### ✅ Completed (v1.5.3)
-
-- ✅ **Multi-Task Support**: Detection, classification, segmentation, depth estimation, vision-language models (VLM)
-- ✅ **Zero-Shot Capabilities**: CLIP (classify), GroundingDINO/OWL-ViT (detect), SAM/SAM3 (segment)
-- ✅ **Vision-Language Models**: Image captioning, VQA, visual understanding with Qwen3-VL - February 2026
-- ✅ **Universal Loader**: llama.cpp-style loading with 5-strategy auto-detection
-- ✅ **Multi-Format Runtime**: PyTorch, ONNX Runtime, TorchScript, Torchvision support
-- ✅ **Torchvision CNN Detection**: Apache 2.0 licensed models (RetinaNet, Faster R-CNN, FCOS, SSD) - February 2026
-- ✅ **Export & Visualization**: JSON/CSV/image/crops with dual backends (PIL/matplotlib)
-- ✅ **Plugin Removal**: Simplified architecture, -1,268 lines of legacy code
-- ✅ **Comprehensive Testing**: 405 tests (exceeded 202+ target), 60-85% coverage
+> **For a full history of completed features, see [CHANGELOG.md](CHANGELOG.md).**
 
 ### 🔄 In Progress
 
@@ -1245,7 +1180,7 @@ export MATA_CONFIG=/path/to/config.json
 - ⏳ **ReID model integration**: Feature embeddings via HuggingFace ReID models
 - ⏳ **Cross-camera tracking**: Match track IDs across camera feeds
 - ⏳ **BotSort ReID mode**: Enable `with_reid=true` in botsort config
-- **Status**: Planned for v1.9
+- **Status**: Planned for v1.9.x
 
 #### **2. KACA Integration** - MIT-licensed CNN detection with PyTorch and ONNX support
 
@@ -1261,6 +1196,7 @@ export MATA_CONFIG=/path/to/config.json
 - 🔄 **Model Recommendations**: Suggest best models based on task and hardware constraints
 - 🔄 **Batch Model Download**: Pre-download common models for air-gapped environments
 - 🔄 **Enhanced Search**: Filter by task, license, performance metrics
+- **Status**: Planned for v2.x
 
 ### ⏳ Planned (v2.0 - Q2 2026)
 

diff --git a/docs/VLM_TOOL_CALLING_SUMMARY.md b/docs/VLM_TOOL_CALLING_SUMMARY.md
@@ -1,9 +1,10 @@
 # VLM Tool-Calling Agent System — Architecture Summary
 
-**Version**: 1.7.0  
+**Version**: 1.7.1  
 **Implementation Date**: February 16, 2026  
+**Last Updated**: March 8, 2026  
 **Status**: ✅ Production Ready  
-**Test Coverage**: 336 comprehensive tests, all passing
+**Test Coverage**: 342 comprehensive tests, all passing
 
 ---
 
@@ -270,6 +271,31 @@ mata.infer(
 
 ---
 
+### 8. **Provider-Aware Schema Generation for Zero-Shot Models** _(v1.7.1)_
+
+**Decision**: `ToolRegistry` introspects the actual provider at registration time and upgrades `text_prompts` to `required=True` for zero-shot adapters.
+
+**Rationale**:
+
+- **VLM must know `text_prompts` is required** — The default `TASK_SCHEMA_DEFAULTS["detect"]` marks `text_prompts` as optional (correct for supervised detectors like RT-DETR, YOLO). But zero-shot models (GroundingDINO, OWL-ViT) **cannot run without class names**. If the schema shows the parameter as optional, the VLM's system prompt will say _"optional"_ and the agent will omit it, causing a `TypeError` or `InvalidInputError` at execution time.
+- **Zero-shot contract is enforced at the adapter level** — `HuggingFaceZeroShotDetectAdapter.predict()` keeps `text_prompts` as a required positional argument. The fix is upstream: make the _schema_ match the adapter's actual contract.
+- **Clean detection via class name** — All MATA zero-shot adapters have `"ZeroShot"` in their class name. `_is_zero_shot_provider()` unwraps one layer of wrapper (e.g., `DetectorWrapper.adapter`) and checks the underlying class name — no new class attributes or protocol changes needed.
+- **`TASK_SCHEMA_DEFAULTS` stays generic** — The shared default schema is not modified; customization happens per-provider at `ToolRegistry` construction time.
+
+**Agentic chain this enables**:
+
+```
+VLM: "I see an unknown object. Let me classify it."
+  → classifier(region=[80,120,220,300])  → "cat (0.92)"
+VLM: "It's a cat. Let me find all cats using the detector."
+  → detector(text_prompts="cat")         → 2 cats detected
+VLM: "Found 2 cats at [80,120,220,300] and [300,130,440,280]. Summary..."
+```
+
+**Implementation**: `_is_zero_shot_provider()` + upgraded `_schema_for_capability(capability, tool_name, provider)` in `src/mata/core/tool_registry.py` (v1.7.1).
+
+---
+
 ### 7. **Multi-Format Tool Call Parsing**
 
 **Decision**: Support fenced blocks (` ```tool_call `), XML (`<tool_call>`), and raw JSON.
@@ -560,6 +586,11 @@ result = AgentResult(
 - **Impact**: VLMs may output `"0.5"` instead of `0.5` for floats
 - **Solution**: Comprehensive type coercion in `validate_tool_call()`
 
+#### ~~4. Zero-Shot Detector Omits `text_prompts`~~ _(Fixed — v1.7.1)_
+
+- **Was**: `TASK_SCHEMA_DEFAULTS["detect"]` marked `text_prompts` as optional, causing the VLM to omit it. Zero-shot adapters require it, so the call failed with `TypeError`.
+- **Fix**: `ToolRegistry._schema_for_capability()` now introspects the actual provider via `_is_zero_shot_provider()` and upgrades `text_prompts` to `required=True` for zero-shot adapters. The VLM's system prompt now correctly says the parameter is required, so the agent always populates it from its own reasoning.
+
 ---
 
 ## Future Directions

diff --git a/examples/classify/basic_classification.py b/examples/classify/basic_classification.py
@@ -1,6 +1,6 @@
 """Basic Classification Examples — MATA Framework
 
-Progressive examples: one-shot → load/reuse → model comparison → filtering.
+Progressive examples: one-shot > load/reuse > model comparison > filtering.
 Run: python examples/classify/basic_classification.py
 """
 
@@ -31,7 +31,7 @@ def load_and_reuse():
     for _ in range(2):
         result = classifier.predict(get_image())
         top1 = result.get_top1()
-        print(f"  → {top1.label_name}: {top1.score * 100:.2f}%")
+        print(f"  to {top1.label_name}: {top1.score * 100:.2f}%")
 
 
 # === Section 3: Access Results (.get_top1, top-5 predictions) ===

diff --git a/examples/classify/clip_zeroshot.py b/examples/classify/clip_zeroshot.py
@@ -150,7 +150,7 @@ def example4_batch_classification():
     for name, img in images:
         result = classifier.predict(img, text_prompts=text_prompts, top_k=2)
         top2 = [(p.label_name, f"{p.score:.4f}") for p in result.predictions]
-        print(f"  {name:15s} → {top2}")
+        print(f"  {name:15s} to {top2}")
 
 
 def main():

diff --git a/examples/depth/basic_depth.py b/examples/depth/basic_depth.py
@@ -38,7 +38,7 @@ def example_depth_v1():
     OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
     result.save(OUTPUT_DIR / "depth_v1.png", colormap="magma")
     result.save(OUTPUT_DIR / "depth_v1.json")
-    print(f"Saved → {OUTPUT_DIR}/depth_v1.png and depth_v1.json")
+    print(f"Saved to {OUTPUT_DIR}/depth_v1.png and depth_v1.json")
 
 
 # === Section 2: One-Shot Depth (Depth Anything V2) ===
@@ -61,7 +61,7 @@ def example_depth_v2():
     OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
     result.save(OUTPUT_DIR / "depth_v2.png", colormap="magma")
     result.save(OUTPUT_DIR / "depth_v2.json")
-    print(f"Saved → {OUTPUT_DIR}/depth_v2.png and depth_v2.json")
+    print(f"Saved to {OUTPUT_DIR}/depth_v2.png and depth_v2.json")
 
 
 # === Section 3: Load Once, Predict Many ===
@@ -105,6 +105,8 @@ def main():
     except Exception as exc:
         print(f"  [error] load-once: {exc}")
 
+    print("\nDone.")
+
 
 if __name__ == "__main__":
     main()
diff --git a/examples/detect/basic_detection.py b/examples/detect/basic_detection.py
@@ -87,12 +87,12 @@ def section_export(output_dir: Path):
     # Save to .json file
     json_path = output_dir / "detections.json"
     result.save(str(json_path))
-    print(f"[export] Saved JSON  → {json_path}")
+    print(f"[export] Saved JSON  to {json_path}")
 
     # Save annotated image (overlay bboxes on the source image)
     img_path = output_dir / "detections_overlay.jpg"
     result.save(str(img_path))
-    print(f"[export] Saved image → {img_path}")
+    print(f"[export] Saved image to {img_path}")
 
 
 # === Section 6: Config Aliases ===
@@ -104,7 +104,7 @@ def section_config_aliases():
     mata.register_model("detect", "my-rtdetr",  "PekingU/rtdetr_r50vd",   threshold=0.6)
 
     detector = mata.load("detect", "my-detr")
-    print(f"[alias] Loaded 'my-detr' → {detector.__class__.__name__}")
+    print(f"[alias] Loaded 'my-detr' to {detector.__class__.__name__}")
 
     # Config-file aliases work the same way — set them in .mata/models.yaml
     # and load by name without calling register_model() in code.

diff --git a/examples/detect/zeroshot_detection.py b/examples/detect/zeroshot_detection.py
@@ -88,7 +88,7 @@ def example_grounding_dino():
     output_image = draw_detections(image.copy(), result, text_prompts)
     output_path = "examples/images/output_grounding_dino.jpg"
     output_image.save(output_path)
-    print(f"\n✓ Saved visualization to: {output_path}")
+    print(f"\n Saved visualization to: {output_path}")
 
     return result
 
@@ -119,7 +119,7 @@ def example_owlvit_v2():
     output_image = draw_detections(image.copy(), result, text_prompts)
     output_path = "examples/images/output_owlvit_v2.jpg"
     output_image.save(output_path)
-    print(f"\n✓ Saved visualization to: {output_path}")
+    print(f"\n Saved visualization to: {output_path}")
 
     return result
 
@@ -153,7 +153,7 @@ def example_batch_processing():
         for instance in result.instances:
             print(f"      - {instance.label_name}: {instance.score:.3f}")
 
-    print(f"\n✓ Processed {len(images)} images in batch")
+    print(f"\n Processed {len(images)} images in batch")
 
     return results
 
@@ -214,9 +214,9 @@ def example_model_comparison():
     print(f"   OWL-ViT v2: {len(result_owlv2.instances)} objects")
 
     print("\n[Results] Model comparison:")
-    print(f"   ├─ GroundingDINO: {len(result_gdino.instances)} detections")
-    print(f"   ├─ OWL-ViT v1:    {len(result_owlv1.instances)} detections")
-    print(f"   └─ OWL-ViT v2:    {len(result_owlv2.instances)} detections")
+    print(f"   - GroundingDINO: {len(result_gdino.instances)} detections")
+    print(f"   - OWL-ViT v1:    {len(result_owlv1.instances)} detections")
+    print(f"   - OWL-ViT v2:    {len(result_owlv2.instances)} detections")
 
     return result_gdino, result_owlv1, result_owlv2
 
@@ -238,17 +238,18 @@ def main():
         example_model_comparison()
 
         print("\n" + "=" * 70)
-        print("✓ All examples completed successfully!")
+        print(" All examples completed successfully!")
         print("=" * 70)
         print("\nNext steps:")
         print("  1. Check the output images in examples/images/")
         print("  2. Try with your own images")
         print("  3. Experiment with different text prompts")
-        print("  4. Explore the GroundingDINO→SAM pipeline: examples/segment/grounding_sam_pipeline.py")
+        print("  4. Explore the GroundingDINO then SAM pipeline: examples/segment/grounding_sam_pipeline.py")
         print()
+        print("Done")
 
     except Exception as e:
-        print(f"\n✗ Error: {e}", file=sys.stderr)
+        print(f"\n Error: {e}", file=sys.stderr)
         import traceback
         traceback.print_exc()
         sys.exit(1)

diff --git a/examples/graph/README.md b/examples/graph/README.md
@@ -18,7 +18,7 @@ These examples demonstrate the fundamental graph system capabilities:
 
 | Example                                     | Description                                  | Key Features                                     |
 | ------------------------------------------- | -------------------------------------------- | ------------------------------------------------ |
-| ✅ [simple_pipeline.py](simple_pipeline.py) | Detection → Filter → Segmentation → Fuse     | `mata.infer()`, `Graph.then()`, basic pipeline   |
+| ✅ [simple_pipeline.py](simple_pipeline.py) | Detection > Filter > Segmentation > Fuse     | `mata.infer()`, `Graph.then()`, basic pipeline   |
 | [parallel_tasks.py](parallel_tasks.py)      | Parallel detection + classification + depth  | `Graph.parallel()`, `ParallelScheduler`, speedup |
 | [video_tracking.py](video_tracking.py)      | Video processing with object tracking        | `VideoProcessor`, `Track`, frame policies        |
 | [vlm_workflows.py](vlm_workflows.py)        | VLM grounded detection & scene understanding | `VLMDetect`, `PromoteEntities`, VLM presets      |