Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 25 additions & 2 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Circular dependency is intentionally broken by patching `memory._on_evict = load

All services live on `app.state`: `settings`, `registry`, `runtime_manager`, `session_manager`. Access them in route handlers via `request.app.state.<name>`.

**ModelRegistry** maps HuggingFace repo IDs → tasks in a JSON sidecar (`data_dir/model_registry.json`). Models are fetched from the HF cache via `huggingface_hub`; the registry is populated by `POST /v1/models/pull`.
**ModelRegistry** maps model IDs → `{task, source}` in a JSON sidecar (`data_dir/model_registry.json`). HuggingFace models (`source="hf"`) are fetched via `huggingface_hub`; pip-based OCR backends (`source="pip"`) are installed via `mataserver/core/pip_installer.py`. The registry supports both old flat format (`{"model": "task"}`) and new dict-of-dicts format (`{"model": {"task": "...", "source": "..."}}`), auto-migrating on read. The registry is populated by `POST /v1/models/pull`.

## Key Conventions

Expand Down Expand Up @@ -102,4 +102,27 @@ Two-step: `POST /v1/sessions` creates a session → `WS /v1/stream/{session_id}`
| `mataserver/schemas/requests.py` | `InferParams`, `to_mata_kwargs()`, `SUPPORTED_TASKS` |
| `mataserver/core/result_converter.py` | MATA `VisionResult` → `InferResponse` dispatch |
| `mataserver/api/deps.py` | Auth dependencies (HTTP + WebSocket) |
| `mataserver/models/registry.py` | Persistent HF model ID → task map |
| `mataserver/models/registry.py` | Persistent model ID → task + source map |
| `mataserver/core/backend_catalog.py` | Static catalog of pip-based OCR backends |
| `mataserver/core/pip_installer.py` | Pip install helper for non-HF backends |

### Backend Catalog (pip-based backends)

`mataserver/core/backend_catalog.py` is a **static Python catalog** (not JSON/YAML) that maps short backend names to installation metadata. This prevents arbitrary pip installs from user input.

```python
from mataserver.core.backend_catalog import lookup, is_cataloged, get_source_type

entry = lookup("easyocr") # CatalogEntry or None
is_cataloged("easyocr") # True
get_source_type("easyocr") # "pip"
get_source_type("org/model") # "hf"
```

Currently cataloged pip backends: `easyocr`, `paddleocr`, `tesseract`.

When adding a new pip backend:

1. Add a `CatalogEntry` to `_CATALOG` in `backend_catalog.py`.
2. `pull.py` and `mataserver/api/v1/models.py` dispatch automatically.
3. Register a result converter with `@_register("ocr")` in `result_converter.py` if needed.
20 changes: 20 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -72,4 +72,24 @@ ENV MATA_SERVER_PORT=8110
ENV MATA_SERVER_DATA_DIR=/var/lib/mataserver
ENV PYTHONPATH=/usr/local/lib/python3.11/site-packages

# Optional: Pre-install OCR backends into the image at build time.
# Pre-baking avoids runtime pip installs and removes the need for outbound internet
# access in the container. Uncomment the backends you need.
#
# EasyOCR:
# RUN pip install --no-cache-dir easyocr
#
# PaddleOCR:
# RUN pip install --no-cache-dir paddlepaddle paddleocr
#
# Tesseract (requires system binary + Python binding):
# RUN apt-get update && apt-get install -y --no-install-recommends tesseract-ocr \
# && rm -rf /var/lib/apt/lists/* \
# && pip install --no-cache-dir pytesseract
#
# After installing pip packages, register each backend so it appears in the registry:
# RUN mataserver pull easyocr --task ocr
# RUN mataserver pull paddleocr --task ocr
# RUN mataserver pull tesseract --task ocr

ENTRYPOINT ["mataserver", "serve"]
62 changes: 49 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,16 +93,16 @@ curl http://localhost:8110/v1/health

The `mataserver` console script provides commands for server management and model operations.

| Command | Description |
| ------------------------------ | ---------------------------------------------- |
| `mataserver serve` | Start the inference server |
| `mataserver pull <m> --task T` | Download and register a model from HuggingFace |
| `mataserver list` | List all registered models (alias: `ls`) |
| `mataserver show <m>` | Show detailed info for a model |
| `mataserver rm <m>` | Remove a model from the registry |
| `mataserver load <m>` | Preload a model into memory (alias: `warmup`) |
| `mataserver stop <m>` | Unload a model from memory |
| `mataserver version` | Print version (also: `mataserver -v`) |
| Command | Description |
| ------------------------------ | ------------------------------------------------------------------ |
| `mataserver serve` | Start the inference server |
| `mataserver pull <m> --task T` | Download/install and register a model (HuggingFace or pip backend) |
| `mataserver list` | List all registered models (alias: `ls`) |
| `mataserver show <m>` | Show detailed info for a model |
| `mataserver rm <m>` | Remove a model from the registry |
| `mataserver load <m>` | Preload a model into memory (alias: `warmup`) |
| `mataserver stop <m>` | Unload a model from memory |
| `mataserver version` | Print version (also: `mataserver -v`) |

For full usage details, argument references, and examples, see [docs/api.md](docs/api.md#cli).

Expand Down Expand Up @@ -167,17 +167,53 @@ curl http://localhost:8110/v1/health
{ "status": "ok", "version": "0.1.0", "gpu_available": false }
```

### Pull a model from HuggingFace
### Pull a model

```bash
# HuggingFace model
curl -X POST http://localhost:8110/v1/models/pull \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{"source": "hf://datamata/rtdetr-l"}'
-d '{"model": "datamata/rtdetr-l", "task": "detect"}'
```

```json
{ "status": "pulling", "model": "datamata/rtdetr-l" }
{ "status": "pulled", "model": "datamata/rtdetr-l" }
```

```bash
# Pip-based OCR backend
curl -X POST http://localhost:8110/v1/models/pull \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{"model": "easyocr", "task": "ocr"}'
```

Or via the CLI:

```bash
# HuggingFace Task Detection model (Example: RT-DETR ResNet-18 backbone)
mataserver pull PekingU/rtdetr_r18vd --task detect

# HuggingFace Task Classification model (Example: ResNet-50)
mataserver pull microsoft/resnet-50 --task classify

# HuggingFace Task Segmentation model (Example: Mask2Former Swin-Tiny trained on COCO)
mataserver pull facebook/mask2former-swin-tiny-coco-instance --task segment

# HuggingFace Task Depth model (Example: Depth Anything V2 Small)
mataserver pull depth-anything/Depth-Anything-V2-Small-hf --task depth

# HuggingFace Task Visual Language Model (VLM)
mataserver pull Qwen/Qwen3-VL-2B-Instruct --task vlm

# HuggingFace OCR model
mataserver pull stepfun-ai/GOT-OCR-2.0-hf --task ocr

# Pip-installed OCR backends
mataserver pull easyocr --task ocr
mataserver pull paddleocr --task ocr
mataserver pull tesseract --task ocr # requires tesseract system binary
```

### Single-shot inference (base64 JSON)
Expand Down
Loading