|
| 1 | +# Technical Specification: Open WebUI Integration with qr-sampler |
| 2 | + |
| 3 | +## Difficulty: Medium |
| 4 | + |
| 5 | +The core integration is straightforward (adding an Open WebUI service to Docker Compose), but doing it well requires careful thought about parameter passthrough, user experience, and maintaining qr-sampler's deployment flexibility. |
| 6 | + |
| 7 | +## Technical Context |
| 8 | + |
| 9 | +- **Language**: Python 3.10+, YAML (Docker Compose), Markdown |
| 10 | +- **Existing infrastructure**: Deployment profiles in `deployments/` with Docker Compose + `.env.example` + README pattern |
| 11 | +- **Key dependency**: Open WebUI (`ghcr.io/open-webui/open-webui:main`) — a self-hosted ChatGPT-style web UI |
| 12 | +- **Connection method**: Open WebUI connects to vLLM via its OpenAI-compatible API (`/v1` endpoint) |
| 13 | +- **Parameter flow**: Open WebUI request → Filter Function `inlet()` injects `qr_*` keys → vLLM `/v1/chat/completions` → `SamplingParams.extra_args` → qr-sampler `resolve_config()` |
| 14 | + |
| 15 | +## Decisions (User-Confirmed) |
| 16 | + |
| 17 | +1. **Option A selected**: Add Open WebUI to every deployment profile using Docker Compose `profiles: ["ui"]` |
| 18 | +2. **Filter Function included**: Ship a pre-built Open WebUI Filter Function for qr-sampler parameter control via admin Valves UI |
| 19 | +3. **README prominence**: Add a recommended "Try the Web UI" section to the main README |
| 20 | + |
| 21 | +## Architecture Overview |
| 22 | + |
| 23 | +``` |
| 24 | +┌─────────────────────┐ |
| 25 | +│ Open WebUI │ ← Users chat here (port 3000) |
| 26 | +│ (profiles: ["ui"]) │ |
| 27 | +└──────────┬──────────┘ |
| 28 | + │ HTTP (OpenAI-compatible) |
| 29 | + │ |
| 30 | + │ Filter Function injects qr_* keys |
| 31 | + │ into request body before forwarding |
| 32 | + │ |
| 33 | +┌──────────▼──────────┐ gRPC ┌──────────────────┐ |
| 34 | +│ vLLM │ ◄────────────► │ Entropy Server │ |
| 35 | +│ + qr-sampler │ │ (optional) │ |
| 36 | +│ (port 8000) │ │ (port 50051) │ |
| 37 | +└─────────────────────┘ └──────────────────┘ |
| 38 | +``` |
| 39 | + |
| 40 | +## Implementation Approach |
| 41 | + |
| 42 | +### Part 1: Docker Compose profiles (all deployment profiles) |
| 43 | + |
| 44 | +Add an `open-webui` service with `profiles: ["ui"]` to each `docker-compose.yml`. This ensures: |
| 45 | +- `docker compose up` — unchanged behavior, Open WebUI does NOT start |
| 46 | +- `docker compose --profile ui up` — starts Open WebUI alongside everything else |
| 47 | + |
| 48 | +Service definition (identical across profiles): |
| 49 | + |
| 50 | +```yaml |
| 51 | + open-webui: |
| 52 | + image: ghcr.io/open-webui/open-webui:main |
| 53 | + profiles: ["ui"] |
| 54 | + ports: |
| 55 | + - "${OPEN_WEBUI_PORT:-3000}:8080" |
| 56 | + environment: |
| 57 | + OPENAI_API_BASE_URL: "http://vllm:8000/v1" |
| 58 | + OPENAI_API_KEY: "unused" |
| 59 | + WEBUI_AUTH: "${OPEN_WEBUI_AUTH:-false}" |
| 60 | + volumes: |
| 61 | + - open-webui-data:/app/backend/data |
| 62 | + depends_on: |
| 63 | + - vllm |
| 64 | + restart: unless-stopped |
| 65 | +``` |
| 66 | +
|
| 67 | +Plus `open-webui-data:` in the `volumes:` section. |
| 68 | + |
| 69 | +### Part 2: Open WebUI Filter Function for qr-sampler |
| 70 | + |
| 71 | +Open WebUI stores functions in its SQLite database. They cannot be auto-loaded from `.py` files. The approach: |
| 72 | + |
| 73 | +1. **Ship the filter as two files**: |
| 74 | + - `examples/open-webui/qr_sampler_filter.py` — the source code (readable, editable) |
| 75 | + - `examples/open-webui/qr_sampler_filter.json` — Open WebUI import-ready JSON format |
| 76 | + |
| 77 | +2. **Import workflow** (documented in README): |
| 78 | + - Open http://localhost:3000 → Admin Panel → Functions |
| 79 | + - Click Import → select `qr_sampler_filter.json` |
| 80 | + - Toggle "Global" to apply to all models |
| 81 | + - Configure parameters via the Valves gear icon |
| 82 | + |
| 83 | +3. **Filter function design**: |
| 84 | + - Type: `filter` with `inlet()` method |
| 85 | + - Valves expose all per-request-overridable qr-sampler fields from `_PER_REQUEST_FIELDS` |
| 86 | + - `inlet()` injects `qr_*` keys as top-level fields in the request body |
| 87 | + - vLLM maps unknown top-level keys to `SamplingParams.extra_args` |
| 88 | + - qr-sampler's `resolve_config()` picks them up transparently |
| 89 | + |
| 90 | +**Valve fields** (matching `_PER_REQUEST_FIELDS` in `config.py`): |
| 91 | + |
| 92 | +| Valve | Type | Default | Maps to | |
| 93 | +|-------|------|---------|---------| |
| 94 | +| `enable_qr_sampling` | `bool` | `True` | (controls whether filter injects anything) | |
| 95 | +| `temperature_strategy` | `Literal["fixed", "edt"]` | `"fixed"` | `qr_temperature_strategy` | |
| 96 | +| `fixed_temperature` | `float` | `0.7` | `qr_fixed_temperature` | |
| 97 | +| `top_k` | `int` | `50` | `qr_top_k` | |
| 98 | +| `top_p` | `float` | `0.9` | `qr_top_p` | |
| 99 | +| `sample_count` | `int` | `20480` | `qr_sample_count` | |
| 100 | +| `log_level` | `Literal["none", "summary", "full"]` | `"summary"` | `qr_log_level` | |
| 101 | +| `diagnostic_mode` | `bool` | `False` | `qr_diagnostic_mode` | |
| 102 | + |
| 103 | +Infrastructure fields (`entropy_source_type`, `grpc_*`, `fallback_mode`) are deliberately excluded — they cannot change per-request and are controlled by environment variables. |
| 104 | + |
| 105 | +### Part 3: Documentation |
| 106 | + |
| 107 | +- **`deployments/README.md`**: Add `--profile ui` to the quick start section |
| 108 | +- **Each profile's README**: Add "Web UI (optional)" section with usage + filter import instructions |
| 109 | +- **`README.md`**: Add a prominent "Web UI" section recommending Open WebUI, linking to the filter function and deployment docs |
| 110 | +- **`examples/open-webui/README.md`**: Detailed guide for the filter function (what it does, how to import, how to configure Valves) |
| 111 | + |
| 112 | +## Files to Create |
| 113 | + |
| 114 | +| File | Purpose | |
| 115 | +|------|---------| |
| 116 | +| `examples/open-webui/qr_sampler_filter.py` | Human-readable filter source code | |
| 117 | +| `examples/open-webui/qr_sampler_filter.json` | Open WebUI importable JSON | |
| 118 | +| `examples/open-webui/README.md` | Filter function documentation | |
| 119 | + |
| 120 | +## Files to Modify |
| 121 | + |
| 122 | +| File | Change | |
| 123 | +|------|--------| |
| 124 | +| `deployments/urandom/docker-compose.yml` | Add `open-webui` service + volume | |
| 125 | +| `deployments/urandom/.env.example` | Add `OPEN_WEBUI_PORT`, `OPEN_WEBUI_AUTH` | |
| 126 | +| `deployments/urandom/README.md` | Add "Web UI (optional)" section | |
| 127 | +| `deployments/firefly-1/docker-compose.yml` | Add `open-webui` service + volume | |
| 128 | +| `deployments/firefly-1/.env.example` | Add `OPEN_WEBUI_PORT`, `OPEN_WEBUI_AUTH` | |
| 129 | +| `deployments/firefly-1/README.md` | Add "Web UI (optional)" section | |
| 130 | +| `deployments/_template/docker-compose.yml` | Add `open-webui` service + volume | |
| 131 | +| `deployments/_template/.env.example` | Add `OPEN_WEBUI_PORT`, `OPEN_WEBUI_AUTH` | |
| 132 | +| `deployments/_template/README.md` | Mention UI option | |
| 133 | +| `deployments/README.md` | Add `--profile ui` to quick start | |
| 134 | +| `README.md` | Add prominent "Web UI" section | |
| 135 | + |
| 136 | +## Verification |
| 137 | + |
| 138 | +1. **Compose syntax**: `docker compose --profile ui config` in each profile directory — validates YAML |
| 139 | +2. **Default behavior**: `docker compose config` (no profile) — confirm Open WebUI is NOT listed in resolved services |
| 140 | +3. **Filter function**: Verify `qr_sampler_filter.json` is valid JSON and contains the full source code |
| 141 | +4. **Filter Valves**: Verify all Valve field names match the `qr_*` keys that `resolve_config()` accepts |
| 142 | +5. **Manual test**: If Docker + GPU available — `docker compose --profile ui up`, open `http://localhost:3000`, import filter, chat, check vLLM logs for qr-sampler activity |
| 143 | + |
| 144 | +## Data Model / API / Interface Changes |
| 145 | + |
| 146 | +None. This is a deployment infrastructure + documentation addition. No Python source code in `src/qr_sampler/` is modified. No tests are affected. The filter function is an Open WebUI plugin, not part of the qr-sampler package. |
0 commit comments