Modern, reactive web interface for exploring Ollama models, browsing a scraped public catalog, pulling variants with streaming progress, and managing locally installed models.
What exactly is this? Youβll find a short, visual explanation in what_is/WHAT_IS.md (incl. Screenshots). π
- Features
- Repository Layout
- Prerequisites
- Quick Start (UI Only)
- Host Resolution Logic
- API Routes Overview
- Frontend Architecture
- Python Scraper
- Development Workflow
- Deployment
- Troubleshooting
- Roadmap / Ideas
- Contributing
- License
- At A Glance
- Disclaimer / Infos
- Release Notes
- π¦ Browse locally installed Ollama models (name, size, digest, modified date)
- β¬ Pull / re-pull models (streamed NDJSON progress with derived percentage)
- ποΈ Delete installed models
- π Searchable remote model catalog (slug, name & capabilities filtering)
- π§© Expandable variant lists with size info and oneβclick pull
- π Global pull lock (avoids concurrent overwriting / race conditions)
- π Host configuration (cookie + header + env fallback resolution)
- π¨ Consistent gradient UI theme + custom scrollbars
- π Toast notifications (success / error / info)
- β‘οΈ Lightweight state management with Zustand & React Query caching
- π Python scraper (separate directory) to periodically refresh the catalog JSON
- π§ Model playground to check two models at same time
ollama-ui/ # Next.js (App Router) application
src/app/ # Pages & API routes
src/lib/ # Environment + utility helpers
src/store/ # Zustand stores (pull logs, toast, etc.)
models.json # Scraped catalog file (copied/updated manually)
Scraper/ # Python async scraper producing models.json
You run / build only inside ollama-ui/. The Python scraper is optional and only needed when you want to regenerate the catalog file.
- Node.js 18.18+ or 20+ (recommended LTS)
- pnpm (preferred) OR npm / yarn / bun
- Python 3.11+ (only if you run the scraper)
- A reachable Ollama server (local or remote) exposing its HTTP API (
/api/pull,/api/tags, etc.)
cd ollama-ui
pnpm install # or npm install / yarn
pnpm dev # start dev server on http://localhost:3000If you already have an Ollama instance running locally at the default fallback (see below) the Installed Models list should populate. Otherwise set the host in the UI or via environment.
Order of precedence (first valid wins):
- Request header:
x-ollama-host - Browser cookie:
ollama_host(set via the Host form) - Environment:
OLLAMA_HOSTorNEXT_PUBLIC_OLLAMA_HOST - Hardcoded fallback in
src/lib/env.ts
Validation enforces a full http:// or https:// URL.
Use the Host box on the Models page, enter full URL (e.g. http://localhost:11434) and press βSet hostβ. Cookie persists for 7 days.
Create .env.local in ollama-ui/:
OLLAMA_HOST=http://localhost:11434
Restart dev server.
Send a custom header (useful for testing):
curl -H "x-ollama-host: http://other-host:11434" http://localhost:3000/api/models
Base path: /api
| Route | Method | Purpose | Notes |
|---|---|---|---|
/api/models |
GET | List installed models + tags | Wraps Ollama /api/tags (server side implementation not shown here) |
/api/models/pull |
POST | Stream pull of a model or model:variant | Returns NDJSON, enriches lines with percentage when possible |
/api/models/delete |
POST | Remove a model | Body: { model: "name" } |
/api/models/catalog |
GET | Filtered catalog from models.json |
Query: q, limit (0 = all) |
/api/config/ollama-host |
GET/POST | Get or set resolved host | POST body: { host: string } |
Other routes (chat, stream, lamas, ps, tools/*) |
β | Additional functionality (not all documented yet) | Future docs TBD |
/api/models/pull emits newlineβdelimited JSON objects. Each line may contain:
{ status, digest?, total?, completed?, percentage? }
If total & completed exist but percentage is missing, the proxy computes and injects it.
Client logic (React) merges these events into a progress bar; a final { done: true } is appended.
- Next.js App Router: Server + edge runtime mixing (pull uses Edge for low latency, catalog read uses Node for FS access).
- React Query: Data caching & stale control for models and catalog.
- Zustand Stores: Lightweight stores for pull logs & toast queue.
- Streaming: Manual
ReadableStreamconsumption with incremental parsing of NDJSON lines. - Styling: Tailwind CSS (v4) + custom gradients + scrollbar styling (WebKit + Firefox).
- Components: Reusable
<Button />with variants (primary,outline,danger, etc.).
State highlights:
anyPullActiveprevents concurrent pulls.expandedVariants[slug]toggles full variant list per model.- Progress derived from last event for the active model.
Location: Scraper/
Purpose: Crawl public model pages, produce models.json with:
scraped_at- For each model:
slug,name,pulls,pulls_text,capabilities[],blurb,description,tags_count,variants[](each variant: tag, size, size_text, context tokens, input tokens)
cd Scraper
python -m venv .venv && source .venv/bin/activate # one time
pip install -r requirements.txt
python ollama_scraper.py # full scrape
python ollama_scraper.py --limit 50 # first 50 models for quick testOutput: out/models.json. Copy or move that file into ollama-ui/models.json (overwrite existing) so the catalog endpoint serves it.
Use cron or a CI workflow to periodically update the file. Example cron entry (daily at 02:30):
30 2 * * * /usr/bin/bash -lc 'cd /path/to/repo/Scraper && source .venv/bin/activate && python ollama_scraper.py && cp out/models.json ../ollama-ui/models.json'
Common scripts:
pnpm dev # start dev w/ Turbopack
pnpm build # production build
pnpm start # run built app
pnpm lint # eslint (uses flat config)
pnpm format # prettier writeAfter updating models.json, no restart is strictly required (catalog route reads file each request) but browser cache is bypassed anyway (cache: 'no-store'). Just refresh.
You can deploy like any standard Next.js app (Vercel, Docker, etc.). Requirements:
- Ensure
models.jsonis present in the build output (it is read at runtime, so keep it in project root of the app). - Provide
OLLAMA_HOSTenvironment variable or rely on user-set cookie. - If deploying serverless, note: the catalog route uses Node runtime (filesystem). Ensure hosting platform supports reading that static file at runtime.
This repository now includes a multiβstage Dockerfile at repo root that:
- Builds the Next.js app (standalone) with Node 20.
- Uses the official
ollama/ollama:latestimage as the final base. - Copies the standalone server + static assets +
models.json. - Starts both Ollama (
ollama serve) and the UI (node server.js) viastart.sh.
Build & run:
docker build -t ollama-ui:latest .
docker run --rm -p 11434:11434 -p 3000:3000 ollama-ui:latestThen open http://localhost:3000 (UI) and Ollama API at http://localhost:11434.
To persist Ollama models and the UI database outside the container, mount host directories as volumes:
docker run --rm -p 11434:11434 -p 3000:3000 \
-v /path/to/ollama-models:/root/.ollama \
-v /path/to/ollama-ui-data:/app/data \
ollama-ui:latest/root/.ollama: stores all pulled Ollama models (can be reused across containers/updates)/app/data: stores the SQLite database (app.db) for UI state (profiles, logs, etc.)
Docker Compose Example:
services:
ollama-ui:
image: ollama-ui:latest
build: .
ports:
- "11434:11434"
- "3000:3000"
volumes:
- /path/to/ollama-models:/root/.ollama
- /path/to/ollama-ui-data:/app/data
volumes: {}Override default host the UI uses:
docker run --rm -e OLLAMA_HOST=http://localhost:11434 -p 11434:11434 -p 3000:3000 ollama-ui:latestYou can use prebuilt images from GitHub Container Registry (GHCR):
Pull and run:
docker pull ghcr.io/chrizzo84/ollamaui:latest
docker run --rm -p 11434:11434 -p 3000:3000 ghcr.io/chrizzo84/ollamaui:latestIf you want to disable the bundled Ollama server and point only to an external one, you can adapt start.sh to skip ollama serve and only run node server.js.
Ollama can leverage GPUs inside the same container. Usage differs by platform:
NVIDIA (Linux) Prerequisites: Install the NVIDIA Container Toolkit on the host.
docker run --rm \
--gpus=all \
-p 11434:11434 -p 3000:3000 \
-v ollama_models:/root/.ollama \
ollama-ui:latestLimit GPU visibility (e.g. only GPU 0):
docker run --rm --gpus 'device=0' -p 11434:11434 -p 3000:3000 ollama-ui:latestDocker Compose Example (docker-compose.yml at repo root):
services:
ollama-ui:
image: ollama-ui:latest
build: .
ports:
- "11434:11434"
- "3000:3000"
volumes:
- ollama_models:/root/.ollama
deploy:
resources:
reservations:
devices:
- capabilities: [gpu]
environment:
- OLLAMA_HOST=http://localhost:11434
volumes:
ollama_models:Apple Silicon (Metal) Metal acceleration is available natively when running Ollama directly on macOS. Docker GPU passthrough for Metal is not currently supported in the same way; prefer running Ollama on the host and pointing the container UI to it:
docker run --rm -e OLLAMA_HOST=http://host.docker.internal:11434 -p 3000:3000 ollama-ui:latestAMD ROCm
If your base image / host supports ROCm and ollama/ollama adds ROCm builds in future, you would expose the devices similarly (e.g. --device=/dev/dri); consult upstream Ollama documentation.
Verify GPU usage after starting:
docker exec -it <container> ollama psOr on host: nvidia-smi (NVIDIA) while a model runs.
| Symptom | Cause | Fix |
|---|---|---|
| Installed list empty | Wrong host / unreachable Ollama | Set correct host; test curl <host>/api/tags |
| Pull stuck at 0% | Upstream not streaming completed/total yet |
Wait; incomplete events still appear in log |
| Host not persisting | Cookies blocked | Allow site cookies or set via env variable |
- Persist catalog search & expansion state (localStorage)
- Per-variant progress indicator (when layers known)
- Multi-pull queue (sequential)
- Download speed & ETA estimation
- Keyboard shortcuts (focus search, abort pull)
- Fork & clone
- Create a branch:
feat/my-feature - Run
pnpm devand implement - Ensure lint passes:
pnpm lint - Open PR with description & screenshots
Distributed under the MIT License. See the LICENSE file for full text.
| Stack | Key Tools |
|---|---|
| Framework | Next.js App Router (Edge + Node runtime) |
| Data | React Query, NDJSON streaming |
| State | Zustand |
| Styling | Tailwind CSS v4, custom gradients, motion via Framer Motion |
| Backend Integrations | Ollama HTTP API |
| Scraping | Python (httpx, BeautifulSoup, tenacity) |
π Happy hacking! Pull, explore, iterate. π¦
β‘οΈ Disclaimer: Vibe Coding & Copilot β‘οΈ
π This app was created exclusively through Vibe Coding β basically just as a test of GPT-5 via GitHub Copilot.
π€ The code is more or less unreviewed, spontaneous, and full of AI magic.
π If you find bugs, feel free to keep them or just continue developing with the vibe.
π§ Docker Native Module Challenge: better-sqlite3 β‘οΈ
The Challenge: Native module better-sqlite3 failed in Docker with "invalid ELF header" error
The Problem: Architecture mismatch between build environment (macOS ARM64) and runtime (Linux ARM64)
Failed Solutions: Standard pnpm rebuild, copying pre-built modules, multi-stage builds
β
The Solution: Manual runtime compilation using node-gyp with full build dependencies
π€ AI Collaboration: Problem solved through iterative debugging with Claude 3.5 Sonnet
Key Learning: Native modules require careful architecture-specific compilation in containerized environments π
# The winning approach: Manual node-gyp compilation at runtime
RUN cd /app/node_modules/.pnpm/better-sqlite3@*/node_modules/better-sqlite3 && \
npm install node-gyp -g && \
node-gyp configure --module_name=better_sqlite3 --module_path=./build && \
node-gyp buildSee the latest changes and release notes here