Skip to content

TAURAAI/taura

Repository files navigation

Taura - Instant Visual Recall

Find any photo, PDF page, or document in milliseconds while you type.

Companion Build & Release

Table of Contents

Overview

Taura combines multi-modal embeddings with time and place heuristics to surface the exact media you are thinking of - directly from any text box (overlay) or the desktop companion. Taura also has a keyboard built in Kotlin for Android, which is still a WIP.


Watch the demo video ↗

🏗 Architecture Overview

Core components:

  • Companion App (Tauri v2 / React) - local indexing, UI, optional local vector db.
  • API Gateway (Go + Fiber) - search and sync orchestration, auth, metrics.
  • Embedder (Python FastAPI) - SigLIP So400M (1152-dim), multi-crop + panorama tiling.
  • Postgres + pgvector - primary vector store (1152-d), IVFFlat lists=100.
  • Workers (future) - PDF page rasterization, keyframes, transcripts.

Data model (simplified): users / media / media_vecs (vector[1152]) / auth tables (identities, sessions, API tokens), orgs, audit logs.

🛠 Local Development

Prerequisites

Docker Desktop, Node.js 18+, pnpm, Rust toolchain, Go 1.23+, Python 3.9+.

One-liner (infra + companion only)

pnpm install; pnpm run dev:infra; pnpm run dev:companion

Full Stack (in parallel via Turbo)

pnpm install
pnpm run dev:infra
pnpm run dev:full

Individual Services

pnpm run dev:infra        # Postgres + pgvector
pnpm run dev:api-gateway  # Go API (localhost:8080)
pnpm run dev:embedder     # FastAPI embedder (localhost:9000)
pnpm run dev:companion    # Desktop overlay + settings

Database Schema Apply

psql -h localhost -U postgres -d taura -f packages/schema/pg.sql

Python Setup

curl -LsSf https://astral.sh/uv/install.sh | sh   # Install uv once per machine
cd services/embedder
UV_EXTRA_INDEX_URL="https://download.pytorch.org/whl/cpu" \
  uv sync --python "$(uv python find --max 3.12 --min 3.10)"
source .venv/bin/activate  # On Windows PowerShell use: .venv/Scripts/Activate.ps1
uv run uvicorn app.main:app --reload --port 9000

The extra index URL surfaces CPU-only PyTorch wheels. Remove the variable if you have CUDA drivers locally.

Deployment (current state)

  • API Gateway (Go Fiber): deployed on a dedicated VM behind HTTPS. Probes for pgvector IVFFlat are configurable via SEARCH_IVFFLAT_PROBES (capped at lists=100).
  • Embedding microservice (Python FastAPI): deployed on Runpod A40 GPU pod. Uses SigLIP So400M (1152-dim) with multi-crop and panorama tiling.
  • Postgres + pgvector: managed on the VM or a hosted Postgres instance, with media_vecs dimension 1152 and IVFFlat index lists=100.

🔐 Privacy Modes (Design)

  • Strict-Local: Only metadata + (optionally) text embeddings leave device; images embedded locally (future).
  • Hybrid (default for MVP): Images/PDF pages sent (or presigned) to server for embedding; only vectors + thumbs stored.

🔎 Retrieval Flow

  1. User types a query - API Gateway calls the Embedder to embed text (1152-dim).
  2. Gateway runs pgvector ANN (IVFFlat, cosine, probes configurable) with filters (modality, time range, geo, album) and selects a candidate pool.
  3. Gateway applies lightweight heuristic rerank (time decay or window, geo boosts, modality prior) and returns top N with metadata and thumbnail URIs.
  4. If vector search is low confidence or empty, a keyword fallback (URI, album, source) is used and results are reranked.
  5. Optional: clients can apply additional reranking using @taura-ai/retrieval-core (hybridSearch) if desired.

🧪 API (Currently Implemented)

GET  /healthz                       -> { status }
POST /auth/google                   -> verify Google id_token, upsert user, return { user_id }
POST /user/upsert                   -> create or fetch user by email (simple helper)
POST /sync                          -> JSON batch (per-item inline) (legacy path)
POST /sync/stream (NDJSON)          -> High-throughput streaming ingest + enqueue embeds
POST /search                        -> ANN + rerank + filters + fallback
GET  /stats?user_id=EMAIL|UUID      -> { user_id, media_count, embedded_count, last_indexed_at }

Embedder Service (FastAPI):
GET  /healthz                       -> { status: ok }
POST /warmup                        -> run model warmup (text+image)
POST /embed/text                    -> { vec, diag }
POST /embed/text/batch              -> { vecs[] }
POST /embed/image (multipart|json)  -> { vec, diag }
POST /embed/image/batch             -> { vecs[], errors[], diagnostics[] }

🖥 Overlay UX Shortcuts

  • Esc: hide overlay
  • Click result: open file & hide overlay

📦 Build & Release (Planned CI)

Tagged release (vX.Y.Z) will trigger multi-platform build (Windows .msi/.exe, macOS .dmg/.app, Linux AppImage/Deb/RPM) via GitHub Actions using @tauri-apps/cli. Artifacts uploaded to a draft GitHub Release; optional notarization/signing steps can be added later.

Download installers from GitHub Releases and install:

Local Production Build

cd apps/companion
pnpm build && pnpm tauri:dev # (replace with tauri build once signing configured)

📁 Repo Layout (excerpt)

apps/companion        # Tauri desktop (React + Vite + Tailwind 4)
services/api-gateway  # Go Fiber gateway
services/embedder     # FastAPI embedding service
packages/schema       # SQL migrations / schema
packages/retrieval-core # Typed SDK (client, rerank, sql helpers)

🚀 Roadmap (Next Milestones)

  • Implement /stats handler (media counts, last indexed timestamps)
  • Thumbnail generation + storage (incl. PDF raster pages)
  • PDF page splitting & per-page embedding
  • Add auth token issuance (JWT or PASETO) and session renewal
  • Strict-Local mode (local embedding fallback / model packaging)
  • Metrics + tracing (OpenTelemetry) and p95 dashboards
  • Evaluation harness (Recall@10, MRR) with curated test set
  • Android IME integration calling /search (debounced)
  • Cross-encoder rerank (top-K ~200) optional toggle
  • Rate limiting & abuse protections
  • Keyframe extraction for video, transcript ingestion for audio/video
  • Encryption / secure at-rest local cache (future)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •