Skip to content

Cache Ollama embedding dimensions on startup #24

@johnlanda

Description

@johnlanda

Summary

The Ollama embedder probes embedding dimensions by making a full embed API call on every mctl serve and mctl up invocation. Cache the discovered dimensions to eliminate 100-400ms of startup latency.

Context

In internal/embedder/ollama.go, NewOllamaEmbedder() calls probeDimensions() which sends a POST to /api/embed with a probe text just to discover the output vector size. This happens every time the embedder is initialized, adding 100-400ms to cold start. The dimensions for a given model are constant and don't need re-probing.

Key files:

  • internal/embedder/ollama.goNewOllamaEmbedder(), probeDimensions() (line ~113)
  • internal/manifest/manifest.go — manifest config where dimensions could be stored
  • internal/lockfile/lockfile.go — lockfile metadata where dimensions could be cached

Acceptance Criteria

  • Embedding dimensions are cached after first probe (in lockfile metadata or dedicated cache file)
  • Subsequent startups skip the probe API call when cache is valid
  • Cache is invalidated when the Ollama model name changes
  • Fallback to probing if cache file is missing or corrupted
  • 100-400ms startup improvement measurable with Ollama provider

Technical Approach

  1. After probeDimensions(), store the result in lockfile [meta] section (e.g., embedding_dimensions = 1024) or a separate cache file at ~/.mycelium/cache/ollama-dims.json
  2. On startup, check cache first: if model name matches and dimensions are cached, skip probe
  3. If model name differs or cache is missing, probe and update cache
  4. Lockfile approach is simpler (already read on startup); cache file approach is more decoupled

Dependencies

None — standalone improvement.

Out of Scope

  • Caching Ollama model availability checks (checkModelAvailable)
  • Other embedder initialization optimizations

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions