Feature request: --batch-size / --limit flag for embed command

## Problem

When running `qmd embed` on a CPU-only machine (no GPU, AVX512 only), the process crashes with a `SessionReleasedError` after processing ~190 chunks. The GGUF embedding model runs out of memory on large collections.

## Current behavior

`qmd embed` attempts to embed all un-embedded chunks in a single run. There is no way to limit how many chunks are processed per invocation.

## Proposed solution

Add a `--batch-size <n>` or `--limit <n>` flag to `qmd embed` that caps the number of chunks processed per run. This would allow:

- CPU-only users to run embed in multiple passes without OOM crashes
- Easy integration with cron jobs that retry automatically
- Graceful degradation on resource-constrained machines

## Workaround

Currently running `qmd embed` multiple times in a retry loop, since it skips already-embedded chunks. Works but inelegant.

## Environment

- Windows 11, CPU-only (no GPU), AVX512
- Collection: ~175 files, ~456 vectors
- Installed via Bun

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: --batch-size / --limit flag for embed command #215

Problem

Current behavior

Proposed solution

Workaround

Environment

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feature request: --batch-size / --limit flag for embed command #215

Description

Problem

Current behavior

Proposed solution

Workaround

Environment

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions