Skip to content

feat(retain): verbatim, chunks modes and named retain strategies#593

Merged
nicoloboschi merged 15 commits intomainfrom
feat/verbatim-extraction-mode
Mar 17, 2026
Merged

feat(retain): verbatim, chunks modes and named retain strategies#593
nicoloboschi merged 15 commits intomainfrom
feat/verbatim-extraction-mode

Conversation

@nicoloboschi
Copy link
Collaborator

@nicoloboschi nicoloboschi commented Mar 16, 2026

Summary

  • `verbatim` extraction mode: each chunk stored as a single memory unit with original text preserved exactly — the LLM still runs to extract entities, temporal info, and location (saves tokens by omitting the `what` field from the LLM response schema)
  • `chunks` extraction mode: completely bypasses the LLM — chunks stored as-is with zero token usage; only user-provided entities are used
  • Named retain strategies: define reusable sets of config overrides (e.g. `"fast"`, `"detailed"`) in bank config; select per-request or per-item via a `strategy` field
  • Per-item strategy: both `MemoryItem` and `FileRetainMetadata` accept a `strategy` field that overrides the request-level strategy for that item
  • Bank config UI: Retain Strategies editor and Default Strategy input added to the bank config page
  • All new modes and strategy fields are configurable globally (env), per-tenant, and per-bank

How it works

verbatim mode

  1. `VERBATIM_FACT_EXTRACTION_PROMPT` instructs the LLM to extract only metadata (`who`, `when`, `where`, `fact_type`, `entities`) — no `what` field in the schema
  2. `_collapse_to_verbatim()` post-processing enforces 1 fact per chunk, overrides `fact_text` with the raw chunk text, merges entities from any extra LLM facts
  3. Rest of the pipeline (embeddings, entity linking, semantic/temporal links) runs unchanged

chunks mode

  • `_extract_facts_chunks()` returns one `ExtractedFact` per chunk using pure Python, no LLM queue, zero token usage
  • User-provided entities from `RetainContent.entities` are the only entity source

Named strategies

  • Defined as `retain_strategies: { name: { field: value, ... } }` in bank/tenant config
  • Strategies are additive overrides on top of the resolved config — only hierarchical fields are applied
  • Strategy resolution order: `request.items[i].strategy` → `request.strategy` → `retain_default_strategy` → bank/global config
  • Strategy overrides can include `retain_extraction_mode`, `retain_chunk_size`, `entity_labels`, `entities_allow_free_form`, and any other hierarchical config field

Changes

  • `config.py`: add `verbatim`, `chunks` to `RETAIN_EXTRACTION_MODES`; add `retain_default_strategy`, `retain_strategies` hierarchical fields
  • `config_resolver.py`: add `apply_strategy()` function; validate empty strategy name keys on update
  • `fact_extraction.py`: add `VerbatimExtractedFact` schema, `VERBATIM_FACT_EXTRACTION_PROMPT`, `_collapse_to_verbatim()`, `_extract_facts_chunks()`
  • `memory_engine.py`: thread `strategy` param through retain pipeline and file convert handler
  • `http.py`: add `strategy` to `MemoryItem`, `FileRetainMetadata`; group items by effective strategy; add `operation_ids` to `RetainResponse`
  • `bank-config-view.tsx`: Retain Strategies editor with named strategy tabs, Default Strategy selector, entity fields merged into strategy form, confirmation dialog for removal
  • `bank-selector.tsx`: strategy field in Add Document dialog (per-file in tabbed collapsible for file uploads)
  • `route.ts` / TypeScript SDK: fix `strategy` forwarding through control plane proxy and SDK layer
  • `openapi.json` + generated TypeScript client: regenerated with `strategy` fields
  • `configuration.md`: document all new modes and strategies with JSON examples
  • `tests/test_retain.py`: unit tests for collapse, strategy application, `chunks` mode, verbatim (LLM), per-item strategy grouping, and end-to-end integration test

Also fixes (pre-existing)

  • `main.py`: missing reranker fields in CLI override block
  • `cross_encoder.py`: ty type error on module attribute assignment

Adds retain_extraction_mode="verbatim" that stores each chunk as-is
without LLM summarization. The LLM still runs to extract entities,
temporal info, and location for full indexability — only the fact text
is replaced with the original chunk content (one memory per chunk).

Useful for RAG-style indexing and benchmarks where original text
must be preserved in memory.

- Add "verbatim" to RETAIN_EXTRACTION_MODES in config.py
- Add VERBATIM_FACT_EXTRACTION_PROMPT with instructions to preserve text
- Add _collapse_to_verbatim() post-processing to enforce 1 fact/chunk
- Expose in bank config UI dropdown with updated description
- Update configuration.md docs with verbatim mode description
- Add unit test for _collapse_to_verbatim and integration test via LLM
- Fix pre-existing main.py CLI override missing new reranker fields
- Fix pre-existing cross_encoder.py ty type error via setattr
Instead of asking the LLM to echo the chunk text back into 'what' and
then discarding it, verbatim mode now uses a dedicated schema
(VerbatimExtractedFact) that omits the 'what' field altogether.
The LLM only returns metadata (entities, temporal info, location, who),
saving output tokens and avoiding any risk of paraphrasing before the
backfill.

- Add VerbatimExtractedFact / VerbatimFactExtractionResponse models
- Verbatim mode skips causal-relations section (nothing to relate causally)
- _extract_facts_from_chunk: allow missing 'what' in verbatim mode,
  set combined_text="" (backfilled by _collapse_to_verbatim)
- Update verbatim prompt to say DO NOT include 'what'
Zero-LLM retain mode: chunks are stored as-is with no LLM call, no
entity extraction, and no temporal indexing. Embeddings still run for
semantic search. User-provided entities via RetainContent.entities
are the sole source of entity data.

Early return placed before the batch-API check so no LLM queue or
concurrency locks are acquired.

- Add "index_only" to RETAIN_EXTRACTION_MODES
- Add _extract_facts_index_only() with pure Python chunking path
- Add to UI dropdown and update description
- Update configuration.md with index_only docs and table entry
- Add unit test asserting zero token usage and exact text preservation
Allows mixing extraction modes in a single bank via named strategies.
Each strategy is a set of hierarchical config overrides (extraction_mode,
chunk_size, entity_labels, entities_allow_free_form, etc.) applied on
top of the resolved bank config at retain time.

- retain_strategies: dict of strategy_name → config overrides (bank config)
- retain_default_strategy: default strategy when none specified (bank config)
- strategy field on /retain request: per-call override
- apply_strategy() in config_resolver applies overrides via dataclasses.replace()
- strategy propagates through retain_batch_async → _retain_batch_async_internal
  and through the async worker task payload
- Any hierarchical field is overridable per strategy, including entity_labels
  and entities_allow_free_form
- Docs updated with strategy configuration example and RRF fairness note
- Unit test for apply_strategy covering overrides, unknown strategy, and
  non-hierarchical field filtering
@nicoloboschi nicoloboschi force-pushed the feat/verbatim-extraction-mode branch from 93da6c6 to 100c0c8 Compare March 17, 2026 13:55
- Add `strategy` field to `MemoryItem` so individual items in a retain
  request can override the request-level strategy
- Add `strategy` field to `FileRetainMetadata` for per-file strategy
  override in file retain requests
- Group memory items by effective strategy in `api_retain`; each group
  is processed as a separate batch, results are aggregated
- Thread strategy through `submit_async_file_retain` →
  `_handle_file_convert_retain` → retain task payload
- Add `operation_ids` to `RetainResponse` for async requests with
  mixed per-item strategies
- Add `test_strategy_overrides_extraction_mode_for_index_only`: unit
  test verifying a named strategy with index_only bypasses the LLM
- Add `test_retain_request_per_item_strategy_field`: unit test for
  per-item strategy grouping logic
- Add StrategiesEditor component: per-strategy cards with name input and
  JSON overrides textarea; supports add/remove; validates JSON inline
- Add Default Strategy text input (retain_default_strategy)
- Update RetainEdits type and retainSlice() to include both new fields
- Regenerate OpenAPI spec (retain_strategies, retain_default_strategy,
  per-item strategy on MemoryItem/FileRetainMetadata, operation_ids on
  RetainResponse)
@nicoloboschi nicoloboschi changed the title feat(retain): add verbatim extraction mode feat(retain): verbatim, index_only modes and named retain strategies Mar 17, 2026
…ialog

- Strategy form now includes entity section (free form toggle + entity labels editor)
- Default strategy selector moved outside tab panel, above strategy chips
- Strategy tabs redesigned with underline indicator style for clarity
- Remove strategy confirms with AlertDialog
- Fix tab re-render bug when typing strategy name (skipSyncRef)
- Add strategy field to Add New Document dialog (text + per-file for uploads)
- File upload collapsible uses same Document/Tags/Source tabbed layout
- API: validate empty strategy names in config_resolver
- api.ts: add strategy field to retain and uploadFiles types
- route.ts: extract and forward `strategy` from request body to retainBatch
- TypeScript SDK: accept and forward `strategy` in retainBatch options and per-item
- config_resolver.py: validate empty strategy name keys on update
- bank-config-view.tsx: merge entity fields into RetainStrategyForm, redesign strategy tabs with underline style, add confirmation dialog for removal, fix tab-reset-on-typing with skipSyncRef, move default strategy selector outside panel
- bank-selector.tsx: add strategy field to Add Document dialog (per-file in tabbed collapsible)
- test_retain.py: add end-to-end integration test verifying named strategy application (index_only = 0 LLM tokens)
…t/MemoryItem

- Regenerate OpenAPI spec to include strategy field in RetainRequest and MemoryItem
- Regenerate TypeScript client from updated spec
- Add strategy to MemoryItemInput interface
- Remove (item as any) cast now that strategy is properly typed
@nicoloboschi nicoloboschi changed the title feat(retain): verbatim, index_only modes and named retain strategies feat(retain): verbatim, chunks modes and named retain strategies Mar 17, 2026
@nicoloboschi nicoloboschi merged commit e4f8a15 into main Mar 17, 2026
32 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant