perf(semantic): run batch overview generation and file summaries concurrently by ahmedhesham6 · Pull Request #840 · volcengine/OpenViking

ahmedhesham6 · 2026-03-21T07:59:11Z

Description

The semantic processor generates directory overviews by splitting large directories into batches of 50 files and calling the VLM for each batch. Previously, both file summary generation in _process_memory_directory and batch overview generation in _batched_generate_overview ran sequentially — each VLM call blocked the next. For directories with 1000+ files (common in memory directories like entities, events, preferences), this caused a single queue item to take 15+ minutes, blocking the entire semantic queue.

This change runs both operations concurrently using asyncio.gather, bounded by the existing max_concurrent_llm semaphore.

Related Issue

N/A — discovered during production usage with large memory directories (1000+ entity memories).

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactoring (no functional changes)
Performance improvement
Test update

Changes Made

_process_memory_directory: Changed file summary generation for modified/added files to run concurrently via asyncio.gather instead of sequential await in a loop. Cached summaries for unchanged files are still reused. Order is preserved via a pre-allocated indexed list.
_batched_generate_overview: All batch prompts are pre-built in the existing loop, then dispatched concurrently via asyncio.gather. Each VLM call is bounded by async with llm_sem to respect max_concurrent_llm. Batch ordering is preserved via an indexed list. The final merge step remains sequential as it depends on all batches completing.

Testing

Tested in production with max_concurrent_llm=20 against a directory with 1,214 memory files split into 20 batches
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have tested this on the following platforms:
- Linux

Before (sequential): memories/entities (1,000 files, 20 batches) — ~15 minutes for batch overview step alone
After (concurrent): Same directory — ~23 seconds for batch overview step (~40x improvement)

Directory	Files	Batches	Before	After
`memories/entities`	1,214	20	~15 min	~90 sec total
`memories/cases`	398	8	~5 min	~47 sec total
`memories/patterns`	73	2	~2 min	~25 sec total

Checklist

My code follows the project's coding style
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Additional Notes

The semantic queue processes items sequentially (one at a time). When a single memory directory with 1000+ files enters the queue, it blocks all other items for the duration of its processing. This change does not alter that single-consumer behavior — it only parallelizes the VLM calls within a single queue item.

The max_concurrent_llm semaphore (configured via vlm.max_concurrent in ov.conf) controls the degree of parallelism. The default of 100 is appropriate for most VLM providers. The change is fully backward-compatible — with max_concurrent_llm=1 the behavior is identical to sequential execution.

…urrently The semantic processor generates directory overviews by splitting large directories into batches of 50 files and calling the VLM for each batch. Previously, both file summary generation in _process_memory_directory and batch overview generation in _batched_generate_overview ran sequentially, causing directories with 1000+ files to take 15+ minutes as each VLM call blocked the next. This change runs both operations concurrently using asyncio.gather, bounded by the existing max_concurrent_llm semaphore: - _process_memory_directory: changed files now generate summaries in parallel instead of awaiting each one sequentially. Cached summaries are still reused for unchanged files. - _batched_generate_overview: all batch prompts are pre-built, then dispatched concurrently via asyncio.gather with the llm semaphore controlling concurrency. Batch ordering is preserved via indexed list. With max_concurrent_llm=20, a 1000-file directory that previously took ~15 minutes for the batch step now completes in ~23 seconds (~40x improvement). The final merge step remains sequential as it depends on all batches completing.

github-actions · 2026-03-21T07:59:57Z

Failed to generate code suggestions for PR

openviking/storage/queuefs/semantic_processor.py

…ormatting Thread llm_sem through _generate_overview and _batched_generate_overview so callers can share a single semaphore across the full pipeline, preventing concurrent calls from exceeding the intended concurrency limit.

github-project-automation bot added this to OpenViking project Mar 21, 2026

github-project-automation bot moved this to Backlog in OpenViking project Mar 21, 2026

Bortlesboat reviewed Mar 22, 2026

View reviewed changes

openviking/storage/queuefs/semantic_processor.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(semantic): run batch overview generation and file summaries concurrently#840

perf(semantic): run batch overview generation and file summaries concurrently#840
ahmedhesham6 wants to merge 2 commits intovolcengine:mainfrom
stakpak:perf/concurrent-semantic-batches

ahmedhesham6 commented Mar 21, 2026

Uh oh!

github-actions bot commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ahmedhesham6 commented Mar 21, 2026

Description

Related Issue

Type of Change

Changes Made

Testing

Checklist

Additional Notes

Uh oh!

github-actions bot commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants