Skip to content

perf: reuse executors and share resources across flow runs#674

Open
vitalii-dynamiq wants to merge 1 commit intomainfrom
perf/executor-and-resource-reuse
Open

perf: reuse executors and share resources across flow runs#674
vitalii-dynamiq wants to merge 1 commit intomainfrom
perf/executor-and-resource-reuse

Conversation

@vitalii-dynamiq
Copy link
Copy Markdown
Contributor

@vitalii-dynamiq vitalii-dynamiq commented Apr 2, 2026

Summary

Performance optimization to reduce per-request overhead in flow execution by reusing executors, sharing resources across runs, and scaling thread pools to match system capacity. Targets scenarios with many sequential or parallel requests (e.g. 50+ concurrent) where thread pool churn, undersized pools, and redundant resource creation caused significant overhead.

Changes

1. PoolExecutor.reset() (dynamiq/executors/pool.py, dynamiq/executors/base.py)

  • Added reset() to BaseExecutor interface (no-op default for backward compatibility)
  • PoolExecutor.reset() clears node_by_future dict without destroying the thread pool

2. Flow executor caching (dynamiq/flows/flow.py)

  • _get_run_executor() lazily creates and caches the executor
  • Detects max_workers changes and recreates the executor when needed
  • Handles max_workers=None (default) correctly by skipping comparison against the resolved integer value
  • run_sync() calls reset() in finally block (exception-safe) instead of shutdown()
  • Added close() method and __enter__/__exit__ context manager support

3. Per-node timeout executor (dynamiq/nodes/node.py)

  • Timeout enforcement uses a dedicated per-node ContextAwareThreadPoolExecutor(max_workers=1)
  • Executor is shut down with shutdown(wait=False) in the finally block of execute_with_retry
  • Why per-node? A shared bounded pool caused false timeouts under concurrency: when 50+ timeout-guarded nodes shared one pool, tasks queued up and future.result(timeout=T) measured wall-clock from submission rather than task start, causing queued tasks to timeout before executing

4. Shared ConnectionManager (dynamiq/connections/managers.py, dynamiq/flows/flow.py)

  • Added get_default_connection_manager() singleton
  • Flow defaults to the shared instance

5. Async conditional sleep (dynamiq/flows/flow.py)

  • asyncio.sleep(0.003) only triggers when no nodes were dispatched (idle polling)

6. Smart thread pool sizing (dynamiq/executors/pool.py)

  • Replaced hardcoded MAX_WORKERS_THREAD_POOL_EXECUTOR = 8 with min(32, os.cpu_count() + 4) — Python stdlib's proven heuristic for I/O-bound workloads
  • Configurable via DYNAMIQ_MAX_WORKERS environment variable for deployment-level tuning
  • Examples: 8-core -> 12 workers (was 8), 16-core -> 20 workers, 28+ core -> 32 workers (capped)

Configuration

Environment Variable Default Description
DYNAMIQ_MAX_WORKERS min(32, cpu_count + 4) Thread pool size for flow node execution

Per-flow override: set max_node_workers on the Flow instance or in RunnableConfig.

Cursor Bugbot Issues - All Resolved

# Severity Issue Fix
1 HIGH reset() only called in success path Moved to finally block
2 HIGH Executor cache defeated by max_workers None mismatch Skip comparison when max_workers is None (default); only recreate when explicitly changed
3 HIGH Shared timeout pool causes false timeouts under concurrency Reverted to per-node ContextAwareThreadPoolExecutor(max_workers=1) with shutdown(wait=False)
4 MEDIUM Shared timeout executor thread leak Resolved by reverting to per-node executor with shutdown(wait=False) in finally
5 LOW reset() not defined on BaseExecutor Added with no-op default for backward compatibility

Test Results

  • All 718 unit tests pass locally (Python 3.10)
  • flake8, darker clean
  • test_background_thread_runs_until_streaming_timeout_after_execute_timeout passes with per-node timeout executor

Impact on 50 Parallel Requests

Before: Each run_sync() call created and destroyed a thread pool (8 threads), a timeout executor, and a connection manager. 50 parallel requests = 150+ redundant resource allocations + constant thread churn.

After:

  • Flow-level thread pools are reused across runs (zero churn)
  • Pool sizes scale with CPU count (12 threads on 8-core, up to 32)
  • Single shared connection manager
  • Per-node timeout executors remain lightweight (max_workers=1) and are properly cleaned up
  • Deployments can tune via DYNAMIQ_MAX_WORKERS=64 for high-concurrency scenarios
  • Conditional sleep reduces idle polling overhead

@vitalii-dynamiq vitalii-dynamiq requested a review from a team as a code owner April 2, 2026 16:08
@vitalii-dynamiq vitalii-dynamiq force-pushed the perf/executor-and-resource-reuse branch 2 times, most recently from db3c76d to 4372e3a Compare April 2, 2026 16:28
- Reuse ThreadPoolExecutor across Flow.run_sync() calls instead of
  creating/destroying a pool per request (PoolExecutor.reset())
- Share a module-level timeout executor for node timeout enforcement
  instead of creating one per node execution
- Share a default ConnectionManager singleton across Flow instances
  to avoid redundant connection client initialization
- Make async flow polling sleep conditional (only when no nodes dispatched)
- Add Flow context manager support (close/__enter__/__exit__) for
  proper resource cleanup
@vitalii-dynamiq vitalii-dynamiq force-pushed the perf/executor-and-resource-reuse branch from 4372e3a to 3718f82 Compare April 2, 2026 16:40
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

try:
if timeout is not None:
executor = ContextAwareThreadPoolExecutor()
executor = ContextAwareThreadPoolExecutor(max_workers=1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single-worker timeout executor blocks retries after timeout

High Severity

The timeout executor changed from default max_workers (typically ≥5) to max_workers=1. The executor is created once before the retry loop and reused across all attempts. After a timeout, the timed-out task continues running on the sole worker thread (Python can't interrupt running threads, and future.cancel() only works on queued tasks). When a retry submits a new task, it's queued behind the still-running timed-out task and can never start until that task finishes — causing every subsequent retry to also time out immediately without executing.

Additional Locations (1)
Fix in Cursor Fix in Web

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
dynamiq/connections
   managers.py87791%71–72, 125, 133, 167, 193, 205
dynamiq/executors
   base.py14378%35, 43, 66
   pool.py711480%25, 62, 90, 118, 139–141, 187–188, 203, 216, 227–228, 239
dynamiq/flows
   flow.py2496673%105, 112, 119, 153–154, 212–214, 236–237, 251–253, 256, 259–260, 272, 275, 281, 283–287, 294, 297, 303, 305, 307–308, 310, 312–314, 445, 466–470, 478–479, 493–494, 496, 515–516, 518, 521, 540–541, 544–545, 548–549, 551–552, 555–557, 559–563, 565
dynamiq/nodes
   node.py69010984%315, 332, 365, 376, 391, 396–397, 401–402, 446–447, 449, 467, 488, 522, 524, 531–532, 604–605, 608–610, 617–619, 627–628, 638–643, 650–651, 726, 739, 741, 743, 747, 758–759, 777–780, 783–786, 788, 793, 795–797, 801, 804–805, 810, 832–834, 840, 1147, 1181–1182, 1207–1208, 1230–1231, 1259–1260, 1286–1287, 1312–1313, 1335–1336, 1356–1357, 1380–1381, 1395, 1414, 1420, 1423, 1440–1442, 1515–1521, 1523, 1525–1526, 1537, 1557, 1592, 1626, 1632, 1685, 1707–1709
TOTAL26751896966% 

Tests Skipped Failures Errors Time
1536 1 💤 0 ❌ 0 🔥 2m 0s ⏱️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant