Skip to content

perf: non-blocking tracing dispatch and flow polling optimization#673

Open
vitalii-dynamiq wants to merge 3 commits intomainfrom
perf/tracing-and-flow-polling-optimization
Open

perf: non-blocking tracing dispatch and flow polling optimization#673
vitalii-dynamiq wants to merge 3 commits intomainfrom
perf/tracing-and-flow-polling-optimization

Conversation

@vitalii-dynamiq
Copy link
Copy Markdown
Contributor

@vitalii-dynamiq vitalii-dynamiq commented Apr 2, 2026

Summary

Surgical performance optimizations targeting two high-impact bottlenecks identified under high parallel load (20/50/100 concurrent requests).

Changes

1. Non-blocking trace dispatch (dynamiq/clients/dynamiq.py)

Problem: DynamiqTracingClient.trace() called _send_traces_sync() on the caller thread, issuing a blocking requests.post() with a 60s timeout. Under high concurrency this blocked execution threads and added significant latency to every node/flow completion callback.

Fix:

  • Added a background daemon thread with a SimpleQueue for trace dispatch
  • trace() now enqueues runs instead of blocking on HTTP POST
  • Added close() method + atexit hook for graceful shutdown
  • The async path is unchanged (already non-blocking via loop.create_task)

2. Reuse httpx.AsyncClient (dynamiq/clients/dynamiq.py)

Problem: request() created a new httpx.AsyncClient context manager per call, paying TCP/TLS handshake overhead every time.

Fix:

  • Lazily create a shared httpx.AsyncClient with double-checked locking
  • Reused across all async trace calls for the lifetime of the client

3. Conditional flow polling sleep (dynamiq/flows/flow.py)

Problem: time.sleep(0.003) fired unconditionally on every iteration of the sync flow polling loop, even after futures.wait(FIRST_COMPLETED) had already blocked waiting for real work. This added unnecessary 3ms per-iteration latency.

Fix:

  • Guard time.sleep(0.003) so it only fires when no futures completed (empty results dict)
  • The sleep still prevents busy-waiting when no nodes are ready
  • The async flow's asyncio.sleep(0.003) is left as-is since it properly yields to the event loop

Impact

Bottleneck Before After
Sync trace flush Blocks caller thread up to 60s Fire-and-forget via background thread
Async HTTP client New TCP connection per trace call Persistent connection pool
Flow polling sleep 3ms added after every iteration 3ms only when idle (no completed work)

Testing

  • All 718 unit tests pass (1 pre-existing failure in test_example_yaml_files_load due to missing OPENAI_API_KEY - unrelated)
  • flake8: clean
  • bandit: clean
  • darker (black + isort): clean

Note

Medium Risk
Introduces background-thread/queue-based trace dispatch and shared httpx.AsyncClient lifecycle management, which can affect shutdown behavior, resource cleanup, and trace delivery under concurrency. The flow-loop sleep gating is low risk but changes timing in synchronous execution.

Overview
Improves performance in tracing and synchronous flow execution by making sync DynamiqTracingClient.trace() fire-and-forget via a background daemon thread/SimpleQueue, with new close()/aclose() cleanup and an atexit hook to flush on shutdown.

Reworks async HTTP usage to reuse a lazily created shared httpx.AsyncClient, including loop-change detection to avoid reusing loop-bound locks across successive event loops.

Reduces unnecessary latency in Flow.run_sync() by only calling time.sleep(0.003) when no node results completed in the polling loop.

Written by Cursor Bugbot for commit 47ea564. This will update automatically on new commits. Configure here.

## Changes

### dynamiq/clients/dynamiq.py
- Add background daemon thread with SimpleQueue for trace dispatch.
  trace() now enqueues runs instead of issuing a blocking
  requests.post on the caller thread, removing up to 60s of
  latency per flush under high concurrency.
- Reuse a single httpx.AsyncClient (lazy, double-checked lock)
  instead of creating one per _send_traces_async call, saving
  TCP/TLS handshake overhead.
- Add close() + atexit hook for graceful shutdown.

### dynamiq/flows/flow.py
- Guard time.sleep(0.003) so it only fires when no futures
  completed (empty results dict). futures.wait(FIRST_COMPLETED)
  already blocks when work is in progress, so the unconditional
  3ms sleep added unnecessary per-iteration latency.
@vitalii-dynamiq vitalii-dynamiq requested a review from a team as a code owner April 2, 2026 15:53
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
dynamiq/clients
   dynamiq.py1319229%22, 48–52, 55–58, 64–67, 75–79, 83–84, 86–91, 93–97, 101–104, 112–113, 118, 127–129, 137–140, 149, 152, 157, 160, 162–171, 174–177, 180–186, 188–192, 194, 196, 200–201, 206, 215–216, 224–225, 227–230, 232–234
dynamiq/flows
   flow.py2325875%104, 111, 118, 152–153, 211–213, 235, 238, 244, 246–250, 257, 260, 266, 268, 270–271, 273, 275–277, 322, 430–434, 442–443, 457–458, 460, 479–480, 482, 485, 504–505, 508–509, 512–513, 515–516, 519–521, 523–527, 529
TOTAL26747897866% 

Tests Skipped Failures Errors Time
1536 1 💤 0 ❌ 0 🔥 1m 47s ⏱️

dynamiq-bot added 2 commits April 2, 2026 16:20
Resolves two Cursor Bugbot issues:

1. (Medium) asyncio.Lock created in __init__ binds to the first event loop and raises RuntimeError when reused across different loops (e.g. successive asyncio.run() calls in test suites). Fixed by lazily initialising both the lock and client per event loop, tracked via weakref to reliably detect loop changes even when CPython reuses memory addresses.

2. (Low) Shared httpx.AsyncClient was never closed - close() only shut down the sync background thread. Added best-effort cleanup in close() and a new async aclose() method.
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

pass
self._async_client = None
self._async_client_lock = asyncio.Lock()
self._async_client_loop = weakref.ref(current_loop)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Await inside threading.Lock causes event loop deadlock

High Severity

_get_async_client calls await old_client.aclose() while holding self._async_init_mutex (a threading.Lock). When the await suspends the coroutine, any other coroutine scheduled on the same event loop that reaches with self._async_init_mutex: will call the blocking threading.Lock.acquire(), freezing the event loop thread entirely. Since the event loop is frozen, the original coroutine's aclose() can never complete — resulting in a deadlock. The await and associated cleanup need to happen outside the with self._async_init_mutex: block.

Fix in Cursor Fix in Web

Triggered by project rule: Bugbot Rules for Dynamiq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant