Skip to content

Conversation

@fcostaoliveira
Copy link
Collaborator

@fcostaoliveira fcostaoliveira commented Feb 6, 2026

Note

Medium Risk
Touches shutdown/cleanup and libevent timer/bufferevent lifecycle across threads; incorrect teardown ordering could cause leaks, hangs, or use-after-free during stop conditions.

Overview
Ensures benchmarks shut down promptly by adding a new forceful stop path that tears down all libevent resources and pending work instead of waiting for normal completion.

This introduces force_stop() on client, client_group, and shard_connection to delete timers, disable/free bufferevents, and drain pending requests; m_pending_resp is made atomic to avoid benign races when polled from the main thread. The main runner now uses force_stop() on Ctrl+C and when --test-time elapses, printing how many threads are being stopped and waiting until they report finished; the interrupt message assertion in tests_oss_simple_flow.py is updated accordingly.

Written by Cursor Bugbot for commit 871b6bc. This will update automatically on new commits. Configure here.

}

// Break the event loop to stop processing
event_base_loopbreak(m_base);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition between force_stop and event loop callbacks

High Severity

client_group::force_stop() frees resources (via shard_connection::force_stop()) while the worker thread's event loop may still be executing callbacks that use those resources. The event_base_loopbreak() is called AFTER resources are freed at lines 752-755, but the worker thread could be inside a callback like handle_timer_event() accessing m_event_timer, m_bev, etc. that are being freed. Since libevent thread-safety isn't enabled (no evthread_use_pthreads() call), this creates a use-after-free race condition.

Additional Locations (1)

Fix in Cursor Fix in Web

}
if (active_threads > 0) {
fprintf(stderr, "[RUN #%u] Waiting for %u threads to finish...\n", run_id, active_threads);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated thread waiting loop code

Low Severity

The waiting loop that polls active_threads and prints status messages is duplicated identically in two places: once for Ctrl+C handling and once for test-time completion. This creates maintenance burden since any future bug fix or enhancement would need to be applied to both locations, risking inconsistent behavior if one is missed.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

// This is more aggressive than interrupt() - it forcefully cleans up all events

// Set the stop flag first - this will cause the run() loop to exit
m_stop_requested.store(true, std::memory_order_relaxed);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable m_stop_requested is set but never read

Low Severity

The m_stop_requested atomic variable is declared in client.h, initialized in the constructor, and set to true in force_stop(), but it's never read anywhere. The comment states "this will cause the run() loop to exit" but run() simply calls event_base_dispatch(m_base) without checking this flag. The actual mechanism that stops the loop is event_base_loopbreak() called later. This is dead code that should be removed.

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant