-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Found a deadlock issue in SH test (more details in SH issue#88):
-
Thread 1 (nuraft-reconfigure): During the
replace_memberprocess, after removing the old member,nuraftacquires thenuraftlock to trigger a reconfiguration and clears thesnapshot_sync_ctx. The cleanup operation requires the currentuser_snp_ctxto stop, which in turn depends on all pending prefetch blobs being read. However, this operation is blocked, waiting for an I/O reactor to handle the read. -
Thread 2 (IO reactor worker 1): This thread calls
monitor_replace_member_replication_status, detects that the replace member task is completed, and attempts to reset the quorum size. However, it is blocked waiting for thenuraftlock, which is held by Thread 1. At the same time, Thread 2 holds them_rd_map_mtxmutex. -
Thread 3 (IO reactor worker 2): This thread calls
gc_repl_reqs, which attempts to acquire them_rd_map_mtxmutex held by Thread 2. As a result, Thread 3 is blocked.
Since both I/O reactor threads (Thread 2 and Thread 3) are blocked, no I/O operations can proceed. This prevents Thread 1 from completing the read operation required to release the nuraft lock, leading to a deadlock.
Since monitor_replace_member_replication_status and gc_repl_reqs are not typical write/read operations, should we consider isolating them from the default IOMgr workers? Below are the timers currently using default IOMgr workers:
m_rdev_gc_timer_hdl: Triggers gc_repl_reqs and gc_repl_devs every minute.
m_rdev_fetch_timer_hdl: Triggers fetch_pending_data every second.
m_flush_durable_commit_timer_hdl: Triggers flush_durable_commit_lsn every 500ms.
m_replace_member_sync_check_timer_hdl: Triggers monitor_replace_member_replication_status every minute.
m_res_audit_timer_hdl: Triggers trigger_truncate every 2 minutes.