Commit e13f49a
committed
fix(gateway): synchronize with worker before notify in ResourceChangeNotifier::shutdown
Classical lost-wakeup race between shutdown() and worker_loop():
1. worker locks queue_mutex_
2. worker checks predicate (flag=false, queue empty) -> false
3. shutdown() CAS flag -> true (NOT holding queue_mutex_)
4. shutdown() notify_one() - worker NOT yet enqueued on CV
5. worker enters wait(lock): atomic unlock+enqueue+sleep
6. worker sleeps forever; main blocks in worker_thread_.join()
Even though shutdown_flag_ uses seq_cst atomics, the worker's predicate
check and entry into wait() are not atomic with respect to the flag
modification. The notify can arrive before the worker is enqueued.
Fix: briefly acquire queue_mutex_ between setting the flag and notifying.
This guarantees the worker is either still outside its critical section
(will observe the new flag on the next lock) or already enqueued on the
CV (notify_one will wake it).
Manifested as a TSan-specific hang in DoubleShutdownIsSafe (worker spawn
+ immediate shutdown hits the race window).1 parent 7068ad0 commit e13f49a
1 file changed
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
105 | 105 | | |
106 | 106 | | |
107 | 107 | | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
108 | 115 | | |
109 | 116 | | |
110 | 117 | | |
| |||
0 commit comments