Conversation
WalkthroughIncrease coroutine stack size to 8MB, add guard-page and 16-byte alignment for coroutine stacks, introduce a Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant SSM as ServiceStateMachine
participant BoostCtx as Boost.Context
participant Executor
Client->>SSM: schedule task / request resume (uses weak_from_this())
Executor->>SSM: invoke resume lambda (captures weak_this)
alt SSM alive
SSM->>BoostCtx: callcc -> allocate stack (8MB) with guard page, store _coroSink
BoostCtx-->>SSM: coroutine runs (control transferred)
SSM->>BoostCtx: yield/resume via stored _coroSink (validate client/_coroSink)
BoostCtx-->>SSM: returns control
else SSM expired
Executor-->>Executor: lambda no-op after weak_ptr lock fails
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/mongo/db/kill_sessions_local.cpp (1)
116-116: Reconsider the log level for coroutine flow messages.These messages describe normal coroutine execution flow (resume/yield operations), not error conditions. Elevating them to
error()level may:
- Pollute error logs with non-error information
- Make it harder to identify actual errors
- Trigger unnecessary alerts in monitoring systems
Consider using
LOG(1)orLOG(3)for debug-level logging instead, or remove these logs entirely if they were added temporarily for debugging the crash investigation.🔎 Suggested fix
- error() << "abortArbitraryTransactionIfExpired call resume."; + LOG(3) << "abortArbitraryTransactionIfExpired call resume.";- error() << "abortArbitraryTransactionIfExpired call yield."; + LOG(3) << "abortArbitraryTransactionIfExpired call yield.";Also applies to: 134-134
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/mongo/db/kill_sessions_local.cpp(3 hunks)src/mongo/transport/service_state_machine.cpp(2 hunks)src/mongo/transport/service_state_machine.h(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 211
File: src/mongo/db/modules/eloq/cmake/build_eloq_log_service.cmake:26-80
Timestamp: 2025-09-25T12:24:06.434Z
Learning: The user githubzilla implemented a fix in commit fe98aaf to address the MEMORY state incorrectly triggering RocksDB discovery in build_eloq_log_service.cmake. This was a control flow issue where MEMORY builds were falling into an else branch that still attempted RocksDB discovery.
🔇 Additional comments (4)
src/mongo/transport/service_state_machine.h (1)
285-287: Stack size increase is well-justified.The increase from 3.2MB to 8MB with the explanatory comment clearly addresses the stack overflow concerns mentioned in the PR objectives. The documentation helps future maintainers understand the rationale.
src/mongo/transport/service_state_machine.cpp (2)
58-58: Appropriate header inclusion for alignment logic.The
<cstdint>header is required foruintptr_tused in the stack pointer alignment at line 752.
735-758: Excellent fix for the stack alignment and bounds issue.The rework correctly addresses the core problem described in the PR:
- Guard-page setup: Reserves the first OS page as PROT_NONE (set in constructor at line 272).
- Boundary calculation: Computes usable range
[bottom, top)wherebottom = _coroStack + _osPageSize.- Alignment: Aligns
spdown to 16-byte boundary using bitwise masking, which is required for x86-64 and ARMv8 ABIs.- Size recomputation (key fix):
sc.size = sp - bottomensures the usable stack range[sp - size, sp)never overlaps the guard page, even after alignment adjustment.The invariant at line 754 provides a safety check. Given typical page sizes (4KB) and the stack size (8MB), the invariant should hold even with the maximum 15-byte alignment reduction.
src/mongo/db/kill_sessions_local.cpp (1)
77-79: Stack size increase is consistent and well-documented.The change aligns with the updated
kCoroStackSizeinservice_state_machine.hand maintains consistency across coroutine implementations.
ServiceExecutorCoroutine::coroutineResumeFunctor() and coroutineLongResumeFunctor() captured the task parameter by reference, leaving lambdas with a dangling reference after the function returned. Under load this could corrupt memory and crash later (e.g. in munmap). Capture Task by value (copy) inside the returned functors to ensure correct lifetime.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/mongo/transport/service_executor_coroutine.cpp (2)
319-324: Excellent fix for the dangling reference issue!The change correctly captures
taskby value (via copy + move) instead of by reference, eliminating the dangling reference when the lambda outlives the function scope.However, consider moving
taskwhen passing it toresumeTaskto avoid an unnecessary copy:- return [thd_group = &_threadGroups[threadGroupId], task = std::move(taskCopy)]() mutable { - thd_group->resumeTask(task); + return [thd_group = &_threadGroups[threadGroupId], task = std::move(taskCopy)]() mutable { + thd_group->resumeTask(std::move(task)); };Since
resumeTasktakesTaskby value and internally moves it, passingtaskas an lvalue causes a copy. Moving directly from the lambda's capturedtaskis more efficient, especially ifTask(likelystd::function<void()>) has non-trivial copy semantics.
330-334: Good fix for the dangling reference issue!The capture-by-value approach correctly prevents dangling references. Same performance suggestion as in
coroutineResumeFunctor: movetaskwhen passing it toenqueueTask:- return [thd_group = &_threadGroups[threadGroupId], task = std::move(taskCopy)]() mutable { - thd_group->enqueueTask(task); + return [thd_group = &_threadGroups[threadGroupId], task = std::move(taskCopy)]() mutable { + thd_group->enqueueTask(std::move(task)); };This avoids copying
taskwhenenqueueTask(which takesTaskby value) is called.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/mongo/transport/service_executor_coroutine.cpp(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 211
File: src/mongo/db/modules/eloq/cmake/build_eloq_log_service.cmake:26-80
Timestamp: 2025-09-25T12:24:06.434Z
Learning: The user githubzilla implemented a fix in commit fe98aaf to address the MEMORY state incorrectly triggering RocksDB discovery in build_eloq_log_service.cmake. This was a control flow issue where MEMORY builds were falling into an else branch that still attempted RocksDB discovery.
Fix potential use-after-free during coroutine resume/yield by avoiding raw this captures in queued resume tasks. Use weak_from_this() for _resumeTask and thread-group migration functor so queued callbacks become no-ops after ServiceStateMachine destruction. Additionally, disable Client coroutine functors and clear SSM coroutine callbacks in _cleanupSession() to prevent yield/resume calls racing with coroutine stack teardown.
ServiceStateMachine stored _coroYield as a member functor but captured &sink from the boost::context::callcc lambda, leaving a dangling reference once the lambda frame changed/ended. This could crash in jump_fcontext() with unreadable backtraces. Store the caller continuation (sink) in a new member _coroSink and implement _coroYield in terms of _coroSink.resume(). Clear _coroSink during reset/cleanup/destruction to avoid resuming stale continuations.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/mongo/transport/service_state_machine.cpp (1)
758-781: Excellent fix for the stack alignment bug—this is the core of the PR.The rewrite correctly addresses the critical issue described in the PR objectives:
What was fixed:
- Line 770-771: Explicitly computes
bottom(guard page boundary) andtop(stack end)- Line 774-775: Aligns
spdown to a 16-byte boundary (required for x86-64 and mandatory for ARMv8/A64 ABIs)- Line 779: Crucially recomputes
size = sp - bottomto reflect the usable range after alignmentWhy this matters:
Previously, aligningspdown without recomputingsizemeant the logical stack range[sp - old_size, sp)could extend belowbottominto the guard page or unallocated memory, causing segmentation faults.Now, the usable stack range is correctly
[sp - size, sp) = [bottom, sp), which never overlaps the protected guard page at[_coroStack, bottom).Line 777: The
invariant(sp > bottom)is a good safety check ensuring alignment doesn't cause overlap.Optional: Add compile-time check for stack size vs page size
While extremely unlikely in practice (8MB stack vs typical 4KB page), consider adding a static assertion for defensive programming:
boost::context::stack_context ServiceStateMachine::_coroStackContext() { + static_assert(kCoroStackSize > 2 * 4096, // Assume max page size of 4KB + "Coroutine stack must be substantially larger than OS page size"); boost::context::stack_context sc;This would catch any future misconfiguration at compile time.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/mongo/transport/service_state_machine.cpp(8 hunks)src/mongo/transport/service_state_machine.h(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 211
File: src/mongo/db/modules/eloq/cmake/build_eloq_log_service.cmake:26-80
Timestamp: 2025-09-25T12:24:06.434Z
Learning: The user githubzilla implemented a fix in commit fe98aaf to address the MEMORY state incorrectly triggering RocksDB discovery in build_eloq_log_service.cmake. This was a control flow issue where MEMORY builds were falling into an else branch that still attempted RocksDB discovery.
🧬 Code graph analysis (1)
src/mongo/transport/service_state_machine.cpp (1)
src/mongo/db/client.cpp (6)
releaseCurrent(164-167)releaseCurrent(164-164)haveClient(160-162)haveClient(160-160)getCurrent(151-153)getCurrent(151-151)
🔇 Additional comments (6)
src/mongo/transport/service_state_machine.h (1)
291-294: Excellent design to prevent dangling references.Storing the caller continuation as a member variable ensures that yield/resume functors never capture references to stack-local continuation variables, eliminating a class of potential use-after-free bugs. This is a solid improvement to the coroutine lifecycle management.
src/mongo/transport/service_state_machine.cpp (5)
58-58: LGTM: Include added for stack pointer alignment.The
<cstdint>include is necessary foruintptr_tused in the stack alignment calculation (Line 775).
281-281: LGTM: Proper cleanup of continuation member.Resetting
_coroSinkin both the destructor andreset()ensures stale continuations cannot be accidentally resumed, preventing use-after-free scenarios.Also applies to: 306-306
549-583: Excellent use-after-free protection for queued coroutine tasks.The switch from raw
thiscaptures toweak_from_this()is a critical safety improvement. Resume tasks can be queued and executed asynchronously, so capturing a weak pointer ensures that if the ServiceStateMachine is destroyed before the task executes, the callback safely becomes a no-op rather than accessing freed memory.The pattern is correctly applied:
- Capture
weak_from_this()in the lambda- Lock to obtain a
shared_ptr- Early return if the SSM has been destroyed
- Only then access SSM members
591-614: LGTM: Coroutine continuation handling is now lifetime-safe.The changes correctly implement the design outlined in the header comments:
- Line 598: Stores the caller continuation (
sink) as_coroSinkmember, preventing dangling references to stack-local variables- Lines 602-610: The yield functor properly:
- Captures
weak_from_this()for safety- Releases the current client (Line 608) before yielding
- Resumes via the stored
_coroSinkmember (Line 609)- Updates
_coroSinkwith the resumed continuation- Line 613: Returns the stored continuation
This prevents both use-after-free and dangling reference issues in coroutine lifecycle management.
813-824: LGTM: Enhanced teardown sequence prevents coroutine races during destruction.The early disabling of coroutine functors (Lines 815-817) is an important safety measure that prevents any yield/resume calls from racing with the destruction of the SSM and its coroutine stack. Clearing all coroutine-related members (Lines 818-823), including the new
_coroSink, ensures complete cleanup.This defensive approach eliminates potential use-after-free scenarios during the teardown phase.
| // Coroutine stack size: Increased from 3.2MB to 8MB to prevent stack overflow | ||
| // in complex database operations (deep recursion, large BSON processing, etc.) | ||
| static constexpr size_t kCoroStackSize = 8192 * 1024; // 8MB |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for kCoroStackSize and related stack size references in codebase
echo "=== Searching for kCoroStackSize references ==="
rg -n "kCoroStackSize" --type=cpp
echo -e "\n=== Searching for coroutine stack size related code ==="
rg -n "CoroStackSize|coroutine.*stack|stack.*size" --type=cpp -i | head -50
echo -e "\n=== Checking service_state_machine.h file context ==="
rg -n -B5 -A5 "kCoroStackSize" --type=cpp
echo -e "\n=== Searching for tests related to coroutine stack or memory ==="
rg -n "coroutine|stack.*overflow|memory.*profile" --type=cpp -g '*test*' -g '*unittest*' | head -30Repository: eloqdata/eloqdoc
Length of output: 13016
Document memory impact analysis and load testing for 8MB coroutine stack increase.
The 2.5x increase in coroutine stack size (3.2MB → 8MB) is implemented consistently across two critical code paths: service_state_machine.h and kill_sessions_local.cpp. Each mmap allocation of 8MB per connection introduces measurable memory overhead for deployments with thousands of concurrent connections.
The code rationale (preventing stack overflow in deep recursion and large BSON processing) is documented in both locations, but the following verification gaps remain:
- No load testing results documenting memory consumption at scale
- No configurable option to adjust stack size for different deployment requirements
- No profiling data confirming actual stack usage justifies the 8MB threshold
These should be addressed in deployment documentation or as follow-up work, particularly for high-connection-count deployments where memory efficiency is critical.
🤖 Prompt for AI Agents
In src/mongo/transport/service_state_machine.h around lines 285 to 287, the
hardcoded coroutine stack size was increased to 8MB but lacks documentation,
load-test data, profiling evidence, and a runtime/configurable option; update
the code and repo as follows: add a clear note in the file header referencing an
external design/doc page that contains memory impact analysis and load-test
results for thousands of connections, expose the stack size as a configurable
parameter (compile-time constant fallback plus runtime override via config file
or env var/CLI flag) with validated bounds and a sensible default of 8MB, add a
short profiling hook or sample code comment showing how to capture maximum stack
usage during tests, and add a task in the repo docs (or CHANGELOG) linking to
completed profiling and load-test artifacts so operators can evaluate and tune
stack size for high-connection deployments.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/mongo/transport/service_state_machine.cpp (1)
772-795: Stack alignment logic is correct; consider adding minimum size check.The 16-byte alignment (lines 788-789) and recomputed
sc.size = sp - bottom(line 793) correctly ensure the usable range[sp - size, sp)never overlaps the guard page. The invariant at line 791 guards against alignment underflow.However, after alignment you could lose up to 15 bytes. If
kCoroStackSizeis small or_osPageSizeis large, the usable stack might be insufficient.Consider adding a minimum size assertion
invariant(sp > bottom); +constexpr size_t kMinStackSize = 4096; // or another safe threshold +invariant(static_cast<size_t>(sp - bottom) >= kMinStackSize); sc.sp = sp; sc.size = static_cast<size_t>(sp - bottom);
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/mongo/transport/service_state_machine.cpp(10 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 211
File: src/mongo/db/modules/eloq/cmake/build_eloq_log_service.cmake:26-80
Timestamp: 2025-09-25T12:24:06.434Z
Learning: The user githubzilla implemented a fix in commit fe98aaf to address the MEMORY state incorrectly triggering RocksDB discovery in build_eloq_log_service.cmake. This was a control flow issue where MEMORY builds were falling into an else branch that still attempted RocksDB discovery.
🧬 Code graph analysis (1)
src/mongo/transport/service_state_machine.cpp (1)
src/mongo/db/client.cpp (8)
setCurrent(169-172)setCurrent(169-169)haveClient(160-162)haveClient(160-160)releaseCurrent(164-167)releaseCurrent(164-164)getCurrent(151-153)getCurrent(151-151)
🔇 Additional comments (6)
src/mongo/transport/service_state_machine.cpp (6)
280-282: LGTM: Proper cleanup order.Clearing
_coroSinkbefore unmapping the coroutine stack ensures the continuation doesn't hold references to memory being freed.
478-483: LGTM: Correctly disables yield after request completion.The comment accurately describes the safety contract: coroutine functors are only valid during active request processing within the
callcc()context. Clearing them after_coroStatus = EmptyensuresClient::coroutineFunctors()returnsUnavailablebetween requests, preventing misuse.
555-593: Excellent UAF prevention using weak_ptr capture.The weak pointer pattern across
_resumeTask(lines 558-583) and_coroMigrateThreadGroup(lines 589-593) prevents use-after-free during asynchronous resume execution. The early-return guards (lines 560-562, 590) and null-checks before invoking functors (lines 566-571) provide solid defense-in-depth.
606-624: LGTM: Robust yield implementation with safety invariants.Storing the sink continuation as
_coroSink(line 608) prevents dangling references to stack-local variables. The weak-capture in_coroYield(lines 612-616) combined with invariants (lines 618-621) ensure yielding only occurs within a valid, active coroutine context.
803-808: LGTM: Defensive guards during migration.Checking functor validity before invoking
_coroResumeand_coroYieldprevents null dereference if concurrent teardown cleared the functors. This is consistent with the guarded pattern in_runNextInGuard(lines 566-571).
831-842: Excellent: Aggressive functor teardown prevents races.Setting
CoroutineFunctors::Unavailable(line 834) before clearing individual members (lines 836-841) ensures no code can invoke yield/resume during SSM destruction. This is critical for preventing use-after-free when the coroutine stack is unmapped in the destructor.
|
Ubuntu seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
fix(coro): align stack pointer correctly and adjust stack size
The previous stack initialization logic failed to recompute the stack size
after aligning the stack pointer (SP) down to a 16-byte boundary. This
created a potential "Bounds Mismatch" where the logical stack bottom
(sp - size) could extend into the guard page or unallocated memory.
Changes:
spby masking down to 16-byte alignment (Required for x86-64and mandatory for ARMv8/A64).
sc.sizebased on the difference between the alignedspand thebottom(guard page boundary).protected guard page.
High Address
+--------------------------+ <--- _coroStack + kCoroStackSize (Original Top)
| Alignment Gap (0-15B) | : This space is discarded by "sp = top & ~15"
+--------------------------+ <--- sc.sp (Aligned Stack Pointer)
| | ^
| | |
| Usable Stack | | sc.size = (sp - bottom)
| [sp - size, sp) | |
| | |
| | v
+--------------------------+ <--- bottom (_coroStack + _osPageSize)
| |
| Guard Page | : Protected by PROT_NONE
| (Safety Zone) |
| |
+--------------------------+ <--- _coroStack (Allocation Base)
Low Address
Summary by CodeRabbit
Bug Fixes
Refactor
Chores
✏️ Tip: You can customize this high-level summary in your review settings.
Note
Fixes coroutine stack alignment/size and task lifetimes, and hardens ServiceStateMachine coroutine yield/resume and teardown to prevent crashes/UAF.
sc.sizewith a guard page; add_coroSinkmember and use it instead of referencing stack-local continuations; rethrowforced_unwind.weak_from_this()for queued resume/migrate functors; add null checks before invoking_coroResume/_coroYield; disable functors and clear state in reset/destructor/cleanup.8192 * 1024bytes (8MB) inServiceStateMachineandCoroCtx.Taskby value incoroutineResumeFunctor/coroutineLongResumeFunctorto avoid dangling references.log()toerror()during yield/resume.Written by Cursor Bugbot for commit 3b25393. This will update automatically on new commits. Configure here.