Refactor flush logic to avoid cache manager deadlocks by JoneKone · Pull Request #1 · JoneKone/Ext4Fsd

JoneKone · 2025-10-25T12:11:11Z

Summary

release file resources before invoking cache manager flush so cache operations run outside driver locks
gather volume busy ranges before issuing cache flushes to avoid calling CcFlushCache while holding Ext4Fsd locks
update shutdown flushing to drop the VCB resource prior to cache flushes and reacquire afterwards

Testing

Not run (driver stack and verifier tooling unavailable in container)

https://chatgpt.com/codex/tasks/task_e_68fcb0f6c698833391e035c8896ec6ad

JoneKone · 2025-10-25T12:12:24Z

Summary

Released each FCB’s main resource before invoking CcFlushCache in Ext2FlushFile, then reacquired it to clear flags only after successful cache operations, preventing cache manager deadlocks while Ext4Fsd locks are held.

Updated the top-level Ext2Flush and shutdown paths to drop and later reacquire the VCB main resource around file and volume flush calls so that cache work executes outside driver-held locks.

Reworked Ext2FlushVcb to capture buffered ranges while internal locks are held and defer CcFlushCache invocations until after those resources are released, keeping BCB activity outside Ext4Fsd lock scopes.

Testing

⚠️ Not run (kernel-mode verifier and heavy-load test harnesses are unavailable in this environment)

Flush path trace

During shutdown the driver now updates metadata while holding Vcb->MainResource, releases it before Ext2FlushFiles/Ext2FlushVolume, and reacquires it only after cache flushes complete, ensuring CcFlushCache is reached with no VCB locks held.

Ext2Flush follows the same pattern for user-initiated flushes: it releases the VCB resource before calling Ext2FlushFiles/Ext2FlushVolume, while Ext2FlushFiles still serializes per-FCB access and calls Ext2FlushFile, which now drops and reacquires each FCB resource around the cache manager call.

Volume-level flushing records outstanding buffer-head ranges while holding internal locks, then performs the cache flushes once those locks are dropped so cache/BBC activity no longer nests inside Ext4Fsd resources.

Reproduction notes

The requested deadlock reproduction with Driver Verifier could not be executed because the container environment lacks kernel-mode driver deployment and verification tooling.

Refactor flush paths to avoid cache deadlocks

ba09db3

JoneKone added the codex label Oct 25, 2025 — with ChatGPT Codex Connector

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor flush logic to avoid cache manager deadlocks#1

Refactor flush logic to avoid cache manager deadlocks#1
JoneKone wants to merge 1 commit intomasterfrom
codex/analyze-flush-sequence-and-deadlock-in-ext4fsd

JoneKone commented Oct 25, 2025

Uh oh!

JoneKone commented Oct 25, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JoneKone commented Oct 25, 2025

Summary

Testing

Uh oh!

JoneKone commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JoneKone commented Oct 25, 2025 •

edited

Loading