Add WAL for direct deployment state recovery#4106
Add WAL for direct deployment state recovery#4106varundeepsaini wants to merge 9 commits intodatabricks:mainfrom
Conversation
b51a199 to
7f67cb2
Compare
7f67cb2 to
baf371e
Compare
86b90ce to
98a1893
Compare
98a1893 to
5cb5da4
Compare
|
@denik @andrewnester the pr is ready for review |
|
@denik @andrewnester bumping again ^^ |
5cb5da4 to
0e7c9fa
Compare
|
@denik fixed the build failures, could you approve the workflows |
1b3af30 to
34e0f37
Compare
|
@denik could you approve the workflow |
|
@andrewnester @denik bump ^^ |
b6f651d to
f509784
Compare
denik
left a comment
There was a problem hiding this comment.
Thanks - looks good. Left a few comments.
|
@denik could you approve the workflows |
|
Commit: 6b1c292
17 interesting tests: 10 SKIP, 7 RECOVERED
Top 24 slowest tests (at least 2 minutes):
|
1485ae4 to
1154360
Compare
|
tests finally ran |
|
@denik bump ^^ |
1 similar comment
|
@denik bump ^^ |
acceptance/bundle/artifacts/artifact_upload_with_no_library_reference/test.toml
Outdated
Show resolved
Hide resolved
acceptance/bundle/artifacts/shell/default/out.deploy.direct.txt
Outdated
Show resolved
Hide resolved
1154360 to
552e3e5
Compare
f34406a to
63b765b
Compare
|
@denik could you please run the ci |
63b765b to
79e3593
Compare
Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>
Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>
Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>
Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>
Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com>
Signed-off-by: Varun Deep Saini <deepsainivarun@gmail.com>
79e3593 to
6e12ee1
Compare
|
@denik i looked at the test failures, rebasing should solve them, have done that. |
|
An authorized user can trigger integration tests manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
|
@denik |
| if len(plan.Plan) == 0 { | ||
| // Avoid creating state file if nothing to deploy | ||
| if b.StateDB.RecoveredFromWAL() { | ||
| if err := b.StateDB.Finalize(); err != nil { |
There was a problem hiding this comment.
Why do we need to call Finalize() here? We already called DeploymentState.Open() which already saved the state, right?
| ] | ||
|
|
||
| [[Repls]] | ||
| Old = 'Updating deployment state...\n' |
There was a problem hiding this comment.
When I said we should not have this difference I meant it should not be printed in the first place, not that we should hide it :)
|
Hi @varundeepsaini thanks a lot, this is great work. There are additional considerations, e.g. recently added state migration. I'm going to take over this PR as I'd like to have WAL functionality in. |
|
Sure @denik |
Closes: #4090
Changes
Add write-ahead log (WAL) to record state changes during direct deployment, enabling recovery of partial state if deployment is interrupted.
Added an Offset To KillCaller. Now it starts killing the process after Offset Successful requests to the endpoint
Why
Today, if deployment is interrupted before
Finalize(), no state is saved, and created resources become orphaned. The WAL writes each state change immediately to disk and replays them on restart.Tests
Tests added for WAL save/replay, delete/replay, finalize cleanup, and edge cases.