Fix pg rewind by dAdAbird · Pull Request #559 · percona/pg_tde

dAdAbird · 2026-03-31T19:06:28Z

The rewind might update bloks on target's file with that of the source.
In that case, encrypted files will end up with different blocks
encrypted with different internal keys, leading to data corruption.
To prevent this, we have to decryp blocks coming from the source and
re-encrypt them with the target's internal key.

Along with that, providers might have been changed, primary keys
rotated, etc., after the divergence of clusters. Generally, it's not a
problem because we would replace the keys and providers on the target
with those of the source. But it will render internal keys that we have
used for the partial blocks unreadable. To fix this, we have to
re-encrypt such internal keys with source's primary keys.

So the sequence is next:

Copy source's pg_tde dir into a tmp location. All following reads
and writes of source keys and providers happen in this tmp dir.
When we have partial updates, check for the source key. And
re-encrypt blocks if the key exists.
Save the target key into the source _keys file replacning the
existing one.
Replace target's pg_tde dir with the tmp one.

See commit messages for additional info.

It's still a draft laking:

Tests.
Refactoring. The 2nd commit should be split in two. Plus I didn't much think on names etc.
Need to clean-up the tmp dir on failure.
We maybe need to merge WAL key as well as rewind sometimes keeps one targets WAL segment from before the diverge. But is not a hard fix but I'm not sure if we need it and I don't want to have useless code sitting around. I need to think more.
We may or may not want to sole issues with _fsm of partially updated relations. Currently they end-up with wrong key (read currpted). But server successfully resolve it by just zeroing corrupted pages: Fixed

2026-03-31 18:57:23.703 UTC [116297] LOG:  page verification failed, calculated checksum 39771 but expected 16415
2026-03-31 18:57:23.703 UTC [116297] CONTEXT:  WAL redo at 0/30D1C70 for Heap/INSERT: off: 28, flags: 0x08; blkref #0: rel 1663/16384/16434, blk 14
2026-03-31 18:57:23.703 UTC [116297] LOG:  invalid page in block 2 of relation "base/16384/16434_fsm"; zeroing out page
2026-03-31 18:57:23.703 UTC [116297] CONTEXT:  WAL redo at 0/30D1C70 for Heap/INSERT: off: 28, flags: 0x08; blkref #0: rel 1663/16384/16434, blk 14
2026-03-31 18:57:23.703 UTC [116297] WARNING:  invalid page in block 2 of relation "base/16384/16434_fsm"; zeroing out page
2026-03-31 18:57:23.703 UTC [116297] CONTEXT:  WAL redo at 0/30D1C70 for Heap/INSERT: off: 28, flags: 0x08; blkref #0: rel 1663/16384/16434, blk 14
2026-03-31 18:57:23.737 UTC [116297] LOG:  completed backup recovery with redo LSN 0/30AB4B8 and end LSN 0/3649278

Ah, and all changes only for pg18 so far.

The rewind might update bloks on target's file with that of the source. In that case, encrypted files will end up with different blocks encrypted with different internal keys, leading to data corruption. To prevent this, we have to decryp blocks coming from the source and re-encrypt them with the target's internal key. Along with that, providers might have been changed, primary keys rotated, etc., after the divergence of clusters. Generally, it's not a problem because we would replace the keys and providers on the target with those of the source. But it will render internal keys that we have used for the partial blocks unreadable. To fix this, we have to re-encrypt such internal keys with source's primary keys. So the sequence is next: 1. Copy source's pg_tde dir into a tmp location. All following reads and writes of source keys and providers happen in this tmp dir. 2. When we have partial updates, check for the source key. And re-encrypt blocks if the key exists. 3. Save the target key into the source _keys file replacning the existing one. 4. Replace target's pg_tde dir with the tmp one.

It also fixes an issue when encrypted files might remain untouched by rewind on the target. That means they are encrypted with the target's key wich vanishes after the pg_tde replace. So we need to ensure keys for such files. Meaning if this is an encrypted relation, we save its target key into the source's keys.

Do what "dry-run" prescribes, everything, but modify the target directory.

We rewrite the internal key for relations when partially re-encrypting blocks. That makes its FSM and VM fork unreadable as they are still encrypted with the old key. To fix that, we re-encrypt such forks with the proper key after we finish processing the main fork file. As pg_rewind processes files in the order of operation types (see file_action_t) and whole-file copies occur before any partial writes, we assume that for files already in the target datadir, we rewrite them in-place.

codecov-commenter · 2026-04-03T18:06:31Z

Codecov Report

❌ Patch coverage is 60.47619% with 83 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.32%. Comparing base (a2ae945) to head (7924f61).
⚠️ Report is 7 commits behind head on main.

❌ Your project status has failed because the head coverage (75.95%) is below the target coverage (90.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #559      +/-   ##
==========================================
- Coverage   58.85%   57.32%   -1.53%     
==========================================
  Files          69       69              
  Lines       10822    10868      +46     
  Branches     1870     2676     +806     
==========================================
- Hits         6369     6230     -139     
+ Misses       3578     3356     -222     
- Partials      875     1282     +407

Components	Coverage Δ
access	`80.67% <ø> (-1.24%)`	⬇️
bin	`63.76% <ø> (-4.24%)`	⬇️
catalog	`77.86% <ø> (-6.78%)`	⬇️
common	`76.47% <ø> (+4.24%)`	⬆️
encryption	`64.53% <100.00%> (+5.71%)`	⬆️
keyring	`65.12% <ø> (-2.69%)`	⬇️
src	`87.33% <ø> (-2.99%)`	⬇️
smgr	`90.20% <100.00%> (-1.08%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dAdAbird added 3 commits March 31, 2026 20:56

pg_rewind: fix dry-run mode with TDE

bc50798

Do what "dry-run" prescribes, everything, but modify the target directory.

dAdAbird force-pushed the fix_pg_rewind branch from 7d41bcd to bc50798 Compare April 2, 2026 17:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix pg rewind#559

Fix pg rewind#559
dAdAbird wants to merge 4 commits intopercona:mainfrom
dAdAbird:fix_pg_rewind

dAdAbird commented Mar 31, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dAdAbird commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Apr 3, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dAdAbird commented Mar 31, 2026 •

edited

Loading