Draft
Conversation
The rewind might update bloks on target's file with that of the source. In that case, encrypted files will end up with different blocks encrypted with different internal keys, leading to data corruption. To prevent this, we have to decryp blocks coming from the source and re-encrypt them with the target's internal key. Along with that, providers might have been changed, primary keys rotated, etc., after the divergence of clusters. Generally, it's not a problem because we would replace the keys and providers on the target with those of the source. But it will render internal keys that we have used for the partial blocks unreadable. To fix this, we have to re-encrypt such internal keys with source's primary keys. So the sequence is next: 1. Copy source's pg_tde dir into a tmp location. All following reads and writes of source keys and providers happen in this tmp dir. 2. When we have partial updates, check for the source key. And re-encrypt blocks if the key exists. 3. Save the target key into the source _keys file replacning the existing one. 4. Replace target's pg_tde dir with the tmp one.
It also fixes an issue when encrypted files might remain untouched by rewind on the target. That means they are encrypted with the target's key wich vanishes after the pg_tde replace. So we need to ensure keys for such files. Meaning if this is an encrypted relation, we save its target key into the source's keys.
Do what "dry-run" prescribes, everything, but modify the target directory.
We rewrite the internal key for relations when partially re-encrypting blocks. That makes its FSM and VM fork unreadable as they are still encrypted with the old key. To fix that, we re-encrypt such forks with the proper key after we finish processing the main fork file. As pg_rewind processes files in the order of operation types (see file_action_t) and whole-file copies occur before any partial writes, we assume that for files already in the target datadir, we rewrite them in-place.
Codecov Report❌ Patch coverage is ❌ Your project status has failed because the head coverage (75.95%) is below the target coverage (90.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #559 +/- ##
==========================================
- Coverage 58.85% 57.32% -1.53%
==========================================
Files 69 69
Lines 10822 10868 +46
Branches 1870 2676 +806
==========================================
- Hits 6369 6230 -139
+ Misses 3578 3356 -222
- Partials 875 1282 +407
🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The rewind might update bloks on target's file with that of the source.
In that case, encrypted files will end up with different blocks
encrypted with different internal keys, leading to data corruption.
To prevent this, we have to decryp blocks coming from the source and
re-encrypt them with the target's internal key.
Along with that, providers might have been changed, primary keys
rotated, etc., after the divergence of clusters. Generally, it's not a
problem because we would replace the keys and providers on the target
with those of the source. But it will render internal keys that we have
used for the partial blocks unreadable. To fix this, we have to
re-encrypt such internal keys with source's primary keys.
So the sequence is next:
and writes of source keys and providers happen in this tmp dir.
re-encrypt blocks if the key exists.
existing one.
See commit messages for additional info.
It's still a draft laking:
We may or may not want to sole issues with _fsm of partially updated relations. Currently they end-up with wrong key (read currpted). But server successfully resolve it by just zeroing corrupted pages:Fixed