This repository was archived by the owner on Apr 2, 2024. It is now read-only.
Closed
Conversation
This change implements invalidation of the series cache, and mechanisms to prevent the ingestion of data based on stale cache information. In principle, the cache invalidation mechanism works as follows: In the database, we track two values: `current_epoch`, and `delete_epoch`. These are unix timestamps (which, for reasons of backwards-compatibility, are stored in a BIGINT field), updated every time that rows in the series table are deleted. `current_epoch` is set from `now()`, and `delete_epoch` is set from `now() - epoch_duration`. `epoch_duration` is a configurable parameter. When a series row is to be deleted, instead of immediately deleting it, we set the `delete_epoch` column of the series row to the `current_epoch` timestamp (the time at which we decided that it will be deleted). After `epoch_duration` time elapses, the row is removed. When the connector starts, it reads `current_epoch` from the database and stores this value with the series cache as `cache_current_epoch`. The connector periodically fetches the ids of series rows where `delete_epoch` is not null, together with `current_epoch`. It removes these entries from the cache, and updates `cache_current_epoch` to the value of `current_epoch` which was fetched from the database. As the connector receives samples to be inserted, it tracks the smallest value of `cache_current_epoch` that it saw for that batch. When it comes to inserting the samples in a batch into the database, it asserts (in the database) that the cache which was read from was not stale. This is expressed as: `cache_current_epoch > delete_epoch`. If this is not the case, the insert is aborted. This is a companion change to [1] which implements the database-side logic required for cache invalidation. [1]: timescale/promscale_extension#529
ede85e2 to
7e8395f
Compare
Member
Author
|
Closing in favor of #1752 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This change implements invalidation of the series cache, and mechanisms to prevent the ingestion of data based on stale cache information.
In principle, the cache invalidation mechanism works as follows:
In the database, we track two values:
current_epoch, anddelete_epoch. These are unix timestamps (which, for reasons of backwards-compatibility, are stored in a BIGINT field), updated every time that rows in the series table are deleted.current_epochis set fromnow(), anddelete_epochis set fromnow() - epoch_duration.epoch_durationis a configurable parameter.When a series row is to be deleted, instead of immediately deleting it, we set the
delete_epochcolumn of the series row to thecurrent_epochtimestamp (the time at which we decided that it will be deleted). Afterepoch_durationtime elapses, the row is removed.When the connector starts, it reads
current_epochfrom the database and stores this value with the series cache ascache_current_epoch. The connector periodically fetches the ids of series rows wheredelete_epochis not null, together withcurrent_epoch. It removes these entries from the cache, and updatescache_current_epochto the value ofcurrent_epochwhich was fetched from the database.As the connector receives samples to be inserted, it tracks the smallest value of
cache_current_epochthat it saw for that batch. When it comes to inserting the samples in a batch into the database, it asserts (in the database) that the cache which was read from was not stale. This is expressed as:cache_current_epoch > delete_epoch. If this is not the case, the insert is aborted.This is a companion change to 1 which implements the database-side logic required for cache invalidation.
Description
Merge requirements
Please take into account the following non-code changes that you may need to make with your PR: