Skip to content
This repository was archived by the owner on Apr 2, 2024. It is now read-only.

Comments

Implement series cache invalidation#1742

Closed
JamesGuthrie wants to merge 1 commit intomasterfrom
jg/cache-the-simple-way
Closed

Implement series cache invalidation#1742
JamesGuthrie wants to merge 1 commit intomasterfrom
jg/cache-the-simple-way

Conversation

@JamesGuthrie
Copy link
Member

This change implements invalidation of the series cache, and mechanisms to prevent the ingestion of data based on stale cache information.

In principle, the cache invalidation mechanism works as follows:

In the database, we track two values: current_epoch, and delete_epoch. These are unix timestamps (which, for reasons of backwards-compatibility, are stored in a BIGINT field), updated every time that rows in the series table are deleted. current_epoch is set from now(), and delete_epoch is set from now() - epoch_duration. epoch_duration is a configurable parameter.

When a series row is to be deleted, instead of immediately deleting it, we set the delete_epoch column of the series row to the current_epoch timestamp (the time at which we decided that it will be deleted). After epoch_duration time elapses, the row is removed.

When the connector starts, it reads current_epoch from the database and stores this value with the series cache as cache_current_epoch. The connector periodically fetches the ids of series rows where delete_epoch is not null, together with current_epoch. It removes these entries from the cache, and updates cache_current_epoch to the value of current_epoch which was fetched from the database.

As the connector receives samples to be inserted, it tracks the smallest value of cache_current_epoch that it saw for that batch. When it comes to inserting the samples in a batch into the database, it asserts (in the database) that the cache which was read from was not stale. This is expressed as: cache_current_epoch > delete_epoch. If this is not the case, the insert is aborted.

This is a companion change to 1 which implements the database-side logic required for cache invalidation.

Description

Merge requirements

Please take into account the following non-code changes that you may need to make with your PR:

  • CHANGELOG entry for user-facing changes
  • Updated the relevant documentation

@JamesGuthrie JamesGuthrie requested a review from a team as a code owner November 4, 2022 15:31
This change implements invalidation of the series cache, and mechanisms
to prevent the ingestion of data based on stale cache information.

In principle, the cache invalidation mechanism works as follows:

In the database, we track two values: `current_epoch`, and
`delete_epoch`. These are unix timestamps (which, for reasons of
backwards-compatibility, are stored in a BIGINT field), updated every
time that rows in the series table are deleted. `current_epoch` is set
from `now()`, and `delete_epoch` is set from `now() - epoch_duration`.
`epoch_duration` is a configurable parameter. 

When a series row is to be deleted, instead of immediately deleting it,
we set the `delete_epoch` column of the series row to the
`current_epoch` timestamp (the time at which we decided that it will be
deleted). After `epoch_duration` time elapses, the row is removed.

When the connector starts, it reads `current_epoch` from the database
and stores this value with the series cache as `cache_current_epoch`.
The connector periodically fetches the ids of series rows where
`delete_epoch` is not null, together with `current_epoch`. It removes
these entries from the cache, and updates `cache_current_epoch` to the
value of `current_epoch` which was fetched from the database.

As the connector receives samples to be inserted, it tracks the smallest
value of `cache_current_epoch` that it saw for that batch. When it comes
to inserting the samples in a batch into the database, it asserts (in
the database) that the cache which was read from was not stale. This is
expressed as: `cache_current_epoch > delete_epoch`. If this is not the
case, the insert is aborted.

This is a companion change to [1] which implements the database-side
logic required for cache invalidation.

[1]: timescale/promscale_extension#529
@JamesGuthrie JamesGuthrie force-pushed the jg/cache-the-simple-way branch from ede85e2 to 7e8395f Compare November 11, 2022 15:55
@JamesGuthrie
Copy link
Member Author

Closing in favor of #1752

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant