Switch from logical to time-based epoch#512
Conversation
796a8cf to
d4a0dc1
Compare
This change adjusts the existing logical cache epoch mechanism to be time-based. Effectively the time-based epoch mechanism uses the current system time (`now()`) to determine the current epoch. The primary advantage of this mechanism is that it allows for exact time-based decisions to be made on when to drop series data (the logical epoch mechanism relied on the fact that the epoch was incremented on a fixed time schedule). The `_prom_catalog.global_epoch` table contains one row tracking the `current_epoch`, and `delete_epoch`, both TIMESTAMPTZ. This table replaces `_prom_catalog.ids_epoch`. The `delete_epoch` column in the `prom_data_series.<metric_name>` table was replaced with the `mark_for_deletion_epoch` column. `current_epoch` contains the timestamp of the current epoch, and is updated every time we delete series. `mark_for_deletion_epoch` contains either NULL, or a TIMESTAMPTZ, which is the value of `current_epoch` when the decision was made to delete the series in the future. `delete_epoch` contains the maximum value of `mark_for_deletion_epoch` for rows in `prom_data_series.<metric_name>` which were deleted.
d4a0dc1 to
8c0759d
Compare
This is a companion change to [1]. That change modifies the database schema to switch to a time-based epoch. This change propagates the time-based epoch into the Promscale connector. For more details about the rationale behind this change, see [1]. [1]: timescale/promscale_extension#512
jgpruitt
left a comment
There was a problem hiding this comment.
Consider my feedback optional. Not marking "Approve" because I'd like someone who's worked on the caching proof to also review.
migration/idempotent/001-base.sql
Outdated
| --dump/restore issues. | ||
| EXECUTE format('CREATE INDEX series_labels_%s ON prom_data_series.%I USING GIN (labels)', NEW.id, NEW.table_name); | ||
| EXECUTE format('CREATE INDEX series_delete_epoch_id_%s ON prom_data_series.%I (delete_epoch, id) WHERE delete_epoch IS NOT NULL', NEW.id, NEW.table_name); | ||
| EXECUTE format('CREATE INDEX series_mark_for_deletion_epoch_id_%s ON prom_data_series.%I (mark_for_deletion_epoch, id) WHERE mark_for_deletion_epoch IS NOT NULL', NEW.id, NEW.table_name); |
There was a problem hiding this comment.
Consider including the id column instead of indexing it. It seems to make for a smaller index that is faster for inserts and about the same for queries. Then again, the difference may not be big enough to matter.
i.e. (mark_for_deletion_epoch) include (id) where mark_for_deletion_epoch is not null)
here's the scratch work i did to compare
https://gist.github.com/jgpruitt/e6a03b791269179ff25f878a8a333ce8
There was a problem hiding this comment.
and this assumes that we're only wanting to select ids, not search by them, which could be an incorrect assumption
There was a problem hiding this comment.
TIL about tablesample. Thank you for sharing this analysis!
I also took a look at this: https://use-the-index-luke.com/blog/2019-04/include-columns-in-btree-indexes, which is an excellent explainer on the structural difference between INCLUDE and not.
I've made this change.
| -- this ensures that pristine and migrated DBs have the same values | ||
| -- it's also important that current_epoch > delete_epoch | ||
| INSERT INTO _prom_catalog.global_epoch (current_epoch, delete_epoch, is_unique) | ||
| VALUES ('2020-01-01 00:00:00 UTC', '1970-01-01 00:00:00 UTC', true); |
There was a problem hiding this comment.
I personally like '-infinity' better than the 1970 timestamp for sentinels like this, but I admit that's wholly subjective.
There was a problem hiding this comment.
you could use VALUES ('epoch', '-infinity', true)
feels more like proper constants to me
https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-DATETIME-SPECIAL-VALUES
There was a problem hiding this comment.
ooh. I like that suggestion. Will adopt.
Because we have exclusive access to the database, we know that we can remove all series rows where delete epoch is not NULL. This optimization ensures that we don't have to rewrite the series table, and we know that the partial index will be empty.
| , max(mark_for_deletion_epoch) as max_deletion_time | ||
| FROM deleted_series d, | ||
| LATERAL unnest(d.labels) as labels(label) | ||
| $query$, metric_schema, metric_table, metric_series_table, deletion_time) INTO label_array, max_deletion_time; |
There was a problem hiding this comment.
Where did max_deletion_time come from? I don't remember it appearing either in specs or in Mat's proof.
| UPDATE _prom_catalog.ids_epoch | ||
| SET (current_epoch, last_update_time) = (next_epoch, now()) | ||
| WHERE current_epoch < next_epoch; | ||
| UPDATE _prom_catalog.global_epoch e |
There was a problem hiding this comment.
I realize this isn't the final protocol implementation, nonetheless I want to point out that this update happens at a different point and with different locking in the proven protocol.
| END IF; | ||
| SELECT current_epoch, last_update_time INTO present_epoch, last_updated FROM | ||
| _prom_catalog.ids_epoch LIMIT 1; | ||
| SELECT e.current_epoch, e.delete_epoch INTO present_epoch, delete_epoch FROM |
There was a problem hiding this comment.
I'm not sure this line is even necessary now, when 183 doesn't reference it.
This is a companion change to [1]. That change modifies the database schema to switch to a time-based epoch. This change propagates the time-based epoch into the Promscale connector. For more details about the rationale behind this change, see [1]. [1]: timescale/promscale_extension#512
This is a companion change to [1]. That change modifies the database schema to switch to a time-based epoch. This change propagates the time-based epoch into the Promscale connector. For more details about the rationale behind this change, see [1]. [1]: timescale/promscale_extension#512
This is a companion change to [1]. That change modifies the database schema to switch to a time-based epoch. This change propagates the time-based epoch into the Promscale connector. For more details about the rationale behind this change, see [1]. [1]: timescale/promscale_extension#512
This is a companion change to [1]. That change modifies the database schema to switch to a time-based epoch. This change propagates the time-based epoch into the Promscale connector. For more details about the rationale behind this change, see [1]. [1]: timescale/promscale_extension#512
|
This PR has been superseded by #529 |
Description
This change adjusts the existing logical cache epoch mechanism to be
time-based.
Effectively the time-based epoch mechanism uses the current system time
(
now()) to determine the current epoch. The primary advantage of thismechanism is that it allows for exact time-based decisions to be made on
when to drop series data (the logical epoch mechanism relied on the fact
that the epoch was incremented on a fixed time schedule).
The
_prom_catalog.global_epochtable contains one row tracking thecurrent_epoch, anddelete_epoch, both TIMESTAMPTZ. This tablereplaces
_prom_catalog.ids_epoch.The
delete_epochcolumn in theprom_data_series.<metric_name>tablewas replaced with the
mark_for_deletion_epochcolumn.current_epochcontains the timestamp of the current epoch, and isupdated every time we delete series.
mark_for_deletion_epochcontains either NULL, or a TIMESTAMPTZ, whichis the value of
current_epochwhen the decision was made to delete theseries in the future.
delete_epochcontains the maximum value ofmark_for_deletion_epochfor rows in
prom_data_series.<metric_name>which were deleted.Merge requirements
Please take into account the following non-code changes that you may need to make with your PR: