From a407f4e2858aa36cbefc5781d5fd060efa1397af Mon Sep 17 00:00:00 2001 From: gmarouli Date: Wed, 14 Jan 2026 14:06:05 +0200 Subject: [PATCH 1/4] Introduce downsampling methods --- .../data-streams/downsampling-concepts.md | 32 +++++++++++++------ .../data-streams/run-downsampling.md | 12 ++++--- 2 files changed, 30 insertions(+), 14 deletions(-) diff --git a/manage-data/data-store/data-streams/downsampling-concepts.md b/manage-data/data-store/data-streams/downsampling-concepts.md index 5252dd4d46..1ef14d1c31 100644 --- a/manage-data/data-store/data-streams/downsampling-concepts.md +++ b/manage-data/data-store/data-streams/downsampling-concepts.md @@ -27,7 +27,7 @@ In a time series data stream, a single document is created for each timestamp. T :alt: time series metric anatomy ::: -For the most current data, the metrics series typically has a low sampling time interval, to optimize for queries that require a high data resolution. +For the most current data, the metrics series typically has a low sampling time interval to optimize for queries that require a high data resolution. :::{image} /manage-data/images/elasticsearch-reference-time-series-original.png :alt: time series original @@ -51,15 +51,7 @@ Downsampling is applied to the individual backing indices of the TSDS. The downs For example, a TSDS index that contains metrics sampled every 10 seconds can be downsampled to an hourly index. All documents within a given hour interval are summarized and stored as a single document in the downsampled index. 2. For each new document, copies all [time series dimensions](time-series-data-stream-tsds.md#time-series-dimension) from the source index to the target index. Dimensions in a TSDS are constant, so this step happens only once per bucket. -3. For each [time series metric](time-series-data-stream-tsds.md#time-series-metric) field, computes aggregations for all documents in the bucket. - - * `gauge` field type: - * `min`, `max`, `sum`, and `value_count` are stored as type `aggregate_metric_double`. - * `counter` field type: - * the last value is stored and the type is preserved. - * `histogram` field type: {applies_to}`stack: preview 9.3` {applies_to}`serverless: preview` - * individual histograms are merged into a single histogram that is stored, preserving the type. The `histogram` field type uses the [T-Digest](elasticsearch://reference/aggregations/search-aggregations-metrics-percentile-aggregation.md) algorithm. - +3. For each [time series metric](time-series-data-stream-tsds.md#time-series-metric) field, it computes the downsampled values based on the [downsampling method](#downsampling-methods). 4. For all other fields, copies the most recent value to the target index. 5. Replaces the original index with the downsampled index, then deletes the original index. @@ -71,6 +63,26 @@ You can downsample a downsampled index. The subsequent downsampling interval mus % TODO ^^ consider mini table in step 3; refactor generally +### Downsampling methods [downsampling-methods] + +The downsampling method is the technique used to reduce multiple values within the same bucket into a single representative value. Two distinct methods exist: + +* `last_value`: {applies_to}`stack: preview 9.3` {applies_to}`serverless: ga` + This method increases the sampling interval by storing only the most recent value for each metric in the same bucket. While this reduces data accuracy, it offers the benefit of conserving storage space. It applies to all metric types. + +* `aggregate`: + This method preserves data accuracy by computing and storing statistical aggregations for all documents within the bucket, though it requires more storage space. It applies to each metric type in the following way: + * `gauge` field type: + * `min`, `max`, `sum`, and `value_count` are stored as type `aggregate_metric_double`. + * `counter` field type: + * the last value is stored and the type is preserved. + * `histogram` field type: {applies_to}`stack: preview 9.3` {applies_to}`serverless: preview` + * individual histograms are merged into a single histogram that is stored, preserving the type. The `histogram` field type uses the [T-Digest](elasticsearch://reference/aggregations/search-aggregations-metrics-percentile-aggregation.md) algorithm. + +:::{tip} +When downsampling a downsampled index, you need to use the same downsampling method as the source index. +::: + ### Source and target index field mappings [downsample-api-mappings] Fields in the target downsampled index are created with the same mapping as in the source index, with one exception: `time_series_metric: gauge` fields are changed to `aggregate_metric_double`. diff --git a/manage-data/data-store/data-streams/run-downsampling.md b/manage-data/data-store/data-streams/run-downsampling.md index d4c67a3e55..af4f465f7d 100644 --- a/manage-data/data-store/data-streams/run-downsampling.md +++ b/manage-data/data-store/data-streams/run-downsampling.md @@ -37,11 +37,13 @@ To downsample a time series using a [data stream lifecycle](/manage-data/lifecyc * Set `fixed_interval` to your preferred level of granularity. The original time series data will be aggregated at this interval. * Set `after` to the minimum time to wait after an index rollover, before running downsampling. +* (Optional) Set `downsampling_method` to your preferred [downsampling method](/manage-data/data-store/data-streams/downsampling-concepts.md#downsampling-methods) (`last_value` or `aggregate`), or leave it unspecified to use the default method (`aggregate`). {applies_to}`stack: preview 9.3` {applies_to}`serverless: ga` ```console PUT _data_stream/my-data-stream/_lifecycle { "data_retention": "7d", + "downsampling_method": "aggregate", "downsampling": [ { "after": "1m", @@ -81,14 +83,16 @@ PUT _ilm/policy/datastream_policy "max_age": "5m" }, "downsample": { - "fixed_interval": "5m" + "fixed_interval": "5m", + "sampling_method": "aggregate" } } }, "warm": { "actions": { "downsample": { - "fixed_interval": "1h" + "fixed_interval": "1h", + "sampling_method": "aggregate" } } } @@ -96,8 +100,8 @@ PUT _ilm/policy/datastream_policy } } ``` -Set `fixed_interval` to your preferred level of granularity. The original time series data will be aggregated at this interval. The downsample action runs after the index is rolled over and the [index time series end time](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has passed. - +Set `fixed_interval` to your preferred level of granularity. The original time series data will be aggregated at this interval. The downsample action runs after the index is rolled over and the [index time series end time](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has passed. +(Optional) Set `sampling_method` to your preferred [downsampling method](/manage-data/data-store/data-streams/downsampling-concepts.md#downsampling-methods) (`last_value` or `aggregate`), or leave it unspecified to use the default method (`aggregate`). {applies_to}`stack: preview 9.3` {applies_to}`serverless: ga` ::: :::: From 4707c1a15639bf092e7d8959321225b568333de1 Mon Sep 17 00:00:00 2001 From: gmarouli Date: Wed, 14 Jan 2026 14:16:44 +0200 Subject: [PATCH 2/4] Small fixes --- manage-data/data-store/data-streams/run-downsampling.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/manage-data/data-store/data-streams/run-downsampling.md b/manage-data/data-store/data-streams/run-downsampling.md index af4f465f7d..a8c0c811b2 100644 --- a/manage-data/data-store/data-streams/run-downsampling.md +++ b/manage-data/data-store/data-streams/run-downsampling.md @@ -37,7 +37,7 @@ To downsample a time series using a [data stream lifecycle](/manage-data/lifecyc * Set `fixed_interval` to your preferred level of granularity. The original time series data will be aggregated at this interval. * Set `after` to the minimum time to wait after an index rollover, before running downsampling. -* (Optional) Set `downsampling_method` to your preferred [downsampling method](/manage-data/data-store/data-streams/downsampling-concepts.md#downsampling-methods) (`last_value` or `aggregate`), or leave it unspecified to use the default method (`aggregate`). {applies_to}`stack: preview 9.3` {applies_to}`serverless: ga` +* (Optional) Set `downsampling_method` to your preferred [downsampling method](/manage-data/data-store/data-streams/downsampling-concepts.md#downsampling-methods), or leave it unspecified to use the default method (`aggregate`). {applies_to}`stack: preview 9.3` {applies_to}`serverless: ga` ```console PUT _data_stream/my-data-stream/_lifecycle @@ -100,8 +100,9 @@ PUT _ilm/policy/datastream_policy } } ``` -Set `fixed_interval` to your preferred level of granularity. The original time series data will be aggregated at this interval. The downsample action runs after the index is rolled over and the [index time series end time](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has passed. -(Optional) Set `sampling_method` to your preferred [downsampling method](/manage-data/data-store/data-streams/downsampling-concepts.md#downsampling-methods) (`last_value` or `aggregate`), or leave it unspecified to use the default method (`aggregate`). {applies_to}`stack: preview 9.3` {applies_to}`serverless: ga` + +* Set `fixed_interval` to your preferred level of granularity. The original time series data will be aggregated at this interval. The downsample action runs after the index is rolled over and the [index time series end time](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has passed. +* (Optional) Set `sampling_method` to your preferred [downsampling method](/manage-data/data-store/data-streams/downsampling-concepts.md#downsampling-methods), or leave it unspecified to use the default method (`aggregate`). {applies_to}`stack: preview 9.3` {applies_to}`serverless: ga` ::: :::: From 2614e109c553e288b36ddec65ce8ec963a1b16bb Mon Sep 17 00:00:00 2001 From: Mary Gouseti Date: Wed, 14 Jan 2026 16:39:30 +0200 Subject: [PATCH 3/4] Update manage-data/data-store/data-streams/downsampling-concepts.md Co-authored-by: Vlada Chirmicci --- manage-data/data-store/data-streams/downsampling-concepts.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/manage-data/data-store/data-streams/downsampling-concepts.md b/manage-data/data-store/data-streams/downsampling-concepts.md index 1ef14d1c31..7cbd6da815 100644 --- a/manage-data/data-store/data-streams/downsampling-concepts.md +++ b/manage-data/data-store/data-streams/downsampling-concepts.md @@ -80,7 +80,7 @@ The downsampling method is the technique used to reduce multiple values within t * individual histograms are merged into a single histogram that is stored, preserving the type. The `histogram` field type uses the [T-Digest](elasticsearch://reference/aggregations/search-aggregations-metrics-percentile-aggregation.md) algorithm. :::{tip} -When downsampling a downsampled index, you need to use the same downsampling method as the source index. +When downsampling a downsampled index, use the same downsampling method as the source index. ::: ### Source and target index field mappings [downsample-api-mappings] From 701e76796b74f2665ae9c79ed88025fa5cd260c7 Mon Sep 17 00:00:00 2001 From: gmarouli Date: Wed, 14 Jan 2026 18:34:20 +0200 Subject: [PATCH 4/4] Fix versions --- manage-data/data-store/data-streams/run-downsampling.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/manage-data/data-store/data-streams/run-downsampling.md b/manage-data/data-store/data-streams/run-downsampling.md index a8c0c811b2..85de491723 100644 --- a/manage-data/data-store/data-streams/run-downsampling.md +++ b/manage-data/data-store/data-streams/run-downsampling.md @@ -37,7 +37,7 @@ To downsample a time series using a [data stream lifecycle](/manage-data/lifecyc * Set `fixed_interval` to your preferred level of granularity. The original time series data will be aggregated at this interval. * Set `after` to the minimum time to wait after an index rollover, before running downsampling. -* (Optional) Set `downsampling_method` to your preferred [downsampling method](/manage-data/data-store/data-streams/downsampling-concepts.md#downsampling-methods), or leave it unspecified to use the default method (`aggregate`). {applies_to}`stack: preview 9.3` {applies_to}`serverless: ga` +* {applies_to}`stack: preview 9.3` {applies_to}`serverless: ga` (Optional) Set `downsampling_method` to your preferred [downsampling method](/manage-data/data-store/data-streams/downsampling-concepts.md#downsampling-methods), or leave it unspecified to use the default method (`aggregate`). ```console PUT _data_stream/my-data-stream/_lifecycle @@ -102,7 +102,7 @@ PUT _ilm/policy/datastream_policy ``` * Set `fixed_interval` to your preferred level of granularity. The original time series data will be aggregated at this interval. The downsample action runs after the index is rolled over and the [index time series end time](elasticsearch://reference/elasticsearch/index-settings/time-series.md#index-time-series-end-time) has passed. -* (Optional) Set `sampling_method` to your preferred [downsampling method](/manage-data/data-store/data-streams/downsampling-concepts.md#downsampling-methods), or leave it unspecified to use the default method (`aggregate`). {applies_to}`stack: preview 9.3` {applies_to}`serverless: ga` +* {applies_to}`stack: preview 9.3` (Optional) Set `sampling_method` to your preferred [downsampling method](/manage-data/data-store/data-streams/downsampling-concepts.md#downsampling-methods), or leave it unspecified to use the default method (`aggregate`). ::: ::::