Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion api/TimeAddressableMediaStore.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2111,7 +2111,14 @@ paths:

The response will include a PUT URL that a client uses to upload the Media Object.
The client is expected to register the Flow Segment using the [/flows/{flowId}/segments](#/operations/POST_flows-flowId-segments) endpoint once the upload is complete.
Service implementations need to handle situations where Objects were uploaded but no Flow Segment was registered successfully.

Clients MAY request Objects in batches to reduce the number of HTTP requests made to the Service.
Clients are not expected to use all of the Objects they requested.
Objects will likely go unused in cases such as shutdown of ingesting clients, the end of ingested live streams, and unexpected network congestion.
Clients SHOULD, however, adapt the number of Objects they request such that they may reasonably expect to use them before the timeout advertised in [`min_object_timeout` at the `/service`](#/operations/GET_service) endpoint, which is subject to a specified minimum (see service endpoint schema).

Service implementations need to handle situations where Objects are not used, and where content is were uploaded but no Flow Segment was registered successfully.
In these circumstances, Services should garbage collect Objects after the timeout advertised in [`min_object_timeout` at the `/service`](#/operations/GET_service) endpoint.

When making requests to the provided `put_url`, clients should include credentials if the provided URL is on the same origin as the API itself, akin to the `same-origin` mode in the [WhatWG Fetch Standard](https://fetch.spec.whatwg.org/#concept-request-credentials-mode).
operationId: POST_flows-flowId-storage
Expand Down
4 changes: 3 additions & 1 deletion api/examples/service-get-200.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,7 @@
"name": "webhooks",
"docs": "https://bbc.github.io/tams/7.0/index.html#/operations/POST_webhooks"
}
]
],
"min_object_timeout": "600:0",
"min_presigned_url_timeout": "610:0"
}
6 changes: 3 additions & 3 deletions api/schemas/flow-segment.json
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,17 @@
},
"ts_offset":
{
"description": "The timestamp offset between the sample timestamps stored in the media file and the corresponding timestamp in the Segment, ie. ts_offset = segment ts - media object ts. Assumed to be 0:0 if not set. Format as described by the [Timestamp](../schemas/timestamp#top) type",
"description": "The timestamp offset between the sample timestamps stored in the media file and the corresponding timestamp in the Segment, ie. ts_offset = segment ts - media object ts. Assumed to be 0:0 if not set. Format as described by the [Timestamp](#/schemas/timestamp) type",
"$ref": "timestamp.json"
},
"timerange":
{
"description": "The timerange for the samples contained in the Segment. The timerange start is always inclusive. If samples have a duration then the timerange end is exclusive and covers at least the duration of the last sample. The exclusive timerange end will typically be set to the timestamp of the next sample. If the samples don't have a duration then the timerange end is inclusive. Format is described by the [TimeRange](../schemas/timerange#top) type. Note that where temporal re-ordering is used, the timerange and samples refers to the presentation timeline.",
"description": "The timerange for the samples contained in the Segment. The timerange start is always inclusive. If samples have a duration then the timerange end is exclusive and covers at least the duration of the last sample. The exclusive timerange end will typically be set to the timestamp of the next sample. If the samples don't have a duration then the timerange end is inclusive. Format is described by the [TimeRange](#/schemas/timerange) type. Note that where temporal re-ordering is used, the timerange and samples refers to the presentation timeline.",
"$ref": "timerange.json"
},
"last_duration":
{
"description": "The difference between the exclusive end of the `timerange` and the last sample timestamp. Format as described by the [Timestamp](../schemas/timestamp#top) type, but cannot be negative",
"description": "The difference between the exclusive end of the `timerange` and the last sample timestamp. Format as described by the [Timestamp](#/schemas/timestamp) type, but cannot be negative",
"$ref": "timestamp.json"
},
"object_timerange": {
Expand Down
2 changes: 1 addition & 1 deletion api/schemas/http-request.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
],
"properties": {
"url": {
"description": "The URL to make the request to",
"description": "The URL to make the request to. Where this URL is pre-signed, it SHALL remain valid for the timeframe advertised in [`min_presigned_url_timeout` at the `/service`](#/operations/GET_service) endpoint, which is subject to a specified minimum (see service endpoint schema).",
"type": "string"
},
"body": {
Expand Down
2 changes: 1 addition & 1 deletion api/schemas/object-core.json
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
"type": "string"
},
"presigned": {
"description": "If `true`, this URL is pre-signed. If this parameter is unset, the URL is NOT pre-signed.",
"description": "If `true`, this URL is pre-signed. If this parameter is unset, the URL is NOT pre-signed. The presigned URL SHALL remain valid for the timeframe advertised in [`min_presigned_url_timeout` at the `/service`](#/operations/GET_service) endpoint, which is subject to a specified minimum (see service endpoint schema).",
"type": "boolean"
},
"label": {
Expand Down
13 changes: 12 additions & 1 deletion api/schemas/service.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@
"required":
[
"type",
"api_version"
"api_version",
"min_object_timeout"
],
"properties":
{
Expand Down Expand Up @@ -43,6 +44,16 @@
{
"$ref": "event-stream-common.json"
}
},
"min_object_timeout":
{
"description": "The minimum timeframe within which a Media Object created by this service must be registered against a Flow segment before it is garbage collected. Services SHOULD allow a small grace period beyond the advertised value to account for latency in assigning the Objects and returning them to the Client. This timeout MUST be `300:0` (i.e. 5 minutes) or greater. Clients MUST be capable of reaching this minimum performance level. Clients SHOULD adapt to this value by balancing how many object URLs to request per page against how fast they will be used. Services MAY allow this value to be configured at deploy-time. Format as described by the [Timestamp](#/schemas/timestamp) type. For more infomation, see the documentation of the [`/flows/{flowId}/storage`](#/operations/POST_flows-flowId-storage) endpoint.",
"$ref": "timestamp.json"
},
"min_presigned_url_timeout":
{
"description": "The minimum timeframe within which pre-signed URLs generated by this Service are valid for (both for writing and reading content). This attribute MUST be set on implementations that support pre-signed URLs. Services SHOULD allow a small grace period beyond the advertised value to account for latency in generating pre-signed URLs and returning them to the Client. This timeout MUST be `30:0` (i.e. 30 second) or greater. The value of this parameter MUST be equal or less than `min_object_timeout` to avoid Objects being garbage collected while their pre-signed PUT URLs are still valid. Clients MUST be capable of reaching this minimum performance level. Clients SHOULD adapt to this value, by taking care to request URLs that will be used in a timely manner. Services MAY allow this value to be configured at deploy-time. Format as described by the [Timestamp](#/schemas/timestamp) type.",
"$ref": "timestamp.json"
}
}
}
2 changes: 1 addition & 1 deletion api/schemas/webhook-get.json
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
],
"properties": {
"error": {
"description": "Provides more information for the error status, as described by the [Error](../schemas/error#top) type",
"description": "Provides more information for the error status, as described by the [Error](#/schemas/error) type",
"$ref": "error.json"
},
"status": {
Expand Down
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ For more information on how we use ADRs, see [here](./adr/README.md).
| [0041](./adr/0041-require-explicit-framerate.md) | Requiring explicit frame rates |
| [0042](./adr/0042-uncontrolled-object-instance-labels.md) | Make `label` Mandatory for Uncontrolled Object Instances |
| [0043](./adr/0043-signalling-retention-time.md) | Signalling retention time |
| [0044](./adr/0044-signalling-timeouts.md) | Signalling timeout periods |
| [0046](./adr/0046-governance.md) | Governance |

\* Note: ADR 0004a was the unintended result of a number clash in the early development of TAMS which wasn't caught before publication
139 changes: 139 additions & 0 deletions docs/adr/0044-signalling-timeouts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
---
status: "proposed"
---
# Signalling of timeout periods

## Context and Problem Statement

Following the creation of [ADR0043 Signalling Retention Time](../0043-signalling-retention-time.md), it was identified that there are two existing types of resource timeouts in TAMS that are not explicitly signalled.

The first is the garbage collection of Media Objects.
The spec is currently vague on how this should be implemented.
Objects are garbage collected both when they are no longer referenced by any Flow Segments, and if they are never registered against a Flow Segment.
Garbage collection where an Object is no longer referenced by a Flow Segment should happen immediately as permissions can no longer be derived from a parent Flow.
Garbage collection where an Object is never registered must happen after a period of time to allow for the opportunity for the Object to be registered against a Flow Segment.
Currently the specification only states the following:

> Service implementations need to handle situations where Objects were uploaded but no Flow Segment was registered successfully.

This means there are no explicit expectations on how much time a client has to make use of an Object.

Secondly, TAMS currently makes no recommendations on the expiry time of pre-signed URLS.
Common approaches include this information in the URL itself.
But this cannot be relied upon.
This means there are no explicit expectations on how much time a client has to make use of a pre-signed URL.

This ADR revisits both of these topics and considers potential approaches to provide explicit expectations regarding these timeouts.

## Considered Options

* Option 1a: Signal Object garbage collection timeout via the `/service` metadata
* Option 1b: Specify a fixed Object garbage collection timeout in the specification
* Option 1c: Do not specify an Object garbage collection timeout
* Option 1d: Client and Service negotiate Object garbage collection timeout
* Option 2a: Signal presigned URL expiry time via the `/service` metadata
* Option 2b: Specify presigned URL expiry time in the specification
* Option 2c: Do not specify a presigned URL expiry time
* Option 2d: Client and Service negotiate presigned URL expiry time

## Decision Outcome

Chosen option: Option 1a, and Option 2a.

These options will be implemented such that the API specification could be extended in future to support Options 1d, and 2d if required.

The minimum timeout for presigned URLs has been selected as 30 seconds.
This is to allow for long distance, poor quality connections.
In particular, latency to opposite sides of Earth is on the order of 250ms and can be much higher.

The minimum timeout for Media Objects has been selected as 5 minutes.
This is to allow for the upload of moderately sized Objects over poor quality connections.

The timeout for Objects is significantly higher than presigned URLs as uploads/downloads must be initiated with presigned URLs before they time out, but Object upload and registration against Flow Segments must be completed before the Object timeout.

### Implementation

Implemented in [PR #166](https://github.com/bbc/tams/pull/166)

## Pros and Cons of the Options

### Option 1a: Signal Object garbage collection timeout via the `/service` metadata

This option would see a parameter added to the metadata at the `/service` endpoint that a Service shall use to communicate the minimum time Objects will be available for after first creation.
Clients should upload content to Objects and register them against segments within this timeframe.
The specification would include a minimum value allowing Clients to validate they meet a minimum level of performance.

* Good, because it provides a clear contract between Services and Clients regarding the timeframe in which Objects must be used
* Good, because it allows for Services to set that timeframe based on their implementation/requirements
* Good, because it provides clear performance requirements for Clients
* Bad, because it requires Clients to adapt to that signalled timeframe (or at least meet the minimum level)

### Option 1b: Specify a fixed Object garbage collection timeout in the specification

This option would see the TAMS specification specify the minimum time Objects will be available for after first creation.
Clients should upload content to Objects and register them against segments within this timeframe.

* Good, because it provides a clear contract between Services and Clients regarding the timeframe in which Objects must be used
* Good, because it doesn't require Clients to adapt to a signalled timeframe
* Bad, because it doesn't allow for Services to set that timeframe based on their implementation/requirements
* Bad, because this "minimum" would not allow Clients to rely on availability beyond the specified time, even if the Service allows use significantly beyond the specified timeframe

### Option 1c: Do not specify an Object garbage collection timeout

This option would see no changes to the TAMS specification.
The current statement in the specification - that service implementations should handle the case where Objects aren't registered - would remain.

* Good, because it allows for Services to set that timeframe based on their implementation/requirements
* Bad, because there is no clear contract between Services and Clients regarding the timeframe in which Objects must be used
* Bad, because Clients cannot adapt to this potentially variable timeframe

### Option 1d: Client and Service negotiate Object garbage collection timeout

This option would see Clients and Services actively negotiate Object garbage collection timeouts.
This would likely take the form of minimum and maximum values being advertised by the Service at its `/service` endpoint.
Clients would then specify a value in this range when requesting Object allocation.

* Good, because it provides a clear contract between Services and Clients regarding the timeframe in which Objects must be used
* Good, because it allows for Services to set the permitted range of timeframes based on their implementation/requirements
* Good, because it allows Clients to to request a timeframe appropriate to its requirements
* Bad, because it requires Clients and Services to adapt to that negotiated timeframe
* Bad, because it adds significant complexity to the API and implementations

### Option 2a: Signal presigned URL expiry time via the `/service` metadata

This option would see a parameter added to the metadata at the `/service` endpoint that a Service shall use to communicate the minimum time pre-signed URLs will be valid for.
The specification would include a minimum value allowing Clients to validate they meet a minimum level of performance.

* Good, because it provides a clear contract between Services and Clients regarding the timeframe over which pre-signed URLs are valid.
* Good, because it allows for Services to set that timeframe based on their implementation/requirements
* Good, because it provides clear performance requirements for Clients
* Bad, because it requires Clients to adapt to that signalled timeframe

### Option 2b: Specify presigned URL expiry time in the specification

This option would see the TAMS specification specify the minimum time pre-signed URLs will be valid for.

* Good, because it provides a clear contract between Services and Clients regarding the timeframe over which pre-signed URLs are valid.
* Good, because it doesn't require Clients to adapt to a signalled timeframe
* Bad, because it doesn't allow for Services to set that timeframe based on their implementation/requirements
* Bad, because this "minimum" would not allow Clients to rely on availability beyond the specified time, even if the Service allows use significantly beyond the specified timeframe

### Option 2c: Do not specify a presigned URL expiry time

This option would see no changes to the TAMS specification.

* Good, because it allows for Services to set that timeframe based on their implementation/requirements
* Bad, because there is no clear contract between Services and Clients regarding the timeframe over which pre-signed URLs are valid
* Bad, because Clients cannot adapt to this potentially variable timeframe

### Option 2d: Client and Service negotiate presigned URL expiry time

This option would see Clients and Services actively negotiate the time pre-signed URLs will be valid for.
This would likely take the form of minimum and maximum values being advertised by the Service at it's `/service` endpoint.
Clients would then specify a value in this range when requesting pre-signed URLs.

* Good, because it provides a clear contract between Services and Clients regarding the timeframe over which pre-signed URLs are valid.
* Good, because it allows for Services to set the permitted range of timeframes based on their implementation/requirements
* Good, because it allows Clients to to request a timeframe appropriate to its requirements
* Bad, because it requires Clients and Services to adapt to that negotiated timeframe
* Bad, because it adds significant complexity to the API and implementations
Loading