-
Notifications
You must be signed in to change notification settings - Fork 448
MSC4140: add dedicated endpoint for delayed events #19354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Use a new, dedicated endpoint for scheduling delayed events. This should deprecate the `delay` query parameter on the `/send` and `/state` endpoints.
| request_body = parse_and_validate_json_object_from_request( | ||
| request, self.DelayedEventBodyModel | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a Pydantic model to validate the request body is convenient & seems safer than manual parsing, but has the downside of giving verbose error messages when the body doesn't match the expected shape.
For example, this is the error given for requests to this servlet with an empty body:
{
"errcode": "M_BAD_JSON",
"error": "2 validation errors for DelayedEventBodyModel\ndelay\n Field required [type=missing, input_value={}, input_type=dict]\n For further information visit https://errors.pydantic.dev/2.12/v/missing\ncontent\n Field required [type=missing, input_value={}, input_type=dict]\n For further information visit https://errors.pydantic.dev/2.12/v/missing"
}There is also the quirk of the errcode being M_INVALID_PARAM for one missing key in the body, but M_BAD_JSON for multiple missing keys.
Is there a good way to trim down errors like this? Or is manual parsing of the request body preferred?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pydantic parsing is preferred over shotgun parsing. As for the bad errors, it might be good to bring up in the backend chapter meeting tomorrow 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conclusion from the meeting is that there are indeed other endpoints that use Pydantic validation, and therefore may return the same kind of verbose error messages. So adding more such errors is no worse than the current situation.
But at some point soon, I'll file an issue/PR for something to trim down the long Pydantic error messages, at least to remove the “visit errors.pydantic.dev” part.
| event_type: str, | ||
| state_key: str | None, | ||
| content: JsonDict, | ||
| txn_id: str | None = "mid1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks sketchy. We could have a random or auto-incremented value created in the function.
Example:
synapse/tests/rest/client/utils.py
Lines 432 to 433 in 5a3362c
| if txn_id is None: | |
| txn_id = "m%s" % (str(time.time())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I let the txn_id be absent when this path is used for a POST request. With that said, perhaps the desired method could be passed into this function to make that more clear, and to use your suggested auto-generation for a PUT with no given txn_id.
| @classmethod | ||
| def get_delayed_event_path_and_body( | ||
| cls, | ||
| room_id: str, | ||
| delay: int, | ||
| event_type: str, | ||
| state_key: str | None, | ||
| content: JsonDict, | ||
| txn_id: str | None = "mid1", | ||
| ) -> tuple[str, JsonDict]: | ||
| path = ( | ||
| f"rooms/{room_id}/{'send' if state_key is None else 'state'}/{event_type}" | ||
| ) | ||
| if state_key is not None: | ||
| path += f"/{state_key}" | ||
| elif txn_id is not None: | ||
| path += f"/{txn_id}" | ||
| path += f"?org.matrix.msc4140.delay={delay}" | ||
| return path, content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by this function. I thought we were moving away from re-using the existing /_matrix/client/v3/rooms/{roomId}/state/{eventType}/{stateKey} endpoint in favor of PUT /_matrix/client/v3/rooms/{roomId}/delayed_event/{eventType}/{txnId} (unstable: PUT /_matrix/client/unstable/org.matrix.msc4140/rooms/{roomId}/delayed_event/{eventType}/{txnId})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the intention is indeed to move away from the /state/-based endpoint, but I figured that it is useful to keep the test coverage on that endpoint while the MSC gets ironed out.
There is another test class which tests the new endpoint, which works by subclassing this test class & overriding this classmethod to swap out the /state/ endpoint with the new /delayed_event/ one.
|
|
||
| def _get_path_for_delayed_send(room_id: str, event_type: str, delay_ms: int) -> str: | ||
| return f"rooms/{room_id}/send/{event_type}?org.matrix.msc4140.delay={delay_ms}" | ||
| def _get_delayed_event_request_args( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a better name. Looks like this does more than grab request args
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "args" in question are the args to make_request, which admittedly isn't very clear. How about refactoring this to include the make_request call & renaming it appropriately?
| channel.json_body, | ||
| ) | ||
|
|
||
| @parameterized.expand((-2000, 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should explain these invalid values. // Invalid because negative, etc
For better test names, I also tend to do this:
synapse/tests/handlers/test_sliding_sync.py
Lines 4941 to 4956 in 5a3362c
| @parameterized.expand( | |
| [ | |
| # Test with a normal arbitrary type (no special meaning) | |
| ("arbitrary_type", "type", set()), | |
| # Test with membership | |
| ("membership", EventTypes.Member, set()), | |
| # Test with lazy-loading room members | |
| ("lazy_loading_membership", EventTypes.Member, {StateValues.LAZY}), | |
| ] | |
| ) | |
| def test_limit_retained_previous_state_keys( | |
| self, | |
| _test_label: str, | |
| event_type: str, | |
| extra_state_keys: set[str], | |
| ) -> None: |
| def _get_delayed_event_request_args( | ||
| room_id: str, | ||
| delay: int, | ||
| event_type: str, | ||
| state_key: str | None, | ||
| content: JsonDict, | ||
| txn_id: str | None = "mid1", | ||
| ) -> tuple[str, bytes, JsonDict]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should mandate keyword args to be used as there are too many args to keep straight and because there multiple args with the same type in a row which is really easy to mix up.
| def _get_delayed_event_request_args( | |
| room_id: str, | |
| delay: int, | |
| event_type: str, | |
| state_key: str | None, | |
| content: JsonDict, | |
| txn_id: str | None = "mid1", | |
| ) -> tuple[str, bytes, JsonDict]: | |
| def _get_delayed_event_request_args( | |
| *, | |
| room_id: str, | |
| delay: int, | |
| event_type: str, | |
| state_key: str | None, | |
| content: JsonDict, | |
| txn_id: str | None = "mid1", | |
| ) -> tuple[str, bytes, JsonDict]: |
| max_event_delay_ms = hs.config.server.max_event_delay_ms | ||
| ( | ||
| RoomDelayedEventRestServlet(hs, max_event_delay_ms) | ||
| if max_event_delay_ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since max_event_delay_ms is an int and can be 0 if we fail to shotgun validate it in the future.
| if max_event_delay_ms | |
| if max_event_delay_ms is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Values below 0 are already guarded against in synapse/config/server.py, but I agree that it's not obvious here. Perhaps it can be typed as a Pydantic PositiveInt.
|
|
||
| def _raise_delayed_events_unsupported() -> NoReturn: | ||
| raise SynapseError( | ||
| HTTPStatus.BAD_REQUEST, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
400 Bad Request indicates a client error. But in this case, it's a server problem so we should probably opt for something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. How about M_FORBIDDEN or M_RESOURCE_LIMIT_EXCEEDED (and a 403 HTTP response code in both cases)?
| "Delayed events are not supported on this server", | ||
| Codes.UNKNOWN, | ||
| { | ||
| "org.matrix.msc4140.errcode": "M_MAX_DELAY_UNSUPPORTED", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see M_MAX_DELAY_UNSUPPORTED in the MSC. I see M_MAX_DELAY_EXCEEDED 🤔
And this should live under synapse/api/errors.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a different error that's not in the spec. It's also poorly named, for that matter...
What might make the most sense is to not use a unique error for this, but to treat it as the same as the requesting user having too many outstanding delayed events. The MSC defines an error for that, and is technically what's happening here if you consider disabling delayed events as the same as setting the per-user delayed event limit to 0.
|
|
||
| class RoomDelayedEventRestServletUnsupported(RoomDelayedEventRestServletBase): | ||
| async def _do(self, *_: Any) -> NoReturn: | ||
| _raise_delayed_events_unsupported() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we use this pattern elsewhere?
I feel like usually, we just don't register the servlets at all if the feature is not configured. Although the feedback isn't necessarily bad.
Feels like something we can just handle in the normal servlet though. And then we don't need the RoomDelayedEventRestServletBase separation either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this pattern is elsewhere, but I liked the idea of not having to check the config value on every call to the servlet. Not sure how useful that is, though, especially if there's interest in allowing some server config values to be mutable (which would necessitate rereading them).
| request_body = parse_and_validate_json_object_from_request( | ||
| request, self.DelayedEventBodyModel | ||
| ) | ||
| _check_delay_within_max(request_body.delay, self._max_event_delay_ms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need to check that this is > 0?
Feels like this validation logic should be more shared. Could even be part of the Pydantic model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A > 0 check isn't needed here, because it's already effectively done by the model (since delay is annotated as a PositiveInt).
It's also not needed in the servlets for the /send/ & /state/ endpoints with the delay query parameter, as they call _parse_request_delay which does a > 0 check directly.
Use a new, dedicated endpoint for scheduling delayed events. This should deprecate the
delayquery parameter on the/sendand/stateendpoints.See:
Pull Request Checklist
EventStoretoEventWorkerStore.".code blocks.