-
Notifications
You must be signed in to change notification settings - Fork 245
DRIVERS-3269 Move aggregation tests out of tailable awaitData cursor #1887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
DRIVERS-3269 Move aggregation tests out of tailable awaitData cursor #1887
Conversation
| - client: *client | ||
| events: [] | ||
|
|
||
| - description: "error on aggregate if maxAwaitTimeMS is greater than timeoutMS" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no client-side validation that would produce an error here: https://github.com/mongodb/specifications/blob/master/source/client-side-operations-timeout/client-side-operations-timeout.md#non-tailable-cursors
The validation of maxAwaitTimeMS vs timeoutMS only comes into play when the cursor is tailable + awaitData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Crud spec says:
This options only applies to aggregations which return a TAILABLE_AWAIT cursor. Drivers SHOULD always send this value, if the cursor is not a TAILABLE_AWAIT cursor the server will ignore it.
It sounds like we should not validate the values in case of non-tailable cursors. Should we revert the expected result to be succeeded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like we should not validate the values in case of non-tailable cursors. Should we revert the expected result to be succeeded?
The server will reject getMore w/ maxTimeMS for non-awaitData cursors. Notably, the only team to have synced this is the Java Driver:
Looks like in Java we pass the test because its a client validation on the aggregation iterable which triggers when setting maxAwaitTimeMS.
Node skips, but claims
We have the same validation [as Java] [...] but it doesn't trigger unless awaitData is provided.
I think these tests are valid. Without upfront validation, this could happen:
- User creates aggregate cursor with maxAwaitTimeMS
- First batch returns fine
- N-th call needs getMore
- getMore fails with confusing server error mid-iteration (DRIVERS-2868)
- User has already processed some documents
This fits the mantra exception: "Exceptions should be rare: for cases where the server might not error and correctness is at stake." The server does error, but it errors mid-iteration, which is arguably a correctness/UX issue. Failing fast is probably better than failing halfway through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand your proposed scenario, a getMore throws an error because it includes maxTimeMS. But attaching maxTimeMS to getMore is already prohibited by the CSOT spec:
Drivers MUST NOT append a maxTimeMS field to getMore commands.
To clarify my comment about Node having this validation: I still do not understand why this test passes in Java (this seems like a bug imo). The spec is pretty clear this validation is only relevant for tailable awaitData cursors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC the above was the argument made in a CSOT triage and I currently don't have the bandwidth to determine if the spec is mistaken / there's a gap in rationale. More specifically, perhaps it was to avoid checking pipeline stages for the $changeStream cases and just do a catch-all validation for simplicity, which IMO is arguably reasonable.
Notably, the part of the spec you quoted is sort of ambiguous-y to me in the sense that you can use maxAwaitTimeMS for aggregate using a $changeStream pipeline. I understand that's tailable but you'd have to inspect client input to realize that. I'm not sure that's a reasonable driver-level requirement. The alternatives are a high-level catch-all validation or accepting the pathological case above.
If this isn't satisfactory, suggest we remove these tests as an amendment to DRIVERS-3092, closing DRIVERS-3269 as "won't do". I don't think we should impress opinions on not doing client-level validations for driver team. I can open a separate drivers ticket to clarify why the pathological case is legitimate.
Edit: differentiate possible motives
Edit: clarify that it's reasonable to do a catch-all
Edit: Clean up intention
Edit: Remove confusing comment about $out / $merge
Edit: point out spec wrongness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possible follow-up: https://jira.mongodb.org/browse/DRIVERS-3383
DRIVERS-3269
Please complete the following before merging:
clusters).