Skip to content

Conversation

@prestonvasquez
Copy link
Member

@prestonvasquez prestonvasquez commented Jan 28, 2026

DRIVERS-3006

Please complete the following before merging:

  • Is the relevant DRIVERS ticket in the PR title?
  • Update changelog.
  • Test changes in at least one language driver. (See Go Driver Implementation)
  • Test these changes against all server versions and topologies (including standalone, replica set, and sharded
    clusters).

Summary

The DRIVERS ticket reports a contradiction in the following two requirements from specifications:

[1] If a next call fails with a timeout error, drivers MUST NOT invalidate the change stream. The subsequent next call MUST perform a resume attempt to establish a new change stream on the server.

[2] If a resume is required for a next call on a change stream, the timeout MUST apply to the entirety of the initial getMore and all commands sent as part of the resume attempt.

While there appears to be no contradiction, the CSOT change stream test in question should be extended to include case [2].

Go Driver Implementation

Notably, the Go Driver skips this test:

TestUnifiedSpec/client-side-operations-timeout/tests/change-streams.json/timeoutMS_applies_to_full_resume_attempt_in_a_next_call
    skip.go:874: Skipping due to known failure: "Ensure change streams resume with CSOT failure ([GODRIVER-3380](https://jira.mongodb.org/browse/GODRIVER-3380))"

GODRIVER-3380 provides a non-UST repro, you can run it directly here. As noted in the ticket, the fix for this is fairly straightforward in the Go Driver. In our case, [1] is just checking that we don’t invalidate on a timeout error [not current behavior]:

func (cs *ChangeStream) next(ctx context.Context, nonBlocking bool) bool {
	if cs.err != nil && !errors.Is(cs.err, context.DeadlineExceeded) {
		return false
	}

	// ...
}

However, even if we made this change, the unified spec test responsible for testing this would still fail since it would be an anti-pattern in Go to use the deadline for a context passed to collection.Watch() for each call to collection.Next(), which is an implicit [at least] requirement of both the specs (see above) and the unified spec test in question:

      - name: iterateUntilDocumentOrError
        object: *changeStream
        expectError:
          isTimeoutError: true

This commit implements the following acceptance criteria from GODRIVER-3480 required to test the changes:

The Go Driver team should consider updating our internal test runner to apply timeouts given to cursor constructors to be applied iteratively, as that is the expectation.

prestonvasquez/mongo-go-driver@808463a

  Combine "change stream can be iterated again if previous iteration
  times out" and "timeoutMS is refreshed for next call after a timeout error" into
  a single test that validates both spec requirements:

  - [resumption] If a next call fails with a timeout error, drivers MUST
    NOT invalidate the change stream. The subsequent next call MUST perform
    a resume attempt to establish a new change stream on the server.

  - [refresh] If a resume is required for a next call on a change
    stream, the timeout MUST apply to the entirety of the initial getMore and all
    commands sent as part of the resume attempt.

  The new test blocks the resume aggregate for 150ms to prove the
  timeout is refreshed (a fresh 200ms budget, not exhausted from the previous
  call).
@prestonvasquez prestonvasquez marked this pull request as ready for review January 28, 2026 21:04
@prestonvasquez prestonvasquez requested a review from a team as a code owner January 28, 2026 21:04
@prestonvasquez
Copy link
Member Author

@dariakp A member of the Node Driver team reported DRIVERES-3006, would someone from that team like to review the summary and validate the suggested UST updates?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant