Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Initially this started as research into the ephemeral flakes in ci:part2 seen across runs in this repo as described here: #594 (this is an in-depth exploration as the origin of the bug, what was seen, how often, and even what we should be testing...eventual consistency? strict testing highly recommend reading for context) This ephemeral flake not only is something that can be seen in automated tests but a real issue that could be seen in production.
The initial #594 PR intended to backport the change from the fanout branch which reduced constraints on the tests but that didn't seem to have too significant of an effect on the success of the tests, and more importantly doesn't really test what's happening in a real situation.
Why
ci:part2 is "flaky"
The failing test
index > operations > search > redundancy > can search while keeping minimum amount of replicasinpackages/programs/data/document/document/test/index.spec.tswas asserting immediate completeness (collected.length === count) while the system is still rebalancing/syncing. In CI, distributedindex.search(fetch=count)can transiently short-read due to timing (indexing lag and/or missed remote RPC responses), producing the familiar signature. See more: #594What this does:
Changes:
Local verification:
Links:
AI Summary:
