-
Notifications
You must be signed in to change notification settings - Fork 87
dekaf: Fix task names containing periods #2588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
jshearer
wants to merge
8
commits into
dekaf/collection_reset_with_e2e_tests
Choose a base branch
from
dekaf/fix-task-name-period-auth
base: dekaf/collection_reset_with_e2e_tests
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
dekaf: Fix task names containing periods #2588
jshearer
wants to merge
8
commits into
dekaf/collection_reset_with_e2e_tests
from
dekaf/fix-task-name-period-auth
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2a00bc8 to
d754da8
Compare
48329ae to
b71f37c
Compare
d2b9d7f to
9de90e5
Compare
b71f37c to
e83cfb1
Compare
4bfc69e to
e1c6baf
Compare
e83cfb1 to
55be0b2
Compare
Dekaf previously required TLS and MSK IAM authentication for all upstream Kafka connections, making local development and testing difficult. This adds support for plaintext connections via URL scheme detection: * `tcp://host:port` connects without TLS, `tls://host:port` uses TLS (default) * `--upstream-auth=none` flag skips SASL authentication entirely * `KafkaClientAuth::from_msk_region(None)` creates no-auth mode Example local usage: dekaf --default-broker-urls tcp://localhost:29092 --upstream-auth=none ...
Used for testing
…ka errors It's possible for a collection to exist in the control plane without having any extant journals. This can happen either when the capture task is failing or hasn't emitted any documents, and more frequently during a collection reset. Previously, Dekaf treated this the same as a missing collection, causing consumers to receive non-retryable errors or inconsistent behavior. Introduces `CollectionStatus` enum to distinguish three states: * `Ready`: binding exists and journals are available * `NotFound`: binding doesn't exist in the materialization spec * `NotReady`: binding exists but journals aren't available yet For `NotReady`, we'll use `LeaderNotAvailable` (a retryable error) to cause consumers to retry with backoff until the journals become available. They will eventually give up.
This is mainly for e2e tests so we can set a low TTL and avoid waiting around for too long for changes to propagate.
* Run Dekaf e2e tests as separate step because `nexttest-run` messes with local stack state * Make `local:data-plane` idempotent * `ci:dekaf-e2e` now assumes `local:stack` etc are up rather than explicitly depending on it * mise: log systemd output if failure * mise: also log agent logs on failure * nexttest: exclude e2e tests by default, and run them with `--profile dekaf-e2e` instead
couple of non-covered tests over
e1c6baf to
e819733
Compare
Task names containing periods (e.g. `foo.com/tenant/task`) were being corrupted during authentication. The issue: `decode_safe_name()` replaces `.` with `%` then attempts percent-decoding, so `foo.com` becomes `foo%com`, and `%co` is invalid hex, leaving the name corrupted. The `decode_safe_name` function was added in #1516 to support dot-encoded JSON config objects as usernames (e.g. `{}` encoded for platforms that don't allow `{` in usernames). When materialization task-based auth was introduced in #1665, the dot-encoding was never removed despite no longer having a valid use case: users provide task names via SASL auth username which doesn't have the same limitations as topic names. This PR removes the `decode_safe_name()` call from `authenticate()`. It's still used by `from_downstream_topic_name` for decoding Kafka topic names in order to avoid breaking backwards compatibility.
The session SASL auth error check used `error_code > 0`, which only catches positive error codes like `SaslAuthenticationFailed` (58). This missed negative error codes like `UnknownServerError` (-1), causing the client to think authentication succeeded when it actually failed. * Change `error_code > 0` to `error_code != 0` to catch all non-success codes
55be0b2 to
146c9b1
Compare
e819733 to
7823aab
Compare
7452c05 to
097b327
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Task names containing periods (e.g.
foo.com/tenant/task) were being corrupted during authentication. The issue:decode_safe_name()replaces.with%then attempts percent-decoding, sofoo.combecomesfoo%com, and%cois invalid hex, leaving the name corrupted.The
decode_safe_namefunction was added in #1516 to support dot-encoded JSON config objects as usernames (e.g.{}encoded for platforms that don't allow{in usernames). When materialization task-based auth was introduced in #1665, the dot-encoding was never removed despite no longer having a valid use case: we're no longer trying to do things like stuff JSON objects into usernames.This PR removes the
decode_safe_name()call fromauthenticate(). It's still used byfrom_downstream_topic_namefor decoding Kafka topic names in order to avoid breaking backwards compatibility. It also fixes a small bug in Dekaf's Kafka API client (which the e2e test infrastructure uses) that was treating an error code of-1as non-terminal.NOTE: This will break any user using
%-encoded or.-encoded SASL usernames