Skip to content

Conversation

@jshearer
Copy link
Contributor

@jshearer jshearer commented Jan 7, 2026

No description provided.

@jshearer jshearer force-pushed the dekaf/group_management_e2e_tests branch from 0d7b8a5 to 0c8a977 Compare January 7, 2026 19:47
@jshearer jshearer changed the title Dekaf: E2E tests for group management APIs WIP Dekaf: E2E tests for group management APIs Jan 7, 2026
@jshearer jshearer force-pushed the dekaf/group_management_e2e_tests branch 4 times, most recently from 5168186 to 27eab7d Compare January 7, 2026 22:44
Dekaf previously required TLS and MSK IAM authentication for all upstream
Kafka connections, making local development and testing difficult. This adds
support for plaintext connections via URL scheme detection:

* `tcp://host:port` connects without TLS, `tls://host:port` uses TLS (default)
* `--upstream-auth=none` flag skips SASL authentication entirely
* `KafkaClientAuth::from_msk_region(None)` creates no-auth mode

Example local usage:
  dekaf --default-broker-urls tcp://localhost:29092 --upstream-auth=none ...
…ka errors

It's possible for a collection to exist in the control plane without having
any extant journals. This can happen either when the capture task is failing or
hasn't emitted any documents, and more frequently during a collection reset.
Previously, Dekaf treated this the same as a missing collection, causing
consumers to receive non-retryable errors or inconsistent behavior.

Introduces `CollectionStatus` enum to distinguish three states:

* `Ready`: binding exists and journals are available
* `NotFound`: binding doesn't exist in the materialization spec
* `NotReady`: binding exists but journals aren't available yet

For `NotReady`, we'll use `LeaderNotAvailable` (a retryable error) to cause
consumers to retry with backoff until the journals become available.
They will eventually give up.
This is mainly for e2e tests so we can set a low TTL and
avoid waiting around for too long for changes to propagate.
* Run Dekaf e2e tests as separate step because `nexttest-run` messes with local stack state
* Make `local:data-plane` idempotent
* `ci:dekaf-e2e` now assumes `local:stack` etc are up rather than explicitly depending on it
* mise: log systemd output if failure
* mise: also log agent logs on failure
* nexttest: exclude e2e tests by default, and run them with
`--profile dekaf-e2e` instead
are identical.

Now instead of truncating the task name, we hash it.

NOTE: This would normally be a breaking change, but we already have a
convenient upgrade point with leader epochs. So the change is gated
behind having a leader epoch, and so once that's deployed and every
consumer has upgraded, we can remove the old codepath entirely
@jshearer jshearer force-pushed the dekaf/group_management_e2e_tests branch from 27eab7d to 6b4314d Compare January 8, 2026 16:11
@jshearer jshearer changed the title WIP Dekaf: E2E tests for group management APIs Dekaf: E2E tests for group management APIs Jan 13, 2026
@jshearer jshearer self-assigned this Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants