Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
194 changes: 194 additions & 0 deletions 124-sidecar-addition-proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,194 @@
# Proposal: Native Sidecar Container Support for Strimzi Components

## Summary
Provide a brief summary of the feature you are proposing to add to Strimzi.

This proposal adds first‑class, **declarative sidecar container** support to Strimzi-managed workloads (starting with Kafka brokers), configured via CRDs under `spec.kafka.template.pod.sidecarContainers`. Users can attach monitoring, logging, security, and networking sidecars without webhooks or forks. The design introduces a stable CRD schema (`SidecarContainer`), a validation and conversion pipeline, and generic interfaces so other Strimzi components (KafkaConnect, KafkaBridge, MirrorMaker2, etc.) can adopt it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does a validation and conversion pipeline mean? We do not have anything like this.


## Current situation
Describe the current capability Strimzi has in this area.

Strimzi currently lacks native sidecar support. Users rely on:
1. Mutating webhooks/admission controllers
2. Forking and patching Strimzi code
3. Ephemeral containers for ad‑hoc troubleshooting

Limitations of these approaches include fragile upgrades, higher operational complexity, lack of declarative control, and weak integration with Strimzi lifecycle and security primitives.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can elaborate more on this? I think that:

  • It will be always fragile and have high operational complexity because you will heavily depend on the Strimzi internals in any case. I struggle to see a sensible use-case that would not depend on some Strimzi internals. Only non-fragile implementation is built-in support for very specific selected sidecars instead of generic sidecar support.
  • Lack of declarative control can be easily worked around by configuring things for example through annotations or custom resources.

But for those I have at least an idea what you might mean. For the rest - weak integration with Strimzi lifecycle and security primitives - I struggle to understand them. So maybe you can get a bit more into the details.


## Motivation
Explain the motivation why this should be added, and what value it brings.

**Business value**
- Operational flexibility: Enable Flexibility for adding additional sidecars as per the custom need of the users
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure flexibility on its own is a value. It is a problem because it makes things unsupportable and unmaintainable. So thre needs to be some goal to justufy the flexibility.

- Simplified ops: manage sidecars natively via Strimzi CRDs and GitOps
- Cloud‑native patterns: support log forwarders, APM agents, service‑mesh proxies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving service-mesh aside, these are not cloud-native patterns for me and have better solutions for example through daemon set for collecting logs etc.

Service mesh is an interesting use-case. However, service meshes seem to move away from expensive sidecars. They also rely on injecting sidecars rather than having them preconfigured. And the things I mention below on proxies would apply to them. And there is no service mesh supporting Kafka as far as I know. So frankly, I do not think this proposal helps with it in any way.


**Representative use cases**
- Monitoring/Observability: prometheus exporters, APM agents (Datadog, New Relic)
- Logging/Auditing: Fluent Bit/Filebeat; audit collectors
- Security/Compliance: Vault agent; cert rotation helpers
- Networking: traffic analyzers/proxies
- Data tooling: lightweight data checks or protocol helpers
Comment on lines +26 to +31
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how much these usecases really justify the proposal ...

Monitoring/Observability: prometheus exporters, APM agents (Datadog, New Relic)

There is a built-in support for Prometheus metrics. The future is likely OpenTelemetry support. These are generally recognized things supported by Strimzi as well as the majority of observability platforms. You do not need any additional Prometheus exporters and stuff like that.

Logging/Auditing: Fluent Bit/Filebeat; audit collectors

The cloud native solution is through Daemon Sets and container logging. That provides the performance and scale needed. Using file logs and sidecars is not the right pattern.

Security/Compliance: Vault agent; cert rotation helpers

This is not something sidecar can help with. This would be something what would need much deeper integration into Strimzi.

Networking: traffic analyzers/proxies
Data tooling: lightweight data checks or protocol helpers

I think these are the valid points. For things such as Kafka-aware proxies, you would need some kind of support from Strimzi. But that support cannot consist only of adding a sidecar container. You need to handle the proper routing, advertised listeners, authentication / encryption offloading. So while this is the use-case where this proposal would make most sense, it is also the one where it is in my opinion completely insufficient. And moreover, adopting this proposal might prevent these use-cases in the future because it fixes the API.


## Proposal
Provide an introduction to the proposal. Use sub sections to call out considerations, possible delivery mechanisms etc.

This proposal introduces a **custom `SidecarContainer` CRD model** (stable, Fabric8‑independent) and a **component‑agnostic sidecar pipeline**:
1) Schema & API surface (`SidecarContainer`) kept minimal and serializable
2) Validation in CR processing; conversion to Fabric8 `Container` at runtime
3) Component interface (`SidecarInterface`) + `SidecarUtils` for reuse
4) PodTemplate placement for per‑pool (KRaft) configuration
5) Automatic NetworkPolicy enrichment for declared sidecar ports
6) Volume access model supporting Kafka data (read‑only), ConfigMaps/Secrets, and `emptyDir`

### Rationale for Custom `SidecarContainer` Abstraction
We use a custom `SidecarContainer` instead of Fabric8 `io.fabric8.kubernetes.api.model.Container` in CRDs.

**Technical justification**
- **Problem – CRD generation incompatibility**: Fabric8's `Container` exposes `IntOrString` fields (e.g., ports, probe handlers). Strimzi's CRD generator cannot reliably serialize `IntOrString` to OpenAPI v3, breaking validation and schema generation.
- **Solution – Custom abstraction**: `SidecarContainer` mirrors essential fields with simple types (String, List, standard objects). Probes use a Strimzi type (`SidecarProbe`) instead of Fabric8 `Probe`.

**Supported configuration (concise)**
- Core: `name`, `image`, `imagePullPolicy`, `command`, `args`
- Env & storage: `env` (ContainerEnvVar), `volumeMounts`
- Networking: `ports` (ContainerPort)
- Resources: `resources` (ResourceRequirements) **mandatory**
- Security: `securityContext` (SecurityContext)
- Health: `livenessProbe`, `readinessProbe` (custom `SidecarProbe`)

**Benefits**
- **CRD compatibility**: Ensures generated CRDs are valid and parseable by Kubernetes
- **Simplified API surface**: Exposes only the configuration options relevant to Strimzi sidecars
- **Maintainability**: Decouples Strimzi's API from potential breaking changes in Fabric8's container model
- **Type safety**: Avoids runtime serialization errors caused by ambiguous field types

### API design (small snippet)
```java
@Description("Sidecar container alongside main component")
public class SidecarContainer {
private String name;
private String image;
private List<ContainerPort> ports;
private List<EnvVar> env;
private ResourceRequirements resources; // mandatory
private List<VolumeMount> volumeMounts;
private SecurityContext securityContext;
}
```

**CRD location in PodTemplate (Kafka example)**
```yaml
spec:
kafka:
template:
pod:
sidecarContainers:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the container template outside of the Pod template. So likely the sidecars would also end outside of the Pod template and not inside it.

- name: fluent-bit
image: fluent/fluent-bit:2.2.0
resources:
limits: { cpu: "200m", memory: "256Mi" }
```

### Validation & conversion pipeline
**Hierarchy**: (1) CRD schema → (2) Strimzi API validation → (3) component rules → (4) pod creation.
**Key rules**: unique names; no conflict with Strimzi-managed names; resource limits required; ports must not collide with component‑reserved ports; volume mounts must exist and avoid conflicts.

**Interface & utils (signatures only)**
```java
public interface SidecarInterface {
void validateSidecarContainers(...);
List<Container> createSidecarContainers(PodTemplate pod);
}

// Conversion
List<Container> convertSidecarContainers(PodTemplate pod);
```

### Network policies & volumes
- **NetworkPolicy**: Extract sidecar ports from all pools; add ingress rules for those ports by default. Users can further restrict via standard policies.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot restrict the Network Policy rules once they are opened by the operator. So this does not sound like a good approach. You would either need to make it fully configurable or leave it up to the users.

- **Volumes**: Support read‑only Kafka data mounts for log access, user‑defined `volumes` in `pod.template`, Secrets/ConfigMaps, and `emptyDir`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Kafka data mounts should not be used for any logs. And ensuring a read-only mounts only might not be really simple.


### SidecarProbe (overview)
A Strimzi probe type used by sidecars to avoid `IntOrString` in CRDs and keep schemas simple.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not reuse the Fabric8 classes?


**Compact class sketch**
```java
@JsonInclude(NON_NULL)
@JsonPropertyOrder({ "execCommand", "httpGetPath", "httpGetPort", "httpGetScheme",
"tcpSocketPort", "initialDelaySeconds", "timeoutSeconds", "periodSeconds",
"successThreshold", "failureThreshold" })
public class SidecarProbe implements UnknownPropertyPreserving {
private List<String> execCommand;
private String httpGetPath, httpGetPort, httpGetScheme, tcpSocketPort;
private int initialDelaySeconds = 15, timeoutSeconds = 5, periodSeconds = 10;
private int successThreshold = 1, failureThreshold = 3;
private Map<String,Object> additionalProperties;
}
```
**Highlights**
- Supports `exec`, `httpGet`, `tcpSocket` styles via simple String fields (no `IntOrString`).
- Sensible defaults: `initialDelaySeconds=15`, `timeoutSeconds=5`, `periodSeconds=10`.
- Min validations (e.g., timeout/period/success/failure thresholds ≥ 1).
- Preserves unknown properties for forward compatibility.

### Minimal examples
**Single sidecar (log forwarder)**
```yaml
spec:
kafka:
template:
pod:
sidecarContainers:
- name: fluent-bit
image: fluent/fluent-bit:2.2.0
resources:
limits: { cpu: "200m", memory: "256Mi" }
```
**Per‑pool configuration (KRaft)**
```yaml
kind: KafkaNodePool
spec:
roles: [broker]
template:
pod:
sidecarContainers:
- name: log-forwarder
image: fluent/fluent-bit:2.2.0
resources:
limits: { cpu: "200m", memory: "256Mi" }
```

## Affected/not affected projects
Call out the projects in the Strimzi organisation that are/are not affected by this proposal.

**Affected (Phase 1)**
- `strimzi-kafka-operator` (api: new `SidecarContainer`; operator: `SidecarInterface`, `SidecarUtils`, KafkaCluster integration; crd‑generator updates)

**Future (Phase 2+)**
- KafkaConnect, KafkaBridge, MirrorMaker2, EntityOperator (implement interface; add reserved ports/names)
Comment on lines +164 to +168
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these phases? You never taked abotu them before.


**Not affected**
- strimzi-kafka-oauth, test-container, drain-cleaner, access-operator (no changes)

## Compatibility
Call out any future or backwards compatibility considerations this proposal has accounted for.

- **Backward compatible**: Field is optional; no change for existing CRs.
- **Forward compatible**: Stable custom model avoids Fabric8 `IntOrString` issues in schema generation.
- **Rolling behavior**: Add/modify/remove sidecars via standard rolling updates.
- **Versioning**: Introduce in Strimzi ≥ 0.50.0; other components can adopt incrementally.

## Rejected alternatives
Call out options that were considered while creating this proposal, but then later rejected, along with reasons why.

1) **Expose Fabric8 `Container` directly** — rejected due to OpenAPI/CRD stability and `IntOrString` schema issues.
2) **Separate `KafkaSidecar` CRD** — rejected for not having a generic placeholder for future expandability.
3) **Operator‑level auto‑injection** — rejected to keep explicit user control and avoid risky implicit coupling.
4) **InitContainer/DaemonSet patterns** — do not satisfy long‑running, pod‑local sidecar use cases.
5) **Helm‑only customization** — not portable; breaks on operator‑managed rollouts.
Comment on lines +184 to +188
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do mot follow what these mean. You should probably elaborate on them.


## References
References (informative)
- Strimzi Pod Templates (docs)
- Kubernetes multi‑container pods (docs)
- Implementation PR: strimzi/strimzi‑kafka‑operator#12121