Skip to content

Backend neutral API services#70

Open
solsson wants to merge 43 commits intomainfrom
backend-neutral-api-services
Open

Backend neutral API services#70
solsson wants to merge 43 commits intomainfrom
backend-neutral-api-services

Conversation

@solsson
Copy link
Collaborator

@solsson solsson commented Mar 16, 2026

  • adds y-s3-api.blobs.svc.cluster.local
  • adds y-boostrap.kafka.svc.cluster.local
  • the blobstore backend (VersityGW by default) now runs in the blobs namespace.
  • kafka with backend (Redpanda) now runs in kafka.
  • new convention for topic create: use http://y-kustomize.ystack.svc.cluster.local/v1/blobs/setup-bucket-job/base-for-annotations.yaml as resource
  • new convention for bucket create: use http://y-kustomize.ystack.svc.cluster.local/v1/kafka/setup-topic-job/base-for-annotations.yaml
  • y-cluster-converge-ystack should now represent a rule set for yolean.se/module-part labels and ordered bases in ./k3s/
  • y-cluster-sudoers now seems to handle sudo-rs well enough
  • y-kubefwd adds --domain for non-local contexts, for http://y-skaffold.ystack.prod1 etc.

solsson and others added 30 commits March 14, 2026 13:50
Adds a readiness check after cluster creation to prevent TLS handshake
timeouts when kubectl apply runs before k3s has finished initializing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allows skipping the image cache and containerd load steps while still
running converge, useful when Docker networking is flaky or images are
already loaded.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduce y-s3-api.blobs:80 (ExternalName to minio or versitygw) and
y-bootstrap.kafka:9092 (ClusterIP selecting redpanda pods) so consumers
don't need to know which implementation backs S3 or Kafka.

Namespace resources are now managed in dedicated nn-namespace-* bases
(00-ystack, 01-blobs, 02-kafka, 03-monitoring) instead of being bundled
with workload bases, preventing accidental namespace deletion on
kubectl delete -k.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SWS (static-web-server) serves kustomize base files from secrets.
When versitygw is installed it creates the secret
y-kustomize.blobs.setup-bucket-job, mounted into SWS at
/blobs/setup-bucket-job/. Consumers reference individual resources:

  resources:
  - http://y-kustomize.ystack.svc.cluster.local/blobs/setup-bucket-job/setup-bucket-job.yaml

The setup-bucket-job uses y-s3-api.blobs abstraction so consumers
don't need to know the S3 backend. Kustomize treats HTTP directory
URLs as git repos, so individual file URLs are used instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Versioned base URLs at /v1/blobs/setup-bucket-job/
- setup-bucket-job.yaml now includes a credentials Secret (name: bucket)
  alongside the Job, so consumers get endpoint+creds after setup
- builds-registry-versitygw adapted: reads S3 config from
  builds-registry-bucket secret instead of hardcoded blobs-versitygw
- No longer depends on registry/generic,versitygw; uses registry/generic
  + versitygw/defaultsecret + y-kustomize HTTP resource directly
- SWS: enable symlinks and hidden files for k8s secret mounts, add
  --health flag

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Canonical URL: /v1/{category}/{job}/base-for-annotations.yaml
- Rename versitygw secret from "minio" to "versitygw-server" with
  root-prefixed keys to clarify these are admin credentials
- Move per-generator disableNameSuffixHash from global generatorOptions
- Add y-kustomize/openapi/openapi.yaml (OpenAPI 3.1) specifying the
  API contract for any y-kustomize implementation
- Add TODO_VALIDATE.md with design for spec-based validation job

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Run y-k8s-ingress-hosts after y-kustomize HTTPRoute is created so
kustomize can resolve the hostname. Curl-based readiness loops with
short timeouts gate any step that depends on y-kustomize HTTP resources.

After versitygw creates the blobs secret, restart y-kustomize and
wait for the base-for-annotations.yaml endpoint before applying
builds-registry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add yolean.se/module-part=gateway label to all Gateway, HTTPRoute,
GRPCRoute resources and the y-kustomize stack (Deployment, Service,
HTTPRoute).

Pass 1 applies only gateway-labeled resources from all bases using
label selector, then runs y-k8s-ingress-hosts to update /etc/hosts.
Pass 2 does the full apply including bases that depend on y-kustomize
HTTP resources being reachable from the host.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
y-k8s-ingress-hosts now reads yolean.se/override-ip from the ystack
gateway annotation, removing the need to pass -override-ip through
the call chain. The new --ensure flag combines check + write.

Converge persists YSTACK_OVERRIDE_IP env as a gateway annotation
after pass 1, then calls --ensure. Provision scripts set the env var
instead of managing hosts separately. Also adds 30s timeout on
y-kustomize host reachability and renders bases to temp files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ingress-hosts --check now verifies IP matches, not just hostname
  presence, detecting stale entries from previous provisioners
- --ensure appends -write when check fails (was missing)
- Fix dry-run label filtering: drop --server-side from dry-run=client
  (incompatible combination caused all gateway resources to be skipped)
- Increase curl connect-timeout to 10s (macOS mDNS adds 5s delay for
  .cluster.local hostnames) and overall timeout to 60s
- Add --request-timeout=5s to kubectl calls in ingress-hosts
- Fix y-image-list-ystack to extract from BASES array (was grepping
  for removed apply_base function)
- Update multipass provision to use converge's built-in --ensure

Tested: full multipass provision converge completes successfully.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create kafka/common with secretGenerator for y-kustomize.kafka.setup-topic-job,
mirroring the blobs pattern in versitygw/common. The base serves a Secret
(kafka-bootstrap with broker endpoint) and a Job (setup-topic using rpk).

Add k3s/09-kafka-common to deploy the secret in ystack namespace.
Update converge to apply it and validate y-kustomize serves kafka bases.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
and not require the user running it to be in the sudo group.

Also gives up (once again) on sudo-rs,
as we still haven't found rules that work for us on that sudo impl.
- Move kafka/redpanda/kafka/kustomization.yaml to kafka/redpanda/ (depth-2 convention)
- Upgrade redpanda image to v24.2.14@sha256:a91cddd8a93181b85107a3cde0beebb
- Use fully qualified image name (docker.redpanda.com/redpandadata/redpanda)
- Add k3s/10-redpanda/ base with redpanda-image component
- Include redpanda in converge-ystack (deploy + rollout wait)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename versitygw/ to blobs-versitygw/ and minio/ to blobs-minio/,
deploy to blobs namespace instead of ystack. y-s3-api becomes a
direct ClusterIP service selecting pods (removes ExternalName
indirection). Registry and other ystack consumers use
y-s3-api.blobs.svc.cluster.local.

- Remove intermediate blobs-versitygw/blobs-minio services
- Add k3s/09-blobs-common for y-kustomize secret in ystack namespace
- Remove versitygw from k3s/40-buildkit (not buildkit's concern)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CPU limits cause throttling even when the node has spare capacity,
hurting latency and throughput without meaningful benefit.
Memory limits are kept.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consolidate 23 k3s bases into 12 using numbered ranges (0*-6*),
rewrite converge script as a generic 10-step phase loop that
discovers bases by directory listing and uses label selectors
(config, services, gateway) instead of hardcoded base names.

Bases with -disabled suffix are skipped by convention.
Deferred bases (referencing y-kustomize HTTP) are detected
automatically and rendered/applied after y-kustomize restart.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Split 21-ystack-config into category-grouped bases so folder names
reflect the namespace relationship. Move namespace declarations from
individual resource yamls to kustomization.yaml.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…headers

Feature bases now own their target namespace, making them
portable across provisioners. Only k3s/60-builds-registry
retains namespace: ystack (cross-namespace override for
blobs-versitygw/defaultsecret, documented with comment).

Extract gateway/ feature base from k3s/20-ystack-core.

Add yaml-language-server schema comment and apiVersion/kind
to all kustomization files touched in this branch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Namespace bases are not subject to consolidation — keep one per
namespace for clarity and to support selective provisioning.

Remove yaml-language-server schema directive from kustomize
Component (redpanda-image) since schemastore has no Component
schema.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Simplify to inline render-and-apply piping. The only base with
y-kustomize HTTP dependency (60-builds-registry) renders naturally
in step 8 after y-kustomize is restarted in step 7.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
with very strange json payloads on stdout

and we might not need emulation with the new cluster strategy
Mounted secrets refresh in-place so no restart is needed, supporting
repeated converge. Replace until-loops with curl --retry flags.
Drop step numbering. Add --teardown-prune flag to lima provision.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consistent log attribution when scripts call each other.
Also fix setup-bucket job name in validate and update converge
completion message.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use static-web-server based example (two layers) instead of node app.
Verify build+push via registry API instead of deploying a pod.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Envoy proxy was needed before Traefik supported GRPCRoute.
Clients connect directly through the gateway on port 80 with
fallback to port 8547 for remote clusters using y-kubefwd.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add annotation-driven bucket-name to setup-bucket-job base so
consumers can use commonAnnotations instead of JSON patches.
Create registry/builds-bucket and registry/builds-topic feature
bases. Remove legacy generic,kafka (pixy-based notifications).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switch from render pipe to kubectl apply -k. Only tolerate
"no objects passed to apply" from label-selector phases.
All other errors are now fatal.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
solsson and others added 13 commits March 16, 2026 21:30
Replace multi-pass label selector approach with single-pass converge
that waits for deployment rollouts between digit groups. Bootstrap
y-kustomize with empty secrets (09-*) so it can start before real
secrets arrive at 3*/4*. Split 20-ystack-core into 20-gateway and
29-y-kustomize. Rename bases to use consistent naming: httproute→gateway,
grpcroute→gateway, common→y-kustomize. Update all kustomization.yaml
references to match renamed paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace pod exec + wget with direct curl to registry service in
validate. Use faster retry params (20x2s) across both scripts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use &&/|| chains so curl failures flow into report instead of aborting.
Add retries to tags check. Extract REGISTRY_HOST variable.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Avoids polluting .local DNS domain when forwarding from remote
clusters. Skipped when context is "local" or --domain is explicit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The -E (preserve environment) flag is rejected by scoped NOPASSWD
sudoers rules. These scripts pass all needed config via command-line
flags, not environment variables, so -E is unnecessary.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Provision scripts (k3d, multipass, lima) accept --exclude=SUBSTRING
(default: monitoring) and forward it to y-cluster-converge-ystack which
filters out k3s bases matching the substring. YSTACK_OVERRIDE_IP env var
replaced by --override-ip flag on converge-ystack.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace hardcoded CRD waits (gateway, prometheus-operator) with
kubectl wait --for=condition=Established crd --all. Replace hardcoded
namespace list for rollout status with dynamic discovery of namespaces
that have deployments. This makes --exclude work for any base without
hanging on missing CRDs or namespaces.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…delay

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Provisions an Ubuntu cloud image VM with k3s via QEMU/KVM.
Uses cloud-init for SSH access, port-forwarding for API server.
Runs y-cluster-converge-ystack with Gateway API and y-k8s-ingress-hosts,
followed by y-cluster-validate-ystack.

The VM disk can be exported as a VMware appliance via --export-vmdk.

Prerequisites: qemu-system-x86 qemu-utils cloud-image-utils, /dev/kvm

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Detection order: qemu (if qemu-system-x86_64, qemu-img, cloud-localds,
and /dev/kvm are all present) > multipass > k3d (docker).

Also add ystack-qemu to y-cluster-local-detect for teardown support.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant