Skip to content

Conversation

@rh-amarin
Copy link

@rh-amarin rh-amarin commented Dec 19, 2025

Changes WIF configuration from using Google Service Accounts to use directly Kubernetes Service Accounts

It uses the developer name as the basis for the namespace
Changes the names of the topics & subscriptions to add the namespace as prefix, to avoid collisions on the shared GCP project

Summary by CodeRabbit

  • Refactor
    • Simplified infrastructure configuration by enabling Pub/Sub by default.
    • Migrated authentication to use workload identity federation instead of explicit service accounts.
    • Updated subscription naming to align with topic names for consistency.
    • Removed namespace configuration variable in favor of alternative configuration approach.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 19, 2025

Walkthrough

This pull request refactors the Terraform infrastructure to migrate from explicit Google Service Accounts to Kubernetes Workload Identity (WIF) authentication. The changes remove Google Service Account resource definitions for Sentinel and Adapter, convert IAM bindings to use WIF principal strings, eliminate related variables and outputs, update the pubsub module interface (removing sentinel_sa_name and adapter_sa_name inputs, changing adapter_k8s_sa_name value, and replacing the kubernetes_namespace variable with developer_name), update subscription naming logic to derive from topic name, and enable pubsub by default.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Key areas requiring attention:
    • service_accounts.tf: Verify all IAM member bindings correctly use the workload identity principal format (principals://iam.googleapis.com/...) and that no stray service account email references remain
    • Module input changes (main.tf): Confirm the namespace mapping change from var.kubernetes_namespace to var.developer_name is intentional and aligns with downstream module usage
    • Subscription naming logic: Validate that the new subscription name derivation from local.topic_name instead of var.namespace maintains expected naming consistency
    • Variable/output consolidation: Ensure removed variables and outputs do not break external dependencies or downstream Helm integrations that may have relied on service account email outputs

Possibly related PRs

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Configure WIF for hyperfleet pubsub' directly and specifically addresses the main objective of the PR: updating the Workload Identity Federation configuration for the hyperfleet pubsub infrastructure.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
terraform/modules/pubsub/service_accounts.tf (1)

6-11: The Workload Identity principal format is correct for GKE Workload Identity Federation.

Since April 2024, GKE allows using Kubernetes service accounts directly as principals in IAM bindings. The principal string format in this code follows the documented pattern and most Google Cloud APIs, including Pub/Sub, support direct resource access to Kubernetes service account principals.

The workload pool identifier ${var.project_id}.svc.id.goog is the default workload pool that GKE uses when Workload Identity is enabled, so the code is correct as written. Verify that:

  1. Workload Identity is enabled on the cluster and the node pool has GKE_METADATA mode enabled
  2. The var.project_id variable matches the GKE cluster's project ID exactly
  3. The namespace and service account name variables match the actual Kubernetes resources
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4350f4d and 6c08132.

📒 Files selected for processing (8)
  • terraform/README.md (0 hunks)
  • terraform/main.tf (1 hunks)
  • terraform/modules/pubsub/main.tf (1 hunks)
  • terraform/modules/pubsub/outputs.tf (0 hunks)
  • terraform/modules/pubsub/service_accounts.tf (1 hunks)
  • terraform/modules/pubsub/variables.tf (0 hunks)
  • terraform/outputs.tf (0 hunks)
  • terraform/variables.tf (1 hunks)
💤 Files with no reviewable changes (4)
  • terraform/README.md
  • terraform/outputs.tf
  • terraform/modules/pubsub/outputs.tf
  • terraform/modules/pubsub/variables.tf
🔇 Additional comments (4)
terraform/variables.tf (1)

89-93: Pubsub now enabled by default - verify this aligns with rollout strategy.

Changing the default from false to true means Pub/Sub resources will be provisioned for all developers automatically. In a shared GCP project, this could:

  1. Increase resource usage and costs
  2. Create topics/subscriptions for developers who don't need them
  3. Require quotas to accommodate all developers

Ensure this aligns with your rollout strategy and that the shared project has sufficient quotas.

terraform/modules/pubsub/service_accounts.tf (2)

20-33: LGTM: Adapter IAM bindings correctly migrated to Workload Identity.

The adapter's IAM bindings have been successfully migrated from using explicit Google Service Accounts to Workload Identity principals. Both subscriber and viewer roles use consistent principal strings that match the pattern established for Sentinel.

The migration ensures:

  • No orphaned Google Service Accounts
  • Direct K8s ServiceAccount to GCP IAM binding
  • Consistent security model across Sentinel and Adapter

42-57: DLQ permissions still use GCP service account - verify this is intentional.

The dead letter queue permissions use the GCP-managed Pub/Sub service account (service-{PROJECT_NUMBER}@gcp-sa-pubsub.iam.gserviceaccount.com) rather than Workload Identity principals.

This is correct because the Pub/Sub service itself (not a K8s workload) needs to:

  1. Publish failed messages to the DLQ topic
  2. Acknowledge messages from the main subscription for DLQ processing
terraform/main.tf (1)

61-62: Verify that Kubernetes ServiceAccounts match the configured names.

The service account names have been updated:

  • sentinel_k8s_sa_name = "sentinel"
  • adapter_k8s_sa_name = "adapter" (changed from the module default of "hyperfleet-adapter")

These names are used in Workload Identity IAM bindings. Verify that the Helm chart (hyperfleet-chart) creates ServiceAccounts with exactly these names in the hyperfleet-system namespace. Any mismatch will cause Workload Identity authentication failures.

namespace = var.kubernetes_namespace
resource_type = "clusters"
project_id = var.gcp_project_id
namespace = var.developer_name
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check variable definition and any validation rules for developer_name
rg -nP 'variable\s+"developer_name"' --type=tf -A15
# Check other usages of developer_name to understand the pattern
rg -nP 'var\.developer_name' --type=tf -C3

Repository: openshift-hyperfleet/hyperfleet-infra

Length of output: 1694


Add validation to enforce Kubernetes namespace naming conventions on var.developer_name.

The developer_name variable currently lacks input validation. Kubernetes namespace names must be valid RFC 1123 DNS labels, which can only contain alphanumeric characters and hyphens (max 253 characters), with names unable to start with a hyphen and alpha characters only lowercase. Since developer usernames typically contain uppercase letters, underscores, or dots, the variable should enforce a regex pattern to validate compatibility before it's used as a namespace name. This prevents deployment failures from invalid Kubernetes resource names.

🤖 Prompt for AI Agents
In terraform/main.tf around line 57, var.developer_name is used as a Kubernetes
namespace but lacks validation; add a validation rule to the variable
declaration (typically in variables.tf or main.tf) that enforces the RFC1123 DNS
label constraints: only lowercase alphanumeric characters and hyphens, must
start and end with an alphanumeric character, and length <= 253 characters;
implement this by adding a regex-based validation (and an explanatory error
message) to the developer_name variable so invalid usernames (uppercase,
underscores, dots, leading/trailing hyphens, or too long) are rejected at plan
time.

locals {
topic_name = var.topic_name != "" ? var.topic_name : "${var.namespace}-${var.resource_type}"
subscription_name = var.subscription_name != "" ? var.subscription_name : "${var.namespace}-adapter"
subscription_name = var.subscription_name != "" ? var.subscription_name : "${local.topic_name}-sub"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -path "*/terraform/modules/pubsub/main.tf" -type f

Repository: openshift-hyperfleet/hyperfleet-infra

Length of output: 115


🏁 Script executed:

cat -n ./terraform/modules/pubsub/main.tf | head -50

Repository: openshift-hyperfleet/hyperfleet-infra

Length of output: 2231


🏁 Script executed:

rg -i "namespace.*adapter|adapter.*namespace" --type=tf

Repository: openshift-hyperfleet/hyperfleet-infra

Length of output: 726


🏁 Script executed:

cat -n ./terraform/modules/pubsub/variables.tf | grep -A 5 -B 5 "namespace.*adapter"

Repository: openshift-hyperfleet/hyperfleet-infra

Length of output: 394


🏁 Script executed:

rg "adapter" --type=tf -i | head -30

Repository: openshift-hyperfleet/hyperfleet-infra

Length of output: 2515


🏁 Script executed:

git log --oneline -n 20 -- terraform/modules/pubsub/main.tf

Repository: openshift-hyperfleet/hyperfleet-infra

Length of output: 102


🏁 Script executed:

git show 6c08132:terraform/modules/pubsub/main.tf | head -50

Repository: openshift-hyperfleet/hyperfleet-infra

Length of output: 1881


🏁 Script executed:

git log --all --oneline -- terraform/modules/pubsub/main.tf | head -10

Repository: openshift-hyperfleet/hyperfleet-infra

Length of output: 179


🏁 Script executed:

git show ec8f9c1:terraform/modules/pubsub/main.tf | grep -A 5 "subscription_name"

Repository: openshift-hyperfleet/hyperfleet-infra

Length of output: 552


🏁 Script executed:

rg "hyperfleet-adapter|namespace.*adapter" --type=tf --type=yaml --type=json --type=md -i | grep -v ".git"

Repository: openshift-hyperfleet/hyperfleet-infra

Length of output: 1068


🏁 Script executed:

cat -n terraform/README.md | grep -A 3 -B 3 "adapter"

Repository: openshift-hyperfleet/hyperfleet-infra

Length of output: 717


Breaking change confirmed: subscription naming convention has been updated without migration mechanism.

The subscription naming has changed from "${var.namespace}-adapter" to "${local.topic_name}-sub", which forces a new resource to be created. This will cause Terraform to plan the destruction of existing subscriptions and creation of new ones, potentially orphaning GCP resources.

To safely migrate existing deployments:

  • Add a Terraform moved block to map the old subscription address to the new one, preventing resource destruction
  • Update documentation (README.md) to reflect the new naming pattern
  • Provide clear guidance for users to run terraform state mv if using versions prior to Terraform 1.1.0

Without these steps, existing deployments will experience service disruption and resource waste.

🤖 Prompt for AI Agents
In terraform/modules/pubsub/main.tf around line 3, the subscription naming was
changed from "${var.namespace}-adapter" to "${local.topic_name}-sub" which will
force replacement of existing resources; add a terraform "moved" block in the
module's root module (or in module registry as appropriate) mapping the old
resource address to the new one to preserve state (e.g., map
module.<module_name>.google_pubsub_subscription.<old_name> to
module.<module_name>.google_pubsub_subscription.<new_name>), update README.md to
document the new naming pattern and include step-by-step migration instructions,
and add a note instructing operators using Terraform <1.1.0 to run "terraform
state mv" manually with the exact addresses shown in the moved block so existing
subscriptions are not destroyed and re-created.

@rh-amarin
Copy link
Author

Closing this in favor of #4

@rh-amarin rh-amarin closed this Jan 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant