Skip to content

Terraform Deployment

Eric Fitzgerald edited this page Apr 8, 2026 · 6 revisions

Terraform Deployment

This guide covers deploying TMI infrastructure using Terraform across four cloud providers: AWS, OCI (Oracle Cloud), GCP (Google Cloud), and Azure.

Overview

TMI provides 8 ready-to-use Terraform templates -- public and private variants for each cloud provider:

Provider Public Template Private Template
AWS aws-public aws-private
OCI oci-public oci-private
GCP gcp-public gcp-private
Azure azure-public azure-private

Public templates are low-cost, "kick the tires" deployments with no deletion protection, designed for easy setup and teardown.

Private templates are for organizations deploying TMI internally -- no public ingress, outbound NAT only, with deletion protection enabled.

Choosing a Template

Public Templates

  • Internet-accessible load balancer
  • Minimal cost (free tier where available)
  • No deletion protection -- easy terraform destroy
  • build_mode=dev with verbose logging
  • First user auto-promoted to admin
  • Best for: evaluation, demos, small teams

Private Templates

  • No public ingress -- internal load balancer only
  • Outbound NAT for reaching OAuth providers and container registries
  • Deletion protection enabled
  • build_mode=production
  • Best for: enterprise deployments, organizations with private network requirements
  • Deployer is responsible for establishing user connectivity (VPN, bastion, etc.) and IdP integration

Architecture

All templates deploy the same core components:

Load Balancer (HTTP/HTTPS + WebSocket)
        ↓
Kubernetes Cluster (single node)
  ├── TMI Server Pod (1 replica)
  └── Redis Pod (separate Deployment)
        ↓
Managed PostgreSQL Database
        ↓
Secrets Manager / Vault

Key constraints:

  • Single TMI server instance always -- TMI is stateful and not designed for HA
  • Multi-arch container images -- amd64 + arm64 support via Docker buildx
  • Provider-native services -- each template uses the cloud provider's managed database, secrets, logging, and certificate services
  • TMI logs to stdout -- each provider's Kubernetes log collection ships logs to their logging service

Provider-Specific Details

AWS

Resource Public Private
Kubernetes EKS + 1× t3.medium managed node EKS + 1× t3.medium, private API endpoint
Database RDS PostgreSQL db.t3.micro RDS PostgreSQL db.t3.small, private subnet
Load Balancer ALB (internet-facing) ALB (internal)
Secrets AWS Secrets Manager Secrets Manager
Logging CloudWatch via Fluent Bit DaemonSet Same
Registry ECR ECR
Est. Cost ~$140-150/mo ~$150-180/mo

OCI (Oracle Cloud)

Resource Public Private
Kubernetes OKE + 1× VM.Standard.A1.Flex (arm64) OKE + 1× VM.Standard.A1.Flex, private API
Database ADB Always Free (23ai) ADB non-free (private endpoint support)
Load Balancer OCI Flexible LB (10 Mbps free) Internal OCI LB
Secrets OCI Vault OCI Vault
Logging OCI Logging (Unified Monitoring Agent) Same
Registry OCIR OCIR
Est. Cost ~$0 (Always Free eligible) ~$50-80/mo

OCI is the only provider with a permanent free tier -- the public template can run indefinitely at zero cost using Always Free resources (Ampere arm64 nodes).

GCP (Google Cloud)

Resource Public Private
Kubernetes GKE Autopilot GKE Autopilot, private cluster
Database Cloud SQL PostgreSQL db-custom-1-3840 Cloud SQL, private IP only
Load Balancer GKE-managed ingress Internal GKE ingress
Secrets Secret Manager Secret Manager
Logging Cloud Logging (built-in) Same
Registry Artifact Registry Artifact Registry
Est. Cost ~$80-100/mo ~$100-130/mo

Azure

Resource Public Private
Kubernetes AKS Free + 1× B2s AKS Standard + 1× B2ms, private cluster
Database PostgreSQL Flexible Server B1ms Flexible Server B2s, VNet integration
Load Balancer NGINX Ingress Controller Internal NGINX
Secrets Key Vault Key Vault
Logging Azure Monitor Container Insights Same
Registry ACR Basic ACR Basic (public access -- Basic SKU does not support private endpoints)
Est. Cost ~$75-85/mo ~$120-150/mo

Prerequisites

Required Tools

  • Terraform 1.5+
  • kubectl (for verifying deployment)
  • Docker (for building container images)
  • Cloud provider CLI: aws, oci, gcloud, or az depending on provider

Container Images

TMI container images must be pushed to the provider's container registry before deployment. The templates accept image URLs as variables.

# Build and push multi-arch images (amd64 + arm64)
make build-containers-multiarch

# Or build for local platform only (testing)
make build-containers-multiarch-local

Quick Start

1. Choose a Template

cd terraform/environments/<provider>-<variant>
# Example: cd terraform/environments/oci-public

2. Configure

cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your values

Each template requires provider-specific credentials and identifiers. See the terraform.tfvars.example file for required and optional variables.

3. Deploy

terraform init
terraform plan -out=tfplan
terraform apply tfplan

4. Configure kubectl

# The exact command is in the terraform output
terraform output kubernetes_config_command

5. Verify

kubectl get pods -n tmi
curl $(terraform output -raw tmi_api_endpoint)/

Configuration

Environment Variables

Templates set default environment variables via a Kubernetes ConfigMap. Public templates include verbose logging; private templates use production defaults.

Common defaults (all templates, both public and private):

Variable Value
TMI_AUTH_AUTO_PROMOTE_FIRST_USER true
TMI_LOGGING_ALSO_LOG_TO_CONSOLE true
TMI_LOGGING_REDACT_AUTH_TOKENS true
TMI_LOGGING_SUPPRESS_UNAUTHENTICATED_LOGS true
TMI_SERVER_INTERFACE 0.0.0.0
TMI_SERVER_PORT 8080

Public template additions (added when build_mode=dev):

Variable Value
TMI_AUTH_BUILD_MODE dev
TMI_AUTH_EVERYONE_IS_A_REVIEWER true
TMI_LOGGING_LOG_API_REQUESTS true
TMI_LOGGING_LOG_API_RESPONSES true
TMI_LOGGING_LOG_WEBSOCKET_MESSAGES true

Private template additions:

Variable Value
TMI_AUTH_BUILD_MODE production

Custom Environment Variables

Add custom environment variables (IdP configuration, timeouts, etc.) via the extra_env_vars variable in your .tfvars:

extra_env_vars = {
  "OAUTH_PROVIDERS_GOOGLE_CLIENT_ID"     = "your-client-id"
  "OAUTH_PROVIDERS_GOOGLE_CLIENT_SECRET" = "your-client-secret"
  "TMI_WEBSOCKET_INACTIVITY_TIMEOUT"     = "600"
}

Secrets

JWT secret, database password, and Redis password are automatically generated via Terraform's random_password resource and stored in the provider's secrets manager. They are stable across subsequent terraform apply runs.

WebSocket Support

TMI uses WebSockets for real-time collaborative diagram editing. All templates configure load balancer idle timeouts to support long-lived WebSocket connections:

Provider Mechanism Idle Timeout
AWS ALB idle_timeout.timeout_seconds 3600s
OCI LB connection-idle-timeout annotation (when TLS is configured) 300s
GCP BackendConfig timeoutSec 3600s
Azure NGINX proxy-read-timeout / proxy-send-timeout 3600s

WebSocket Origin Checking

  • Public templates (build_mode=dev): All WebSocket origins accepted
  • Private templates (build_mode=production): Origins must match the request host, TLS subject name, or values in the WEBSOCKET_ALLOWED_ORIGINS environment variable

If your client hostname differs from the TMI API hostname in a private deployment, set:

extra_env_vars = {
  "WEBSOCKET_ALLOWED_ORIGINS" = "https://tmi.internal.example.com"
}

Post-Deployment Steps

Required for All Templates

  1. Verify deployment: kubectl get pods -n tmi -- confirm TMI and Redis pods are running
  2. Test connectivity: curl <tmi_api_endpoint>/ -- should return TMI version info

Required for Public Templates

  1. Register OAuth provider: Configure at least one OAuth identity provider:
    • Set the callback URL at the IdP to <tmi_external_url>/oauth2/callback
    • Add the IdP's client ID/secret via extra_env_vars and re-apply
  2. DNS (optional): Point a custom domain at the load balancer IP/DNS
  3. TLS (optional): Configure a certificate via the provider's certificate service

Required for Private Templates

  1. Establish user connectivity: Set up VPN, bastion, or other access method so users can reach the internal load balancer
  2. Register OAuth provider: Same as public, but callback URL uses the internal URL
  3. DNS: Configure internal DNS to resolve the TMI hostname to the internal LB IP
  4. TLS: Deploy a certificate and configure the LB for HTTPS
  5. WebSocket origins: Set WEBSOCKET_ALLOWED_ORIGINS if client hostname differs from API hostname

CORS

TMI's CORS middleware sets Access-Control-Allow-Origin: * -- all origins are accepted. No CORS configuration is needed at the infrastructure level. The tmi-ux client can be served from any domain and call the TMI API without issues.

For stricter CORS in production, a TMI server code change would be needed (not part of the Terraform templates).

Directory Structure

terraform/
├── environments/
│   ├── aws-public/          # AWS internet-accessible, minimal cost
│   ├── aws-private/         # AWS internal, no public ingress
│   ├── oci-public/          # OCI Always Free tier
│   ├── oci-private/         # OCI internal deployment
│   ├── gcp-public/          # GCP Autopilot, minimal cost
│   ├── gcp-private/         # GCP private cluster
│   ├── azure-public/        # Azure Free AKS, minimal cost
│   └── azure-private/       # Azure private cluster
└── modules/
    ├── certificates/        # TLS certificates (aws, azure, gcp, oci)
    ├── compute/             # Container compute (oci only)
    ├── database/            # Managed PostgreSQL (aws, azure, gcp, oci)
    ├── dns/                 # DNS records (aws only -- Route 53)
    ├── kubernetes/          # K8s cluster + workloads (aws, azure, gcp, oci)
    ├── logging/             # Log collection (aws, azure, gcp, oci)
    ├── network/             # VPC/VNet/VCN (aws, azure, gcp, oci)
    └── secrets/             # Secrets management (aws, azure, gcp, oci)

Kubernetes Service Choices

Provider K8s Service Node Type Notes
AWS EKS + managed node group t3.medium Fargate costs more and doesn't support true CPU bursting
OCI OKE + managed node pool VM.Standard.A1.Flex (arm64) Virtual nodes blocked (no init containers)
GCP GKE Autopilot Auto-provisioned Pay-per-pod, no node management
Azure AKS B2s (public) / B2ms (private) Virtual nodes blocked (no init containers)

Terraform State

Templates use local state by default. Each template includes a commented-out remote backend block with provider-appropriate configuration:

  • AWS: S3 backend
  • OCI: HTTP backend with OCI Object Storage (via Pre-Authenticated Request)
  • GCP: GCS backend
  • Azure: Azure Blob Storage backend

Uncomment and configure the backend block in main.tf for remote state management.

Tearing Down

Public Templates

terraform destroy

No deletion protection is set -- all resources are removed cleanly.

Private Templates

Private templates have deletion protection enabled on databases. To tear down:

  1. Disable deletion protection (varies by provider)
  2. Run terraform destroy

Troubleshooting

Pods Not Starting

kubectl describe pod -n tmi <pod-name>
kubectl logs -n tmi <pod-name>

WebSocket Connections Dropping

Check load balancer idle timeout configuration. TMI's default WebSocket inactivity timeout is 300 seconds -- the LB idle timeout must be >= this value.

OAuth Callback Failures

Ensure TMI_OAUTH_CALLBACK_URL matches the URL registered with your identity provider. This is automatically set from the load balancer endpoint, but if you're using a custom domain, update it via extra_env_vars.

Private Template -- Cannot Reach K8s API

During provisioning, private templates create temporary public access to the K8s API. If terraform apply fails mid-way, this access may remain. Run terraform apply again to complete provisioning and revoke access, or run terraform destroy to clean up.

Multi-Arch Container Images

OCI Always Free tier uses Ampere ARM64 nodes, while other providers typically use x86_64. TMI supports both via multi-architecture container images:

# Build and push multi-arch images
make build-containers-multiarch

# Build for local platform only (testing)
make build-containers-multiarch-local

The Chainguard base images and Go cross-compilation (CGO_ENABLED=0) make this work seamlessly. Oracle-tagged builds (requiring CGO) remain x86_64 only.

Related Documentation

Home

Releases


Getting Started

Deployment

Operation

Troubleshooting

Development

Integrations

Tools

API Reference

Reference

Clone this wiki locally