Skip to content

Conversation

@jkroll-deepgram
Copy link
Contributor

@jkroll-deepgram jkroll-deepgram commented Nov 7, 2025

Proposed changes

Adds optional support for the Billing container, providing airgapped license management and usage tracking for Deepgram self-hosted deployments.

This implementation provides enterprise customers with a robust, HA-capable, airgapped licensing solution while maintaining full backward compatibility with existing cloud-connected deployments.

Key Features & Implementation Choices

1. Deployment Flexibility

  • Three supported deployment patterns:
    • Pattern 1 (Cloud/Connected): API/Engine → License Proxy → license.deepgram.com (default, no changes)
    • Pattern 2 (Airgapped Direct): API/Engine → Billing (License Proxy disabled)
    • Pattern 3 (Airgapped with Caching): API/Engine → License Proxy → Billing (optional caching layer)

2. High Availability (HA) Support

  • StatefulSet architecture with stable network identities for multi-replica deployments
  • Per-pod persistent journal storage via volumeClaimTemplates (ReadWriteMany by default)
  • Optional shared storage via billing.journal.existingPvcName for EFS/NFS (ReadWriteMany)
  • Configurable replica count (default: 1, supports N replicas for redundancy)
  • Each replica maintains its own usage journal to avoid write conflicts (EBS), or writes to separate subdirectories (EFS)

3. Persistent Usage Tracking

  • Journal file (/journal/journal) persists all usage data for billing and compliance
  • EFS-backed PVCs (default: 1Gi, configurable storage class and size)
  • Critical data protection: journals survive pod restarts, node failures, and cluster migrations
  • Flexible storage: Supports both EBS (per-pod PVCs) and EFS (shared PVC) for different HA patterns

4. Secure Secret Management

  • Two-tier secret architecture:
    • global.deepgramLicenseSecretRef: License key (env var: DEEPGRAM_LICENSE_KEY)
    • billing.licenseFile.secretRef: License file (mounted as /license/license.dg)
  • Runtime config rendering via sed in initContainers for secure key injection
  • Configurable init container image (global.initContainer.image) with ubuntu:22.04 default
  • Airgapped-friendly: init container image can be mirrored to private registries
  • Secrets never exposed in ConfigMaps or logs

5. Minimal Container Design

  • Ultra-minimal billing image (no shell, no tar) for security and size optimization
  • Debug workflow provided via ephemeral debug pods for journal file access (documented in samples/airgapped.md)
  • Consistent with Deepgram's security-first container design philosophy

6. Seamless License Proxy Integration

  • New configuration flag: licenseProxy.upstream.useBilling
    • When true: License Proxy forwards requests to Billing
    • When false: License Proxy uses license.deepgram.com (default)
  • Allows optional License Proxy as a caching/request aggregation layer in airgapped environments

7. Automatic Configuration Management

  • Conditional TOML generation in API/Engine ConfigMaps based on billing.enabled
  • Dynamic service discovery for billing-internal headless service
  • Automatic placeholder injection: DEEPGRAM_API_KEY=airgapped-mode for API/Engine/License Proxy to satisfy internal checks
  • Backward compatibility: existing deployments unaffected when billing.enabled: false

8. Resource Management

  • Default resource limits match License Proxy (configurable per-deployment)
  • Node affinity/selectors for workload isolation (e.g., k8s.deepgram.com/node-type=billing)
  • Tolerations for specialized node pools

9. Comprehensive Documentation

  • New airgapped guide: samples/airgapped.md with step-by-step instructions
    • Quick start with minimal configuration examples
    • Manual journal retrieval (debug pod method)
    • Automated backup via CronJob (with S3/local storage options)
    • Multi-replica considerations (EBS vs EFS)
    • Troubleshooting and best practices
  • Sample configurations:
    • 07-basic-setup-aws-airgapped.values.yaml - Complete airgapped deployment values
    • 07-basic-setup-aws-airgapped.cluster-config.yaml - EKS cluster setup for airgapped deployments

Migration Path

  • Zero breaking changes for existing customers
  • Opt-in only: set billing.enabled: true and configure secrets
  • Side-by-side compatibility: License Proxy and Billing can coexist during testing
  • Clear migration guide in samples/airgapped.md

Technical Highlights

  • Kubernetes Resources: StatefulSet, Headless Service, ConfigMap, RBAC, PVCs
  • Storage Requirements: EFS CSI driver (AWS) or equivalent storage provisioner, configurable storage class
  • Secret Types: generic (license key), generic (license file as file mount)
  • Init Container Pattern: Configurable image (default ubuntu:22.04) with sed for runtime secret injection
  • Journal Format: Newline-delimited JSON (NDJSON) with base64-encoded, signed usage events

Testing Completed

  • End-to-end airgapped deployment on EKS 1.32
  • Successful transcription requests with billing tracking
  • Journal file generation and retrieval (both manual and automated methods)
  • Multi-replica StatefulSet behavior
  • License Proxy integration (Pattern 3)
  • Init container secret injection with sed
  • Backward compatibility ("cloud" self-hosted deployments unaffected)

Types of changes

What types of changes does your code introduce to the Deepgram self-hosted resources?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update or tests (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING doc
  • I have tested my changes in my local self-hosted environment
    • Please describe your testing setup and methodology here

I brought up a deployment with this Helm chart and the 07 airgapped samples, served a test STT request, and confirmed that the usage was logged in the journal file.

  • I have added necessary documentation (if appropriate)

Further comments

@jkroll-deepgram jkroll-deepgram marked this pull request as ready for review November 13, 2025 23:58

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkroll-deepgram I have seen u already fixed that envsubst and API KEY error , i also got one more error related to flux , i think that config expects script_path and socket_path to be present,

Error: TOML parse error at line 26, column 1
   |
26 | [flux]
   | ^^^^^^
missing field `script_path`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mk2134226 that error is odd to me, script_path and socket_path are not Deepgram values or feature flags. Do those have any contextual meaning to you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mk2134226, I got clarity that those are Deepgram Flux parameters, but we don't expect users to need to set them; we set the defaults. Let me know if this is still an ongoing issue that is being surfaced.

@jkroll-deepgram jkroll-deepgram requested a review from bd-g January 13, 2026 01:58
Copy link
Collaborator

@bd-g bd-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Most of my comments are nits that would be nice to clean up, but can be done in a subsequent PR since they're internal-only details and aren't breaking changes. That said, I have a couple comments that might be better to address before merging:

  • Removing the licenseProxy.upstream.useBilling option
  • Double-checking that multiple billing replicas works as expected

Anyways, well done!

@jkroll-deepgram jkroll-deepgram merged commit a9abc39 into main Jan 21, 2026
2 checks passed
@jkroll-deepgram jkroll-deepgram deleted the billing-helm branch January 21, 2026 19:56
@jkroll-deepgram jkroll-deepgram mentioned this pull request Jan 21, 2026
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants