Note: This is my personal infrastructure management system, made public as a portfolio showcase. It demonstrates patterns for infrastructure-as-code, state-driven architecture, and zero-config service discovery.
A personal infrastructure management system for AI projects that combines:
- Terraform Infrastructure - Modular IaC for GCP, provisioning storage, compute, and networking resources
- Python Utilities - Zero-config access to infrastructure via state-driven discovery
- Neon PostgreSQL - Serverless database integration with async client
- Makefile Automation - Simple commands for managing ephemeral and persistent resources
✨ Zero Configuration: from databot import storage; bucket = storage('dev') - no manual config needed
🚀 Modular Infrastructure: Reusable Terraform modules for storage, service accounts, workbenches, cloud run
⚡ Ephemeral Resources: Spin up/down expensive compute with make workbench.up / make workbench.down
🔐 Secure by Default: Auto-generated service accounts, per-environment isolation, credential management
📦 State-Driven Discovery: Python automatically discovers infrastructure from Terraform state files
The project uses a state-driven architecture where infrastructure discovery happens automatically:
- Terraform provisions GCP resources → writes
terraform.tfstate - State files contain infrastructure metadata (bucket names, service accounts, etc.)
- Python package reads state files at runtime to discover available resources
- User code accesses infrastructure through simple APIs with zero manual configuration
This eliminates the need for configuration files or hard-coded resource names. The Python code always knows what infrastructure exists by reading Terraform's state.
databot3000/
├── terraform/
│ ├── modules/ # Reusable Terraform modules
│ │ ├── gcp-apis/ # Enable GCP APIs
│ │ ├── storage/ # Cloud Storage buckets
│ │ ├── service-account/ # IAM service accounts
│ │ ├── workbench/ # Vertex AI Workbench (GPU instances)
│ │ └── cloud-run/ # Serverless containers
│ └── environments/
│ ├── dev/ # Ephemeral development resources
│ └── prod/ # Persistent production resources
├── src/databot/
│ ├── core/ # State loader (reads terraform.tfstate)
│ ├── storage/ # GCS bucket interface
│ ├── neondb/ # Async PostgreSQL client
│ ├── auth/ # Service account authentication
│ └── config.py # Configuration discovery
└── tests/ # Pytest test suite
# Storage: discovered from Terraform state
from databot import storage
bucket = storage('dev')
bucket.upload_json({'data': [1, 2, 3]}, 'results.json')
# Database: async PostgreSQL with Neon
from databot import neondb
async with neondb("myproject", "neondb") as db:
users = await db.fetch("SELECT * FROM users WHERE active = $1", True)# Setup
make install # Install dependencies (uv)
make test # Run pytest suite
# Infrastructure
make dev # Initialize dev environment
make terraform.plan # Preview infrastructure changes
make terraform.apply # Deploy infrastructure
make terraform.destroy # Tear down all resources
# Ephemeral Resources (cost optimization)
make workbench.up # Spin up Vertex AI Workbench
make workbench.down # Destroy Workbench to save $$$Dev (terraform/environments/dev/):
- Ephemeral resources (
force_destroy = true) - 90-day auto-delete on storage
- Workbench defaults to STOPPED state
- Service accounts with keys for local development
Prod (terraform/environments/prod/):
- Persistent resources with deletion protection
- Versioned storage buckets
- Archive bucket with cold storage migration
- Workload Identity (no service account keys)
- Infrastructure: Terraform >= 1.0, GCP (Cloud Storage, Vertex AI, Cloud Run)
- Language: Python >= 3.11, asyncpg for PostgreSQL
- Database: Neon serverless PostgreSQL
- Tooling: uv for dependency management, pytest for testing
- Compute: Modal Labs for ad-hoc serverless workloads (planned)
from databot import storage
# Automatically discovers bucket from terraform state
bucket = storage('dev')
bucket.upload_file('data.csv', 'datasets/data.csv')
files = bucket.list_files(prefix='datasets/')from databot import neondb
async with neondb("databot") as db:
# Insert with parameterized queries
await db.execute(
"INSERT INTO logs (event, timestamp) VALUES ($1, $2)",
"model_trained", datetime.now()
)
# Fetch with filters
recent = await db.fetch(
"SELECT * FROM logs WHERE timestamp > $1",
datetime.now() - timedelta(days=7)
)from databot import DatabotConfig, StateLoader
# High-level discovery
config = DatabotConfig(environment='dev')
buckets = config.get_bucket_names()
service_account = config.get_service_account_email()
# Low-level state access
loader = StateLoader('terraform/environments/dev/terraform.tfstate')
outputs = loader.get_outputs()
workbenches = loader.get_google_workbench_instances()- CLAUDE.md - Development guide for Claude Code
- terraform/README.md - Terraform modules and usage
- src/databot/README.md - Python API reference
- Terraform >= 1.0
- Python >= 3.11
- Google Cloud Project with billing
- gcloud CLI for authentication
Version: 0.1.0 Status: Active Development Last Updated: November 2025