diff --git a/CHANGELOG.md b/CHANGELOG.md
index 78b2b07..ced3ef2 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,7 +7,46 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+### 🔒 Security - CRITICAL
+
+**Multi-Tenant Security Vulnerability Identified and Mitigated**
+
+- **Identified:** Cross-tenant private repository data leakage in default configuration
+- **Impact:** Critical for multi-tenant deployments with private repositories
+- **Severity:** CVSS 8.1 (High)
+- **Mitigation:** Multiple isolation strategies provided (sidecar pattern deployable today)
+
 ### Added
+
+#### Security Infrastructure
+- Complete security documentation suite (`docs/security/`)
+- Tenant isolation framework (`isolation.go`) with 4 isolation modes
+- Secure deployment manifests (`examples/kubernetes-sidecar-secure.yaml`)
+- Security testing infrastructure
+- NetworkPolicy and SecurityContext templates
+
+#### Load Testing & Deployment
+- Docker Compose multi-instance test environment
+- Python and k6 load testing harnesses (`loadtest/`)
+- HAProxy configuration with consistent hashing
+- Prometheus + Grafana monitoring stack
+- Comprehensive deployment pattern guide
+
+#### Storage Optimization
+- Tiered storage strategies for AWS, GCP, and Azure
+- Cost optimization guide (60-95% potential savings)
+- Terraform configurations for cloud storage
+- Automated lifecycle management examples
+
+#### Documentation
+- Restructured documentation in `docs/` (10,000+ lines)
+- Getting started guide
+- Security guides (3 documents)
+- Operations guides (4 documents)
+- Architecture documentation (3 documents)
+- Configuration examples for isolation modes
+
+#### CI/CD & Release
 - GitHub Actions automated release pipeline
 - Multi-platform binary builds (Linux, macOS, Windows)
 - Automated release notes generation
@@ -16,8 +55,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Comprehensive offline mode documentation with testing guides
 
 ### Changed
+- Root README with prominent security warnings
+- Documentation organization (`docs/` structure)
 - Enhanced README with offline mode configuration, monitoring, and testing sections
 
+### Security
+- **Action Required for Multi-Tenant Deployments:** Review `docs/security/README.md`
+- Sidecar pattern provides immediate security (no code changes)
+- Namespace isolation for enterprise compliance
+- Application-level isolation framework (requires integration)
+
 ## Template for New Releases
 
 When creating a new release, copy the following template and fill in the details:
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 0000000..31f8c0d
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,235 @@
+# Contributing to Goblet
+
+Thank you for your interest in contributing to Goblet! This document provides guidelines for contributing to the project.
+
+## Code of Conduct
+
+This project adheres to a code of conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to the project maintainers.
+
+## How to Contribute
+
+### Reporting Bugs
+
+Before creating bug reports, please check existing issues to avoid duplicates. When creating a bug report, include:
+
+- **Clear title and description**
+- **Steps to reproduce**
+- **Expected behavior**
+- **Actual behavior**
+- **Environment details** (OS, Go version, Goblet version)
+- **Logs and error messages**
+
+### Suggesting Enhancements
+
+Enhancement suggestions are welcome! Please include:
+
+- **Clear use case**: Why is this enhancement needed?
+- **Proposed solution**: How would you like it to work?
+- **Alternatives considered**: What other approaches did you consider?
+- **Impact**: Who benefits from this enhancement?
+
+### Pull Requests
+
+1. **Fork the repository** and create your branch from `main`
+2. **Make your changes**:
+   - Write clear, concise commit messages
+   - Follow the existing code style
+   - Add tests for new functionality
+   - Update documentation as needed
+3. **Test your changes**:
+   ```bash
+   make test
+   make test-integration
+   ```
+4. **Ensure code quality**:
+   ```bash
+   make lint
+   make fmt
+   ```
+5. **Submit the pull request**:
+   - Link any related issues
+   - Describe what the PR does
+   - Note any breaking changes
+
+## Development Setup
+
+### Prerequisites
+
+- Go 1.21 or later
+- Git
+- Docker (for integration tests)
+- Make
+
+### Setup
+
+```bash
+# Clone your fork
+git clone https://github.com/YOUR_USERNAME/goblet.git
+cd goblet
+
+# Add upstream remote
+git remote add upstream https://github.com/google/goblet.git
+
+# Install dependencies
+go mod download
+
+# Build
+make build
+
+# Run tests
+make test
+```
+
+### Project Structure
+
+```
+github-cache-daemon/
+├── cmd/                    # Command-line tools
+├── pkg/                    # Public libraries
+├── internal/               # Private libraries
+├── docs/                   # Documentation
+├── examples/               # Configuration examples
+├── loadtest/              # Load testing infrastructure
+├── scripts/               # Utility scripts
+└── testing/               # Test infrastructure
+```
+
+## Development Guidelines
+
+### Code Style
+
+- **Follow Go best practices**: See [Effective Go](https://golang.org/doc/effective_go.html)
+- **Format code**: Use `gofmt` and `goimports`
+- **Lint code**: Use `golangci-lint`
+- **Write tests**: Aim for 80%+ coverage
+- **Document exported symbols**: Use Go doc comments
+
+### Commit Messages
+
+Follow the [Conventional Commits](https://www.conventionalcommits.org/) specification:
+
+```
+type(scope): subject
+
+body
+
+footer
+```
+
+**Types:**
+- `feat`: New feature
+- `fix`: Bug fix
+- `docs`: Documentation changes
+- `test`: Test additions or changes
+- `refactor`: Code refactoring
+- `perf`: Performance improvements
+- `chore`: Build process or auxiliary tool changes
+
+**Examples:**
+```
+feat(cache): add LRU eviction policy
+
+Implements a configurable LRU cache eviction policy to
+prevent unbounded cache growth.
+
+Closes #123
+```
+
+```
+fix(auth): handle OAuth2 token refresh
+
+Fixes an issue where expired tokens were not properly
+refreshed, causing authentication failures.
+
+Fixes #456
+```
+
+### Testing
+
+**Unit Tests:**
+```bash
+make test
+```
+
+**Integration Tests:**
+```bash
+make test-integration
+```
+
+**Load Tests:**
+```bash
+cd loadtest && make start && make loadtest-python
+```
+
+**Test Coverage:**
+```bash
+make coverage
+open coverage.html
+```
+
+### Documentation
+
+- **Update docs/** when adding features
+- **Update README.md** for major changes
+- **Add examples/** for new configurations
+- **Update CHANGELOG.md** for releases
+
+**Validate Documentation Links:**
+
+All documentation links are automatically validated in CI. Before submitting a PR, run:
+
+```bash
+# Validate all markdown links
+./scripts/validate-links.py
+```
+
+The CI pipeline will fail if any broken links are detected. This ensures:
+- All relative file links point to existing files
+- All anchor links point to existing headers
+- Documentation stays consistent and navigable
+
+## Security
+
+### Reporting Security Issues
+
+**DO NOT** create public issues for security vulnerabilities.
+
+Instead, email security@example.com with:
+- Description of the vulnerability
+- Steps to reproduce
+- Potential impact
+- Suggested fix (if any)
+
+### Security Guidelines
+
+- Never commit credentials or secrets
+- Follow the [Security Guide](docs/security/README.md)
+- Test security-sensitive changes thoroughly
+- Consider multi-tenant implications
+
+## Release Process
+
+Releases are handled by project maintainers:
+
+1. Update CHANGELOG.md
+2. Update version in code
+3. Create git tag: `git tag -a v1.2.3 -m "Release v1.2.3"`
+4. Push tag: `git push origin v1.2.3`
+5. GitHub Actions builds and publishes release
+
+See [Releasing Guide](docs/operations/releasing.md) for details.
+
+## Getting Help
+
+- **Documentation**: [docs/index.md](docs/index.md)
+- **Questions**: [GitHub Discussions](https://github.com/google/goblet/discussions)
+- **Issues**: [GitHub Issues](https://github.com/google/goblet/issues)
+
+## Recognition
+
+Contributors are recognized in:
+- CHANGELOG.md (for significant contributions)
+- GitHub contributors list
+- Release notes
+
+Thank you for contributing to Goblet! 🎉
diff --git a/docs/architecture/design-decisions.md b/docs/architecture/design-decisions.md
new file mode 100644
index 0000000..d00fb15
--- /dev/null
+++ b/docs/architecture/design-decisions.md
@@ -0,0 +1,568 @@
+# Architecture Decisions: Goblet Scaling & Deployment
+
+## Executive Summary
+
+This document addresses key architectural questions about scaling Goblet for high-traffic deployments, particularly for use cases like Terraform Cloud Agents handling millions of GitHub requests per month.
+
+**Key Findings:**
+- ✅ Goblet is stateful and requires careful deployment planning
+- ✅ Sidecar pattern is RECOMMENDED for Terraform-scale deployments
+- ✅ Multi-process deployment IS POSSIBLE with repository sharding
+- ❌ Naive shared-cache deployment WILL CORRUPT data
+
+---
+
+## Question 1: Does Goblet Handle Stateless Servicing?
+
+### Answer: NO - Goblet is Stateful
+
+**Stateful Characteristics:**
+
+1. **File-based Git repositories**
+   - Location: `/cache/<host>/<path>` as bare Git repos
+   - Managed by: Native `git` commands (fetch, ls-refs)
+   - State: Mutable, modified by background fetch operations
+
+2. **In-process synchronization**
+   ```go
+   // managed_repository.go:45
+   managedRepos sync.Map  // Process-level registry
+
+   // managed_repository.go:126
+   type managedRepository struct {
+       mu sync.RWMutex  // Per-repository lock
+       lastUpdate time.Time  // In-memory timestamp
+   }
+   ```
+
+3. **Background operations**
+   ```go
+   // git_protocol_v2_handler.go:123
+   go func() {
+       _ = repo.fetchUpstream()  // Async modification
+   }()
+   ```
+
+**Implications:**
+- Multiple instances sharing cache = **DATA CORRUPTION**
+- Locks are process-local, not distributed
+- No coordination between instances
+
+---
+
+## Question 2: Multi-Process Frontend with Load Balancing in Compose
+
+### Answer: YES - With Repository Sharding
+
+**Safe Architecture:**
+
+```
+                    HAProxy
+                (consistent hash on URL)
+                        |
+        +---------------+---------------+
+        |               |               |
+    Goblet-1        Goblet-2        Goblet-3
+   cache-dir-1     cache-dir-2     cache-dir-3
+```
+
+**Key Requirements:**
+
+1. **Consistent hashing**: Route same repository to same instance
+   ```haproxy
+   backend goblet_shards
+       balance uri whole
+       hash-type consistent
+   ```
+
+2. **Separate cache directories**: No shared storage
+   ```yaml
+   volumes:
+     - cache-1:/cache  # Isolated volume per instance
+   ```
+
+3. **Zero retries**: Don't retry on same server (prevents corruption)
+   ```haproxy
+   retries 0
+   ```
+
+**Provided Implementation:**
+
+See `docker-compose.loadtest.yml` and `loadtest/haproxy.cfg`
+
+**Tradeoffs:**
+
+| Aspect | Single Instance | Sharded Multi-Process |
+|--------|----------------|----------------------|
+| Cache Efficiency | 100% (all repos) | ~33% per instance (1/N) |
+| Throughput | 500-1000 req/s | 1500-3000 req/s (3x) |
+| Availability | Single point of failure | N-1 survivability |
+| Complexity | Simple | Moderate (requires LB) |
+| Setup | 1 command | Compose + config |
+
+**Verdict:** Multi-process IS possible and provided in this repository.
+
+---
+
+## Question 3: Would Sidecar Pattern Be Useful?
+
+### Answer: YES - HIGHLY RECOMMENDED for Terraform Scale
+
+### Why Sidecar is Ideal
+
+**Terraform Agent Architecture:**
+```
+Pod (Terraform Agent)
+├── Main Container: terraform-agent
+│   └── git clone (via http://localhost:8080)
+└── Sidecar: goblet-cache
+    ├── Port: 8080 (localhost)
+    ├── Cache: /cache (emptyDir 10GB)
+    └── Lifecycle: Pod-scoped
+```
+
+**Benefits for Terraform Cloud Agents:**
+
+1. **Zero Network Latency**
+   - Communication: localhost (no network hop)
+   - Latency: ~0.1ms vs ~10ms (remote)
+   - Throughput: ~10Gbps (memory) vs ~1Gbps (network)
+
+2. **Natural Workload Partitioning**
+   - Each agent has own cache
+   - No coordination overhead
+   - No distributed locks needed
+   - No cache contention
+
+3. **Pod-Scoped Lifecycle**
+   - Cache created with pod
+   - Cache destroyed with pod
+   - No orphaned state
+   - Clean failure recovery
+
+4. **Linear Scaling**
+   - 100 pods = 100 independent caches
+   - No shared state bottleneck
+   - No coordination overhead
+   - Scales to 1000s of pods
+
+5. **High Cache Hit Rate**
+   - Terraform runs often reuse same modules
+   - Common pattern: 10-100 repos per team
+   - After warm-up: 80-95% cache hit rate
+   - Example: `terraform-aws-modules/*` reused frequently
+
+**Capacity Analysis for 1M Requests/Month:**
+
+```
+Deployment: 100 Terraform Agent pods with sidecars
+
+Traffic Distribution:
+  1M requests/month = 33,333 requests/day
+  Per pod: 333 requests/day = ~14 requests/hour
+  Peak (10x): ~140 requests/hour/pod = ~2.3 req/min
+
+Per-Pod Load:
+  Average: 0.004 req/sec (trivial)
+  Peak: 0.04 req/sec (still trivial)
+
+Single Goblet instance capacity: ~500-1000 req/sec
+Utilization per pod: 0.004% average, 0.04% peak
+
+Verdict: MASSIVE HEADROOM. Each pod barely uses its sidecar.
+```
+
+**Why Not Shared Cache?**
+
+Consider alternative: Single shared Goblet cluster
+
+```
+100 Terraform Agents → Load Balancer → 3 Goblet instances (shared cache)
+```
+
+Problems:
+- ❌ Network latency: ~10ms per request
+- ❌ Requires distributed locking (Redis/etcd)
+- ❌ Coordination overhead
+- ❌ Shared cache bottleneck
+- ❌ More complex failure modes
+- ✅ Benefit: Higher cache efficiency... but:
+  - At 1M requests/month, cache misses are rare anyway
+  - Sidecar pattern achieves 80-95% hit rate after warm-up
+
+**Recommendation: Use sidecar pattern.**
+
+### Implementation
+
+**Provided:**
+- `kubernetes-sidecar-deployment.yaml` - Complete Kubernetes manifest
+- Includes: Deployment, Service, HPA, PodDisruptionBudget, ServiceMonitor
+
+**Deployment:**
+```bash
+kubectl apply -f loadtest/kubernetes-sidecar-deployment.yaml
+```
+
+**Configuration:**
+```yaml
+env:
+  - name: HTTP_PROXY
+    value: "http://localhost:8080"
+```
+
+**Scaling:**
+```yaml
+minReplicas: 10   # Baseline
+maxReplicas: 100  # Auto-scale on CPU/memory
+```
+
+---
+
+## Question 4: Load Testing in Compose
+
+### Answer: YES - Fully Implemented
+
+**Provided Components:**
+
+1. **Infrastructure** (`docker-compose.loadtest.yml`)
+   - 3 Goblet instances
+   - HAProxy with consistent hashing
+   - Prometheus + Grafana monitoring
+
+2. **Load Test Scripts**
+   - `loadtest.py` - Python-based (flexible, easy to customize)
+   - `k6-script.js` - k6-based (advanced, gradual ramp-up)
+
+3. **Automation** (`Makefile`)
+   - One-command setup: `make start`
+   - One-command test: `make loadtest-python`
+   - Monitoring: `make stats`, `make metrics`
+
+**Quick Start:**
+
+```bash
+cd loadtest
+
+# Start environment
+make start
+
+# Run load test (Python)
+make loadtest-python
+
+# View stats
+make stats
+
+# View metrics
+open http://localhost:9090  # Prometheus
+open http://localhost:3000  # Grafana (admin/admin)
+open http://localhost:8404  # HAProxy stats
+
+# Stop
+make stop
+```
+
+**Test Scenarios:**
+
+```bash
+# Light load: 10 workers, 100 requests each
+python3 loadtest.py --workers 10 --requests 100
+
+# Medium load: 50 workers, 200 requests each
+python3 loadtest.py --workers 50 --requests 200
+
+# Heavy load: 100 workers, 500 requests each
+python3 loadtest.py --workers 100 --requests 500
+
+# Custom repos
+python3 loadtest.py \
+  --repos github.com/hashicorp/terraform \
+          github.com/terraform-aws-modules/terraform-aws-vpc \
+  --workers 20 \
+  --requests 100 \
+  --output results.json
+```
+
+---
+
+## Architectural Recommendations
+
+### For Small Deployments (<100 req/sec)
+
+**Recommendation:** Single instance
+
+```yaml
+# docker-compose.yml
+services:
+  goblet:
+    image: goblet:latest
+    ports:
+      - "8080:8080"
+    volumes:
+      - cache:/cache
+```
+
+**Pros:** Simple, easy to operate, minimal overhead
+**Cons:** Single point of failure
+
+---
+
+### For Medium Deployments (100-1000 req/sec)
+
+**Recommendation:** Sharded multi-instance with HAProxy
+
+```yaml
+# Use provided docker-compose.loadtest.yml
+# 3-5 instances with consistent hashing
+```
+
+**Pros:** Horizontal scaling, high availability, load distribution
+**Cons:** Moderate complexity, reduced cache efficiency per instance
+
+---
+
+### For Large-Scale Deployments (Terraform Cloud Scale)
+
+**Recommendation:** Sidecar pattern in Kubernetes
+
+```yaml
+# Use provided kubernetes-sidecar-deployment.yaml
+# 10-100 pods with HPA (autoscaling)
+```
+
+**Pros:**
+- ✅ Linear scaling (no coordination overhead)
+- ✅ Zero network latency
+- ✅ Simple failure model
+- ✅ High cache hit rate (80-95% after warm-up)
+- ✅ Pod-scoped lifecycle
+
+**Capacity:**
+- 100 pods handle 1M requests/month easily
+- Auto-scale to 500+ pods for peak load
+- Each pod: ~14 req/hour average
+
+---
+
+### For Multi-Region Deployments
+
+**Recommendation:** Regional instances + optional sync
+
+```
+US-EAST          EU-WEST          APAC
+  |                |                |
+Goblet           Goblet           Goblet
+(regional)       (regional)       (regional)
+```
+
+**Pros:** Low latency, regional isolation
+**Cons:** Cache duplication, higher storage costs
+
+**Optional enhancement:** Background sync popular repos between regions
+
+---
+
+## Partitioning Strategy Recommendations
+
+### Current State: No Built-in Partitioning
+
+Goblet does not have built-in partitioning logic. To enable multi-instance deployment, YOU MUST implement partitioning externally.
+
+### Recommended Partitioning Strategies
+
+#### 1. URL-Based Consistent Hashing (Implemented)
+
+**Method:** HAProxy routes by URL path
+
+```haproxy
+backend goblet_shards
+    balance uri whole
+    hash-type consistent
+```
+
+**Pros:**
+- ✅ Automatic routing
+- ✅ Same repo → same instance
+- ✅ No application changes
+
+**Use case:** Shared multi-instance deployment
+
+---
+
+#### 2. Client-Side Partitioning
+
+**Method:** Git clients select instance based on repo
+
+```bash
+# Example: Hash repo URL to select instance
+REPO="github.com/kubernetes/kubernetes"
+INSTANCE=$(($(echo -n "$REPO" | md5sum | cut -c1-8) % 3))
+export HTTP_PROXY="http://goblet-$INSTANCE:8080"
+git clone ...
+```
+
+**Pros:**
+- ✅ No load balancer
+- ✅ Explicit control
+
+**Cons:**
+- ❌ Client complexity
+
+**Use case:** Batch jobs, CI/CD pipelines
+
+---
+
+#### 3. Tenant-Based Partitioning
+
+**Method:** Route by team/organization
+
+```haproxy
+# Route based on path prefix
+acl team_a path_beg /github.com/team-a/
+acl team_b path_beg /github.com/team-b/
+
+use_backend goblet_team_a if team_a
+use_backend goblet_team_b if team_b
+```
+
+**Pros:**
+- ✅ Cache isolation per team
+- ✅ Cost allocation per tenant
+
+**Use case:** Multi-tenant platforms
+
+---
+
+#### 4. Sidecar (No Partitioning Needed!)
+
+**Method:** Each workload has own instance
+
+```
+Pod 1: App + Goblet → localhost:8080
+Pod 2: App + Goblet → localhost:8080
+Pod 3: App + Goblet → localhost:8080
+```
+
+**Pros:**
+- ✅ No partitioning logic needed
+- ✅ Natural isolation
+
+**Use case:** Terraform agents, CI/CD runners (RECOMMENDED)
+
+---
+
+## Migration Path: Current → Sidecar
+
+### Phase 1: Baseline (Current State)
+```
+Single Goblet instance
+- All requests to one server
+```
+
+### Phase 2: Load Test (This PR)
+```
+Compose environment with 3 instances
+- Test multi-process behavior
+- Measure cache efficiency
+- Validate consistent hashing
+```
+
+### Phase 3: Sidecar Pilot
+```
+Deploy 10 Terraform agents with sidecars
+- Monitor for 1 week
+- Compare vs. shared cache
+- Measure cache hit rate
+```
+
+### Phase 4: Production Rollout
+```
+Scale to 100+ pods
+- Enable HPA (10-100 pods)
+- Monitor metrics
+- Tune cache size per pod
+```
+
+---
+
+## Future Enhancements
+
+### For Shared-Cache Multi-Instance (Not Implemented)
+
+To enable true shared-cache deployment, would need:
+
+1. **Distributed Locking**
+   - Redis-based locks per repository
+   - Lock acquisition before git operations
+   - Timeout + retry logic
+
+2. **Leader Election**
+   - One leader per repository
+   - Leader handles upstream fetches
+   - Followers serve reads from cache
+
+3. **Cache Coherency**
+   - Publish/subscribe for ref updates
+   - Invalidate stale cache across instances
+   - Coordinate background fetches
+
+4. **Shared State Store**
+   - Centralized metadata (lastUpdate times)
+   - Distributed configuration
+   - Health checking
+
+**Complexity:** HIGH
+**Benefit:** Moderate (higher cache efficiency)
+**Recommendation:** NOT WORTH IT for most use cases. Use sidecar instead.
+
+---
+
+## Conclusion
+
+### Key Takeaways
+
+1. **Goblet is stateful** - requires careful deployment
+2. **Multi-process IS possible** - with repository sharding (implemented)
+3. **Sidecar pattern is IDEAL** - for Terraform Cloud scale (implemented)
+4. **Load testing infrastructure is READY** - full Compose environment provided
+
+### For Your Terraform Use Case
+
+**Recommendation: Deploy as sidecar**
+
+```bash
+# 1. Build image
+docker build -t goblet:v1.0.0 .
+
+# 2. Deploy to Kubernetes
+kubectl apply -f loadtest/kubernetes-sidecar-deployment.yaml
+
+# 3. Scale
+kubectl scale deployment terraform-agent --replicas=100
+
+# 4. Monitor
+kubectl port-forward svc/terraform-agent-metrics 8080:8080
+curl http://localhost:8080/metrics
+```
+
+**Expected Results:**
+- Cache hit rate: 80-95% (after warm-up)
+- Latency: <10ms (localhost)
+- Throughput: Linear with pod count
+- Operational complexity: Low (no coordination)
+
+### Next Steps
+
+1. ✅ Load test with provided infrastructure
+2. ✅ Deploy sidecar pilot with 10 pods
+3. ✅ Monitor for 1 week
+4. ✅ Scale to production (100+ pods)
+5. ⏭️ Future: Add LRU eviction, metrics-based cache warming
+
+---
+
+## Questions?
+
+- **Load testing**: See `loadtest/README.md`
+- **Deployment**: See `kubernetes-sidecar-deployment.yaml`
+- **Architecture**: This document
+- **Code**: See `managed_repository.go`, `git_protocol_v2_handler.go`
diff --git a/docs/architecture/scaling-strategies.md b/docs/architecture/scaling-strategies.md
new file mode 100644
index 0000000..cc11494
--- /dev/null
+++ b/docs/architecture/scaling-strategies.md
@@ -0,0 +1,53 @@
+# Scaling Strategies
+
+How to scale Goblet for high-traffic deployments.
+
+## Vertical Scaling
+
+Increase resources for single instance:
+
+- **CPU:** 2-8 cores
+- **Memory:** 4-16GB  
+- **Disk:** Fast SSD, 100GB-1TB
+- **Capacity:** Up to 1,000 req/sec
+
+## Horizontal Scaling
+
+Add more instances:
+
+1. **Sidecar Pattern:** N instances (one per workload)
+2. **Sharded Pattern:** HAProxy with consistent hashing
+3. **Regional Pattern:** Instance per region
+
+See [Deployment Patterns](../operations/deployment-patterns.md) for details.
+
+## Auto-Scaling
+
+Kubernetes HPA configuration:
+
+```yaml
+apiVersion: autoscaling/v2
+kind: HorizontalPodAutoscaler
+metadata:
+  name: goblet-hpa
+spec:
+  scaleTargetRef:
+    apiVersion: apps/v1
+    kind: Deployment
+    name: goblet
+  minReplicas: 10
+  maxReplicas: 100
+  metrics:
+  - type: Resource
+    resource:
+      name: cpu
+      target:
+        type: Utilization
+        averageUtilization: 70
+```
+
+## Related Documentation
+
+- [Deployment Patterns](../operations/deployment-patterns.md)
+- [Load Testing](../operations/load-testing.md)
+- [Design Decisions](design-decisions.md)
diff --git a/docs/architecture/storage-architecture.md b/docs/architecture/storage-architecture.md
new file mode 100644
index 0000000..de056ff
--- /dev/null
+++ b/docs/architecture/storage-architecture.md
@@ -0,0 +1,354 @@
+# Storage Architecture
+
+## Overview
+
+Goblet uses object storage backends to persist git repository backups. The storage architecture has been redesigned to support multiple providers through a common interface, enabling deployment flexibility.
+
+## Design Principles
+
+1. **Provider Abstraction**: A common `storage.Provider` interface abstracts storage operations
+2. **Pluggable Backends**: Easy to add new storage providers
+3. **Backward Compatible**: Existing GCS deployments work with minimal changes
+4. **Configuration-driven**: Provider selection via command-line flags
+
+## Architecture
+
+### Storage Interface
+
+The `storage.Provider` interface defines the contract for all storage backends:
+
+```go
+type Provider interface {
+    Writer(ctx context.Context, path string) (io.WriteCloser, error)
+    Reader(ctx context.Context, path string) (io.ReadCloser, error)
+    Delete(ctx context.Context, path string) error
+    List(ctx context.Context, prefix string) ObjectIterator
+    Close() error
+}
+```
+
+### Object Iteration
+
+Storage providers implement a consistent iterator pattern:
+
+```go
+type ObjectIterator interface {
+    Next() (*ObjectAttrs, error)
+}
+
+type ObjectAttrs struct {
+    Name    string
+    Prefix  string
+    Created time.Time
+    Updated time.Time
+    Size    int64
+}
+```
+
+### Supported Providers
+
+#### 1. Google Cloud Storage (GCS)
+
+**Implementation**: `storage/gcs.go`
+
+Uses the official `cloud.google.com/go/storage` SDK.
+
+**Configuration:**
+```bash
+-storage_provider=gcs
+-backup_bucket_name=my-gcs-bucket
+-backup_manifest_name=production
+```
+
+**Authentication:**
+- Uses Application Default Credentials (ADC)
+- Service account JSON key via GOOGLE_APPLICATION_CREDENTIALS
+- Workload Identity in GKE
+
+**Features:**
+- Automatic retry and exponential backoff
+- Strong consistency
+- Lifecycle policies for old manifests
+
+#### 2. S3-Compatible Storage (S3/Minio)
+
+**Implementation**: `storage/s3.go`
+
+Uses the Minio Go SDK (`github.com/minio/minio-go/v7`) which supports:
+- Amazon S3
+- Minio
+- DigitalOcean Spaces
+- Wasabi
+- Any S3-compatible storage
+
+**Configuration:**
+```bash
+-storage_provider=s3
+-s3_endpoint=s3.amazonaws.com          # or localhost:9000 for Minio
+-s3_bucket=my-s3-bucket
+-s3_access_key=AKIAIOSFODNN7EXAMPLE
+-s3_secret_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
+-s3_region=us-east-1
+-s3_use_ssl=true                       # false for local Minio
+-backup_manifest_name=production
+```
+
+**Authentication:**
+- Static credentials via flags/environment variables
+- IAM roles (for AWS EC2/ECS)
+- Environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
+
+**Features:**
+- Multipart upload for large objects
+- Bucket auto-creation
+- Streaming uploads via io.Pipe
+
+## Storage Operations
+
+### Backup Process
+
+The backup process runs on a configurable frequency (default: 1 hour):
+
+1. **List Managed Repositories**: Get all cached repositories
+2. **Check Latest Bundle**: Verify if backup is up-to-date
+3. **Create Bundle**: Generate git bundle from repository
+4. **Upload Bundle**: Write bundle to storage provider
+5. **Update Manifest**: Write manifest file with repository list
+6. **Garbage Collection**: Remove old bundles and manifests
+
+### Recovery Process
+
+On startup, the server can recover from backups:
+
+1. **List Manifests**: Find all manifest files
+2. **Read Manifest**: Parse repository URLs
+3. **Download Bundles**: Fetch git bundles from storage
+4. **Restore Repositories**: Initialize local repositories from bundles
+
+### Storage Layout
+
+```
+bucket/
+├── goblet-repository-manifests/
+│   └── {manifest-name}/
+│       ├── {timestamp1}           # Manifest file
+│       └── {timestamp2}           # Manifest file
+└── github.com/
+    └── {owner}/
+        └── {repo}/
+            └── {timestamp}        # Git bundle
+```
+
+**Manifest File Format:**
+```
+https://github.com/owner/repo1
+https://github.com/owner/repo2
+https://github.com/owner/repo3
+```
+
+**Bundle Naming:**
+- Timestamp format: 12-digit Unix timestamp (e.g., `000001699999999`)
+- Enables chronological sorting
+- Garbage collection keeps only the latest bundle
+
+## Provider Selection
+
+The `storage.NewProvider()` factory function creates the appropriate provider:
+
+```go
+func NewProvider(ctx context.Context, config *Config) (Provider, error) {
+    switch config.Provider {
+    case "gcs":
+        return NewGCSProvider(ctx, config.GCSBucket)
+    case "s3":
+        return NewS3Provider(ctx, config)
+    default:
+        return nil, nil // No backup configured
+    }
+}
+```
+
+## Adding New Providers
+
+To add a new storage provider:
+
+1. **Create Provider File**: `storage/{provider}.go`
+2. **Implement Interface**: Implement `storage.Provider`
+3. **Add to Factory**: Update `NewProvider()` in `storage/storage.go`
+4. **Add Configuration**: Add flags in `goblet-server/main.go`
+5. **Document**: Update this file
+
+### Example Provider Template
+
+```go
+package storage
+
+type MyProvider struct {
+    client *SomeClient
+}
+
+func NewMyProvider(ctx context.Context, config *Config) (*MyProvider, error) {
+    // Initialize client
+    return &MyProvider{client: client}, nil
+}
+
+func (p *MyProvider) Writer(ctx context.Context, path string) (io.WriteCloser, error) {
+    // Return writer
+}
+
+func (p *MyProvider) Reader(ctx context.Context, path string) (io.ReadCloser, error) {
+    // Return reader
+}
+
+func (p *MyProvider) Delete(ctx context.Context, path string) error {
+    // Delete object
+}
+
+func (p *MyProvider) List(ctx context.Context, prefix string) ObjectIterator {
+    // Return iterator
+}
+
+func (p *MyProvider) Close() error {
+    // Cleanup
+}
+```
+
+## Performance Considerations
+
+### GCS Provider
+- **Latency**: Low latency within same region
+- **Throughput**: High (multi-Gbps)
+- **Consistency**: Strong consistency
+- **Cost**: Pay for storage and operations
+
+### S3 Provider
+- **Latency**: Varies by provider
+- **Throughput**: High for AWS S3
+- **Consistency**: Strong consistency (as of Dec 2020)
+- **Cost**: Varies by provider (Minio is self-hosted)
+
+### Minio (Self-hosted)
+- **Latency**: Very low (local network)
+- **Throughput**: Limited by hardware
+- **Consistency**: Strong consistency
+- **Cost**: Infrastructure only
+
+## Testing
+
+### Local Testing with Minio
+
+```bash
+# Start services
+docker-compose up -d
+
+# Check Minio console
+open http://localhost:9001
+# Login: minioadmin / minioadmin
+
+# View logs
+docker-compose logs -f goblet
+
+# Test backup by adding a repository
+git clone --mirror https://github.com/some/repo /tmp/test.git
+
+# Stop services
+docker-compose down
+```
+
+### Unit Testing
+
+Mock the `storage.Provider` interface for testing:
+
+```go
+type MockProvider struct {
+    mock.Mock
+}
+
+func (m *MockProvider) Writer(ctx context.Context, path string) (io.WriteCloser, error) {
+    args := m.Called(ctx, path)
+    return args.Get(0).(io.WriteCloser), args.Error(1)
+}
+
+// ... implement other methods
+```
+
+## Security Considerations
+
+1. **Credentials Management**
+   - Never commit credentials to source control
+   - Use environment variables or secrets management
+   - Rotate credentials regularly
+
+2. **Bucket Permissions**
+   - Principle of least privilege
+   - Separate buckets for different environments
+   - Enable versioning for production
+
+3. **Network Security**
+   - Use SSL/TLS for remote storage (s3_use_ssl=true)
+   - VPC endpoints for cloud storage
+   - Network policies for Kubernetes
+
+4. **Data Protection**
+   - Enable encryption at rest
+   - Use server-side encryption
+   - Implement lifecycle policies
+
+## Monitoring
+
+Key metrics to monitor:
+
+- **Backup Success Rate**: Percentage of successful backups
+- **Backup Duration**: Time to complete backup cycle
+- **Storage Size**: Total size of stored bundles
+- **API Errors**: Storage provider error rates
+- **Latency**: Read/write operation latency
+
+## Troubleshooting
+
+### Common Issues
+
+**Connection Refused (Minio):**
+- Check Minio is running: `docker-compose ps`
+- Verify endpoint configuration
+- Check network connectivity
+
+**Authentication Failed (GCS):**
+- Verify credentials: `gcloud auth application-default login`
+- Check service account permissions
+- Ensure storage.objects.* permissions
+
+**Authentication Failed (S3):**
+- Verify access key and secret key
+- Check IAM policy has s3:* permissions
+- Verify bucket exists and region is correct
+
+**Slow Backups:**
+- Check network bandwidth
+- Monitor storage provider metrics
+- Consider increasing backup frequency
+- Verify no rate limiting
+
+### Debug Logging
+
+Enable verbose logging:
+```bash
+# Set log level
+export GOBLET_LOG_LEVEL=debug
+
+# Run with debug flags
+./goblet-server -storage_provider=s3 ...
+```
+
+## Future Enhancements
+
+Potential improvements to the storage architecture:
+
+1. **Azure Blob Storage**: Add Azure support
+2. **Compression**: Compress bundles before upload
+3. **Encryption**: Client-side encryption for sensitive repos
+4. **Deduplication**: Share common objects across bundles
+5. **Incremental Backups**: Only backup changed objects
+6. **Parallel Uploads**: Upload multiple bundles concurrently
+7. **Backup Verification**: Periodic integrity checks
+8. **Backup Metrics**: Expose Prometheus metrics
diff --git a/docs/architecture/storage-optimization.md b/docs/architecture/storage-optimization.md
new file mode 100644
index 0000000..52067ee
--- /dev/null
+++ b/docs/architecture/storage-optimization.md
@@ -0,0 +1,842 @@
+# Storage Cost Optimization for Goblet
+
+## Overview
+
+Git caches can grow to hundreds of GB per tenant. This document provides strategies to minimize storage costs while maintaining performance using cloud provider tiered storage.
+
+---
+
+## Storage Cost Comparison (per TB/month, 2025)
+
+| Tier | AWS | GCP | Azure | Use Case | Access Time |
+|------|-----|-----|-------|----------|-------------|
+| **Hot** | $23 | $20 | $18 | Active repos | < 10ms |
+| **Cool** | $10 | $10 | $10 | Recent repos | < 100ms |
+| **Archive** | $1 | $1.20 | $0.99 | Old repos | Minutes-hours |
+| **Cold Archive** | $0.36 | $0.40 | $0.18 | Compliance | Hours |
+
+**Cost Reduction:** Up to **98% savings** with proper tiering
+
+---
+
+## Recommended Architecture
+
+### Three-Tier Strategy
+
+```
+┌──────────────────────────────────────────────────────────┐
+│                    Hot Tier (NVMe SSD)                   │
+│  • Last accessed: < 7 days                               │
+│  • Cost: $20-23/TB/month                                 │
+│  • Access: < 10ms                                        │
+│  • Size: 10-20% of total                                 │
+└────────────────┬─────────────────────────────────────────┘
+                 │ Automatic tiering (7 days)
+┌────────────────▼─────────────────────────────────────────┐
+│                    Cool Tier (HDD/S3)                    │
+│  • Last accessed: 7-90 days                              │
+│  • Cost: $10/TB/month                                    │
+│  • Access: < 100ms                                       │
+│  • Size: 30-50% of total                                 │
+└────────────────┬─────────────────────────────────────────┘
+                 │ Automatic tiering (90 days)
+┌────────────────▼─────────────────────────────────────────┐
+│                 Archive Tier (Glacier/Coldline)          │
+│  • Last accessed: > 90 days                              │
+│  • Cost: $1/TB/month                                     │
+│  • Access: Minutes-hours                                 │
+│  • Size: 30-60% of total                                 │
+└──────────────────────────────────────────────────────────┘
+```
+
+### Cost Savings Example
+
+**Scenario:** 1TB cache, 60% cold data
+
+| Storage Strategy | Cost/month | Annual Cost |
+|-----------------|------------|-------------|
+| All Hot (SSD) | $20 | $240 |
+| **Tiered** (40% hot, 30% cool, 30% archive) | **$9.30** | **$111.60** |
+| **Savings** | **54%** | **$128.40** |
+
+---
+
+## AWS Implementation
+
+### Strategy: S3 Intelligent-Tiering + EBS
+
+#### Architecture
+
+```
+┌─────────────────────────────────────────────┐
+│  EC2 Instance (Goblet)                      │
+│  ┌──────────────────────────────────────┐   │
+│  │ Active Cache (EBS gp3)               │   │
+│  │ /cache/hot/                          │   │
+│  │ Last 7 days: 200GB                   │   │
+│  └──────────────────────────────────────┘   │
+└─────────┬───────────────────────────────────┘
+          │ Sync every 1 hour
+┌─────────▼───────────────────────────────────┐
+│ S3 Intelligent-Tiering Bucket               │
+│ s3://goblet-cache-tenant-{id}/              │
+│                                             │
+│ Auto-tiering:                               │
+│ • 0-30 days → Frequent Access   $23/TB     │
+│ • 30-90 days → Infrequent       $12.50/TB  │
+│ • 90+ days → Archive            $4/TB      │
+│ • 180+ days → Deep Archive      $1/TB      │
+└─────────────────────────────────────────────┘
+```
+
+#### Implementation
+
+```yaml
+# goblet-config.yaml
+storage:
+  primary:
+    type: "ebs"
+    mount: "/cache/hot"
+    size_gb: 200
+    volume_type: "gp3"  # $0.08/GB/month = $16/month for 200GB
+    iops: 3000
+    throughput_mbps: 125
+
+  tiering:
+    enabled: true
+    provider: "aws-s3"
+
+    # S3 bucket with Intelligent-Tiering
+    s3:
+      bucket: "goblet-cache-${TENANT_ID}"
+      region: "us-east-1"
+      storage_class: "INTELLIGENT_TIERING"
+
+    # Tiering rules
+    rules:
+      - name: "sync-to-s3"
+        condition: "age > 1 hour AND access_count = 0"
+        action: "upload"
+        delete_local: false
+
+      - name: "evict-from-local"
+        condition: "age > 7 days"
+        action: "delete"
+        keep_in_s3: true
+
+      - name: "restore-on-access"
+        condition: "cache_miss AND exists_in_s3"
+        action: "download"
+        priority: "high"
+```
+
+#### Terraform Configuration
+
+```hcl
+# S3 bucket with Intelligent-Tiering
+resource "aws_s3_bucket" "goblet_cache" {
+  for_each = var.tenants
+
+  bucket = "goblet-cache-${each.key}"
+
+  tags = {
+    Tenant = each.key
+    Purpose = "git-cache"
+  }
+}
+
+resource "aws_s3_bucket_intelligent_tiering_configuration" "goblet_cache" {
+  for_each = var.tenants
+
+  bucket = aws_s3_bucket.goblet_cache[each.key].id
+  name   = "EntireCache"
+
+  tiering {
+    access_tier = "ARCHIVE_ACCESS"
+    days        = 90
+  }
+
+  tiering {
+    access_tier = "DEEP_ARCHIVE_ACCESS"
+    days        = 180
+  }
+}
+
+resource "aws_s3_bucket_lifecycle_configuration" "goblet_cache" {
+  for_each = var.tenants
+
+  bucket = aws_s3_bucket.goblet_cache[each.key].id
+
+  rule {
+    id     = "abort-incomplete-uploads"
+    status = "Enabled"
+
+    abort_incomplete_multipart_upload {
+      days_after_initiation = 7
+    }
+  }
+
+  rule {
+    id     = "delete-old-versions"
+    status = "Enabled"
+
+    noncurrent_version_expiration {
+      noncurrent_days = 30
+    }
+  }
+}
+
+# EBS volume for hot cache
+resource "aws_ebs_volume" "goblet_hot_cache" {
+  for_each = var.goblet_instances
+
+  availability_zone = each.value.az
+  size              = 200  # GB
+  type              = "gp3"
+  iops              = 3000
+  throughput        = 125
+  encrypted         = true
+  kms_key_id        = aws_kms_key.goblet_cache.arn
+
+  tags = {
+    Name = "goblet-hot-cache-${each.key}"
+    Tier = "hot"
+  }
+}
+```
+
+#### Cost Breakdown
+
+```
+Hot Cache (EBS gp3): 200GB × $0.08/GB = $16/month
+S3 Intelligent-Tiering:
+  - 400GB × $0.023/GB (frequent, 0-30 days) = $9.20
+  - 300GB × $0.0125/GB (infrequent, 30-90 days) = $3.75
+  - 100GB × $0.004/GB (archive, 90+ days) = $0.40
+
+Total: $29.35/month for 1TB (vs $80 all-EBS)
+Savings: 63%
+```
+
+---
+
+## GCP Implementation
+
+### Strategy: Persistent Disk + Cloud Storage Autoclass
+
+#### Architecture
+
+```
+┌─────────────────────────────────────────────┐
+│  GKE Node (Goblet Pod)                      │
+│  ┌──────────────────────────────────────┐   │
+│  │ Active Cache (SSD PD)                │   │
+│  │ /cache/hot/                          │   │
+│  │ Last 7 days: 200GB                   │   │
+│  └──────────────────────────────────────┘   │
+└─────────┬───────────────────────────────────┘
+          │ Sync with Cloud Storage Fuse (gcsfuse)
+┌─────────▼───────────────────────────────────┐
+│ Cloud Storage Autoclass Bucket              │
+│ gs://goblet-cache-tenant-{id}/              │
+│                                             │
+│ Auto-tiering:                               │
+│ • Frequent Access → Standard    $20/TB     │
+│ • Infrequent      → Nearline    $10/TB     │
+│ • Archive         → Coldline    $4/TB      │
+│ • Deep Archive    → Archive     $1.20/TB   │
+└─────────────────────────────────────────────┘
+```
+
+#### Implementation
+
+```yaml
+# goblet-gcp-config.yaml
+storage:
+  primary:
+    type: "gcp-persistent-disk"
+    mount: "/cache/hot"
+    size_gb: 200
+    disk_type: "pd-ssd"  # $0.17/GB/month = $34/month
+
+  tiering:
+    enabled: true
+    provider: "gcp-gcs"
+
+    gcs:
+      bucket: "goblet-cache-${TENANT_ID}"
+      location: "us-central1"
+      storage_class: "AUTOCLASS"  # Automatic tiering
+
+    # Mount GCS as filesystem using gcsfuse
+    gcsfuse:
+      enabled: true
+      mount: "/cache/cold"
+      cache_max_size_mb: 1024  # Local cache for GCS data
+      stat_cache_ttl: "1h"
+
+    rules:
+      - name: "sync-to-gcs"
+        condition: "age > 6 hours"
+        action: "upload"
+        delete_local: false
+
+      - name: "evict-from-pd"
+        condition: "age > 7 days"
+        action: "delete"
+        keep_in_gcs: true
+
+      - name: "lazy-load"
+        condition: "cache_miss"
+        action: "mount"  # Access via gcsfuse, auto-download
+```
+
+#### Terraform Configuration
+
+```hcl
+# GCS bucket with Autoclass
+resource "google_storage_bucket" "goblet_cache" {
+  for_each = var.tenants
+
+  name          = "goblet-cache-${each.key}"
+  location      = "US"
+  storage_class = "STANDARD"  # Autoclass starts here
+
+  autoclass {
+    enabled = true
+  }
+
+  lifecycle_rule {
+    condition {
+      age = 180
+    }
+    action {
+      type          = "SetStorageClass"
+      storage_class = "ARCHIVE"
+    }
+  }
+
+  lifecycle_rule {
+    condition {
+      age = 365
+      with_state = "ARCHIVED"
+    }
+    action {
+      type = "Delete"
+    }
+  }
+
+  encryption {
+    default_kms_key_name = google_kms_crypto_key.goblet_cache.id
+  }
+}
+
+# Persistent disk for hot cache
+resource "google_compute_disk" "goblet_hot_cache" {
+  for_each = var.goblet_instances
+
+  name  = "goblet-hot-cache-${each.key}"
+  type  = "pd-ssd"
+  zone  = each.value.zone
+  size  = 200  # GB
+
+  disk_encryption_key {
+    kms_key_self_link = google_kms_crypto_key.goblet_cache.id
+  }
+
+  labels = {
+    tier = "hot"
+    tenant = each.key
+  }
+}
+
+# Kubernetes PVC using the disk
+resource "kubernetes_persistent_volume_claim" "goblet_hot_cache" {
+  for_each = var.goblet_instances
+
+  metadata {
+    name      = "goblet-hot-cache"
+    namespace = "tenant-${each.key}"
+  }
+
+  spec {
+    access_modes = ["ReadWriteOnce"]
+    resources {
+      requests = {
+        storage = "200Gi"
+      }
+    }
+    storage_class_name = "ssd-retain"
+  }
+}
+```
+
+#### Cost Breakdown
+
+```
+Hot Cache (PD-SSD): 200GB × $0.17/GB = $34/month
+GCS Autoclass: 800GB average across tiers
+  - 300GB × $0.020/GB (standard) = $6.00
+  - 300GB × $0.010/GB (nearline) = $3.00
+  - 200GB × $0.004/GB (coldline) = $0.80
+
+Total: $43.80/month for 1TB (vs $170 all-SSD)
+Savings: 74%
+```
+
+---
+
+## Azure Implementation
+
+### Strategy: Premium SSD + Blob Storage with Access Tiers
+
+#### Architecture
+
+```
+┌─────────────────────────────────────────────┐
+│  AKS Node (Goblet Pod)                      │
+│  ┌──────────────────────────────────────┐   │
+│  │ Active Cache (Premium SSD)           │   │
+│  │ /cache/hot/                          │   │
+│  │ Last 7 days: 200GB                   │   │
+│  └──────────────────────────────────────┘   │
+└─────────┬───────────────────────────────────┘
+          │ Sync with Blob Storage using Blobfuse2
+┌─────────▼───────────────────────────────────┐
+│ Azure Blob Storage (Lifecycle Management)   │
+│ container: goblet-cache-tenant-{id}         │
+│                                             │
+│ Auto-tiering:                               │
+│ • 0-30 days → Hot               $18/TB     │
+│ • 30-90 days → Cool             $10/TB     │
+│ • 90+ days → Archive            $0.99/TB   │
+│ • 180+ days → Cold Archive (opt) $0.18/TB  │
+└─────────────────────────────────────────────┘
+```
+
+#### Implementation
+
+```yaml
+# goblet-azure-config.yaml
+storage:
+  primary:
+    type: "azure-disk"
+    mount: "/cache/hot"
+    size_gb: 200
+    sku: "Premium_LRS"  # $0.128/GB/month = $25.60/month
+
+  tiering:
+    enabled: true
+    provider: "azure-blob"
+
+    blob:
+      storage_account: "gobletcache${TENANT_ID}"
+      container: "cache"
+      access_tier: "Hot"  # Initial tier, will auto-tier
+
+    # Mount using Blobfuse2
+    blobfuse:
+      enabled: true
+      mount: "/cache/cold"
+      tmp_path: "/mnt/blobfuse-tmp"
+      cache_size_mb: 1024
+
+    rules:
+      - name: "sync-to-blob"
+        condition: "age > 12 hours"
+        action: "upload"
+        access_tier: "Hot"
+
+      - name: "tier-to-cool"
+        condition: "age > 30 days"
+        action: "change_tier"
+        access_tier: "Cool"
+
+      - name: "tier-to-archive"
+        condition: "age > 90 days"
+        action: "change_tier"
+        access_tier: "Archive"
+
+      - name: "evict-from-disk"
+        condition: "age > 7 days"
+        action: "delete"
+        keep_in_blob: true
+
+      - name: "rehydrate-on-access"
+        condition: "cache_miss AND tier = Archive"
+        action: "rehydrate"
+        priority: "Standard"  # or "High" for faster (more expensive)
+```
+
+#### Terraform Configuration
+
+```hcl
+# Storage account
+resource "azurerm_storage_account" "goblet_cache" {
+  for_each = var.tenants
+
+  name                     = "gobletcache${replace(each.key, "-", "")}"
+  resource_group_name      = azurerm_resource_group.goblet.name
+  location                 = azurerm_resource_group.goblet.location
+  account_tier             = "Standard"
+  account_replication_type = "LRS"
+
+  blob_properties {
+    versioning_enabled = true
+
+    # Lifecycle management
+    lifecycle_management {
+      rule {
+        name    = "tier-to-cool"
+        enabled = true
+
+        filters {
+          blob_types   = ["blockBlob"]
+          prefix_match = ["cache/"]
+        }
+
+        actions {
+          base_blob {
+            tier_to_cool_after_days_since_modification = 30
+            tier_to_archive_after_days_since_modification = 90
+            delete_after_days_since_modification = 365
+          }
+        }
+      }
+    }
+  }
+
+  tags = {
+    Tenant = each.key
+  }
+}
+
+# Container
+resource "azurerm_storage_container" "goblet_cache" {
+  for_each = var.tenants
+
+  name                  = "cache"
+  storage_account_name  = azurerm_storage_account.goblet_cache[each.key].name
+  container_access_type = "private"
+}
+
+# Managed disk for hot cache
+resource "azurerm_managed_disk" "goblet_hot_cache" {
+  for_each = var.goblet_instances
+
+  name                 = "goblet-hot-cache-${each.key}"
+  location             = azurerm_resource_group.goblet.location
+  resource_group_name  = azurerm_resource_group.goblet.name
+  storage_account_type = "Premium_LRS"
+  create_option        = "Empty"
+  disk_size_gb         = 200
+
+  encryption_settings {
+    enabled = true
+    disk_encryption_key {
+      secret_url      = azurerm_key_vault_secret.disk_encryption_key.id
+      source_vault_id = azurerm_key_vault.goblet.id
+    }
+  }
+
+  tags = {
+    tier   = "hot"
+    tenant = each.key
+  }
+}
+
+# Kubernetes PVC
+resource "kubernetes_persistent_volume_claim" "goblet_hot_cache" {
+  for_each = var.goblet_instances
+
+  metadata {
+    name      = "goblet-hot-cache"
+    namespace = "tenant-${each.key}"
+  }
+
+  spec {
+    access_modes = ["ReadWriteOnce"]
+    resources {
+      requests = {
+        storage = "200Gi"
+      }
+    }
+    storage_class_name = "managed-premium-retain"
+  }
+}
+```
+
+#### Cost Breakdown
+
+```
+Hot Cache (Premium SSD): 200GB × $0.128/GB = $25.60/month
+Blob Storage: 800GB across tiers
+  - 300GB × $0.018/GB (hot, 0-30 days) = $5.40
+  - 300GB × $0.010/GB (cool, 30-90 days) = $3.00
+  - 200GB × $0.00099/GB (archive, 90+ days) = $0.20
+
+Total: $34.20/month for 1TB (vs $128 all-Premium)
+Savings: 73%
+```
+
+---
+
+## Comparison Matrix
+
+### Cost Comparison (1TB cache over 1 year)
+
+| Provider | All Hot | Tiered | Savings |
+|----------|---------|--------|---------|
+| AWS | $960 | $352 | **$608 (63%)** |
+| GCP | $2,040 | $526 | **$1,514 (74%)** |
+| Azure | $1,536 | $410 | **$1,126 (73%)** |
+
+**Winner: Azure** (lowest tiered cost)
+
+### Performance Comparison
+
+| Metric | AWS | GCP | Azure |
+|--------|-----|-----|-------|
+| Hot tier latency | 5ms (gp3) | 3ms (SSD) | 4ms (Premium) |
+| Cool tier latency | 50ms (S3) | 40ms (GCS) | 60ms (Blob) |
+| Archive restore | 3-5 hours | 12 hours | 15 hours |
+| Throughput (hot) | 125MB/s | 120MB/s | 120MB/s |
+
+**Winner: GCP** (lowest latency for cool tier)
+
+### Feature Comparison
+
+| Feature | AWS | GCP | Azure |
+|---------|-----|-----|-------|
+| Automatic tiering | ✅ Intelligent-Tiering | ✅ Autoclass | ⚠️  Manual lifecycle |
+| FUSE mounting | ⚠️  s3fs (3rd party) | ✅ gcsfuse (official) | ✅ Blobfuse2 (official) |
+| Encryption | ✅ KMS | ✅ KMS | ✅ Key Vault |
+| Multi-region | ✅ S3 Replication | ✅ Dual-region | ✅ GRS/RA-GRS |
+| Cost explorer | ✅ Excellent | ✅ Good | ⚠️  Basic |
+
+**Winner: AWS** (best automation and tooling)
+
+---
+
+## Hybrid Strategy: Multi-Cloud Cost Optimization
+
+### Recommended Approach
+
+Use cheapest storage for each tier across providers:
+
+```
+Hot Tier: GCP Persistent Disk SSD ($34/month for 200GB)
+  └─ Lowest latency, good price
+
+Cool Tier: Azure Blob Cool ($3/month for 300GB)
+  └─ Best cool tier pricing
+
+Archive: AWS S3 Deep Archive ($0.36/month for 200GB)
+  └─ Cheapest long-term storage
+```
+
+**Total hybrid cost:** $37.36/month for 700GB actively managed cache
+
+**Challenges:**
+- Complexity of multi-cloud orchestration
+- Data transfer costs between providers
+- Operational overhead
+
+**Verdict:** Only for very large deployments (100+ TB)
+
+---
+
+## Best Practices
+
+### 1. Access Pattern Analysis
+
+```bash
+# Analyze cache access patterns
+./scripts/analyze-access-patterns.sh /cache
+
+# Output:
+# Repository Access Report (Last 90 days):
+#   github.com/acme/app: 1,234 accesses (hot)
+#   github.com/acme/lib: 45 accesses (cool)
+#   github.com/acme/archive: 2 accesses (archive candidate)
+```
+
+### 2. Tiering Policy Configuration
+
+```yaml
+# Customize based on your access patterns
+tiering:
+  policies:
+    - name: "frequently-accessed"
+      condition: "access_count > 10/week"
+      tier: "hot"
+      cost_optimized: false
+
+    - name: "occasionally-accessed"
+      condition: "access_count 1-10/week"
+      tier: "cool"
+      cost_optimized: true
+
+    - name: "rarely-accessed"
+      condition: "access_count < 1/week"
+      tier: "archive"
+      cost_optimized: true
+      rehydration: "standard"  # 15-hour restore
+
+    - name: "compliance-only"
+      condition: "age > 365 days"
+      tier: "cold-archive"
+      cost_optimized: true
+      rehydration: "bulk"  # 48-hour restore
+```
+
+### 3. Cache Warming
+
+```go
+// Pre-warm cache for known access patterns
+func (c *CacheManager) WarmCache(ctx context.Context, repos []string) error {
+    for _, repoURL := range repos {
+        // Check current tier
+        tier, err := c.storage.GetTier(repoURL)
+        if err != nil {
+            return err
+        }
+
+        // Rehydrate if archived
+        if tier == "archive" || tier == "cold-archive" {
+            log.Printf("Rehydrating %s (currently in %s)", repoURL, tier)
+            if err := c.storage.Rehydrate(repoURL, "expedited"); err != nil {
+                return err
+            }
+        }
+
+        // Move to hot tier
+        if err := c.storage.SetTier(repoURL, "hot"); err != nil {
+            return err
+        }
+    }
+
+    return nil
+}
+
+// Example: Warm cache before business hours
+func (c *CacheManager) ScheduledWarmup() {
+    // Daily at 6 AM
+    cron.Schedule("0 6 * * *", func() {
+        repos := c.getFrequentlyAccessedRepos()
+        c.WarmCache(context.Background(), repos)
+    })
+}
+```
+
+### 4. Cost Monitoring
+
+```go
+type StorageCostTracker struct {
+    provider    string
+    tenantID    string
+    prometheus  *prometheus.Client
+}
+
+func (s *StorageCostTracker) TrackCosts() {
+    // Hot tier cost
+    hotSize := s.getSize("hot")
+    hotCost := hotSize * s.getPricing("hot")
+    s.prometheus.RecordCost("hot", hotCost, s.tenantID)
+
+    // Cool tier cost
+    coolSize := s.getSize("cool")
+    coolCost := coolSize * s.getPricing("cool")
+    s.prometheus.RecordCost("cool", coolCost, s.tenantID)
+
+    // Archive tier cost
+    archiveSize := s.getSize("archive")
+    archiveCost := archiveSize * s.getPricing("archive")
+    s.prometheus.RecordCost("archive", archiveCost, s.tenantID)
+
+    // Data transfer cost
+    transferCost := s.getTransferCost()
+    s.prometheus.RecordCost("transfer", transferCost, s.tenantID)
+
+    // Total
+    totalCost := hotCost + coolCost + archiveCost + transferCost
+    s.prometheus.RecordCost("total", totalCost, s.tenantID)
+}
+```
+
+---
+
+## Recommendations by Scale
+
+### Small (< 100GB, < 1000 req/day)
+
+**Recommendation:** All-hot storage (simplest)
+
+- AWS: EBS gp3
+- GCP: Persistent Disk SSD
+- Azure: Premium SSD
+
+**Why:** Tiering overhead not worth it at this scale
+
+---
+
+### Medium (100GB - 1TB, 1000-10000 req/day)
+
+**Recommendation:** Hot + Cool tiering
+
+- **AWS:** EBS gp3 (hot) + S3 Intelligent-Tiering
+- **GCP:** PD-SSD (hot) + GCS Autoclass
+- **Azure:** Premium SSD (hot) + Blob Cool
+
+**Savings:** 50-70%
+
+---
+
+### Large (1TB - 10TB, > 10000 req/day)
+
+**Recommendation:** Hot + Cool + Archive
+
+- **AWS:** EBS gp3 (hot, 200GB) + S3 Intelligent-Tiering (warm) + S3 Glacier (archive)
+- **GCP:** PD-SSD (hot, 200GB) + GCS Nearline (warm) + GCS Coldline (archive)
+- **Azure:** Premium SSD (hot, 200GB) + Blob Cool (warm) + Blob Archive
+
+**Savings:** 70-85%
+
+---
+
+### Enterprise (> 10TB, > 100000 req/day)
+
+**Recommendation:** Hot + Cool + Archive + Cold Archive + Multi-region
+
+- **AWS:** EBS io2 Block Express (ultra-hot) + gp3 (hot) + S3 INT (warm) + Glacier (archive) + Deep Archive (cold)
+- **GCP:** Local SSD (ultra-hot) + PD-SSD (hot) + GCS Standard (warm) + Coldline (archive) + Archive (cold)
+- **Azure:** Ultra Disk (ultra-hot) + Premium SSD (hot) + Blob Hot (warm) + Cool (archive) + Archive (cold) + Cold Archive (long-term)
+
+**Additional:** CDN for frequently accessed public repos
+
+**Savings:** 80-95%
+
+---
+
+## Summary
+
+**Recommended Providers by Priority:**
+
+1. **AWS** - Best automation (Intelligent-Tiering), great tooling
+2. **Azure** - Lowest cost for tiered storage
+3. **GCP** - Best performance (gcsfuse), good auto-tiering
+
+**Key Takeaways:**
+
+- ✅ Tiering can save **60-95%** on storage costs
+- ✅ Most repos accessed < once/week (ideal for archival)
+- ✅ Automatic tiering (AWS/GCP) reduces operational overhead
+- ✅ Monitor access patterns to optimize tier placement
+
+**Action Items:**
+
+1. Analyze current access patterns
+2. Choose provider based on existing infrastructure
+3. Implement hot + cool tiers initially
+4. Add archive tier after 90 days of data
+5. Monitor costs and adjust policies
diff --git a/docs/operations/deployment-patterns.md b/docs/operations/deployment-patterns.md
new file mode 100644
index 0000000..14168aa
--- /dev/null
+++ b/docs/operations/deployment-patterns.md
@@ -0,0 +1,628 @@
+# Deployment Patterns
+
+This guide describes proven deployment patterns for Goblet based on your scale and requirements.
+
+## Pattern Selection
+
+Choose a deployment pattern based on your needs:
+
+| Pattern | Best For | Isolation | Complexity | Cost |
+|---------|----------|-----------|------------|------|
+| [Single Instance](#single-instance) | Development, < 1K req/day | N/A | Low | $ |
+| [Sidecar](#sidecar-pattern) | Multi-tenant, CI/CD | Perfect | Low | $$ |
+| [Namespace](#namespace-isolation) | Enterprise, compliance | High | Medium | $$$ |
+| [Sharded](#sharded-cluster) | High traffic > 10K req/day | Good | High | $$$$ |
+
+## Single Instance
+
+### Overview
+
+One Goblet instance serves all requests. Suitable for development or single-tenant production use.
+
+```
+┌─────────────┐
+│   Clients   │
+└──────┬──────┘
+       │
+┌──────▼──────┐
+│   Goblet    │
+│   Instance  │
+└──────┬──────┘
+       │
+  ┌────▼────┐
+  │  Cache  │
+  └─────────┘
+```
+
+### When to Use
+
+- Development and testing
+- Single user or service account
+- Public repositories only
+- Low traffic (< 1,000 requests/day)
+
+### Deployment
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: goblet
+spec:
+  replicas: 1
+  template:
+    spec:
+      containers:
+      - name: goblet
+        image: goblet:latest
+        ports:
+        - containerPort: 8080
+        volumeMounts:
+        - name: cache
+          mountPath: /cache
+      volumes:
+      - name: cache
+        persistentVolumeClaim:
+          claimName: goblet-cache
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: goblet
+spec:
+  selector:
+    app: goblet
+  ports:
+  - port: 80
+    targetPort: 8080
+```
+
+### Scaling Limits
+
+- **Throughput:** 500-1,000 requests/second
+- **Concurrent users:** 100-500
+- **Cache size:** 100GB-1TB
+- **Single point of failure**
+
+## Sidecar Pattern
+
+### Overview
+
+Each workload gets its own Goblet instance as a sidecar container. Provides perfect isolation with minimal configuration.
+
+```
+┌────────────────────────────────┐
+│  Pod (Workload)                │
+│  ┌──────────┐  ┌────────────┐  │
+│  │   App    │  │  Goblet    │  │
+│  │Container │──│  Sidecar   │  │
+│  └──────────┘  └─────┬──────┘  │
+│                      │          │
+│                 ┌────▼──────┐   │
+│                 │   Cache   │   │
+│                 │ (emptyDir)│   │
+│                 └───────────┘   │
+└────────────────────────────────┘
+```
+
+### When to Use
+
+- ✅ **Recommended default for multi-tenant deployments**
+- Multiple users with different access permissions
+- Terraform Cloud, security scanning
+- CI/CD runners
+- Kubernetes-native environments
+
+### Benefits
+
+- **Perfect isolation:** Each workload has dedicated cache
+- **No shared state:** Eliminates cross-tenant risks
+- **Simple scaling:** Add pods for more capacity
+- **Zero network latency:** Localhost communication
+- **No code changes:** Deploy with existing Goblet
+
+### Deployment
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: terraform-agent
+spec:
+  replicas: 10  # Scale as needed
+  template:
+    spec:
+      containers:
+      # Main application
+      - name: terraform-agent
+        image: terraform:latest
+        env:
+        - name: HTTP_PROXY
+          value: "http://localhost:8080"
+        - name: HTTPS_PROXY
+          value: "http://localhost:8080"
+
+      # Goblet sidecar
+      - name: goblet-cache
+        image: goblet:latest
+        ports:
+        - containerPort: 8080
+        volumeMounts:
+        - name: cache
+          mountPath: /cache
+        resources:
+          requests:
+            cpu: 500m
+            memory: 1Gi
+          limits:
+            cpu: 1
+            memory: 2Gi
+
+      volumes:
+      - name: cache
+        emptyDir:
+          sizeLimit: 10Gi
+```
+
+### Auto-Scaling
+
+```yaml
+apiVersion: autoscaling/v2
+kind: HorizontalPodAutoscaler
+metadata:
+  name: terraform-agent-hpa
+spec:
+  scaleTargetRef:
+    apiVersion: apps/v1
+    kind: Deployment
+    name: terraform-agent
+  minReplicas: 10
+  maxReplicas: 100
+  metrics:
+  - type: Resource
+    resource:
+      name: cpu
+      target:
+        type: Utilization
+        averageUtilization: 70
+```
+
+### Cost Analysis
+
+**Example:** 100 pods for 1M requests/month
+- Per pod: ~10,000 requests/month
+- CPU: 50m average, 500m burst
+- Memory: 1GB
+- Cache: 10GB per pod
+- **Total cost:** ~$155/month (varies by provider)
+
+### Capacity Planning
+
+| Pods | Requests/Month | Cost/Month | Use Case |
+|------|----------------|------------|----------|
+| 10 | 100K | $15 | Small team |
+| 50 | 500K | $75 | Growing team |
+| 100 | 1M | $155 | Enterprise |
+| 500 | 5M | $775 | Large scale |
+
+## Namespace Isolation
+
+### Overview
+
+Separate Goblet deployments per tenant in isolated Kubernetes namespaces with network policies.
+
+```
+┌─────────────────────────────────────┐
+│  Namespace: tenant-acme             │
+│  ┌──────────┐   ┌──────────┐       │
+│  │  Goblet  │───│  Network │       │
+│  │  Deploy  │   │  Policy  │       │
+│  └────┬─────┘   └──────────┘       │
+│       │                             │
+│  ┌────▼─────┐                       │
+│  │  Cache   │                       │
+│  │  (PVC)   │                       │
+│  └──────────┘                       │
+└─────────────────────────────────────┘
+        ┼
+┌─────────────────────────────────────┐
+│  Namespace: tenant-bigcorp          │
+│  ┌──────────┐   ┌──────────┐       │
+│  │  Goblet  │───│  Network │       │
+│  │  Deploy  │   │  Policy  │       │
+│  └────┬─────┘   └──────────┘       │
+│       │                             │
+│  ┌────▼─────┐                       │
+│  │  Cache   │                       │
+│  │  (PVC)   │                       │
+│  └──────────┘                       │
+└─────────────────────────────────────┘
+```
+
+### When to Use
+
+- Enterprise multi-tenant deployments
+- Compliance requirements (SOC 2, ISO 27001)
+- Strong isolation needed
+- Different SLAs per tenant
+- Resource quotas per tenant
+
+### Deployment
+
+```yaml
+# Create namespace per tenant
+apiVersion: v1
+kind: Namespace
+metadata:
+  name: tenant-acme-corp
+  labels:
+    tenant: acme-corp
+---
+# Network policy for isolation
+apiVersion: networking.k8s.io/v1
+kind: NetworkPolicy
+metadata:
+  name: goblet-isolation
+  namespace: tenant-acme-corp
+spec:
+  podSelector:
+    matchLabels:
+      app: goblet
+  policyTypes:
+  - Ingress
+  - Egress
+  ingress:
+  # Only from same namespace
+  - from:
+    - namespaceSelector:
+        matchLabels:
+          tenant: acme-corp
+    ports:
+    - port: 8080
+  egress:
+  # DNS, KMS, upstream only
+  - to:
+    - namespaceSelector:
+        matchLabels:
+          name: kube-system
+    ports:
+    - port: 53
+      protocol: UDP
+---
+# Resource quota per tenant
+apiVersion: v1
+kind: ResourceQuota
+metadata:
+  name: tenant-quota
+  namespace: tenant-acme-corp
+spec:
+  hard:
+    requests.cpu: "10"
+    requests.memory: "20Gi"
+    persistentvolumeclaims: "10"
+---
+# Goblet deployment
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: goblet
+  namespace: tenant-acme-corp
+spec:
+  replicas: 3
+  template:
+    spec:
+      containers:
+      - name: goblet
+        image: goblet:latest
+        volumeMounts:
+        - name: cache
+          mountPath: /cache
+      volumes:
+      - name: cache
+        persistentVolumeClaim:
+          claimName: goblet-cache-acme-corp
+```
+
+### Management Script
+
+```bash
+#!/bin/bash
+# deploy-tenant.sh
+
+TENANT=$1
+
+kubectl create namespace tenant-$TENANT
+kubectl label namespace tenant-$TENANT tenant=$TENANT
+
+# Apply network policy
+kubectl apply -f network-policy.yaml -n tenant-$TENANT
+
+# Apply resource quota
+kubectl apply -f resource-quota.yaml -n tenant-$TENANT
+
+# Deploy goblet
+kubectl apply -f goblet-deployment.yaml -n tenant-$TENANT
+
+echo "Tenant $TENANT deployed successfully"
+```
+
+## Sharded Cluster
+
+### Overview
+
+Multiple Goblet instances with load balancer using consistent hashing to route requests.
+
+```
+        ┌───────────────┐
+        │ Load Balancer │
+        │(Consistent    │
+        │ Hash on URL)  │
+        └───────┬───────┘
+                │
+    ┌───────────┼───────────┐
+    │           │           │
+┌───▼───┐   ┌───▼───┐   ┌───▼───┐
+│Goblet │   │Goblet │   │Goblet │
+│  -1   │   │  -2   │   │  -3   │
+└───┬───┘   └───┬───┘   └───┬───┘
+    │           │           │
+┌───▼───┐   ┌───▼───┐   ┌───▼───┐
+│Cache-1│   │Cache-2│   │Cache-3│
+└───────┘   └───────┘   └───────┘
+```
+
+### When to Use
+
+- High traffic (> 10,000 requests/day)
+- Need high availability
+- Want to share cache across team
+- Have operational expertise
+
+### Load Balancer Configuration
+
+```
+# HAProxy config
+backend goblet_shards
+    balance uri whole
+    hash-type consistent
+
+    # Route same repo to same instance
+    server goblet-1 10.0.1.1:8080 check
+    server goblet-2 10.0.1.2:8080 check
+    server goblet-3 10.0.1.3:8080 check
+```
+
+### Deployment
+
+```yaml
+apiVersion: apps/v1
+kind: StatefulSet
+metadata:
+  name: goblet
+spec:
+  serviceName: goblet
+  replicas: 3
+  template:
+    spec:
+      containers:
+      - name: goblet
+        image: goblet:latest
+        volumeMounts:
+        - name: cache
+          mountPath: /cache
+  volumeClaimTemplates:
+  - metadata:
+      name: cache
+    spec:
+      accessModes: ["ReadWriteOnce"]
+      resources:
+        requests:
+          storage: 100Gi
+```
+
+### Scaling Considerations
+
+**Adding a node:**
+```bash
+# Gradually increases StatefulSet replicas
+kubectl scale statefulset goblet --replicas=4
+
+# HAProxy automatically includes new instance
+# Some repositories will migrate to new instance
+```
+
+**Removing a node:**
+```bash
+# Drain node gracefully
+kubectl drain node-4 --ignore-daemonsets
+
+# Scale down
+kubectl scale statefulset goblet --replicas=3
+
+# Repositories redistribute to remaining instances
+```
+
+## Hybrid Patterns
+
+### Sidecar + Namespace
+
+Combine sidecar pattern with namespace isolation for maximum security:
+
+```yaml
+# Each tenant gets own namespace
+# Each workload in namespace gets sidecar
+# Network policy enforces namespace boundary
+```
+
+**Best for:** Enterprise SaaS platforms
+
+### Sharded + Sidecar
+
+Use sharding for shared resources, sidecar for user workloads:
+
+```
+Shared Infrastructure (sharded):
+  ├─ Common public repositories
+  └─ Terraform modules
+
+User Workloads (sidecar):
+  ├─ Private repositories
+  └─ User-specific caches
+```
+
+**Best for:** Hybrid cloud/on-premise deployments
+
+## Migration Paths
+
+### From Single Instance to Sidecar
+
+```bash
+# 1. Deploy sidecar pattern in new namespace
+kubectl create ns goblet-v2
+kubectl apply -f sidecar-deployment.yaml -n goblet-v2
+
+# 2. Gradually migrate workloads
+kubectl label namespace app-team-1 goblet-version=v2
+
+# 3. Monitor both versions
+kubectl logs -l app=goblet -n goblet-v1
+kubectl logs -l app=goblet -n goblet-v2
+
+# 4. Decommission old instance when ready
+kubectl delete deployment goblet -n goblet-v1
+```
+
+### From Sidecar to Namespace
+
+```bash
+# Create tenant namespaces
+for tenant in acme bigcorp startup; do
+  kubectl create ns tenant-$tenant
+  kubectl apply -f tenant-deployment.yaml -n tenant-$tenant
+done
+
+# Migrate workloads namespace by namespace
+kubectl move-workloads tenant-acme
+```
+
+## Monitoring Deployments
+
+### Key Metrics by Pattern
+
+| Pattern | Key Metrics |
+|---------|-------------|
+| Single Instance | Request rate, cache hit rate, disk usage |
+| Sidecar | Pods running, cache size per pod, memory usage |
+| Namespace | Quota utilization, cross-namespace calls (should be 0) |
+| Sharded | Load distribution, rebalancing events |
+
+### Alerting Rules
+
+```yaml
+# Prometheus alerting rules
+groups:
+- name: goblet
+  rules:
+  # Low cache hit rate
+  - alert: LowCacheHitRate
+    expr: rate(cache_hits_total[5m]) / rate(requests_total[5m]) < 0.5
+    for: 10m
+
+  # High error rate
+  - alert: HighErrorRate
+    expr: rate(errors_total[5m]) / rate(requests_total[5m]) > 0.05
+    for: 5m
+
+  # Disk space low
+  - alert: LowDiskSpace
+    expr: disk_usage_bytes / disk_capacity_bytes > 0.9
+    for: 5m
+```
+
+## Best Practices
+
+### General
+
+1. **Start simple:** Use sidecar pattern unless specific needs require alternatives
+2. **Monitor first:** Instrument before scaling
+3. **Test isolation:** Verify cross-tenant access fails
+4. **Document decisions:** Record why you chose a pattern
+
+### Sidecar Pattern
+
+1. Set appropriate `emptyDir` size limits
+2. Use resource requests/limits
+3. Configure HPA for auto-scaling
+4. Monitor per-pod cache hit rates
+
+### Namespace Isolation
+
+1. Use NetworkPolicy to enforce boundaries
+2. Set ResourceQuota per namespace
+3. Monitor quota utilization
+4. Audit cross-namespace access
+
+### Sharded Cluster
+
+1. Use consistent hashing in load balancer
+2. Monitor load distribution
+3. Plan shard additions carefully
+4. Test failover scenarios
+
+## Troubleshooting
+
+### Sidecar Not Starting
+
+```bash
+# Check container logs
+kubectl logs pod-name -c goblet-cache
+
+# Check events
+kubectl describe pod pod-name
+
+# Common issues:
+# - Resource limits too low
+# - Volume mount permissions
+# - Image pull errors
+```
+
+### High Memory Usage
+
+```bash
+# Check cache size
+kubectl exec pod-name -c goblet-cache -- du -sh /cache
+
+# Reduce cache size limit
+# Edit deployment: emptyDir.sizeLimit
+```
+
+### Cross-Tenant Access
+
+```bash
+# Test isolation
+./test-isolation.sh tenant-a tenant-b
+
+# If test fails:
+# - Verify NetworkPolicy applied
+# - Check namespace labels
+# - Review RBAC rules
+```
+
+## Summary
+
+**Quick Decision Guide:**
+
+- **Starting out?** → Sidecar Pattern
+- **Enterprise compliance?** → Namespace Isolation
+- **High traffic (> 10K req/day)?** → Sharded Cluster
+- **Development only?** → Single Instance
+
+**Next Steps:**
+
+1. Review your requirements
+2. Choose a pattern
+3. Deploy to dev/staging
+4. Monitor and validate
+5. Deploy to production
+
+For detailed implementation, see example configurations in [`examples/`](../../examples/).
diff --git a/docs/operations/monitoring.md b/docs/operations/monitoring.md
new file mode 100644
index 0000000..9d33f0c
--- /dev/null
+++ b/docs/operations/monitoring.md
@@ -0,0 +1,199 @@
+# Monitoring Guide
+
+Monitor Goblet's performance, health, and security with Prometheus metrics and alerting.
+
+## Quick Start
+
+```bash
+# View metrics
+curl http://localhost:8080/metrics
+
+# Access Prometheus (if using load test environment)
+open http://localhost:9090
+
+# Access Grafana
+open http://localhost:3000
+```
+
+## Key Metrics
+
+### Performance Metrics
+
+**Cache Hit Rate:**
+```promql
+rate(cache_hits_total[5m]) / rate(requests_total[5m])
+```
+- Target: > 80%
+- Warning: < 70%
+- Critical: < 50%
+
+**Request Latency (P95):**
+```promql
+histogram_quantile(0.95, rate(request_duration_seconds_bucket[5m]))
+```
+- Good: < 100ms
+- Acceptable: 100-500ms
+- Poor: > 500ms
+
+**Error Rate:**
+```promql
+rate(errors_total[5m]) / rate(requests_total[5m])
+```
+- Target: < 1%
+- Warning: > 5%
+- Critical: > 10%
+
+### Resource Metrics
+
+**Disk Usage:**
+```promql
+disk_usage_bytes / disk_capacity_bytes
+```
+- Warning: > 80%
+- Critical: > 90%
+
+**Memory Usage:**
+```promql
+container_memory_usage_bytes{container="goblet"}
+```
+
+**CPU Usage:**
+```promql
+rate(container_cpu_usage_seconds_total{container="goblet"}[5m])
+```
+
+## Dashboards
+
+### Grafana Dashboard
+
+Import the Goblet dashboard (coming soon):
+```bash
+# Import dashboard JSON
+kubectl create configmap goblet-dashboard \
+  --from-file=dashboards/goblet.json
+```
+
+### Key Panels
+
+1. **Request Overview**
+   - Total requests/sec
+   - Success rate
+   - Error rate
+
+2. **Cache Performance**
+   - Hit rate over time
+   - Cache size
+   - Eviction rate
+
+3. **Latency Distribution**
+   - P50, P95, P99
+   - By operation type
+   - By repository
+
+4. **Resource Utilization**
+   - CPU usage
+   - Memory usage
+   - Disk usage
+   - Network I/O
+
+## Alerting Rules
+
+### Prometheus Alerts
+
+```yaml
+groups:
+- name: goblet
+  rules:
+  # Low cache hit rate
+  - alert: GobletLowCacheHitRate
+    expr: rate(cache_hits_total[5m]) / rate(requests_total[5m]) < 0.5
+    for: 10m
+    labels:
+      severity: warning
+    annotations:
+      summary: "Low cache hit rate ({{ $value | humanizePercentage }})"
+
+  # High error rate
+  - alert: GobletHighErrorRate
+    expr: rate(errors_total[5m]) / rate(requests_total[5m]) > 0.05
+    for: 5m
+    labels:
+      severity: critical
+    annotations:
+      summary: "High error rate ({{ $value | humanizePercentage }})"
+
+  # Disk space low
+  - alert: GobletLowDiskSpace
+    expr: disk_usage_bytes / disk_capacity_bytes > 0.9
+    for: 5m
+    labels:
+      severity: warning
+    annotations:
+      summary: "Low disk space ({{ $value | humanizePercentage }})"
+
+  # High latency
+  - alert: GobletHighLatency
+    expr: histogram_quantile(0.95, rate(request_duration_seconds_bucket[5m])) > 1.0
+    for: 10m
+    labels:
+      severity: warning
+    annotations:
+      summary: "High P95 latency ({{ $value }}s)"
+```
+
+## Health Checks
+
+### Liveness Probe
+
+```yaml
+livenessProbe:
+  httpGet:
+    path: /healthz
+    port: 8080
+  initialDelaySeconds: 10
+  periodSeconds: 30
+```
+
+### Readiness Probe
+
+```yaml
+readinessProbe:
+  httpGet:
+    path: /healthz
+    port: 8080
+  initialDelaySeconds: 5
+  periodSeconds: 10
+```
+
+## Logging
+
+### Log Levels
+
+- `debug`: Detailed debugging information
+- `info`: General operational messages
+- `warn`: Warning messages (e.g., cache misses, slow operations)
+- `error`: Error messages
+
+### Structured Logging
+
+```json
+{
+  "level": "info",
+  "timestamp": "2025-11-07T10:00:00Z",
+  "message": "Cache hit",
+  "repository": "github.com/kubernetes/kubernetes",
+  "operation": "fetch",
+  "duration_ms": 45,
+  "cache_hit": true
+}
+```
+
+## Troubleshooting
+
+See [Troubleshooting Guide](troubleshooting.md) for common issues and solutions.
+
+## Related Documentation
+
+- [Load Testing](load-testing.md)
+- [Deployment Patterns](deployment-patterns.md)
+- [Troubleshooting](troubleshooting.md)
diff --git a/docs/operations/releasing.md b/docs/operations/releasing.md
new file mode 100644
index 0000000..43740c0
--- /dev/null
+++ b/docs/operations/releasing.md
@@ -0,0 +1,433 @@
+# Release Process
+
+This document describes how to create a new release of Goblet.
+
+## Overview
+
+Goblet uses **[GoReleaser](https://goreleaser.com/)** for automated, standardized releases. GoReleaser is the industry-standard tool for Go project releases and provides:
+
+- ✅ **Automatic semantic versioning** from git tags
+- ✅ **Multi-platform binary builds** (Linux, macOS, Windows)
+- ✅ **Automatic changelog generation** from git commits
+- ✅ **SHA256 checksum generation**
+- ✅ **GitHub release creation** with all artifacts
+- ✅ **Multi-arch Docker images** (amd64, arm64)
+- ✅ **Archive generation** (tar.gz, zip)
+
+## Prerequisites
+
+- Write access to the GitHub repository
+- Clean working directory on the `main` branch
+- All CI checks passing on `main`
+- Follow [Conventional Commits](https://www.conventionalcommits.org/) for automatic changelog generation
+
+## Release Workflow Overview
+
+When you push a version tag, GoReleaser automatically:
+
+1. Builds binaries for all supported platforms
+2. Generates SHA256 checksums for verification
+3. Creates archives (tar.gz for Unix, zip for Windows)
+4. Generates changelog from git history using conventional commits
+5. Creates a GitHub release with all binaries attached
+6. Builds and pushes multi-arch Docker images to GitHub Container Registry (GHCR)
+
+## Supported Platforms
+
+The release pipeline builds binaries for:
+
+- **Linux**: amd64, arm64
+- **macOS**: amd64 (Intel), arm64 (Apple Silicon)
+- **Windows**: amd64
+
+## Conventional Commits for Automatic Changelogs
+
+GoReleaser generates changelogs automatically from git commit messages. Follow the [Conventional Commits](https://www.conventionalcommits.org/) specification:
+
+### Commit Message Format
+
+```
+<type>(<scope>): <subject>
+
+<body>
+
+<footer>
+```
+
+### Common Types
+
+- `feat`: New features (appears in changelog under "Features")
+- `fix`: Bug fixes (appears in changelog under "Bug Fixes")
+- `perf`: Performance improvements
+- `docs`: Documentation changes
+- `test`: Test additions or changes
+- `build`: Build system changes
+- `ci`: CI/CD changes
+- `chore`: Other changes (excluded from changelog)
+
+### Examples
+
+```bash
+# Feature
+git commit -m "feat: add offline ls-refs support with local cache fallback"
+
+# Bug fix
+git commit -m "fix: resolve data race in UpstreamEnabled configuration"
+
+# Breaking change
+git commit -m "feat!: change config API to use atomic operations
+
+BREAKING CHANGE: UpstreamEnabled now requires SetUpstreamEnabled() method"
+
+# With scope
+git commit -m "fix(auth): handle expired tokens correctly"
+```
+
+### Breaking Changes
+
+Indicate breaking changes with `!` or `BREAKING CHANGE:` in the footer:
+
+```bash
+git commit -m "feat!: require Go 1.21 or higher"
+# or
+git commit -m "feat: require Go 1.21 or higher
+
+BREAKING CHANGE: Go 1.20 is no longer supported"
+```
+
+## Step-by-Step Release Process
+
+### 1. Ensure Clean Commit History
+
+Make sure recent commits follow conventional commit format:
+
+```bash
+# View recent commits
+git log --oneline -10
+
+# Good examples:
+# feat: add new cache backend support
+# fix: resolve memory leak in repository manager
+# docs: update installation instructions
+
+# If needed, update commit messages before release
+git rebase -i HEAD~5  # Interactive rebase to edit messages
+```
+
+### 2. Verify CI Status
+
+Ensure all CI checks are passing on main:
+
+```bash
+# Check latest CI status
+gh run list --branch main --limit 1
+
+# Or visit GitHub Actions
+# https://github.com/jrepp/github-cache-daemon/actions
+```
+
+### 3. Create and Push the Release Tag
+
+Create a version tag following semantic versioning:
+
+```bash
+# For a new major version (breaking changes)
+git tag -a v1.0.0 -m "Release v1.0.0"
+
+# For a new minor version (new features, backwards compatible)
+git tag -a v1.1.0 -m "Release v1.1.0"
+
+# For a patch version (bug fixes)
+git tag -a v1.0.1 -m "Release v1.0.1"
+
+# Push the tag to trigger the release pipeline
+git push origin v1.0.0
+```
+
+**Important**: Use the `v` prefix (e.g., `v1.0.0`) to match the workflow trigger pattern.
+
+### 4. Monitor the Release Pipeline
+
+Watch the release workflow progress:
+
+```bash
+# Watch the release workflow in real-time
+gh run watch
+
+# Or view release workflow runs
+gh run list --workflow=release.yml --limit 5
+
+# Or check GitHub Actions UI
+# https://github.com/jrepp/github-cache-daemon/actions/workflows/release.yml
+```
+
+GoReleaser will:
+- ✅ Build binaries for all platforms (~3-5 minutes)
+- ✅ Generate SHA256 checksums
+- ✅ Create archives (tar.gz/zip)
+- ✅ Generate changelog from commits
+- ✅ Create GitHub release with all artifacts
+- ✅ Build and push multi-arch Docker images to GHCR
+
+### 5. Verify the Release
+
+Once the pipeline completes:
+
+1. **Check the GitHub Release page**:
+   ```bash
+   gh release view v1.0.0
+   # Or visit: https://github.com/jrepp/github-cache-daemon/releases
+   ```
+
+2. **Verify all archives and checksums are attached**:
+   - `goblet_1.0.0_linux_amd64.tar.gz`
+   - `goblet_1.0.0_linux_arm64.tar.gz`
+   - `goblet_1.0.0_darwin_amd64.tar.gz`
+   - `goblet_1.0.0_darwin_arm64.tar.gz`
+   - `goblet_1.0.0_windows_amd64.zip`
+   - `checksums.txt` (contains all SHA256 checksums)
+
+3. **Test a binary download and verification**:
+   ```bash
+   # Download archive and checksums
+   gh release download v1.0.0 -p "goblet_1.0.0_linux_amd64.tar.gz"
+   gh release download v1.0.0 -p "checksums.txt"
+
+   # Verify checksum
+   sha256sum -c --ignore-missing checksums.txt
+
+   # Extract and test
+   tar -xzf goblet_1.0.0_linux_amd64.tar.gz
+   ./goblet-server --version
+   ```
+
+4. **Verify Docker images on GitHub Container Registry**:
+   ```bash
+   # Pull version-specific tag
+   docker pull ghcr.io/jrepp/goblet-server:1.0.0
+
+   # Pull latest tag
+   docker pull ghcr.io/jrepp/goblet-server:latest
+
+   # Verify multi-arch support
+   docker inspect ghcr.io/jrepp/goblet-server:1.0.0 | grep Architecture
+   ```
+
+### 6. Announce the Release
+
+After verification:
+
+1. Update any documentation referencing version numbers
+2. Announce on relevant channels (if applicable)
+3. Update any dependent projects
+
+## Pre-releases and Release Candidates
+
+To create a pre-release:
+
+```bash
+# Alpha release
+git tag -a v1.0.0-alpha.1 -m "Release v1.0.0-alpha.1"
+git push origin v1.0.0-alpha.1
+
+# Beta release
+git tag -a v1.0.0-beta.1 -m "Release v1.0.0-beta.1"
+git push origin v1.0.0-beta.1
+
+# Release candidate
+git tag -a v1.0.0-rc.1 -m "Release v1.0.0-rc.1"
+git push origin v1.0.0-rc.1
+```
+
+Pre-releases are automatically marked as "pre-release" on GitHub (any tag containing a hyphen).
+
+## Semantic Versioning Guidelines
+
+Follow [Semantic Versioning 2.0.0](https://semver.org/):
+
+- **MAJOR** version (`v2.0.0`): Breaking changes, incompatible API changes
+- **MINOR** version (`v1.1.0`): New features, backwards compatible
+- **PATCH** version (`v1.0.1`): Bug fixes, backwards compatible
+
+Examples:
+- Adding offline mode feature: `v1.1.0` (new feature, backwards compatible)
+- Fixing race condition: `v1.0.1` (bug fix)
+- Changing config API: `v2.0.0` (breaking change)
+
+## GitHub Container Registry (GHCR)
+
+Docker images are automatically pushed to GitHub Container Registry (GHCR) during releases. No additional configuration is required - the workflow uses the built-in `GITHUB_TOKEN` with `packages: write` permission.
+
+Images are available at:
+- `ghcr.io/jrepp/goblet-server:latest`
+- `ghcr.io/jrepp/goblet-server:1.0.0`
+- `ghcr.io/jrepp/goblet-server:1.0`
+- `ghcr.io/jrepp/goblet-server:1`
+
+### Making Images Public
+
+By default, GHCR images are private. To make them public:
+
+1. Go to https://github.com/users/jrepp/packages/container/goblet-server/settings
+2. Scroll to "Danger Zone"
+3. Click "Change visibility" → "Public"
+
+## Local Testing with GoReleaser
+
+Test the release process locally before creating a tag:
+
+### Install GoReleaser
+
+```bash
+# macOS
+brew install goreleaser
+
+# Linux
+go install github.com/goreleaser/goreleaser@latest
+
+# Or download from: https://github.com/goreleaser/goreleaser/releases
+```
+
+### Test Build Without Publishing
+
+```bash
+# Build for all platforms (no publishing)
+goreleaser build --snapshot --clean
+
+# Check dist/ directory for binaries
+ls -la dist/
+
+# Test a specific binary
+./dist/goblet-server_linux_amd64_v1/goblet-server --version
+```
+
+### Test Full Release (Snapshot Mode)
+
+```bash
+# Run complete release process without publishing
+goreleaser release --snapshot --clean --skip=publish
+
+# This will:
+# - Build all binaries
+# - Create archives
+# - Generate checksums
+# - Create changelog
+# - Skip: GitHub release creation, Docker push
+```
+
+### Validate Configuration
+
+```bash
+# Check .goreleaser.yml for errors
+goreleaser check
+
+# View current configuration
+goreleaser --help
+```
+
+## Troubleshooting
+
+### Pipeline Fails at Build Step
+
+Check the build logs for compilation errors:
+```bash
+gh run view --log
+```
+
+Common issues:
+- Go module issues: Ensure `go.mod` is up to date
+- Build errors: Run `task ci` locally before tagging
+
+### Release Already Exists
+
+If you need to recreate a release:
+
+```bash
+# Delete the GitHub release
+gh release delete v1.0.0
+
+# Delete the tag locally and remotely
+git tag -d v1.0.0
+git push origin :refs/tags/v1.0.0
+
+# Recreate and push
+git tag -a v1.0.0 -m "Release v1.0.0"
+git push origin v1.0.0
+```
+
+### Missing Binaries in Release
+
+If some binaries are missing, check:
+- Build matrix configuration in `.github/workflows/release.yml`
+- Platform-specific build errors in workflow logs
+
+### Docker Push Fails
+
+If Docker image push fails:
+- Verify `DOCKER_HUB_USERNAME` and `DOCKER_HUB_TOKEN` are configured
+- Check Docker Hub access token permissions
+- Verify repository name in workflow matches Docker Hub repository
+
+## Testing the Release Pipeline
+
+To test the release pipeline without creating an official release:
+
+1. Create a test tag in a feature branch:
+   ```bash
+   git checkout -b test-release
+   git tag -a v0.0.0-test.1 -m "Test release"
+   git push origin v0.0.0-test.1
+   ```
+
+2. Monitor the workflow
+
+3. Clean up:
+   ```bash
+   gh release delete v0.0.0-test.1
+   git push origin :refs/tags/v0.0.0-test.1
+   git tag -d v0.0.0-test.1
+   ```
+
+## Version Numbering Strategy
+
+Current development follows this strategy:
+
+- **Main branch**: Unreleased development (`main`)
+- **Stable releases**: `v1.0.0`, `v1.1.0`, `v1.0.1`
+- **Pre-releases**: `v1.0.0-alpha.1`, `v1.0.0-beta.1`, `v1.0.0-rc.1`
+- **Feature branches**: No tags (merge to main first)
+
+## Release Checklist
+
+Use this checklist when creating a release:
+
+**Pre-Release:**
+- [ ] All CI checks passing on `main`
+- [ ] Recent commits follow conventional commit format
+- [ ] Version number determined (follows semantic versioning)
+- [ ] GoReleaser config validated locally (`goreleaser check`)
+- [ ] Local snapshot build tested (`goreleaser build --snapshot --clean`)
+
+**Release:**
+- [ ] Tag created with `v` prefix (e.g., `v1.0.0`)
+- [ ] Tag pushed to GitHub
+- [ ] GitHub Actions workflow triggered
+
+**Post-Release Verification:**
+- [ ] Pipeline completed successfully
+- [ ] All archives present in GitHub release
+- [ ] Checksums file (`checksums.txt`) included
+- [ ] Changelog automatically generated and accurate
+- [ ] Docker images pushed to GHCR
+- [ ] Downloaded binary tested and verified
+- [ ] Release notes reviewed
+- [ ] Documentation updated (if needed)
+- [ ] Release announced (if needed)
+
+## Getting Help
+
+If you encounter issues with the release process:
+
+1. Check GitHub Actions logs: https://github.com/jrepp/github-cache-daemon/actions
+2. Review workflow file: `.github/workflows/release.yml`
+3. Open an issue: https://github.com/jrepp/github-cache-daemon/issues
diff --git a/docs/operations/testing.md b/docs/operations/testing.md
new file mode 100644
index 0000000..9da86c4
--- /dev/null
+++ b/docs/operations/testing.md
@@ -0,0 +1,385 @@
+# Testing Guide
+
+This document describes the testing strategy and how to run different types of tests for the Goblet Git cache proxy.
+
+## Test Organization
+
+Tests are organized into two categories:
+
+### 1. Unit Tests (No Docker Required)
+Unit tests run with the `-short` flag and skip any tests requiring Docker containers. These are safe to run in CI environments without Docker.
+
+**Command:**
+```bash
+task test-unit
+```
+
+**What gets tested:**
+- Pure Go unit tests
+- Logic and algorithm tests
+- Tests that don't require external services
+
+**Coverage:** `coverage-unit.out`
+
+### 2. Integration Tests (Require Docker)
+Integration tests require Docker containers to be running and test the full system end-to-end.
+
+#### Go Integration Tests
+Tests in `./testing/...` that require Docker Compose test environment.
+
+**Command:**
+```bash
+task test-integration-go
+```
+
+**What gets tested:**
+- Git fetch operations
+- Cache functionality
+- Storage backend integration
+- Authentication flows
+- Health checks
+
+**Coverage:** `coverage-integration.out`
+
+#### OIDC Integration Tests
+End-to-end tests for OIDC authentication using Dex IdP.
+
+**Command:**
+```bash
+task test-integration-oidc
+# or
+task test-oidc
+```
+
+**What gets tested:**
+- Service health
+- Token generation and retrieval
+- Authentication flows (401, 400, 200 responses)
+- Git operations (ls-remote, clone)
+- Metrics collection
+- Server logs
+
+**Details:** 12 integration tests covering full OIDC workflow
+
+#### All Integration Tests
+Run both Go and OIDC integration tests.
+
+**Command:**
+```bash
+task test-integration-all
+# or
+task test-integration
+```
+
+## Quick Reference
+
+### Development Workflow
+
+```bash
+# Quick feedback loop (no Docker)
+task test-unit              # Run unit tests
+task test-short             # Run unit tests (fast, no race detector)
+
+# Pre-commit checks (no Docker)
+task pre-commit             # fmt + tidy + lint + unit tests
+
+# Full local testing (requires Docker)
+task test-integration       # All integration tests
+task int                    # Full integration cycle (clean + build + test)
+```
+
+### CI/CD Workflows
+
+```bash
+# Fast CI (no Docker - use in pull request checks)
+task ci                     # fmt-check + lint + unit tests + build
+
+# Quick checks (no Docker)
+task ci-quick               # fmt-check + lint + unit tests (fastest)
+
+# Full CI (requires Docker - use in post-merge or nightly)
+task ci-full                # unit tests + build-all + integration tests
+
+# Complete local CI (simulates GitHub Actions)
+task ci-local               # install-tools + deps + ci-full
+```
+
+### Specific Test Types
+
+```bash
+# Unit tests only (no Docker)
+task test-unit              # With race detector
+task test-short             # Without race detector (faster)
+task test                   # Alias for test-unit
+
+# Integration tests (require Docker)
+task test-integration-go    # Go integration tests
+task test-integration-oidc  # OIDC integration tests
+task test-integration-all   # All integration tests
+task test-integration       # Alias for test-integration-all
+
+# Parallel testing (require Docker)
+task test-parallel          # Run Go integration tests in parallel
+
+# OIDC-specific
+task test-oidc             # Run OIDC integration tests
+task validate-token        # Validate token mount
+task get-token             # Get bearer token
+```
+
+## Test Categories Matrix
+
+| Task | Docker Required? | CI Safe? | Coverage File | Duration |
+|------|-----------------|----------|---------------|----------|
+| `test-unit` | ❌ No | ✅ Yes | `coverage-unit.out` | ~5s |
+| `test-short` | ❌ No | ✅ Yes | None | ~3s |
+| `test-integration-go` | ✅ Yes | ❌ No | `coverage-integration.out` | ~30s |
+| `test-integration-oidc` | ✅ Yes | ❌ No | None | ~15s |
+| `test-integration-all` | ✅ Yes | ❌ No | Mixed | ~45s |
+| `test-parallel` | ✅ Yes | ❌ No | None | ~20s |
+
+## CI/CD Integration
+
+### GitHub Actions Workflow
+
+The project uses a parallelized GitHub Actions workflow (`.github/workflows/ci.yml`) that extracts each `task ci` step into separate jobs:
+
+**Parallel CI Jobs (No Docker):**
+- `format-check` - Code formatting validation with goimports
+- `tidy-check` - Go module tidiness check
+- `lint` - Static analysis with golangci-lint and staticcheck
+- `test-unit` - Unit tests with race detector and coverage
+- `build` - Build for current platform
+- `build-multi` - Multi-platform builds (matrix strategy)
+
+**Status Check:**
+- `ci-complete` - Depends on all parallel jobs, provides single PR status
+
+**Integration Tests (Docker Required):**
+- `integration-test` - Only runs on main branch or with `run-integration-tests` label
+
+**Local Equivalent:**
+```bash
+# Run same checks locally (sequential)
+task ci
+
+# Run full CI with integration tests
+task ci-full
+```
+
+### GitLab CI Example
+
+```yaml
+test:unit:
+  stage: test
+  script:
+    - task test-unit
+
+test:integration:
+  stage: test
+  services:
+    - docker:dind
+  script:
+    - task test-integration
+```
+
+## Coverage Reports
+
+Generate and view coverage:
+
+```bash
+# Generate coverage HTML report
+task coverage
+
+# View unit test coverage only
+go tool cover -html=coverage-unit.out
+
+# View integration test coverage only
+go tool cover -html=coverage-integration.out
+```
+
+## Test Environment Setup
+
+### For Unit Tests
+No setup required - unit tests run without external dependencies.
+
+### For Integration Tests
+
+The project uses a unified `docker-compose.yml` with profiles for different scenarios:
+- **basic** (default): Simple Minio + Goblet (no OIDC)
+- **dev**: Full stack with Dex OIDC + Minio + Goblet + token automation
+- **test**: Test environment with Dex + Minio for integration testing
+
+#### Docker Compose Test Environment
+```bash
+# Start test environment (test profile)
+task docker-test-up
+
+# Run tests
+task test-integration-go
+
+# Stop test environment
+task docker-test-down
+
+# View logs
+task docker-test-logs
+```
+
+#### Docker Compose Dev Environment (for OIDC tests)
+```bash
+# Start dev environment (dev profile with OIDC)
+task up-dev
+# or
+task docker-up
+
+# Run OIDC tests
+task test-oidc
+
+# Stop dev environment
+task down
+
+# View logs
+task docker-logs
+```
+
+#### Basic Environment (no OIDC)
+```bash
+# Start basic environment (default profile)
+task up
+
+# Stop
+task down
+```
+
+## Troubleshooting
+
+### Unit Tests Failing
+```bash
+# Run with verbose output
+go test -short -v ./...
+
+# Run specific test
+go test -short -v ./... -run TestName
+```
+
+### Integration Tests Failing
+```bash
+# Check Docker containers are running
+docker ps
+
+# View service logs
+task docker-test-logs
+
+# Clean and restart
+task docker-test-down
+task docker-test-up
+```
+
+### OIDC Tests Failing
+```bash
+# Validate token is accessible
+task validate-token
+
+# Check dev services
+docker compose --profile dev ps
+
+# View server logs
+docker logs goblet-server
+```
+
+### Common Issues & Solutions
+
+#### Issue: HTTP 500 Instead of 401 on Authentication Failure
+**Symptom:** Server returns HTTP 500 Internal Server Error instead of HTTP 401 Unauthorized when requests lack authentication.
+
+**Root Cause:** OIDC authorizer returning plain Go errors instead of gRPC status errors.
+
+**Solution:** Modified `auth/oidc/authorizer.go` to return proper gRPC status codes:
+- `status.Error(codes.Unauthenticated, ...)` for missing/invalid tokens
+- `status.Errorf(codes.Internal, ...)` for internal errors
+
+#### Issue: Command-Line Flags Not Being Parsed
+**Symptom:** Goblet server not respecting command-line flags (e.g., `-port=8888`).
+
+**Root Cause:** Docker Compose `command: >` syntax creating a single string instead of array of arguments.
+
+**Solution:** Changed docker-compose.yml to use array syntax:
+```yaml
+command:
+  - -port=8888
+  - -cache_root=/cache
+```
+
+#### Issue: Empty Tokens Sent to Upstream (GitHub 401 Errors)
+**Symptom:** GitHub returns 401 errors even for public repositories when using anonymous token source.
+
+**Root Cause:** Code unconditionally sending empty Authorization headers.
+
+**Solution:** Only set Authorization headers when token has non-empty AccessToken:
+```go
+if t.AccessToken != "" {
+    t.SetAuthHeader(req)
+}
+```
+
+## Writing New Tests
+
+### Unit Tests
+Mark tests that require Docker with build tags or skip in short mode:
+
+```go
+func TestSomething(t *testing.T) {
+    if testing.Short() {
+        t.Skip("Skipping integration test in short mode")
+    }
+    // Test code that requires Docker
+}
+```
+
+### Integration Tests
+Place integration tests in `./testing/...` directory:
+
+```go
+// testing/my_integration_test.go
+package testing
+
+func TestIntegration(t *testing.T) {
+    // Full integration test with Docker
+}
+```
+
+### OIDC Tests
+Add new tests to `Taskfile.yml` under `test-oidc-*` tasks following the existing pattern.
+
+## Best Practices
+
+1. **Always run unit tests before committing:**
+   ```bash
+   task pre-commit
+   ```
+
+2. **Run integration tests before pushing:**
+   ```bash
+   task test-integration
+   ```
+
+3. **Use parallel testing for faster feedback:**
+   ```bash
+   task test-parallel
+   ```
+
+4. **Keep unit tests fast** (< 1s per test)
+
+5. **Mark integration tests clearly** with `-short` skip or build tags
+
+6. **Use table-driven tests** for multiple scenarios
+
+7. **Clean up test resources** in defer statements
+
+## Summary
+
+- **No Docker?** Use `task test-unit` or `task ci`
+- **Have Docker?** Use `task test-integration` or `task ci-full`
+- **Quick check?** Use `task ci-quick`
+- **Pre-commit?** Use `task pre-commit`
+- **OIDC testing?** Use `task test-oidc`
diff --git a/docs/operations/troubleshooting.md b/docs/operations/troubleshooting.md
new file mode 100644
index 0000000..d24920e
--- /dev/null
+++ b/docs/operations/troubleshooting.md
@@ -0,0 +1,133 @@
+# Troubleshooting Guide
+
+Common issues and solutions for Goblet deployments.
+
+## Quick Diagnostics
+
+```bash
+# Check pod status
+kubectl get pods -l app=goblet
+
+# View logs
+kubectl logs -f deployment/goblet
+
+# Check metrics
+curl http://localhost:8080/metrics
+
+# Test connectivity
+curl http://localhost:8080/healthz
+```
+
+## Common Issues
+
+### High Latency
+
+**Symptoms:** P95 latency > 1000ms
+
+**Causes:**
+- Low cache hit rate
+- Slow disk I/O
+- Network issues with upstream
+- Resource constraints
+
+**Solutions:**
+1. Check cache hit rate: `curl http://localhost:8080/metrics | grep cache_hit`
+2. Check disk I/O: `iostat -x 1`
+3. Increase cache size
+4. Use faster storage (SSD)
+5. Pre-warm cache
+
+See [Monitoring Guide](monitoring.md) for detailed metrics.
+
+### High Error Rate
+
+**Symptoms:** Errors > 5%
+
+**Causes:**
+- Upstream connectivity issues
+- Authentication failures
+- Rate limiting
+- Misconfigurations
+
+**Solutions:**
+1. Check logs for error patterns
+2. Verify upstream connectivity
+3. Check authentication configuration
+4. Review rate limits
+
+### Out of Disk Space
+
+**Symptoms:** "no space left on device"
+
+**Causes:**
+- Cache grew beyond capacity
+- No eviction policy
+- Large repositories
+
+**Solutions:**
+1. Implement [tiered storage](../architecture/storage-optimization.md)
+2. Add LRU eviction
+3. Increase disk size
+4. Clean old repositories
+
+### Cross-Tenant Access
+
+**Symptoms:** User A can access User B's repositories
+
+**Cause:** Missing tenant isolation
+
+**Solution:** Implement [isolation strategy](../security/isolation-strategies.md)
+
+### Pod Won't Start
+
+**Symptoms:** CrashLoopBackOff
+
+**Diagnostics:**
+```bash
+kubectl describe pod <pod-name>
+kubectl logs <pod-name> --previous
+```
+
+**Common Causes:**
+- Image pull errors
+- Resource limits too low
+- Volume mount issues
+- Configuration errors
+
+## Debugging Commands
+
+### Check Configuration
+```bash
+kubectl get configmap goblet-config -o yaml
+```
+
+### View Full Logs
+```bash
+kubectl logs deployment/goblet --all-containers=true --tail=100
+```
+
+### Check Resource Usage
+```bash
+kubectl top pod -l app=goblet
+```
+
+### Test Cache
+```bash
+# Clone repo twice, second should be faster
+time git clone https://github.com/kubernetes/kubernetes.git test1
+rm -rf test1
+time git clone https://github.com/kubernetes/kubernetes.git test2
+```
+
+## Getting Help
+
+1. Check this guide
+2. Review [documentation](../index.md)
+3. Search [GitHub issues](https://github.com/google/goblet/issues)
+4. Ask in [discussions](https://github.com/google/goblet/discussions)
+
+## Related Documentation
+
+- [Monitoring Guide](monitoring.md)
+- [Load Testing](load-testing.md)
+- [Deployment Patterns](deployment-patterns.md)
diff --git a/docs/operations/upgrading.md b/docs/operations/upgrading.md
new file mode 100644
index 0000000..2d5da51
--- /dev/null
+++ b/docs/operations/upgrading.md
@@ -0,0 +1,123 @@
+# Upgrading Guide
+
+## 2025-11 Update
+
+### Go Version Update
+
+The project has been upgraded from Go 1.12 to Go 1.24.0, bringing modern language features and improved performance.
+
+### Module Updates
+
+All Go modules have been updated to their latest versions:
+
+**Major Updates:**
+- `cloud.google.com/go/logging`: v1.4.2 → v1.13.1
+- `cloud.google.com/go/storage`: v1.16.0 → v1.57.1
+- `github.com/go-git/go-git/v5`: v5.4.2 → v5.16.3
+- `google.golang.org/api`: v0.50.0 → v0.255.0
+- `google.golang.org/grpc`: v1.39.0 → v1.76.0
+- `google.golang.org/protobuf`: v1.27.1 → v1.36.10
+
+**New Dependencies:**
+- OpenTelemetry instrumentation packages (v1.38.0)
+- Minio Go SDK (v7.0.97) for S3 support
+- Cloud monitoring and tracing support
+
+### Breaking Changes
+
+#### Storage Backend Configuration
+
+The storage configuration has been modernized to support multiple providers.
+
+**Old Configuration (GCS only):**
+```bash
+-backup_bucket_name=my-bucket
+-backup_manifest_name=my-manifest
+```
+
+**New Configuration:**
+
+For GCS:
+```bash
+-storage_provider=gcs
+-backup_bucket_name=my-bucket
+-backup_manifest_name=my-manifest
+```
+
+For S3/Minio:
+```bash
+-storage_provider=s3
+-s3_endpoint=localhost:9000
+-s3_bucket=goblet-backups
+-s3_access_key=minioadmin
+-s3_secret_key=minioadmin
+-s3_region=us-east-1
+-s3_use_ssl=false
+-backup_manifest_name=my-manifest
+```
+
+#### API Changes
+
+The `google.RunBackupProcess` function signature has changed:
+
+**Before:**
+```go
+func RunBackupProcess(config *goblet.ServerConfig, bh *storage.BucketHandle, manifestName string, logger *log.Logger)
+```
+
+**After:**
+```go
+func RunBackupProcess(config *goblet.ServerConfig, provider storage.Provider, manifestName string, logger *log.Logger)
+```
+
+### Migration Steps
+
+1. **Update Go Installation:**
+   ```bash
+   # Install Go 1.24 or later
+   go version # Should show go1.24 or higher
+   ```
+
+2. **Update Dependencies:**
+   ```bash
+   go mod tidy
+   go build ./...
+   ```
+
+3. **Update Configuration:**
+   - Add `-storage_provider` flag to your deployment
+   - For GCS: `-storage_provider=gcs`
+   - For S3/Minio: Add S3 configuration flags
+
+4. **Test Changes:**
+   ```bash
+   go test ./...
+   ```
+
+5. **Deploy:**
+   - Update your deployment scripts with new configuration flags
+   - For Docker deployments, see docker-compose.yml for examples
+
+### Backwards Compatibility
+
+The changes maintain backwards compatibility for deployments without backup configured. If no storage provider is specified, the server will run without backup functionality.
+
+### Docker Deployment
+
+A new docker-compose.yml has been added for local testing with Minio:
+
+```bash
+docker-compose up -d
+```
+
+This will start:
+- Goblet server on port 8080
+- Minio S3 on port 9000 (API) and 9001 (Console)
+
+### Environment Variables
+
+S3 credentials can also be provided via environment variables:
+- `AWS_ACCESS_KEY_ID`
+- `AWS_SECRET_ACCESS_KEY`
+
+For production deployments, prefer environment variables or secrets management over command-line flags.
diff --git a/docs/reference/api.md b/docs/reference/api.md
new file mode 100644
index 0000000..af8dede
--- /dev/null
+++ b/docs/reference/api.md
@@ -0,0 +1,62 @@
+# API Reference
+
+HTTP endpoints exposed by Goblet.
+
+## Git Protocol Endpoints
+
+### POST /{repo}/git-upload-pack
+
+Git protocol v2 upload-pack endpoint for fetch operations.
+
+**Request:**
+```
+POST /github.com/kubernetes/kubernetes/git-upload-pack
+Content-Type: application/x-git-upload-pack-request
+Git-Protocol: version=2
+
+<git protocol v2 data>
+```
+
+**Response:**
+```
+HTTP/1.1 200 OK
+Content-Type: application/x-git-upload-pack-result
+
+<git pack data>
+```
+
+### GET /{repo}/info/refs
+
+Git smart HTTP info/refs endpoint (legacy).
+
+## Health & Monitoring
+
+### GET /healthz
+
+Health check endpoint.
+
+**Response:**
+```json
+{
+  "status": "healthy",
+  "uptime": "48h30m",
+  "cache_size": "45GB"
+}
+```
+
+### GET /metrics
+
+Prometheus metrics endpoint.
+
+**Response:**
+```
+# HELP goblet_requests_total Total number of requests
+# TYPE goblet_requests_total counter
+goblet_requests_total{operation="fetch",status="success"} 12345
+...
+```
+
+## Related Documentation
+
+- [Metrics Reference](metrics.md)
+- [Monitoring Guide](../operations/monitoring.md)
diff --git a/docs/reference/configuration.md b/docs/reference/configuration.md
new file mode 100644
index 0000000..6f4f84d
--- /dev/null
+++ b/docs/reference/configuration.md
@@ -0,0 +1,99 @@
+# Configuration Reference
+
+Complete reference for all Goblet configuration options.
+
+## Command-Line Flags
+
+### Basic Options
+
+```bash
+--port int
+    HTTP server port (default: 8080)
+
+--cache_root string
+    Root directory for cache storage (default: "/cache")
+
+--upstream_timeout duration
+    Timeout for upstream requests (default: 30s)
+
+--log_level string
+    Log level: debug, info, warn, error (default: "info")
+```
+
+### Authentication
+
+```bash
+--auth_type string
+    Authentication type: oauth2, oidc (default: "oauth2")
+
+--oauth2_client_id string
+    OAuth2 client ID
+
+--oidc_issuer string
+    OIDC issuer URL
+
+--oidc_client_id string
+    OIDC client ID
+```
+
+### Storage
+
+```bash
+--storage_provider string
+    Storage provider: local, s3, gcs (default: "local")
+
+--storage_bucket string
+    Cloud storage bucket name
+
+--backup_interval duration
+    Backup interval for cloud storage (default: 1h)
+```
+
+## Environment Variables
+
+All command-line flags can be set via environment variables:
+
+```bash
+GOBLET_PORT=8080
+GOBLET_CACHE_ROOT=/cache
+GOBLET_LOG_LEVEL=info
+GOBLET_AUTH_TYPE=oidc
+GOBLET_OIDC_ISSUER=https://auth.example.com
+GOBLET_OIDC_CLIENT_ID=goblet
+```
+
+## Configuration File
+
+Create `/etc/goblet/config.yaml`:
+
+```yaml
+server:
+  port: 8080
+  cache_root: /cache
+  upstream_timeout: 30s
+  log_level: info
+
+auth:
+  type: oidc
+  oidc:
+    issuer: https://auth.example.com
+    client_id: goblet
+
+storage:
+  provider: local
+  backup:
+    enabled: true
+    interval: 1h
+    provider: gcs
+    bucket: goblet-backups
+```
+
+## Isolation Configuration
+
+See [Isolation Strategies](../security/isolation-strategies.md) for multi-tenant configuration.
+
+## Related Documentation
+
+- [Getting Started](../getting-started.md)
+- [Security Guide](../security/README.md)
+- [Deployment Patterns](../operations/deployment-patterns.md)
diff --git a/docs/reference/metrics.md b/docs/reference/metrics.md
new file mode 100644
index 0000000..e6aaa5d
--- /dev/null
+++ b/docs/reference/metrics.md
@@ -0,0 +1,75 @@
+# Metrics Reference
+
+Complete reference for Prometheus metrics exposed by Goblet.
+
+## Request Metrics
+
+### goblet_requests_total
+
+Total number of requests processed.
+
+**Labels:**
+- `operation`: fetch, ls-refs
+- `status`: success, error
+- `cache`: hit, miss
+
+**Type:** Counter
+
+### goblet_request_duration_seconds
+
+Request duration histogram.
+
+**Labels:**
+- `operation`: fetch, ls-refs
+
+**Type:** Histogram
+
+## Cache Metrics
+
+### goblet_cache_hits_total
+
+Total number of cache hits.
+
+**Type:** Counter
+
+### goblet_cache_misses_total
+
+Total number of cache misses.
+
+**Type:** Counter
+
+### goblet_cache_size_bytes
+
+Current cache size in bytes.
+
+**Type:** Gauge
+
+## Error Metrics
+
+### goblet_errors_total
+
+Total number of errors.
+
+**Labels:**
+- `type`: upstream, auth, internal
+
+**Type:** Counter
+
+## System Metrics
+
+### goblet_disk_usage_bytes
+
+Disk usage in bytes.
+
+**Type:** Gauge
+
+### goblet_disk_capacity_bytes
+
+Total disk capacity in bytes.
+
+**Type:** Gauge
+
+## Related Documentation
+
+- [Monitoring Guide](../operations/monitoring.md)
+- [API Reference](api.md)