Skip to content

story(boot-server): implement boot asset streaming endpoints #613

@Zaba505

Description

@Zaba505

Description

Implement HTTP endpoints for streaming large boot assets (kernel and initrd images) from Cloud Storage to bare metal servers. These endpoints are called by servers after receiving the iPXE boot script, and must efficiently stream multi-megabyte files through the WireGuard VPN tunnel.

Performance is critical: target < 100ms latency for initiating streams and efficient throughput for large file downloads (typically 50-150MB per boot).

Acceptance Criteria

  • Kernel streaming endpoint created in services/boot-server/endpoint/assets.go:
    • GET /assets/{image_id}/kernel
    • Alternative: GET /kernels/{image_name}.img
  • Initrd streaming endpoint created:
    • GET /assets/{image_id}/initrd
    • Alternative: GET /initrd/{image_name}.img
  • Implements OpenAPI-first handler pattern (z5labs/humus):
    • RequestBody() method defining OpenAPI request schema
    • Responses() method defining OpenAPI response schemas (including file streaming)
    • ServeHTTP() method implementing HTTP handler logic
  • Cloud Storage integration:
    • Cloud Storage client from app context
    • Stream files from configured bucket (e.g., gs://boot-images/)
    • Efficient streaming (use io.Copy or chunked reads, avoid loading entire file in memory)
    • Handle Cloud Storage errors gracefully
  • Path parameter validation:
    • Image ID or image name validation
    • Returns HTTP 404 Not Found if asset doesn't exist
    • Prevent path traversal attacks
  • Source IP validation (security):
    • Validate request originates from WireGuard VPN subnet
    • Returns HTTP 403 Forbidden if outside allowed IP ranges
  • Response headers:
    • Content-Type: application/octet-stream
    • Content-Length: file size (if known)
    • Content-Disposition: attachment; filename="..." (optional)
    • Cache-Control: appropriate caching headers
  • Performance optimization:
    • Stream files efficiently (target < 100ms to first byte)
    • Support HTTP Range requests for resume capability (optional)
    • Consider Cloud Storage signed URLs for direct download (alternative approach)
  • OpenTelemetry instrumentation:
    • Log asset download requests with image ID, file size
    • Metrics: download count, bytes transferred, latency, success/failure rate
    • Trace context propagation
  • Error handling:
    • Proper HTTP status codes (403, 404, 500, 503)
    • Structured error responses
    • Logging with context
  • Unit tests:
    • Test path validation and sanitization
    • Test Cloud Storage integration (mock storage client)
    • Test streaming logic
    • Test error cases (missing file, storage errors)

Related Issues

Implements ADR-0005 - Network Boot Infrastructure Implementation on Google Cloud

Depends on #605 (API documentation)
Depends on #611 (boot server scaffold)
Related to #612 (boot script endpoint calls these URLs)
Related to #601

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestgoPull requests that update Go code

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions