Platform Engineering Take-Home Exercise

Vendor Signal Normalization & Routing

Time Estimate: 2 hours
Language: Your choice of Python or TypeScript/Node.js

Overview

Build a small HTTP service that ingests vendor signals from multiple sources (webhooks, status pages), normalizes them into a unified schema, and routes critical events to downstream destinations.

This exercise simulates a real-world platform engineering challenge: building reliable, observable infrastructure that integrates multiple third-party services.

We encourage you use AI to build this, but be prepared to answer any and all questions about the codebase, why you did certain things and other questions.

Requirements

Core Functionality

Your service must:

Ingest Stripe webhook events with signature verification
Poll external status pages (Statuspage.io format) for Spreedly and Braze
Normalize all signals into a unified internal event schema
Route critical events to a mock PagerDuty destination
Store recent events in memory with idempotency
Expose a query API for recent events

External Resources

API Specification

1. POST `/ingest/stripe`

Click to expand/collapse

Accepts Stripe webhook events with signature verification.

Requirements:

Verify webhook signature using Stripe's signing secret
Return 400 if signature verification fails
Return 200 with { received: true } on success
Normalize the event and store it
Route to PagerDuty if severity is critical

Environment Variables:

STRIPE_WEBHOOK_SECRET (required)
STRIPE_API_KEY (optional, not needed for verification)

Example request:

curl -X POST http://localhost:3000/ingest/stripe \
  -H "Content-Type: application/json" \
  -H "Stripe-Signature: t=..." \
  -d @fixtures/stripe/payment_failed.json

---

2. POST `/ingest/pull-status`

Click to expand/collapse

Fetches and processes Statuspage.io summary.json from configured URLs.

Requirements:

Fetch from SPREEDLY_STATUS_SUMMARY_URL and BRAZE_STATUS_SUMMARY_URL
If URLs not configured, fall back to local fixtures
Parse incidents and components
Normalize and store events
Route critical incidents

Response:

{
  "fetched": {
    "spreedly": 2,
    "braze": 1
  },
  "stored": 3,
  "routed": 1
}

Environment Variables:

SPREEDLY_STATUS_SUMMARY_URL (optional)
BRAZE_STATUS_SUMMARY_URL (optional)

---

3. POST `/destinations/pagerduty`

Click to expand/collapse

Mock PagerDuty destination endpoint.

Requirements:

Accept normalized events
Log receipt with structured logging
Store in memory for verification
Return 202 Accepted

---

4. GET `/events`

Click to expand/collapse

Query recent normalized events.

Query Parameters:

limit (default: 50) - number of events to return

Response:

{
  "events": [
    {
      "event_id": "evt_123",
      "source": "stripe",
      "kind": "payment",
      "severity": "critical",
      "service": "stripe",
      "summary": "Payment failed for pi_abc",
      "description": null,
      "started_at": "2024-01-15T10:30:00Z",
      "resolved_at": null,
      "routed": true,
      "delivered_to": ["pagerduty"],
      "raw": { ... }
    }
  ]
}

---

5. GET `/healthz`

Click to expand/collapse

Health check endpoint.

Response:

{ "ok": true }

---

Normalized Event Schema

All vendor signals must be normalized to this schema:

{
  event_id: string,           // Unique identifier
  source: "stripe" | "spreedly_status" | "braze_status",
  kind: "incident" | "status" | "payment",
  severity: "info" | "warning" | "critical",
  service: string,            // Service name
  summary: string,            // Short description
  description: string | null, // Detailed info
  started_at: string,         // ISO-8601 timestamp
  resolved_at: string | null, // ISO-8601 or null if ongoing
  raw: unknown                // Original event for debugging
}

Validation:

Use schema validation (Zod, Pydantic, etc.)
Enforce types strictly

Normalization Rules

Stripe Events (https://docs.stripe.com/)

Support at least these event types:

Event Type	Severity	Kind	Notes
`payout.failed`	critical	payment	Financial impact
`payment_intent.payment_failed`	warning	payment	May need investigation

Mapping:

event_id = Stripe's event.id
started_at = Convert event.created (unix) to ISO-8601
resolved_at = null (webhooks are point-in-time)
summary = {event.type}: {object.id}

Statuspage.io Format

Typical summary.json structure:

{
  "incidents": [
    {
      "id": "abc123",
      "name": "API Degradation",
      "status": "investigating",
      "impact": "major",
      "created_at": "2024-01-15T10:00:00Z",
      "updated_at": "2024-01-15T10:15:00Z",
      "resolved_at": null
    }
  ],
  "components": [
    {
      "id": "comp_1",
      "name": "API",
      "status": "operational",
      "updated_at": "2024-01-15T09:00:00Z"
    }
  ]
}

Incident Mapping:

Impact	Severity
critical, major	critical
minor	warning
none, maintenance	info

Component Status Mapping:

Status	Create Event?	Severity
`operational`	No	-
`degraded_performance`	Yes	warning
`partial_outage`	Yes	critical
`major_outage`	Yes	critical

Routing Logic

Events should be routed based on severity:

Severity	Route to PagerDuty?	Condition
critical	Yes	Always
warning	Conditional	Only if `ROUTE_WARNING=true`
info	No	Never

Delivery:

Best-effort: log failures but don't block ingestion
Mark delivery status in stored event metadata
Internal call to POST /destinations/pagerduty

Idempotency

Requirements:

Track seen event_id values in memory
If duplicate detected:
- Return 200 { received: true, deduped: true }
- Do NOT route/deliver again
- Do NOT create duplicate in event store

Implementation suggestion:

Use Set or Map: event_id -> first_seen_timestamp
Include in /events response metadata if useful

Bonus: AI Hook (Optional)

Add an optional endpoint that demonstrates how you would integrate AI/LLM capabilities:

POST `/ai/summarize`

Input:

{
  "text": "Multiple payment failures detected across APAC region..."
}

Output:

{
  "summary": "Payment failures in APAC",
  "suggested_severity": "critical"
}

Implementation:

Use a deterministic stub (keyword matching, no real API calls)
In documentation, describe how you WOULD integrate a real LLM:
- Prompt engineering
- PII/secrets redaction
- Audit logging
- Fallback behavior
- Cost controls
- Latency considerations

Technical Requirements

Language & Frameworks

Option 1: TypeScript/Node.js

Node 18+
Fastify or Express
Zod for validation
Pino for structured logging
Jest or Vitest for tests

Option 2: Python

Python 3.10+
FastAPI or Flask
Pydantic for validation
structlog for structured logging
pytest for tests

Must Include

✅ Structured logging (JSON format)
✅ Request correlation IDs
✅ Strict type checking (TypeScript strict mode or mypy)
✅ Input validation on all endpoints
✅ Proper error handling with appropriate status codes
✅ At least 3 meaningful unit tests
✅ Docker support (Dockerfile + docker-compose.yml)
✅ Environment variable configuration
✅ .env.example file

Fixtures

Provide example fixture files:

`fixtures/stripe/payout_failed.json`

{
  "id": "evt_1abc",
  "object": "event",
  "type": "payout.failed",
  "created": 1705315200,
  "data": {
    "object": {
      "id": "po_123",
      "amount": 10000,
      "currency": "usd",
      "failure_message": "Insufficient funds"
    }
  }
}

`fixtures/stripe/payment_failed.json`

{
  "id": "evt_2def",
  "object": "event",
  "type": "payment_intent.payment_failed",
  "created": 1705315800,
  "data": {
    "object": {
      "id": "pi_456",
      "amount": 5000,
      "currency": "usd",
      "last_payment_error": {
        "message": "Card declined"
      }
    }
  }
}

`fixtures/statuspage/spreedly_summary.json`

{
  "page": {
    "id": "spreedly",
    "name": "Spreedly Status"
  },
  "incidents": [
    {
      "id": "inc_spreedly_1",
      "name": "Payment Gateway Latency",
      "status": "investigating",
      "impact": "major",
      "created_at": "2024-01-15T10:00:00Z",
      "updated_at": "2024-01-15T10:30:00Z",
      "resolved_at": null
    }
  ],
  "components": [
    {
      "id": "comp_api",
      "name": "API",
      "status": "operational",
      "updated_at": "2024-01-15T09:00:00Z"
    }
  ]
}

`fixtures/statuspage/braze_summary.json`

{
  "page": {
    "id": "braze",
    "name": "Braze Status"
  },
  "incidents": [],
  "components": [
    {
      "id": "comp_dashboard",
      "name": "Dashboard",
      "status": "degraded_performance",
      "updated_at": "2024-01-15T11:00:00Z"
    }
  ]
}

Documentation Requirements

README.md

Your submission must include:

Setup Instructions
- Dependencies installation
- Environment configuration
- How to run locally
- How to run with Docker
Usage Examples
- curl commands for each endpoint
- How to trigger ingestion with fixtures
- How to verify routing behavior
Architecture Overview
- High-level system diagram (ASCII or description)
- Key design decisions
- Data flow explanation
Testing
- How to run tests
- What is tested
- Test coverage approach
Security Considerations
- Stripe signature verification (why raw body matters)
- Secret management
- Input validation
Production Readiness Discussion
- What would you add for production?
- Tradeoffs made for this exercise
- Scalability considerations
- Observability improvements
- Persistence strategy
- Queue/retry mechanisms
- Rate limiting

Evaluation Criteria

We'll evaluate based on:

Core Functionality (40%)

✅ All endpoints work as specified
✅ Stripe signature verification works
✅ Normalization is correct
✅ Routing logic is accurate
✅ Idempotency is implemented

Code Quality (30%)

✅ Clean, readable code
✅ Proper error handling
✅ Type safety
✅ Structured logging
✅ Input validation

Testing (15%)

✅ Meaningful unit tests
✅ Test coverage of critical paths
✅ Tests are runnable

Documentation (10%)

✅ Clear setup instructions
✅ Architecture explanation
✅ Production considerations

DevOps (5%)

✅ Docker works correctly
✅ Environment configuration
✅ Easy to run locally

Submission

Push your code to a private GitHub repository
Invite the reviewers (we'll provide usernames)
Include:
- All source code
- Tests
- Fixtures
- Dockerfile + docker-compose.yml
- README.md
- .env.example

Do NOT include:

node_modules/ or virtual environments
Real secrets or API keys
Build artifacts

Time Expectations

This exercise is designed to take 2 hours. We understand you may not have time to implement everything perfectly.

Prioritize in this order:

Core endpoints working (stripe ingestion, status polling, events query)
Proper normalization and routing
Tests for critical logic
Documentation
Bonus features (AI hook)

If short on time:

Mock external HTTP calls in tests
Document what you would improve given more time
Focus on demonstrating your thought process

Questions?

If anything is unclear, please email us. We'll respond within 24 hours.

Good luck! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
fixtures		fixtures
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Platform Engineering Take-Home Exercise

Vendor Signal Normalization & Routing

Overview

Requirements

Core Functionality

API Specification

1. POST /ingest/stripe

2. POST /ingest/pull-status

3. POST /destinations/pagerduty

4. GET /events

5. GET /healthz

Normalized Event Schema

Normalization Rules

Stripe Events (https://docs.stripe.com/)

Statuspage.io Format

Routing Logic

Idempotency

Bonus: AI Hook (Optional)

POST /ai/summarize

Technical Requirements

Language & Frameworks

Must Include

Fixtures

fixtures/stripe/payout_failed.json

fixtures/stripe/payment_failed.json

fixtures/statuspage/spreedly_summary.json

fixtures/statuspage/braze_summary.json

Documentation Requirements

README.md

Evaluation Criteria

Core Functionality (40%)

Code Quality (30%)

Testing (15%)

Documentation (10%)

DevOps (5%)

Submission

Time Expectations

Questions?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

1. POST `/ingest/stripe`

2. POST `/ingest/pull-status`

3. POST `/destinations/pagerduty`

4. GET `/events`

5. GET `/healthz`

POST `/ai/summarize`

`fixtures/stripe/payout_failed.json`

`fixtures/stripe/payment_failed.json`

`fixtures/statuspage/spreedly_summary.json`

`fixtures/statuspage/braze_summary.json`

Packages