This service is designed to never crash from application-level errors. It logs all errors and continues serving requests.
Location: src/middleware/async-handler.ts
What it catches:
- Async/await errors in route handlers
- Promise rejections in route logic
- Database query failures
- Service call failures
Example:
router.post(
"/decisions/:node",
asyncHandler(async (req, res) => {
// If this throws or rejects, asyncHandler catches it
const decision = await makeDecision(params);
res.json(decision);
})
);Result: Error forwarded to global error handler, request gets error response, service continues.
Location: src/middleware/error-handler.ts
What it catches:
- All errors forwarded by asyncHandler
- Validation errors (zod)
- Custom errors (NotFoundError, BadRequestError, etc.)
- Unexpected errors in middleware
Behavior:
- CustomError instances: Returns structured error with appropriate status code (400, 404, etc.)
- Unknown errors: Logs full stack trace, returns generic 500 response (doesn't leak internals)
Security: Never exposes internal error details to clients.
Location: src/server.ts
process.on("uncaughtException", (error) => {
// Log fatal error
// Exit after 1 second (allows log flush)
});When it triggers:
- Synchronous errors outside request context
- Programming errors (referencing undefined, etc.)
Why it exits: Uncaught exceptions leave the process in undefined state. Best practice is to crash and let orchestrator (Docker, Railway, PM2) restart.
process.on("unhandledRejection", (reason) => {
// Log error
// Continue running (don't exit)
});When it triggers:
- Promise rejections outside request context
- Background job failures
- Timer callback failures
Why it continues: Unhandled rejections don't corrupt process state. We log and continue for maximum uptime.
process.on("SIGTERM", () => {
// Stop accepting new connections
// Wait for existing requests to complete
// Exit cleanly
});When it triggers:
- Deployment/restart (Railway, Kubernetes)
- Manual Ctrl+C
Result: No dropped requests during deployment.
try {
await migrate(db, { migrationsFolder: "./drizzle" });
} catch (error) {
baseLogger.fatal("Database migration failed");
process.exit(1); // Don't start with broken DB
}Why it exits: If migrations fail, DB schema is wrong. Better to crash and alert than serve broken responses.
┌─────────────────────────────────────────────────────────────┐
│ Request Arrives │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Route Handler (wrapped in asyncHandler) │
│ ┌─────────────────────────────────────────────┐ │
│ │ await makeDecision(params) ← Error thrown │ │
│ └─────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼ Error caught by asyncHandler
┌─────────────────────────────────────────────────────────────┐
│ Global Error Handler (errorHandler middleware) │
│ ┌─────────────────────────────────────────────┐ │
│ │ if (CustomError) │ │
│ │ → 400/404 with structured error │ │
│ │ else │ │
│ │ → 500 generic error │ │
│ └─────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Response sent, service continues, next request processed │
└─────────────────────────────────────────────────────────────┘
curl -X POST http://localhost:3000/decisions/finance \
-H "Content-Type: application/json" \
-d '{"action":"invalid_action","data":{}}'Expected: 400 Bad Request, service continues
curl -X POST http://localhost:3000/decisions/finance \
-H "Content-Type: application/json" \
-d '{"data":{}}'Expected: 400 Validation Error, service continues
curl -X POST http://localhost:3000/outcomes/00000000-0000-0000-0000-000000000000 \
-H "Content-Type: application/json" \
-d '{"metrics":{"won":true}}'Expected: 404 Not Found, service continues
curl -X POST http://localhost:3000/decisions/finance \
-H "Content-Type: application/json" \
-d '{broken json'Expected: 400 Bad Request (Express JSON parser), service continues
# Stop docker: docker-compose down
# Try request
curl http://localhost:3000/decisions/finance -X POST -H "Content-Type: application/json" -d '{"action":"approve_discount","data":{"discount_pct":0.1,"margin_pct":0.25}}'Expected: 500 Internal Server Error (logged), service continues attempting reconnect
Check logs for error patterns:
# Count errors by type
grep "error" logs.json | jq .err.type | sort | uniq -c
# Find unhandled errors
grep "Unhandled error in request" logs.json
# Check for crashes (shouldn't find any)
grep "uncaughtException" logs.jsonrouter.get("/health", async (req, res) => {
try {
await db.execute(sql`SELECT 1`); // Check DB
res.json({ ok: true, db: "connected" });
} catch (error) {
res.status(503).json({ ok: false, db: "disconnected" });
}
});For external services (LLM calls, Slack, etc.):
- Fail fast after N consecutive errors
- Prevent cascading failures
- Return cached/fallback responses
For transient errors:
- Retry DB queries (connection timeouts)
- Exponential backoff for LLM calls
- Idempotency keys prevent duplicate work
rate(errors_total[5m]) > 10→ Alertunhandled_rejections > 0→ Alertuncaught_exceptions > 0→ Page on-call
✅ Won't crash:
- Route handler errors
- Validation errors
- Database query failures
- Unhandled promise rejections
❌ Will crash (by design):
- Database migration failure at startup
- Out of memory (OOM)
- Segmentation fault (native code bugs)
uncaughtException(process state corrupted)
For production, run multiple instances behind a load balancer. If one crashes, others continue serving.
| Error Type | Handler | Result | Service State |
|---|---|---|---|
| Route error | asyncHandler → errorHandler | 4xx/5xx response | ✅ Running |
| Validation error | errorHandler | 400 response | ✅ Running |
| DB query error | asyncHandler → errorHandler | 500 response | ✅ Running |
| Unhandled rejection | process.on | Logged | ✅ Running |
| Uncaught exception | process.on | Logged, exit | ❌ Crashes (restarts) |
| Migration error | try/catch | Exit | ❌ Doesn't start |
Bottom line: Normal application errors won't crash the service. Only catastrophic failures cause restarts, which orchestrators handle automatically.