Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
152 changes: 152 additions & 0 deletions SECURITY_AUDIT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# Security Audit Report

**Date:** 2026-02-06
**Scope:** Full codebase security review of Editorial Assistant v3.0
**Auditor:** Claude Code (automated)

---

## Executive Summary

This audit reviewed the entire codebase (~108 source files, ~8,200 lines of Python service code plus React frontend) for security vulnerabilities. **7 issues were found and fixed**, and **4 advisory findings** are documented below for future consideration.

---

## Fixed Vulnerabilities

### 1. CRITICAL: Path Traversal in File Upload

**File:** `api/routers/upload.py:117`
**Risk:** An attacker could upload a file with a crafted filename (e.g., `../../etc/cron.d/backdoor`) to write files outside the `transcripts/` directory, potentially achieving remote code execution.

**Root cause:** `file.filename` from the HTTP multipart upload was used directly in `TRANSCRIPTS_DIR / file.filename` without any sanitization.

**Fix:** Added `sanitize_upload_filename()` that:
- Strips directory components using `PurePosixPath.name`
- Removes null bytes, `..` sequences, and leading dots
- Restricts to safe characters (alphanumeric, hyphens, underscores, spaces, dots)
- Added a belt-and-suspenders `is_relative_to()` check on the resolved path

### 2. HIGH: Command Injection via `shell=True`

**File:** `api/routers/system.py:91`
**Risk:** `subprocess.Popen(command, shell=True, ...)` interprets the command through a shell. While the commands were hardcoded in this version, this pattern is dangerous — any future change that introduces user input into the command string would enable remote code execution.

**Fix:** Replaced with:
- An `ALLOWED_COMPONENTS` allowlist that maps component keys to pre-defined command lists
- `subprocess.Popen(command_list, ...)` (list form, no `shell=True`)
- Only the two known components (`run_worker.py`, `watch_transcripts.py`) can be started

### 3. HIGH: Server-Side Request Forgery (SSRF) in Ingest Scanner

**File:** `api/routers/ingest.py:266-316`
**Risk:** The `/api/ingest/scan` endpoint accepted a `base_url` query parameter that was passed directly to `IngestScanner`, which makes HTTP requests to that URL. An attacker could supply internal network addresses (e.g., `http://169.254.169.254/` for cloud metadata, `http://localhost:8000/api/...` for internal APIs) to probe the internal network.

**Fix:** Added an `ALLOWED_INGEST_HOSTS` allowlist (`mmingest.pbswi.wisc.edu`) and URL parsing validation. Requests to any other host are rejected with a 400 error.

### 4. MEDIUM: Error Message Information Disclosure

**Files:** `api/routers/chat_prototype.py:96`, `api/routers/ingest.py:316,545,572`, `api/routers/upload.py:184`
**Risk:** Exception details (including internal paths, stack traces, database errors, and API error messages) were passed directly to HTTP responses via `detail=str(e)`. This leaks implementation details that help attackers understand the system.

**Fix:** Replaced all `detail=str(e)` patterns with generic error messages. Full exception details are still logged server-side for debugging.

### 5. MEDIUM: Airtable Formula Injection

**Files:** `api/services/airtable.py:103,158`, `mcp_server/server.py:324`
**Risk:** Media ID values were interpolated directly into Airtable `filterByFormula` strings without escaping. A media ID containing `'` could alter the formula logic. While this is a read-only query, it could still be used to extract unintended data.

**Fix:** Added `_escape_airtable_formula_value()` that escapes backslashes and single quotes before interpolation into formula strings. Applied to both single and batch lookup methods.

### 6. MEDIUM: Gemini API Key Exposed in URL

**File:** `api/services/llm.py:853`
**Risk:** The Gemini API key was passed as a URL query parameter (`?key=<api_key>`). URL query parameters appear in:
- HTTP server access logs
- Proxy logs
- Error messages and stack traces
- Browser history (if applicable)
- Network monitoring tools

**Fix:** Moved the API key to the `x-goog-api-key` HTTP header, which is Google's supported authentication header for the Gemini API.

### 7. MEDIUM: Missing Security Headers and Overly Permissive CORS

**File:** `api/main.py:83-94`
**Risk:**
- `allow_methods=["*"]` and `allow_headers=["*"]` reduced the effectiveness of CORS protections
- No security headers were set (X-Content-Type-Options, X-Frame-Options, etc.)

**Fix:**
- Tightened CORS to explicit method and header lists
- Added security headers middleware: `X-Content-Type-Options: nosniff`, `X-Frame-Options: DENY`, `X-XSS-Protection`, `Referrer-Policy`, `Permissions-Policy`

---

## Advisory Findings (Not Fixed — Require Architecture Decisions)

### A1. HIGH: No Authentication or Authorization

**Impact:** The entire API is unauthenticated. All endpoints — including system management (restart/stop workers), configuration changes, file uploads, and chat (which costs money via LLM calls) — are accessible to anyone who can reach the server.

**Context:** The API is exposed to the internet via Cloudflare Tunnel (`config/cloudflared.yml`). While Cloudflare Access can provide authentication at the tunnel level, no application-level auth exists as a defense-in-depth measure.

**Recommendation:** Implement at least one of:
1. **Cloudflare Access** (zero-trust) with JWT validation in the API
2. **API key authentication** middleware for all non-health endpoints
3. **OAuth2/OIDC** if multi-user access is planned

Priority: System management endpoints (`/api/system/*`) should be protected first.

### A2. MEDIUM: No Rate Limiting

**Impact:** No rate limiting on any endpoint. An attacker could:
- Exhaust LLM API budgets via rapid chat messages
- Cause denial of service via upload flooding
- Trigger excessive ingest server scanning

**Recommendation:** Add `slowapi` or similar rate limiting middleware. Critical endpoints to protect first: `/api/chat/message`, `/api/upload/transcripts`, `/api/ingest/scan`.

### A3. MEDIUM: WebSocket Without Authentication

**File:** `api/routers/websocket.py:101`
**Impact:** The WebSocket endpoint accepts all connections without authentication. Anyone can connect and receive real-time job status updates.

**Recommendation:** Add token-based authentication to WebSocket connections (e.g., via query parameter token validated against a session).

### A4. LOW: Hardcoded sys.path for Shared Module

**Files:** `api/main.py:13`, `api/services/airtable.py:17`, `mcp_server/server.py:40`
**Impact:** `sys.path.insert(0, str(Path.home() / "Developer/the-lodge/scripts"))` inserts a hardcoded external path. If this directory is writable by another user or process, it could be used to inject malicious code via a fake `keychain_secrets.py`.

**Recommendation:** Use environment variables or a proper package installation (`pip install -e ../the-lodge`) instead of sys.path manipulation.

---

## Positive Security Observations

The codebase has several good security practices already in place:

1. **Airtable write protection** - Strict read-only policy with a single controlled exception for screengrab attachment (additive-only, audit-logged, idempotent)
2. **LLM cost caps** - Run cost caps, per-token cost limits, and model allowlists prevent runaway spending
3. **Path traversal checks in MCP server** - `mcp_server/server.py:1122-1124` correctly uses `resolve().relative_to()` to validate file paths
4. **Output file allowlist** - `api/routers/jobs.py:348-379` uses both an extension allowlist and `is_relative_to` check
5. **Parameterized SQL queries** - All database queries use SQLAlchemy's parameterized `text()` with `:param` binding, preventing SQL injection
6. **Pydantic validation** - Request models use Pydantic for type validation and constraints
7. **Secrets via Keychain** - Production secrets are stored in macOS Keychain, not in code or `.env` files
8. **`.gitignore` coverage** - `.env`, `dashboard.db`, `transcripts/`, `OUTPUT/`, and `logs/` are all gitignored

---

## Files Modified

| File | Changes |
|------|---------|
| `api/routers/upload.py` | Added `sanitize_upload_filename()`, path traversal protection, generic error messages |
| `api/routers/system.py` | Replaced `shell=True` with allowlist + list-form subprocess |
| `api/routers/chat_prototype.py` | Generic error messages instead of `str(e)` |
| `api/routers/ingest.py` | SSRF host allowlist, generic error messages |
| `api/services/airtable.py` | Formula value escaping for Airtable queries |
| `api/services/llm.py` | Gemini API key moved from URL to header |
| `api/main.py` | Security headers middleware, tightened CORS methods/headers |
| `mcp_server/server.py` | Formula value escaping for Airtable query |
18 changes: 15 additions & 3 deletions api/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@

from contextlib import asynccontextmanager

from fastapi import FastAPI
from fastapi import FastAPI, Request, Response
from fastapi.middleware.cors import CORSMiddleware

from api.services import database
Expand Down Expand Up @@ -89,11 +89,23 @@ async def lifespan(app: FastAPI):
"https://cardigan.bymarkriechers.com", # Cloudflare Tunnel
],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
allow_methods=["GET", "POST", "PUT", "PATCH", "DELETE", "OPTIONS"],
allow_headers=["Content-Type", "Authorization", "X-Requested-With"],
)


@app.middleware("http")
async def add_security_headers(request: Request, call_next):
"""Add security headers to all responses."""
response: Response = await call_next(request)
response.headers["X-Content-Type-Options"] = "nosniff"
response.headers["X-Frame-Options"] = "DENY"
response.headers["X-XSS-Protection"] = "1; mode=block"
response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"
response.headers["Permissions-Policy"] = "camera=(), microphone=(), geolocation=()"
return response


@app.get("/")
async def root():
"""Health check endpoint."""
Expand Down
2 changes: 1 addition & 1 deletion api/routers/chat_prototype.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,4 +93,4 @@ async def send_chat_message(request: ChatRequest) -> ChatResponse:

except Exception as e:
logger.error(f"Chat error: {e}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Chat request failed: {str(e)}")
raise HTTPException(status_code=500, detail="Chat request failed")
22 changes: 18 additions & 4 deletions api/routers/ingest.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,17 @@
import logging
from datetime import datetime
from typing import List, Optional
from urllib.parse import urlparse

from fastapi import APIRouter, HTTPException, Query
from pydantic import BaseModel
from sqlalchemy import text

# Allowlist of permitted ingest server hostnames to prevent SSRF
ALLOWED_INGEST_HOSTS = {
"mmingest.pbswi.wisc.edu",
}

from api.models.ingest import IngestConfigResponse, IngestConfigUpdate
from api.services.database import get_session
from api.services.ingest_config import (
Expand Down Expand Up @@ -293,6 +299,14 @@ async def trigger_scan(
scan_base_url = base_url or config.server_url
scan_dirs = directories.split(",") if directories else config.directories

# SSRF protection: validate base_url against allowlist
parsed = urlparse(scan_base_url)
if parsed.hostname not in ALLOWED_INGEST_HOSTS:
raise HTTPException(
status_code=400,
detail=f"Invalid ingest server host. Allowed: {', '.join(ALLOWED_INGEST_HOSTS)}",
)

scanner = IngestScanner(base_url=scan_base_url, directories=scan_dirs)
result = await scanner.scan()

Expand All @@ -313,7 +327,7 @@ async def trigger_scan(
logger.error(f"Scan failed: {e}")
# Record failed scan
await record_scan_result(success=False)
raise HTTPException(status_code=500, detail=str(e))
raise HTTPException(status_code=500, detail="Scan failed")


@router.get("/status", response_model=IngestStatusResponse)
Expand Down Expand Up @@ -542,7 +556,7 @@ async def attach_screengrab(file_id: int) -> AttachResponse:
)
except Exception as e:
logger.error(f"Attach failed for file {file_id}: {e}")
raise HTTPException(status_code=500, detail=str(e))
raise HTTPException(status_code=500, detail="Screengrab attachment failed")


@router.post("/screengrabs/attach-all", response_model=BatchAttachResponse)
Expand All @@ -569,7 +583,7 @@ async def attach_all_screengrabs() -> BatchAttachResponse:
)
except Exception as e:
logger.error(f"Batch attach failed: {e}")
raise HTTPException(status_code=500, detail=str(e))
raise HTTPException(status_code=500, detail="Batch attachment failed")


@router.post("/screengrabs/{file_id}/ignore")
Expand Down Expand Up @@ -760,7 +774,7 @@ async def queue_transcript(file_id: int) -> QueueTranscriptResponse:
file_id=file_id,
media_id=download_result.get("media_id"),
local_path=download_result.get("local_path"),
error=f"Failed to create job: {e}",
error="Failed to create job",
)


Expand Down
29 changes: 22 additions & 7 deletions api/routers/system.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,17 +78,32 @@ def _kill_process(pattern: str) -> bool:
return False


def _start_component(command: str, log_file: str) -> bool:
"""Start a component in background."""
ALLOWED_COMPONENTS = {
"run_worker.py": ["./venv/bin/python", "run_worker.py"],
"watch_transcripts.py": ["./venv/bin/python", "watch_transcripts.py"],
}


def _start_component(component_key: str, log_file: str) -> bool:
"""Start a component in background.

Only components defined in ALLOWED_COMPONENTS can be started.
Uses list-form subprocess (no shell=True) to prevent command injection.
"""
if component_key not in ALLOWED_COMPONENTS:
return False

try:
project_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
log_path = os.path.join(project_dir, "logs", log_file)

# Ensure logs directory exists
os.makedirs(os.path.dirname(log_path), exist_ok=True)

command_list = ALLOWED_COMPONENTS[component_key]

with open(log_path, "a") as log:
subprocess.Popen(command, shell=True, cwd=project_dir, stdout=log, stderr=log, start_new_session=True)
subprocess.Popen(command_list, cwd=project_dir, stdout=log, stderr=log, start_new_session=True)
return True
except Exception:
return False
Expand Down Expand Up @@ -118,7 +133,7 @@ async def restart_worker():
_kill_process("run_worker.py")

# Start new worker
success = _start_component("./venv/bin/python run_worker.py", "worker.log")
success = _start_component("run_worker.py", "worker.log")

if success:
return RestartResponse(success=True, message="Worker restarted successfully")
Expand All @@ -134,7 +149,7 @@ async def restart_watcher():
_kill_process("watch_transcripts.py")

# Start new watcher
success = _start_component("./venv/bin/python watch_transcripts.py", "watcher.log")
success = _start_component("watch_transcripts.py", "watcher.log")

if success:
return RestartResponse(success=True, message="Watcher restarted successfully")
Expand Down Expand Up @@ -169,7 +184,7 @@ async def start_worker():
if _find_process("run_worker.py"):
return RestartResponse(success=False, message="Worker is already running")

success = _start_component("./venv/bin/python run_worker.py", "worker.log")
success = _start_component("run_worker.py", "worker.log")

if success:
return RestartResponse(success=True, message="Worker started")
Expand All @@ -184,7 +199,7 @@ async def start_watcher():
if _find_process("watch_transcripts.py"):
return RestartResponse(success=False, message="Watcher is already running")

success = _start_component("./venv/bin/python watch_transcripts.py", "watcher.log")
success = _start_component("watch_transcripts.py", "watcher.log")

if success:
return RestartResponse(success=True, message="Watcher started")
Expand Down
Loading
Loading