Architecture — StreamMind AI

This document explains the system design, data flow, technical decisions, trade-offs, and scaling strategy for StreamMind AI.

1. High-Level Architecture

StreamMind AI follows a modular monorepo architecture with clear package boundaries. Each package has a single responsibility and communicates through well-defined TypeScript interfaces.

┌─────────────────────────────────────────────────────────────┐
│                     PRESENTATION LAYER                      │
│                                                             │
│  Chrome Extension (Manifest V3)                             │
│  React + TypeScript + Vite + Tailwind CSS + shadcn/ui       │
│                                                             │
│  ┌─────────────┐  ┌─────────────┐  ┌───────────────────┐    │
│  │ SetupView   │  │ SearchView  │  │ Movie Cards       │    │
│  │ (BYOK Flow) │  │ (Query UI)  │  │ (Result Display)  │    │
│  └─────────────┘  └──────┬──────┘  └───────────────────┘    │
│                          │                                  │
│  ┌───────────────────────┴──────────────────────────────┐   │
│  │  API Client (fetch) — sends API key per-request      │   │
│  └───────────────────────┬──────────────────────────────┘   │
└──────────────────────────┼──────────────────────────────────┘
                           │ HTTPS
┌──────────────────────────┼──────────────────────────────────┐
│                     APPLICATION LAYER                       │
│                                                             │
│  Fastify API Server                                         │
│                                                             │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  Route Handlers (validate request, call services)    │   │
│  │  ├── POST /api/recommend                             │   │
│  │  └── GET  /health                                    │   │
│  └──────────────────────┬───────────────────────────────┘   │
│                         │                                   │
│  ┌──────────────────────┴───────────────────────────────┐   │
│  │  Recommendation Service (business logic coordinator) │   │
│  │  1. Sanitize input                                   │   │
│  │  2. Fetch catalog → TMDB                             │   │
│  │  3. Orchestrate LLM → OpenAI / Anthropic / Google    │   │
│  │  4. Validate output (Zod + catalog cross-check)      │   │
│  │  5. Enrich results with full movie details           │   │
│  └──────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘
                           │
┌──────────────────────────┼──────────────────────────────────┐
│                     DOMAIN LAYER (Packages)                 │
│                                                             │
│  ┌─────────────────┐  ┌──────────────────────────────────┐  │
│  │ catalog-core    │  │ llm-adapter                      │  │
│  │                 │  │                                  │  │
│  │ CatalogProvider │  │ LLMProvider (interface)          │  │
│  │ TMDBService     │  │ LLMOrchestrator                  │  │
│  │ MovieNormalizer │  │ PromptBuilder                    │  │
│  │                 │  │ ResponseValidator                │  │
│  │                 │  │ OpenAIAdapter                    │  │
│  └────────┬────────┘  └──────────────┬───────────────────┘  │
│           │                          │                      │
│  ┌────────┴──────────────────────────┴───────────────────┐  │
│  │ shared-types                                          │  │
│  │ Zod schemas + TypeScript types + Config schemas       │  │
│  └───────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────┘

2. Data Flow

Recommendation Request Flow

1. User types: "Mind-bending sci-fi like Inception"

2. Extension → POST /api/recommend
   {
     query: "Mind-bending sci-fi like Inception",
     apiKey: "sk-...",          // Passed per-request, NEVER stored
     provider: "openai",
     model: "gpt-4o-mini",
     maxResults: 5
   }

3. Backend → InputSanitizer
   - Trim whitespace
   - Check length limits
   - Detect prompt injection patterns
   - Remove control characters

4. Backend → CatalogProvider (TMDB)
   - Fetch top 100 popular movies
   - Normalize to internal Movie format
   - Extract MovieReference[] (id + title)

5. Backend → LLMOrchestrator
   a. PromptBuilder creates system + user prompts
      - System prompt constrains to JSON + catalog only
      - User prompt includes query + full catalog list
   b. OpenAIAdapter sends to OpenAI API
   c. ResponseValidator:
      - Parse JSON from LLM output
      - Validate against Zod schema
      - Cross-check movie IDs against catalog
      - Detect hallucinations
   d. If invalid → retry once → fail gracefully

6. Backend → Enrichment
   - Fetch full movie details for recommended IDs
   - Include poster paths, genres, ratings

7. Response → Extension
   {
     success: true,
     data: {
       recommendations: [...],
       enrichedMovies: [...],
       provider: "openai",
       model: "gpt-4o-mini",
       processingTimeMs: 2340,
       tokensUsed: 1847
     }
   }

8. Extension renders MovieCards with animated entrance

3. Key Design Decisions

3.1 Monorepo with Package Boundaries

Decision: Use a pnpm + Turborepo monorepo with strict package boundaries.

Why:

Each package can be tested and built independently.
Clear dependency graph (shared-types → catalog-core, llm-adapter → recommendation-core → api).
Easy to refactor into separate services later (SaaS migration path).
Single repo for development velocity.

Trade-off: Slightly more complex initial setup vs. long-term maintainability.

3.2 BYOK (Bring Your Own Key)

Decision: Users provide their own LLM API key. No server-side key storage.

Why:

Zero infrastructure cost for AI inference.
User controls their spending.
No key management liability.
Open-source friendly — anyone can run it.

Trade-off: Requires user to have an API key. Mitigated by clear setup UX.

3.3 LLM Provider Abstraction (Dependency Inversion)

Decision: Define an LLMProvider interface. All providers implement it.

Why:

Adding new providers = adding one adapter class.
Orchestrator logic is provider-agnostic.
Easy to test with mock providers.
SOLID's Dependency Inversion Principle.

Trade-off: Slightly more abstraction overhead vs. direct API calls.

3.4 Strict Output Validation

Decision: Every LLM response is validated with Zod AND cross-checked against the catalog.

Why:

LLMs hallucinate. Period.
A movie recommendation that doesn't exist is worse than no recommendation.
This is the #1 differentiator from "just call ChatGPT" projects.
Demonstrates responsible AI engineering.

Trade-off: Extra processing time + potential retry. Worth it for correctness.

3.5 Extension-First Architecture

Decision: Chrome Extension (Manifest V3) as the primary interface.

Why:

Meets streaming users where they are (in the browser).
Manifest V3 is the current Chrome extension standard.
Service workers provide lifecycle management.
Local storage encryption for API keys.

Trade-off: Platform-specific (Chrome initially). Future: Firefox, web app.

4. Security Architecture

Threat Model

Threat	Mitigation
API key theft	Keys stored in chrome.storage.local (encrypted). Never sent to our servers.
Prompt injection	InputSanitizer detects injection patterns. Combined with output validation.
LLM hallucination	ResponseValidator cross-checks every movie ID against TMDB catalog.
API abuse	Rate limiting (configurable per-route).
XSS via LLM output	All LLM output is parsed as JSON, never rendered as HTML.
Man-in-the-middle	HTTPS enforced for all external API calls.

API Key Lifecycle

1. User enters key in extension popup
2. Key saved to chrome.storage.local (encrypted by Chrome)
3. On each request: key read → sent to backend → forwarded to LLM → discarded
4. Backend NEVER stores, logs, or caches the key
5. User can revoke at any time via the extension settings

5. Scaling Strategy

Current (v0.1) — Single Backend

Extension → Fastify API → TMDB + OpenAI

v0.2 — Add Caching

Extension → Fastify API → Redis Cache → TMDB
                        → OpenAI

v1.0 — SaaS Architecture

Extension/Web → API Gateway → Recommendation Service
                            → User Service
                            → Billing Service
                            → Vector DB (embeddings)
                            → Redis (cache + rate limit)
                            → PostgreSQL (user data)

The monorepo structure enables this migration path: each package can become an independent deployable service.

6. Technology Choices

Component	Technology	Why
Monorepo	pnpm + Turborepo	Fast, efficient, great workspace support
Backend	Fastify	2x faster than Express, built-in validation, TypeScript-first
Types	Zod	Runtime validation + TypeScript inference in one
Extension	React + Vite + Manifest V3	Modern DX, fast builds, current Chrome standard
Styling	Tailwind + shadcn pattern	Utility-first, composable, dark mode native
LLM	OpenAI (initial)	Best JSON mode support, most reliable structured output
Catalog	TMDB	Free tier, comprehensive data, well-documented API

7. Future Considerations

Vector Database (Pinecone/Weaviate): For semantic movie matching beyond keyword search.
User Taste Graph: Track preferences over time for personalized recommendations.
Hybrid Engine: Combine collaborative filtering + LLM reasoning.
Multi-Region: Deploy API closer to users with edge functions.
A/B Testing: Compare LLM provider quality for recommendations.

Last updated: February 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture — StreamMind AI

1. High-Level Architecture

2. Data Flow

Recommendation Request Flow

3. Key Design Decisions

3.1 Monorepo with Package Boundaries

3.2 BYOK (Bring Your Own Key)

3.3 LLM Provider Abstraction (Dependency Inversion)

3.4 Strict Output Validation

3.5 Extension-First Architecture

4. Security Architecture

Threat Model

API Key Lifecycle

5. Scaling Strategy

Current (v0.1) — Single Backend

v0.2 — Add Caching

v1.0 — SaaS Architecture

6. Technology Choices

7. Future Considerations

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Architecture — StreamMind AI

1. High-Level Architecture

2. Data Flow

Recommendation Request Flow

3. Key Design Decisions

3.1 Monorepo with Package Boundaries

3.2 BYOK (Bring Your Own Key)

3.3 LLM Provider Abstraction (Dependency Inversion)

3.4 Strict Output Validation

3.5 Extension-First Architecture

4. Security Architecture

Threat Model

API Key Lifecycle

5. Scaling Strategy

Current (v0.1) — Single Backend

v0.2 — Add Caching

v1.0 — SaaS Architecture

6. Technology Choices

7. Future Considerations