Architecture

Overview

The Audio Analysis MCP Server is a FastMCP application that exposes audio analysis capabilities as Model Context Protocol tools over Streamable HTTP. It wraps best-in-class ML libraries (Whisper, pyannote.audio, librosa, HuggingFace Transformers) behind a clean, stateless API.

Component Diagram

┌─────────────────────────────────────────────────┐
│              MCP Client (Claude)                 │
└──────────────────────┬──────────────────────────┘
                       │ Streamable HTTP
┌──────────────────────▼──────────────────────────┐
│            FastMCP Server (server.py)            │
│          HTTP /health  ·  /mcp endpoint          │
├─────────────────────────────────────────────────┤
│                  Tools Layer                     │
│  ┌────────────┐  ┌───────────┐  ┌────────────┐  │
│  │orchestrat. │  │transcribe │  │ diarize    │  │
│  │(pipeline)  │  │           │  │            │  │
│  ├────────────┤  ├───────────┤  ├────────────┤  │
│  │  prosody   │  │ patterns  │  │ sentiment  │  │
│  └────────────┘  └───────────┘  └────────────┘  │
├─────────────────────────────────────────────────┤
│               Processors Layer                   │
│   WhisperProc · DiarizationProc · ProsodyProc    │
│      PatternsProc · SentimentProc                │
├─────────────────────────────────────────────────┤
│           Model Management (loader.py)           │
│    GPU detection · VRAM mode · Model caching     │
└─────────────────────────────────────────────────┘

Data Flow

Request — MCP client sends tool call with audio file path
Validation — File existence, format, and duration validated
Routing — Single tool → direct processor; full_analysis → orchestrator pipeline
Inference — Models loaded on-demand (GPU if available, CPU fallback)
Response — Structured Pydantic models serialized to JSON

Key Design Decisions

Decision	Rationale
Stateless HTTP transport	Eliminates session affinity issues on server restart
Feature flags	Deploy with partial capabilities (e.g., transcription-only)
Processors ↔ Tools separation	Tools own MCP interface; processors own ML inference
Low-VRAM sequential loading	Enables deployment on 4–6 GB GPUs without OOM errors
Pydantic models throughout	Runtime validation + automatic JSON schema for MCP tool signatures

Configuration

All settings are environment-variable driven via pydantic-settings. See docs/CONFIGURATION.md for the complete reference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture

Overview

Component Diagram

Data Flow

Key Design Decisions

Configuration

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Architecture

Overview

Component Diagram

Data Flow

Key Design Decisions

Configuration