The Audio Analysis MCP Server is a FastMCP application that exposes audio analysis capabilities as Model Context Protocol tools over Streamable HTTP. It wraps best-in-class ML libraries (Whisper, pyannote.audio, librosa, HuggingFace Transformers) behind a clean, stateless API.
┌─────────────────────────────────────────────────┐
│ MCP Client (Claude) │
└──────────────────────┬──────────────────────────┘
│ Streamable HTTP
┌──────────────────────▼──────────────────────────┐
│ FastMCP Server (server.py) │
│ HTTP /health · /mcp endpoint │
├─────────────────────────────────────────────────┤
│ Tools Layer │
│ ┌────────────┐ ┌───────────┐ ┌────────────┐ │
│ │orchestrat. │ │transcribe │ │ diarize │ │
│ │(pipeline) │ │ │ │ │ │
│ ├────────────┤ ├───────────┤ ├────────────┤ │
│ │ prosody │ │ patterns │ │ sentiment │ │
│ └────────────┘ └───────────┘ └────────────┘ │
├─────────────────────────────────────────────────┤
│ Processors Layer │
│ WhisperProc · DiarizationProc · ProsodyProc │
│ PatternsProc · SentimentProc │
├─────────────────────────────────────────────────┤
│ Model Management (loader.py) │
│ GPU detection · VRAM mode · Model caching │
└─────────────────────────────────────────────────┘
- Request — MCP client sends tool call with audio file path
- Validation — File existence, format, and duration validated
- Routing — Single tool → direct processor;
full_analysis→ orchestrator pipeline - Inference — Models loaded on-demand (GPU if available, CPU fallback)
- Response — Structured Pydantic models serialized to JSON
| Decision | Rationale |
|---|---|
| Stateless HTTP transport | Eliminates session affinity issues on server restart |
| Feature flags | Deploy with partial capabilities (e.g., transcription-only) |
| Processors ↔ Tools separation | Tools own MCP interface; processors own ML inference |
| Low-VRAM sequential loading | Enables deployment on 4–6 GB GPUs without OOM errors |
| Pydantic models throughout | Runtime validation + automatic JSON schema for MCP tool signatures |
All settings are environment-variable driven via pydantic-settings. See docs/CONFIGURATION.md for the complete reference.