Profile-driven MCP server for Google Cloud Text-to-Speech: define one profile per app/client so each tool always speaks with the right voice and settings.
Exposes three tools to any MCP client:
tts_speak— synthesize text to audio and auto-play ittts_doctor— run diagnostics on auth, profile, and playbacktts_stop— stop any currently playing audio
Voice, language, model, and format are locked per profile — the LLM can only control text content, speaking rate, and pitch.
pip install tts-mcpOr with uvx (no install needed):
uvx tts-mcp --help- Python 3.11+
- A Google Cloud project with the Cloud Text-to-Speech API enabled
- Google offers a generous free tier — up to 4 million characters/month (roughly 84 hours of English speech at a normal pace) for Standard and WaveNet voices, and 1 million characters/month (roughly 21 hours) for Neural2, Polyglot, Chirp 3: HD, and Studio voices, more than enough for most individual use. See TTS pricing for details.
- Google Cloud CLI (
gcloud) for authentication - macOS uses
afplayfor playback by default (configurable via profile)
gcloud auth application-default login
gcloud auth application-default set-quota-project YOUR_PROJECT_IDThis stores credentials at ~/.config/gcloud/application_default_credentials.json, which the TTS client discovers automatically. No environment variables needed.
tts-mcp --init
${EDITOR:-vi} ~/.config/tts-mcp/profiles.jsonThis creates a starter config at ~/.config/tts-mcp/profiles.json with example profiles for every Google TTS voice tier. Edit it to pick your voice, format, and playback settings.
The server finds the profiles file automatically — no --profiles flag needed for the common case. The search order is:
--profilesflag orTTS_MCP_PROFILES_PATHenv var (explicit override)~/.config/tts-mcp/profiles.json(XDG standard — created bytts-mcp --init)
After running tts-mcp --init, no --profiles flag is needed — the server finds ~/.config/tts-mcp/profiles.json automatically. Just pass --profile to select which profile each client uses.
claude mcp add --transport stdio --scope user \
speech -- tts-mcp --profile claudeEdit ~/.config/opencode/opencode.jsonc:
Edit ~/.codex/config.toml:
[mcp_servers.speech]
command = "tts-mcp"
args = ["--profile", "codex"]Any client config can use uvx instead of installing globally:
{
"command": "uvx",
"args": ["--update", "tts-mcp", "--profile", "opencode"]
}In any MCP-enabled client, prompt naturally:
Summarize this and read it aloud.Stop talking.
Tool names may appear prefixed by the client (e.g. speech_tts_speak, speech_tts_stop).
The package installs four commands. Each supports --help for full details.
For normal usage, you only need tts-mcp --init plus your MCP client setup above; the commands below are mostly for diagnostics or manual testing.
tts-mcp --init # create starter config at ~/.config/tts-mcp/profiles.json
tts-mcp --init --force # overwrite existing config
tts-mcp --doctor # diagnostics: auth, profile, voice, player
tts-mcp --profile casual # start MCP server with a specific profileDefaults:
--profiles:TTS_MCP_PROFILES_PATHenv var or""(then auto-discovery runs)--profile:TTS_MCP_PROFILE_NAMEenv var or""(thendefault_profileis used)--doctor,--init,--force:false
tts-speak --text "Hello world" --voice en-US-Chirp3-HD-Fenrir --format wav --out hello.wav
tts-speak --text-file notes.txt --voice en-US-Neural2-D --format mp3 --out notes.mp3
tts-speak --ssml --text "<speak>Hello <break time='500ms'/> world</speak>" --out ssml.wav
echo "Piped text" | tts-speak --voice en-US-Casual-K --out piped.oggOptions: --text, --text-file, --voice, --language, --model, --format (mp3/ogg/wav), --speaking-rate, --pitch, --out, --usage-log.
Defaults:
--voice:""--language:en-US--model:""--format:mp3--speaking-rate:1.0--pitch:0.0--out:""(auto-generatesYYYYMMDD-HHMMSS-ms.extin the current directory, local timezone)--usage-log:usage_log.csv- input: if neither
--textnor--text-fileis provided, the CLI reads piped stdin or prompts for text
tts-voices # list en-US voices (default language)
tts-voices --language en-US # filter by language
tts-voices --language en-US --family Chirp3 # filter by family
tts-voices --limit 5 # limit resultsDefaults:
--language:en-US--family:""(no family filter)--limit:0(no limit)
tts-batch --text-file test.txt --out-dir ./samples
tts-batch --text-file test.txt --families Chirp3,Neural2 --language en-US --format wav
tts-batch --text-file test.txt --limit 3 # first 3 matching voices onlyDefaults:
--families:""(no family filter)--language:en-US--format:mp3--out-dir:./out--speaking-rate:1.0--pitch:0.0--limit:0(all matching voices)--text-file: required
Profiles are defined in a JSON file (see profiles.example.json):
{
"default_profile": "opencode",
"profiles": {
"opencode": {
"voice": "en-US-Chirp3-HD-Fenrir",
"language": "en-US",
"model": "models/chirp3-hd",
"format": "wav",
"speaking_rate": 1.0,
"pitch": 0.0,
"output_dir": "~/.local/share/tts-mcp/out",
"usage_log": "~/.local/share/tts-mcp/usage_log.csv",
"autoplay": true,
"player_command": ["afplay", "{file}"]
}
}
}Each profile locks: voice, language, model, format, output_dir, usage_log, autoplay, and player_command. Only speaking_rate and pitch can be overridden per tool call.
- Auth errors — run
gcloud auth application-default login, or confirmGOOGLE_APPLICATION_CREDENTIALSis set. - No audio — verify the player binary (e.g.
afplay) exists, or changeplayer_commandin your profile. - Tool timeout — playback is non-blocking, but if timeouts persist, increase the client's
tool_timeout. - Run diagnostics —
tts-mcp --doctorchecks auth, profile, voice, and player.
git clone git@github.com:that-lucas/tts-mcp.git
cd tts-mcp
make setup # creates venv, installs package + dev deps, sets git hooks
make test # run pytest
make lint # run ruff check + format checkSee CONTRIBUTING.md for details.
MIT
{ "mcp": { "speech": { "type": "local", "command": ["tts-mcp", "--profile", "opencode"], "enabled": true, "timeout": 120000 } } }