feat(voice): add CLI args for extra player args, TTS provider, and port #323
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Overview
Adds configuration options for passing extra arguments to audio players, overriding the TTS provider, and setting the server port via CLI flags. This enables voice notifications to work inside devcontainers and other containerized environments where audio requires special configuration (e.g., PulseAudio socket mounting).
Problem
The voice server previously:
/usr/bin/mpg123,/snap/bin/mpv) that don't work when players are installed elsewhere-o pulsein containers)whichon every audio playback instead of caching the resultSolution
New CLI Flags
--extra-args=<args>--tts-provider=<provider>-t--port=<port>-pNew Environment Variable
VOICE_SERVER_EXTRA_ARGSDynamic Player Detection (Cached)
Replaced hardcoded paths with
which-based detection, cached at startup:detectPlayer()runs once at startup, stores result inDETECTED_PLAYERfindPlayer('mpg123')/findPlayer('mpv')- finds player regardless of install locationwhichcalls during audio playbackChanges
Packs/kai-voice-system/src/voice/server.tsImports & CLI Parsing:
parseArgsimport from Node.jsutilmoduleexecSyncimport for runningwhichcommands--extra-args,--tts-provider/-t,--port/-pcliArgs.values.port || process.env.PAI_VOICE_PORT || "8888"Player Detection:
findPlayer(name)function for dynamic player detection viawhichdetectPlayer()function that runs once at startupDETECTED_PLAYERconstant caching the detected player path and typeplayAudio()to use cachedDETECTED_PLAYERinstead of callingfindPlayer()repeatedlyConfiguration Functions:
getExtraArgs()function to resolve CLI/env args with precedenceTTS_PROVIDERto accept CLI override via--tts-provider/-tLogging:
Packs/kai-voice-system/README.mdUsage Examples
Environment Variable:
CLI Flags:
Devcontainer Setup:
{ "mounts": [ "source=/run/user/1000/pulse,target=/run/user/1000/pulse,type=bind" ], "containerEnv": { "PULSE_SERVER": "unix:/run/user/1000/pulse/native" } }Common Use Cases
VOICE_SERVER_EXTRA_ARGS="-o pulse"VOICE_SERVER_EXTRA_ARGS="-o alsa -a hw:1,0"--tts-provider=googleor-t google--port=9000or-p 9000Logging Output
Startup: Shows detected player and configured extra args
Runtime: Logs full player command when extra args are used
Test Plan
VOICE_SERVER_EXTRA_ARGS="-o pulse"- verify startup log shows extra args--extra-args="-o alsa"- verify CLI overrides env var--tts-provider=google- verify TTS provider switches-t elevenlabs- verify short flag works--port=9000- verify server starts on port 9000-p 9000- verify short flag works for portfindPlayer()locates mpg123/mpv correctly on LinuxBreaking Changes
None. All changes are additive and backward compatible.
Dependencies
This depends partially on #322 where
.envloading was enhanced.Generated with Claude Code