Add CLI binary interface with stdin pipeline support for Kitten TTS #78

andkirby · 2025-11-08T16:18:24Z

Summary

Added complete command-line interface for Kitten TTS
Implemented stdin pipeline support for text input
Added executable kitten-tts binary script
Created comprehensive CLI with all major features
No changes in core files

Key Features Added

Command-line interface with comprehensive argument parsing
Stdin pipeline support - can read text from pipes (e.g., echo "text" | kitten-tts)
Executable binary script for direct usage
Audio fade-out with customizable duration (default: 0.2s)
Text preprocessing - automatic dots suffix to prevent abrupt
audio cutoff
Multi-format audio output (WAV, FLAC, OGG support)
Voice selection for all 8 available voices (expr-voice-2/m/f
through expr-voice-5/m/f)
Speech speed control with float values
System audio playback across platforms (macOS, Linux, Windows)
Comprehensive help documentation with usage examples

Files Changed

kitten-tts - New executable wrapper script
kittentts/cli.py - Complete CLI implementation with pipeline
support. And system installs the binary by path venv/kitten-tts

Usage Examples

# Basic usage with argument
kitten-tts "Hello world"

# Pipeline usage (stdin)
echo "Hello world" | ./kitten-tts
cat file.txt | kitten-tts --output audio.wav

# With specific voice and fade-out
kitten-tts "Hello world" --voice expr-voice-2-f --fade-out 0.3

# Save to file with custom speed
kitten-tts "Hello world" --output hello.wav --speed 1.2

# List available voices
kitten-tts --list-voices

* Remove the `misaki` dependency, but directly depend on `phonemizer-fork` instead. * Do the side-effect phonemizer initialization call by hand

- Add executable kitten-tts wrapper script - Add kittentts/cli.py with full command-line interface - Configure console script entry point in pyproject.toml - Implement audio fade-out with customizable duration (default: 0.2s) - Add automatic dots suffix to prevent audio cutoff - Support all available voices, speed control, and audio formats - Add joblib dependency for proper package installation - Include comprehensive help documentation and examples Features: - Text-to-speech synthesis via command line - Multiple voice options (expr-voice-2/m/f through expr-voice-5/m/f) - Adjustable speech speed and fade-out duration - Audio file output (WAV, FLAC, OGG) or direct playback - Automatic text preprocessing to prevent abrupt cutoffs

- Implemented pipeline/stdin reading functionality - Added support for piping text to kitten-tts command - Updated help documentation with pipeline usage examples - Enhanced error handling for stdin operations - Maintained backward compatibility with argument-based input Usage examples: echo "hello world" | ./kitten-tts cat text_file.txt | ./kitten-tts --output audio.wav

- Added comprehensive CLI usage section - Documented installation and setup steps for CLI - Listed all CLI features and available voices - Added examples for both argument and stdin/pipeline usage - Organized Python API and CLI sections separately - Updated features list to highlight CLI functionality

- Organized CLI documentation in a collapsible details section - Added structured subsections (Installation, Basic Usage, Advanced Options) - Improved readability with better organization - Maintained all CLI features and examples - Made README more concise while preserving comprehensive information

- Changed 'Click to expand CLI usage instructions' to 'CLI Usage Instructions' - More concise and cleaner collapsible section header

Major improvements: 🚀 CLI Performance: - Implement lazy imports for instant help display (0.04s vs 2.2s) - Add optimized entry point that only loads heavy dependencies when needed - Refactor CLI into separate entry and processing modules 🎵 Audio System Enhancements: - Add direct audio streaming with sounddevice library - Implement fallback to system temp directory for temp files - Fix permission issues when running from root directory - Add proper temp file cleanup and error handling 📦 Package Structure: - Update pyproject.toml to use optimized entry point - Make package imports lazy to improve startup performance - Add sounddevice as optional streaming dependency 💡 User Experience: - Help commands now appear instantly - Audio works from any directory including root - Graceful fallback when sounddevice unavailable - Maintains full CLI functionality with all existing features

- Combined formatting improvements from both branches - Kept comprehensive CLI documentation - Maintained proper spacing and structure

- Keep CLI script entry point from main - Use Hatchling version configuration from fix-pkg - Remove requirements.txt in favor of pyproject.toml dependencies

- Remove misaki and huggingface_hub dependencies - Add phonemizer-fork dependency - Clean up duplicate packaging files

iamgroot42 and others added 13 commits August 5, 2025 14:11

Typing, minor README edits, gitignore

ae44a48

Minor edits

5f9fe40

Trim generated audio based on edge silence

3883bdf

Remove duplicate packaging files; use Hatchling as packaging backend

8e72130

Remove unnecessary misaki dependency

03853c7

* Remove the `misaki` dependency, but directly depend on `phonemizer-fork` instead. * Do the side-effect phonemizer initialization call by hand

syntax highlighting

0d7d96e

Simplify CLI section title

ece72ef

- Changed 'Click to expand CLI usage instructions' to 'CLI Usage Instructions' - More concise and cleaner collapsible section header

Update checklist in README for CLI support

6b76cde

andkirby marked this pull request as draft November 8, 2025 21:40

andkirby marked this pull request as ready for review November 8, 2025 21:42

andkirby mentioned this pull request Nov 8, 2025

合成语音末尾容易吞音 #72

Open

andkirby added 6 commits November 8, 2025 22:57

Resolve merge conflicts in README.md

271b7e9

- Combined formatting improvements from both branches - Kept comprehensive CLI documentation - Maintained proper spacing and structure

Merge branch 'patch-1'

be0849e

Merge fix-pkg branch and resolve conflicts

7e41f05

- Keep CLI script entry point from main - Use Hatchling version configuration from fix-pkg - Remove requirements.txt in favor of pyproject.toml dependencies

Merge branch 'fix-trim'

d7d97ec

Merge remove-unnecessary-dep branch and resolve conflicts

6c2e403

- Remove misaki and huggingface_hub dependencies - Add phonemizer-fork dependency - Clean up duplicate packaging files

Add old_trim parameter to generate method for backward compatibility

fba1326

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CLI binary interface with stdin pipeline support for Kitten TTS #78

Add CLI binary interface with stdin pipeline support for Kitten TTS #78

Uh oh!

andkirby commented Nov 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add CLI binary interface with stdin pipeline support for Kitten TTS #78

Are you sure you want to change the base?

Add CLI binary interface with stdin pipeline support for Kitten TTS #78

Uh oh!

Conversation

andkirby commented Nov 8, 2025

Summary

Key Features Added

Files Changed

Usage Examples

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants