-
Notifications
You must be signed in to change notification settings - Fork 495
Add CLI binary interface with stdin pipeline support for Kitten TTS #78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
andkirby
wants to merge
19
commits into
KittenML:main
Choose a base branch
from
andkirby:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Remove the `misaki` dependency, but directly depend on `phonemizer-fork` instead. * Do the side-effect phonemizer initialization call by hand
- Add executable kitten-tts wrapper script - Add kittentts/cli.py with full command-line interface - Configure console script entry point in pyproject.toml - Implement audio fade-out with customizable duration (default: 0.2s) - Add automatic dots suffix to prevent audio cutoff - Support all available voices, speed control, and audio formats - Add joblib dependency for proper package installation - Include comprehensive help documentation and examples Features: - Text-to-speech synthesis via command line - Multiple voice options (expr-voice-2/m/f through expr-voice-5/m/f) - Adjustable speech speed and fade-out duration - Audio file output (WAV, FLAC, OGG) or direct playback - Automatic text preprocessing to prevent abrupt cutoffs
- Implemented pipeline/stdin reading functionality - Added support for piping text to kitten-tts command - Updated help documentation with pipeline usage examples - Enhanced error handling for stdin operations - Maintained backward compatibility with argument-based input Usage examples: echo "hello world" | ./kitten-tts cat text_file.txt | ./kitten-tts --output audio.wav
- Added comprehensive CLI usage section - Documented installation and setup steps for CLI - Listed all CLI features and available voices - Added examples for both argument and stdin/pipeline usage - Organized Python API and CLI sections separately - Updated features list to highlight CLI functionality
- Organized CLI documentation in a collapsible details section - Added structured subsections (Installation, Basic Usage, Advanced Options) - Improved readability with better organization - Maintained all CLI features and examples - Made README more concise while preserving comprehensive information
- Changed 'Click to expand CLI usage instructions' to 'CLI Usage Instructions' - More concise and cleaner collapsible section header
Major improvements: 🚀 CLI Performance: - Implement lazy imports for instant help display (0.04s vs 2.2s) - Add optimized entry point that only loads heavy dependencies when needed - Refactor CLI into separate entry and processing modules 🎵 Audio System Enhancements: - Add direct audio streaming with sounddevice library - Implement fallback to system temp directory for temp files - Fix permission issues when running from root directory - Add proper temp file cleanup and error handling 📦 Package Structure: - Update pyproject.toml to use optimized entry point - Make package imports lazy to improve startup performance - Add sounddevice as optional streaming dependency 💡 User Experience: - Help commands now appear instantly - Audio works from any directory including root - Graceful fallback when sounddevice unavailable - Maintains full CLI functionality with all existing features
Open
- Combined formatting improvements from both branches - Kept comprehensive CLI documentation - Maintained proper spacing and structure
- Keep CLI script entry point from main - Use Hatchling version configuration from fix-pkg - Remove requirements.txt in favor of pyproject.toml dependencies
- Remove misaki and huggingface_hub dependencies - Add phonemizer-fork dependency - Clean up duplicate packaging files
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
kitten-ttsbinary scriptKey Features Added
echo "text" | kitten-tts)audio cutoff
through expr-voice-5/m/f)
Files Changed
kitten-tts- New executable wrapper scriptkittentts/cli.py- Complete CLI implementation with pipelinesupport. And system installs the binary by path venv/kitten-tts
Usage Examples