Skip to content

Commit 24d652c

Browse files
committed
Polished README with Rick Rubin aesthetic
1 parent 064ac6d commit 24d652c

1 file changed

Lines changed: 94 additions & 77 deletions

File tree

README.md

Lines changed: 94 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -1,120 +1,137 @@
11
<p align="center">
2-
<img src="assets/banner.jpeg" alt="speak - High performance CLI tool your agent can use to generate life like speech, real time on Apple Silicon" width="100%">
2+
<img src="assets/banner.jpeg" alt="speak - Talk to your Claude" width="100%">
33
</p>
44

5-
A fast CLI tool for AI agents to convert their text output to speech using Chatterbox TTS on Apple Silicon.
5+
```
6+
┌─────────────────────────────────────────────────────────────┐
7+
│ │
8+
│ ███████╗██████╗ ███████╗ █████╗ ██╗ ██╗ │
9+
│ ██╔════╝██╔══██╗██╔════╝██╔══██╗██║ ██╔╝ │
10+
│ ███████╗██████╔╝█████╗ ███████║█████╔╝ │
11+
│ ╚════██║██╔═══╝ ██╔══╝ ██╔══██║██╔═██╗ │
12+
│ ███████║██║ ███████╗██║ ██║██║ ██╗ │
13+
│ ╚══════╝╚═╝ ╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝ │
14+
│ │
15+
│ Talk to your Claude. │
16+
│ │
17+
└─────────────────────────────────────────────────────────────┘
18+
```
619

7-
## Install as Agent Skill
20+
<p align="center">
21+
<strong>Voice cloning. Long documents. Audiobook quality. Local & private.</strong>
22+
</p>
823

9-
Add this skill to Claude Code, Cursor, Windsurf, and other AI agents:
24+
<p align="center">
25+
<code>speak article.md --stream</code> → Audio starts in seconds
26+
</p>
1027

11-
```bash
12-
npx skills add EmZod/speak
13-
```
28+
---
1429

15-
## Quick Start
30+
## Install
1631

32+
**For AI Agents** (Claude Code, Cursor, Windsurf):
1733
```bash
18-
git clone https://github.com/EmZod/speak.git
19-
cd speak
20-
bun install
21-
22-
# First run auto-installs Python dependencies
23-
bun run src/index.ts "Hello, world!" --play
34+
npx skills add EmZod/speak
2435
```
2536

26-
Create an alias for easier access:
37+
**CLI:**
2738
```bash
39+
git clone https://github.com/EmZod/speak.git
40+
cd speak && bun install
2841
alias speak="bun run $(pwd)/src/index.ts"
2942
```
3043

31-
## Requirements
44+
**Requirements:** macOS Apple Silicon · Bun · Python 3.10+ · sox (`brew install sox`)
3245

33-
- macOS with Apple Silicon (M Series)
34-
- [Bun](https://bun.sh)
35-
- Python 3.10+
36-
- sox (for long documents): `brew install sox`
46+
---
3747

38-
## Basic Usage
48+
## Usage
3949

4050
```bash
4151
speak "Hello, world!" --play # Generate and play
42-
speak article.md --stream # Stream long content
43-
speak --clipboard --play # Read from clipboard
52+
speak article.md --stream # Stream long content
4453
speak document.md --output out.wav # Save to file
54+
speak --clipboard --play # Read from clipboard
4555
```
4656

47-
## Key Features
57+
---
4858

49-
```bash
50-
# Long documents - auto-chunk for reliability
51-
speak book.md --auto-chunk --output book.wav
59+
## Voice Cloning
5260

53-
# Resume interrupted generation
54-
speak --resume manifest.json
61+
Clone any voice from a 10-30 second sample:
5562

56-
# Batch processing
57-
speak *.md --output-dir ~/Audio/
63+
```bash
64+
# Use your cloned voice
65+
speak "Hello" --voice ~/.chatter/voices/morgan_freeman.wav --play
66+
```
5867

59-
# Estimate duration before generating
60-
speak --estimate document.md
68+
---
6169

62-
# Concatenate audio files
63-
speak concat part1.wav part2.wav --out combined.wav
70+
## Long Documents
71+
72+
```bash
73+
speak book.md --auto-chunk --output book.wav # Auto-chunk for reliability
74+
speak --resume manifest.json # Resume interrupted generation
75+
speak *.md --output-dir ~/Audio/ # Batch processing
76+
speak --estimate document.md # Estimate duration first
6477
```
6578

66-
## Commands
79+
---
6780

68-
| Command | Description |
69-
|---------|-------------|
70-
| `speak <text\|file>` | Generate speech |
71-
| `speak health` | Check system status |
72-
| `speak models` | List available models |
73-
| `speak concat <files>` | Combine audio files |
74-
| `speak daemon kill` | Stop TTS server |
75-
76-
## Common Options
77-
78-
| Option | Description |
79-
|--------|-------------|
80-
| `--play` | Play after generation |
81-
| `--stream` | Stream as it generates |
82-
| `--output <path>` | Output file or directory |
83-
| `--auto-chunk` | Chunk long documents |
84-
| `--estimate` | Show duration estimate |
85-
| `--dry-run` | Preview without generating |
81+
## Commands
8682

87-
## Documentation
83+
```
84+
speak <text|file> Generate speech
85+
speak health Check system status
86+
speak models List available models
87+
speak concat <files> Combine audio files
88+
speak daemon kill Stop TTS server
89+
```
8890

89-
- **[docs/usage.md](docs/usage.md)** - Complete usage guide
90-
- **[docs/configuration.md](docs/configuration.md)** - Config file, environment variables, shell setup
91-
- **[docs/troubleshooting.md](docs/troubleshooting.md)** - Common issues and fixes
92-
- **[SKILL.md](SKILL.md)** - Agent-optimized reference
93-
- **[CHANGELOG.md](CHANGELOG.md)** - Version history
94-
- **[.agentic/](.agentic/)** - Agentic engineering artifacts (optimization reports, focus group tests)
91+
---
9592

96-
## Development
93+
## Options
9794

98-
```bash
99-
bun install # Install dependencies
100-
bun test # Run tests
101-
bun run typecheck # Type check
95+
```
96+
--play Play after generation
97+
--stream Stream as it generates
98+
--output Output file or directory
99+
--voice Custom voice file (WAV)
100+
--auto-chunk Chunk long documents
101+
--estimate Show duration estimate
102+
--dry-run Preview without generating
102103
```
103104

104-
## For AI Agents
105+
---
105106

106-
**Recommended:** Install via the skills registry:
107-
```bash
108-
npx skills add EmZod/speak
109-
```
107+
## Performance
110108

111-
Or manually copy [SKILL.md](SKILL.md) to your agent's skills directory:
112-
```bash
113-
cp SKILL.md ~/.claude/skills/speak-tts/SKILL.md
114109
```
110+
Long documents ████████████████████ Streaming, auto-chunk
111+
Voice cloning ████████████████████ Any voice from sample
112+
Emotion tags ████████████████████ [laugh], [sigh], etc.
113+
Quality ████████████████████ Audiobook grade
114+
```
115+
116+
---
117+
118+
## See Also
119+
120+
Need instant audio (~90ms)? Try [**speakturbo**](https://github.com/EmZod/Speak-Turbo).
115121

116-
See [AGENTS.md](AGENTS.md) for setup details.
122+
---
117123

118-
## License
124+
## Documentation
125+
126+
| File | Content |
127+
|------|---------|
128+
| [SKILL.md](SKILL.md) | Full usage guide for agents |
129+
| [docs/usage.md](docs/usage.md) | Complete CLI reference |
130+
| [docs/troubleshooting.md](docs/troubleshooting.md) | Common issues & fixes |
131+
| [AGENTS.md](AGENTS.md) | Architecture & development |
132+
133+
---
119134

120-
MIT
135+
<p align="center">
136+
<sub>MIT License · Built on <a href="https://github.com/resemble-ai/chatterbox">Chatterbox TTS</a></sub>
137+
</p>

0 commit comments

Comments
 (0)