feat: native voice replies and first-party TTS

## Summary
HybridClaw has inbound audio transcription and can send generated audio files back through some channels, but it does not ship a first-party TTS runtime config or built-in speech synthesis provider.

## Why
Voice is already partially present in the product surface. A native TTS path would make voice interactions feel complete instead of requiring custom scripts or external MCP wrappers.

## Proposed scope
- Add first-party `tts.*` runtime configuration.
- Add at least one built-in speech synthesis provider abstraction with pluggable backends.
- Support generating outbound audio replies directly from agent text.
- Add per-channel delivery rules for supported platforms.
- Allow voice preferences per agent/session where practical.

## Candidate UX
- `/tts on|off`
- `/tts voice <name>`
- gateway config for provider, voice, format, and max duration
- optional "reply in voice" channel/session setting

## Implementation notes
- Start with one reliable provider/backend and a clean abstraction for future backends.
- Reuse the existing media delivery path instead of inventing a separate outbound transport.
- Ensure generated files are treated as sensitive transient artifacts and cleaned up correctly.
- Consider Discord file delivery and WhatsApp voice-note/PTT support separately so the first version can ship incrementally.

## Acceptance criteria
- A user can enable TTS and receive spoken replies without custom tooling.
- At least one built-in provider/backend is documented and tested.
- Generated audio is delivered through supported channels using the existing media pipeline.
- Config and transcripts make it clear when a text reply was synthesized to audio.
- Tests cover provider selection, generation failure paths, and channel delivery behavior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: native voice replies and first-party TTS #198

Summary

Why

Proposed scope

Candidate UX

Implementation notes

Acceptance criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: native voice replies and first-party TTS #198

Description

Summary

Why

Proposed scope

Candidate UX

Implementation notes

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions