Skip to content

Feature Request: Low-latency mode for screen reader use (<200ms) #291

@aradix85

Description

@aradix85

Hi,
First: I sent a tip via Ko-fi and offered to donate my voice for a Dutch TTS voice. Kokoro's quality is impressive.
I'm blind and use NVDA (a screen reader). I've been testing Kokoro to see if it could replace the 20+ year old TTS engines we currently use (eSpeak, Tiflotecnica/old Nuance). The voice quality difference is night and day - but latency is a blocker.
My benchmarks (Core Ultra 7 258V, 32GB RAM, Intel Arc 140V GPU):
Model
Short phrase latency
Verdict
FP32 ONNX (CPU)
~500ms
Too slow
INT8 ONNX (CPU)
~1100ms
Even slower (wrong codepath?)
OpenVINO GPU
Failed
Dynamic STFT shapes not supported
Why screen readers are different:
Most TTS use cases (audiobooks, podcasts, video narration) tolerate 500ms+ latency easily. Screen readers are unique: we generate thousands of tiny utterances per hour ("button", "edit", "link", "checkbox checked"). Each must feel instant or navigation becomes unbearable.
Target latency: <200ms on average laptop CPU (i5/Ryzen 5, no GPU)
Reference: eSpeak achieves 5-10ms, Tiflotecnica ~50-150ms.
The gap:
The blind community is stuck with 20-year-old robotic voices because neural TTS is too slow. We don't typically have gaming PCs with dedicated GPUs. A neural TTS optimized for screen readers would help millions of users worldwide.
What might help:
• Static shapes export for OpenVINO compatibility
• Streaming mode (start audio while still generating)
• Lighter model variant specifically for low-latency use
• Guidance on optimal CPU inference settings
I understand this is a hard problem. Just wanted to share this use case and data in case it's useful for future development.
Thanks for building Kokoro - hoping one day it can power screen readers too.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions