feat: Kokoro 82M TTS engine + voice profile type system#325
Conversation
Add Kokoro-82M as a new TTS engine — 82M params, CPU realtime, 8 languages, Apache 2.0. Unlike cloning engines, Kokoro uses pre-built voice styles, which required a new profile type system to support non-cloning engines cleanly. Kokoro engine: - New kokoro_backend.py implementing TTSBackend protocol - 50 built-in voices across en/es/fr/hi/it/pt/ja/zh - KPipeline API with language-aware G2P routing via misaki - PyInstaller bundling for misaki, language_tags, espeakng_loader, en_core_web_sm Voice profile type system: - New voice_type column: 'cloned' | 'preset' | 'designed' (future) - Preset profiles store engine + voice ID instead of audio samples - default_engine field on profiles — auto-selects engine on profile pick - Create Voice dialog: toggle between 'Clone from audio' and 'Built-in voice' - Edit dialog shows preset voice info instead of sample list for preset profiles - Engine selector locks to preset engine when preset profile is selected - Profile grid filters by engine — shows Kokoro voices when Kokoro selected - Custom empty state when no preset profiles exist for selected engine Bug fixes: - Fix relative audio paths in DB causing 404s in production builds - config.set_data_dir() now resolves to absolute paths - Startup migration converts existing relative paths to absolute Also updates PROJECT_STATUS.md and tts-engines.mdx developer guide.
📝 WalkthroughWalkthroughAdds Kokoro 82M as a preset (non-cloning) TTS engine across backend and frontend: new Kokoro backend, DB/profile schema for preset/designed voices, preset seeding/listing APIs, profile-aware engine filtering and auto-switching in the UI, and packaging/build updates to include Kokoro dependencies. Changes
Sequence Diagram(s)sequenceDiagram
participant Browser
participant Frontend
participant API
participant DB
participant KokoroBackend
Browser->>Frontend: select profile / start generation
Frontend->>API: GET /profiles/{id}
API->>DB: fetch profile (includes preset_engine, preset_voice_id, default_engine)
DB-->>API: profile row
API-->>Frontend: profile response
Frontend->>Frontend: isProfileCompatibleWithEngine(profile, currentEngine)
alt incompatible
Frontend->>Frontend: set engine = firstAvailableOption
Frontend->>UIStore: setSelectedEngine(engine)
end
Frontend->>API: POST /generate (engine=kokoro, preset_voice_id, effects_chain)
API->>KokoroBackend: generate(text, voice_prompt=preset reference, effects_chain)
KokoroBackend->>KokoroBackend: load model/pipeline (lazy)
KokoroBackend-->>API: audio bytes
API-->>Frontend: stream/return audio
Frontend-->>Browser: play audio
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/src/components/VoiceProfiles/ProfileForm.tsx (1)
640-654:⚠️ Potential issue | 🟠 MajorMake cloned-profile creation fail atomically when sample upload breaks.
This error path toasts the failure, but then still falls through to the shared success cleanup. The modal closes and the user is left with a newly created profile that has no samples, with no retry path from the same form state.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/VoiceProfiles/ProfileForm.tsx` around lines 640 - 654, The sample upload catch block currently toasts the error but then continues to the success cleanup (setProfileFormDraft, form.reset, setEditingProfileId, setOpen) leaving an empty profile; change this so the catch for sampleError performs an atomic rollback: call the profile-deletion API for the newly created profile (use the created profile identifier available as data.id or similar), await its result, and if deletion fails show another toast indicating rollback failure; after successful rollback return early from the function so you do NOT call setProfileFormDraft, form.reset, setEditingProfileId, or setOpen. Ensure the deletion call is properly awaited and errors are handled to avoid swallowing failures.
🧹 Nitpick comments (3)
backend/build_binary.py (1)
231-268: Collapse the PyInstaller manifest into one shared source.This Kokoro/misaki/spaCy bundle list now exists here and in
backend/voicebox-server.spec. Keeping two hand-edited manifests in sync is brittle; the usual failure mode is one build path shipping fine while the other misses a runtime asset. Pull the hidden-import /collect-alldefinitions into a shared helper and have both entry points consume it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/build_binary.py` around lines 231 - 268, Extract the repeated PyInstaller arguments into a single callable/constant and have both build scripts import it: create a function or constant named get_shared_pyinstaller_options (or SHARED_PYINSTALLER_OPTIONS) that returns the list of flags shown (all "--hidden-import", "--collect-all", "--copy-metadata" entries for kokoro, misaki, language_tags, espeakng_loader, en_core_web_sm, loguru, etc.), replace the inlined list in the current build_binary.py (the block that contains kokoro/misaki/spacy entries) with a call/import of that shared symbol, and update the other consumer (the code that generates voicebox-server.spec) to import and extend/consume the same shared symbol so there is a single source of truth for hidden-import/collect-all entries. Ensure the shared helper is a plain Python module that both build scripts can import and include tests or a quick local build check to confirm the same assets are bundled.backend/backends/kokoro_backend.py (2)
156-160: Potential race condition in concurrentload_modelcalls.Two concurrent calls can both pass the
if self._model is not Nonecheck before either completes_load_model_sync, resulting in redundant model loads and a resource leak of the first model instance.This mirrors the known latent design issue in
PyTorchTTSBackend(tracked for future follow-up). Consider adding anasyncio.Lockto serialize model loading if concurrent access is expected.🔒 Proposed fix using asyncio.Lock
class KokoroTTSBackend: """Kokoro-82M TTS backend — tiny, fast, CPU-friendly.""" def __init__(self): self._model = None self._pipelines: dict = {} # lang_code -> KPipeline self._device: Optional[str] = None self.model_size = "default" + self._load_lock = asyncio.Lock() ... async def load_model(self, model_size: str = "default") -> None: """Load the Kokoro model.""" - if self._model is not None: - return - await asyncio.to_thread(self._load_model_sync) + async with self._load_lock: + if self._model is not None: + return + await asyncio.to_thread(self._load_model_sync)Based on learnings: "the model reload/unload race condition... is a pre-existing latent design issue... Fixing it requires an asyncio.Lock or active-ops counter."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/kokoro_backend.py` around lines 156 - 160, The load_model method can race when called concurrently because both callers may see self._model is None and invoke _load_model_sync; add an asyncio.Lock (e.g., self._load_lock) on the backend class, initialize it in the constructor, and wrap the check-and-load sequence in an async with self._load_lock: block inside load_model so only one coroutine runs _load_model_sync while others await the lock and then return early if self._model was set by the first loader.
20-20: Unused import:osThe
osmodule is imported but not used anywhere in this file.🧹 Proposed fix
import asyncio import logging -import os from typing import Optional🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/kokoro_backend.py` at line 20, Remove the unused top-level import "import os" from the module (the unused import statement in kokoro_backend.py); delete that import line and run the linter/formatter to ensure no leftover references remain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/src/components/Generation/EngineModelSelector.tsx`:
- Around line 53-64: getAvailableOptions currently returns ENGINE_OPTIONS for
cloned profiles causing invalid profile/engine combos; update
getAvailableOptions to filter ENGINE_OPTIONS using the same compatibility rules
as isProfileCompatibleWithEngine (or explicitly filter by CLONING_ENGINES when
selectedProfile.voice_type === 'cloned'), keep the existing preset branch that
filters by selectedProfile.preset_engine, and return only options whose
opt.engine is in CLONING_ENGINES (or passes
isProfileCompatibleWithEngine(selectedProfile, opt.engine)) so the dropdown
cannot select incompatible engines.
In `@app/src/components/VoiceProfiles/ProfileForm.tsx`:
- Around line 1125-1154: The Select currently lists preset-only engines (e.g.,
"kokoro") even for sample-based/cloned profiles; update the options rendering in
the Default Engine Select so that preset-only engines are omitted when the
profile is sample-based (check editingProfile?.voice_type === 'sample' or
voiceSource === 'sample'), e.g., only render the SelectItem for "kokoro" when
editingProfile?.voice_type !== 'sample' (or voiceSource !== 'sample');
additionally, when loading an editingProfile, validate defaultEngine and call
setDefaultEngine('') if the current value is a now-disallowed engine to avoid
persisting an invalid choice (references: defaultEngine, setDefaultEngine,
editingProfile, voiceSource).
In `@app/src/components/VoiceProfiles/ProfileList.tsx`:
- Around line 58-71: The UI promises a "default voice will be used" but
useGenerationForm still hard-fails when no profile is selected; update the hook
to provide a real fallback instead of rejecting or change the UI copy.
Specifically, in the useGenerationForm hook (function useGenerationForm in
app/src/lib/hooks/useGenerationForm.ts) modify the validation /
getSelectedProfile logic so that when selectedProfile is missing and the engine
is a preset (isPresetEngine true) it returns or injects a Kokoro/default profile
object (with the engine and default voice fields) or bypasses the hard error
path and allows submission with a noted fallback; alternatively, if you prefer
the UI change, update ProfileList.tsx text to remove the misleading "The default
voice will be used" line so it accurately reflects that a profile must be
created/selected.
In `@backend/database/migrations.py`:
- Around line 185-223: In _resolve_relative_paths, stop resolving relative paths
against the process CWD; instead obtain the configured data directory (e.g. via
your existing config accessor such as get_data_dir() or settings.data_dir) and
join it with the stored relative path: replace p = Path(path_val); resolved =
p.resolve() with resolved = (Path(data_dir) / path_val).resolve() (or if no
data_dir is available, fall back to the SQLite DB file parent) before calling
resolved.exists() and performing the UPDATE for the table/column pairs in
path_columns; ensure you import or access the config value and keep the
idempotent behavior for already absolute paths.
In `@backend/routes/profiles.py`:
- Around line 134-149: When seeding presets, don't skip creating a profile just
because the desired name (profile_name) exists; instead first ensure there isn't
already a profile with the same (preset_engine, preset_voice_id) and if that
pair is absent, generate a unique name by appending a numeric suffix to
profile_name until db.query(DBVoiceProfile).filter_by(name=unique_name).first()
is false, then create DBVoiceProfile with that unique_name; update references to
profile_name in the DBVoiceProfile constructor to use the unique_name and retain
checks against preset_engine and preset_voice_id to keep seeding idempotent.
In `@backend/services/profiles.py`:
- Around line 427-442: Validate preset/designed profiles before returning: when
voice_type == "preset", check that profile.preset_engine and
profile.preset_voice_id are present and that profile.preset_engine matches the
requested engine (the `engine` param); if not, raise/return a clear validation
error. Similarly, when voice_type == "designed", ensure profile.design_prompt
exists and (if designed profiles are engine-specific) that any required engine
constraint matches `engine`; otherwise return a validation error. Use the
existing symbols voice_type, preset_engine, preset_voice_id, design_prompt and
engine to locate the checks and fail fast with explicit errors instead of
returning incomplete dicts.
In `@docs/notes/PROJECT_STATUS.md`:
- Around line 419-427: The "Kokoro-82M" bullet in the "Previously Prioritized —
Now Done" section is contradictory ("In progress" inside a "Now Done" list);
update the Kokoro-82M line (the bullet containing "Kokoro-82M") so its status
reflects completion (e.g., change "Kokoro-82M — In progress" to "~~Kokoro-82M~~
**Shipped**" or similar) or move it out of this "Now Done" section into an
appropriate "In progress" section so the document is consistent.
---
Outside diff comments:
In `@app/src/components/VoiceProfiles/ProfileForm.tsx`:
- Around line 640-654: The sample upload catch block currently toasts the error
but then continues to the success cleanup (setProfileFormDraft, form.reset,
setEditingProfileId, setOpen) leaving an empty profile; change this so the catch
for sampleError performs an atomic rollback: call the profile-deletion API for
the newly created profile (use the created profile identifier available as
data.id or similar), await its result, and if deletion fails show another toast
indicating rollback failure; after successful rollback return early from the
function so you do NOT call setProfileFormDraft, form.reset,
setEditingProfileId, or setOpen. Ensure the deletion call is properly awaited
and errors are handled to avoid swallowing failures.
---
Nitpick comments:
In `@backend/backends/kokoro_backend.py`:
- Around line 156-160: The load_model method can race when called concurrently
because both callers may see self._model is None and invoke _load_model_sync;
add an asyncio.Lock (e.g., self._load_lock) on the backend class, initialize it
in the constructor, and wrap the check-and-load sequence in an async with
self._load_lock: block inside load_model so only one coroutine runs
_load_model_sync while others await the lock and then return early if
self._model was set by the first loader.
- Line 20: Remove the unused top-level import "import os" from the module (the
unused import statement in kokoro_backend.py); delete that import line and run
the linter/formatter to ensure no leftover references remain.
In `@backend/build_binary.py`:
- Around line 231-268: Extract the repeated PyInstaller arguments into a single
callable/constant and have both build scripts import it: create a function or
constant named get_shared_pyinstaller_options (or SHARED_PYINSTALLER_OPTIONS)
that returns the list of flags shown (all "--hidden-import", "--collect-all",
"--copy-metadata" entries for kokoro, misaki, language_tags, espeakng_loader,
en_core_web_sm, loguru, etc.), replace the inlined list in the current
build_binary.py (the block that contains kokoro/misaki/spacy entries) with a
call/import of that shared symbol, and update the other consumer (the code that
generates voicebox-server.spec) to import and extend/consume the same shared
symbol so there is a single source of truth for hidden-import/collect-all
entries. Ensure the shared helper is a plain Python module that both build
scripts can import and include tests or a quick local build check to confirm the
same assets are bundled.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f379bea6-fdf5-4f4f-9211-a49146c1f7b3
⛔ Files ignored due to path filters (1)
tauri/src-tauri/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (25)
app/src/components/Generation/EngineModelSelector.tsxapp/src/components/Generation/FloatingGenerateBox.tsxapp/src/components/Generation/GenerationForm.tsxapp/src/components/ServerSettings/ModelManagement.tsxapp/src/components/VoiceProfiles/ProfileCard.tsxapp/src/components/VoiceProfiles/ProfileForm.tsxapp/src/components/VoiceProfiles/ProfileList.tsxapp/src/lib/api/client.tsapp/src/lib/api/types.tsapp/src/lib/constants/languages.tsapp/src/lib/hooks/useGenerationForm.tsapp/src/stores/uiStore.tsbackend/backends/__init__.pybackend/backends/kokoro_backend.pybackend/build_binary.pybackend/config.pybackend/database/migrations.pybackend/database/models.pybackend/models.pybackend/requirements.txtbackend/routes/profiles.pybackend/services/profiles.pybackend/voicebox-server.specdocs/content/docs/developer/tts-engines.mdxdocs/notes/PROJECT_STATUS.md
| function getAvailableOptions(selectedProfile?: VoiceProfileResponse | null) { | ||
| if (!selectedProfile) return ENGINE_OPTIONS; | ||
|
|
||
| const voiceType = selectedProfile.voice_type || 'cloned'; | ||
|
|
||
| if (voiceType === 'preset') { | ||
| // Preset profiles lock to their specific engine | ||
| const presetEngine = selectedProfile.preset_engine; | ||
| return ENGINE_OPTIONS.filter((opt) => opt.engine === presetEngine); | ||
| } | ||
|
|
||
| return ENGINE_OPTIONS; |
There was a problem hiding this comment.
Match available options to the compatibility rules below.
getAvailableOptions() still returns Kokoro for cloned profiles, even though isProfileCompatibleWithEngine() correctly says cloned voices only work with CLONING_ENGINES. Right now the selector can still drive the form into an invalid profile/engine combination.
💡 Keep the dropdown consistent with the helper
function getAvailableOptions(selectedProfile?: VoiceProfileResponse | null) {
if (!selectedProfile) return ENGINE_OPTIONS;
const voiceType = selectedProfile.voice_type || 'cloned';
if (voiceType === 'preset') {
// Preset profiles lock to their specific engine
const presetEngine = selectedProfile.preset_engine;
return ENGINE_OPTIONS.filter((opt) => opt.engine === presetEngine);
}
- return ENGINE_OPTIONS;
+ return ENGINE_OPTIONS.filter((opt) => CLONING_ENGINES.has(opt.engine));
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function getAvailableOptions(selectedProfile?: VoiceProfileResponse | null) { | |
| if (!selectedProfile) return ENGINE_OPTIONS; | |
| const voiceType = selectedProfile.voice_type || 'cloned'; | |
| if (voiceType === 'preset') { | |
| // Preset profiles lock to their specific engine | |
| const presetEngine = selectedProfile.preset_engine; | |
| return ENGINE_OPTIONS.filter((opt) => opt.engine === presetEngine); | |
| } | |
| return ENGINE_OPTIONS; | |
| function getAvailableOptions(selectedProfile?: VoiceProfileResponse | null) { | |
| if (!selectedProfile) return ENGINE_OPTIONS; | |
| const voiceType = selectedProfile.voice_type || 'cloned'; | |
| if (voiceType === 'preset') { | |
| // Preset profiles lock to their specific engine | |
| const presetEngine = selectedProfile.preset_engine; | |
| return ENGINE_OPTIONS.filter((opt) => opt.engine === presetEngine); | |
| } | |
| return ENGINE_OPTIONS.filter((opt) => CLONING_ENGINES.has(opt.engine)); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src/components/Generation/EngineModelSelector.tsx` around lines 53 - 64,
getAvailableOptions currently returns ENGINE_OPTIONS for cloned profiles causing
invalid profile/engine combos; update getAvailableOptions to filter
ENGINE_OPTIONS using the same compatibility rules as
isProfileCompatibleWithEngine (or explicitly filter by CLONING_ENGINES when
selectedProfile.voice_type === 'cloned'), keep the existing preset branch that
filters by selectedProfile.preset_engine, and return only options whose
opt.engine is in CLONING_ENGINES (or passes
isProfileCompatibleWithEngine(selectedProfile, opt.engine)) so the dropdown
cannot select incompatible engines.
| <FormItem> | ||
| <FormLabel>Default Engine</FormLabel> | ||
| <Select | ||
| value={defaultEngine || '_none'} | ||
| onValueChange={(v) => { | ||
| setDefaultEngine(v === '_none' ? '' : v); | ||
| }} | ||
| disabled={ | ||
| voiceSource === 'builtin' || editingProfile?.voice_type === 'preset' | ||
| } | ||
| > | ||
| <FormControl> | ||
| <SelectTrigger> | ||
| <SelectValue placeholder="No preference" /> | ||
| </SelectTrigger> | ||
| </FormControl> | ||
| <SelectContent> | ||
| <SelectItem value="_none">No preference</SelectItem> | ||
| <SelectItem value="qwen">Qwen3-TTS</SelectItem> | ||
| <SelectItem value="luxtts">LuxTTS</SelectItem> | ||
| <SelectItem value="chatterbox">Chatterbox</SelectItem> | ||
| <SelectItem value="chatterbox_turbo">Chatterbox Turbo</SelectItem> | ||
| <SelectItem value="tada">TADA</SelectItem> | ||
| <SelectItem value="kokoro">Kokoro 82M</SelectItem> | ||
| </SelectContent> | ||
| </Select> | ||
| <p className="text-xs text-muted-foreground"> | ||
| Auto-selects this engine when the profile is chosen. | ||
| </p> | ||
| </FormItem> |
There was a problem hiding this comment.
Don't offer preset-only engines as a cloned profile's default.
This dropdown currently lets a sample-based profile save default_engine="kokoro", even though Kokoro only works with preset voices. Selecting that profile later will auto-pick an engine that can't use its samples.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src/components/VoiceProfiles/ProfileForm.tsx` around lines 1125 - 1154,
The Select currently lists preset-only engines (e.g., "kokoro") even for
sample-based/cloned profiles; update the options rendering in the Default Engine
Select so that preset-only engines are omitted when the profile is sample-based
(check editingProfile?.voice_type === 'sample' or voiceSource === 'sample'),
e.g., only render the SelectItem for "kokoro" when editingProfile?.voice_type
!== 'sample' (or voiceSource !== 'sample'); additionally, when loading an
editingProfile, validate defaultEngine and call setDefaultEngine('') if the
current value is a now-disallowed engine to avoid persisting an invalid choice
(references: defaultEngine, setDefaultEngine, editingProfile, voiceSource).
| ) : filteredProfiles.length === 0 && isPresetEngine ? ( | ||
| <Card> | ||
| <CardContent className="flex flex-col items-center justify-center py-12"> | ||
| <Music className="h-12 w-12 text-muted-foreground mb-4" /> | ||
| <p className="text-muted-foreground mb-2"> | ||
| No {ENGINE_NAMES[selectedEngine] ?? selectedEngine} voices created yet. | ||
| </p> | ||
| <p className="text-sm text-muted-foreground mb-4"> | ||
| The default voice will be used. Create a profile to choose a specific voice. | ||
| </p> | ||
| <Button onClick={() => setDialogOpen(true)}> | ||
| <Sparkles className="mr-2 h-4 w-4" /> | ||
| Create {ENGINE_NAMES[selectedEngine] ?? selectedEngine} Voice | ||
| </Button> |
There was a problem hiding this comment.
The preset empty state promises a fallback that the form still rejects.
Lines 65-67 say "The default voice will be used", but app/src/lib/hooks/useGenerationForm.ts Lines 64-71 still hard-fail when no profile is selected. Either wire up a real Kokoro default-voice path or change this copy so users aren't told they can proceed without creating/selecting a profile.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src/components/VoiceProfiles/ProfileList.tsx` around lines 58 - 71, The
UI promises a "default voice will be used" but useGenerationForm still
hard-fails when no profile is selected; update the hook to provide a real
fallback instead of rejecting or change the UI copy. Specifically, in the
useGenerationForm hook (function useGenerationForm in
app/src/lib/hooks/useGenerationForm.ts) modify the validation /
getSelectedProfile logic so that when selectedProfile is missing and the engine
is a preset (isPresetEngine true) it returns or injects a Kokoro/default profile
object (with the engine and default voice fields) or bypasses the hard error
path and allows submission with a noted fallback; alternatively, if you prefer
the UI change, update ProfileList.tsx text to remove the misleading "The default
voice will be used" line so it accurately reflects that a profile must be
created/selected.
| # Skip name collisions | ||
| if db.query(DBVoiceProfile).filter_by(name=profile_name).first(): | ||
| continue | ||
|
|
||
| profile = DBVoiceProfile( | ||
| id=str(uuid.uuid4()), | ||
| name=profile_name, | ||
| description=f"Kokoro preset voice — {display_name} ({gender})", | ||
| language=lang, | ||
| voice_type="preset", | ||
| preset_engine="kokoro", | ||
| preset_voice_id=voice_id, | ||
| created_at=datetime.utcnow(), | ||
| updated_at=datetime.utcnow(), | ||
| ) | ||
| db.add(profile) |
There was a problem hiding this comment.
Generate a unique preset name instead of skipping the voice.
A pre-existing profile named Bella currently blocks the Kokoro Bella preset from ever being seeded, even though that (preset_engine, preset_voice_id) does not exist yet. Appending a suffix keeps the seed exhaustive and idempotent.
💡 One way to keep every preset seedable
- # Skip name collisions
- if db.query(DBVoiceProfile).filter_by(name=profile_name).first():
- continue
+ candidate_name = profile_name
+ suffix = 2
+ while db.query(DBVoiceProfile).filter_by(name=candidate_name).first():
+ candidate_name = f"{profile_name} ({engine} {suffix})"
+ suffix += 1
+ profile_name = candidate_name📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Skip name collisions | |
| if db.query(DBVoiceProfile).filter_by(name=profile_name).first(): | |
| continue | |
| profile = DBVoiceProfile( | |
| id=str(uuid.uuid4()), | |
| name=profile_name, | |
| description=f"Kokoro preset voice — {display_name} ({gender})", | |
| language=lang, | |
| voice_type="preset", | |
| preset_engine="kokoro", | |
| preset_voice_id=voice_id, | |
| created_at=datetime.utcnow(), | |
| updated_at=datetime.utcnow(), | |
| ) | |
| db.add(profile) | |
| candidate_name = profile_name | |
| suffix = 2 | |
| while db.query(DBVoiceProfile).filter_by(name=candidate_name).first(): | |
| candidate_name = f"{profile_name} ({engine} {suffix})" | |
| suffix += 1 | |
| profile_name = candidate_name | |
| profile = DBVoiceProfile( | |
| id=str(uuid.uuid4()), | |
| name=profile_name, | |
| description=f"Kokoro preset voice — {display_name} ({gender})", | |
| language=lang, | |
| voice_type="preset", | |
| preset_engine="kokoro", | |
| preset_voice_id=voice_id, | |
| created_at=datetime.utcnow(), | |
| updated_at=datetime.utcnow(), | |
| ) | |
| db.add(profile) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/routes/profiles.py` around lines 134 - 149, When seeding presets,
don't skip creating a profile just because the desired name (profile_name)
exists; instead first ensure there isn't already a profile with the same
(preset_engine, preset_voice_id) and if that pair is absent, generate a unique
name by appending a numeric suffix to profile_name until
db.query(DBVoiceProfile).filter_by(name=unique_name).first() is false, then
create DBVoiceProfile with that unique_name; update references to profile_name
in the DBVoiceProfile constructor to use the unique_name and retain checks
against preset_engine and preset_voice_id to keep seeding idempotent.
| voice_type = getattr(profile, "voice_type", None) or "cloned" | ||
|
|
||
| # ── Preset profiles: return engine-specific voice reference ── | ||
| if voice_type == "preset": | ||
| return { | ||
| "voice_type": "preset", | ||
| "preset_engine": profile.preset_engine, | ||
| "preset_voice_id": profile.preset_voice_id, | ||
| } | ||
|
|
||
| # ── Designed profiles: return text description (future) ── | ||
| if voice_type == "designed": | ||
| return { | ||
| "voice_type": "designed", | ||
| "design_prompt": profile.design_prompt, | ||
| } |
There was a problem hiding this comment.
Reject invalid or engine-mismatched preset/designed profiles here.
This branch now returns preset/designed prompt dicts without checking that the required metadata is present or that the requested engine matches the profile. A preset profile with missing preset_engine / preset_voice_id, or a Kokoro preset used with engine='qwen', can fail deeper in generation instead of returning a clear validation error here.
🔧 Suggested guard clauses
if voice_type == "preset":
+ if not profile.preset_engine or not profile.preset_voice_id:
+ raise ValueError(f"Preset profile {profile_id} is missing preset metadata")
+ if engine != profile.preset_engine:
+ raise ValueError(
+ f"Profile {profile_id} only supports the {profile.preset_engine} engine"
+ )
return {
"voice_type": "preset",
"preset_engine": profile.preset_engine,
"preset_voice_id": profile.preset_voice_id,
}
if voice_type == "designed":
+ if not profile.design_prompt:
+ raise ValueError(f"Designed profile {profile_id} is missing a design prompt")
return {
"voice_type": "designed",
"design_prompt": profile.design_prompt,
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/services/profiles.py` around lines 427 - 442, Validate
preset/designed profiles before returning: when voice_type == "preset", check
that profile.preset_engine and profile.preset_voice_id are present and that
profile.preset_engine matches the requested engine (the `engine` param); if not,
raise/return a clear validation error. Similarly, when voice_type == "designed",
ensure profile.design_prompt exists and (if designed profiles are
engine-specific) that any required engine constraint matches `engine`; otherwise
return a validation error. Use the existing symbols voice_type, preset_engine,
preset_voice_id, design_prompt and engine to locate the checks and fail fast
with explicit errors instead of returning incomplete dicts.
| ### ~~Previously Prioritized — Now Done~~ | ||
|
|
||
| - ~~#258 — Chatterbox Turbo~~ **Merged** | ||
| - ~~#99 — Chunked TTS~~ **Superseded by #266, merged** | ||
| - ~~#88 — CORS restriction~~ **Merged** | ||
| - ~~#161 — Docker deployment~~ **Merged** | ||
| - ~~#234 — Queue system~~ **Addressed by #269, merged** | ||
| - ~~HumeAI TADA~~ **Shipped** (PR #296) | ||
| - ~~Kokoro-82M~~ **In progress** |
There was a problem hiding this comment.
Fix the contradictory Kokoro status line.
This section is labeled "Now Done", but the Kokoro bullet still says "In progress". Readers won't be able to tell whether that item shipped or not.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/notes/PROJECT_STATUS.md` around lines 419 - 427, The "Kokoro-82M" bullet
in the "Previously Prioritized — Now Done" section is contradictory ("In
progress" inside a "Now Done" list); update the Kokoro-82M line (the bullet
containing "Kokoro-82M") so its status reflects completion (e.g., change
"Kokoro-82M — In progress" to "~~Kokoro-82M~~ **Shipped**" or similar) or move
it out of this "Now Done" section into an appropriate "In progress" section so
the document is consistent.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
app/src/components/Generation/EngineModelSelector.tsx (1)
44-50:⚠️ Potential issue | 🟠 MajorReapply profile-compatible filtering in
getAvailableOptions.Line 48 currently ignores
selectedProfileand returns all engines, which allows invalid profile/engine combinations (and breaks preset-engine locking behavior).💡 Suggested fix
-function getAvailableOptions(_selectedProfile?: VoiceProfileResponse | null) { - return ENGINE_OPTIONS; +function getAvailableOptions(selectedProfile?: VoiceProfileResponse | null) { + if (!selectedProfile) return ENGINE_OPTIONS; + return ENGINE_OPTIONS.filter((opt) => + isProfileCompatibleWithEngine(selectedProfile, opt.engine), + ); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/Generation/EngineModelSelector.tsx` around lines 44 - 50, getAvailableOptions currently ignores the selectedProfile and returns all ENGINE_OPTIONS, allowing invalid profile/engine combos; update it to return ENGINE_OPTIONS filtered by compatibility with the provided selectedProfile (e.g., ENGINE_OPTIONS.filter(e => isEngineCompatibleWithProfile(selectedProfile, e)) or check selectedProfile.supportedEngines/allowedEngines/compatibleEngineIds), and preserve any preset-engine locking behavior (use the profile's preset lock flag or existing helper to enforce locked engine selection when present).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/src/components/Generation/EngineModelSelector.tsx`:
- Around line 111-116: The engine fallback logic currently performs a side
effect during render (using setTimeout) — move this into a useEffect: compute
currentEngineAvailable from availableOptions and selectValue, then inside a
useEffect that depends on [availableOptions, selectValue, form] call
handleEngineChange(form, availableOptions[0].value) when !currentEngineAvailable
and availableOptions.length > 0; remove the setTimeout-based call from the
render path to avoid state changes during render and React Strict Mode warnings.
---
Duplicate comments:
In `@app/src/components/Generation/EngineModelSelector.tsx`:
- Around line 44-50: getAvailableOptions currently ignores the selectedProfile
and returns all ENGINE_OPTIONS, allowing invalid profile/engine combos; update
it to return ENGINE_OPTIONS filtered by compatibility with the provided
selectedProfile (e.g., ENGINE_OPTIONS.filter(e =>
isEngineCompatibleWithProfile(selectedProfile, e)) or check
selectedProfile.supportedEngines/allowedEngines/compatibleEngineIds), and
preserve any preset-engine locking behavior (use the profile's preset lock flag
or existing helper to enforce locked engine selection when present).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5ed93f58-56eb-4b61-a5d8-d1f001783beb
📒 Files selected for processing (2)
app/src/components/Generation/EngineModelSelector.tsxbackend/database/migrations.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/database/migrations.py
| // If current engine isn't in available options, auto-switch to first available | ||
| const currentEngineAvailable = availableOptions.some((opt) => opt.value === selectValue); | ||
| if (!currentEngineAvailable && availableOptions.length > 0) { | ||
| // Defer to avoid setting state during render | ||
| setTimeout(() => handleEngineChange(form, availableOptions[0].value), 0); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify render-phase side effect currently exists and useEffect is absent.
rg -n --type=tsx -C3 'setTimeout\(\(\) => handleEngineChange\(form, availableOptions\[0\]\.value\), 0\)' app/src/components/Generation/EngineModelSelector.tsx
rg -n --type=tsx 'useEffect\(' app/src/components/Generation/EngineModelSelector.tsxRepository: jamiepine/voicebox
Length of output: 117
🏁 Script executed:
fd -name "EngineModelSelector.tsx" app/src/components/Repository: jamiepine/voicebox
Length of output: 292
🏁 Script executed:
cat -n app/src/components/Generation/EngineModelSelector.tsx | head -130 | tail -50Repository: jamiepine/voicebox
Length of output: 2341
🏁 Script executed:
rg -n "setTimeout" app/src/components/Generation/EngineModelSelector.tsxRepository: jamiepine/voicebox
Length of output: 143
🏁 Script executed:
rg -n "useEffect" app/src/components/Generation/EngineModelSelector.tsxRepository: jamiepine/voicebox
Length of output: 44
Move engine fallback logic to useEffect to avoid side effects during render.
Lines 113–116 schedule a form update via setTimeout during the component render. Side effects should not execute in the render phase—this can cause unpredictable state transitions and issues in React Strict Mode. Move this logic into a useEffect hook:
+import { useEffect } from 'react';
import type { UseFormReturn } from 'react-hook-form';
@@
- // If current engine isn't in available options, auto-switch to first available
- const currentEngineAvailable = availableOptions.some((opt) => opt.value === selectValue);
- if (!currentEngineAvailable && availableOptions.length > 0) {
- // Defer to avoid setting state during render
- setTimeout(() => handleEngineChange(form, availableOptions[0].value), 0);
- }
+ useEffect(() => {
+ if (!currentEngineAvailable && availableOptions.length > 0) {
+ handleEngineChange(form, availableOptions[0].value);
+ }
+ }, [currentEngineAvailable, availableOptions, form]);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // If current engine isn't in available options, auto-switch to first available | |
| const currentEngineAvailable = availableOptions.some((opt) => opt.value === selectValue); | |
| if (!currentEngineAvailable && availableOptions.length > 0) { | |
| // Defer to avoid setting state during render | |
| setTimeout(() => handleEngineChange(form, availableOptions[0].value), 0); | |
| } | |
| import { useEffect } from 'react'; | |
| import type { UseFormReturn } from 'react-hook-form'; | |
| useEffect(() => { | |
| const currentEngineAvailable = availableOptions.some((opt) => opt.value === selectValue); | |
| if (!currentEngineAvailable && availableOptions.length > 0) { | |
| handleEngineChange(form, availableOptions[0].value); | |
| } | |
| }, [selectValue, availableOptions, form, handleEngineChange]); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src/components/Generation/EngineModelSelector.tsx` around lines 111 -
116, The engine fallback logic currently performs a side effect during render
(using setTimeout) — move this into a useEffect: compute currentEngineAvailable
from availableOptions and selectValue, then inside a useEffect that depends on
[availableOptions, selectValue, form] call handleEngineChange(form,
availableOptions[0].value) when !currentEngineAvailable and
availableOptions.length > 0; remove the setTimeout-based call from the render
path to avoid state changes during render and React Strict Mode warnings.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/routes/generations.py`:
- Around line 236-242: The code selects engine from data.engine or
profile.default_engine / profile.preset_engine and then calls
get_tts_backend_for_engine(engine); instead, validate that the resolved engine
string is a supported/known engine before doing backend lookup (e.g., compare
the resolved value against the allowed engines list or let a helper
is_supported_engine(engine) perform validation) and if invalid return a
client-friendly 4xx error (bad request) rather than letting
get_tts_backend_for_engine raise/return a 500; update the resolution logic
around the engine variable and the call site of get_tts_backend_for_engine to
perform this check and return an appropriate validation error when
profile.default_engine or profile.preset_engine contain stale/unknown values.
- Around line 280-289: After deserializing profile.effects_chain with
_json.loads into effects_chain_config, validate that it is a list of dicts
before calling apply_effects to avoid downstream errors in
apply_effects/build_pedalboard; specifically, after the try/except that sets
effects_chain_config, check isinstance(effects_chain_config, list) and that
every item is an instance of dict (or the expected mapping type) and if the
check fails set effects_chain_config = None (or skip applying effects) so
apply_effects(audio, sample_rate, effects_chain_config) only runs with the
validated structure.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0532274b-a066-482f-8d59-c91341939806
📒 Files selected for processing (1)
backend/routes/generations.py
| engine = ( | ||
| data.engine | ||
| or getattr(profile, "default_engine", None) | ||
| or getattr(profile, "preset_engine", None) | ||
| or "qwen" | ||
| ) | ||
| tts_model = get_tts_backend_for_engine(engine) |
There was a problem hiding this comment.
Validate profile-derived engine before backend lookup.
Line 236 now trusts profile.default_engine / profile.preset_engine (DB strings). If either value is stale/invalid, Line 242 can fail and return a 500 instead of a client-facing validation error.
Proposed fix
engine = (
data.engine
or getattr(profile, "default_engine", None)
or getattr(profile, "preset_engine", None)
or "qwen"
)
- tts_model = get_tts_backend_for_engine(engine)
+ try:
+ tts_model = get_tts_backend_for_engine(engine)
+ except Exception as exc:
+ raise HTTPException(
+ status_code=400,
+ detail=f"Unsupported engine '{engine}' from request/profile",
+ ) from exc🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/routes/generations.py` around lines 236 - 242, The code selects
engine from data.engine or profile.default_engine / profile.preset_engine and
then calls get_tts_backend_for_engine(engine); instead, validate that the
resolved engine string is a supported/known engine before doing backend lookup
(e.g., compare the resolved value against the allowed engines list or let a
helper is_supported_engine(engine) perform validation) and if invalid return a
client-friendly 4xx error (bad request) rather than letting
get_tts_backend_for_engine raise/return a 500; update the resolution logic
around the engine variable and the call site of get_tts_backend_for_engine to
perform this check and return an appropriate validation error when
profile.default_engine or profile.preset_engine contain stale/unknown values.
| try: | ||
| effects_chain_config = _json.loads(profile.effects_chain) | ||
| except Exception: | ||
| effects_chain_config = None | ||
|
|
||
| if effects_chain_config: | ||
| from ..utils.effects import apply_effects | ||
|
|
||
| audio = apply_effects(audio, sample_rate, effects_chain_config) | ||
|
|
There was a problem hiding this comment.
Validate deserialized effects_chain shape before applying effects.
Line 281 accepts any JSON value; only decode failures are caught. If stored JSON is not List[Dict], Line 288 can fail inside apply_effects/build_pedalboard.
Proposed fix
elif profile.effects_chain:
import json as _json
try:
- effects_chain_config = _json.loads(profile.effects_chain)
- except Exception:
+ parsed = _json.loads(profile.effects_chain)
+ if isinstance(parsed, list) and all(isinstance(e, dict) for e in parsed):
+ effects_chain_config = parsed
+ else:
+ logger.warning(
+ "Ignoring invalid effects_chain format for profile %s",
+ data.profile_id,
+ )
+ effects_chain_config = None
+ except _json.JSONDecodeError:
+ logger.warning(
+ "Ignoring unparsable effects_chain for profile %s",
+ data.profile_id,
+ )
effects_chain_config = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/routes/generations.py` around lines 280 - 289, After deserializing
profile.effects_chain with _json.loads into effects_chain_config, validate that
it is a list of dicts before calling apply_effects to avoid downstream errors in
apply_effects/build_pedalboard; specifically, after the try/except that sets
effects_chain_config, check isinstance(effects_chain_config, list) and that
every item is an instance of dict (or the expected mapping type) and if the
check fails set effects_chain_config = None (or skip applying effects) so
apply_effects(audio, sample_rate, effects_chain_config) only runs with the
validated structure.
Summary
cloned/preset/designed) to support engines that use built-in voices instead of zero-shot cloningKokoro Engine
backend/backends/kokoro_backend.py— fullTTSBackendprotocol implementationKPipelineAPI with language-aware G2P routing (misaki)misaki,language_tags,espeakng_loader,en_core_web_sm(spacy model)Voice Profile Type System
Kokoro doesn't do traditional voice cloning — it uses pre-built style vectors. Rather than bolting on a temporary workaround, this PR introduces a proper type system for voice profiles that will also support Qwen CustomVoice (text-described voices) in the future.
Schema: New columns on
profilestable —voice_type,preset_engine,preset_voice_id,design_prompt,default_engine. Idempotent migration runs on startup.Create Voice dialog: Toggle between "Clone from audio" (existing flow) and "Built-in voice" (pick from Kokoro's 50 voices in a grid). Both flows produce a profile that appears in the same grid.
Profile ↔ Engine interaction:
default_engineon any profile auto-selects the engine when that profile is pickedEdit dialog: Preset profiles show their assigned voice info instead of the sample list. Default engine dropdown available on both create and edit for all profile types.
Effects pre-fill: Profile default effects auto-populate the generation bar effects dropdown, matching against known presets or showing "Profile default".
Bug Fix: Relative Audio Paths
config.set_data_dir()now.resolve()s to absolute paths. A startup migration converts existing relativeaudio_pathvalues ingenerations,generation_versions,profile_samples, andprofilesto absolute. Fixes 404s in production builds where CWD ≠ data directory.Files Changed (26 files, +1015 / -292)
New:
backend/backends/kokoro_backend.pyBackend:
backends/__init__.py,build_binary.py,config.py,database/migrations.py,database/models.py,models.py,requirements.txt,routes/profiles.py,services/profiles.pyFrontend:
EngineModelSelector.tsx,FloatingGenerateBox.tsx,GenerationForm.tsx,ModelManagement.tsx,ProfileCard.tsx,ProfileForm.tsx,ProfileList.tsx,client.ts,types.ts,languages.ts,useGenerationForm.ts,uiStore.tsDocs:
tts-engines.mdx,PROJECT_STATUS.mdSummary by CodeRabbit
New Features
Improvements
Documentation