Backend refactor: modular architecture, style guide, tooling#285
Backend refactor: modular architecture, style guide, tooling#285
Conversation
Phase 1 - delete dead code: - studio.py, migrate_add_instruct.py, utils/validation.py - duplicate _profile_to_response in main.py, duplicate asyncio import - pointless _get_profiles_dir/_get_generations_dir wrappers - duplicate LANGUAGE_CODE_TO_NAME and WHISPER_HF_REPOS constants Phase 2 - extract backends/base.py with shared utilities: - is_model_cached() replaces 7 copy-pasted HF cache checks - get_torch_device() replaces 5 device detection methods - combine_voice_prompts() replaces 5 identical implementations - model_load_progress() ctx manager replaces progress boilerplate in all backends - patch_chatterbox_f32() replaces identical monkey-patches in both chatterbox backends net -1078 lines across the backend
…akefile - Add backend/STYLE_GUIDE.md covering formatting, imports, types, docstrings, comments, error handling, async, logging, and naming conventions - Add pyproject.toml with ruff linter/formatter config (ERA, FIX, isort, pyupgrade) - Extract generation service (Phase 3): unified run_generation() replaces three duplicated closures, serial queue moved to services/task_queue.py - Delete Makefile in favor of justfile; update all references - Add Python lint/format/test commands to justfile (check-python, fix-python, test) - Install ruff, pytest, pytest-asyncio as dev tools in setup-python - Update REFACTOR_PLAN.md with Phase 3 and Phase 7 completion
Split the 2,578-line main.py (90 routes) into 12 domain-specific router modules under routes/. main.py is now a 45-line entry point. New structure: - app.py: FastAPI instance, CORS, startup/shutdown, safe_content_disposition - routes/: health, profiles, channels, generations, history, transcription, stories, effects, audio, models, tasks, cuda - services/cuda.py: moved from cuda_download.py Also includes Phase 5 database/ package (from parallel agent): - database/__init__.py re-exports all symbols for backward compat - database/models.py, session.py, migrations.py, seed.py All 90 routes verified registered and app imports cleanly.
…ils/ Move 9 business-logic modules from the backend root into services/: channels, effects, history, profiles, stories, versions, export_import, transcribe, tts. Move platform_detect.py into utils/. Backend root now contains only infrastructure (app, main, config, server, models, build_binary) and docs. All 94 routes verified.
…277) On Windows, Voicebox shelled out to netstat.exe on startup to check for existing server processes. On systems with corrupted DLLs, netstat fails with 0xc0000142, causing an infinite loading loop. Replace with: - TcpStream::connect_timeout() for port-in-use checks (pure Rust) - PowerShell Get-NetTCPConnection for port-to-PID lookup (built-in cmdlet) - tasklist for process name verification (unchanged) Closes #277
…guide Replace the outdated backend README (473 lines of stale API docs and pre-refactor file tree) with a concise architecture document covering module structure, request flow, backend selection, API domain overview, and development commands. Delete REFACTOR_PLAN.md -- all phases are complete. Update STYLE_GUIDE.md to remove refactor plan references and replace the verbose target layout with the current actual structure.
Compiled Xcode asset catalog gets regenerated every build. No reason to track it.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReplaces Makefile with justfile, adds a FastAPI app factory and many new routers, splits the database into models/migrations/session/seed, introduces a ModelConfig-backed backend registry and shared backend helpers, adds a serial generation queue and run_generation orchestration, updates frontend engine/model selection, and removes legacy docs/examples. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant API as FastAPI
participant DB
participant Queue as TaskQueue
participant Worker as GenerationWorker
participant Backend as TTSBackend
participant Store as Storage
Client->>API: POST /generations (GenerationRequest)
API->>DB: INSERT Generation (status=loading_model)
API->>Queue: enqueue_generation(coroutine)
API-->>Client: 200 GenerationResponse (id)
Queue->>Worker: deliver coroutine
Worker->>Backend: ensure model loaded (engine, model_size)
Backend->>Store: load/cache model assets
Backend-->>Worker: model ready
Worker->>Backend: run TTS generation (chunked)
Worker->>Store: write WAV file(s)
Worker->>DB: UPDATE Generation status -> generating/completed
Client->>API: GET /generations/{id}/status (SSE)
API->>DB: poll updates -> stream status events
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 18
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (14)
backend/README.md-22-35 (1)
22-35:⚠️ Potential issue | 🟡 MinorAdd fence languages to markdown code blocks (Line 22, Line 39, Line 112).
These fences should declare a language (use
text) to satisfy markdownlint MD040.Suggested doc-only fix
-``` +```text backend/ app.py # FastAPI app factory, CORS, lifecycle events main.py # Entry point (imports app, runs uvicorn) config.py # Data directory paths and configuration models.py # Pydantic request/response schemas server.py # Tauri sidecar launcher, parent-pid watchdog @@ -``` +```text HTTP request -> routes/ (validate input, parse params) -> services/ (business logic, database queries, orchestration) -> backends/ (TTS/STT inference) -> utils/ (audio processing, effects, caching) @@ -``` +```text {data_dir}/ voicebox.db # SQLite database profiles/{id}/ # Voice samples per profile generations/ # Generated audio files cache/ # Voice prompt cache (memory + disk) backends/ # Downloaded CUDA binary (if applicable)Also applies to: 39-45, 112-119
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/README.md` around lines 22 - 35, Update the three markdown code fences that contain the directory listing, the HTTP request flow, and the {data_dir}/ listing so they declare a language; replace the opening triple backticks (```) with triple backticks plus "text" (```text) for the blocks starting with "backend/", "HTTP request", and "{data_dir}/" respectively to satisfy markdownlint MD040; ensure only the fence header is changed and the block contents remain identical.backend/README.md-55-55 (1)
55-55:⚠️ Potential issue | 🟡 MinorPrevent markdown strong-style parsing on
__init__filename (Line 55).
**backends/__init__.py**can trigger MD050 due to underscores. Wrap the filename in backticks inside bold text.Suggested doc-only fix
-**backends/__init__.py** -- Protocol definitions (`TTSBackend`, `STTBackend`), model config registry, and factory functions. Adding a new engine means implementing the protocol and registering a config entry. +**`backends/__init__.py`** -- Protocol definitions (`TTSBackend`, `STTBackend`), model config registry, and factory functions. Adding a new engine means implementing the protocol and registering a config entry.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/README.md` at line 55, Change the bolded filename token to use inline code so the Markdown linter won't parse underscores as emphasis: replace the bold text "**backends/__init__.py**" with bolded inline-code "**`backends/__init__.py`**" in the README (the line describing backends/__init__.py, where the protocols TTSBackend and STTBackend are referenced).tauri/src-tauri/src/main.rs-17-18 (1)
17-18:⚠️ Potential issue | 🟡 MinorUpdate the helper docs to match actual behavior.
Line 17-Line 18 mention
TcpListener::bind, but this function only uses PowerShell +tasklist. The comment is now misleading for future maintenance.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tauri/src-tauri/src/main.rs` around lines 17 - 18, The doc comment currently claims the helper uses TcpListener::bind, but the implementation actually invokes PowerShell and tasklist to find a voicebox process and return its PID; update the helper's documentation comment to remove the TcpListener::bind reference and clearly state that it runs a PowerShell tasklist scan (using tasklist) to locate the voicebox process and returns the PID when found, and mention any platform constraint (Windows) if applicable so future maintainers (and the function that follows) see accurate behavior.CHANGELOG.md-69-79 (1)
69-79:⚠️ Potential issue | 🟡 MinorConsolidate entries under a single
Unreleasedsection.The file now has two
## [Unreleased]headers (Line 8 and Line 62). Please merge these new bullets into one section to avoid ambiguity for readers/tools.🧩 Suggested cleanup
-## [Unreleased] - -### Fixed -- Audio export failing when Tauri save dialog returns object instead of string path -- OpenAPI client generator script now documents the local backend port and avoids an unused loop variable warning - -### Added -- **justfile** - Comprehensive development workflow automation with commands for setup, development, building, testing, and code quality checks - - Cross-platform support (macOS, Linux, Windows) - - Python version detection and compatibility warnings - - Self-documenting help system with `just --list` - -### Changed -- **README** - Updated Quick Start with justfile-based setup instructions - -### Removed -- **Makefile** - Replaced by justfile (cross-platform, simpler syntax) +## [Unreleased] +### Fixed +- **Profile Name Validation** - Added proper validation to prevent duplicate profile names ([`#134`](https://github.com/jamiepine/voicebox/issues/134)) + - Users now receive clear error messages when attempting to create or update profiles with duplicate names + - Improved error handling in create and update profile API endpoints + - Added comprehensive test suite for duplicate name validation +- Audio export failing when Tauri save dialog returns object instead of string path +- OpenAPI client generator script now documents the local backend port and avoids an unused loop variable warning + +### Added +- **justfile** - Comprehensive development workflow automation with commands for setup, development, building, testing, and code quality checks + - Cross-platform support (macOS, Linux, Windows) + - Python version detection and compatibility warnings + - Self-documenting help system with `just --list` + +### Changed +- **README** - Updated Quick Start with justfile-based setup instructions + +### Removed +- **Makefile** - Replaced by justfile (cross-platform, simpler syntax)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@CHANGELOG.md` around lines 69 - 79, There are duplicate "## [Unreleased]" sections; merge the entries under both headers into a single Unreleased section by combining the bullets for "**justfile**", the "Changed" entry for "**README**", and the "Removed" entry for "**Makefile**" into one consolidated "## [Unreleased]" header, remove the extra duplicate header, and ensure the combined bullets maintain their original ordering and formatting (including sub-bullets like "Cross-platform support" and "Python version detection") so tools/readers see only one Unreleased block.backend/routes/tasks.py-41-42 (1)
41-42:⚠️ Potential issue | 🟡 MinorPreserve exception chain with
from e.Per static analysis, exceptions raised within an
exceptblock should chain to preserve the original traceback.🐛 Proposed fix
except Exception as e: - raise HTTPException(status_code=500, detail=f"Failed to clear cache: {str(e)}") + raise HTTPException(status_code=500, detail=f"Failed to clear cache: {e!s}") from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/tasks.py` around lines 41 - 42, In the except block that catches Exception and raises an HTTPException (the block that currently does: except Exception as e: raise HTTPException(status_code=500, detail=f"Failed to clear cache: {str(e)}")), preserve the exception chain by re-raising the HTTPException with "from e" (i.e., raise HTTPException(status_code=500, detail=...) from e) so the original traceback is kept; update that raise in the clear-cache route handler in backend/routes/tasks.py accordingly.backend/routes/audio.py-9-9 (1)
9-9:⚠️ Potential issue | 🟡 MinorUnused import
models.The
modelsmodule is imported but never used in this file.🧹 Proposed fix
-from .. import models from ..services import history🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/audio.py` at line 9, The import models at the top of backend/routes/audio.py is unused; remove the line "from .. import models" (or otherwise use the imported symbol in functions that require it) so the file no longer contains an unused import and linting/flake8 warnings are resolved.backend/services/export_import.py-97-97 (1)
97-97:⚠️ Potential issue | 🟡 MinorUnused variable
profile_dir.This variable is assigned but never used in
export_profile_to_zip. It appears to be leftover from refactoring.🧹 Proposed fix
- profile_dir = config.get_profiles_dir() / profile_id - for sample in samples:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/export_import.py` at line 97, In export_profile_to_zip remove the unused assignment to profile_dir (the expression config.get_profiles_dir() / profile_id) since it’s leftover from refactoring; locate the line assigning profile_dir inside the export_profile_to_zip function and delete it, or if the path is actually needed use the existing variables/use the path where required instead of introducing profile_dir.backend/app.py-106-120 (1)
106-120:⚠️ Potential issue | 🟡 MinorDatabase session not properly closed on all paths.
Using
next(get_db())bypasses the generator'sfinallyblock cleanup. If an exception occurs beforedb.close(), the session may leak. Consider using a context manager pattern.🛡️ Proposed fix
try: from sqlalchemy import text as sa_text - db = next(get_db()) - result = db.execute( - sa_text( - "UPDATE generations SET status = 'failed', " - "error = 'Server was shut down during generation' " - "WHERE status = 'generating'" + from contextlib import contextmanager + + `@contextmanager` + def db_session(): + db = database.SessionLocal() + try: + yield db + finally: + db.close() + + with db_session() as db: + result = db.execute( + sa_text( + "UPDATE generations SET status = 'failed', " + "error = 'Server was shut down during generation' " + "WHERE status = 'generating'" + ) ) - ) - if result.rowcount > 0: - print(f"Marked {result.rowcount} stale generation(s) as failed") - db.commit() - db.close() + if result.rowcount > 0: + print(f"Marked {result.rowcount} stale generation(s) as failed") + db.commit()Or simpler, just use
try/finallywithdb.close()already in place and ensure exception paths also close:db = next(get_db()) - result = db.execute( - sa_text( - "UPDATE generations SET status = 'failed', " - "error = 'Server was shut down during generation' " - "WHERE status = 'generating'" + try: + result = db.execute( + sa_text( + "UPDATE generations SET status = 'failed', " + "error = 'Server was shut down during generation' " + "WHERE status = 'generating'" + ) ) - ) - if result.rowcount > 0: - print(f"Marked {result.rowcount} stale generation(s) as failed") - db.commit() - db.close() + if result.rowcount > 0: + print(f"Marked {result.rowcount} stale generation(s) as failed") + db.commit() + finally: + db.close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app.py` around lines 106 - 120, The code calls next(get_db()) which bypasses the generator's cleanup; fix by capturing the generator (e.g. gen = get_db()) and then db = next(gen), wrap the DB work in try/finally and in finally call gen.close() (and/or db.close() if still needed) so the generator's finally block always runs on exceptions; update the block around get_db(), db, and commit/close to use this try/finally pattern to guarantee session cleanup.backend/database/__init__.py-24-44 (1)
24-44:⚠️ Potential issue | 🟡 MinorKeep
__all__sorted to satisfy Ruff.RUF022 is already firing on Lines 24-44, so this file will stay noisy in
ruff checkuntil the export list is ordered consistently.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/database/__init__.py` around lines 24 - 44, The __all__ export list in this module is unsorted and triggers RUF022; reorder the entries in the __all__ list (the string names such as "AudioChannel", "Base", "ChannelDeviceMapping", "EffectPreset", "Generation", "GenerationVersion", "ProfileChannelMapping", "ProfileSample", "Project", "SessionLocal", "_db_path", "Story", "StoryItem", "VoiceProfile", "engine", "get_db", "init_db") into a consistent alphabetical order (respecting your project's case-sensitivity rules) so Ruff stops flagging it; update the existing __all__ declaration (not its values elsewhere) to use the sorted sequence.docs/plans/ADDING_TTS_ENGINES.md-194-214 (1)
194-214:⚠️ Potential issue | 🟡 MinorAnnotate these fenced blocks with
bash.markdownlint MD040 is already flagging the snippets on Lines 194, 206, and 212, so the new docs will stay noisy until those fences declare a language.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/plans/ADDING_TTS_ENGINES.md` around lines 194 - 214, The fenced code blocks showing shell commands (the block with "pip install --no-deps chatterbox-tts", the block showing "linacodec @ git+https://...", and the block with "--find-links https://k2-fsa.github.io/icefall/piper_phonemize.html") should be annotated as bash to satisfy markdownlint MD040; update each triple-backtick fence to ```bash so the snippets that include the pip install line, the git+https URIs, and the --find-links line are explicitly marked as shell/bash code blocks.docs/plans/PROJECT_STATUS.md-53-56 (1)
53-56:⚠️ Potential issue | 🟡 MinorThis architecture table is already stale after the refactor.
The
backend/main.pyrow still describes the pre-refactor 2.8k-line monolith, even though this PR turns it into a small bootstrap. The new hard-coded source line ranges in the neighboring rows will rot just as fast, so I'd update the table and drop exact line numbers while you're here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/plans/PROJECT_STATUS.md` around lines 53 - 56, Update the architecture table to reflect the refactor: change the `backend/main.py` row to describe it as a small bootstrap/entrypoint (not a ~2850-line monolith), remove all hard-coded source line ranges (e.g., the `:32-101`, `:17-29,153-366`, `:382-426` annotations) and instead reference symbols or files only (for example list `TTSBackend` Protocol, `ModelConfig` dataclass, and the engine registry in `backend/backends/__init__.py` without line numbers), and update the TTS factory row to describe the thread-safe engine registry (double-checked locking) generically so the table stays accurate after future refactors.backend/routes/health.py-31-35 (1)
31-35:⚠️ Potential issue | 🟡 MinorUse
call_later()to ensure the shutdown callback actually runs.
asyncio.create_task(shutdown_async())creates a task with only a weak reference from the event loop. If this reference is the only one after the handler returns, the task can be garbage-collected before the 0.1-second delay executes, making the/shutdownendpoint a no-op. This is explicitly documented in Python's asyncio docs as a known issue with fire-and-forget tasks.Suggested fix
- async def shutdown_async(): - await asyncio.sleep(0.1) - os.kill(os.getpid(), signal.SIGTERM) - - asyncio.create_task(shutdown_async()) + asyncio.get_running_loop().call_later(0.1, os.kill, os.getpid(), signal.SIGTERM)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/health.py` around lines 31 - 35, The shutdown task may be garbage-collected before its sleep runs because asyncio.create_task(shutdown_async()) is a weak reference; replace that scheduling with the event loop's call_later to guarantee execution: get the current loop (e.g. via asyncio.get_event_loop() or asyncio.get_running_loop()) and use loop.call_later(0.1, lambda: asyncio.create_task(shutdown_async())), or alternatively move the os.kill call into a non-async callback and schedule that with loop.call_later(0.1, shutdown_callback) so the shutdown sequence in shutdown_async() reliably executes; update references to shutdown_async and the scheduling call accordingly.backend/routes/profiles.py-179-182 (1)
179-182:⚠️ Potential issue | 🟡 MinorPotential
Nonefilename causingPath(None).suffixerror.
file.filenamecan beNonefor some upload scenarios. Unlike line 114 which handles this withfile.filename or "", this line does not.🐛 Proposed fix
- with tempfile.NamedTemporaryFile(delete=False, suffix=Path(file.filename).suffix) as tmp: + with tempfile.NamedTemporaryFile(delete=False, suffix=Path(file.filename or "").suffix or ".tmp") as tmp:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/profiles.py` around lines 179 - 182, The code calls Path(file.filename).suffix without guarding against file.filename being None; change the NamedTemporaryFile suffix creation to use a safe fallback (e.g., Path(file.filename or "").suffix or a default suffix) so Path never receives None. Update the tempfile.NamedTemporaryFile(...) call that uses Path(file.filename).suffix, keep the same tmp_path/tmp variables and ensure file.read() and tmp.write(content) behavior is unchanged.backend/backends/base.py-113-120 (1)
113-120:⚠️ Potential issue | 🟡 MinorType inconsistency in DirectML device return: function declares
-> strbut returns device object.The function signature explicitly specifies
-> str, buttorch_directml.device(0)returns atorch.device-like object, while all other branches return string literals ("cuda","cpu","xpu","mps").Although PyTorch's
.to()accepts both strings and device objects at runtime, this violates the return type contract and can confuse static type checkers. Align the return type or convert the DirectML device to a string representation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/base.py` around lines 113 - 120, The DirectML branch (guarded by allow_directml) currently returns a torch_directml.device object (via torch_directml.device(0)) while the function signature is -> str; convert the device object to a string before returning (e.g., use str(...) or another string representation) so the branch returns the same type as the other branches ("cuda"/"cpu"/"xpu"/"mps") and keep the signature as -> str; update the torch_directml.device(0) return to return a string and ensure any callers expecting strings still work.
🧹 Nitpick comments (27)
backend/services/stories.py (3)
5-5: Consider using built-in generics per style guide.The style guide for Python 3.12+ recommends using built-in generics (
list[T],T | None) instead oftyping.Listandtyping.Optional. This can be addressed incrementally as files are touched during the refactor.♻️ Suggested change
-from typing import List, Optional +from typing import TYPE_CHECKINGThen update function signatures:
List[StoryResponse]→list[StoryResponse]Optional[StoryDetailResponse]→StoryDetailResponse | None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/stories.py` at line 5, Replace typing generics with built-in generics: change the import from typing.List and typing.Optional usage to use Python 3.12+ built-ins. Update the top-level import that currently references List/Optional and modify function signatures that return List[StoryResponse] to return list[StoryResponse] and any occurrences of Optional[StoryDetailResponse] to StoryDetailResponse | None; search for symbols StoryResponse and StoryDetailResponse and update their annotations accordingly.
57-61: Thegetattrusage suggests defensive coding for nullable columns.Using
getattr(item, "version_id", None)and similar patterns implies these fields might not always exist onDBStoryItem. If these are defined columns in the ORM model, direct attribute access would be cleaner and more type-safe. If they're genuinely optional (added in a migration), consider adding type hints to the model.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/stories.py` around lines 57 - 61, The code uses defensive getattr calls (e.g., getattr(item, "version_id", None), getattr(item, "trim_start_ms", 0), getattr(item, "trim_end_ms", 0)) when building the story dict; if these are defined ORM columns on DBStoryItem, replace getattr with direct attribute access (item.version_id, item.trim_start_ms, item.trim_end_ms) to be clearer and more type-safe, otherwise annotate DBStoryItem to mark these fields as Optional[int] (or the correct type) and keep getattr only if model truly may lack the attribute; update usages in the function that constructs the story (references: DBStoryItem, version_id, trim_start_ms, trim_end_ms) accordingly so the code and type hints align.
95-96:datetime.utcnow()is deprecated in Python 3.12.Consider using
datetime.now(datetime.timezone.utc)instead, which is the recommended replacement. This applies to all occurrences in the file.♻️ Suggested change
+from datetime import datetime, timezone -from datetime import datetime # Then replace usages: -created_at=datetime.utcnow(), +created_at=datetime.now(timezone.utc),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/stories.py` around lines 95 - 96, Replace all uses of datetime.utcnow() in this file with timezone-aware UTC timestamps by calling datetime.now(datetime.timezone.utc); specifically update the assignments to created_at and updated_at (and any other occurrences) to use datetime.now(datetime.timezone.utc) and ensure datetime.timezone is referenced or imported so timestamps are timezone-aware and compliant with Python 3.12.backend/services/task_queue.py (2)
30-31: Replacetraceback.print_exc()withlogger.exception()per style guide.The style guide specifies using
logger.exception()insideexceptblocks as it captures the traceback automatically and integrates with the logging infrastructure.♻️ Proposed fix
+import logging + +logger = logging.getLogger(__name__) + # Keep references to fire-and-forget background tasks to prevent GC _background_tasks: set = set()try: await coro except Exception: - traceback.print_exc() + logger.exception("Generation task failed") finally: _generation_queue.task_done()After this change, you can remove the
import tracebackline.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/task_queue.py` around lines 30 - 31, Replace the bare traceback.print_exc() call in the except block of backend.services.task_queue (the except in the code containing traceback.print_exc()) with logger.exception(...) so the exception and stack trace are logged via the module logger; update the message to provide context if needed and then remove the now-unused import traceback from the top of the file. Ensure logger is the correct logger instance used in task_queue (or import/get it) so logger.exception is available.
24-33: Consider adding graceful shutdown support for the worker.The
_generation_workerruns an infinite loop with no shutdown mechanism. If the application needs to shut down cleanly (e.g., waiting for in-flight generation to complete), there's no way to signal the worker to stop. This may be acceptable for now but could cause issues with graceful shutdown during development reloads or production deployments.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/task_queue.py` around lines 24 - 33, The worker _generation_worker currently loops forever and needs a shutdown mechanism; add a module-level asyncio.Event (e.g., _generation_shutdown_event) or a sentinel object and implement a stop function (e.g., stop_generation_worker) that signals the event or enqueues the sentinel to wake the loop; change _generation_worker to await either _generation_queue.get() or _generation_shutdown_event.wait() (or check for the sentinel after get()), break out of the loop when shutdown is signaled but ensure in-flight coro is awaited and _generation_queue.task_done() is called so pending tasks are drained and the worker exits cleanly; update any startup code to create the event and use stop_generation_worker during shutdown.backend/STYLE_GUIDE.md (1)
372-385: Add language specifier to the code block.The project layout code block is missing a language specifier, which triggers a markdownlint warning (MD040). Adding a language helps with rendering and accessibility.
📝 Proposed fix
## Project Layout -``` +```text backend/ app.py # FastAPI app factory, CORS, lifecycle events🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/STYLE_GUIDE.md` around lines 372 - 385, Update the project layout code fence in the STYLE_GUIDE.md project layout block so the opening triple backticks include a language specifier (e.g., change from ``` to ```text or ```plaintext) for the backend/ listing code block; locate the project layout code block (the one starting with "backend/" and the list of files like app.py, main.py, config.py) and modify its opening fence to include the language specifier to satisfy markdownlint MD040.backend/services/channels.py (1)
84-84: Consider usingdatetime.now(timezone.utc)instead ofdatetime.utcnow().
datetime.utcnow()is deprecated as of Python 3.12. The recommended approach is:from datetime import datetime, timezone # ... created_at=datetime.now(timezone.utc),This can be addressed in a follow-up if the project targets Python 3.12+.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/channels.py` at line 84, Replace the use of datetime.utcnow() for created_at with a timezone-aware timestamp: import timezone from datetime and set created_at to datetime.now(timezone.utc) in the code that constructs the channel (look for the created_at=datetime.utcnow() assignment in backend/services/channels.py, e.g., where the channel or record is created); update the import to include timezone (from datetime import datetime, timezone) and use datetime.now(timezone.utc).backend/tests/test_generation_download.py (1)
196-254: Test Case 2 is defined but never called.
test_generation_with_fresh_download()is not invoked inmain(). If this is intentional (to avoid slow model downloads during regular test runs), consider adding a comment or removing the unused code. Alternatively, you could add a CLI flag to optionally run it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_generation_download.py` around lines 196 - 254, The test function test_generation_with_fresh_download is defined but never called from main; either invoke it conditionally from main or mark it as intentionally skipped. Fix by updating main to optionally run test_generation_with_fresh_download when a flag/env var is set (e.g., RUN_SLOW_TESTS or a CLI --run-slow-tests) and ensure the call awaits the coroutine (await test_generation_with_fresh_download()), or add a clear comment above the function explaining it is intentionally unused and remove it if not needed; reference test_generation_with_fresh_download and main to locate where to add the conditional invocation or the explanatory comment.backend/services/export_import.py (2)
349-349: Redundant import ofconfig.
configis already imported at module level (line 19). This inner import is unnecessary.♻️ Proposed fix
from pathlib import Path import tempfile import shutil from datetime import datetime - from .. import config🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/export_import.py` at line 349, The function/file contains a redundant inner import of config (the statement "from .. import config") even though config is already imported at module scope; remove the duplicate inner import so only the module-level import remains and ensure any references use the existing module-level name (config) to avoid shadowing or re-importing.
417-417: Prefer standard import over__import__.Using
__import__('uuid')inline is non-idiomatic. Consider importinguuidat the module level with other standard library imports.♻️ Proposed fix
Add to imports at top of file:
import uuidThen update line 417:
- new_generation_id = str(__import__('uuid').uuid4()) + new_generation_id = str(uuid.uuid4())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/export_import.py` at line 417, Replace the inline dynamic import usage in the new_generation_id assignment with a standard module import: add "import uuid" to the module imports and change the assignment that uses __import__('uuid').uuid4() to call uuid.uuid4() (keeping the str(...) conversion) so the code uses the idiomatic, module-level uuid import; update the statement that sets new_generation_id accordingly.backend/routes/tasks.py (2)
66-75: Repeated encapsulation violation.Same issue as lines 24-27: directly accessing private
_lockand_progressattributes. A public accessor method likeget_progress(model_name)would be cleaner.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/tasks.py` around lines 66 - 75, The code is directly accessing private attributes progress_manager._lock and progress_manager._progress in two places; replace these private-access patterns with a public accessor on ProgressManager (e.g., implement and call progress_manager.get_progress(model_name)) so callers no longer touch _lock/_progress; update the two blocks that read pm_data and error to call progress_manager.get_progress(model_name) (or an equivalent thread-safe method) which should handle locking and return the progress dict or None.
24-27: Encapsulation violation: accessing private attributes.Directly manipulating
progress_manager._lock,_progress,_last_notify_time, and_last_notify_progressbreaks encapsulation. Consider adding aclear()method to the progress manager class instead.♻️ Suggested approach
Add a public method to ProgressManager:
def clear(self): """Clear all progress state.""" with self._lock: self._progress.clear() self._last_notify_time.clear() self._last_notify_progress.clear()Then simplify the endpoint:
task_manager.clear_all() - - with progress_manager._lock: - progress_manager._progress.clear() - progress_manager._last_notify_time.clear() - progress_manager._last_notify_progress.clear() + progress_manager.clear()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/tasks.py` around lines 24 - 27, You are directly accessing private attributes of ProgressManager (progress_manager._lock, _progress, _last_notify_time, _last_notify_progress); add a public ProgressManager.clear() method that acquires the internal lock and clears those internal collections, then replace the direct manipulation in the route with a single call to progress_manager.clear(); reference the ProgressManager class and the route where progress_manager is used to locate the change.backend/database/session.py (2)
72-78: Guard against uninitializedSessionLocal.If
get_db()is called beforeinit_db(),SessionLocalwill beNoneandSessionLocal()will fail with an unhelpful error. Adding a guard improves debuggability.🛡️ Proposed fix
def get_db(): """Yield a database session (FastAPI dependency).""" + if SessionLocal is None: + raise RuntimeError("Database not initialized. Call init_db() first.") db = SessionLocal() try: yield db finally: db.close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/database/session.py` around lines 72 - 78, get_db currently calls SessionLocal() without checking if SessionLocal has been initialized, causing a confusing failure if init_db wasn't run; add a guard at the start of get_db to verify SessionLocal is not None and raise a clear RuntimeError (or similar) instructing to call init_db(or initialize the DB) when SessionLocal is uninitialized, so callers get a helpful error instead of a cryptic exception from SessionLocal().
50-50: Useis_(True)for SQLAlchemy boolean comparison.Static analysis flags
== True, and while SQLAlchemy requires an expression (not a truthy check), using.is_(True)is more explicit and avoids the linter warning.♻️ Proposed fix
- default_channel = db.query(AudioChannel).filter(AudioChannel.is_default == True).first() + default_channel = db.query(AudioChannel).filter(AudioChannel.is_default.is_(True)).first()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/database/session.py` at line 50, The boolean comparison in the query uses AudioChannel.is_default == True which triggers static analysis; update the filter call used when building default_channel (the db.query(AudioChannel).filter(...).first() expression) to use SQLAlchemy's is_ syntax: replace the equality check with AudioChannel.is_default.is_(True) so the filter becomes filter(AudioChannel.is_default.is_(True)).backend/app.py (2)
87-91: Optional: simplifyelifchain afterreturn.Static analysis notes the
elifis unnecessary afterreturn. This is purely stylistic but could be cleaner.♻️ Proposed simplification
return f"CUDA ({device_name})" - elif hasattr(torch.backends, "mps") and torch.backends.mps.is_available(): + if hasattr(torch.backends, "mps") and torch.backends.mps.is_available(): return "MPS (Apple Silicon)" - elif backend_type == "mlx": + if backend_type == "mlx": return "Metal (Apple Silicon via MLX)" return "None (CPU only)"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app.py` around lines 87 - 91, The trailing elifs after an earlier return are stylistically unnecessary; in the function that determines the backend (the branch using torch.backends.mps.is_available() and the check backend_type == "mlx"), replace the `elif` checks following the prior `return` with plain `if` checks so control flow is clearer—specifically change the `elif hasattr(torch.backends, "mps") and torch.backends.mps.is_available():` and `elif backend_type == "mlx":` to `if ...` while keeping the same return values ("MPS (Apple Silicon)", "Metal (Apple Silicon via MLX)") and the final `return "None (CPU only)"`.
97-98: Consider migrating to lifespan context manager.
@app.on_event("startup")and@app.on_event("shutdown")are deprecated in FastAPI. The recommended approach is thelifespancontext manager. This works for now but should be updated in the future.Also applies to: 149-150
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app.py` around lines 97 - 98, The code uses the deprecated `@application.on_event`("startup")/@application.on_event("shutdown") decorators (e.g., startup_event and corresponding shutdown handler); replace them by implementing an async lifespan context manager function (e.g., async def lifespan(application): ... yield ...) that contains the startup logic previously in startup_event and the teardown logic previously in the shutdown handler, and pass that lifespan callable into the FastAPI application construction (application = FastAPI(lifespan=lifespan)) so all startup/shutdown activities run inside the lifespan context.backend/backends/luxtts_backend.py (2)
130-131: Type hints removed fromcombine_voice_prompts.The method signature lost its type annotations (
audio_paths: List[str], reference_texts: List[str]) compared to the original. Consider preserving them for IDE support and static analysis.♻️ Proposed fix
- async def combine_voice_prompts(self, audio_paths, reference_texts): + async def combine_voice_prompts( + self, audio_paths: list[str], reference_texts: list[str] + ) -> tuple[np.ndarray, str]: return await _combine_voice_prompts(audio_paths, reference_texts, sample_rate=24000)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/luxtts_backend.py` around lines 130 - 131, The combine_voice_prompts method lost its parameter type annotations; restore the original signature to include audio_paths: List[str] and reference_texts: List[str] (and add an appropriate return type annotation if present previously) so IDEs and static analysis can infer types; update the async def combine_voice_prompts declaration that delegates to _combine_voice_prompts to use those annotations and ensure List is imported from typing if not already.
10-10: Replacetyping.Tuplewith built-intupletype hint.The project requires Python 3.12+, which supports PEP 585 syntax. Use
tupledirectly instead of importingTuplefromtyping.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/luxtts_backend.py` at line 10, Remove Tuple from the typing import and switch any Tuple[...] annotations to the built-in tuple[...] PEP 585 syntax: update the import line "from typing import Optional, Tuple" to only import Optional (e.g., "from typing import Optional") and replace all uses of the symbol "Tuple" in type hints with "tuple" (for example change "Tuple[int, str]" to "tuple[int, str]") within luxtts_backend.py.backend/database/models.py (1)
7-9: Switch the base import to the SQLAlchemy 2.x API.
sqlalchemy.ext.declarative.declarative_base()is the legacy import path in 2.x. Usingsqlalchemy.orm.declarative_base()keeps the new database package off the deprecation path.Suggested fix
-from sqlalchemy.ext.declarative import declarative_base +from sqlalchemy.orm import declarative_base🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/database/models.py` around lines 7 - 9, Replace the legacy declarative_base import with the SQLAlchemy 2.x API: import declarative_base from sqlalchemy.orm and keep the existing Base = declarative_base() assignment (i.e., update the import that currently references sqlalchemy.ext.declarative to sqlalchemy.orm.declarative_base to avoid deprecation warnings for the Base symbol).backend/routes/profiles.py (2)
305-306:datetime.utcnow()is deprecated in Python 3.12+.Consider using
datetime.now(datetime.timezone.utc)for timezone-aware UTC timestamps.♻️ Proposed fix
+from datetime import datetime, timezone ... - profile.updated_at = datetime.utcnow() + profile.updated_at = datetime.now(timezone.utc)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/profiles.py` around lines 305 - 306, Replace the deprecated call to datetime.utcnow() used when updating profile.updated_at with a timezone-aware UTC timestamp: use datetime.now(datetime.timezone.utc) (ensure datetime.timezone is imported) so profile.updated_at is set to a tz-aware datetime before calling db.commit(); update any related imports or tests that assume naive datetimes.
27-32: Redundant exception handlers with identical behavior.Both
ValueErrorand genericExceptionhandlers return 400. Either combine them or use different status codes.♻️ Proposed fix
async def create_profile( data: models.VoiceProfileCreate, db: Session = Depends(get_db), ): """Create a new voice profile.""" try: return await profiles.create_profile(data, db) except ValueError as e: - raise HTTPException(status_code=400, detail=str(e)) - except Exception as e: - raise HTTPException(status_code=400, detail=str(e)) + raise HTTPException(status_code=400, detail=str(e)) from e + except Exception as e: + raise HTTPException(status_code=500, detail=str(e)) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/profiles.py` around lines 27 - 32, The current try/except around await profiles.create_profile(data, db) has redundant handlers returning 400 for both ValueError and Exception; simplify by keeping a specific ValueError handler that raises HTTPException(status_code=400, detail=str(e)) and change the generic except Exception handler to raise HTTPException(status_code=500, detail="Internal server error") (or include str(e) if you want the message) so unexpected errors map to 500; update the except blocks in the try/except that surrounds profiles.create_profile to reflect this change.backend/models.py (1)
149-152: TranscriptionRequest language pattern is very restrictive.The transcription endpoint only accepts
enorzh, but Whisper supports many more languages. Consider aligning with the broader language set or removing the pattern restriction to allow any language hint.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/models.py` around lines 149 - 152, The TranscriptionRequest model currently restricts language to only "en" or "zh" via Field(..., pattern="^(en|zh)$"); update the TranscriptionRequest.language declaration to either remove the pattern constraint or expand it to the broader set of Whisper-supported language codes so callers can pass any valid language hint; locate the language: Optional[str] field on the TranscriptionRequest class and modify its Field(...) invocation accordingly, ensuring any validators or docs consuming that pattern are updated or removed to match the new, less restrictive behavior.backend/routes/effects.py (1)
101-104: Add exception chaining for better tracebacks.When re-raising exceptions as HTTPException, chain them with
from eto preserve the original traceback for debugging.♻️ Proposed fix
try: return effects_mod.create_preset(data, db) except ValueError as e: - raise HTTPException(status_code=400, detail=str(e)) + raise HTTPException(status_code=400, detail=str(e)) from eApply the same pattern to lines 122 and 135.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/effects.py` around lines 101 - 104, The except blocks that catch ValueError and re-raise HTTPException (e.g., the try/except around effects_mod.create_preset in backend/routes/effects.py) should chain the original exception by raising HTTPException(... ) from e so the original traceback is preserved; update the raise statements in that handler and the analogous handlers (the ones around the other effects_mod calls at the locations referenced, e.g., the blocks near the other two try/excepts) to use "from e" when re-raising.backend/routes/stories.py (1)
29-32: Broad exception handling may mask unexpected errors.Catching all
Exceptiontypes and returning 400 could hide unexpected server errors (e.g., database connection issues) that should return 500.♻️ Proposed fix with exception chaining
async def create_story( data: models.StoryCreate, db: Session = Depends(get_db), ): """Create a new story.""" try: return await stories.create_story(data, db) + except ValueError as e: + raise HTTPException(status_code=400, detail=str(e)) from e except Exception as e: - raise HTTPException(status_code=400, detail=str(e)) + raise HTTPException(status_code=500, detail=str(e)) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/stories.py` around lines 29 - 32, The current broad except block around stories.create_story(data, db) catches all Exception and returns HTTP 400; change it to catch only expected, domain-specific exceptions (e.g., validation errors or known stories.create_story exceptions) and raise HTTPException(status_code=400, detail=...) for those, and for anything else re-raise or convert to HTTPException(status_code=500, detail="Internal server error") while preserving exception chaining (use "from e") so stack traces are retained; update the except clause(s) around stories.create_story to reference the specific exception classes you expect and add a fallback that produces a 500 with exception chaining.backend/backends/__init__.py (1)
360-372: Accessing private_is_model_cachedmethod on backend instances.The
ensure_model_cached_or_raisefunction accessesbackend._is_model_cached()which is a private method (prefixed with_). This creates tight coupling and could break if backend implementations change.Consider either making this method part of the public
TTSBackendprotocol or adding a public wrapper.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/__init__.py` around lines 360 - 372, The code currently calls the private backend._is_model_cached() inside ensure_model_cached_or_raise which couples callers to a private API; add a public method on the TTSBackend protocol (e.g., is_model_cached(self, model_size: Optional[str] = None) -> bool) or a public wrapper on the backend implementations, implement that method across concrete backends, and then replace all usages of backend._is_model_cached(...) in ensure_model_cached_or_raise with the new public backend.is_model_cached(...) (preserving the model_size argument for the "qwen" branch and none for others) so callers rely only on the public contract.backend/backends/base.py (2)
60-65: Consider simplifying loop withall()comprehension.Per static analysis, the loop can be simplified.
♻️ Simplified version
if required_files: - # Check that every required filename exists somewhere in snapshots - for fname in required_files: - if not any(snapshots_dir.rglob(fname)): - return False - return True + return all(any(snapshots_dir.rglob(fname)) for fname in required_files)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/base.py` around lines 60 - 65, The loop that checks every filename in required_files can be simplified to a single all(...) expression: replace the for-loop that uses if not any(snapshots_dir.rglob(fname)): return False followed by return True with a single return all(any(snapshots_dir.rglob(fname)) for fname in required_files) so it preserves the same semantics while being concise; look for the block referencing required_files and snapshots_dir.rglob in the function in base.py and swap the loop for that all(...) comprehension.
195-221: Context manager enter/exit calls are manually managed outside try block.The
tracker_context.__enter__()on line 196 is called before thetryblock begins. If an exception occurs between lines 196 and 208 (e.g., intask_manager.start_download), the__exit__in thefinallyblock will still be called, but the exception won't be properly propagated through the context manager's exception handling.Consider using a nested
withstatement or ensuring the entire yield is wrapped properly.♻️ Safer pattern using nested with
- tracker_context = tracker.patch_download() - tracker_context.__enter__() - - if not is_cached: - task_manager.start_download(model_name) - progress_manager.update_progress(...) - - try: - yield tracker_context - except Exception as e: - ... - finally: - tracker_context.__exit__(None, None, None) + with tracker.patch_download() as tracker_context: + if not is_cached: + task_manager.start_download(model_name) + progress_manager.update_progress(...) + + try: + yield tracker_context + except Exception as e: + progress_manager.mark_error(model_name, str(e)) + task_manager.error_download(model_name, str(e)) + raise + else: + if not is_cached: + progress_manager.mark_complete(model_name) + task_manager.complete_download(model_name)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/base.py` around lines 195 - 221, The manual call to tracker_context.__enter__ is outside the try so exceptions during setup (e.g., task_manager.start_download or progress_manager.update_progress) bypass the context manager's __exit__; replace the manual enter/exit with a proper with block (use "with tracker.patch_download() as tracker_context:") and move the try/except/else/finally logic so the yield occurs inside that with-block; ensure task_manager.start_download and progress_manager.update_progress remain before the yield but inside the with so any exception triggers tracker_context.__exit__ and the existing error reporting (progress_manager.mark_error / task_manager.error_download) still runs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 826cea17-3abb-4d17-b132-56fc1514395c
⛔ Files ignored due to path filters (2)
tauri/src-tauri/Cargo.lockis excluded by!**/*.locktauri/src-tauri/gen/Assets.caris excluded by!**/gen/**
📒 Files selected for processing (68)
.gitignoreCHANGELOG.mdMakefilePATCH_NOTES.mdapp/src/components/Generation/EngineModelSelector.tsxapp/src/components/Generation/FloatingGenerateBox.tsxapp/src/components/Generation/GenerationForm.tsxapp/src/components/Sidebar.tsxbackend/README.mdbackend/STYLE_GUIDE.mdbackend/app.pybackend/backends/__init__.pybackend/backends/base.pybackend/backends/chatterbox_backend.pybackend/backends/chatterbox_turbo_backend.pybackend/backends/luxtts_backend.pybackend/backends/mlx_backend.pybackend/backends/pytorch_backend.pybackend/build_binary.pybackend/database.pybackend/database/__init__.pybackend/database/migrations.pybackend/database/models.pybackend/database/seed.pybackend/database/session.pybackend/example_usage.pybackend/main.pybackend/migrate_add_instruct.pybackend/models.pybackend/pyproject.tomlbackend/routes/__init__.pybackend/routes/audio.pybackend/routes/channels.pybackend/routes/cuda.pybackend/routes/effects.pybackend/routes/generations.pybackend/routes/health.pybackend/routes/history.pybackend/routes/models.pybackend/routes/profiles.pybackend/routes/stories.pybackend/routes/tasks.pybackend/routes/transcription.pybackend/services/__init__.pybackend/services/channels.pybackend/services/cuda.pybackend/services/effects.pybackend/services/export_import.pybackend/services/generation.pybackend/services/history.pybackend/services/profiles.pybackend/services/stories.pybackend/services/task_queue.pybackend/services/transcribe.pybackend/services/tts.pybackend/services/versions.pybackend/studio.pybackend/tests/test_generation_download.pybackend/tests/test_profile_duplicate_names.pybackend/utils/chunked_tts.pybackend/utils/effects.pybackend/utils/platform_detect.pybackend/utils/validation.pydocs/overview/troubleshooting.mdxdocs/plans/ADDING_TTS_ENGINES.mddocs/plans/PROJECT_STATUS.mdjustfiletauri/src-tauri/src/main.rs
💤 Files with no reviewable changes (6)
- backend/database.py
- backend/migrate_add_instruct.py
- backend/utils/validation.py
- Makefile
- backend/example_usage.py
- backend/studio.py
| class GenerationVersion(Base): | ||
| """A version of a generation's audio (original, processed, alternate takes).""" | ||
|
|
||
| __tablename__ = "generation_versions" | ||
|
|
||
| id = Column(String, primary_key=True, default=lambda: str(uuid.uuid4())) | ||
| generation_id = Column(String, ForeignKey("generations.id"), nullable=False) | ||
| label = Column(String, nullable=False) | ||
| audio_path = Column(String, nullable=False) | ||
| effects_chain = Column(Text, nullable=True) | ||
| source_version_id = Column(String, ForeignKey("generation_versions.id"), nullable=True) | ||
| is_default = Column(Boolean, default=False) | ||
| created_at = Column(DateTime, default=datetime.utcnow) |
There was a problem hiding this comment.
Enforce one default GenerationVersion per generation.
is_default is only a boolean right now, so concurrent regenerate/set-default flows can persist multiple defaults for the same generation_id. Any code that later loads “the default version” becomes nondeterministic once that happens.
Suggested fix
-from sqlalchemy import Column, String, Integer, Float, DateTime, Text, ForeignKey, Boolean
+from sqlalchemy import Column, String, Integer, Float, DateTime, Text, ForeignKey, Boolean, Index, text
@@
class GenerationVersion(Base):
"""A version of a generation's audio (original, processed, alternate takes)."""
__tablename__ = "generation_versions"
+ __table_args__ = (
+ Index(
+ "uq_generation_versions_default",
+ "generation_id",
+ unique=True,
+ sqlite_where=text("is_default = 1"),
+ ),
+ )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/database/models.py` around lines 99 - 111, Add a DB-enforced
uniqueness so only one GenerationVersion per generation can have
is_default=True: in the GenerationVersion model add a unique partial
index/constraint on generation_id where is_default is true (e.g.,
Index("uq_generation_default", GenerationVersion.generation_id, unique=True,
postgresql_where=(GenerationVersion.is_default==True)) or UniqueConstraint with
postgresql_where) and add a migration that creates this partial index; also
ensure the migration resolves any existing multiple defaults (pick one or clear
extras) so the constraint can be applied cleanly.
Backend: - Generation service reports 'loading_model' status only when model is not yet in memory, then 'generating' once inference starts - Migrate hf_offline_patch.py from print() to logging module - Update ADDING_TTS_ENGINES.md for post-refactor file paths Frontend: - HistoryTable shows 'Loading model...' vs 'Generating...' based on step - FloatingGenerateBox: replace instruct toggle + inline effects editor with an effects preset dropdown (third dropdown after language and engine) - Instruct UI removed for now (form field preserved for future models) - Remove focus ring from Select component globally
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
docs/plans/ADDING_TTS_ENGINES.md (2)
9-9: Use hyphenation for the compound adjective (real-time).Line 9 should use “real-time sink” (not “real time sink”) for correct technical prose.
Proposed wording tweak
-Adding an engine touches ~10 files across 4 layers. The backend protocol work is straightforward — the real time sink is dependency hell, upstream library bugs, and PyInstaller bundling. +Adding an engine touches ~10 files across 4 layers. The backend protocol work is straightforward — the real-time sink is dependency hell, upstream library bugs, and PyInstaller bundling.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/plans/ADDING_TTS_ENGINES.md` at line 9, Replace the unhyphenated phrase "real time sink" with the compound-adjective form "real-time sink" in the document (update the instance of "real time sink" found in the ADDING_TTS_ENGINES.md content to "real-time sink").
194-214: Add language identifiers to fenced code blocks to satisfy MD040.The fences starting at Line 194, Line 206, and Line 212 are missing a language tag.
Lint-compliant markdown fences
-``` +```bash # In justfile (NOT requirements.txt): pip install --no-deps chatterbox-tts # In requirements.txt — list the transitive deps: conformer diffusers omegaconf # ... etc@@
-+text
linacodec @ git+https://github.com/user/repo.git
Zipvoice @ git+https://github.com/user/repo.git@@ -``` +```text --find-links https://k2-fsa.github.io/icefall/piper_phonemize.html</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@docs/plans/ADDING_TTS_ENGINES.mdaround lines 194 - 214, The markdown fenced
code blocks in docs/plans/ADDING_TTS_ENGINES.md that contain "pip install
--no-deps chatterbox-tts", the block listing "linacodec @
git+https://github.com/user/repo.git" / "Zipvoice @
git+https://github.com/user/repo.git", and the block with "--find-links
https://k2-fsa.github.io/icefall/piper_phonemize.html" are missing language
identifiers; update the first fence to use a shell/bash tag (e.g.bash) and the latter two to use a neutral text tag (e.g.text) so the markdown linter
MD040 is satisfied and syntax highlighting is stable.</details> </blockquote></details> <details> <summary>app/src/components/Generation/FloatingGenerateBox.tsx (1)</summary><blockquote> `52-56`: **Differentiate preset-loading failure from “no effects” state.** Right now, if preset fetch fails, users only see the default option and can’t tell whether there are truly no presets or the request failed. <details> <summary>Suggested refactor</summary> ```diff - const { data: effectPresets } = useQuery({ + const { + data: effectPresets, + isLoading: isEffectPresetsLoading, + isError: isEffectPresetsError, + } = useQuery({ queryKey: ['effectPresets'], queryFn: () => apiClient.listEffectPresets(), }); ``` ```diff - <Select + <Select value={selectedPresetId || 'none'} onValueChange={(value) => setSelectedPresetId(value === 'none' ? null : value) } + disabled={isEffectPresetsLoading || isEffectPresetsError} > <SelectTrigger className="h-8 text-xs bg-card border-border rounded-full hover:bg-background/50 transition-all"> - <SelectValue placeholder="No effects" /> + <SelectValue + placeholder={ + isEffectPresetsLoading + ? 'Loading effects...' + : isEffectPresetsError + ? 'Effects unavailable' + : 'No effects' + } + /> </SelectTrigger> ``` </details> Also applies to: 366-383 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@app/src/components/Generation/FloatingGenerateBox.tsx` around lines 52 - 56, The effect presets fetch currently only uses the returned data (effectPresets) so a failed request looks identical to an empty list; update the useQuery call in FloatingGenerateBox to destructure isLoading and isError (e.g. const { data: effectPresets, isLoading, isError } = useQuery(...)) and then change the dropdown rendering logic to (1) show a loading state when isLoading, (2) show a distinct disabled option like "Failed to load presets" or surface an inline error/toast when isError, and (3) retain the existing "No presets" / default option only when isError is false and effectPresets is an empty array; apply the same change to the similar block referenced around the 366-383 area so both places differentiate network errors from an empty result. ``` </details> </blockquote></details> <details> <summary>backend/services/generation.py (2)</summary><blockquote> `95-119`: **Add an else clause for defensive completeness.** If an invalid mode somehow bypasses type checking at runtime, `final_path` would be undefined and cause a `NameError` at line 125. Adding an explicit else branch improves robustness. <details> <summary>🛡️ Proposed defensive fix</summary> ```diff elif mode == "regenerate": final_path = _save_regenerate( generation_id=generation_id, version_id=version_id, audio=audio, sample_rate=sample_rate, save_audio=save_audio, db=bg_db, ) + else: + raise ValueError(f"Unknown generation mode: {mode}") ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@backend/services/generation.py` around lines 95 - 119, The code paths for mode ("generate"/"retry"/"regenerate") assign final_path via _save_generate, _save_retry, or _save_regenerate but lack a fallback, so an unexpected mode can leave final_path undefined; add an explicit else branch after the existing elifs that handles invalid mode (e.g., log/processLogger.error or raise a ValueError/RuntimeError including the invalid mode) to fail fast and avoid a NameError later, referencing the local variable final_path and the mode variable to make the error clear. ``` </details> --- `88-91`: **Clarify: `regenerate` always normalizes regardless of the `normalize` parameter.** The condition `if normalize or mode == "regenerate"` means regenerate mode ignores the caller's `normalize=False`. If this is intentional, consider documenting it in the docstring. If not, the condition should respect the parameter for all modes. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@backend/services/generation.py` around lines 88 - 91, The current check "if normalize or mode == 'regenerate'" forces normalization in regenerate mode even when normalize=False; update the logic so normalization respects the normalize parameter for all modes (e.g., only call normalize_audio(audio) when normalize is True), or if the intent is to always normalize on regenerate, make that explicit in the function's docstring by documenting that mode=='regenerate' overrides normalize; reference the variables and functions involved: mode, normalize, and normalize_audio to locate and change the condition or add the docstring note. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@app/src/components/History/HistoryTable.tsx:
- Around line 397-399: The aria-label mismatch comes from treating loading_model
the same as generating; update the status booleans so they distinguish
loading_model: replace the single isInProgress/isGenerating assignment with
separate flags (e.g., isLoadingModel = gen.status === 'loading_model';
isGenerating = gen.status === 'generating'; isFailed = gen.status === 'failed')
and update the row aria-label and any status text usage (references around the
current isInProgress/isGenerating logic and the similar block at the later
420-423 region) to announce "Loading model..." when isLoadingModel is true,
"Generating..." when isGenerating is true, and fallback to the existing labels
otherwise.In
@app/src/components/ui/select.tsx:
- Around line 18-21: The Select component's className string (the cn call that
builds classes for the select root) removes visible focus styling by keeping
focus:outline-none while dropping the focus ring; restore an accessible focus
indicator by adding back a visible focus style (e.g. add focus:ring-2
focus:ring-ring focus:ring-offset-2) or, if design requires no ring, replace
with an alternative such as focus:border-primary or focus:bg-muted to ensure
keyboard focus is visually apparent; update the class list in the cn(...) for
the select root (the className passed into the Select component) accordingly and
remove focus:outline-none if it prevents the chosen focus style from showing.In
@backend/services/generation.py:
- Around line 232-242: The take-N label generation is racy: calls to
versions_mod.list_versions followed by versions_mod.create_version can produce
duplicate labels under concurrency; add a database-level UNIQUE constraint on
(generation_id, label) and update versions_mod.create_version to handle
unique-constraint violations by retrying label creation atomically (e.g., catch
the DB IntegrityError/UniqueViolation when inserting a duplicate label, re-fetch
the current count or use an atomic counter/separate per-generation sequence,
regenerate label like "take-{N}" and retry a few times), or implement a SELECT
... FOR UPDATE increment on a counter column on the generation row to derive the
next take number; reference the functions/variables versions_mod.list_versions,
versions_mod.create_version, generation_id, and label when making these changes.
Nitpick comments:
In@app/src/components/Generation/FloatingGenerateBox.tsx:
- Around line 52-56: The effect presets fetch currently only uses the returned
data (effectPresets) so a failed request looks identical to an empty list;
update the useQuery call in FloatingGenerateBox to destructure isLoading and
isError (e.g. const { data: effectPresets, isLoading, isError } = useQuery(...))
and then change the dropdown rendering logic to (1) show a loading state when
isLoading, (2) show a distinct disabled option like "Failed to load presets" or
surface an inline error/toast when isError, and (3) retain the existing "No
presets" / default option only when isError is false and effectPresets is an
empty array; apply the same change to the similar block referenced around the
366-383 area so both places differentiate network errors from an empty result.In
@backend/services/generation.py:
- Around line 95-119: The code paths for mode ("generate"/"retry"/"regenerate")
assign final_path via _save_generate, _save_retry, or _save_regenerate but lack
a fallback, so an unexpected mode can leave final_path undefined; add an
explicit else branch after the existing elifs that handles invalid mode (e.g.,
log/processLogger.error or raise a ValueError/RuntimeError including the invalid
mode) to fail fast and avoid a NameError later, referencing the local variable
final_path and the mode variable to make the error clear.- Around line 88-91: The current check "if normalize or mode == 'regenerate'"
forces normalization in regenerate mode even when normalize=False; update the
logic so normalization respects the normalize parameter for all modes (e.g.,
only call normalize_audio(audio) when normalize is True), or if the intent is to
always normalize on regenerate, make that explicit in the function's docstring
by documenting that mode=='regenerate' overrides normalize; reference the
variables and functions involved: mode, normalize, and normalize_audio to locate
and change the condition or add the docstring note.In
@docs/plans/ADDING_TTS_ENGINES.md:
- Line 9: Replace the unhyphenated phrase "real time sink" with the
compound-adjective form "real-time sink" in the document (update the instance of
"real time sink" found in the ADDING_TTS_ENGINES.md content to "real-time
sink").- Around line 194-214: The markdown fenced code blocks in
docs/plans/ADDING_TTS_ENGINES.md that contain "pip install --no-deps
chatterbox-tts", the block listing "linacodec @
git+https://github.com/user/repo.git" / "Zipvoice @
git+https://github.com/user/repo.git", and the block with "--find-links
https://k2-fsa.github.io/icefall/piper_phonemize.html" are missing language
identifiers; update the first fence to use a shell/bash tag (e.g.bash) and the latter two to use a neutral text tag (e.g.text) so the markdown linter
MD040 is satisfied and syntax highlighting is stable.</details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: defaults **Review profile**: CHILL **Plan**: Pro **Run ID**: `8bb3d2e5-6b6a-46d4-ac19-3fff919e9b74` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 8efcc956067db5657f47ab50ecc041259d8eb308 and 798cd40f059eb87d613e8cbbbb33696013643c73. </details> <details> <summary>📒 Files selected for processing (13)</summary> * `app/src/components/Generation/FloatingGenerateBox.tsx` * `app/src/components/History/HistoryTable.tsx` * `app/src/components/ui/select.tsx` * `app/src/lib/api/types.ts` * `app/src/lib/hooks/useGenerationProgress.ts` * `backend/services/generation.py` * `backend/utils/hf_offline_patch.py` * `docs/PR-ACCESSIBILITY.md` * `docs/plans/ADDING_TTS_ENGINES.md` * `docs/plans/EXTERNAL_PROVIDERS.md` * `docs/plans/MLX_AUDIO.md` * `docs/plans/PR33_CUDA_PROVIDER_REVIEW.md` * `docs/plans/TTS_PROVIDER_ARCHITECTURE.md` </details> <details> <summary>💤 Files with no reviewable changes (5)</summary> * docs/PR-ACCESSIBILITY.md * docs/plans/EXTERNAL_PROVIDERS.md * docs/plans/MLX_AUDIO.md * docs/plans/PR33_CUDA_PROVIDER_REVIEW.md * docs/plans/TTS_PROVIDER_ARCHITECTURE.md </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
| const isInProgress = gen.status === 'loading_model' || gen.status === 'generating'; | ||
| const isGenerating = isInProgress; | ||
| const isFailed = gen.status === 'failed'; |
There was a problem hiding this comment.
Align ARIA status text with loading_model state.
When gen.status === 'loading_model', the visible text says “Loading model...” but the row aria-label still announces generating. This is an accessibility mismatch.
Suggested fix
- isGenerating
- ? `Generating speech for ${gen.profile_name}...`
+ gen.status === 'loading_model'
+ ? `Loading model for ${gen.profile_name}...`
+ : isGenerating
+ ? `Generating speech for ${gen.profile_name}...`
: isFailed
? `Generation failed for ${gen.profile_name}`Also applies to: 420-423
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src/components/History/HistoryTable.tsx` around lines 397 - 399, The
aria-label mismatch comes from treating loading_model the same as generating;
update the status booleans so they distinguish loading_model: replace the single
isInProgress/isGenerating assignment with separate flags (e.g., isLoadingModel =
gen.status === 'loading_model'; isGenerating = gen.status === 'generating';
isFailed = gen.status === 'failed') and update the row aria-label and any status
text usage (references around the current isInProgress/isGenerating logic and
the similar block at the later 420-423 region) to announce "Loading model..."
when isLoadingModel is true, "Generating..." when isGenerating is true, and
fallback to the existing labels otherwise.
Critical: - Remove dead backend.utils.validation PyInstaller hidden import - Fix story_items table rebuild to preserve track/trim/version columns - Guard cache migration against same source/destination path - Fix regeneration audio overwrite (use random uuid suffix per take) Major: - Engine selector: validate language on Qwen switch, clear stale modelSize - Sync language validation regex between profile create and generate (22 langs) - Guard CUDA download against duplicate concurrent requests - Only set model_size for engines that support multiple sizes - Fix 404 swallowed by generic except in history export - Validate audio_path before FileResponse in export-audio - Transcription: stream uploads in 1MB chunks, use robust cache check, call complete_download() on Whisper download success - Set clean version as default when effects chain validation fails - Return explicit error when Windows port occupied by non-voicebox process
- Use DB COUNT query instead of list length for take-N label to avoid TOCTOU race between list_versions and create_version - Add focus:bg-muted to SelectTrigger for keyboard focus visibility
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
backend/routes/generations.py (3)
227-232:⚠️ Potential issue | 🟠 Major
stream_speechalso needs the engine model size guard.Line 229 unconditionally defaults to
"1.7B". This should checkengine_has_model_sizes()likegenerate_speechdoes.🐛 Proposed fix
+ from ..backends import engine_has_model_sizes + engine = data.engine or "qwen" tts_model = get_tts_backend_for_engine(engine) - model_size = data.model_size or "1.7B" + model_size = (data.model_size or "1.7B") if engine_has_model_sizes(engine) else None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/generations.py` around lines 227 - 232, In stream_speech, mirror generate_speech's model-size guard: instead of unconditionally defaulting model_size to "1.7B", call engine_has_model_sizes(engine) and only set data.model_size or default to "1.7B" when the engine supports sizes; otherwise leave model_size None (or skip caching/loading). Update the stream_speech logic that calls get_tts_backend_for_engine(engine), ensure_model_cached_or_raise(engine, model_size), and load_engine_model(engine, model_size) to pass model_size only when engine_has_model_sizes(engine) is true so you don't attempt to cache/load a non-existent size.
165-167:⚠️ Potential issue | 🟠 MajorRegenerate has the same model_size defaulting issue.
Apply the same
engine_has_model_sizes()guard here as suggested for retry.🐛 Proposed fix
+ from ..backends import engine_has_model_sizes + enqueue_generation( run_generation( generation_id=generation_id, profile_id=gen.profile_id, text=gen.text, language=gen.language, engine=gen.engine or "qwen", - model_size=gen.model_size or "1.7B", + model_size=(gen.model_size or "1.7B") if engine_has_model_sizes(gen.engine or "qwen") else None, seed=gen.seed, instruct=gen.instruct, mode="regenerate", version_id=version_id, ) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/generations.py` around lines 165 - 167, The model_size default is being set unconditionally for regenerate like in the retry path; update the code that builds the regenerate payload (the lines using language=gen.language, engine=gen.engine or "qwen", model_size=gen.model_size or "1.7B") to only include or default model_size when engine_has_model_sizes(engine) returns True—use the same pattern applied in the retry change: compute engine = gen.engine or "qwen", then if engine_has_model_sizes(engine) set model_size = gen.model_size or "1.7B", otherwise omit model_size or set it to None before constructing the regenerate request.
122-124:⚠️ Potential issue | 🟠 MajorRetry still injects
1.7Binto non-Qwen engines.Unlike
generate_speech, the retry path unconditionally defaultsmodel_sizeto"1.7B"without checkingengine_has_model_sizes(). If a LuxTTS or Chatterbox generation failed, retrying it would pass an invalid model size.🐛 Proposed fix
+ from ..backends import engine_has_model_sizes + enqueue_generation( run_generation( generation_id=generation_id, profile_id=gen.profile_id, text=gen.text, language=gen.language, engine=gen.engine or "qwen", - model_size=gen.model_size or "1.7B", + model_size=(gen.model_size or "1.7B") if engine_has_model_sizes(gen.engine or "qwen") else None, seed=gen.seed, instruct=gen.instruct, mode="retry", ) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/generations.py` around lines 122 - 124, The retry path unconditionally sets model_size to "1.7B" which is invalid for non-Qwen engines; update the retry payload construction (the place creating the dict with language=gen.language, engine=gen.engine or "qwen", model_size=gen.model_size or "1.7B") to only default model_size to "1.7B" when the engine supports model sizes (use engine_has_model_sizes(gen.engine) or check gen.engine == "qwen"), otherwise leave model_size as None/omitted or use gen.model_size if provided; ensure you reference and call engine_has_model_sizes(gen.engine) when deciding the default.
🧹 Nitpick comments (7)
app/src/components/Generation/EngineModelSelector.tsx (2)
53-53: Remove unsafe double-cast when clearingmodelSize.
modelSizeis already optional inGenerationFormValues; castingundefinedthroughunknownobscures type safety and readability.Proposed simplification
- form.setValue('modelSize', undefined as unknown as '1.7B' | '0.6B'); + form.setValue('modelSize', undefined);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/Generation/EngineModelSelector.tsx` at line 53, The line using a double-cast to clear modelSize should be replaced with a safe assignment: remove the "as unknown as '1.7B' | '0.6B'" and call form.setValue('modelSize', undefined) (or explicitly type as GenerationFormValues['modelSize'] if needed) in EngineModelSelector.tsx so you rely on the existing optional type of modelSize instead of masking types with an unsafe double-cast.
25-30: Tighten description typing to engine union for exhaustiveness.Using
Record<string, string>allows missing keys silently. Prefer typing descriptions against the engine union so adding/removing engines is compile-time checked.Type-safe mapping example
-const ENGINE_DESCRIPTIONS: Record<string, string> = { +type Engine = NonNullable<GenerationFormValues['engine']>; + +const ENGINE_DESCRIPTIONS: Record<Engine, string> = { qwen: 'Multi-language, two sizes', luxtts: 'Fast, English-focused', chatterbox: '23 languages, incl. Hebrew', chatterbox_turbo: 'English, [laugh] [cough] tags', };Also applies to: 101-103
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/Generation/EngineModelSelector.tsx` around lines 25 - 30, The ENGINE_DESCRIPTIONS map is currently typed as Record<string,string>, which won't catch missing engine keys; change its type to a mapped type over the engine union (e.g. use the union type name used in this file such as EngineId or Engine) so the compiler enforces exhaustiveness (for example: declare ENGINE_DESCRIPTIONS as {[K in EngineId]: string} or Record<EngineId,string> and then ensure all engine union members are present in the object); update the declaration of ENGINE_DESCRIPTIONS accordingly and add any missing entries or remove obsolete ones so the mapping matches the engine union exactly.backend/routes/models.py (2)
59-72: Chain exceptions for better debugging.When re-raising
HTTPExceptionfrom a caught exception, useraise ... from eto preserve the original traceback. This helps with debugging production issues.♻️ Proposed fix
`@router.post`("/models/load") async def load_model(model_size: str = "1.7B"): """Manually load TTS model.""" from ..services import tts try: tts_model = tts.get_tts_model() await tts_model.load_model_async(model_size) return {"message": f"Model {model_size} loaded successfully"} except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) + raise HTTPException(status_code=500, detail=str(e)) from e `@router.post`("/models/unload") async def unload_model(): """Unload the default Qwen TTS model to free memory.""" from ..services import tts try: tts.unload_tts_model() return {"message": "Model unloaded successfully"} except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) + raise HTTPException(status_code=500, detail=str(e)) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/models.py` around lines 59 - 72, The exception handlers in unload_model (and the earlier block that raises HTTPException) re-raise HTTPException without chaining the original exception; update both raise HTTPException(...) statements to use exception chaining (raise HTTPException(status_code=500, detail=str(e)) from e) so the original traceback is preserved when calling tts.unload_tts_model and any other code paths that currently catch Exception and re-raise HTTPException.
429-433: Accessing private members of progress_manager breaks encapsulation.Lines 430-432 directly access
progress_manager._lockandprogress_manager._progress. Consider adding a publicremove_progress(model_name)method to the progress manager instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/models.py` around lines 429 - 433, Replace direct access to progress_manager._lock and progress_manager._progress with a new public method on the ProgressManager (e.g., remove_progress(model_name)) and call that from this route. Implement ProgressManager.remove_progress(self, model_name) to acquire its internal lock and delete the entry if present, returning a boolean if removed; then in the route call progress_manager.remove_progress(request.model_name) and use the boolean instead of manipulating _lock/_progress directly.backend/routes/transcription.py (1)
71-72: Chain exception for better debugging.Use
raise ... from eto preserve the exception chain.♻️ Proposed fix
except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) + raise HTTPException(status_code=500, detail=str(e)) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/transcription.py` around lines 71 - 72, Replace the bare re-raise of HTTPException with an exception-chaining raise so the original traceback is preserved: in the except Exception as e block where you currently call raise HTTPException(status_code=500, detail=str(e)), change it to raise HTTPException(status_code=500, detail=str(e)) from e so the original exception (e) is chained into the HTTPException.backend/services/generation.py (1)
232-248: Take label race condition is mitigated by the serial task queue.The label assignment (
take-{len(existing) + 1}) is technically racy, but perbackend/services/task_queue.py, all generation tasks are processed by a single_generation_worker()that handles one coroutine at a time. This serialization prevents concurrent regenerations for the samegeneration_idfrom colliding.However, if the queue architecture changes or parallel workers are added in the future, this could produce duplicate labels. Consider adding a brief comment noting this serialization dependency.
📝 Suggested documentation
def _save_regenerate( *, generation_id: str, version_id: Optional[str], audio, sample_rate: int, save_audio, db, ) -> str: """Save regeneration output as a new version with auto-label. Returns the audio path. + + Note: Label uniqueness relies on task_queue serialization; if parallel + workers are ever added, add a DB unique constraint on (generation_id, label). """🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/generation.py` around lines 232 - 248, Add a brief comment next to the label assignment in generation.py explaining that the take-{len(existing)+1} computation relies on serialized processing of generation tasks (see _generation_worker in backend/services/task_queue.py) to avoid races; mention that if the task queue is changed to allow parallel workers this labeling could produce duplicates and should be revisited (or replaced with an atomic/DB-backed counter) — place the comment near the label variable alongside the existing calls to versions_mod.list_versions and versions_mod.create_version.backend/routes/history.py (1)
60-63: Chain exceptions for better debugging.Use
raise ... from eto preserve the exception chain in error handlers.♻️ Proposed fix
try: result = await export_import.import_generation_from_zip(content, db) return result except ValueError as e: - raise HTTPException(status_code=400, detail=str(e)) + raise HTTPException(status_code=400, detail=str(e)) from e except Exception as e: - raise HTTPException(status_code=500, detail=str(e)) + raise HTTPException(status_code=500, detail=str(e)) from eAlso applies to lines 135-138.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/history.py` around lines 60 - 63, In the except handlers inside backend/routes/history.py (the blocks catching "except ValueError as e" and "except Exception as e" — including the similar handlers around lines 135-138), preserve the original exception chain by re-raising the HTTPException using "raise HTTPException(status_code=..., detail=...) from e" instead of a plain raise; update both the ValueError handler and the generic Exception handler to use "from e" so the original traceback is chained for debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/database/migrations.py`:
- Around line 77-85: The loop that updates start_time_ms can raise TypeError
because duration may be None; in the loop that iterates over result.fetchall()
update the code around current_time_ms and duration to treat a NULL duration
safely (e.g., coerce duration to 0 or use a default like 0.0 before computing
int(duration * 1000)), so replace the direct int(duration * 1000) usage with a
safe conversion (e.g., safe_duration = duration or 0; increment =
int(safe_duration * 1000) + 200) and then use current_time_ms += increment; keep
the existing UPDATE on story_items (id=item_id) and current_story_id logic
intact.
In `@backend/routes/models.py`:
- Around line 126-133: The current same-directory check using source.resolve()
== destination.resolve() misses the case where destination is a subdirectory of
source; update the validation after computing source and destination to also
reject destinations that are inside the source (and optionally vice versa) by
checking if destination.resolve() is a child of source.resolve() (use
Path.is_relative_to(destination.resolve(), source.resolve()) when available, or
compare resolved strings with a trailing separator as a fallback) and raise
HTTPException(status_code=400, detail="Destination must not be inside the source
directory") when that condition is true.
In `@tauri/src-tauri/src/main.rs`:
- Around line 15-20: Update the docstring for the find_voicebox_on_port function
to accurately describe what the implementation does: state that it performs PID
lookup via PowerShell and verifies the process with tasklist on Windows, and
that it does NOT call TcpListener::bind (port occupancy should be checked by the
caller, e.g., with TcpStream::connect_timeout). Mention the Windows-only nature
and the replacement of the previous netstat approach for context, but remove the
incorrect claim about binding.
---
Duplicate comments:
In `@backend/routes/generations.py`:
- Around line 227-232: In stream_speech, mirror generate_speech's model-size
guard: instead of unconditionally defaulting model_size to "1.7B", call
engine_has_model_sizes(engine) and only set data.model_size or default to "1.7B"
when the engine supports sizes; otherwise leave model_size None (or skip
caching/loading). Update the stream_speech logic that calls
get_tts_backend_for_engine(engine), ensure_model_cached_or_raise(engine,
model_size), and load_engine_model(engine, model_size) to pass model_size only
when engine_has_model_sizes(engine) is true so you don't attempt to cache/load a
non-existent size.
- Around line 165-167: The model_size default is being set unconditionally for
regenerate like in the retry path; update the code that builds the regenerate
payload (the lines using language=gen.language, engine=gen.engine or "qwen",
model_size=gen.model_size or "1.7B") to only include or default model_size when
engine_has_model_sizes(engine) returns True—use the same pattern applied in the
retry change: compute engine = gen.engine or "qwen", then if
engine_has_model_sizes(engine) set model_size = gen.model_size or "1.7B",
otherwise omit model_size or set it to None before constructing the regenerate
request.
- Around line 122-124: The retry path unconditionally sets model_size to "1.7B"
which is invalid for non-Qwen engines; update the retry payload construction
(the place creating the dict with language=gen.language, engine=gen.engine or
"qwen", model_size=gen.model_size or "1.7B") to only default model_size to
"1.7B" when the engine supports model sizes (use
engine_has_model_sizes(gen.engine) or check gen.engine == "qwen"), otherwise
leave model_size as None/omitted or use gen.model_size if provided; ensure you
reference and call engine_has_model_sizes(gen.engine) when deciding the default.
---
Nitpick comments:
In `@app/src/components/Generation/EngineModelSelector.tsx`:
- Line 53: The line using a double-cast to clear modelSize should be replaced
with a safe assignment: remove the "as unknown as '1.7B' | '0.6B'" and call
form.setValue('modelSize', undefined) (or explicitly type as
GenerationFormValues['modelSize'] if needed) in EngineModelSelector.tsx so you
rely on the existing optional type of modelSize instead of masking types with an
unsafe double-cast.
- Around line 25-30: The ENGINE_DESCRIPTIONS map is currently typed as
Record<string,string>, which won't catch missing engine keys; change its type to
a mapped type over the engine union (e.g. use the union type name used in this
file such as EngineId or Engine) so the compiler enforces exhaustiveness (for
example: declare ENGINE_DESCRIPTIONS as {[K in EngineId]: string} or
Record<EngineId,string> and then ensure all engine union members are present in
the object); update the declaration of ENGINE_DESCRIPTIONS accordingly and add
any missing entries or remove obsolete ones so the mapping matches the engine
union exactly.
In `@backend/routes/history.py`:
- Around line 60-63: In the except handlers inside backend/routes/history.py
(the blocks catching "except ValueError as e" and "except Exception as e" —
including the similar handlers around lines 135-138), preserve the original
exception chain by re-raising the HTTPException using "raise
HTTPException(status_code=..., detail=...) from e" instead of a plain raise;
update both the ValueError handler and the generic Exception handler to use
"from e" so the original traceback is chained for debugging.
In `@backend/routes/models.py`:
- Around line 59-72: The exception handlers in unload_model (and the earlier
block that raises HTTPException) re-raise HTTPException without chaining the
original exception; update both raise HTTPException(...) statements to use
exception chaining (raise HTTPException(status_code=500, detail=str(e)) from e)
so the original traceback is preserved when calling tts.unload_tts_model and any
other code paths that currently catch Exception and re-raise HTTPException.
- Around line 429-433: Replace direct access to progress_manager._lock and
progress_manager._progress with a new public method on the ProgressManager
(e.g., remove_progress(model_name)) and call that from this route. Implement
ProgressManager.remove_progress(self, model_name) to acquire its internal lock
and delete the entry if present, returning a boolean if removed; then in the
route call progress_manager.remove_progress(request.model_name) and use the
boolean instead of manipulating _lock/_progress directly.
In `@backend/routes/transcription.py`:
- Around line 71-72: Replace the bare re-raise of HTTPException with an
exception-chaining raise so the original traceback is preserved: in the except
Exception as e block where you currently call raise
HTTPException(status_code=500, detail=str(e)), change it to raise
HTTPException(status_code=500, detail=str(e)) from e so the original exception
(e) is chained into the HTTPException.
In `@backend/services/generation.py`:
- Around line 232-248: Add a brief comment next to the label assignment in
generation.py explaining that the take-{len(existing)+1} computation relies on
serialized processing of generation tasks (see _generation_worker in
backend/services/task_queue.py) to avoid races; mention that if the task queue
is changed to allow parallel workers this labeling could produce duplicates and
should be revisited (or replaced with an atomic/DB-backed counter) — place the
comment near the label variable alongside the existing calls to
versions_mod.list_versions and versions_mod.create_version.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bf071068-633d-4ff0-b2c9-f052b205964c
📒 Files selected for processing (11)
app/src/components/Generation/EngineModelSelector.tsxbackend/build_binary.pybackend/database/migrations.pybackend/models.pybackend/routes/cuda.pybackend/routes/generations.pybackend/routes/history.pybackend/routes/models.pybackend/routes/transcription.pybackend/services/generation.pytauri/src-tauri/src/main.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/routes/cuda.py
Replace verbose startup messages with a clean summary: - App version, Python version, OS/arch - Database path (fix None display), data directory - Profile and generation counts - Backend, GPU, model cache path - Clean up stale loading_model status on startup - Remove noisy progress manager log line
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (4)
backend/backends/mlx_backend.py (1)
58-61: Logger message is misleading when model is already cached.
_get_model_pathis called by_is_model_cached, so logging "Will download MLX model from HuggingFace Hub" occurs even when just checking if the model is cached. This creates confusing log output.Consider moving this log statement to
_load_model_syncwhere the download actually happens, or rewording to be accurate for both contexts.✨ Suggested fix
hf_model_id = mlx_model_map[model_size] - logger.info("Will download MLX model from HuggingFace Hub: %s", hf_model_id) return hf_model_idThen in
_load_model_syncaround line 109:- logger.info("Loading MLX TTS model %s...", model_size) + logger.info("Loading MLX TTS model %s from %s...", model_size, model_path)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/mlx_backend.py` around lines 58 - 61, The logger.info in _get_model_path currently says "Will download MLX model..." even when _get_model_path is being used by _is_model_cached; change the logging so it only announces an actual download in _load_model_sync (where the download occurs) or make the message neutral in _get_model_path. Concretely, remove or reword the logger.info in _get_model_path (referencing hf_model_id) so it does not claim a download, and add a new logger.info in _load_model_sync that logs "Downloading MLX model from HuggingFace Hub: %s" using the same hf_model_id when the code path actually performs the download.backend/backends/pytorch_backend.py (1)
322-326: Prefix unused variable with underscore.The
srvariable fromload_audiois unused. Prefix it with_to clarify intent and silence the linter warning.✨ Suggested fix
def _transcribe_sync(): """Run synchronous transcription in thread pool.""" # Load audio - audio, sr = load_audio(audio_path, sample_rate=16000) + audio, _sr = load_audio(audio_path, sample_rate=16000)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/pytorch_backend.py` around lines 322 - 326, In _transcribe_sync, the tuple returned by load_audio assigns the sample rate to sr but it is unused; update the assignment to capture the unused value as _sr (e.g., audio, _sr = load_audio(...)) to signal intent and silence the linter warning—modify the assignment in the _transcribe_sync function where load_audio is called.backend/build_binary.py (2)
313-314: Redundant import statement.When
restore_cudaisTrue, thesubprocessmodule was already imported at line 280. This second import is unreachable when not redundant.♻️ Remove redundant import
if restore_cuda: print("Restoring CUDA torch...") - import subprocess - subprocess.run(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/build_binary.py` around lines 313 - 314, Remove the redundant "import subprocess" inside the conditional that checks restore_cuda; the module is already imported earlier (the first import at line ~280), so delete the second import statement to avoid unreachable/redundant imports and rely on the existing top-level subprocess import when using restore_cuda.
11-17: Logger created but never used.The
loggeris initialized but all output throughout the file usesprint()statements instead. Either use the logger for consistent logging or remove the unused import and variable.♻️ Option 1: Remove unused logger
import PyInstaller.__main__ import argparse -import logging import os import platform import sys from pathlib import Path -logger = logging.getLogger(__name__) -🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/build_binary.py` around lines 11 - 17, The file defines a logger variable (logger) but never uses it and instead uses print() for output; either replace all print(...) calls in this module with appropriate logger methods (e.g., logger.info/debug/warning/error) so logging is consistent and the logger import is used, or remove the logging import and the logger variable and keep print() if stdout-only output is intended; search for the logger symbol and all print(...) occurrences to update them consistently (prefer converting prints to logger.* calls).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app.py`:
- Around line 150-154: The shutdown_event currently calls tts.unload_tts_model()
then transcribe.unload_whisper_model() and if the first throws the second never
runs; change shutdown_event to call each unload in its own try/except so
failures are best-effort and do not abort subsequent cleanup, e.g. wrap
tts.unload_tts_model() in a try/except that logs the exception via logger.error
and then independently call transcribe.unload_whisper_model() inside its own
try/except and log any error similarly; ensure the handler remains async and
continues executing even if an unload raises.
- Around line 107-123: The DB session obtained via next(get_db()) must be closed
and rolled back on error even if execute() or commit() raises; wrap the work
using the session (db) in a try/except/finally: after calling get_db() (the
next(get_db()) call) perform execute() and commit() in the try, on exception
call db.rollback() and log via logger.warning (preserve the existing message),
and always call db.close() in the finally block so the session is closed on
every path; reference get_db, db.execute, db.commit, db.rollback, db.close, and
logger in the changes.
- Around line 16-27: The module-level import of torch (the statement "import
torch") is intentionally below AMD GPU bootstrap environment setup and triggers
Ruff E402; suppress this by adding a per-file Ruff ignore for E402 immediately
above the torch import (i.e., insert "# ruff: noqa: E402" on the line before the
"import torch" statement) or alternatively add an equivalent per-file ignore in
the Ruff config so the required AMD GPU environment variable code can remain
before the torch import.
- Around line 11-12: Don't force HSA_OVERRIDE_GFX_VERSION; make it opt-in by
only setting it when a dedicated enable flag is present. Change the current
logic around HSA_OVERRIDE_GFX_VERSION so it first checks if an opt-in env var
(e.g., ROCM_ENABLE_GFX_OVERRIDE or similar) equals "1" and only then sets
os.environ["HSA_OVERRIDE_GFX_VERSION"] = "10.3.0" if HSA_OVERRIDE_GFX_VERSION is
not already defined; reference the environment variable name
HSA_OVERRIDE_GFX_VERSION and the new opt-in var (ROCM_ENABLE_GFX_OVERRIDE) in
your change and ensure you do not overwrite an existing
HSA_OVERRIDE_GFX_VERSION.
In `@backend/build_binary.py`:
- Around line 335-377: Remove the unused helper function _get_cuda_dll_excludes
from backend/build_binary.py by deleting its entire definition (including its
docstring and internal logic); verify there are no remaining references or
assignments to _get_cuda_dll_excludes elsewhere in the file and clean up any
now-unused imports or variables that were only used by that function to avoid
lint warnings.
In `@backend/services/generation.py`:
- Around line 129-136: The exception handler must rollback the failing DB
session before calling history.update_generation_status so the subsequent
db.query()/db.commit() inside update_generation_status can run; update the
except block to call await bg_db.rollback() (or bg_db.rollback() if sync) or
bg_db.rollback_async() as appropriate to your DB session API, then call
history.update_generation_status(generation_id=generation_id, status="failed",
db=bg_db, error=str(e)); ensure you reference the existing bg_db, generation_id,
and history.update_generation_status symbols and keep rollback awaited/executed
before the update call.
---
Nitpick comments:
In `@backend/backends/mlx_backend.py`:
- Around line 58-61: The logger.info in _get_model_path currently says "Will
download MLX model..." even when _get_model_path is being used by
_is_model_cached; change the logging so it only announces an actual download in
_load_model_sync (where the download occurs) or make the message neutral in
_get_model_path. Concretely, remove or reword the logger.info in _get_model_path
(referencing hf_model_id) so it does not claim a download, and add a new
logger.info in _load_model_sync that logs "Downloading MLX model from
HuggingFace Hub: %s" using the same hf_model_id when the code path actually
performs the download.
In `@backend/backends/pytorch_backend.py`:
- Around line 322-326: In _transcribe_sync, the tuple returned by load_audio
assigns the sample rate to sr but it is unused; update the assignment to capture
the unused value as _sr (e.g., audio, _sr = load_audio(...)) to signal intent
and silence the linter warning—modify the assignment in the _transcribe_sync
function where load_audio is called.
In `@backend/build_binary.py`:
- Around line 313-314: Remove the redundant "import subprocess" inside the
conditional that checks restore_cuda; the module is already imported earlier
(the first import at line ~280), so delete the second import statement to avoid
unreachable/redundant imports and rely on the existing top-level subprocess
import when using restore_cuda.
- Around line 11-17: The file defines a logger variable (logger) but never uses
it and instead uses print() for output; either replace all print(...) calls in
this module with appropriate logger methods (e.g.,
logger.info/debug/warning/error) so logging is consistent and the logger import
is used, or remove the logging import and the logger variable and keep print()
if stdout-only output is intended; search for the logger symbol and all
print(...) occurrences to update them consistently (prefer converting prints to
logger.* calls).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7dfff82b-d0ec-4290-aafe-e2d0396476ad
📒 Files selected for processing (6)
app/src/components/ui/select.tsxbackend/app.pybackend/backends/mlx_backend.pybackend/backends/pytorch_backend.pybackend/build_binary.pybackend/services/generation.py
Summary
Complete backend restructuring from a 2,800-line monolithic
main.pyinto a modular, well-documented architecture. No API changes — all 90 endpoints preserved with identical behavior.What changed
Architecture (Phases 1-4)
main.pyinto 12 domain-specific routers underroutes/services/(channels, effects, history, profiles, stories, versions, export_import, transcribe, tts, generation, task_queue, cuda)database.pyinto adatabase/package (models, session, migrations, seeds)backends/base.pyshared utilities (-1,078 lines)services/generation.pymain.pyis now a 45-line entry pointStyle guide and tooling (Phase 7)
backend/STYLE_GUIDE.mdcovering formatting, imports, types, docstrings, comments, error handling, async, logging, and namingpyproject.tomlwith ruff linter + formatter configjust check-python,just fix-python,just testcommandsBug fix
netstat.exewithTcpStream+ PowerShellGet-NetTCPConnectionfor Windows port detection — fixes Can't run the software at all #277 (infinite loading loop on systems with corrupted system DLLs)Cleanup
studio.py,migrate_add_instruct.py,utils/validation.py,example_usage.pybackend/README.mdas an architecture documenttauri/src-tauri/gen/Assets.carPost-refactor backend structure
Verification
cargo checkpasses)Closes #277
Summary by CodeRabbit
New Features
Bug Fixes
Documentation