Skip to content

Backend refactor: modular architecture, style guide, tooling#285

Merged
jamiepine merged 26 commits intomainfrom
backend-refactor
Mar 16, 2026
Merged

Backend refactor: modular architecture, style guide, tooling#285
jamiepine merged 26 commits intomainfrom
backend-refactor

Conversation

@jamiepine
Copy link
Owner

@jamiepine jamiepine commented Mar 16, 2026

Summary

Complete backend restructuring from a 2,800-line monolithic main.py into a modular, well-documented architecture. No API changes — all 90 endpoints preserved with identical behavior.

What changed

Architecture (Phases 1-4)

  • Extracted 90 routes from main.py into 12 domain-specific routers under routes/
  • Moved all business logic into services/ (channels, effects, history, profiles, stories, versions, export_import, transcribe, tts, generation, task_queue, cuda)
  • Split database.py into a database/ package (models, session, migrations, seeds)
  • Deduplicated backend implementations via backends/base.py shared utilities (-1,078 lines)
  • Unified three near-identical generation closures into a single services/generation.py
  • main.py is now a 45-line entry point

Style guide and tooling (Phase 7)

  • Added backend/STYLE_GUIDE.md covering formatting, imports, types, docstrings, comments, error handling, async, logging, and naming
  • Added pyproject.toml with ruff linter + formatter config
  • Added just check-python, just fix-python, just test commands
  • Replaced Makefile with justfile as the single task runner

Bug fix

  • Replace netstat.exe with TcpStream + PowerShell Get-NetTCPConnection for Windows port detection — fixes Can't run the software at all #277 (infinite loading loop on systems with corrupted system DLLs)

Cleanup

  • Deleted dead code: studio.py, migrate_add_instruct.py, utils/validation.py, example_usage.py
  • Rewrote backend/README.md as an architecture document
  • Gitignored tauri/src-tauri/gen/Assets.car

Post-refactor backend structure

backend/
  app.py              # FastAPI app factory, CORS, lifecycle
  main.py             # Entry point (45 lines)
  config.py           # Data directory paths
  models.py           # Pydantic schemas
  server.py           # Tauri sidecar launcher
  routes/             # 12 domain routers (thin HTTP handlers)
  services/           # Business logic, CRUD, orchestration
  backends/           # TTS/STT engine implementations
  database/           # ORM models, session, migrations, seeds
  utils/              # Audio, effects, caching, progress

Verification

  • All 90 application routes registered and app imports cleanly
  • Rust compiles clean (cargo check passes)
  • Net diff: +5,766 / -6,376 lines (610 lines removed overall)

Closes #277

Summary by CodeRabbit

  • New Features

    • Engine/model selector UI with per-engine language rules and a new "loading_model" status
    • Generation versioning, effects presets (preview/apply), multi-track story editor and export
    • Audio channel management, model lifecycle APIs (download/load/unload), CUDA tooling, transcription, and richer model status reporting
  • Bug Fixes

    • Improved Windows server discovery and port handling
  • Documentation

    • New backend style guide, engine-integration guide, revamped backend README, and Just-based setup/build instructions (Makefile → justfile)

Phase 1 - delete dead code:
- studio.py, migrate_add_instruct.py, utils/validation.py
- duplicate _profile_to_response in main.py, duplicate asyncio import
- pointless _get_profiles_dir/_get_generations_dir wrappers
- duplicate LANGUAGE_CODE_TO_NAME and WHISPER_HF_REPOS constants

Phase 2 - extract backends/base.py with shared utilities:
- is_model_cached() replaces 7 copy-pasted HF cache checks
- get_torch_device() replaces 5 device detection methods
- combine_voice_prompts() replaces 5 identical implementations
- model_load_progress() ctx manager replaces progress boilerplate in all backends
- patch_chatterbox_f32() replaces identical monkey-patches in both chatterbox backends

net -1078 lines across the backend
…akefile

- Add backend/STYLE_GUIDE.md covering formatting, imports, types, docstrings,
  comments, error handling, async, logging, and naming conventions
- Add pyproject.toml with ruff linter/formatter config (ERA, FIX, isort, pyupgrade)
- Extract generation service (Phase 3): unified run_generation() replaces three
  duplicated closures, serial queue moved to services/task_queue.py
- Delete Makefile in favor of justfile; update all references
- Add Python lint/format/test commands to justfile (check-python, fix-python, test)
- Install ruff, pytest, pytest-asyncio as dev tools in setup-python
- Update REFACTOR_PLAN.md with Phase 3 and Phase 7 completion
Split the 2,578-line main.py (90 routes) into 12 domain-specific router
modules under routes/. main.py is now a 45-line entry point.

New structure:
- app.py: FastAPI instance, CORS, startup/shutdown, safe_content_disposition
- routes/: health, profiles, channels, generations, history, transcription,
  stories, effects, audio, models, tasks, cuda
- services/cuda.py: moved from cuda_download.py

Also includes Phase 5 database/ package (from parallel agent):
- database/__init__.py re-exports all symbols for backward compat
- database/models.py, session.py, migrations.py, seed.py

All 90 routes verified registered and app imports cleanly.
…ils/

Move 9 business-logic modules from the backend root into services/:
channels, effects, history, profiles, stories, versions, export_import,
transcribe, tts. Move platform_detect.py into utils/.

Backend root now contains only infrastructure (app, main, config, server,
models, build_binary) and docs. All 94 routes verified.
…277)

On Windows, Voicebox shelled out to netstat.exe on startup to check for
existing server processes. On systems with corrupted DLLs, netstat fails
with 0xc0000142, causing an infinite loading loop.

Replace with:
- TcpStream::connect_timeout() for port-in-use checks (pure Rust)
- PowerShell Get-NetTCPConnection for port-to-PID lookup (built-in cmdlet)
- tasklist for process name verification (unchanged)

Closes #277
…guide

Replace the outdated backend README (473 lines of stale API docs and
pre-refactor file tree) with a concise architecture document covering
module structure, request flow, backend selection, API domain overview,
and development commands.

Delete REFACTOR_PLAN.md -- all phases are complete.

Update STYLE_GUIDE.md to remove refactor plan references and replace
the verbose target layout with the current actual structure.
Compiled Xcode asset catalog gets regenerated every build. No reason
to track it.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 16, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Replaces Makefile with justfile, adds a FastAPI app factory and many new routers, splits the database into models/migrations/session/seed, introduces a ModelConfig-backed backend registry and shared backend helpers, adds a serial generation queue and run_generation orchestration, updates frontend engine/model selection, and removes legacy docs/examples.

Changes

Cohort / File(s) Summary
Build & Tooling
Makefile, justfile, CHANGELOG.md, PATCH_NOTES.md, .gitignore, backend/pyproject.toml
Remove Makefile, add justfile and Python tooling (ruff/pytest), update changelog and patch notes, add generated Tauri asset to .gitignore, add backend pyproject config.
FastAPI app & lifecycle
backend/app.py, backend/routes/__init__.py
Add FastAPI factory (create_app()), CORS, startup/shutdown lifecycle, GPU/status helpers, and centralized router registration.
Database package
backend/database/...
backend/database/models.py, backend/database/session.py, backend/database/migrations.py, backend/database/seed.py, backend/database/__init__.py (removed) backend/database.py
Replace monolithic DB module with SQLAlchemy models, session/init, idempotent migrations, seed/backfill utilities, and package re-exports.
Routes — new/changed
backend/routes/* (generations, models, effects, profiles, stories, history, tasks, cuda, audio, health, transcription, channels)
Add comprehensive routers: generation endpoints (SSE/status/stream), model lifecycle/migrate/download/progress, effects/versioning, profiles CRUD/import/export, stories CRUD/export, history import/export, tasks/cache, CUDA management with SSE progress, audio serving, health/filesystem, and transcription.
Backends & shared helpers
backend/backends/__init__.py, backend/backends/base.py, backend/backends/* (chatterbox*, luxtts, mlx, pytorch, mlx_backend, etc.)
Introduce ModelConfig registry, centralized utilities (is_model_cached, get_torch_device, model_load_progress, combine_voice_prompts, patching), and refactor per-engine loading/unloading to use shared helpers.
Generation & task queue services
backend/services/generation.py, backend/services/task_queue.py, backend/services/*
Add run_generation orchestration (generate/retry/regenerate), a serial generation queue, background-task tracking, and update service import paths.
Database migrations & seed
backend/database/migrations.py, backend/database/seed.py
Add idempotent SQLite runtime migrations and post-migration seeding/backfill routines (generation versions, builtin presets).
Frontend components & types
app/src/components/Generation/EngineModelSelector.tsx, FloatingGenerateBox.tsx, GenerationForm.tsx, app/src/lib/api/types.ts, app/src/lib/hooks/*, app/src/components/History/HistoryTable.tsx, app/src/components/Sidebar.tsx
Add EngineModelSelector component, preset-driven generation UI, integrate EngineModelSelector into forms, add loading_model status to API types/hooks, update history messaging and sidebar update status UI.
Packaging & binary build
backend/build_binary.py, tauri/src-tauri/src/main.rs, backend/utils/*, backend/utils/hf_offline_patch.py, backend/utils/hf_progress.py
Refactor PyInstaller arg construction and hidden imports, improve logging, replace netstat with TCP/connect + Windows PID lookup, centralize HF-offline patching and progress reporting.
Removals & cleanup
backend/database.py, backend/example_usage.py, backend/studio.py, backend/migrate_add_instruct.py, backend/utils/validation.py, docs/... (many planning docs removed)
Remove legacy monolithic DB file, example scripts, stubs, migration script, validation utilities, and several large planning documents.
Tests & import fixes
backend/tests/*, backend/services/*
Adjust tests/fixtures and service import paths to the new package layout; minor test formatting and fixture target changes.
Docs & plans
docs/plans/*, added docs/plans/ADDING_TTS_ENGINES.md
Add engine-integration guide, update docs and project-status files, remove outdated planning docs.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant API as FastAPI
  participant DB
  participant Queue as TaskQueue
  participant Worker as GenerationWorker
  participant Backend as TTSBackend
  participant Store as Storage

  Client->>API: POST /generations (GenerationRequest)
  API->>DB: INSERT Generation (status=loading_model)
  API->>Queue: enqueue_generation(coroutine)
  API-->>Client: 200 GenerationResponse (id)
  Queue->>Worker: deliver coroutine
  Worker->>Backend: ensure model loaded (engine, model_size)
  Backend->>Store: load/cache model assets
  Backend-->>Worker: model ready
  Worker->>Backend: run TTS generation (chunked)
  Worker->>Store: write WAV file(s)
  Worker->>DB: UPDATE Generation status -> generating/completed
  Client->>API: GET /generations/{id}/status (SSE)
  API->>DB: poll updates -> stream status events
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Poem

🐇 I hopped through code both new and old,

Routers, models, queues in a neat fold,
Engines gathered, presets tucked tight,
Tasks queued, models loading by night,
From Make to just — a tidy hop of delight!

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch backend-refactor

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🟡 Minor comments (14)
backend/README.md-22-35 (1)

22-35: ⚠️ Potential issue | 🟡 Minor

Add fence languages to markdown code blocks (Line 22, Line 39, Line 112).

These fences should declare a language (use text) to satisfy markdownlint MD040.

Suggested doc-only fix
-```
+```text
 backend/
   app.py                  # FastAPI app factory, CORS, lifecycle events
   main.py                 # Entry point (imports app, runs uvicorn)
   config.py               # Data directory paths and configuration
   models.py               # Pydantic request/response schemas
   server.py               # Tauri sidecar launcher, parent-pid watchdog
@@
-```
+```text
 HTTP request
   -> routes/        (validate input, parse params)
   -> services/      (business logic, database queries, orchestration)
   -> backends/      (TTS/STT inference)
   -> utils/         (audio processing, effects, caching)
@@
-```
+```text
 {data_dir}/
   voicebox.db             # SQLite database
   profiles/{id}/          # Voice samples per profile
   generations/            # Generated audio files
   cache/                  # Voice prompt cache (memory + disk)
   backends/               # Downloaded CUDA binary (if applicable)

Also applies to: 39-45, 112-119

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/README.md` around lines 22 - 35, Update the three markdown code
fences that contain the directory listing, the HTTP request flow, and the
{data_dir}/ listing so they declare a language; replace the opening triple
backticks (```) with triple backticks plus "text" (```text) for the blocks
starting with "backend/", "HTTP request", and "{data_dir}/" respectively to
satisfy markdownlint MD040; ensure only the fence header is changed and the
block contents remain identical.
backend/README.md-55-55 (1)

55-55: ⚠️ Potential issue | 🟡 Minor

Prevent markdown strong-style parsing on __init__ filename (Line 55).

**backends/__init__.py** can trigger MD050 due to underscores. Wrap the filename in backticks inside bold text.

Suggested doc-only fix
-**backends/__init__.py** -- Protocol definitions (`TTSBackend`, `STTBackend`), model config registry, and factory functions. Adding a new engine means implementing the protocol and registering a config entry.
+**`backends/__init__.py`** -- Protocol definitions (`TTSBackend`, `STTBackend`), model config registry, and factory functions. Adding a new engine means implementing the protocol and registering a config entry.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/README.md` at line 55, Change the bolded filename token to use inline
code so the Markdown linter won't parse underscores as emphasis: replace the
bold text "**backends/__init__.py**" with bolded inline-code
"**`backends/__init__.py`**" in the README (the line describing
backends/__init__.py, where the protocols TTSBackend and STTBackend are
referenced).
tauri/src-tauri/src/main.rs-17-18 (1)

17-18: ⚠️ Potential issue | 🟡 Minor

Update the helper docs to match actual behavior.

Line 17-Line 18 mention TcpListener::bind, but this function only uses PowerShell + tasklist. The comment is now misleading for future maintenance.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tauri/src-tauri/src/main.rs` around lines 17 - 18, The doc comment currently
claims the helper uses TcpListener::bind, but the implementation actually
invokes PowerShell and tasklist to find a voicebox process and return its PID;
update the helper's documentation comment to remove the TcpListener::bind
reference and clearly state that it runs a PowerShell tasklist scan (using
tasklist) to locate the voicebox process and returns the PID when found, and
mention any platform constraint (Windows) if applicable so future maintainers
(and the function that follows) see accurate behavior.
CHANGELOG.md-69-79 (1)

69-79: ⚠️ Potential issue | 🟡 Minor

Consolidate entries under a single Unreleased section.

The file now has two ## [Unreleased] headers (Line 8 and Line 62). Please merge these new bullets into one section to avoid ambiguity for readers/tools.

🧩 Suggested cleanup
-## [Unreleased]
-
-### Fixed
-- Audio export failing when Tauri save dialog returns object instead of string path
-- OpenAPI client generator script now documents the local backend port and avoids an unused loop variable warning
-
-### Added
-- **justfile** - Comprehensive development workflow automation with commands for setup, development, building, testing, and code quality checks
-  - Cross-platform support (macOS, Linux, Windows)
-  - Python version detection and compatibility warnings
-  - Self-documenting help system with `just --list`
-
-### Changed
-- **README** - Updated Quick Start with justfile-based setup instructions
-
-### Removed
-- **Makefile** - Replaced by justfile (cross-platform, simpler syntax)
+## [Unreleased]
+### Fixed
+- **Profile Name Validation** - Added proper validation to prevent duplicate profile names ([`#134`](https://github.com/jamiepine/voicebox/issues/134))
+  - Users now receive clear error messages when attempting to create or update profiles with duplicate names
+  - Improved error handling in create and update profile API endpoints
+  - Added comprehensive test suite for duplicate name validation
+- Audio export failing when Tauri save dialog returns object instead of string path
+- OpenAPI client generator script now documents the local backend port and avoids an unused loop variable warning
+
+### Added
+- **justfile** - Comprehensive development workflow automation with commands for setup, development, building, testing, and code quality checks
+  - Cross-platform support (macOS, Linux, Windows)
+  - Python version detection and compatibility warnings
+  - Self-documenting help system with `just --list`
+
+### Changed
+- **README** - Updated Quick Start with justfile-based setup instructions
+
+### Removed
+- **Makefile** - Replaced by justfile (cross-platform, simpler syntax)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@CHANGELOG.md` around lines 69 - 79, There are duplicate "## [Unreleased]"
sections; merge the entries under both headers into a single Unreleased section
by combining the bullets for "**justfile**", the "Changed" entry for
"**README**", and the "Removed" entry for "**Makefile**" into one consolidated
"## [Unreleased]" header, remove the extra duplicate header, and ensure the
combined bullets maintain their original ordering and formatting (including
sub-bullets like "Cross-platform support" and "Python version detection") so
tools/readers see only one Unreleased block.
backend/routes/tasks.py-41-42 (1)

41-42: ⚠️ Potential issue | 🟡 Minor

Preserve exception chain with from e.

Per static analysis, exceptions raised within an except block should chain to preserve the original traceback.

🐛 Proposed fix
     except Exception as e:
-        raise HTTPException(status_code=500, detail=f"Failed to clear cache: {str(e)}")
+        raise HTTPException(status_code=500, detail=f"Failed to clear cache: {e!s}") from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/tasks.py` around lines 41 - 42, In the except block that
catches Exception and raises an HTTPException (the block that currently does:
except Exception as e: raise HTTPException(status_code=500, detail=f"Failed to
clear cache: {str(e)}")), preserve the exception chain by re-raising the
HTTPException with "from e" (i.e., raise HTTPException(status_code=500,
detail=...) from e) so the original traceback is kept; update that raise in the
clear-cache route handler in backend/routes/tasks.py accordingly.
backend/routes/audio.py-9-9 (1)

9-9: ⚠️ Potential issue | 🟡 Minor

Unused import models.

The models module is imported but never used in this file.

🧹 Proposed fix
-from .. import models
 from ..services import history
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/audio.py` at line 9, The import models at the top of
backend/routes/audio.py is unused; remove the line "from .. import models" (or
otherwise use the imported symbol in functions that require it) so the file no
longer contains an unused import and linting/flake8 warnings are resolved.
backend/services/export_import.py-97-97 (1)

97-97: ⚠️ Potential issue | 🟡 Minor

Unused variable profile_dir.

This variable is assigned but never used in export_profile_to_zip. It appears to be leftover from refactoring.

🧹 Proposed fix
-        profile_dir = config.get_profiles_dir() / profile_id
-
         for sample in samples:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/export_import.py` at line 97, In export_profile_to_zip
remove the unused assignment to profile_dir (the expression
config.get_profiles_dir() / profile_id) since it’s leftover from refactoring;
locate the line assigning profile_dir inside the export_profile_to_zip function
and delete it, or if the path is actually needed use the existing variables/use
the path where required instead of introducing profile_dir.
backend/app.py-106-120 (1)

106-120: ⚠️ Potential issue | 🟡 Minor

Database session not properly closed on all paths.

Using next(get_db()) bypasses the generator's finally block cleanup. If an exception occurs before db.close(), the session may leak. Consider using a context manager pattern.

🛡️ Proposed fix
         try:
             from sqlalchemy import text as sa_text
 
-            db = next(get_db())
-            result = db.execute(
-                sa_text(
-                    "UPDATE generations SET status = 'failed', "
-                    "error = 'Server was shut down during generation' "
-                    "WHERE status = 'generating'"
+            from contextlib import contextmanager
+            
+            `@contextmanager`
+            def db_session():
+                db = database.SessionLocal()
+                try:
+                    yield db
+                finally:
+                    db.close()
+            
+            with db_session() as db:
+                result = db.execute(
+                    sa_text(
+                        "UPDATE generations SET status = 'failed', "
+                        "error = 'Server was shut down during generation' "
+                        "WHERE status = 'generating'"
+                    )
                 )
-            )
-            if result.rowcount > 0:
-                print(f"Marked {result.rowcount} stale generation(s) as failed")
-            db.commit()
-            db.close()
+                if result.rowcount > 0:
+                    print(f"Marked {result.rowcount} stale generation(s) as failed")
+                db.commit()

Or simpler, just use try/finally with db.close() already in place and ensure exception paths also close:

             db = next(get_db())
-            result = db.execute(
-                sa_text(
-                    "UPDATE generations SET status = 'failed', "
-                    "error = 'Server was shut down during generation' "
-                    "WHERE status = 'generating'"
+            try:
+                result = db.execute(
+                    sa_text(
+                        "UPDATE generations SET status = 'failed', "
+                        "error = 'Server was shut down during generation' "
+                        "WHERE status = 'generating'"
+                    )
                 )
-            )
-            if result.rowcount > 0:
-                print(f"Marked {result.rowcount} stale generation(s) as failed")
-            db.commit()
-            db.close()
+                if result.rowcount > 0:
+                    print(f"Marked {result.rowcount} stale generation(s) as failed")
+                db.commit()
+            finally:
+                db.close()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app.py` around lines 106 - 120, The code calls next(get_db()) which
bypasses the generator's cleanup; fix by capturing the generator (e.g. gen =
get_db()) and then db = next(gen), wrap the DB work in try/finally and in
finally call gen.close() (and/or db.close() if still needed) so the generator's
finally block always runs on exceptions; update the block around get_db(), db,
and commit/close to use this try/finally pattern to guarantee session cleanup.
backend/database/__init__.py-24-44 (1)

24-44: ⚠️ Potential issue | 🟡 Minor

Keep __all__ sorted to satisfy Ruff.

RUF022 is already firing on Lines 24-44, so this file will stay noisy in ruff check until the export list is ordered consistently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/database/__init__.py` around lines 24 - 44, The __all__ export list
in this module is unsorted and triggers RUF022; reorder the entries in the
__all__ list (the string names such as "AudioChannel", "Base",
"ChannelDeviceMapping", "EffectPreset", "Generation", "GenerationVersion",
"ProfileChannelMapping", "ProfileSample", "Project", "SessionLocal", "_db_path",
"Story", "StoryItem", "VoiceProfile", "engine", "get_db", "init_db") into a
consistent alphabetical order (respecting your project's case-sensitivity rules)
so Ruff stops flagging it; update the existing __all__ declaration (not its
values elsewhere) to use the sorted sequence.
docs/plans/ADDING_TTS_ENGINES.md-194-214 (1)

194-214: ⚠️ Potential issue | 🟡 Minor

Annotate these fenced blocks with bash.

markdownlint MD040 is already flagging the snippets on Lines 194, 206, and 212, so the new docs will stay noisy until those fences declare a language.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/ADDING_TTS_ENGINES.md` around lines 194 - 214, The fenced code
blocks showing shell commands (the block with "pip install --no-deps
chatterbox-tts", the block showing "linacodec @ git+https://...", and the block
with "--find-links https://k2-fsa.github.io/icefall/piper_phonemize.html")
should be annotated as bash to satisfy markdownlint MD040; update each
triple-backtick fence to ```bash so the snippets that include the pip install
line, the git+https URIs, and the --find-links line are explicitly marked as
shell/bash code blocks.
docs/plans/PROJECT_STATUS.md-53-56 (1)

53-56: ⚠️ Potential issue | 🟡 Minor

This architecture table is already stale after the refactor.

The backend/main.py row still describes the pre-refactor 2.8k-line monolith, even though this PR turns it into a small bootstrap. The new hard-coded source line ranges in the neighboring rows will rot just as fast, so I'd update the table and drop exact line numbers while you're here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/PROJECT_STATUS.md` around lines 53 - 56, Update the architecture
table to reflect the refactor: change the `backend/main.py` row to describe it
as a small bootstrap/entrypoint (not a ~2850-line monolith), remove all
hard-coded source line ranges (e.g., the `:32-101`, `:17-29,153-366`, `:382-426`
annotations) and instead reference symbols or files only (for example list
`TTSBackend` Protocol, `ModelConfig` dataclass, and the engine registry in
`backend/backends/__init__.py` without line numbers), and update the TTS factory
row to describe the thread-safe engine registry (double-checked locking)
generically so the table stays accurate after future refactors.
backend/routes/health.py-31-35 (1)

31-35: ⚠️ Potential issue | 🟡 Minor

Use call_later() to ensure the shutdown callback actually runs.

asyncio.create_task(shutdown_async()) creates a task with only a weak reference from the event loop. If this reference is the only one after the handler returns, the task can be garbage-collected before the 0.1-second delay executes, making the /shutdown endpoint a no-op. This is explicitly documented in Python's asyncio docs as a known issue with fire-and-forget tasks.

Suggested fix
-    async def shutdown_async():
-        await asyncio.sleep(0.1)
-        os.kill(os.getpid(), signal.SIGTERM)
-
-    asyncio.create_task(shutdown_async())
+    asyncio.get_running_loop().call_later(0.1, os.kill, os.getpid(), signal.SIGTERM)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/health.py` around lines 31 - 35, The shutdown task may be
garbage-collected before its sleep runs because
asyncio.create_task(shutdown_async()) is a weak reference; replace that
scheduling with the event loop's call_later to guarantee execution: get the
current loop (e.g. via asyncio.get_event_loop() or asyncio.get_running_loop())
and use loop.call_later(0.1, lambda: asyncio.create_task(shutdown_async())), or
alternatively move the os.kill call into a non-async callback and schedule that
with loop.call_later(0.1, shutdown_callback) so the shutdown sequence in
shutdown_async() reliably executes; update references to shutdown_async and the
scheduling call accordingly.
backend/routes/profiles.py-179-182 (1)

179-182: ⚠️ Potential issue | 🟡 Minor

Potential None filename causing Path(None).suffix error.

file.filename can be None for some upload scenarios. Unlike line 114 which handles this with file.filename or "", this line does not.

🐛 Proposed fix
-    with tempfile.NamedTemporaryFile(delete=False, suffix=Path(file.filename).suffix) as tmp:
+    with tempfile.NamedTemporaryFile(delete=False, suffix=Path(file.filename or "").suffix or ".tmp") as tmp:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/profiles.py` around lines 179 - 182, The code calls
Path(file.filename).suffix without guarding against file.filename being None;
change the NamedTemporaryFile suffix creation to use a safe fallback (e.g.,
Path(file.filename or "").suffix or a default suffix) so Path never receives
None. Update the tempfile.NamedTemporaryFile(...) call that uses
Path(file.filename).suffix, keep the same tmp_path/tmp variables and ensure
file.read() and tmp.write(content) behavior is unchanged.
backend/backends/base.py-113-120 (1)

113-120: ⚠️ Potential issue | 🟡 Minor

Type inconsistency in DirectML device return: function declares -> str but returns device object.

The function signature explicitly specifies -> str, but torch_directml.device(0) returns a torch.device-like object, while all other branches return string literals ("cuda", "cpu", "xpu", "mps").

Although PyTorch's .to() accepts both strings and device objects at runtime, this violates the return type contract and can confuse static type checkers. Align the return type or convert the DirectML device to a string representation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/base.py` around lines 113 - 120, The DirectML branch
(guarded by allow_directml) currently returns a torch_directml.device object
(via torch_directml.device(0)) while the function signature is -> str; convert
the device object to a string before returning (e.g., use str(...) or another
string representation) so the branch returns the same type as the other branches
("cuda"/"cpu"/"xpu"/"mps") and keep the signature as -> str; update the
torch_directml.device(0) return to return a string and ensure any callers
expecting strings still work.
🧹 Nitpick comments (27)
backend/services/stories.py (3)

5-5: Consider using built-in generics per style guide.

The style guide for Python 3.12+ recommends using built-in generics (list[T], T | None) instead of typing.List and typing.Optional. This can be addressed incrementally as files are touched during the refactor.

♻️ Suggested change
-from typing import List, Optional
+from typing import TYPE_CHECKING

Then update function signatures:

  • List[StoryResponse]list[StoryResponse]
  • Optional[StoryDetailResponse]StoryDetailResponse | None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/stories.py` at line 5, Replace typing generics with built-in
generics: change the import from typing.List and typing.Optional usage to use
Python 3.12+ built-ins. Update the top-level import that currently references
List/Optional and modify function signatures that return List[StoryResponse] to
return list[StoryResponse] and any occurrences of Optional[StoryDetailResponse]
to StoryDetailResponse | None; search for symbols StoryResponse and
StoryDetailResponse and update their annotations accordingly.

57-61: The getattr usage suggests defensive coding for nullable columns.

Using getattr(item, "version_id", None) and similar patterns implies these fields might not always exist on DBStoryItem. If these are defined columns in the ORM model, direct attribute access would be cleaner and more type-safe. If they're genuinely optional (added in a migration), consider adding type hints to the model.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/stories.py` around lines 57 - 61, The code uses defensive
getattr calls (e.g., getattr(item, "version_id", None), getattr(item,
"trim_start_ms", 0), getattr(item, "trim_end_ms", 0)) when building the story
dict; if these are defined ORM columns on DBStoryItem, replace getattr with
direct attribute access (item.version_id, item.trim_start_ms, item.trim_end_ms)
to be clearer and more type-safe, otherwise annotate DBStoryItem to mark these
fields as Optional[int] (or the correct type) and keep getattr only if model
truly may lack the attribute; update usages in the function that constructs the
story (references: DBStoryItem, version_id, trim_start_ms, trim_end_ms)
accordingly so the code and type hints align.

95-96: datetime.utcnow() is deprecated in Python 3.12.

Consider using datetime.now(datetime.timezone.utc) instead, which is the recommended replacement. This applies to all occurrences in the file.

♻️ Suggested change
+from datetime import datetime, timezone
-from datetime import datetime

# Then replace usages:
-created_at=datetime.utcnow(),
+created_at=datetime.now(timezone.utc),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/stories.py` around lines 95 - 96, Replace all uses of
datetime.utcnow() in this file with timezone-aware UTC timestamps by calling
datetime.now(datetime.timezone.utc); specifically update the assignments to
created_at and updated_at (and any other occurrences) to use
datetime.now(datetime.timezone.utc) and ensure datetime.timezone is referenced
or imported so timestamps are timezone-aware and compliant with Python 3.12.
backend/services/task_queue.py (2)

30-31: Replace traceback.print_exc() with logger.exception() per style guide.

The style guide specifies using logger.exception() inside except blocks as it captures the traceback automatically and integrates with the logging infrastructure.

♻️ Proposed fix
+import logging
+
+logger = logging.getLogger(__name__)
+
 # Keep references to fire-and-forget background tasks to prevent GC
 _background_tasks: set = set()
         try:
             await coro
         except Exception:
-            traceback.print_exc()
+            logger.exception("Generation task failed")
         finally:
             _generation_queue.task_done()

After this change, you can remove the import traceback line.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/task_queue.py` around lines 30 - 31, Replace the bare
traceback.print_exc() call in the except block of backend.services.task_queue
(the except in the code containing traceback.print_exc()) with
logger.exception(...) so the exception and stack trace are logged via the module
logger; update the message to provide context if needed and then remove the
now-unused import traceback from the top of the file. Ensure logger is the
correct logger instance used in task_queue (or import/get it) so
logger.exception is available.

24-33: Consider adding graceful shutdown support for the worker.

The _generation_worker runs an infinite loop with no shutdown mechanism. If the application needs to shut down cleanly (e.g., waiting for in-flight generation to complete), there's no way to signal the worker to stop. This may be acceptable for now but could cause issues with graceful shutdown during development reloads or production deployments.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/task_queue.py` around lines 24 - 33, The worker
_generation_worker currently loops forever and needs a shutdown mechanism; add a
module-level asyncio.Event (e.g., _generation_shutdown_event) or a sentinel
object and implement a stop function (e.g., stop_generation_worker) that signals
the event or enqueues the sentinel to wake the loop; change _generation_worker
to await either _generation_queue.get() or _generation_shutdown_event.wait() (or
check for the sentinel after get()), break out of the loop when shutdown is
signaled but ensure in-flight coro is awaited and _generation_queue.task_done()
is called so pending tasks are drained and the worker exits cleanly; update any
startup code to create the event and use stop_generation_worker during shutdown.
backend/STYLE_GUIDE.md (1)

372-385: Add language specifier to the code block.

The project layout code block is missing a language specifier, which triggers a markdownlint warning (MD040). Adding a language helps with rendering and accessibility.

📝 Proposed fix
 ## Project Layout

-```
+```text
 backend/
   app.py              # FastAPI app factory, CORS, lifecycle events
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/STYLE_GUIDE.md` around lines 372 - 385, Update the project layout
code fence in the STYLE_GUIDE.md project layout block so the opening triple
backticks include a language specifier (e.g., change from ``` to ```text or
```plaintext) for the backend/ listing code block; locate the project layout
code block (the one starting with "backend/" and the list of files like app.py,
main.py, config.py) and modify its opening fence to include the language
specifier to satisfy markdownlint MD040.
backend/services/channels.py (1)

84-84: Consider using datetime.now(timezone.utc) instead of datetime.utcnow().

datetime.utcnow() is deprecated as of Python 3.12. The recommended approach is:

from datetime import datetime, timezone
# ...
created_at=datetime.now(timezone.utc),

This can be addressed in a follow-up if the project targets Python 3.12+.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/channels.py` at line 84, Replace the use of
datetime.utcnow() for created_at with a timezone-aware timestamp: import
timezone from datetime and set created_at to datetime.now(timezone.utc) in the
code that constructs the channel (look for the created_at=datetime.utcnow()
assignment in backend/services/channels.py, e.g., where the channel or record is
created); update the import to include timezone (from datetime import datetime,
timezone) and use datetime.now(timezone.utc).
backend/tests/test_generation_download.py (1)

196-254: Test Case 2 is defined but never called.

test_generation_with_fresh_download() is not invoked in main(). If this is intentional (to avoid slow model downloads during regular test runs), consider adding a comment or removing the unused code. Alternatively, you could add a CLI flag to optionally run it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_generation_download.py` around lines 196 - 254, The test
function test_generation_with_fresh_download is defined but never called from
main; either invoke it conditionally from main or mark it as intentionally
skipped. Fix by updating main to optionally run
test_generation_with_fresh_download when a flag/env var is set (e.g.,
RUN_SLOW_TESTS or a CLI --run-slow-tests) and ensure the call awaits the
coroutine (await test_generation_with_fresh_download()), or add a clear comment
above the function explaining it is intentionally unused and remove it if not
needed; reference test_generation_with_fresh_download and main to locate where
to add the conditional invocation or the explanatory comment.
backend/services/export_import.py (2)

349-349: Redundant import of config.

config is already imported at module level (line 19). This inner import is unnecessary.

♻️ Proposed fix
     from pathlib import Path
     import tempfile
     import shutil
     from datetime import datetime
-    from .. import config
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/export_import.py` at line 349, The function/file contains a
redundant inner import of config (the statement "from .. import config") even
though config is already imported at module scope; remove the duplicate inner
import so only the module-level import remains and ensure any references use the
existing module-level name (config) to avoid shadowing or re-importing.

417-417: Prefer standard import over __import__.

Using __import__('uuid') inline is non-idiomatic. Consider importing uuid at the module level with other standard library imports.

♻️ Proposed fix

Add to imports at top of file:

import uuid

Then update line 417:

-                new_generation_id = str(__import__('uuid').uuid4())
+                new_generation_id = str(uuid.uuid4())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/export_import.py` at line 417, Replace the inline dynamic
import usage in the new_generation_id assignment with a standard module import:
add "import uuid" to the module imports and change the assignment that uses
__import__('uuid').uuid4() to call uuid.uuid4() (keeping the str(...)
conversion) so the code uses the idiomatic, module-level uuid import; update the
statement that sets new_generation_id accordingly.
backend/routes/tasks.py (2)

66-75: Repeated encapsulation violation.

Same issue as lines 24-27: directly accessing private _lock and _progress attributes. A public accessor method like get_progress(model_name) would be cleaner.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/tasks.py` around lines 66 - 75, The code is directly accessing
private attributes progress_manager._lock and progress_manager._progress in two
places; replace these private-access patterns with a public accessor on
ProgressManager (e.g., implement and call
progress_manager.get_progress(model_name)) so callers no longer touch
_lock/_progress; update the two blocks that read pm_data and error to call
progress_manager.get_progress(model_name) (or an equivalent thread-safe method)
which should handle locking and return the progress dict or None.

24-27: Encapsulation violation: accessing private attributes.

Directly manipulating progress_manager._lock, _progress, _last_notify_time, and _last_notify_progress breaks encapsulation. Consider adding a clear() method to the progress manager class instead.

♻️ Suggested approach

Add a public method to ProgressManager:

def clear(self):
    """Clear all progress state."""
    with self._lock:
        self._progress.clear()
        self._last_notify_time.clear()
        self._last_notify_progress.clear()

Then simplify the endpoint:

     task_manager.clear_all()
-
-    with progress_manager._lock:
-        progress_manager._progress.clear()
-        progress_manager._last_notify_time.clear()
-        progress_manager._last_notify_progress.clear()
+    progress_manager.clear()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/tasks.py` around lines 24 - 27, You are directly accessing
private attributes of ProgressManager (progress_manager._lock, _progress,
_last_notify_time, _last_notify_progress); add a public ProgressManager.clear()
method that acquires the internal lock and clears those internal collections,
then replace the direct manipulation in the route with a single call to
progress_manager.clear(); reference the ProgressManager class and the route
where progress_manager is used to locate the change.
backend/database/session.py (2)

72-78: Guard against uninitialized SessionLocal.

If get_db() is called before init_db(), SessionLocal will be None and SessionLocal() will fail with an unhelpful error. Adding a guard improves debuggability.

🛡️ Proposed fix
 def get_db():
     """Yield a database session (FastAPI dependency)."""
+    if SessionLocal is None:
+        raise RuntimeError("Database not initialized. Call init_db() first.")
     db = SessionLocal()
     try:
         yield db
     finally:
         db.close()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/database/session.py` around lines 72 - 78, get_db currently calls
SessionLocal() without checking if SessionLocal has been initialized, causing a
confusing failure if init_db wasn't run; add a guard at the start of get_db to
verify SessionLocal is not None and raise a clear RuntimeError (or similar)
instructing to call init_db(or initialize the DB) when SessionLocal is
uninitialized, so callers get a helpful error instead of a cryptic exception
from SessionLocal().

50-50: Use is_(True) for SQLAlchemy boolean comparison.

Static analysis flags == True, and while SQLAlchemy requires an expression (not a truthy check), using .is_(True) is more explicit and avoids the linter warning.

♻️ Proposed fix
-        default_channel = db.query(AudioChannel).filter(AudioChannel.is_default == True).first()
+        default_channel = db.query(AudioChannel).filter(AudioChannel.is_default.is_(True)).first()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/database/session.py` at line 50, The boolean comparison in the query
uses AudioChannel.is_default == True which triggers static analysis; update the
filter call used when building default_channel (the
db.query(AudioChannel).filter(...).first() expression) to use SQLAlchemy's is_
syntax: replace the equality check with AudioChannel.is_default.is_(True) so the
filter becomes filter(AudioChannel.is_default.is_(True)).
backend/app.py (2)

87-91: Optional: simplify elif chain after return.

Static analysis notes the elif is unnecessary after return. This is purely stylistic but could be cleaner.

♻️ Proposed simplification
         return f"CUDA ({device_name})"
-    elif hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
+    if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
         return "MPS (Apple Silicon)"
-    elif backend_type == "mlx":
+    if backend_type == "mlx":
         return "Metal (Apple Silicon via MLX)"
     return "None (CPU only)"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app.py` around lines 87 - 91, The trailing elifs after an earlier
return are stylistically unnecessary; in the function that determines the
backend (the branch using torch.backends.mps.is_available() and the check
backend_type == "mlx"), replace the `elif` checks following the prior `return`
with plain `if` checks so control flow is clearer—specifically change the `elif
hasattr(torch.backends, "mps") and torch.backends.mps.is_available():` and `elif
backend_type == "mlx":` to `if ...` while keeping the same return values ("MPS
(Apple Silicon)", "Metal (Apple Silicon via MLX)") and the final `return "None
(CPU only)"`.

97-98: Consider migrating to lifespan context manager.

@app.on_event("startup") and @app.on_event("shutdown") are deprecated in FastAPI. The recommended approach is the lifespan context manager. This works for now but should be updated in the future.

Also applies to: 149-150

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app.py` around lines 97 - 98, The code uses the deprecated
`@application.on_event`("startup")/@application.on_event("shutdown") decorators
(e.g., startup_event and corresponding shutdown handler); replace them by
implementing an async lifespan context manager function (e.g., async def
lifespan(application): ... yield ...) that contains the startup logic previously
in startup_event and the teardown logic previously in the shutdown handler, and
pass that lifespan callable into the FastAPI application construction
(application = FastAPI(lifespan=lifespan)) so all startup/shutdown activities
run inside the lifespan context.
backend/backends/luxtts_backend.py (2)

130-131: Type hints removed from combine_voice_prompts.

The method signature lost its type annotations (audio_paths: List[str], reference_texts: List[str]) compared to the original. Consider preserving them for IDE support and static analysis.

♻️ Proposed fix
-    async def combine_voice_prompts(self, audio_paths, reference_texts):
+    async def combine_voice_prompts(
+        self, audio_paths: list[str], reference_texts: list[str]
+    ) -> tuple[np.ndarray, str]:
         return await _combine_voice_prompts(audio_paths, reference_texts, sample_rate=24000)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/luxtts_backend.py` around lines 130 - 131, The
combine_voice_prompts method lost its parameter type annotations; restore the
original signature to include audio_paths: List[str] and reference_texts:
List[str] (and add an appropriate return type annotation if present previously)
so IDEs and static analysis can infer types; update the async def
combine_voice_prompts declaration that delegates to _combine_voice_prompts to
use those annotations and ensure List is imported from typing if not already.

10-10: Replace typing.Tuple with built-in tuple type hint.

The project requires Python 3.12+, which supports PEP 585 syntax. Use tuple directly instead of importing Tuple from typing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/luxtts_backend.py` at line 10, Remove Tuple from the typing
import and switch any Tuple[...] annotations to the built-in tuple[...] PEP 585
syntax: update the import line "from typing import Optional, Tuple" to only
import Optional (e.g., "from typing import Optional") and replace all uses of
the symbol "Tuple" in type hints with "tuple" (for example change "Tuple[int,
str]" to "tuple[int, str]") within luxtts_backend.py.
backend/database/models.py (1)

7-9: Switch the base import to the SQLAlchemy 2.x API.

sqlalchemy.ext.declarative.declarative_base() is the legacy import path in 2.x. Using sqlalchemy.orm.declarative_base() keeps the new database package off the deprecation path.

Suggested fix
-from sqlalchemy.ext.declarative import declarative_base
+from sqlalchemy.orm import declarative_base
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/database/models.py` around lines 7 - 9, Replace the legacy
declarative_base import with the SQLAlchemy 2.x API: import declarative_base
from sqlalchemy.orm and keep the existing Base = declarative_base() assignment
(i.e., update the import that currently references sqlalchemy.ext.declarative to
sqlalchemy.orm.declarative_base to avoid deprecation warnings for the Base
symbol).
backend/routes/profiles.py (2)

305-306: datetime.utcnow() is deprecated in Python 3.12+.

Consider using datetime.now(datetime.timezone.utc) for timezone-aware UTC timestamps.

♻️ Proposed fix
+from datetime import datetime, timezone
 ...
-    profile.updated_at = datetime.utcnow()
+    profile.updated_at = datetime.now(timezone.utc)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/profiles.py` around lines 305 - 306, Replace the deprecated
call to datetime.utcnow() used when updating profile.updated_at with a
timezone-aware UTC timestamp: use datetime.now(datetime.timezone.utc) (ensure
datetime.timezone is imported) so profile.updated_at is set to a tz-aware
datetime before calling db.commit(); update any related imports or tests that
assume naive datetimes.

27-32: Redundant exception handlers with identical behavior.

Both ValueError and generic Exception handlers return 400. Either combine them or use different status codes.

♻️ Proposed fix
 async def create_profile(
     data: models.VoiceProfileCreate,
     db: Session = Depends(get_db),
 ):
     """Create a new voice profile."""
     try:
         return await profiles.create_profile(data, db)
     except ValueError as e:
-        raise HTTPException(status_code=400, detail=str(e))
-    except Exception as e:
-        raise HTTPException(status_code=400, detail=str(e))
+        raise HTTPException(status_code=400, detail=str(e)) from e
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e)) from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/profiles.py` around lines 27 - 32, The current try/except
around await profiles.create_profile(data, db) has redundant handlers returning
400 for both ValueError and Exception; simplify by keeping a specific ValueError
handler that raises HTTPException(status_code=400, detail=str(e)) and change the
generic except Exception handler to raise HTTPException(status_code=500,
detail="Internal server error") (or include str(e) if you want the message) so
unexpected errors map to 500; update the except blocks in the try/except that
surrounds profiles.create_profile to reflect this change.
backend/models.py (1)

149-152: TranscriptionRequest language pattern is very restrictive.

The transcription endpoint only accepts en or zh, but Whisper supports many more languages. Consider aligning with the broader language set or removing the pattern restriction to allow any language hint.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/models.py` around lines 149 - 152, The TranscriptionRequest model
currently restricts language to only "en" or "zh" via Field(...,
pattern="^(en|zh)$"); update the TranscriptionRequest.language declaration to
either remove the pattern constraint or expand it to the broader set of
Whisper-supported language codes so callers can pass any valid language hint;
locate the language: Optional[str] field on the TranscriptionRequest class and
modify its Field(...) invocation accordingly, ensuring any validators or docs
consuming that pattern are updated or removed to match the new, less restrictive
behavior.
backend/routes/effects.py (1)

101-104: Add exception chaining for better tracebacks.

When re-raising exceptions as HTTPException, chain them with from e to preserve the original traceback for debugging.

♻️ Proposed fix
     try:
         return effects_mod.create_preset(data, db)
     except ValueError as e:
-        raise HTTPException(status_code=400, detail=str(e))
+        raise HTTPException(status_code=400, detail=str(e)) from e

Apply the same pattern to lines 122 and 135.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/effects.py` around lines 101 - 104, The except blocks that
catch ValueError and re-raise HTTPException (e.g., the try/except around
effects_mod.create_preset in backend/routes/effects.py) should chain the
original exception by raising HTTPException(... ) from e so the original
traceback is preserved; update the raise statements in that handler and the
analogous handlers (the ones around the other effects_mod calls at the locations
referenced, e.g., the blocks near the other two try/excepts) to use "from e"
when re-raising.
backend/routes/stories.py (1)

29-32: Broad exception handling may mask unexpected errors.

Catching all Exception types and returning 400 could hide unexpected server errors (e.g., database connection issues) that should return 500.

♻️ Proposed fix with exception chaining
 async def create_story(
     data: models.StoryCreate,
     db: Session = Depends(get_db),
 ):
     """Create a new story."""
     try:
         return await stories.create_story(data, db)
+    except ValueError as e:
+        raise HTTPException(status_code=400, detail=str(e)) from e
     except Exception as e:
-        raise HTTPException(status_code=400, detail=str(e))
+        raise HTTPException(status_code=500, detail=str(e)) from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/stories.py` around lines 29 - 32, The current broad except
block around stories.create_story(data, db) catches all Exception and returns
HTTP 400; change it to catch only expected, domain-specific exceptions (e.g.,
validation errors or known stories.create_story exceptions) and raise
HTTPException(status_code=400, detail=...) for those, and for anything else
re-raise or convert to HTTPException(status_code=500, detail="Internal server
error") while preserving exception chaining (use "from e") so stack traces are
retained; update the except clause(s) around stories.create_story to reference
the specific exception classes you expect and add a fallback that produces a 500
with exception chaining.
backend/backends/__init__.py (1)

360-372: Accessing private _is_model_cached method on backend instances.

The ensure_model_cached_or_raise function accesses backend._is_model_cached() which is a private method (prefixed with _). This creates tight coupling and could break if backend implementations change.

Consider either making this method part of the public TTSBackend protocol or adding a public wrapper.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/__init__.py` around lines 360 - 372, The code currently
calls the private backend._is_model_cached() inside ensure_model_cached_or_raise
which couples callers to a private API; add a public method on the TTSBackend
protocol (e.g., is_model_cached(self, model_size: Optional[str] = None) -> bool)
or a public wrapper on the backend implementations, implement that method across
concrete backends, and then replace all usages of backend._is_model_cached(...)
in ensure_model_cached_or_raise with the new public backend.is_model_cached(...)
(preserving the model_size argument for the "qwen" branch and none for others)
so callers rely only on the public contract.
backend/backends/base.py (2)

60-65: Consider simplifying loop with all() comprehension.

Per static analysis, the loop can be simplified.

♻️ Simplified version
         if required_files:
-            # Check that every required filename exists somewhere in snapshots
-            for fname in required_files:
-                if not any(snapshots_dir.rglob(fname)):
-                    return False
-            return True
+            return all(any(snapshots_dir.rglob(fname)) for fname in required_files)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/base.py` around lines 60 - 65, The loop that checks every
filename in required_files can be simplified to a single all(...) expression:
replace the for-loop that uses if not any(snapshots_dir.rglob(fname)): return
False followed by return True with a single return
all(any(snapshots_dir.rglob(fname)) for fname in required_files) so it preserves
the same semantics while being concise; look for the block referencing
required_files and snapshots_dir.rglob in the function in base.py and swap the
loop for that all(...) comprehension.

195-221: Context manager enter/exit calls are manually managed outside try block.

The tracker_context.__enter__() on line 196 is called before the try block begins. If an exception occurs between lines 196 and 208 (e.g., in task_manager.start_download), the __exit__ in the finally block will still be called, but the exception won't be properly propagated through the context manager's exception handling.

Consider using a nested with statement or ensuring the entire yield is wrapped properly.

♻️ Safer pattern using nested with
-    tracker_context = tracker.patch_download()
-    tracker_context.__enter__()
-
-    if not is_cached:
-        task_manager.start_download(model_name)
-        progress_manager.update_progress(...)
-
-    try:
-        yield tracker_context
-    except Exception as e:
-        ...
-    finally:
-        tracker_context.__exit__(None, None, None)
+    with tracker.patch_download() as tracker_context:
+        if not is_cached:
+            task_manager.start_download(model_name)
+            progress_manager.update_progress(...)
+
+        try:
+            yield tracker_context
+        except Exception as e:
+            progress_manager.mark_error(model_name, str(e))
+            task_manager.error_download(model_name, str(e))
+            raise
+        else:
+            if not is_cached:
+                progress_manager.mark_complete(model_name)
+                task_manager.complete_download(model_name)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/base.py` around lines 195 - 221, The manual call to
tracker_context.__enter__ is outside the try so exceptions during setup (e.g.,
task_manager.start_download or progress_manager.update_progress) bypass the
context manager's __exit__; replace the manual enter/exit with a proper with
block (use "with tracker.patch_download() as tracker_context:") and move the
try/except/else/finally logic so the yield occurs inside that with-block; ensure
task_manager.start_download and progress_manager.update_progress remain before
the yield but inside the with so any exception triggers tracker_context.__exit__
and the existing error reporting (progress_manager.mark_error /
task_manager.error_download) still runs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 826cea17-3abb-4d17-b132-56fc1514395c

📥 Commits

Reviewing files that changed from the base of the PR and between 82cd4bf and 8efcc95.

⛔ Files ignored due to path filters (2)
  • tauri/src-tauri/Cargo.lock is excluded by !**/*.lock
  • tauri/src-tauri/gen/Assets.car is excluded by !**/gen/**
📒 Files selected for processing (68)
  • .gitignore
  • CHANGELOG.md
  • Makefile
  • PATCH_NOTES.md
  • app/src/components/Generation/EngineModelSelector.tsx
  • app/src/components/Generation/FloatingGenerateBox.tsx
  • app/src/components/Generation/GenerationForm.tsx
  • app/src/components/Sidebar.tsx
  • backend/README.md
  • backend/STYLE_GUIDE.md
  • backend/app.py
  • backend/backends/__init__.py
  • backend/backends/base.py
  • backend/backends/chatterbox_backend.py
  • backend/backends/chatterbox_turbo_backend.py
  • backend/backends/luxtts_backend.py
  • backend/backends/mlx_backend.py
  • backend/backends/pytorch_backend.py
  • backend/build_binary.py
  • backend/database.py
  • backend/database/__init__.py
  • backend/database/migrations.py
  • backend/database/models.py
  • backend/database/seed.py
  • backend/database/session.py
  • backend/example_usage.py
  • backend/main.py
  • backend/migrate_add_instruct.py
  • backend/models.py
  • backend/pyproject.toml
  • backend/routes/__init__.py
  • backend/routes/audio.py
  • backend/routes/channels.py
  • backend/routes/cuda.py
  • backend/routes/effects.py
  • backend/routes/generations.py
  • backend/routes/health.py
  • backend/routes/history.py
  • backend/routes/models.py
  • backend/routes/profiles.py
  • backend/routes/stories.py
  • backend/routes/tasks.py
  • backend/routes/transcription.py
  • backend/services/__init__.py
  • backend/services/channels.py
  • backend/services/cuda.py
  • backend/services/effects.py
  • backend/services/export_import.py
  • backend/services/generation.py
  • backend/services/history.py
  • backend/services/profiles.py
  • backend/services/stories.py
  • backend/services/task_queue.py
  • backend/services/transcribe.py
  • backend/services/tts.py
  • backend/services/versions.py
  • backend/studio.py
  • backend/tests/test_generation_download.py
  • backend/tests/test_profile_duplicate_names.py
  • backend/utils/chunked_tts.py
  • backend/utils/effects.py
  • backend/utils/platform_detect.py
  • backend/utils/validation.py
  • docs/overview/troubleshooting.mdx
  • docs/plans/ADDING_TTS_ENGINES.md
  • docs/plans/PROJECT_STATUS.md
  • justfile
  • tauri/src-tauri/src/main.rs
💤 Files with no reviewable changes (6)
  • backend/database.py
  • backend/migrate_add_instruct.py
  • backend/utils/validation.py
  • Makefile
  • backend/example_usage.py
  • backend/studio.py

Comment on lines +99 to +111
class GenerationVersion(Base):
"""A version of a generation's audio (original, processed, alternate takes)."""

__tablename__ = "generation_versions"

id = Column(String, primary_key=True, default=lambda: str(uuid.uuid4()))
generation_id = Column(String, ForeignKey("generations.id"), nullable=False)
label = Column(String, nullable=False)
audio_path = Column(String, nullable=False)
effects_chain = Column(Text, nullable=True)
source_version_id = Column(String, ForeignKey("generation_versions.id"), nullable=True)
is_default = Column(Boolean, default=False)
created_at = Column(DateTime, default=datetime.utcnow)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Enforce one default GenerationVersion per generation.

is_default is only a boolean right now, so concurrent regenerate/set-default flows can persist multiple defaults for the same generation_id. Any code that later loads “the default version” becomes nondeterministic once that happens.

Suggested fix
-from sqlalchemy import Column, String, Integer, Float, DateTime, Text, ForeignKey, Boolean
+from sqlalchemy import Column, String, Integer, Float, DateTime, Text, ForeignKey, Boolean, Index, text
@@
 class GenerationVersion(Base):
     """A version of a generation's audio (original, processed, alternate takes)."""
 
     __tablename__ = "generation_versions"
+    __table_args__ = (
+        Index(
+            "uq_generation_versions_default",
+            "generation_id",
+            unique=True,
+            sqlite_where=text("is_default = 1"),
+        ),
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/database/models.py` around lines 99 - 111, Add a DB-enforced
uniqueness so only one GenerationVersion per generation can have
is_default=True: in the GenerationVersion model add a unique partial
index/constraint on generation_id where is_default is true (e.g.,
Index("uq_generation_default", GenerationVersion.generation_id, unique=True,
postgresql_where=(GenerationVersion.is_default==True)) or UniqueConstraint with
postgresql_where) and add a migration that creates this partial index; also
ensure the migration resolves any existing multiple defaults (pick one or clear
extras) so the constraint can be applied cleanly.

Backend:
- Generation service reports 'loading_model' status only when model
  is not yet in memory, then 'generating' once inference starts
- Migrate hf_offline_patch.py from print() to logging module
- Update ADDING_TTS_ENGINES.md for post-refactor file paths

Frontend:
- HistoryTable shows 'Loading model...' vs 'Generating...' based on step
- FloatingGenerateBox: replace instruct toggle + inline effects editor
  with an effects preset dropdown (third dropdown after language and engine)
- Instruct UI removed for now (form field preserved for future models)
- Remove focus ring from Select component globally
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
docs/plans/ADDING_TTS_ENGINES.md (2)

9-9: Use hyphenation for the compound adjective (real-time).

Line 9 should use “real-time sink” (not “real time sink”) for correct technical prose.

Proposed wording tweak
-Adding an engine touches ~10 files across 4 layers. The backend protocol work is straightforward — the real time sink is dependency hell, upstream library bugs, and PyInstaller bundling.
+Adding an engine touches ~10 files across 4 layers. The backend protocol work is straightforward — the real-time sink is dependency hell, upstream library bugs, and PyInstaller bundling.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/ADDING_TTS_ENGINES.md` at line 9, Replace the unhyphenated phrase
"real time sink" with the compound-adjective form "real-time sink" in the
document (update the instance of "real time sink" found in the
ADDING_TTS_ENGINES.md content to "real-time sink").

194-214: Add language identifiers to fenced code blocks to satisfy MD040.

The fences starting at Line 194, Line 206, and Line 212 are missing a language tag.

Lint-compliant markdown fences
-```
+```bash
 # In justfile (NOT requirements.txt):
 pip install --no-deps chatterbox-tts
 
 # In requirements.txt — list the transitive deps:
 conformer
 diffusers
 omegaconf
 # ... etc

@@
- +text
linacodec @ git+https://github.com/user/repo.git
Zipvoice @ git+https://github.com/user/repo.git

@@
-```
+```text
--find-links https://k2-fsa.github.io/icefall/piper_phonemize.html
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/plans/ADDING_TTS_ENGINES.md around lines 194 - 214, The markdown fenced
code blocks in docs/plans/ADDING_TTS_ENGINES.md that contain "pip install
--no-deps chatterbox-tts", the block listing "linacodec @
git+https://github.com/user/repo.git" / "Zipvoice @
git+https://github.com/user/repo.git", and the block with "--find-links
https://k2-fsa.github.io/icefall/piper_phonemize.html" are missing language
identifiers; update the first fence to use a shell/bash tag (e.g. bash) and the latter two to use a neutral text tag (e.g. text) so the markdown linter
MD040 is satisfied and syntax highlighting is stable.


</details>

</blockquote></details>
<details>
<summary>app/src/components/Generation/FloatingGenerateBox.tsx (1)</summary><blockquote>

`52-56`: **Differentiate preset-loading failure from “no effects” state.**

Right now, if preset fetch fails, users only see the default option and can’t tell whether there are truly no presets or the request failed.


<details>
<summary>Suggested refactor</summary>

```diff
- const { data: effectPresets } = useQuery({
+ const {
+   data: effectPresets,
+   isLoading: isEffectPresetsLoading,
+   isError: isEffectPresetsError,
+ } = useQuery({
    queryKey: ['effectPresets'],
    queryFn: () => apiClient.listEffectPresets(),
  });
```

```diff
- <Select
+ <Select
    value={selectedPresetId || 'none'}
    onValueChange={(value) =>
      setSelectedPresetId(value === 'none' ? null : value)
    }
+   disabled={isEffectPresetsLoading || isEffectPresetsError}
  >
    <SelectTrigger className="h-8 text-xs bg-card border-border rounded-full hover:bg-background/50 transition-all">
-     <SelectValue placeholder="No effects" />
+     <SelectValue
+       placeholder={
+         isEffectPresetsLoading
+           ? 'Loading effects...'
+           : isEffectPresetsError
+             ? 'Effects unavailable'
+             : 'No effects'
+       }
+     />
    </SelectTrigger>
```
</details>


Also applies to: 366-383

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/Generation/FloatingGenerateBox.tsx` around lines 52 - 56,
The effect presets fetch currently only uses the returned data (effectPresets)
so a failed request looks identical to an empty list; update the useQuery call
in FloatingGenerateBox to destructure isLoading and isError (e.g. const { data:
effectPresets, isLoading, isError } = useQuery(...)) and then change the
dropdown rendering logic to (1) show a loading state when isLoading, (2) show a
distinct disabled option like "Failed to load presets" or surface an inline
error/toast when isError, and (3) retain the existing "No presets" / default
option only when isError is false and effectPresets is an empty array; apply the
same change to the similar block referenced around the 366-383 area so both
places differentiate network errors from an empty result.
```

</details>

</blockquote></details>
<details>
<summary>backend/services/generation.py (2)</summary><blockquote>

`95-119`: **Add an else clause for defensive completeness.**

If an invalid mode somehow bypasses type checking at runtime, `final_path` would be undefined and cause a `NameError` at line 125. Adding an explicit else branch improves robustness.

<details>
<summary>🛡️ Proposed defensive fix</summary>

```diff
         elif mode == "regenerate":
             final_path = _save_regenerate(
                 generation_id=generation_id,
                 version_id=version_id,
                 audio=audio,
                 sample_rate=sample_rate,
                 save_audio=save_audio,
                 db=bg_db,
             )
+        else:
+            raise ValueError(f"Unknown generation mode: {mode}")
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@backend/services/generation.py` around lines 95 - 119, The code paths for
mode ("generate"/"retry"/"regenerate") assign final_path via _save_generate,
_save_retry, or _save_regenerate but lack a fallback, so an unexpected mode can
leave final_path undefined; add an explicit else branch after the existing elifs
that handles invalid mode (e.g., log/processLogger.error or raise a
ValueError/RuntimeError including the invalid mode) to fail fast and avoid a
NameError later, referencing the local variable final_path and the mode variable
to make the error clear.
```

</details>

---

`88-91`: **Clarify: `regenerate` always normalizes regardless of the `normalize` parameter.**

The condition `if normalize or mode == "regenerate"` means regenerate mode ignores the caller's `normalize=False`. If this is intentional, consider documenting it in the docstring. If not, the condition should respect the parameter for all modes.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@backend/services/generation.py` around lines 88 - 91, The current check "if
normalize or mode == 'regenerate'" forces normalization in regenerate mode even
when normalize=False; update the logic so normalization respects the normalize
parameter for all modes (e.g., only call normalize_audio(audio) when normalize
is True), or if the intent is to always normalize on regenerate, make that
explicit in the function's docstring by documenting that mode=='regenerate'
overrides normalize; reference the variables and functions involved: mode,
normalize, and normalize_audio to locate and change the condition or add the
docstring note.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @app/src/components/History/HistoryTable.tsx:

  • Around line 397-399: The aria-label mismatch comes from treating loading_model
    the same as generating; update the status booleans so they distinguish
    loading_model: replace the single isInProgress/isGenerating assignment with
    separate flags (e.g., isLoadingModel = gen.status === 'loading_model';
    isGenerating = gen.status === 'generating'; isFailed = gen.status === 'failed')
    and update the row aria-label and any status text usage (references around the
    current isInProgress/isGenerating logic and the similar block at the later
    420-423 region) to announce "Loading model..." when isLoadingModel is true,
    "Generating..." when isGenerating is true, and fallback to the existing labels
    otherwise.

In @app/src/components/ui/select.tsx:

  • Around line 18-21: The Select component's className string (the cn call that
    builds classes for the select root) removes visible focus styling by keeping
    focus:outline-none while dropping the focus ring; restore an accessible focus
    indicator by adding back a visible focus style (e.g. add focus:ring-2
    focus:ring-ring focus:ring-offset-2) or, if design requires no ring, replace
    with an alternative such as focus:border-primary or focus:bg-muted to ensure
    keyboard focus is visually apparent; update the class list in the cn(...) for
    the select root (the className passed into the Select component) accordingly and
    remove focus:outline-none if it prevents the chosen focus style from showing.

In @backend/services/generation.py:

  • Around line 232-242: The take-N label generation is racy: calls to
    versions_mod.list_versions followed by versions_mod.create_version can produce
    duplicate labels under concurrency; add a database-level UNIQUE constraint on
    (generation_id, label) and update versions_mod.create_version to handle
    unique-constraint violations by retrying label creation atomically (e.g., catch
    the DB IntegrityError/UniqueViolation when inserting a duplicate label, re-fetch
    the current count or use an atomic counter/separate per-generation sequence,
    regenerate label like "take-{N}" and retry a few times), or implement a SELECT
    ... FOR UPDATE increment on a counter column on the generation row to derive the
    next take number; reference the functions/variables versions_mod.list_versions,
    versions_mod.create_version, generation_id, and label when making these changes.

Nitpick comments:
In @app/src/components/Generation/FloatingGenerateBox.tsx:

  • Around line 52-56: The effect presets fetch currently only uses the returned
    data (effectPresets) so a failed request looks identical to an empty list;
    update the useQuery call in FloatingGenerateBox to destructure isLoading and
    isError (e.g. const { data: effectPresets, isLoading, isError } = useQuery(...))
    and then change the dropdown rendering logic to (1) show a loading state when
    isLoading, (2) show a distinct disabled option like "Failed to load presets" or
    surface an inline error/toast when isError, and (3) retain the existing "No
    presets" / default option only when isError is false and effectPresets is an
    empty array; apply the same change to the similar block referenced around the
    366-383 area so both places differentiate network errors from an empty result.

In @backend/services/generation.py:

  • Around line 95-119: The code paths for mode ("generate"/"retry"/"regenerate")
    assign final_path via _save_generate, _save_retry, or _save_regenerate but lack
    a fallback, so an unexpected mode can leave final_path undefined; add an
    explicit else branch after the existing elifs that handles invalid mode (e.g.,
    log/processLogger.error or raise a ValueError/RuntimeError including the invalid
    mode) to fail fast and avoid a NameError later, referencing the local variable
    final_path and the mode variable to make the error clear.
  • Around line 88-91: The current check "if normalize or mode == 'regenerate'"
    forces normalization in regenerate mode even when normalize=False; update the
    logic so normalization respects the normalize parameter for all modes (e.g.,
    only call normalize_audio(audio) when normalize is True), or if the intent is to
    always normalize on regenerate, make that explicit in the function's docstring
    by documenting that mode=='regenerate' overrides normalize; reference the
    variables and functions involved: mode, normalize, and normalize_audio to locate
    and change the condition or add the docstring note.

In @docs/plans/ADDING_TTS_ENGINES.md:

  • Line 9: Replace the unhyphenated phrase "real time sink" with the
    compound-adjective form "real-time sink" in the document (update the instance of
    "real time sink" found in the ADDING_TTS_ENGINES.md content to "real-time
    sink").
  • Around line 194-214: The markdown fenced code blocks in
    docs/plans/ADDING_TTS_ENGINES.md that contain "pip install --no-deps
    chatterbox-tts", the block listing "linacodec @
    git+https://github.com/user/repo.git" / "Zipvoice @
    git+https://github.com/user/repo.git", and the block with "--find-links
    https://k2-fsa.github.io/icefall/piper_phonemize.html" are missing language
    identifiers; update the first fence to use a shell/bash tag (e.g. bash) and the latter two to use a neutral text tag (e.g. text) so the markdown linter
    MD040 is satisfied and syntax highlighting is stable.

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: defaults

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `8bb3d2e5-6b6a-46d4-ac19-3fff919e9b74`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 8efcc956067db5657f47ab50ecc041259d8eb308 and 798cd40f059eb87d613e8cbbbb33696013643c73.

</details>

<details>
<summary>📒 Files selected for processing (13)</summary>

* `app/src/components/Generation/FloatingGenerateBox.tsx`
* `app/src/components/History/HistoryTable.tsx`
* `app/src/components/ui/select.tsx`
* `app/src/lib/api/types.ts`
* `app/src/lib/hooks/useGenerationProgress.ts`
* `backend/services/generation.py`
* `backend/utils/hf_offline_patch.py`
* `docs/PR-ACCESSIBILITY.md`
* `docs/plans/ADDING_TTS_ENGINES.md`
* `docs/plans/EXTERNAL_PROVIDERS.md`
* `docs/plans/MLX_AUDIO.md`
* `docs/plans/PR33_CUDA_PROVIDER_REVIEW.md`
* `docs/plans/TTS_PROVIDER_ARCHITECTURE.md`

</details>

<details>
<summary>💤 Files with no reviewable changes (5)</summary>

* docs/PR-ACCESSIBILITY.md
* docs/plans/EXTERNAL_PROVIDERS.md
* docs/plans/MLX_AUDIO.md
* docs/plans/PR33_CUDA_PROVIDER_REVIEW.md
* docs/plans/TTS_PROVIDER_ARCHITECTURE.md

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment on lines +397 to 399
const isInProgress = gen.status === 'loading_model' || gen.status === 'generating';
const isGenerating = isInProgress;
const isFailed = gen.status === 'failed';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align ARIA status text with loading_model state.

When gen.status === 'loading_model', the visible text says “Loading model...” but the row aria-label still announces generating. This is an accessibility mismatch.

Suggested fix
- isGenerating
-   ? `Generating speech for ${gen.profile_name}...`
+ gen.status === 'loading_model'
+   ? `Loading model for ${gen.profile_name}...`
+   : isGenerating
+     ? `Generating speech for ${gen.profile_name}...`
      : isFailed
        ? `Generation failed for ${gen.profile_name}`

Also applies to: 420-423

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/History/HistoryTable.tsx` around lines 397 - 399, The
aria-label mismatch comes from treating loading_model the same as generating;
update the status booleans so they distinguish loading_model: replace the single
isInProgress/isGenerating assignment with separate flags (e.g., isLoadingModel =
gen.status === 'loading_model'; isGenerating = gen.status === 'generating';
isFailed = gen.status === 'failed') and update the row aria-label and any status
text usage (references around the current isInProgress/isGenerating logic and
the similar block at the later 420-423 region) to announce "Loading model..."
when isLoadingModel is true, "Generating..." when isGenerating is true, and
fallback to the existing labels otherwise.

Critical:
- Remove dead backend.utils.validation PyInstaller hidden import
- Fix story_items table rebuild to preserve track/trim/version columns
- Guard cache migration against same source/destination path
- Fix regeneration audio overwrite (use random uuid suffix per take)

Major:
- Engine selector: validate language on Qwen switch, clear stale modelSize
- Sync language validation regex between profile create and generate (22 langs)
- Guard CUDA download against duplicate concurrent requests
- Only set model_size for engines that support multiple sizes
- Fix 404 swallowed by generic except in history export
- Validate audio_path before FileResponse in export-audio
- Transcription: stream uploads in 1MB chunks, use robust cache check,
  call complete_download() on Whisper download success
- Set clean version as default when effects chain validation fails
- Return explicit error when Windows port occupied by non-voicebox process
- Use DB COUNT query instead of list length for take-N label to avoid
  TOCTOU race between list_versions and create_version
- Add focus:bg-muted to SelectTrigger for keyboard focus visibility
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
backend/routes/generations.py (3)

227-232: ⚠️ Potential issue | 🟠 Major

stream_speech also needs the engine model size guard.

Line 229 unconditionally defaults to "1.7B". This should check engine_has_model_sizes() like generate_speech does.

🐛 Proposed fix
+    from ..backends import engine_has_model_sizes
+
     engine = data.engine or "qwen"
     tts_model = get_tts_backend_for_engine(engine)
-    model_size = data.model_size or "1.7B"
+    model_size = (data.model_size or "1.7B") if engine_has_model_sizes(engine) else None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/generations.py` around lines 227 - 232, In stream_speech,
mirror generate_speech's model-size guard: instead of unconditionally defaulting
model_size to "1.7B", call engine_has_model_sizes(engine) and only set
data.model_size or default to "1.7B" when the engine supports sizes; otherwise
leave model_size None (or skip caching/loading). Update the stream_speech logic
that calls get_tts_backend_for_engine(engine),
ensure_model_cached_or_raise(engine, model_size), and load_engine_model(engine,
model_size) to pass model_size only when engine_has_model_sizes(engine) is true
so you don't attempt to cache/load a non-existent size.

165-167: ⚠️ Potential issue | 🟠 Major

Regenerate has the same model_size defaulting issue.

Apply the same engine_has_model_sizes() guard here as suggested for retry.

🐛 Proposed fix
+    from ..backends import engine_has_model_sizes
+
     enqueue_generation(
         run_generation(
             generation_id=generation_id,
             profile_id=gen.profile_id,
             text=gen.text,
             language=gen.language,
             engine=gen.engine or "qwen",
-            model_size=gen.model_size or "1.7B",
+            model_size=(gen.model_size or "1.7B") if engine_has_model_sizes(gen.engine or "qwen") else None,
             seed=gen.seed,
             instruct=gen.instruct,
             mode="regenerate",
             version_id=version_id,
         )
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/generations.py` around lines 165 - 167, The model_size default
is being set unconditionally for regenerate like in the retry path; update the
code that builds the regenerate payload (the lines using language=gen.language,
engine=gen.engine or "qwen", model_size=gen.model_size or "1.7B") to only
include or default model_size when engine_has_model_sizes(engine) returns
True—use the same pattern applied in the retry change: compute engine =
gen.engine or "qwen", then if engine_has_model_sizes(engine) set model_size =
gen.model_size or "1.7B", otherwise omit model_size or set it to None before
constructing the regenerate request.

122-124: ⚠️ Potential issue | 🟠 Major

Retry still injects 1.7B into non-Qwen engines.

Unlike generate_speech, the retry path unconditionally defaults model_size to "1.7B" without checking engine_has_model_sizes(). If a LuxTTS or Chatterbox generation failed, retrying it would pass an invalid model size.

🐛 Proposed fix
+    from ..backends import engine_has_model_sizes
+
     enqueue_generation(
         run_generation(
             generation_id=generation_id,
             profile_id=gen.profile_id,
             text=gen.text,
             language=gen.language,
             engine=gen.engine or "qwen",
-            model_size=gen.model_size or "1.7B",
+            model_size=(gen.model_size or "1.7B") if engine_has_model_sizes(gen.engine or "qwen") else None,
             seed=gen.seed,
             instruct=gen.instruct,
             mode="retry",
         )
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/generations.py` around lines 122 - 124, The retry path
unconditionally sets model_size to "1.7B" which is invalid for non-Qwen engines;
update the retry payload construction (the place creating the dict with
language=gen.language, engine=gen.engine or "qwen", model_size=gen.model_size or
"1.7B") to only default model_size to "1.7B" when the engine supports model
sizes (use engine_has_model_sizes(gen.engine) or check gen.engine == "qwen"),
otherwise leave model_size as None/omitted or use gen.model_size if provided;
ensure you reference and call engine_has_model_sizes(gen.engine) when deciding
the default.
🧹 Nitpick comments (7)
app/src/components/Generation/EngineModelSelector.tsx (2)

53-53: Remove unsafe double-cast when clearing modelSize.

modelSize is already optional in GenerationFormValues; casting undefined through unknown obscures type safety and readability.

Proposed simplification
-    form.setValue('modelSize', undefined as unknown as '1.7B' | '0.6B');
+    form.setValue('modelSize', undefined);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/Generation/EngineModelSelector.tsx` at line 53, The line
using a double-cast to clear modelSize should be replaced with a safe
assignment: remove the "as unknown as '1.7B' | '0.6B'" and call
form.setValue('modelSize', undefined) (or explicitly type as
GenerationFormValues['modelSize'] if needed) in EngineModelSelector.tsx so you
rely on the existing optional type of modelSize instead of masking types with an
unsafe double-cast.

25-30: Tighten description typing to engine union for exhaustiveness.

Using Record<string, string> allows missing keys silently. Prefer typing descriptions against the engine union so adding/removing engines is compile-time checked.

Type-safe mapping example
-const ENGINE_DESCRIPTIONS: Record<string, string> = {
+type Engine = NonNullable<GenerationFormValues['engine']>;
+
+const ENGINE_DESCRIPTIONS: Record<Engine, string> = {
   qwen: 'Multi-language, two sizes',
   luxtts: 'Fast, English-focused',
   chatterbox: '23 languages, incl. Hebrew',
   chatterbox_turbo: 'English, [laugh] [cough] tags',
 };

Also applies to: 101-103

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/Generation/EngineModelSelector.tsx` around lines 25 - 30,
The ENGINE_DESCRIPTIONS map is currently typed as Record<string,string>, which
won't catch missing engine keys; change its type to a mapped type over the
engine union (e.g. use the union type name used in this file such as EngineId or
Engine) so the compiler enforces exhaustiveness (for example: declare
ENGINE_DESCRIPTIONS as {[K in EngineId]: string} or Record<EngineId,string> and
then ensure all engine union members are present in the object); update the
declaration of ENGINE_DESCRIPTIONS accordingly and add any missing entries or
remove obsolete ones so the mapping matches the engine union exactly.
backend/routes/models.py (2)

59-72: Chain exceptions for better debugging.

When re-raising HTTPException from a caught exception, use raise ... from e to preserve the original traceback. This helps with debugging production issues.

♻️ Proposed fix
 `@router.post`("/models/load")
 async def load_model(model_size: str = "1.7B"):
     """Manually load TTS model."""
     from ..services import tts

     try:
         tts_model = tts.get_tts_model()
         await tts_model.load_model_async(model_size)
         return {"message": f"Model {model_size} loaded successfully"}
     except Exception as e:
-        raise HTTPException(status_code=500, detail=str(e))
+        raise HTTPException(status_code=500, detail=str(e)) from e


 `@router.post`("/models/unload")
 async def unload_model():
     """Unload the default Qwen TTS model to free memory."""
     from ..services import tts

     try:
         tts.unload_tts_model()
         return {"message": "Model unloaded successfully"}
     except Exception as e:
-        raise HTTPException(status_code=500, detail=str(e))
+        raise HTTPException(status_code=500, detail=str(e)) from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/models.py` around lines 59 - 72, The exception handlers in
unload_model (and the earlier block that raises HTTPException) re-raise
HTTPException without chaining the original exception; update both raise
HTTPException(...) statements to use exception chaining (raise
HTTPException(status_code=500, detail=str(e)) from e) so the original traceback
is preserved when calling tts.unload_tts_model and any other code paths that
currently catch Exception and re-raise HTTPException.

429-433: Accessing private members of progress_manager breaks encapsulation.

Lines 430-432 directly access progress_manager._lock and progress_manager._progress. Consider adding a public remove_progress(model_name) method to the progress manager instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/models.py` around lines 429 - 433, Replace direct access to
progress_manager._lock and progress_manager._progress with a new public method
on the ProgressManager (e.g., remove_progress(model_name)) and call that from
this route. Implement ProgressManager.remove_progress(self, model_name) to
acquire its internal lock and delete the entry if present, returning a boolean
if removed; then in the route call
progress_manager.remove_progress(request.model_name) and use the boolean instead
of manipulating _lock/_progress directly.
backend/routes/transcription.py (1)

71-72: Chain exception for better debugging.

Use raise ... from e to preserve the exception chain.

♻️ Proposed fix
     except Exception as e:
-        raise HTTPException(status_code=500, detail=str(e))
+        raise HTTPException(status_code=500, detail=str(e)) from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/transcription.py` around lines 71 - 72, Replace the bare
re-raise of HTTPException with an exception-chaining raise so the original
traceback is preserved: in the except Exception as e block where you currently
call raise HTTPException(status_code=500, detail=str(e)), change it to raise
HTTPException(status_code=500, detail=str(e)) from e so the original exception
(e) is chained into the HTTPException.
backend/services/generation.py (1)

232-248: Take label race condition is mitigated by the serial task queue.

The label assignment (take-{len(existing) + 1}) is technically racy, but per backend/services/task_queue.py, all generation tasks are processed by a single _generation_worker() that handles one coroutine at a time. This serialization prevents concurrent regenerations for the same generation_id from colliding.

However, if the queue architecture changes or parallel workers are added in the future, this could produce duplicate labels. Consider adding a brief comment noting this serialization dependency.

📝 Suggested documentation
 def _save_regenerate(
     *,
     generation_id: str,
     version_id: Optional[str],
     audio,
     sample_rate: int,
     save_audio,
     db,
 ) -> str:
     """Save regeneration output as a new version with auto-label.
 
     Returns the audio path.
+
+    Note: Label uniqueness relies on task_queue serialization; if parallel
+    workers are ever added, add a DB unique constraint on (generation_id, label).
     """
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/generation.py` around lines 232 - 248, Add a brief comment
next to the label assignment in generation.py explaining that the
take-{len(existing)+1} computation relies on serialized processing of generation
tasks (see _generation_worker in backend/services/task_queue.py) to avoid races;
mention that if the task queue is changed to allow parallel workers this
labeling could produce duplicates and should be revisited (or replaced with an
atomic/DB-backed counter) — place the comment near the label variable alongside
the existing calls to versions_mod.list_versions and
versions_mod.create_version.
backend/routes/history.py (1)

60-63: Chain exceptions for better debugging.

Use raise ... from e to preserve the exception chain in error handlers.

♻️ Proposed fix
     try:
         result = await export_import.import_generation_from_zip(content, db)
         return result
     except ValueError as e:
-        raise HTTPException(status_code=400, detail=str(e))
+        raise HTTPException(status_code=400, detail=str(e)) from e
     except Exception as e:
-        raise HTTPException(status_code=500, detail=str(e))
+        raise HTTPException(status_code=500, detail=str(e)) from e

Also applies to lines 135-138.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/history.py` around lines 60 - 63, In the except handlers
inside backend/routes/history.py (the blocks catching "except ValueError as e"
and "except Exception as e" — including the similar handlers around lines
135-138), preserve the original exception chain by re-raising the HTTPException
using "raise HTTPException(status_code=..., detail=...) from e" instead of a
plain raise; update both the ValueError handler and the generic Exception
handler to use "from e" so the original traceback is chained for debugging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/database/migrations.py`:
- Around line 77-85: The loop that updates start_time_ms can raise TypeError
because duration may be None; in the loop that iterates over result.fetchall()
update the code around current_time_ms and duration to treat a NULL duration
safely (e.g., coerce duration to 0 or use a default like 0.0 before computing
int(duration * 1000)), so replace the direct int(duration * 1000) usage with a
safe conversion (e.g., safe_duration = duration or 0; increment =
int(safe_duration * 1000) + 200) and then use current_time_ms += increment; keep
the existing UPDATE on story_items (id=item_id) and current_story_id logic
intact.

In `@backend/routes/models.py`:
- Around line 126-133: The current same-directory check using source.resolve()
== destination.resolve() misses the case where destination is a subdirectory of
source; update the validation after computing source and destination to also
reject destinations that are inside the source (and optionally vice versa) by
checking if destination.resolve() is a child of source.resolve() (use
Path.is_relative_to(destination.resolve(), source.resolve()) when available, or
compare resolved strings with a trailing separator as a fallback) and raise
HTTPException(status_code=400, detail="Destination must not be inside the source
directory") when that condition is true.

In `@tauri/src-tauri/src/main.rs`:
- Around line 15-20: Update the docstring for the find_voicebox_on_port function
to accurately describe what the implementation does: state that it performs PID
lookup via PowerShell and verifies the process with tasklist on Windows, and
that it does NOT call TcpListener::bind (port occupancy should be checked by the
caller, e.g., with TcpStream::connect_timeout). Mention the Windows-only nature
and the replacement of the previous netstat approach for context, but remove the
incorrect claim about binding.

---

Duplicate comments:
In `@backend/routes/generations.py`:
- Around line 227-232: In stream_speech, mirror generate_speech's model-size
guard: instead of unconditionally defaulting model_size to "1.7B", call
engine_has_model_sizes(engine) and only set data.model_size or default to "1.7B"
when the engine supports sizes; otherwise leave model_size None (or skip
caching/loading). Update the stream_speech logic that calls
get_tts_backend_for_engine(engine), ensure_model_cached_or_raise(engine,
model_size), and load_engine_model(engine, model_size) to pass model_size only
when engine_has_model_sizes(engine) is true so you don't attempt to cache/load a
non-existent size.
- Around line 165-167: The model_size default is being set unconditionally for
regenerate like in the retry path; update the code that builds the regenerate
payload (the lines using language=gen.language, engine=gen.engine or "qwen",
model_size=gen.model_size or "1.7B") to only include or default model_size when
engine_has_model_sizes(engine) returns True—use the same pattern applied in the
retry change: compute engine = gen.engine or "qwen", then if
engine_has_model_sizes(engine) set model_size = gen.model_size or "1.7B",
otherwise omit model_size or set it to None before constructing the regenerate
request.
- Around line 122-124: The retry path unconditionally sets model_size to "1.7B"
which is invalid for non-Qwen engines; update the retry payload construction
(the place creating the dict with language=gen.language, engine=gen.engine or
"qwen", model_size=gen.model_size or "1.7B") to only default model_size to
"1.7B" when the engine supports model sizes (use
engine_has_model_sizes(gen.engine) or check gen.engine == "qwen"), otherwise
leave model_size as None/omitted or use gen.model_size if provided; ensure you
reference and call engine_has_model_sizes(gen.engine) when deciding the default.

---

Nitpick comments:
In `@app/src/components/Generation/EngineModelSelector.tsx`:
- Line 53: The line using a double-cast to clear modelSize should be replaced
with a safe assignment: remove the "as unknown as '1.7B' | '0.6B'" and call
form.setValue('modelSize', undefined) (or explicitly type as
GenerationFormValues['modelSize'] if needed) in EngineModelSelector.tsx so you
rely on the existing optional type of modelSize instead of masking types with an
unsafe double-cast.
- Around line 25-30: The ENGINE_DESCRIPTIONS map is currently typed as
Record<string,string>, which won't catch missing engine keys; change its type to
a mapped type over the engine union (e.g. use the union type name used in this
file such as EngineId or Engine) so the compiler enforces exhaustiveness (for
example: declare ENGINE_DESCRIPTIONS as {[K in EngineId]: string} or
Record<EngineId,string> and then ensure all engine union members are present in
the object); update the declaration of ENGINE_DESCRIPTIONS accordingly and add
any missing entries or remove obsolete ones so the mapping matches the engine
union exactly.

In `@backend/routes/history.py`:
- Around line 60-63: In the except handlers inside backend/routes/history.py
(the blocks catching "except ValueError as e" and "except Exception as e" —
including the similar handlers around lines 135-138), preserve the original
exception chain by re-raising the HTTPException using "raise
HTTPException(status_code=..., detail=...) from e" instead of a plain raise;
update both the ValueError handler and the generic Exception handler to use
"from e" so the original traceback is chained for debugging.

In `@backend/routes/models.py`:
- Around line 59-72: The exception handlers in unload_model (and the earlier
block that raises HTTPException) re-raise HTTPException without chaining the
original exception; update both raise HTTPException(...) statements to use
exception chaining (raise HTTPException(status_code=500, detail=str(e)) from e)
so the original traceback is preserved when calling tts.unload_tts_model and any
other code paths that currently catch Exception and re-raise HTTPException.
- Around line 429-433: Replace direct access to progress_manager._lock and
progress_manager._progress with a new public method on the ProgressManager
(e.g., remove_progress(model_name)) and call that from this route. Implement
ProgressManager.remove_progress(self, model_name) to acquire its internal lock
and delete the entry if present, returning a boolean if removed; then in the
route call progress_manager.remove_progress(request.model_name) and use the
boolean instead of manipulating _lock/_progress directly.

In `@backend/routes/transcription.py`:
- Around line 71-72: Replace the bare re-raise of HTTPException with an
exception-chaining raise so the original traceback is preserved: in the except
Exception as e block where you currently call raise
HTTPException(status_code=500, detail=str(e)), change it to raise
HTTPException(status_code=500, detail=str(e)) from e so the original exception
(e) is chained into the HTTPException.

In `@backend/services/generation.py`:
- Around line 232-248: Add a brief comment next to the label assignment in
generation.py explaining that the take-{len(existing)+1} computation relies on
serialized processing of generation tasks (see _generation_worker in
backend/services/task_queue.py) to avoid races; mention that if the task queue
is changed to allow parallel workers this labeling could produce duplicates and
should be revisited (or replaced with an atomic/DB-backed counter) — place the
comment near the label variable alongside the existing calls to
versions_mod.list_versions and versions_mod.create_version.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bf071068-633d-4ff0-b2c9-f052b205964c

📥 Commits

Reviewing files that changed from the base of the PR and between 798cd40 and 0d0b62e.

📒 Files selected for processing (11)
  • app/src/components/Generation/EngineModelSelector.tsx
  • backend/build_binary.py
  • backend/database/migrations.py
  • backend/models.py
  • backend/routes/cuda.py
  • backend/routes/generations.py
  • backend/routes/history.py
  • backend/routes/models.py
  • backend/routes/transcription.py
  • backend/services/generation.py
  • tauri/src-tauri/src/main.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/routes/cuda.py

Replace verbose startup messages with a clean summary:
- App version, Python version, OS/arch
- Database path (fix None display), data directory
- Profile and generation counts
- Backend, GPU, model cache path
- Clean up stale loading_model status on startup
- Remove noisy progress manager log line
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (4)
backend/backends/mlx_backend.py (1)

58-61: Logger message is misleading when model is already cached.

_get_model_path is called by _is_model_cached, so logging "Will download MLX model from HuggingFace Hub" occurs even when just checking if the model is cached. This creates confusing log output.

Consider moving this log statement to _load_model_sync where the download actually happens, or rewording to be accurate for both contexts.

✨ Suggested fix
         hf_model_id = mlx_model_map[model_size]
-        logger.info("Will download MLX model from HuggingFace Hub: %s", hf_model_id)

         return hf_model_id

Then in _load_model_sync around line 109:

-                logger.info("Loading MLX TTS model %s...", model_size)
+                logger.info("Loading MLX TTS model %s from %s...", model_size, model_path)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/mlx_backend.py` around lines 58 - 61, The logger.info in
_get_model_path currently says "Will download MLX model..." even when
_get_model_path is being used by _is_model_cached; change the logging so it only
announces an actual download in _load_model_sync (where the download occurs) or
make the message neutral in _get_model_path. Concretely, remove or reword the
logger.info in _get_model_path (referencing hf_model_id) so it does not claim a
download, and add a new logger.info in _load_model_sync that logs "Downloading
MLX model from HuggingFace Hub: %s" using the same hf_model_id when the code
path actually performs the download.
backend/backends/pytorch_backend.py (1)

322-326: Prefix unused variable with underscore.

The sr variable from load_audio is unused. Prefix it with _ to clarify intent and silence the linter warning.

✨ Suggested fix
         def _transcribe_sync():
             """Run synchronous transcription in thread pool."""
             # Load audio
-            audio, sr = load_audio(audio_path, sample_rate=16000)
+            audio, _sr = load_audio(audio_path, sample_rate=16000)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/pytorch_backend.py` around lines 322 - 326, In
_transcribe_sync, the tuple returned by load_audio assigns the sample rate to sr
but it is unused; update the assignment to capture the unused value as _sr
(e.g., audio, _sr = load_audio(...)) to signal intent and silence the linter
warning—modify the assignment in the _transcribe_sync function where load_audio
is called.
backend/build_binary.py (2)

313-314: Redundant import statement.

When restore_cuda is True, the subprocess module was already imported at line 280. This second import is unreachable when not redundant.

♻️ Remove redundant import
         if restore_cuda:
             print("Restoring CUDA torch...")
-            import subprocess
-
             subprocess.run(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/build_binary.py` around lines 313 - 314, Remove the redundant "import
subprocess" inside the conditional that checks restore_cuda; the module is
already imported earlier (the first import at line ~280), so delete the second
import statement to avoid unreachable/redundant imports and rely on the existing
top-level subprocess import when using restore_cuda.

11-17: Logger created but never used.

The logger is initialized but all output throughout the file uses print() statements instead. Either use the logger for consistent logging or remove the unused import and variable.

♻️ Option 1: Remove unused logger
 import PyInstaller.__main__
 import argparse
-import logging
 import os
 import platform
 import sys
 from pathlib import Path

-logger = logging.getLogger(__name__)
-
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/build_binary.py` around lines 11 - 17, The file defines a logger
variable (logger) but never uses it and instead uses print() for output; either
replace all print(...) calls in this module with appropriate logger methods
(e.g., logger.info/debug/warning/error) so logging is consistent and the logger
import is used, or remove the logging import and the logger variable and keep
print() if stdout-only output is intended; search for the logger symbol and all
print(...) occurrences to update them consistently (prefer converting prints to
logger.* calls).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app.py`:
- Around line 150-154: The shutdown_event currently calls tts.unload_tts_model()
then transcribe.unload_whisper_model() and if the first throws the second never
runs; change shutdown_event to call each unload in its own try/except so
failures are best-effort and do not abort subsequent cleanup, e.g. wrap
tts.unload_tts_model() in a try/except that logs the exception via logger.error
and then independently call transcribe.unload_whisper_model() inside its own
try/except and log any error similarly; ensure the handler remains async and
continues executing even if an unload raises.
- Around line 107-123: The DB session obtained via next(get_db()) must be closed
and rolled back on error even if execute() or commit() raises; wrap the work
using the session (db) in a try/except/finally: after calling get_db() (the
next(get_db()) call) perform execute() and commit() in the try, on exception
call db.rollback() and log via logger.warning (preserve the existing message),
and always call db.close() in the finally block so the session is closed on
every path; reference get_db, db.execute, db.commit, db.rollback, db.close, and
logger in the changes.
- Around line 16-27: The module-level import of torch (the statement "import
torch") is intentionally below AMD GPU bootstrap environment setup and triggers
Ruff E402; suppress this by adding a per-file Ruff ignore for E402 immediately
above the torch import (i.e., insert "# ruff: noqa: E402" on the line before the
"import torch" statement) or alternatively add an equivalent per-file ignore in
the Ruff config so the required AMD GPU environment variable code can remain
before the torch import.
- Around line 11-12: Don't force HSA_OVERRIDE_GFX_VERSION; make it opt-in by
only setting it when a dedicated enable flag is present. Change the current
logic around HSA_OVERRIDE_GFX_VERSION so it first checks if an opt-in env var
(e.g., ROCM_ENABLE_GFX_OVERRIDE or similar) equals "1" and only then sets
os.environ["HSA_OVERRIDE_GFX_VERSION"] = "10.3.0" if HSA_OVERRIDE_GFX_VERSION is
not already defined; reference the environment variable name
HSA_OVERRIDE_GFX_VERSION and the new opt-in var (ROCM_ENABLE_GFX_OVERRIDE) in
your change and ensure you do not overwrite an existing
HSA_OVERRIDE_GFX_VERSION.

In `@backend/build_binary.py`:
- Around line 335-377: Remove the unused helper function _get_cuda_dll_excludes
from backend/build_binary.py by deleting its entire definition (including its
docstring and internal logic); verify there are no remaining references or
assignments to _get_cuda_dll_excludes elsewhere in the file and clean up any
now-unused imports or variables that were only used by that function to avoid
lint warnings.

In `@backend/services/generation.py`:
- Around line 129-136: The exception handler must rollback the failing DB
session before calling history.update_generation_status so the subsequent
db.query()/db.commit() inside update_generation_status can run; update the
except block to call await bg_db.rollback() (or bg_db.rollback() if sync) or
bg_db.rollback_async() as appropriate to your DB session API, then call
history.update_generation_status(generation_id=generation_id, status="failed",
db=bg_db, error=str(e)); ensure you reference the existing bg_db, generation_id,
and history.update_generation_status symbols and keep rollback awaited/executed
before the update call.

---

Nitpick comments:
In `@backend/backends/mlx_backend.py`:
- Around line 58-61: The logger.info in _get_model_path currently says "Will
download MLX model..." even when _get_model_path is being used by
_is_model_cached; change the logging so it only announces an actual download in
_load_model_sync (where the download occurs) or make the message neutral in
_get_model_path. Concretely, remove or reword the logger.info in _get_model_path
(referencing hf_model_id) so it does not claim a download, and add a new
logger.info in _load_model_sync that logs "Downloading MLX model from
HuggingFace Hub: %s" using the same hf_model_id when the code path actually
performs the download.

In `@backend/backends/pytorch_backend.py`:
- Around line 322-326: In _transcribe_sync, the tuple returned by load_audio
assigns the sample rate to sr but it is unused; update the assignment to capture
the unused value as _sr (e.g., audio, _sr = load_audio(...)) to signal intent
and silence the linter warning—modify the assignment in the _transcribe_sync
function where load_audio is called.

In `@backend/build_binary.py`:
- Around line 313-314: Remove the redundant "import subprocess" inside the
conditional that checks restore_cuda; the module is already imported earlier
(the first import at line ~280), so delete the second import statement to avoid
unreachable/redundant imports and rely on the existing top-level subprocess
import when using restore_cuda.
- Around line 11-17: The file defines a logger variable (logger) but never uses
it and instead uses print() for output; either replace all print(...) calls in
this module with appropriate logger methods (e.g.,
logger.info/debug/warning/error) so logging is consistent and the logger import
is used, or remove the logging import and the logger variable and keep print()
if stdout-only output is intended; search for the logger symbol and all
print(...) occurrences to update them consistently (prefer converting prints to
logger.* calls).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7dfff82b-d0ec-4290-aafe-e2d0396476ad

📥 Commits

Reviewing files that changed from the base of the PR and between 0d0b62e and 944ba22.

📒 Files selected for processing (6)
  • app/src/components/ui/select.tsx
  • backend/app.py
  • backend/backends/mlx_backend.py
  • backend/backends/pytorch_backend.py
  • backend/build_binary.py
  • backend/services/generation.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can't run the software at all

1 participant