Skip to content

Conversation

@onuryuruten
Copy link

Description

Moves torchcodec from required dependencies to optional dependencies to improve installation reliability and reduce unnecessary requirements for core TTS functionality.

Problem

Currently, torchcodec is listed as a required dependency, but:

  1. Not needed for core TTS functionality - F5-TTS works perfectly fine without it for text-to-speech tasks
  2. Causes installation failures - Users encounter errors like AttributeError: module 'torchcodec' has no attribute 'decoders' due to incomplete or broken installations, particularly on macOS Intel machines
  3. Platform compatibility issues - torchcodec has varying levels of support across different platforms and architectures
  4. Increases installation complexity - Adds an unnecessary dependency that most users won't utilize

Use Case Analysis

torchcodec is primarily useful for:

  • Video decoding and processing
  • Extracting audio from video files
  • Multimodal workflows

For standard text-to-speech workflows (which represent the majority of F5-TTS use cases), torchcodec is never utilized.

Solution

Move torchcodec to optional dependencies under a new video extra (or include in full extra if preferred).

Changes

# Before
dependencies = [
    ...
    "torchcodec",
    ...
]

# After
dependencies = [
    ...
    # torchcodec moved to optional dependencies
    ...
]

[project.optional-dependencies]
video = [
    "torchcodec",
]

Users who need video processing capabilities can install with:

pip install f5-tts[video]

Benefits

  • ✅ Reduces installation failures for standard users
  • ✅ Maintains functionality for users who need video processing
  • ✅ Improves cross-platform compatibility
  • ✅ Follows Python packaging best practices (keep required deps minimal)
  • ✅ Faster installation times for most users

Testing

  • Verified F5-TTS runs successfully without torchcodec for standard TTS workflows
  • No breaking changes for users who install with [video] extra

Migration Path

Users currently relying on torchcodec (if any) can simply install with:

pip install f5-tts[video]

Or add it manually:

pip install torchcodec

Alternative Consideration

If the maintainers prefer, torchcodec could be added to an existing full extra instead of creating a new video extra.

Problem: torch.xpu.is_available() can trigger an exception for users running PyTorch 2.0-2.3.

Solution: Add hasattr(torch, 'xpu') check before calling torch.xpu.is_available(). This ensures backward compatibility with PyTorch 2.0+ while still supporting Intel XPU acceleration when available.
Fix device selection logic for XPU availability
The corresponding change is already done for PR 1243
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant