-
Notifications
You must be signed in to change notification settings - Fork 2k
Open
Labels
help wantedExtra attention is neededExtra attention is needed
Description
Checks
- This template is only for usage issues encountered.
- I have thoroughly reviewed the project documentation but couldn't find information to solve my problem.
- I have searched for existing issues, including closed ones, and couldn't find a solution.
- I am using English to submit this issue to facilitate community communication.
Environment Details
- Windows 11 Pro 25H2
- OS Build 26200.7171
- Python 3.12.10 (tags/v3.12.10:0cc8128, Apr 8 2025, 12:21:36) [MSC v.1943 64 bit (AMD64)] on win32
- ffmpeg version 8.0-full_build-www.gyan.dev
Steps to Reproduce
- Create a new venv
- Clone Repo and installed as pip package
- Run the command "f5-tts_infer-gradio" with nothing provided at cli
- Attach reference audio and input text to synthesize. Click on Synthesize button and it returns following error
Traceback (most recent call last):
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\gradio\queueing.py", line 759, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\gradio\route_utils.py", line 354, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\gradio\blocks.py", line 2191, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\gradio\blocks.py", line 1698, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\anyio\to_thread.py", line 61, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 2525, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 986, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\gradio\utils.py", line 915, in wrapper
response = f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\src\f5_tts\infer\infer_gradio.py", line 290, in basic_tts
audio_out, spectrogram_path, ref_text_out, used_seed = infer(
^^^^^^
File "C:\Users\hsn\Python\F5-TTS\src\f5_tts\infer\infer_gradio.py", line 160, in infer
ref_audio, ref_text = preprocess_ref_audio_text(ref_audio_orig, ref_text, show_info=show_info)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\src\f5_tts\infer\utils_infer.py", line 361, in preprocess_ref_audio_text
ref_text = transcribe(ref_audio)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\src\f5_tts\infer\utils_infer.py", line 176, in transcribe
return asr_pipe(
^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\transformers\pipelines\automatic_speech_recognition.py", line 275, in __call__
return super().__call__(inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\transformers\pipelines\base.py", line 1459, in __call__
return next(
^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\transformers\pipelines\pt_utils.py", line 126, in __next__
item = next(self.iterator)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\transformers\pipelines\pt_utils.py", line 271, in __next__
processed = self.infer(next(self.iterator), **self.params)
^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torch\utils\data\dataloader.py", line 732, in __next__
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torch\utils\data\dataloader.py", line 788, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torch\utils\data\_utils\fetch.py", line 33, in fetch
data.append(next(self.dataset_iter))
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\transformers\pipelines\pt_utils.py", line 188, in __next__
processed = next(self.subiterator)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\transformers\pipelines\automatic_speech_recognition.py", line 381, in preprocess
import torchcodec
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torchcodec\__init__.py", line 12, in <module>
from . import decoders, encoders, samplers # noqa
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torchcodec\decoders\__init__.py", line 7, in <module>
from .._core import AudioStreamMetadata, VideoStreamMetadata
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torchcodec\_core\__init__.py", line 8, in <module>
from ._metadata import (
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torchcodec\_core\_metadata.py", line 16, in <module>
from torchcodec._core.ops import (
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torchcodec\_core\ops.py", line 85, in <module>
ffmpeg_major_version, core_library_path = load_torchcodec_shared_libraries()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torchcodec\_core\ops.py", line 70, in load_torchcodec_shared_libraries
raise RuntimeError(
RuntimeError: Could not load libtorchcodec. Likely causes:
1. FFmpeg is not properly installed in your environment. We support
versions 4, 5, 6, 7, and 8.
2. The PyTorch version (2.9.1+cpu) is not compatible with
this version of TorchCodec. Refer to the version compatibility
table:
https://github.com/pytorch/torchcodec?tab=readme-ov-file#installing-torchcodec.
3. Another runtime dependency; see exceptions below.
The following exceptions were raised as we tried to load libtorchcodec:
[start of libtorchcodec loading traceback]
FFmpeg version 8: Could not load this library: C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torchcodec\libtorchcodec_core8.dll
FFmpeg version 7: Could not load this library: C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torchcodec\libtorchcodec_core7.dll
FFmpeg version 6: Could not load this library: C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torchcodec\libtorchcodec_core6.dll
FFmpeg version 5: Could not load this library: C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torchcodec\libtorchcodec_core5.dll
FFmpeg version 4: Could not load this library: C:\Users\hsn\Python\F5-TTS\venv\Lib\site-packages\torchcodec\libtorchcodec_core4.dll
[end of libtorchcodec loading traceback].✔️ Expected Behavior
output a generated audio
❌ Actual Behavior
Generating error
Metadata
Metadata
Assignees
Labels
help wantedExtra attention is neededExtra attention is needed