[wip] Decode WAV with Python backend by Dan-Flores · Pull Request #1222 · meta-pytorch/torchcodec

Dan-Flores · 2026-02-04T20:17:28Z

To review:

See the top level changes to _audio_decoder.py.
Read the definition of class WavDecoder, and the associated try_create.
Skim through _parse_wav_chunks, it looks complicated but just handles the WAV file header.
Skim through _samples_from_bytes. This conversion is considerably easier to read than the C++ implementation.

pytorch-bot · 2026-02-04T20:17:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/meta-pytorch/torchcodec/1222

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Lint/Mac jobs intermittently fail

❌ 28 New Failures

As of commit 16472d5 with merge base 377c638 ():

NEW FAILURES - The following jobs have failed:

Build and test Linux CUDA aarch64 wheels / install-and-test (3.10, 12.6, 4.4.2) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux CUDA aarch64 wheels / install-and-test (3.10, 12.6, 5.1.2) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux CUDA aarch64 wheels / install-and-test (3.10, 12.6, 6.1.1) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux CUDA aarch64 wheels / install-and-test (3.10, 12.6, 7.0.1) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux CUDA aarch64 wheels / install-and-test (3.10, 12.6, 8.0) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux CUDA wheels and docs / install-and-test (3.10, 12.6, 4.4.2) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux CUDA wheels and docs / install-and-test (3.10, 12.6, 6) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux CUDA wheels and docs / install-and-test (3.10, 12.6, 7) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux CUDA wheels and docs / install-and-test (3.10, 12.6, 8.0) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux CUDA wheels and docs / install-and-test (3.10, 13.0, 4.4.2) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux CUDA wheels and docs / install-and-test (3.10, 13.0, 6) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux CUDA wheels and docs / install-and-test (3.10, 13.0, 7) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux CUDA wheels and docs / install-and-test (3.10, 13.0, 8.0) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux wheel / install-and-test (3.10, 4.4.2) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux wheel / install-and-test (3.10, 5.1.2) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux wheel / install-and-test (3.10, 6.1.1) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux wheel / install-and-test (3.10, 7.0.1) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Linux wheel / install-and-test (3.10, 8.0) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test MacOS wheel / install-and-test (3.10, 4.4.2) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test MacOS wheel / install-and-test (3.10, 5.1.2) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test MacOS wheel / install-and-test (3.10, 6.1.1) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test MacOS wheel / install-and-test (3.10, 7.0.1) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test MacOS wheel / install-and-test (3.10, 8.0) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Windows wheel / install-and-test (3.10, 4.4.2) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Windows wheel / install-and-test (3.10, 6.1.1) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Windows wheel / install-and-test (3.10, 7.0.1) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Build and test Windows wheel / install-and-test (3.10, 8.0) (gh)
test/test_decoders.py::TestDecoder::test_create[file_like_custom-AudioDecoder-asset1]
Lint / mypy (3.12) (gh)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Dan-Flores · 2026-02-05T05:20:31Z

src/torchcodec/decoders/_audio_decoder.py

+        # Try fast WAV path
+        self._wav_decoder = WavDecoder.validate_and_init(
+            source, sample_rate, num_channels, stream_index
+        )


self._wav_decoder is only populated if WAV decoding was successful.

Do we have a sense of how expensive this check is? It'll be run for every single input file so it's important that it's quick.

Dan-Flores · 2026-02-05T05:20:32Z

src/torchcodec/decoders/_audio_decoder.py

+        )
+        if self._wav_decoder is not None:
+            self.stream_index = self._wav_decoder.stream_index
+            self.metadata = self._wav_decoder.metadata


Because WavDecoder is a python class, we can set the AudioStreamMetadata class easily.

NicolasHug · 2026-02-05T14:41:26Z

src/torchcodec/decoders/_fast_wav.py

+        elif isinstance(source, (str, Path)):
+            path = Path(source)
+            if path.suffix.lower() == ".wav":
+                try:
+                    with open(path, "rb") as f:
+                        source_bytes = f.read()
+                except OSError:
+                    return None
+        elif isinstance(source, (io.RawIOBase, io.BufferedReader)) or (
+            hasattr(source, "read") and hasattr(source, "seek")
+        ):
+            source_bytes = source.read()
+            # Will reset seek position below if we can't use fast path


So this works but it also reads / loads / downloads the entire content of a file-like object in memory. We should try to see how FFmpeg behaves when decoding a very long wav file for example. Does it need to load the entire file at once? If not, we might be losing that functionality here, which is something to consider.

Same for the C++ alternative, which I haven't checked yet.

It does not need to, I've updated both implementations to only read the header at first. Thanks for the suggestion!

Dan-Flores · 2026-02-10T05:04:35Z

src/torchcodec/decoders/_fast_wav.py

+        stream_index: int | None = None,
+    ):
+        """
+        Create a WavDecoder for the given source.


This function handles all input source types, and calls _parse_wav_chunks with a function that can read the detected input type.

If successful, this function initializes a WavDecoder object.

If it fails, it will raise an error. This init is designed to be called by try_create below, which catches the error so we can fallback to FFmpeg backend.

Dan-Flores · 2026-02-10T05:04:38Z

src/torchcodec/decoders/_fast_wav.py

+    bytes_per_sample = metadata.bits_per_sample // 8
+    num_samples = len(audio_bytes) // bytes_per_sample // metadata.num_channels
+
+    # Convert to tensor based on format


Here various WAV formats are normalized to [-1, 1].
Some formats use the full range of a dtype, so torch.iinfo is used to get the maximum value of that type.

Dan-Flores · 2026-02-10T05:04:40Z

src/torchcodec/decoders/_fast_wav.py

+    audio_format = 0
+    num_channels = 0
+    sample_rate = 0
+    bits_per_sample = 0


Once we find the RIFF and WAVE signatures, we can look for the fmt chunk which contains metadata, and the data chunk which contains the actual samples. We find the data chunk now to store bytes_per_sample and num_samples, which will be needed later for decoding.

Dan-Flores added 4 commits February 3, 2026 20:22

moved all to fast_wav.py

477f9b2

minor adjustments

91b21db

add comments

9727fe1

remove numpy

aaaa516

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 4, 2026

Dan-Flores commented Feb 5, 2026

View reviewed changes

NicolasHug reviewed Feb 5, 2026

View reviewed changes

Dan-Flores added 4 commits February 6, 2026 16:09

dont load entire wav upfront

03f0204

refactor

5807de8

mypy

e60378b

reduce diff, improve test

16472d5

Dan-Flores commented Feb 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wip] Decode WAV with Python backend#1222

[wip] Decode WAV with Python backend#1222
Dan-Flores wants to merge 8 commits intometa-pytorch:mainfrom
Dan-Flores:wav

Dan-Flores commented Feb 4, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 4, 2026 •

edited

Loading

Uh oh!

Dan-Flores Feb 5, 2026

Uh oh!

NicolasHug Feb 5, 2026

Uh oh!

Dan-Flores Feb 5, 2026 •

edited

Loading

Uh oh!

NicolasHug Feb 5, 2026

Uh oh!

Dan-Flores Feb 10, 2026

Uh oh!

Dan-Flores Feb 10, 2026

Uh oh!

Dan-Flores Feb 10, 2026

Uh oh!

Dan-Flores Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Dan-Flores commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/meta-pytorch/torchcodec/1222

❗ 1 Active SEVs

❌ 28 New Failures

Uh oh!

Dan-Flores Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

NicolasHug Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Dan-Flores Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Dan-Flores Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Dan-Flores Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Dan-Flores Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Dan-Flores Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Dan-Flores commented Feb 4, 2026 •

edited

Loading

pytorch-bot bot commented Feb 4, 2026 •

edited

Loading

Dan-Flores Feb 5, 2026 •

edited

Loading