Skip to content

chore(registry): publish marketplace registry update#283

Merged
streamer45 merged 2 commits intomainfrom
registry/update-24260371932
Apr 11, 2026
Merged

chore(registry): publish marketplace registry update#283
streamer45 merged 2 commits intomainfrom
registry/update-24260371932

Conversation

@streamkit-bot
Copy link
Copy Markdown
Collaborator

@streamkit-bot streamkit-bot commented Apr 10, 2026

Summary

Automated registry metadata update from run 24260371932.

Additionally, adds --latest=false to the gh release create command in the marketplace release workflow. This prevents plugin releases from overriding the "Latest" tag on GitHub Releases, ensuring the latest release remains the main StreamKit release rather than the most recently built plugin.

Review & Testing Checklist for Human

  • Verify the --latest=false flag is correctly placed in the gh release create command in .github/workflows/marketplace-release.yml
  • Next time the marketplace release workflow runs, confirm the resulting GitHub release is not marked as "Latest"

Notes

The change is a single-line addition (--latest=false) to the gh release create arguments in the Create per-plugin releases step. No functional changes to the registry update itself.

Link to Devin session: https://staging.itsdev.in/sessions/3ebfe2724f4b4ac2abfe6044348a30f6
Requested by: @streamer45


Staging: Open in Devin

Copy link
Copy Markdown
Contributor

@staging-devin-ai-integration staging-devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Staging: Open in Devin
Debug

Playground

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>
@streamer45 streamer45 merged commit cffe692 into main Apr 11, 2026
17 checks passed
@streamer45 streamer45 deleted the registry/update-24260371932 branch April 11, 2026 09:37
staging-devin-ai-integration bot pushed a commit that referenced this pull request Apr 11, 2026
* chore(registry): publish marketplace registry update

* fix(marketplace): prevent plugin releases from becoming latest on GitHub

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

---------

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-authored-by: StreamKit Devin <devin@streamkit.dev>
Co-authored-by: Claudio Costa <cstcld91@gmail.com>
streamer45 added a commit that referenced this pull request Apr 11, 2026
* chore(registry): publish marketplace registry update (#283)

* chore(registry): publish marketplace registry update

* fix(marketplace): prevent plugin releases from becoming latest on GitHub

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

---------

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-authored-by: StreamKit Devin <devin@streamkit.dev>
Co-authored-by: Claudio Costa <cstcld91@gmail.com>

* feat(e2e): add headless pipeline validation tests

Add a Rust-based test framework for validating oneshot pipelines against
a live skit server using ffprobe for output verification. No browser
required.

Architecture:
- datatest-stable discovers .yml files in samples/pipelines/test/
- Each .yml has a companion .toml sidecar with expected output metadata
- Tests POST the pipeline YAML to /api/v1/process, save the response,
  and validate codec, resolution, container format via ffprobe
- HW codec tests (NVENC AV1, Vulkan Video H.264) are skipped gracefully
  when the required node kind is not registered on the server

New files:
- tests/pipeline-validation/          Standalone Rust test crate
- samples/pipelines/test/*.yml        4 short test pipelines (30 frames)
- samples/pipelines/test/*.toml       Expected output metadata sidecars
- justfile: test-pipelines recipe

Usage: just test-pipelines http://localhost:4545
  just test-pipelines http://localhost:4545 vp9   # filter by name
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* refactor(e2e): restructure test pipelines to one-dir-per-test layout

Move from flat files to directory-based test layout:

  samples/pipelines/test/<name>/pipeline.yml
  samples/pipelines/test/<name>/expected.toml

Each test is self-contained in its own directory, making it easier to
add test-specific input media or extra config in the future. The
datatest-stable harness now matches on 'pipeline.yml' recursively.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* feat(e2e): add SVT-AV1 test pipeline and CI integration

- Add svt_av1_colorbars test pipeline (SW codec, requires svt_av1 feature)
- Add pacer node to VP9 pipeline for consistency with other WebM pipelines
- Add pipeline-validation job to e2e.yml CI workflow — runs SW codec tests
  (VP9, OpenH264, SVT-AV1) against a live skit server with ffprobe validation
- GPU-specific tests (NVENC AV1, Vulkan Video H.264) are skipped in CI
  via the requires_node mechanism

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(ci): fail explicitly when skit server doesn't start

Add HEALTHY flag to health check loop so the pipeline-validation CI job
fails with a clear error instead of proceeding to run tests against a
server that never became healthy.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* feat(e2e): add complete format coverage for pipeline validation tests

Extend the pipeline validation framework to support audio-only tests and
file upload pipelines, then add 8 new test cases covering all core
codecs, muxers, and demuxers:

Audio codec tests:
- opus_roundtrip: Opus encode/decode via Ogg container
- opus_mp4: Opus encode via MP4 container (file mode)
- flac_decode: FLAC decoder (symphonia) → Opus/Ogg
- mp3_decode: MP3 decoder (symphonia) → Opus/Ogg
- wav_decode: WAV demuxer (symphonia) → Opus/Ogg

Video codec/decoder tests:
- rav1e_colorbars: rav1e AV1 encoder → WebM
- vp9_roundtrip: VP9 encode → decode → re-encode roundtrip
- dav1d_roundtrip: SVT-AV1 → dav1d decode → SVT-AV1 re-encode

Framework changes:
- Expected struct now supports audio-only tests (audio_codec,
  sample_rate, channels) and file uploads (input_file)
- run_pipeline() accepts optional input file for multipart upload
- validate_output() validates audio and/or video stream properties
- Test audio fixtures (Ogg/Opus, FLAC, MP3, WAV) in fixtures/

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* chore: add REUSE/SPDX license files for test audio fixtures

Adds CC0-1.0 license companion files for the generated test tone
audio fixtures (ogg, flac, mp3, wav) to satisfy the reuse-compliance
check.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* feat(ci): add GPU pipeline validation job on self-hosted runner

Adds a 'Pipeline Validation (GPU)' job to the E2E workflow that runs
on the self-hosted GPU runner. This builds skit with gpu, svt_av1, and
dav1d_static features, starts the server, and runs all pipeline
validation tests.

Currently the NVENC AV1 and Vulkan Video H.264 tests will skip
gracefully since those features (nvcodec, vulkan_video) aren't on main
yet. Once PR #279 merges, adding those features to the build command
will enable full HW codec pipeline validation in CI.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(ci): use alternate port for GPU pipeline validation server

The self-hosted GPU runner has a persistent skit instance on port
4545. Use port 4546 for the pipeline validation server to avoid
'Address already in use' errors.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(ci): add libssl-dev for GPU pipeline validation runner

The pipeline-validation test crate depends on reqwest which pulls in
openssl-sys. The self-hosted GPU runner needs libssl-dev installed.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(test): correct multipart field name and audio channel config

- Use 'media' instead of 'file' for the multipart field name to match
  the server's http_input binding convention.
- Set channels: 2 on all opus_encoder nodes since test fixtures are
  stereo, fixing 'Incompatible connection' errors on the GPU runner.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(test): use mono audio fixtures to match Opus encoder pin type

The Opus encoder node's input pin is hardcoded to accept mono audio
(channels: 1). Regenerate all test fixtures as mono sine waves and
update pipeline configs and expected.toml files accordingly.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(ci): add cleanup step to kill skit on self-hosted runner

Self-hosted runners persist between runs, so background processes can
accumulate. Add an always-run cleanup step to kill the skit process
after tests complete.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(test): remove incompatible audio decode pipelines and fix GPU cleanup

Remove flac_decode, mp3_decode, and wav_decode test pipelines: the FLAC
decoder, MP3 decoder, and WAV demuxer all declare channels: 2 in their
static output pins, but the Opus encoder (the only audio encoder) only
accepts channels: 1. This static type mismatch causes pipeline validation
to reject the connection before any audio data flows.

Audio codec coverage is retained via opus_roundtrip (Ogg) and opus_mp4
(MP4) tests which exercise the full Opus encode/decode path.

Also fix GPU CI cleanup: use PID-based kill instead of pkill pattern
matching (the port number is in an env var, not the command line).

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* feat(ci): enable nvcodec + vulkan_video in GPU pipeline validation

Now that the branch is rebased on top of PR #279 (HW video codecs),
enable the nvcodec and vulkan_video features in the GPU CI build so
the nv_av1_colorbars and vulkan_video_h264_colorbars tests actually
run on the self-hosted GPU runner.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(test): correct doc comment for multipart field name

The doc comment said 'file' but the code uses 'media'.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

---------

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Signed-off-by: Devin AI <devin@streamkit.dev>
Co-authored-by: streamkit-bot <registry-bot@streamkit.dev>
Co-authored-by: StreamKit Devin <devin@streamkit.dev>
Co-authored-by: Claudio Costa <cstcld91@gmail.com>
streamer45 added a commit that referenced this pull request Apr 14, 2026
…AV1, NVENC/NVDEC AV1) (#279)

* feat(nodes): add HW video codec backends (Vulkan Video H.264, VA-API AV1, NVENC/NVDEC AV1)

Implement hardware-accelerated video encoding and decoding for StreamKit,
targeting Linux with Intel and NVIDIA GPUs (issue #217).

Three backends behind optional feature flags:

  vulkan_video — H.264 encode/decode via Vulkan Video (vk-video v0.3).
    Cross-vendor (Intel ANV, NVIDIA, AMD RADV). Includes lazy encoder
    creation on first frame for resolution detection, NV12/I420 input
    support, and configurable bitrate/framerate/keyframe interval.

  vaapi — AV1 encode/decode via VA-API (cros-codecs v0.0.6).
    Primarily Intel (intel-media-driver), also AMD. Uses GBM surfaces
    for zero-copy VA-API buffer management. Includes stride-aware
    NV12 plane read/write helpers with odd-width correctness.

  nvcodec — AV1 encode/decode via NVENC/NVDEC (shiguredo_nvcodec v2025.2).
    NVIDIA only (RTX 30xx+ decode, RTX 40xx+ AV1 encode). Dynamic CUDA
    loading — no build-time CUDA Toolkit required for the host binary.

All backends share:
- HwAccelMode enum (auto/force_hw/force_cpu) for graceful fallback
- ProcessorNode trait integration with health reporting
- Consistent config structs with serde deny_unknown_fields validation
- Comprehensive unit tests (mock-based, no GPU required)

Closes #217

Signed-off-by: Devin AI <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* ci: run nvcodec tests on GPU runner

The self-hosted GPU runner (skit-demo-eu-gpu) has an NVIDIA GPU but the
CI workflow wasn't exercising the nvcodec feature tests. Add the missing
cargo test invocation so NVENC/NVDEC AV1 tests run alongside the
existing GPU compositor tests.

Signed-off-by: Devin AI <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* ci: install CUDA headers on GPU runner for nvcodec tests

The shiguredo_nvcodec build script requires cuda.h at compile time.
Install nvidia-cuda-toolkit on the self-hosted GPU runner if CUDA
headers aren't already present.

Signed-off-by: Devin AI <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* ci: set CUDA_INCLUDE_PATH for nvcodec build on GPU runner

Ubuntu's nvidia-cuda-toolkit installs cuda.h to /usr/include, but
shiguredo_nvcodec's build script defaults to /usr/local/cuda/include.
Set CUDA_INCLUDE_PATH=/usr/include so the build finds the headers.

Signed-off-by: Devin AI <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* ci: fix nvcodec build on GPU runner (BINDGEN_EXTRA_CLANG_ARGS)

Remove conditional nvidia-cuda-toolkit install (already pre-installed
on the self-hosted runner) and add BINDGEN_EXTRA_CLANG_ARGS to point
bindgen at the LLVM 18 clang builtin includes so stddef.h is found.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* ci: reorder GPU tests so nvcodec runs before engine

The streamkit-engine GPU test binary segfaults (SIGSEGV) during
cleanup after all 25 tests pass — this is a pre-existing issue
likely related to wgpu/Vulkan teardown.  Move the nvcodec node
tests before the engine GPU tests so they are not blocked by
the crash.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): add missing framerate field in nvcodec test

The force_cpu_encoder_rejected test was constructing
NvAv1EncoderConfig with all fields explicitly but missed the
new framerate field added in the review-fix round.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): register HW codec nodes, fix i420_to_nv12 truncation, remove dead code

- Add cfg-gated registration calls for vulkan_video, vaapi, and nvcodec
  nodes in register_video_nodes() — without these, users enabling the
  features would get 'node not found' errors at runtime.
- Fix i420_to_nv12 in vulkan_video.rs to use div_ceil(2) for chroma
  dimensions instead of truncating integer division (h/2, w/2), matching
  the correct implementation in nv_av1.rs.
- Update HwAccelMode::Auto doc comment to accurately reflect that
  HW-only nodes do not implement CPU fallback — Auto and ForceHw
  behave identically; CPU fallback is achieved by selecting a different
  (software) node at the pipeline level.
- Remove dead default_quality() and default_framerate() functions in
  vaapi_av1.rs (unused — the struct uses a manual Default impl).
- Add registration regression tests to nv_av1 and vaapi_av1 modules.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): add encoder flush comment, validate cuda_device, use GBM plane offsets

- vulkan_video.rs: document that vk-video 0.3.0 BytesEncoder has no
  flush() method (unlike BytesDecoder); frame-at-a-time, no B-frames
- nv_av1.rs: reject cuda_device > i32::MAX at construction time
  instead of silently wrapping via 'as i32' cast
- vaapi_av1.rs: use gbm_frame.get_plane_offset() for FrameLayout
  instead of manually computing y_stride * coded_height; also fix
  stride fallback to use coded_width instead of display width

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(skit): forward HW codec feature flags from streamkit-server to streamkit-nodes

Without these forwarding features, `just extra_features="--features vulkan_video" skit`
would silently ignore the feature since streamkit-server didn't know about it.

Adds vulkan_video, vaapi, and nvcodec feature forwarding, matching the
existing pattern for svt_av1 and dav1d.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* docs(samples): add HW video codec sample pipelines

Add oneshot and dynamic (MoQ) sample pipelines for each HW video codec
backend:

- Vulkan Video H.264: video_vulkan_video_h264_colorbars (oneshot + MoQ)
- VA-API AV1: video_vaapi_av1_colorbars (oneshot + MoQ)
- NVENC AV1: video_nv_av1_colorbars (oneshot + MoQ)

Each oneshot pipeline generates SMPTE color bars, HW-encodes, muxes into
a container (MP4 for H.264, WebM for AV1), and outputs via HTTP.

Each dynamic pipeline generates color bars, HW-encodes, and streams via
MoQ for live playback in the browser.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): revert get_plane_offset to computed fallback

get_plane_offset() is private in cros-codecs 0.0.6. Fall back to
computing the UV plane offset from pitch × coded_height, which is
correct for linear NV12 allocations used by VA-API encode surfaces.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* style: format vaapi_av1.rs

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* feat(nodes): add VA-API H.264 encoder and decoder nodes

Add vaapi_h264 module with VaapiH264EncoderNode and VaapiH264DecoderNode
using cros-codecs StatelessEncoder/StatelessDecoder for H.264 via VA-API.

- Encoder: CQP rate control, Main profile, macroblock-aligned coding
- Decoder: stateless H.264 decode with format-change handling
- Reuses shared helpers from vaapi_av1 (GBM/NV12 I/O, device detection)
- Registration: video::vaapi::h264_encoder, video::vaapi::h264_decoder
- Sample pipelines: oneshot MP4 + dynamic MoQ for VA-API H.264

Supported on Intel (Sandy Bridge+), AMD, and NVIDIA (decode only).

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* feat(nodes): add VA-API H.264 encoder and decoder nodes

Add vaapi_h264 module with VaapiH264EncoderNode and VaapiH264DecoderNode
using cros-codecs StatelessEncoder/StatelessDecoder for H.264 via VA-API.

- Encoder: CQP rate control, Main profile, macroblock-aligned coding
- Decoder: stateless H.264 decode with format-change handling
- Reuses shared helpers from vaapi_av1 (GBM/NV12 I/O, device detection)
- Registration: video::vaapi::h264_encoder, video::vaapi::h264_decoder
- Sample pipelines: oneshot MP4 + dynamic MoQ for VA-API H.264

Supported on Intel (Sandy Bridge+), AMD, and NVIDIA (decode only).

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): auto-detect VA-API H.264 encoder entrypoint

Modern Intel GPUs (Gen 9+ / Skylake onwards) only expose the low-power
fixed-function encoder (VAEntrypointEncSliceLP), not the full encoder
(VAEntrypointEncSlice).  Query the driver for supported entrypoints and
auto-select the correct one instead of hardcoding low_power=false.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): bypass GBM for VA-API encoders, use direct VA surfaces

Replace GBM-backed frame allocation with direct VA surface creation
and Image API uploads for both H.264 and AV1 VA-API encoders.

The cros-codecs GBM allocator uses GBM_BO_USE_HW_VIDEO_ENCODER, a flag
that Mesa's iris driver does not support for NV12 on some hardware
(e.g. Intel Tiger Lake with Mesa 23.x), causing 'Error allocating
contiguous buffer' failures.

By using libva Surface<()> handles instead:
- Surfaces are created via vaCreateSurfaces (no GBM needed)
- NV12 data is uploaded via the VA Image API (vaCreateImage + vaPutImage)
- The encoder's import_picture passthrough accepts Surface<()> directly
- Pitches/offsets come from the VA driver's VAImage, not GBM

This also adds two new shared helpers in vaapi_av1.rs:
- open_va_display(): opens VA display without GBM device
- write_nv12_to_va_surface(): uploads NV12/I420 frame data to a VA
  surface using the Image API, returning driver pitches/offsets

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): use ceiling division for chroma dimensions in VA surface upload

write_nv12_to_va_surface used truncating integer division (w / 2, h / 2)
for chroma plane dimensions, which would corrupt chroma data for frames
with odd width or height.  VideoLayout::packed uses (width + 1) / 2 for
chroma dimensions, so the upload function must match.

Changes:
- NV12 path: use (h+1)/2 for uv_h, ((w+1)/2)*2 for chroma row bytes
- I420 path: use (w+1)/2 for uv_w, (h+1)/2 for uv_h

This matches the existing write_nv12_to_mapping (which uses div_ceil)
and i420_to_nv12_buffer in nv_av1.rs.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): remove incorrect .min(w) clamp on NV12 UV row copy

For odd-width frames, chroma_row_bytes (e.g. 642 for w=641) is the
correct number of bytes per UV row in VideoLayout::packed format.
Clamping to .min(w) would drop the last V sample on every UV row.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* style(nodes): fix rustfmt for VA surface UV copy

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* feat(e2e): add headless pipeline validation tests (#285)

* chore(registry): publish marketplace registry update (#283)

* chore(registry): publish marketplace registry update

* fix(marketplace): prevent plugin releases from becoming latest on GitHub

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

---------

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-authored-by: StreamKit Devin <devin@streamkit.dev>
Co-authored-by: Claudio Costa <cstcld91@gmail.com>

* feat(e2e): add headless pipeline validation tests

Add a Rust-based test framework for validating oneshot pipelines against
a live skit server using ffprobe for output verification. No browser
required.

Architecture:
- datatest-stable discovers .yml files in samples/pipelines/test/
- Each .yml has a companion .toml sidecar with expected output metadata
- Tests POST the pipeline YAML to /api/v1/process, save the response,
  and validate codec, resolution, container format via ffprobe
- HW codec tests (NVENC AV1, Vulkan Video H.264) are skipped gracefully
  when the required node kind is not registered on the server

New files:
- tests/pipeline-validation/          Standalone Rust test crate
- samples/pipelines/test/*.yml        4 short test pipelines (30 frames)
- samples/pipelines/test/*.toml       Expected output metadata sidecars
- justfile: test-pipelines recipe

Usage: just test-pipelines http://localhost:4545
  just test-pipelines http://localhost:4545 vp9   # filter by name
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* refactor(e2e): restructure test pipelines to one-dir-per-test layout

Move from flat files to directory-based test layout:

  samples/pipelines/test/<name>/pipeline.yml
  samples/pipelines/test/<name>/expected.toml

Each test is self-contained in its own directory, making it easier to
add test-specific input media or extra config in the future. The
datatest-stable harness now matches on 'pipeline.yml' recursively.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* feat(e2e): add SVT-AV1 test pipeline and CI integration

- Add svt_av1_colorbars test pipeline (SW codec, requires svt_av1 feature)
- Add pacer node to VP9 pipeline for consistency with other WebM pipelines
- Add pipeline-validation job to e2e.yml CI workflow — runs SW codec tests
  (VP9, OpenH264, SVT-AV1) against a live skit server with ffprobe validation
- GPU-specific tests (NVENC AV1, Vulkan Video H.264) are skipped in CI
  via the requires_node mechanism

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(ci): fail explicitly when skit server doesn't start

Add HEALTHY flag to health check loop so the pipeline-validation CI job
fails with a clear error instead of proceeding to run tests against a
server that never became healthy.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* feat(e2e): add complete format coverage for pipeline validation tests

Extend the pipeline validation framework to support audio-only tests and
file upload pipelines, then add 8 new test cases covering all core
codecs, muxers, and demuxers:

Audio codec tests:
- opus_roundtrip: Opus encode/decode via Ogg container
- opus_mp4: Opus encode via MP4 container (file mode)
- flac_decode: FLAC decoder (symphonia) → Opus/Ogg
- mp3_decode: MP3 decoder (symphonia) → Opus/Ogg
- wav_decode: WAV demuxer (symphonia) → Opus/Ogg

Video codec/decoder tests:
- rav1e_colorbars: rav1e AV1 encoder → WebM
- vp9_roundtrip: VP9 encode → decode → re-encode roundtrip
- dav1d_roundtrip: SVT-AV1 → dav1d decode → SVT-AV1 re-encode

Framework changes:
- Expected struct now supports audio-only tests (audio_codec,
  sample_rate, channels) and file uploads (input_file)
- run_pipeline() accepts optional input file for multipart upload
- validate_output() validates audio and/or video stream properties
- Test audio fixtures (Ogg/Opus, FLAC, MP3, WAV) in fixtures/

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* chore: add REUSE/SPDX license files for test audio fixtures

Adds CC0-1.0 license companion files for the generated test tone
audio fixtures (ogg, flac, mp3, wav) to satisfy the reuse-compliance
check.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* feat(ci): add GPU pipeline validation job on self-hosted runner

Adds a 'Pipeline Validation (GPU)' job to the E2E workflow that runs
on the self-hosted GPU runner. This builds skit with gpu, svt_av1, and
dav1d_static features, starts the server, and runs all pipeline
validation tests.

Currently the NVENC AV1 and Vulkan Video H.264 tests will skip
gracefully since those features (nvcodec, vulkan_video) aren't on main
yet. Once PR #279 merges, adding those features to the build command
will enable full HW codec pipeline validation in CI.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(ci): use alternate port for GPU pipeline validation server

The self-hosted GPU runner has a persistent skit instance on port
4545. Use port 4546 for the pipeline validation server to avoid
'Address already in use' errors.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(ci): add libssl-dev for GPU pipeline validation runner

The pipeline-validation test crate depends on reqwest which pulls in
openssl-sys. The self-hosted GPU runner needs libssl-dev installed.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(test): correct multipart field name and audio channel config

- Use 'media' instead of 'file' for the multipart field name to match
  the server's http_input binding convention.
- Set channels: 2 on all opus_encoder nodes since test fixtures are
  stereo, fixing 'Incompatible connection' errors on the GPU runner.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(test): use mono audio fixtures to match Opus encoder pin type

The Opus encoder node's input pin is hardcoded to accept mono audio
(channels: 1). Regenerate all test fixtures as mono sine waves and
update pipeline configs and expected.toml files accordingly.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(ci): add cleanup step to kill skit on self-hosted runner

Self-hosted runners persist between runs, so background processes can
accumulate. Add an always-run cleanup step to kill the skit process
after tests complete.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(test): remove incompatible audio decode pipelines and fix GPU cleanup

Remove flac_decode, mp3_decode, and wav_decode test pipelines: the FLAC
decoder, MP3 decoder, and WAV demuxer all declare channels: 2 in their
static output pins, but the Opus encoder (the only audio encoder) only
accepts channels: 1. This static type mismatch causes pipeline validation
to reject the connection before any audio data flows.

Audio codec coverage is retained via opus_roundtrip (Ogg) and opus_mp4
(MP4) tests which exercise the full Opus encode/decode path.

Also fix GPU CI cleanup: use PID-based kill instead of pkill pattern
matching (the port number is in an env var, not the command line).

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* feat(ci): enable nvcodec + vulkan_video in GPU pipeline validation

Now that the branch is rebased on top of PR #279 (HW video codecs),
enable the nvcodec and vulkan_video features in the GPU CI build so
the nv_av1_colorbars and vulkan_video_h264_colorbars tests actually
run on the self-hosted GPU runner.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(test): correct doc comment for multipart field name

The doc comment said 'file' but the code uses 'media'.

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

---------

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Signed-off-by: Devin AI <devin@streamkit.dev>
Co-authored-by: streamkit-bot <registry-bot@streamkit.dev>
Co-authored-by: StreamKit Devin <devin@streamkit.dev>
Co-authored-by: Claudio Costa <cstcld91@gmail.com>

* fix(e2e): improve pipeline test diagnostics and GPU CI reliability

- Add file size to ffprobe error messages for easier debugging
- Detect empty response bodies (encoder failed to produce output)
- Capture skit server logs in GPU CI job for post-mortem analysis
- Use --test-threads=1 for GPU tests to avoid NVENC session contention

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): eagerly init Vulkan device to prevent empty output on fast pipelines

The Vulkan Video H.264 encoder lazily initialised the VulkanDevice
inside the blocking encode task on the first frame.  On GPUs where
device creation takes ~500 ms (common on CI runners), short pipelines
such as colorbars (30 frames in ~12 ms) would close the input stream
before the encoder was ready, resulting in zero encoded packets and an
empty HTTP response.

Move device initialisation to a dedicated spawn_blocking call that
completes before the encode loop starts.  The BytesEncoder is still
created lazily on the first frame (to know the resolution), but the
expensive Vulkan instance/adapter/device setup is already done.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): eagerly init Vulkan device to prevent empty output on fast pipelines

The Vulkan Video H.264 encoder lazily initialised the VulkanDevice
inside the blocking encode task on the first frame.  On GPUs where
device creation takes ~500 ms (common on CI runners), short pipelines
such as colorbars (30 frames in ~12 ms) would close the input stream
before the encoder was ready, resulting in zero encoded packets and an
empty HTTP response.

Move device initialisation to a dedicated spawn_blocking call that
completes before the encode loop starts.  The BytesEncoder is still
created lazily on the first frame (to know the resolution), but the
expensive Vulkan instance/adapter/device setup is already done.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): add panic detection and lifecycle tracing to codec forward loop

- Log which select branch fires in codec_forward_loop (drain path)
- Detect and log panics from codec tasks instead of silently swallowing
- Track frames_encoded count in Vulkan encoder task
- Increase GPU CI server log capture from 100 to 500 lines
- Enable debug logging for codec/vulkan modules in GPU CI

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): add panic detection and lifecycle tracing to codec forward loop

- Log which select branch fires in codec_forward_loop (drain path)
- Detect and log panics from codec tasks instead of silently swallowing
- Track frames_encoded count in Vulkan encoder task
- Increase GPU CI server log capture from 100 to 500 lines
- Enable debug logging for codec/vulkan modules in GPU CI

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): force first Vulkan Video H.264 frame as IDR keyframe

The MP4 muxer gates all video packets until it sees the first keyframe.
The colorbars source does not set metadata.keyframe, so force_keyframe
defaulted to false for every frame.  Without an explicit IDR request
the Vulkan Video encoder may not mark the first frame as a keyframe,
causing the muxer to skip all 30 packets and produce an empty output.

Also fix clippy lint: collapse identical Ok(())/Err(_) match arms in
codec_forward_loop's codec-task await.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): add diagnostic tracing for pipeline shutdown race condition

Add targeted tracing to identify why BytesOutputNode exits before
receiving data from the MP4 muxer:

- recv_with_cancellation: distinguish cancellation-token vs channel-close
- graph_builder: log when each node task completes (success or error)
- mp4 muxer: log keyframe gate decisions (first keyframe seen vs skip)

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): await codec task before draining to prevent shutdown race

The codec_forward_loop drain phase previously interleaved with the
(potentially slow) blocking encode task.  On fast pipelines the drain
could take 100+ ms while downstream nodes (MP4 muxer, BytesOutputNode)
processed and closed their channels, resulting in zero-byte output.

Restructure the drain so that we:
1. Break out of the select loop when the input task completes.
2. Await the codec (blocking) task to completion — all results are now
   buffered in result_rx.
3. Drain the fully-buffered results in a tight loop, forwarding them
   downstream before any channel can close.

This eliminates the race window between result forwarding and downstream
shutdown.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* style: format codec_utils.rs

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): resolve clippy cognitive_complexity in codec_utils and mp4

- codec_utils: extract finish_codec_task helper to reduce nesting
- mp4: flatten keyframe gate logic to remove nested if/else

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* refactor(nodes): extract accumulate_video_sample to fix mp4 cognitive_complexity

Extract video frame accumulation logic (Annex B → AVCC conversion,
sample entry tracking, duration calculation) into a standalone helper
to bring run_stream_mode under the cognitive_complexity threshold.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* refactor(nodes): extract accumulate_audio_sample to further reduce mp4 complexity

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): abort codec task before awaiting in non-drain path

Fixes a potential deadlock: when the output channel closes (e.g. client
disconnect), the select loop breaks with drain_pending=false.  Without
aborting the codec task first, it may be blocked on blocking_send() with
a full result channel that nobody is draining, causing finish_codec_task
to wait forever.

Identified by Devin Review.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* refactor(nodes): extract check_video_keyframe_gate to further reduce mp4 complexity

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): populate avcC chroma fields for High profile H.264

The shiguredo_mp4 library requires chroma_format, bit_depth_luma_minus8,
and bit_depth_chroma_minus8 fields in the AvccBox for profiles other
than Baseline (66), Main (77), and Extended (88).  HW encoders like
Vulkan Video typically produce High profile (100) H.264, causing
'Missing chroma_format field in avcC box' when the MP4 muxer tries to
create the init segment.

Set 4:2:0 chroma (1) and 8-bit depth (0) for non-Baseline/Main/Extended
profiles, matching the NV12 format used by all HW encoder backends.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): drain results concurrently with codec task to prevent deadlock

The previous fix (awaiting the codec task before draining) introduced a
deadlock when the codec produces more results than the bounded channel
capacity (32).  The codec task blocks on blocking_send() waiting for
space, but nobody is draining result_rx because we're waiting for the
codec task to finish first.

Fix by using tokio::select! with biased polling: drain results from
result_rx (keeping the channel flowing) while simultaneously awaiting
the codec task.  Once the codec task finishes, result_tx is dropped
and result_rx.recv() returns None, ending the drain loop naturally
with all results forwarded.

This fixes opus_mp4 and opus_roundtrip pipeline validation tests that
were hanging because the OpusEncoder produces ~51 frames (exceeding
the 32-capacity channel).

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* refactor(nodes): extract drain_codec_results to reduce cognitive complexity

Extract the concurrent drain loop into a separate
drain_codec_results() function to bring codec_forward_loop back under
the clippy cognitive_complexity limit (50).

Also adds Send + Sync bounds to the to_packet closure parameter to
support the extracted async function.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): detect keyframe from VA-API encoder bitstream output

The VA-API AV1 and H.264 encoders were cloning input metadata without
updating the keyframe flag based on actual encoder output.  This caused
downstream consumers (MP4 muxer, RTMP/MoQ transport) to miss keyframes,
particularly encoder-initiated periodic keyframes from the LowDelay
prediction structure.

Add bitstream-level keyframe detection:
- AV1: parse OBU headers to find Frame OBU with frame_type == KEY_FRAME
- H.264: scan Annex B start codes for IDR NAL unit type (5)

Both encode() and flush_encoder() paths now set the keyframe flag from
the actual encoded bitstream rather than blindly cloning input metadata.

Also fix HwAccelMode serde rename_all from "lowercase" to "snake_case"
so ForceHw serializes as "force_hw" (not "forcehw").

Include unit tests for all keyframe detection functions.

Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): construct VA-API encoder backend directly to satisfy trait bounds

The `CrosVaapiAv1Encoder` and `CrosVaapiH264Encoder` type aliases use
`Surface<()>` to bypass GBM buffer allocation, but `Surface<()>` does
not implement the `VideoFrame` trait required by `new_vaapi()`.

Replace `new_vaapi()` calls with direct `VaapiBackend::new()` +
`new_av1()`/`new_h264()` construction — the same pattern used by
cros-codecs' own tests — which avoids the `V: VideoFrame` constraint
while preserving the GBM-free surface path.

Also removes unused imports (GbmDevice, GbmExternalBufferDescriptor,
ReadMapping, WriteMapping, CrosFourcc, write_nv12_to_mapping) that were
flagged as warnings.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): restore required imports removed in previous commit

Restore GbmDevice, ReadMapping, WriteMapping, and CrosFourcc imports
that are used by decoder and NV12 helper functions in vaapi_av1.rs.
Only GbmExternalBufferDescriptor was truly unused.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): use GbmVideoFrame for VA-API encoders with runtime GBM fallback

Replace Surface<()> type alias with GbmVideoFrame in both VA-API AV1
and H.264 encoders.  This satisfies the VideoFrame trait bound required
by StatelessEncoder::new_vaapi(), fixing the build with --features vaapi.

At construction time, the encoder probes GBM buffer allocation with
GBM_BO_USE_HW_VIDEO_ENCODER.  If the driver does not support that flag
(e.g. Mesa iris on Intel Tiger Lake with Mesa 23.x), it falls back to
GBM_BO_USE_HW_VIDEO_DECODER which is universally supported and still
produces a valid NV12 buffer the encoder can read.

Also removes the now-unused open_va_display() and
write_nv12_to_va_surface() helper functions, and the direct
VaapiEncBackend import that was only needed for the old manual backend
construction path.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): return error on NV12 bounds-check failure

In write_nv12_to_mapping, the row-copy and I420 UV interleave paths
silently skipped rows when bounds checks failed instead of surfacing
the error. This made it impossible to diagnose corrupted frames from
mismatched buffer sizes.

Change all silent skip patterns to return descriptive error messages
with the exact indices and buffer lengths involved.

Closes #291

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* test(ci): add VA-API test coverage to GPU runner

- Add libva-dev to the test-gpu system dependencies install step.
- Add cargo test with --features vaapi to the GPU test matrix,
  running VA-API AV1 encode/decode tests on the self-hosted runner.
- Add resolution-padding verification test (issue #292) that encodes
  at 1280x720 (coded 1280x768) and asserts decoded frames match the
  original display resolution, not the coded resolution.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(ci): add libgbm-dev to GPU runner dependencies

The cros-codecs VA-API backend links against libgbm for GBM buffer
management. Without libgbm-dev the vaapi feature tests fail to link.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): skip VA-API encode tests on decode-only drivers

NVIDIA's community nvidia-vaapi-driver only supports VA-API decode,
not encode. The existing vaapi_available() check only verifies that a
VA-API display can be opened, which succeeds on NVIDIA — but the
encoder tests then fail because no encode entrypoints exist.

Add vaapi_av1_encode_available() and vaapi_h264_encode_available()
helpers that probe whether the driver actually supports encoding by
attempting to construct the encoder. Encode tests now skip gracefully
on decode-only drivers instead of failing.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(deps): vendor cros-codecs with GbmUsage::Linear support

Vendor cros-codecs 0.0.6 and add a GbmUsage::Linear variant to
GbmDevice::new_frame().  On drivers where neither
GBM_BO_USE_HW_VIDEO_ENCODER nor GBM_BO_USE_HW_VIDEO_DECODER is
supported for contiguous NV12 allocation (e.g. Mesa iris on Intel
Tiger Lake with Mesa ≤ 23.x), the Linear variant falls back to
GBM_BO_USE_LINEAR which is universally supported.

A [patch.crates-io] entry in the workspace Cargo.toml redirects the
cros-codecs dependency to the vendored copy.  This patch should be
removed once upstream cros-codecs ships the Linear variant.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): add GBM_BO_USE_LINEAR fallback for Tiger Lake VA-API

On Intel Tiger Lake with Mesa ≤ 23.x, both GBM_BO_USE_HW_VIDEO_ENCODER
and GBM_BO_USE_HW_VIDEO_DECODER flags are unsupported for contiguous
NV12 buffer allocation, causing VA-API H.264 and AV1 encoding to fail
with 'Error allocating contiguous buffer'.

Add a three-level GBM usage probe to both encoders:
  1. GBM_BO_USE_HW_VIDEO_ENCODER  (optimal tiling)
  2. GBM_BO_USE_HW_VIDEO_DECODER  (decoder-tiled fallback)
  3. GBM_BO_USE_LINEAR            (universal fallback)

Also update the decoder allocation callbacks to try LINEAR when DECODE
fails, ensuring decode also works on affected drivers.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* chore(reuse): add SPDX annotation for vendored cros-codecs

Cover the vendored cros-codecs directory with BSD-3-Clause (ChromiumOS
Authors) in REUSE.toml so the reuse-compliance-check CI job passes.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* chore(reuse): add BSD-3-Clause license text for vendored cros-codecs

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(deps): add GbmUsage::Separated for per-plane R8 VA-API export

On Mesa iris (Tiger Lake), gbm_bo_create rejects the NV12 fourcc with
every usage flag (HW_VIDEO_ENCODER, HW_VIDEO_DECODER, LINEAR).

Add a GbmUsage::Separated variant that bypasses native NV12 allocation
entirely: each plane is allocated as a separate R8 buffer with LINEAR,
then exported to VA-API via a multi-object VADRMPRIMESurfaceDescriptor
(one DMA-BUF FD per plane).

Changes to the vendored cros-codecs:
- GbmUsage::Separated enum variant
- new_frame(): when usage is Separated, take the per-plane R8 path
  even for formats that are normally contiguous (NV12)
- GbmExternalBufferDescriptor: store Vec<File> + object_indices instead
  of a single File, so multi-BO frames can be exported
- to_native_handle(): handle both single-BO and multi-BO frames,
  creating the correct num_objects / object_index mapping

Changes to the encoder/decoder nodes:
- Four-level GBM probe: Encode → Decode → Linear → Separated
- Decoder alloc callbacks: Decode → Linear → Separated fallback

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(deps): use single flat R8 BO for GbmUsage::Separated

The previous multi-BO approach (one R8 BO per plane) failed on Intel
iHD because vaCreateSurfaces rejected the multi-object
VADRMPRIMESurfaceDescriptor for NV12.

Switch to a single oversized R8/LINEAR buffer that is tall enough to
hold all planes end-to-end (height = coded_height × 3/2 for NV12).
The NV12 plane pitches and offsets are computed manually from the R8
stride and stored in a new SeparatedLayout struct on GbmVideoFrame.

This gives us a single DMA-BUF FD → single-object VA-API import, which
is the same proven path that contiguous NV12 allocations use.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): pass display resolution to VA-API encoders (fixes #292)

The AV1 encoder was passing only the superblock-aligned coded resolution
to cros-codecs, which set render_width/render_height in the AV1 frame
header to the coded dimensions.  For non-aligned inputs (e.g. 1280×720
→ coded 1280×768), decoders would show 48 pixels of black padding at
the bottom.

Add a display_resolution field to the vendored cros-codecs AV1
EncoderConfig and use it for render_width/render_height in the frame
header predictor.  When display differs from coded dimensions, the AV1
bitstream now signals render_and_frame_size_different=1 so decoders
crop the superblock padding.

For H.264, the SpsBuilder::resolution() method already handles
macroblock alignment and frame_crop offsets automatically, but we were
passing the pre-aligned coded resolution, bypassing the cropping logic.
Now we pass the original display resolution and let SpsBuilder compute
the correct frame_crop offsets.

Closes #292

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* refactor(nodes): use VA-API Image API for encoders, drop GBM encoder path

Replace GBM buffer allocation (GbmVideoFrame + GBM_BO_USE_HW_VIDEO_ENCODER)
with direct VA surface creation + Image API upload (vaCreateImage/vaPutImage)
for both AV1 and H264 VA-API encoders.

This bypasses the GBM NV12 allocation that Mesa's iris driver rejects on
Intel Tiger Lake, eliminating the need for the vendored GbmUsage::Linear
and GbmUsage::Separated workarounds.

Changes:
- Add open_va_display() helper (VA-only, no GBM device needed)
- Add write_nv12_to_va_surface() with bounds-check error handling (#291)
- Encoder type aliases use Surface<()> instead of GbmVideoFrame
- Encoder structs drop gbm/gbm_usage fields
- Encoder::encode() creates VA surfaces and uploads via Image API
- Revert vendored gbm_video_frame.rs to upstream (drop Linear/Separated)
- Simplify decoder alloc callbacks to GbmUsage::Decode only
- Update Cargo.toml vendor comment (now only for display_resolution #292)

Decoders remain GBM-backed (GBM_BO_USE_HW_VIDEO_DECODER works on all
tested hardware including Tiger Lake).

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): restore CrosFourcc and WriteMapping imports for vaapi tests

These types are used by write_nv12_to_mapping (decoder helper) and
nv12_fourcc(), which are still needed even after switching encoders
to the Image API path. The test module's MockWriteMapping also
implements the WriteMapping trait.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): use direct backend construction for VA-API encoders

Replace new_vaapi() with VaapiBackend::new() + new_av1()/new_h264()
construction. Surface<()> does not implement the VideoFrame trait
required by new_vaapi(), so we construct the backend directly — the
same pattern used by cros-codecs' own tests.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(deps): make new_av1/new_h264 public in vendored cros-codecs

These constructors are needed for direct backend construction (bypassing
new_vaapi() which requires VideoFrame trait bounds that Surface<()>
doesn't satisfy).

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* refactor(nodes): minimize cros-codecs vendor to pub fn new_h264 only

- AV1 encoder: standard GbmVideoFrame + new_vaapi() path (no vendor changes)
- H264 encoder: Surface<()> + Image API + new_h264() (bypasses GBM on Tiger Lake)
- Revert all vendor changes except one-word visibility: fn new_h264 -> pub fn new_h264
- Remove VaSurface newtype (infeasible due to Send+Sync constraint)
- Remove display_resolution from vendored AV1 EncoderConfig
- Remove pub on new_av1 (not needed, AV1 uses new_vaapi())
- Update Cargo.toml and REUSE.toml comments to reflect minimal patch

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* feat(nodes): replace cros-codecs H.264 encoder with custom VA-API shim

Replace the cros-codecs StatelessEncoder for H.264 encoding with a custom
VA-API shim (vaapi_h264_enc.rs) that drives cros-libva directly.  This
eliminates the need for vendoring cros-codecs entirely.

The custom encoder:
- Uses the VA-API Image API (vaCreateImage/vaPutImage) to upload NV12
  frames, bypassing GBM buffer allocation which Mesa's iris driver
  rejects for NV12 on some hardware (e.g. Intel Tiger Lake with
  Mesa <= 23.x).
- Implements IPP low-delay prediction (periodic IDR + single-reference
  P frames) with CQP rate control.
- Constructs H.264 parameter buffers (SPS/PPS/slice) directly via
  cros-libva's typed wrappers.
- Auto-detects low-power vs full encoding entrypoint.
- Handles non-MB-aligned resolutions via frame cropping offsets.

The H.264 decoder and AV1 encoder/decoder continue to use cros-codecs
0.0.6 from crates.io (no vendoring, no patches).

Removes:
- vendor/cros-codecs/ directory (~50k lines, 229 files)
- [patch.crates-io] section from workspace Cargo.toml
- REUSE.toml vendor annotation

Closes #291 (bounds-check errors already fixed in prior commits)
Refs #292 (H.264 resolution padding handled by frame cropping)

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* chore(reuse): remove unused BSD-3-Clause license file

The BSD-3-Clause license was only needed for the vendored cros-codecs
directory, which has been removed in the previous commit.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* style: apply rustfmt to vaapi_h264 encoder shim

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): fix H264EncFrameCropOffsets clone and remove unused imports

H264EncFrameCropOffsets in cros-libva 0.0.12 does not derive Clone.
Reconstruct it from field values instead of cloning.

Remove unused imports: GbmDevice, ReadMapping, CrosVideoFrame.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): restore VideoFrame and ReadMapping trait imports for decoder

These traits must be in scope for the decoder to call get_plane_pitch()
and map() on Arc<GbmVideoFrame>.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): generate SPS/PPS NALUs for custom H.264 VA-API encoder

Some VA-API drivers (notably Intel iHD) do not auto-generate SPS/PPS
NAL units in the coded output.  The cros-libva crate does not expose
packed header buffer types (VAEncPackedHeaderParameterBuffer /
VAEncPackedHeaderDataBuffer), so we cannot request them via the VA-API.

Without SPS/PPS in the bitstream, the fMP4 muxer falls back to
placeholder parameter sets (Baseline profile, 4 bytes) that do not
match the actual Main profile stream — causing browsers to reject the
decoded output and disconnect after the first segment.

Fix: on IDR frames, check whether the coded output already contains
SPS (NAL type 7) and PPS (NAL type 8).  If not, generate conformant
SPS/PPS NALUs from the encoder parameters using a minimal exp-Golomb
bitstream writer, and prepend them to the coded data.

Includes unit tests for the BitWriter (bits, ue, se) and for the
bitstream_contains_sps_pps scanner.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): address code review findings for VA-API H.264 encoder

- [P1] Pipeline validation: fail when PIPELINE_REQUIRE_NODES=1 is set
  and a required node is missing (prevents false-green CI runs); also
  panic on unreachable schema endpoint under the same flag. Set the
  env var in both CI pipeline-validation jobs.

- [P3] Fix H.264 CPU fallback messages: decoder now says 'no CPU H.264
  decoder is currently available' (none exists); encoder points to
  video::openh264::encoder (the only software H.264 encoder).

- Fix unsafe aliasing in MockWriteMapping test mock (vaapi_av1.rs):
  replaced RefCell round-tripping with raw-pointer storage matching
  upstream GbmMapping pattern, with proper SAFETY comments.

- Deduplicate I420-to-NV12 conversions: extracted shared
  i420_frame_to_nv12_buffer() into video/mod.rs, removed duplicate
  implementations from nv_av1.rs and vulkan_video.rs.

- Remove dead accessors on VaH264Encoder (display(), width(), height())
  — only coded_width()/coded_height() are used.

- Add debug_assert for NV12 packed-layout assumption in
  write_nv12_to_va_surface (stride == width contract).

- Fix endian-dependent fourcc: replace u32::from_ne_bytes(*b"NV12")
  with nv12_fourcc().into() matching vaapi_av1.rs.

- Fix scratch surface pool: return old reference frame surfaces to the
  pool instead of dropping them.

- Add idr_period documentation comment explaining the hardcoded 1024
  default and how callers can force IDR via force_keyframe.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): use crate::video path for i420_frame_to_nv12_buffer in test modules

super:: inside nv_av1::tests and vulkan_video::tests resolves to
the nv_av1/vulkan_video module, not the video parent module where
i420_frame_to_nv12_buffer lives.  Use the fully qualified crate path.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* refactor(nodes): consolidate VA-API decode loop + Vulkan Video encoder nits

Finding #5: Extract generic vaapi_decode_loop_body<D>() and
vaapi_drain_decoder_events<D>() in vaapi_av1.rs, parameterised on
StatelessVideoDecoder codec type.  Both vaapi_h264_decode_loop and
vaapi_av1_decode_loop now delegate to these shared helpers, removing
~130 lines of near-identical code.  The AV1 decode loop init is
simplified to use the existing open_va_and_gbm() helper.

Finding #8: Add comment block explaining why the Vulkan Video H.264
encoder does not use StandardVideoEncoder / spawn_standard_encode_task
(no flush(), eager device pre-init, different dimension-change model).

Finding #9: Remove redundant init_vulkan_encode_device() call inside
the dimension-change block — the Vulkan device is pre-initialised and
never cleared, so we use it directly instead of cloning through the
init helper.  Also removes the now-unnecessary device re-assignment.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* fix(nodes): restore libva import removed during decode loop consolidation

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

* ci(e2e): drop PIPELINE_REQUIRE_NODES from non-GPU pipeline validation

The non-GPU Pipeline Validation job builds skit without nvcodec or
vulkan_video features, so GPU-only test pipelines (nv_av1_colorbars,
vulkan_video_h264_colorbars) are never available.  With
PIPELINE_REQUIRE_NODES=1 this caused hard failures instead of skips.

The GPU runner (pipeline-validation-gpu) already runs ALL pipeline
tests with PIPELINE_REQUIRE_NODES=1 and all features enabled, so
node registration regressions are still caught.  Without the flag
the non-GPU runner gracefully skips GPU pipelines while still
validating all SW codec pipelines.

Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>

---------

Signed-off-by: Devin AI <devin@streamkit.dev>
Signed-off-by: StreamKit Devin <devin@streamkit.dev>
Co-authored-by: StreamKit Devin <devin@streamkit.dev>
Co-authored-by: Claudio Costa <cstcld91@gmail.com>
Co-authored-by: staging-devin-ai-integration[bot] <166158716+staging-devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: streamkit-bot <registry-bot@streamkit.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants