Skip to content

Feat: Implement GPU Generation Path#2

Merged
gmemmy merged 2 commits intomainfrom
feat/gpu-path-generation
Sep 24, 2025
Merged

Feat: Implement GPU Generation Path#2
gmemmy merged 2 commits intomainfrom
feat/gpu-path-generation

Conversation

@gmemmy
Copy link
Owner

@gmemmy gmemmy commented Sep 8, 2025

Summary by CodeRabbit

  • New Features

    • Native VisionRTCView added and exported for embedding video previews on Android and iOS.
    • Selectable capture modes added, including GPU-backed and CPU-backed synthetic video generators and dual-capture handling.
    • Example app updated with controls, preview area, and HUD showing FPS and dropped frames.
  • Documentation

    • README expanded with comprehensive public API reference (functions, view props, track lifecycle, constraints, stats).
  • Chores

    • Types updated to accept a new optional mode option for track creation.

@coderabbitai
Copy link

coderabbitai bot commented Sep 8, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

Adds GPU-backed null generators and dual capture paths on Android and iOS, introduces native view managers and a JS VisionRTCView component, extends TrackOptions with a mode field, adds a track registry/bridge on iOS, updates example UI and README docs, and tweaks build/lint configs.

Changes

Cohort / File(s) Summary of Changes
Docs & Example
README.md, example/src/App.tsx
Added comprehensive API documentation to README; updated example UI to include VisionRTCView, controls, HUD, and layout/styling changes.
Android — generators, module & view manager
android/src/main/java/com/visionrtc/GlNullGenerator.kt, android/src/main/java/com/visionrtc/VisionRtcModule.kt, android/src/main/java/com/visionrtc/VisionRtcPackage.kt, android/src/main/java/com/visionrtc/VisionRtcViewManager.kt
New GL null-frame generator (GlNullGenerator); VisionRtcModule now holds dual capturers (cpu/gl), selects mode (null-gpu/null-cpu) when creating tracks, updates pause/resume/constraints/dispose to operate on both capturers, exposes EGL context and track lookup; added Android ViewManager and package registration.
iOS — generators, module, registry, bridge & view manager
ios/NullGPUGenerator.swift, ios/VisionRTCModule.swift, ios/VisionRTCTrackRegistry.swift, ios/VisionRTCRegistryBridge.h, ios/VisionRTCRegistryBridge.mm, ios/VisionRTCViewManager.m
New NullGPUGenerator producing CI/Metal frames; VisionRTCModule updated for mode-based GPU path and per-track GPU generator lifecycle; new VisionRTCTrackRegistry singleton; Objective-C bridge VisionRTCGetTrackFor to retrieve tracks; new iOS view (VisionRTCView) + manager exposing trackId prop and handling attach/detach.
JS/TS — public API & view component
src/NativeVisionRtc.ts, src/types.ts, src/vision-rtc-view.tsx, src/index.ts
Added optional `mode?: 'null-gpu'
Build / tooling / configs
lefthook.yml, example/ios/VisionRtcExample.xcodeproj/project.pbxproj
Fixed glob in lefthook lint/types and tightened TS invocation (npx tsc -p ... --noEmit); added Xcode build settings (ENABLE_PREVIEWS = NO, MACH_O_TYPE = mh_execute) in example project configs.

Sequence Diagram(s)

sequenceDiagram
    participant JS as JS App
    participant RN as React Native Bridge
    participant Module as VisionRTCModule (native)
    participant Registry as VisionRTCTrackRegistry / Track Store
    participant Generator as NullGPUGenerator / GlNullGenerator / GradientCapturer
    participant View as VisionRTCView (native)

    JS->>RN: createTrack(options{ mode })
    RN->>Module: createTrack(options)
    alt mode == "null-gpu"
        Module->>Generator: start GPU generator (NullGPUGenerator / GlNullGenerator)
    else mode == "null-cpu"
        Module->>Generator: start CPU capturer (GradientCapturer)
    else mode == "external"
        Module->>Registry: expect external RTCVideoTrack registration
    end
    Module->>Registry: register(trackId, rtcTrack?)
    Module-->>RN: track created (trackId)
    RN-->>JS: trackId

    JS->>RN: mount VisionRTCView(trackId)
    RN->>Module: view attach request(trackId)
    Module->>Registry: track(for: trackId)
    Registry-->>Module: RTCVideoTrack?
    alt track found
        Module->>View: attach RTCVideoTrack as renderer
    else not found
        View-->>Module: no-op / wait
    end

    Generator->>Registry: produce frames -> RTCVideoTrack
    RTCVideoTrack->>View: frame rendered
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

I nibble code and hum a tune,
Pixels bloom beneath the moon,
CPU and GPU both may play,
Tracks attach and views display.
Hop, hop — new frames begin to run! 🐰✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “Feat: Implement GPU Generation Path” accurately and concisely summarizes the primary change by highlighting the addition of a GPU‐based frame generation path across platforms, which aligns with the bulk of the code modifications in the PR.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a95e4d and 95c3513.

📒 Files selected for processing (3)
  • ios/VisionRTCRegistryBridge.h (1 hunks)
  • ios/VisionRTCRegistryBridge.mm (1 hunks)
  • src/vision-rtc-view.tsx (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
README.md (4)

44-49: Align function naming and defaults with implementation.

  • Rename parameter for replace to match native/newTrack wording.
  • Mention default track constraints.

Apply this diff:

-  - `createWebRTCTrack(source, opts?)`: Creates a WebRTC video track from a source. You can pass simple options like `fps` and `resolution`.
+  - `createWebRTCTrack(source, opts?)`: Creates a WebRTC video track from a source. You can pass options like `fps` and `resolution`. Defaults to `1280x720@30` if omitted.
-  - `replaceTrack(senderId: string, nextTrackId: string)`: Swaps the video track used by an existing WebRTC sender.
+  - `replaceTrack(senderId: string, newTrackId: string)`: Swaps the video track used by an existing WebRTC sender.

50-51: Document optional encoderQueueDepth in getStats.

Expose the extra field some platforms provide.

-  - `getStats()`: Returns basic runtime stats like `fps` and `droppedFrames` (if supported on the platform).
+  - `getStats()`: Returns basic runtime stats like `fps` and `droppedFrames`; may also include `encoderQueueDepth` on some platforms. Availability is platform-dependent.

53-57: Use StyleProp for style typing.

Matches React Native’s canonical style prop type.

-      - `style?: ViewStyle | ViewStyle[]`
+      - `style?: StyleProp<ViewStyle>`

58-66: Clarify TrackOptions support and defaults; mark GPU mode as experimental.

Current native code applies fps/resolution; the rest may be ignored right now.

   - `TrackOptions`: Options for tracks. Common fields:
     - `fps?: number` (frames per second)
     - `resolution?: { width: number; height: number }`
     - `bitrate?: number`
     - `colorSpace?: 'auto' | 'sRGB' | 'BT.709' | 'BT.2020'`
     - `orientationMode?: 'auto' | 'fixed-0' | 'fixed-90' | 'fixed-180' | 'fixed-270'`
     - `mode?: 'null-gpu' | 'null-cpu' | 'external'`
+  - Notes:
+    - Defaults: `resolution: { width: 1280, height: 720 }`, `fps: 30`.
+    - Platform support today: `fps` and `resolution` are applied on iOS and Android; `bitrate`, `colorSpace`, and `orientationMode` may be ignored.
+    - `mode`: `null-cpu` is implemented; `null-gpu` is experimental/in progress; `external` is for integrating your own pixel source.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 85d66d5 and b6eb036.

📒 Files selected for processing (1)
  • README.md (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
README.md (3)
src/NativeVisionRtc.ts (1)
  • Spec (16-41)
ios/VisionRTCModule.swift (1)
  • VisionRTC (5-273)
src/index.ts (1)
  • createWebRTCTrack (24-29)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-android
🔇 Additional comments (1)
README.md (1)

40-70: Solid API section—clear and useful.

Docs read well and align broadly with the native specs. A few small tighten-ups below.

@gmemmy gmemmy force-pushed the feat/gpu-path-generation branch from b6eb036 to 3a95e4d Compare September 24, 2025 22:41
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
android/src/main/java/com/visionrtc/VisionRtcModule.kt (1)

221-226: Fix build break in invalidate() cleanup

TrackHandle no longer exposes a capturer, so the cleanup path now fails to compile. Update the shutdown logic to stop both capturer variants instead of calling the removed property.

-        try { handle.capturer.stop() } catch (_: Throwable) {}
+        try { handle.cpuCapturer?.stop() } catch (_: Throwable) {}
+        try { handle.glCapturer?.stop() } catch (_: Throwable) {}
🧹 Nitpick comments (1)
example/src/App.tsx (1)

78-83: Prefer passing null when no track is bound

Once the JS wrapper forwards straight into the native component, handing it '' still triggers a lookup. Passing through null keeps detach semantics explicit.

-        <VisionRTCView trackId={trackId ?? ''} />
+        <VisionRTCView trackId={trackId} />
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b6eb036 and 3a95e4d.

⛔ Files ignored due to path filters (2)
  • example/ios/Podfile.lock is excluded by !**/*.lock
  • yarn.lock is excluded by !**/yarn.lock, !**/*.lock
📒 Files selected for processing (18)
  • README.md (1 hunks)
  • android/src/main/java/com/visionrtc/GlNullGenerator.kt (1 hunks)
  • android/src/main/java/com/visionrtc/VisionRtcModule.kt (5 hunks)
  • android/src/main/java/com/visionrtc/VisionRtcPackage.kt (2 hunks)
  • android/src/main/java/com/visionrtc/VisionRtcViewManager.kt (1 hunks)
  • example/ios/VisionRtcExample.xcodeproj/project.pbxproj (2 hunks)
  • example/src/App.tsx (2 hunks)
  • ios/NullGPUGenerator.swift (1 hunks)
  • ios/VisionRTCModule.swift (11 hunks)
  • ios/VisionRTCRegistryBridge.h (1 hunks)
  • ios/VisionRTCRegistryBridge.mm (1 hunks)
  • ios/VisionRTCTrackRegistry.swift (1 hunks)
  • ios/VisionRTCViewManager.m (1 hunks)
  • lefthook.yml (1 hunks)
  • src/NativeVisionRtc.ts (1 hunks)
  • src/index.ts (2 hunks)
  • src/types.ts (1 hunks)
  • src/vision-rtc-view.tsx (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • README.md
🧰 Additional context used
🧬 Code graph analysis (6)
src/vision-rtc-view.tsx (1)
src/index.ts (1)
  • VisionRTCView (19-19)
android/src/main/java/com/visionrtc/GlNullGenerator.kt (1)
android/src/main/java/com/visionrtc/VisionRtcModule.kt (1)
  • tick (286-324)
ios/NullGPUGenerator.swift (2)
android/src/main/java/com/visionrtc/GlNullGenerator.kt (1)
  • tick (59-96)
ios/VisionRTCModule.swift (1)
  • tick (205-297)
ios/VisionRTCModule.swift (2)
ios/VisionRTCTrackRegistry.swift (3)
  • register (11-15)
  • track (23-29)
  • unregister (17-21)
ios/NullGPUGenerator.swift (4)
  • setResolution (54-57)
  • setFps (59-63)
  • start (38-46)
  • stop (50-52)
android/src/main/java/com/visionrtc/VisionRtcPackage.kt (1)
example/android/app/src/main/java/visionrtc/example/MainApplication.kt (3)
  • { (18-21)
  • reactNativeHost (13-38)
  • getPackages (17-21)
example/src/App.tsx (2)
src/index.ts (1)
  • VisionRTCView (19-19)
src/vision-rtc-view.tsx (1)
  • VisionRTCView (3-9)
🪛 GitHub Check: lint
src/vision-rtc-view.tsx

[warning] 5-5:
Inline style: { flex: 1 }

🪛 SwiftLint (0.57.0)
ios/VisionRTCModule.swift

[Warning] 73-73: TODOs should be resolved ((phase3): Support 'external' m...)

(todo)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-ios
  • GitHub Check: build-android
🔇 Additional comments (7)
example/ios/VisionRtcExample.xcodeproj/project.pbxproj (1)

260-262: LGTM

Target-level overrides for SwiftUI previews and Mach-O type make sense for this app target.

Also applies to: 290-292

lefthook.yml (1)

8-9: TS build check matches shipping config

Pointing the hook at tsconfig.build.json with --noEmit keeps the pre-commit type pass aligned with what we actually ship—nice tightening.

src/NativeVisionRtc.ts (1)

14-14: Mode union keeps JS bindings aligned with native

Threading the 'null-gpu' | 'null-cpu' | 'external' union through the TurboModule surface mirrors the native switch paths and prevents accidental mismatches.

ios/VisionRTCRegistryBridge.h (1)

5-5: Bridge signature is well-scoped

Appreciate the explicit nullability on the exported function; it matches the ObjC++ implementation and keeps consumers honest.

ios/VisionRTCRegistryBridge.mm (1)

5-21: Runtime lookup guards look solid

The layered class resolution and selector checks avoid crashing when the registry isn't linked, while still letting us wire into Swift when present. Nicely done.

src/types.ts (1)

9-9: Type surface now reflects the new capture modes

Exposing the mode union here gives consumers full intellisense on the GPU vs CPU vs external paths—looks good.

android/src/main/java/com/visionrtc/VisionRtcPackage.kt (1)

35-37: View manager registration wired correctly

Returning the new VisionRtcViewManager here ensures autolinked packages get the native view without extra manual wiring. Nice touch.

Comment on lines +70 to +95
val buffer = JavaI420Buffer.allocate(w, h)
val yPlane: ByteBuffer = buffer.dataY
val uPlane: ByteBuffer = buffer.dataU
val vPlane: ByteBuffer = buffer.dataV
val ts = (now % 5000).toInt()

for (y in 0 until h) {
for (x in 0 until w) {
val yVal = (((x + ts / 10) % w).toFloat() / w * 255f).roundToInt().coerceIn(0, 255)
yPlane.put(y * buffer.strideY + x, yVal.toByte())
}
}
val chromaW = (w + 1) / 2
val chromaH = (h + 1) / 2
for (y in 0 until chromaH) {
for (x in 0 until chromaW) {
uPlane.put(y * buffer.strideU + x, 128.toByte())
vPlane.put(y * buffer.strideV + x, 128.toByte())
}
}

val timestampNs = org.webrtc.TimestampAligner.getRtcTimeNanos()
val frame = VideoFrame(buffer, 0, timestampNs)
observer.onFrameCaptured(frame)
frame.release()
framesThisSecond += 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Release the I420 buffer after frame dispatch.

JavaI420Buffer.allocate returns a ref-counted buffer. Because we never relinquish our creator reference, every tick leaks a frame-sized buffer and will quickly exhaust memory under sustained capture.

     val frame = VideoFrame(buffer, 0, timestampNs)
     observer.onFrameCaptured(frame)
     frame.release()
+    buffer.release()
🤖 Prompt for AI Agents
In android/src/main/java/com/visionrtc/GlNullGenerator.kt around lines 70 to 95,
the JavaI420Buffer allocated for the frame is not released and thus leaks; after
constructing the VideoFrame and dispatching it to observer (and after calling
frame.release()), call buffer.release() to relinquish the creator reference (do
not remove frame.release()), ensuring the buffer’s refcount is decremented and
preventing the per-tick memory leak.

Comment on lines +28 to +41
@ReactProp(name = "trackId")
fun setTrackId(view: SurfaceViewRenderer, trackId: String?) {
val module = view.context.applicationContext.let { (it as? Context)?.let { _ -> null } }
val mod = (context.getNativeModule(VisionRTCModule::class.java)) ?: return
val handle = mod.findTrack(trackId)
view.release()
view.init(mod.eglContext(), null)
handle?.track?.addSink(view)
}

override fun onDropViewInstance(view: SurfaceViewRenderer) {
super.onDropViewInstance(view)
view.release()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Detach previous sinks when retargeting the view

We keep adding the same SurfaceViewRenderer as a sink without ever removing it from the previous track. After a prop change or unmount the old track continues to deliver frames (and leaks the renderer), while the new track attaches on top—leading to double rendering and resource churn. Track the currently attached trackId, remove the sink when switching/tearing down, and bail early when the new id is null/unchanged (you’ll need a WeakHashMap<SurfaceViewRenderer, String> field and to mirror the removal in onDropViewInstance).

-    val handle = mod.findTrack(trackId)
-    view.release()
-    view.init(mod.eglContext(), null)
-    handle?.track?.addSink(view)
+    val previous = attachedTracks[view]
+    if (previous != null && previous != trackId) {
+      mod.findTrack(previous)?.track?.removeSink(view)
+      attachedTracks.remove(view)
+    }
+    if (trackId.isNullOrBlank()) {
+      view.release()
+      return
+    }
+    val handle = mod.findTrack(trackId) ?: return
+    if (previous == trackId) return
+    view.release()
+    view.init(mod.eglContext(), null)
+    handle.track.addSink(view)
+    attachedTracks[view] = trackId
   override fun onDropViewInstance(view: SurfaceViewRenderer) {
     super.onDropViewInstance(view)
-    view.release()
+    attachedTracks.remove(view)?.let { previous ->
+      context.getNativeModule(VisionRTCModule::class.java)
+        ?.findTrack(previous)
+        ?.track
+        ?.removeSink(view)
+    }
+    view.release()
   }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In android/src/main/java/com/visionrtc/VisionRtcViewManager.kt around lines 28
to 41, the current implementation repeatedly adds the same SurfaceViewRenderer
as a sink without removing it from the previous track; add a
WeakHashMap<SurfaceViewRenderer, String> field to store the currently attached
trackId per renderer, then in setTrackId: if the new trackId is null or equals
the stored id, bail early; otherwise lookup the previous id from the map, find
its track (via the module) and remove the renderer as a sink from that old track
before attaching the new one, update the map to the new id; also mirror this
removal in onDropViewInstance by removing the renderer from any attached track
and clearing the map entry to avoid leaks.

Comment on lines +8 to +63
private let queue = DispatchQueue(label: "com.visionrtc.nullgpu", qos: .userInitiated)
private var timer: DispatchSourceTimer?
private var fps: Int
private var res: Resolution
private var framesThisSecond: Int = 0
private var lastSecondTs: CFTimeInterval = CACurrentMediaTime()
private var droppedFrames: Int = 0

private var ciContext: CIContext
private var pixelPool: CVPixelBufferPool?

private let onFrame: (_ pixelBuffer: CVPixelBuffer, _ timestampNs: Int64) -> Void
private let onFps: (_ fps: Int, _ dropped: Int) -> Void

init(width: Int, height: Int, fps: Int,
onFrame: @escaping (_ pixelBuffer: CVPixelBuffer, _ timestampNs: Int64) -> Void,
onFps: @escaping (_ fps: Int, _ dropped: Int) -> Void) {
self.fps = max(1, fps)
self.res = Resolution(width: max(1, width), height: max(1, height))
self.onFrame = onFrame
self.onFps = onFps

if let device = MTLCreateSystemDefaultDevice() {
self.ciContext = CIContext(mtlDevice: device)
} else {
self.ciContext = CIContext(options: nil)
}
self.pixelPool = makePool(width: self.res.width, height: self.res.height)
}

func start() {
stop()
let periodNs = UInt64(1_000_000_000) / UInt64(max(1, fps))
let timer = DispatchSource.makeTimerSource(queue: queue)
timer.schedule(deadline: .now(), repeating: .nanoseconds(Int(periodNs)))
timer.setEventHandler { [weak self] in self?.tick() }
self.timer = timer
timer.resume()
}

func pause() { timer?.suspend() }
func resume() { timer?.resume() }
func stop() {
timer?.cancel(); timer = nil
}

func setResolution(width: Int, height: Int) {
res = Resolution(width: max(1, width), height: max(1, height))
pixelPool = makePool(width: res.width, height: res.height)
}

func setFps(_ next: Int) {
guard next > 0, next != fps else { return }
fps = next
start()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Balance the DispatchSourceTimer suspend/resume lifecycle.

Canceling a suspended DispatchSourceTimer (e.g., pause()stop() or setFps(...) while paused) triggers a runtime crash. Track the suspension state, avoid duplicate pauses/resumes, and always resume before canceling.

   private let queue = DispatchQueue(label: "com.visionrtc.nullgpu", qos: .userInitiated)
   private var timer: DispatchSourceTimer?
+  private var timerSuspended = false
   private var fps: Int
   private var res: Resolution
@@
   func start() {
     stop()
     let periodNs = UInt64(1_000_000_000) / UInt64(max(1, fps))
     let timer = DispatchSource.makeTimerSource(queue: queue)
     timer.schedule(deadline: .now(), repeating: .nanoseconds(Int(periodNs)))
     timer.setEventHandler { [weak self] in self?.tick() }
     self.timer = timer
+    timerSuspended = false
     timer.resume()
   }
 
-  func pause() { timer?.suspend() }
-  func resume() { timer?.resume() }
-  func stop() {
-    timer?.cancel(); timer = nil
-  }
+  func pause() {
+    guard let timer = timer, !timerSuspended else { return }
+    timer.suspend()
+    timerSuspended = true
+  }
+  func resume() {
+    guard let timer = timer, timerSuspended else { return }
+    timer.resume()
+    timerSuspended = false
+  }
+  func stop() {
+    guard let timer = timer else { return }
+    if timerSuspended {
+      timer.resume()
+      timerSuspended = false
+    }
+    timer.cancel()
+    self.timer = nil
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
private let queue = DispatchQueue(label: "com.visionrtc.nullgpu", qos: .userInitiated)
private var timer: DispatchSourceTimer?
private var fps: Int
private var res: Resolution
private var framesThisSecond: Int = 0
private var lastSecondTs: CFTimeInterval = CACurrentMediaTime()
private var droppedFrames: Int = 0
private var ciContext: CIContext
private var pixelPool: CVPixelBufferPool?
private let onFrame: (_ pixelBuffer: CVPixelBuffer, _ timestampNs: Int64) -> Void
private let onFps: (_ fps: Int, _ dropped: Int) -> Void
init(width: Int, height: Int, fps: Int,
onFrame: @escaping (_ pixelBuffer: CVPixelBuffer, _ timestampNs: Int64) -> Void,
onFps: @escaping (_ fps: Int, _ dropped: Int) -> Void) {
self.fps = max(1, fps)
self.res = Resolution(width: max(1, width), height: max(1, height))
self.onFrame = onFrame
self.onFps = onFps
if let device = MTLCreateSystemDefaultDevice() {
self.ciContext = CIContext(mtlDevice: device)
} else {
self.ciContext = CIContext(options: nil)
}
self.pixelPool = makePool(width: self.res.width, height: self.res.height)
}
func start() {
stop()
let periodNs = UInt64(1_000_000_000) / UInt64(max(1, fps))
let timer = DispatchSource.makeTimerSource(queue: queue)
timer.schedule(deadline: .now(), repeating: .nanoseconds(Int(periodNs)))
timer.setEventHandler { [weak self] in self?.tick() }
self.timer = timer
timer.resume()
}
func pause() { timer?.suspend() }
func resume() { timer?.resume() }
func stop() {
timer?.cancel(); timer = nil
}
func setResolution(width: Int, height: Int) {
res = Resolution(width: max(1, width), height: max(1, height))
pixelPool = makePool(width: res.width, height: res.height)
}
func setFps(_ next: Int) {
guard next > 0, next != fps else { return }
fps = next
start()
}
private let queue = DispatchQueue(label: "com.visionrtc.nullgpu", qos: .userInitiated)
private var timer: DispatchSourceTimer?
private var timerSuspended = false
private var fps: Int
private var res: Resolution
private var framesThisSecond: Int = 0
private var lastSecondTs: CFTimeInterval = CACurrentMediaTime()
private var droppedFrames: Int = 0
private var ciContext: CIContext
private var pixelPool: CVPixelBufferPool?
private let onFrame: (_ pixelBuffer: CVPixelBuffer, _ timestampNs: Int64) -> Void
private let onFps: (_ fps: Int, _ dropped: Int) -> Void
init(width: Int, height: Int, fps: Int,
onFrame: @escaping (_ pixelBuffer: CVPixelBuffer, _ timestampNs: Int64) -> Void,
onFps: @escaping (_ fps: Int, _ dropped: Int) -> Void) {
self.fps = max(1, fps)
self.res = Resolution(width: max(1, width), height: max(1, height))
self.onFrame = onFrame
self.onFps = onFps
if let device = MTLCreateSystemDefaultDevice() {
self.ciContext = CIContext(mtlDevice: device)
} else {
self.ciContext = CIContext(options: nil)
}
self.pixelPool = makePool(width: self.res.width, height: self.res.height)
}
func start() {
stop()
let periodNs = UInt64(1_000_000_000) / UInt64(max(1, fps))
let timer = DispatchSource.makeTimerSource(queue: queue)
timer.schedule(deadline: .now(), repeating: .nanoseconds(Int(periodNs)))
timer.setEventHandler { [weak self] in self?.tick() }
self.timer = timer
timerSuspended = false
timer.resume()
}
func pause() {
guard let timer = timer, !timerSuspended else { return }
timer.suspend()
timerSuspended = true
}
func resume() {
guard let timer = timer, timerSuspended else { return }
timer.resume()
timerSuspended = false
}
func stop() {
guard let timer = timer else { return }
if timerSuspended {
timer.resume()
timerSuspended = false
}
timer.cancel()
self.timer = nil
}
func setResolution(width: Int, height: Int) {
res = Resolution(width: max(1, width), height: max(1, height))
pixelPool = makePool(width: res.width, height: res.height)
}
func setFps(_ next: Int) {
guard next > 0, next != fps else { return }
fps = next
start()
}
🤖 Prompt for AI Agents
In ios/NullGPUGenerator.swift around lines 8 to 63, the DispatchSourceTimer
suspend/resume lifecycle is not tracked and canceling a suspended timer can
crash; add a Bool flag (e.g., isTimerSuspended) to track whether the timer is
currently suspended, only call suspend() when the timer exists and is not
already suspended, only call resume() when the timer exists and is suspended,
and in stop() ensure you resume the timer if it's suspended before calling
cancel() and clearing it; also ensure start() resets the suspended flag when
creating/resuming the timer and setFps()/setResolution() that restart the timer
follow the same safe suspend/resume/cancel sequence.

Comment on lines 169 to 172
@objc(getStats:rejecter:)
func getStats(resolver: RCTPromiseResolveBlock, rejecter: RCTPromiseRejectBlock) {
var fpsMax = 0
stateQueue.sync {
fpsMax = self.activeTrackIds.compactMap { self.trackStates[$0]?.fps }.max() ?? 0
}
resolver(["fps": fpsMax, "droppedFrames": 0])
resolver(["fps": producedFps, "droppedFrames": droppedFrames])
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Restore stats reporting for non-GPU tracks

producedFps/droppedFrames are only ever mutated by NullGPUGenerator callbacks. For every track still running on the CPU null capturer, getStats now reports zeros (regression from the previous implementation that derived numbers from the module state). We need to fall back to the legacy computation when no GPU generators are active so CPU-only callers still receive meaningful telemetry.

   @objc(func getStats:rejecter:)
   func getStats(resolver: RCTPromiseResolveBlock, rejecter: RCTPromiseRejectBlock) {
-    resolver(["fps": producedFps, "droppedFrames": droppedFrames])
+    var stats: [String: Any] = [:]
+    stateQueue.sync {
+      if !gpuGenerators.isEmpty {
+        stats["fps"] = producedFps
+        stats["droppedFrames"] = droppedFrames
+      } else {
+        let active = activeTrackIds
+        let targetFps = active.compactMap { trackStates[$0]?.fps }.max() ?? 0
+        stats["fps"] = targetFps
+        stats["droppedFrames"] = 0
+      }
+    }
+    resolver(stats)
   }

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +320 to +323
}, onFps: { [weak self] fpsNow, dropped in
self?.producedFps = fpsNow
self?.droppedFrames = dropped
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Synchronize GPU stats updates

NullGPUGenerator invokes onFps on its own queue, while getStats reads those fields on the JS/UI thread. Writing and reading producedFps/droppedFrames without coordination is a classic data race in Swift and can produce torn reads or crashes. Update the closure to hop through stateQueue so both writers and readers share the same synchronization primitive.

-    }, onFps: { [weak self] fpsNow, dropped in
-      self?.producedFps = fpsNow
-      self?.droppedFrames = dropped
-    })
+    }, onFps: { [weak self] fpsNow, dropped in
+      guard let self = self else { return }
+      self.stateQueue.async(flags: .barrier) {
+        self.producedFps = fpsNow
+        self.droppedFrames = dropped
+      }
+    })
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
}, onFps: { [weak self] fpsNow, dropped in
self?.producedFps = fpsNow
self?.droppedFrames = dropped
})
}, onFps: { [weak self] fpsNow, dropped in
guard let self = self else { return }
self.stateQueue.async(flags: .barrier) {
self.producedFps = fpsNow
self.droppedFrames = dropped
}
})
🤖 Prompt for AI Agents
In ios/VisionRTCModule.swift around lines 320 to 323, the onFps closure updates
producedFps and droppedFrames on NullGPUGenerator's queue causing a data race
with getStats on the JS/UI thread; instead, dispatch those assignments onto the
module's stateQueue so reads and writes share the same synchronization
primitive—capture self weakly as before and call stateQueue.async { update
producedFps and droppedFrames } to perform the writes.

Comment on lines +11 to +20
@objc func register(trackId: String, track: RTCVideoTrack) {
queue.async(flags: .barrier) {
self.tracks[trackId] = track
}
}

@objc func unregister(trackId: String) {
queue.async(flags: .barrier) {
self.tracks.removeValue(forKey: trackId)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix track registration race

register/unregister dispatch their mutations asynchronously, so a lookup that immediately follows (e.g. when the native view calls VisionRTCGetTrackFor right after createTrack resolves) can still observe nil. When that happens the view never attaches a renderer and the GPU path shows a blank surface. Please make these writes synchronous so the registry is updated before we return.

-    queue.async(flags: .barrier) {
+    queue.sync(flags: .barrier) {
       self.tracks[trackId] = track
     }
-    queue.async(flags: .barrier) {
+    queue.sync(flags: .barrier) {
       self.tracks.removeValue(forKey: trackId)
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@objc func register(trackId: String, track: RTCVideoTrack) {
queue.async(flags: .barrier) {
self.tracks[trackId] = track
}
}
@objc func unregister(trackId: String) {
queue.async(flags: .barrier) {
self.tracks.removeValue(forKey: trackId)
}
@objc func register(trackId: String, track: RTCVideoTrack) {
queue.sync(flags: .barrier) {
self.tracks[trackId] = track
}
}
@objc func unregister(trackId: String) {
queue.sync(flags: .barrier) {
self.tracks.removeValue(forKey: trackId)
}
}
🤖 Prompt for AI Agents
In ios/VisionRTCTrackRegistry.swift around lines 11 to 20, the
register/unregister methods dispatch mutations asynchronously which can lead to
a race where a subsequent lookup observes nil; change both queue.async(flags:
.barrier) calls to queue.sync(flags: .barrier) so the registry map is updated
synchronously before returning, preserving the barrier semantics; update both
methods accordingly and ensure callers are not performing these calls on the
same serial queue to avoid deadlock.

Comment on lines +15 to +27
if (self.attachedTrack) {
[self.attachedTrack removeRenderer:self];
self.attachedTrack = nil;
}

if (_trackId.length > 0) {
RTCVideoTrack *track = VisionRTCGetTrackFor(_trackId);
if (track) {
self.contentMode = UIViewContentModeScaleAspectFit;
[track addRenderer:self];
self.attachedTrack = track;
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Detach the track when the view is torn down

RTCVideoTrack holds onto every renderer you register. If this React Native view ever leaves the hierarchy without trackId being reset, the track continues to retain (and push frames to) the now-orphaned RTCMTLVideoView. Because we also keep a strong reference to the track in attachedTrack, the pair never gets released, leaking both the renderer and the native track and, on some builds, continuing to draw into a zombie view.

Please centralize the “detach” logic and invoke it from both this setter and the teardown lifecycle (removeFromSuperview, dealloc, or didMoveToWindow when window == nil). For example:

@@
-  if (self.attachedTrack) {
-    [self.attachedTrack removeRenderer:self];
-    self.attachedTrack = nil;
-  }
+  [self detachCurrentTrack];
@@
-@end
+}
+
+- (void)removeFromSuperview
+{
+  [self detachCurrentTrack];
+  [super removeFromSuperview];
+}
+
+- (void)dealloc
+{
+  [self detachCurrentTrack];
+}
+
+- (void)detachCurrentTrack
+{
+  if (!self.attachedTrack) { return; }
+  [self.attachedTrack removeRenderer:self];
+  self.attachedTrack = nil;
+}
+@end

This ensures we break the retain cycle and stop WebRTC from dispatching frames into a view that’s already gone.

🤖 Prompt for AI Agents
In ios/VisionRTCViewManager.m around lines 15 to 27, centralize the detach logic
into a helper (e.g., -detachTrack) that checks self.attachedTrack, calls
[self.attachedTrack removeRenderer:self], and sets self.attachedTrack = nil;
call this helper at the start of the setter that currently detaches/attaches
tracks (so old renderer is always removed before attaching a new one) and also
call it from the view teardown lifecycle (override removeFromSuperview and
dealloc, or override didMoveToWindow and call when window == nil) to ensure the
renderer is removed and the strong reference cleared.

Comment on lines +19 to +20
export {VisionRTCView};

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

VisionRTCView never reaches the native renderer

Right now src/vision-rtc-view.tsx still returns a plain <View><Text>{trackId}</Text></View>, so exporting it here means callers (and the example) only see the track ID string. None of the native view-manager work lands, so GPU/CPU surfaces can't render. Let's replace the stub with a requireNativeComponent('VisionRTCView') wrapper that forwards trackId (and supports null for detaching), otherwise this API is non-functional.

Apply this diff to wire the native view:

diff --git a/src/vision-rtc-view.tsx b/src/vision-rtc-view.tsx
index 0000000..0000000 100644
--- a/src/vision-rtc-view.tsx
+++ b/src/vision-rtc-view.tsx
@@
-import * as React from 'react';
-import {Text, View} from 'react-native';
-
-export default function VisionRTCView({trackId}: {trackId: string}) {
-  return (
-    <View style={{flex: 1}}>
-      <Text>{trackId}</Text>
-    </View>
-  );
-}
+import {requireNativeComponent} from 'react-native';
+
+type VisionRTCViewProps = {
+  trackId?: string | null;
+};
+
+const NativeVisionRTCView =
+  requireNativeComponent<VisionRTCViewProps>('VisionRTCView');
+
+export default function VisionRTCView({trackId}: VisionRTCViewProps) {
+  return <NativeVisionRTCView trackId={trackId ?? null} />;
+}
🤖 Prompt for AI Agents
In src/index.ts around lines 19-20, exporting the stub VisionRTCView causes
callers to only get the plain React view instead of the native renderer; replace
the stub export with a requireNativeComponent('VisionRTCView') wrapper component
that forwards the trackId prop (accepting string | null to allow detaching) and
any other relevant props to the native view, define the prop types/interface to
include trackId as optional nullable, and export that wrapper so the native
view-manager receives the prop and can render GPU/CPU surfaces.

Comment on lines +1 to +9
import {View, Text} from 'react-native';

export default function VisionRTCView({trackId}: {trackId: string}) {
return (
<View style={{flex: 1}}>
<Text>{trackId}</Text>
</View>
);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical: JS component never renders the native VisionRTC view

Right now we just render a local <View>/<Text>, so the trackId never reaches the native view manager (iOS VisionRTCViewManager / Android VisionRtcViewManager). That means the new GPU/null capture path can’t display anything — the native side is never instantiated. We need to bridge through requireNativeComponent and pass the props straight down so the track attaches correctly.

Please wire it up along these lines:

-import {View, Text} from 'react-native';
-
-export default function VisionRTCView({trackId}: {trackId: string}) {
-  return (
-    <View style={{flex: 1}}>
-      <Text>{trackId}</Text>
-    </View>
-  );
-}
+import {requireNativeComponent} from 'react-native';
+import type {StyleProp, ViewStyle} from 'react-native';
+
+type VisionRTCViewProps = {
+  trackId: string;
+  style?: StyleProp<ViewStyle>;
+};
+
+const NativeVisionRTCView =
+  requireNativeComponent<VisionRTCViewProps>('VisionRTCView');
+
+export default function VisionRTCView(props: VisionRTCViewProps) {
+  return <NativeVisionRTCView {...props} />;
+}

(Be sure the string literal matches the name exported by the native view manager.)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import {View, Text} from 'react-native';
export default function VisionRTCView({trackId}: {trackId: string}) {
return (
<View style={{flex: 1}}>
<Text>{trackId}</Text>
</View>
);
}
import {requireNativeComponent} from 'react-native';
import type {StyleProp, ViewStyle} from 'react-native';
type VisionRTCViewProps = {
trackId: string;
style?: StyleProp<ViewStyle>;
};
const NativeVisionRTCView = requireNativeComponent<VisionRTCViewProps>('VisionRTCView');
export default function VisionRTCView(props: VisionRTCViewProps) {
return <NativeVisionRTCView {...props} />;
}
🧰 Tools
🪛 GitHub Check: lint

[warning] 5-5:
Inline style: { flex: 1 }

🤖 Prompt for AI Agents
In src/vision-rtc-view.tsx around lines 1 to 9, the React component currently
renders only a local View/Text so the native VisionRTC view manager never
receives props; replace the placeholder with a bridged native component using
requireNativeComponent('VisionRTCView') (ensure the string matches the native
manager's exported name), forward the incoming trackId (and any other props like
style) directly to the native component, and return that native component (e.g.,
<VisionRTCView style={{flex:1}} trackId={trackId} />) so the native view is
instantiated and the track attaches correctly.

@gmemmy gmemmy merged commit e1f4cba into main Sep 24, 2025
3 of 6 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant