fix: replace fal face detection with Gemini vision#127
Conversation
…erlays correctly - resolveImageInstruction: prepend FACE_SWAP_INSTRUCTION to customInstruction when usesFaceGuide is true, so the model uses the face guide for face-swapping - createContentTask: skip passing additionalImageUrls to generateContentImage when template.usesImageOverlay is true (those images are for video overlays only) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverts the resolveImageInstruction logic change. Instead, adds face-swap instructions directly to the editorial template's customInstruction field. Simpler: no code change needed, just template content. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Florence-2 object detection had false negatives on AI-generated face images. Replaces it with a Gemini 2.5 Flash vision call via the Recoup Chat API, which can reliably determine if an image is a portrait/headshot. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
2 issues found across 6 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/content/detectFace.ts">
<violation number="1" location="src/content/detectFace.ts:24">
P2: Add a fetch timeout so an unresponsive upstream API doesn't block this step for minutes. Node.js 20+ supports `signal: AbortSignal.timeout(30_000)`.</violation>
<violation number="2" location="src/content/detectFace.ts:42">
P2: Validate the API response with Zod instead of a type assertion. If the upstream response shape changes, the current `as` cast silently yields `false` with no error signal, making failures hard to diagnose.
```ts
import { z } from "zod";
const DetectFaceResponse = z.object({ text: z.string().optional() });
// …
const json = DetectFaceResponse.parse(await response.json());
```</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
src/content/detectFace.ts
Outdated
| } | ||
|
|
||
| const recoupApiUrl = process.env.RECOUP_API_URL ?? "https://recoup-api.vercel.app"; | ||
| const response = await fetch(`${recoupApiUrl}/api/chat/generate`, { |
There was a problem hiding this comment.
Do not query the recoup API. Make a toolLoopAgent similar to other usages of ai gateway.
061bf7f to
6cff7ad
Compare
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughIntroduces Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client\n(`detectFace`)
participant Agent as ToolLoopAgent\n(`createFaceDetectionAgent`)
participant AI as AI Gateway\n(`google/gemini-2.5-flash`)
Client->>Agent: generate(multimodal user message with `image` part = imageUrl)
Agent->>AI: request(prompt + schema)
AI-->>Agent: response (structured output: { hasFace: boolean } or null)
Agent-->>Client: returns { output: { hasFace } } or { output: null }
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Replaces direct Recoup Chat API call with a ToolLoopAgent using
Gemini 2.5 Flash via AI Gateway, matching the createClipAnalysisAgent
pattern. Returns structured { hasFace: boolean } via Zod schema.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The model was only seeing the URL as text, not the actual image pixels.
Now passes the image via { type: "image", image: new URL(imageUrl) } so
Gemini can visually analyze the image content for face detection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ToolLoopAgent.generate() accepts messages (ModelMessage[]) not multimodal
prompt arrays. Pass image as { type: "image", image: imageUrl } inside a
user message content array per AI SDK docs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/agents/createFaceDetectionAgent.ts (1)
8-13: Make the synthetic-portrait rule explicit.The regression here was AI-generated headshots being routed as overlays. These instructions never say whether synthetic or AI-generated portraits still count as
hasFace, so the model can stay inconsistent on the exact edge case this change is meant to fix.♻️ Prompt tweak
-Return hasFace: true if the image contains a clear human face as the primary subject. +Return hasFace: true if the image contains a clear human face as the primary subject, including photorealistic AI-generated headshots or portraits.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/agents/createFaceDetectionAgent.ts` around lines 8 - 13, Update the instructions string used in createFaceDetectionAgent (the variable named `instructions`) to explicitly state how to treat synthetic/AI-generated portraits: clarify that AI-generated or synthetic headshots/portraits should be treated the same as real human portraits and return hasFace: true when they clearly depict a human face as the primary subject, and return hasFace: false for stylized or abstract non-human images; keep the existing categories (headshot, selfie, press photo, portrait) and ensure the new sentence is unambiguous about synthetic/AI-generated imagery counting as faces when they depict clear human faces.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/content/__tests__/detectFace.test.ts`:
- Around line 7-12: The test's mock references a top-level const (mockGenerate)
from the vi.mock factory which runs before that const is initialized; wrap the
mockGenerate initialization with vi.hoisted so it is hoisted alongside the mock
factory. Specifically, replace the top-level mockGenerate declaration used by
the vi.mock for createFaceDetectionAgent and initialize it via vi.hoisted(() =>
vi.fn()), leaving the vi.mock factory returning an object whose
createFaceDetectionAgent -> generate uses that hoisted mockGenerate.
In `@src/content/detectFace.ts`:
- Around line 20-21: The code logs a raw slice of imageUrl in logStep; replace
that with a redacted stable surrogate (e.g., compute a short deterministic hash
or strip query/auth params into a safe id) and use that surrogate variable
instead of imageUrl.slice(...); update the calls around hasFace and any failure
path in detectFace.ts (referencing imageUrl, hasFace, and logStep) so no raw URL
is logged anywhere and the same surrogate is reused for all related log entries.
---
Nitpick comments:
In `@src/agents/createFaceDetectionAgent.ts`:
- Around line 8-13: Update the instructions string used in
createFaceDetectionAgent (the variable named `instructions`) to explicitly state
how to treat synthetic/AI-generated portraits: clarify that AI-generated or
synthetic headshots/portraits should be treated the same as real human portraits
and return hasFace: true when they clearly depict a human face as the primary
subject, and return hasFace: false for stylized or abstract non-human images;
keep the existing categories (headshot, selfie, press photo, portrait) and
ensure the new sentence is unambiguous about synthetic/AI-generated imagery
counting as faces when they depict clear human faces.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c5acf7ba-acb4-4d23-9e83-28dba3e14010
📒 Files selected for processing (3)
src/agents/createFaceDetectionAgent.tssrc/content/__tests__/detectFace.test.tssrc/content/detectFace.ts
| const mockGenerate = vi.fn(); | ||
| vi.mock("../../agents/createFaceDetectionAgent", () => ({ | ||
| createFaceDetectionAgent: () => ({ | ||
| generate: mockGenerate, | ||
| }), | ||
| })); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Vitest, are vi.mock()factories hoisted such that they cannot safely reference a top-levelconstfrom the same module withoutvi.hoisted()?
💡 Result:
Yes. In Vitest, vi.mock() calls are hoisted to the top of the file (executed before imports), and Vitest explicitly warns that this means the mock factory cannot use variables defined outside the factory—including a top-level const from the same module—because it may not be initialized yet (TDZ / ReferenceError). If you need to share local values with the factory, define them via vi.hoisted() (or use vi.doMock(), which is not hoisted but only affects subsequent imports). [1] [2]
Sources:
[1] Vitest API docs – vi.mock hoisting + “cannot use any variables inside the factory… defined outside the factory” + vi.hoisted guidance: (vitest.dev)
[2] Vitest guide – vi.mock is moved to the top of the file by transformation: (vitest.dev)
Citations:
Wrap mock dependencies in vi.hoisted() to ensure they initialize before the mock factory runs.
vi.mock() is hoisted during module transformation, so its factory executes before top-level const declarations. When the factory tries to reference mockGenerate, it hasn't been initialized yet, causing a TDZ error. Use vi.hoisted() to hoist variable initialization alongside the mock setup.
Fix with `vi.hoisted()`
-const mockGenerate = vi.fn();
+const { mockGenerate } = vi.hoisted(() => ({
+ mockGenerate: vi.fn(),
+}));
vi.mock("../../agents/createFaceDetectionAgent", () => ({
createFaceDetectionAgent: () => ({
generate: mockGenerate,
}),
}));📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const mockGenerate = vi.fn(); | |
| vi.mock("../../agents/createFaceDetectionAgent", () => ({ | |
| createFaceDetectionAgent: () => ({ | |
| generate: mockGenerate, | |
| }), | |
| })); | |
| const { mockGenerate } = vi.hoisted(() => ({ | |
| mockGenerate: vi.fn(), | |
| })); | |
| vi.mock("../../agents/createFaceDetectionAgent", () => ({ | |
| createFaceDetectionAgent: () => ({ | |
| generate: mockGenerate, | |
| }), | |
| })); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/content/__tests__/detectFace.test.ts` around lines 7 - 12, The test's
mock references a top-level const (mockGenerate) from the vi.mock factory which
runs before that const is initialized; wrap the mockGenerate initialization with
vi.hoisted so it is hoisted alongside the mock factory. Specifically, replace
the top-level mockGenerate declaration used by the vi.mock for
createFaceDetectionAgent and initialize it via vi.hoisted(() => vi.fn()),
leaving the vi.mock factory returning an object whose createFaceDetectionAgent
-> generate uses that hoisted mockGenerate.
| const hasFace = output?.hasFace ?? false; | ||
| logStep("Face detection result", false, { imageUrl: imageUrl.slice(0, 80), hasFace }); |
There was a problem hiding this comment.
Redact the image URL before logging it.
Line 21 still logs a raw slice of the source URL. Signed asset URLs often put bearer tokens or query params at the front, and even when they don’t this creates a high-cardinality log field. Log a stable surrogate instead, and reuse it in the failure path too.
🛡️ Safer log payload
export async function detectFace(imageUrl: string): Promise<boolean> {
+ let imageRefForLog = "[invalid-url]";
try {
const agent = createFaceDetectionAgent();
+ const parsedImageUrl = new URL(imageUrl);
+ imageRefForLog = `${parsedImageUrl.origin}${parsedImageUrl.pathname}`;
const { output } = await agent.generate({
prompt: [
- { type: "image", image: new URL(imageUrl) },
+ { type: "image", image: parsedImageUrl },
{ type: "text", text: "Does this image contain a human face as the primary subject?" },
],
});
const hasFace = output?.hasFace ?? false;
- logStep("Face detection result", false, { imageUrl: imageUrl.slice(0, 80), hasFace });
+ logStep("Face detection result", false, { imageUrl: imageRefForLog, hasFace });
return hasFace;
} catch (err) {
logStep("Face detection failed, assuming no face", false, {
- imageUrl: imageUrl.slice(0, 80),
+ imageUrl: imageRefForLog,
error: err instanceof Error ? err.message : String(err),
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/content/detectFace.ts` around lines 20 - 21, The code logs a raw slice of
imageUrl in logStep; replace that with a redacted stable surrogate (e.g.,
compute a short deterministic hash or strip query/auth params into a safe id)
and use that surrogate variable instead of imageUrl.slice(...); update the calls
around hasFace and any failure path in detectFace.ts (referencing imageUrl,
hasFace, and logStep) so no raw URL is logged anywhere and the same surrogate is
reused for all related log entries.
Shows the model an example face guide (headshot on white background) before asking it to classify the target image. This helps distinguish between actual face guides and playlist covers that happen to show faces. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
1 issue found across 4 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/content/__tests__/detectFace.test.ts">
<violation number="1" location="src/content/__tests__/detectFace.test.ts:60">
P3: Add a `toBeDefined()` assertion on `targetImagePart` before accessing `.image`, matching the pattern used for `exampleImagePart` above. Without it, a missing image part yields an opaque `TypeError` instead of a clear test failure.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
| // Third message: actual image to classify | ||
| expect(messages[2].role).toBe("user"); | ||
| const targetImagePart = messages[2].content.find((p: { type: string }) => p.type === "image"); | ||
| expect(targetImagePart.image).toBe("https://example.com/photo.png"); |
There was a problem hiding this comment.
P3: Add a toBeDefined() assertion on targetImagePart before accessing .image, matching the pattern used for exampleImagePart above. Without it, a missing image part yields an opaque TypeError instead of a clear test failure.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/content/__tests__/detectFace.test.ts, line 60:
<comment>Add a `toBeDefined()` assertion on `targetImagePart` before accessing `.image`, matching the pattern used for `exampleImagePart` above. Without it, a missing image part yields an opaque `TypeError` instead of a clear test failure.</comment>
<file context>
@@ -18,33 +22,42 @@ describe("detectFace", () => {
+ // Third message: actual image to classify
+ expect(messages[2].role).toBe("user");
+ const targetImagePart = messages[2].content.find((p: { type: string }) => p.type === "image");
+ expect(targetImagePart.image).toBe("https://example.com/photo.png");
});
</file context>
| expect(targetImagePart.image).toBe("https://example.com/photo.png"); | |
| expect(targetImagePart).toBeDefined(); | |
| expect(targetImagePart.image).toBe("https://example.com/photo.png"); |
Switches from gemini-2.5-flash to gemini-3.1-flash-lite-preview for better vision accuracy in distinguishing face guides from playlist covers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
readFileSync fails in Trigger.dev builds since the file isn't copied to the build output. Use the raw GitHub URL instead so it works in any runtime environment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
1 issue found across 3 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/agents/createFaceDetectionAgent.ts">
<violation number="1" location="src/agents/createFaceDetectionAgent.ts:30">
P1: Custom agent: **Flag AI Slop and Fabricated Changes**
The PR description claims this change uses **Gemini 2.5 Flash**, but the code actually references `google/gemini-3.1-flash-lite-preview` — a model identifier that doesn't appear anywhere else in the codebase and contradicts the PR's own stated intent. If this model ID is fabricated or hallucinated, the agent will fail at runtime. Please verify the model identifier is valid for your AI gateway and update the PR description to match.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
| */ | ||
| export function createFaceDetectionAgent() { | ||
| return new ToolLoopAgent({ | ||
| model: "google/gemini-3.1-flash-lite-preview", |
There was a problem hiding this comment.
P1: Custom agent: Flag AI Slop and Fabricated Changes
The PR description claims this change uses Gemini 2.5 Flash, but the code actually references google/gemini-3.1-flash-lite-preview — a model identifier that doesn't appear anywhere else in the codebase and contradicts the PR's own stated intent. If this model ID is fabricated or hallucinated, the agent will fail at runtime. Please verify the model identifier is valid for your AI gateway and update the PR description to match.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/agents/createFaceDetectionAgent.ts, line 30:
<comment>The PR description claims this change uses **Gemini 2.5 Flash**, but the code actually references `google/gemini-3.1-flash-lite-preview` — a model identifier that doesn't appear anywhere else in the codebase and contradicts the PR's own stated intent. If this model ID is fabricated or hallucinated, the agent will fail at runtime. Please verify the model identifier is valid for your AI gateway and update the PR description to match.</comment>
<file context>
@@ -27,7 +27,7 @@ Return hasFace: false for everything else.`;
export function createFaceDetectionAgent() {
return new ToolLoopAgent({
- model: "google/gemini-2.5-flash",
+ model: "google/gemini-3.1-flash-lite-preview",
instructions,
output: Output.object({ schema: faceDetectionSchema }),
</file context>
| model: "google/gemini-3.1-flash-lite-preview", | |
| model: "google/gemini-2.5-flash", |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Now loaded from Vercel Blob storage at runtime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
fal-ai/florence-2-large/object-detection) with Gemini 2.5 Flash vision via the Recoup Chat APITest plan
🤖 Generated with Claude Code
Summary by cubic
Replaced Florence‑2 with Gemini 3.1 Flash Lite vision via a
ToolLoopAgentto reliably detect face‑guide portraits and fix overlay routing. Added a few‑shot example and switched the reference image to a Vercel Blob URL so detection works across runtimes.Bug Fixes
fal-ai/florence-2-large/object-detectiontogoogle/gemini-3.1-flash-lite-previewvia aToolLoopAgent, returning{ hasFace: boolean }.additionalImageUrlstogenerateContentImagewhenusesImageOverlayis true.Refactors
Written for commit f3c0a9e. Summary will update on new commits.
Summary by CodeRabbit
Refactor
Bug Fixes
Tests