feat(image-upload): send images to agent by vinhnxv · Pull Request #554 · slopus/happy

vinhnxv · 2026-02-07T18:33:51Z

Summary

Add the ability to send images (screenshots, UI mockups, photos) from the Happy mobile and desktop app to AI coding agents. This enables users to leverage Claude Code's vision capabilities directly from the app.

Core Feature: Image Upload Pipeline

Flow:

User picks images from gallery (native) or file picker/clipboard paste (web)
App resizes to max 1024px, converts to JPEG at quality 0.7 (~100-300KB)
App uploads via existing RPC writeFile to CLI machine's OS temp dir ($TMPDIR/happy/uploads/{sessionId}/)
App sends text message with [image: /path/to/file.jpg] references
Agent reads the file using its Read tool to analyze the image

Why this approach: Zero server/protocol changes required. Reuses existing writeFile RPC (already encrypted), and Claude Code's Read tool already supports image files natively.

What's New

Area	Change	Files
Image picking (native)	Gallery picker via `expo-image-picker`, resize via `expo-image-manipulator`	`imageUpload.ts`
Image picking (web)	File picker + clipboard paste, resize via Canvas API	`imageUpload.web.ts`, `MultiTextInput.web.tsx`
Shared upload logic	Base64 validation, RPC upload, upload dir caching, path sanitization	`imageUpload.shared.ts`
Image upload hook	`useImageUpload` — state management for pending images, pick/paste handlers	`useImageUpload.ts`
UI: image button	Action bar button with count badge, disabled at max (5 images)	`AgentInput.tsx`
UI: image chips	Horizontal scrollable strip above input with remove button per image	`AgentInput.tsx`
Agent integration	System prompt instructs agent to read `[image: path]` references	`systemPrompt.ts`
CLI: upload dir RPC	New `getUploadDir` RPC returns OS temp dir path	`registerCommonHandlers.ts`
CLI: path security	`validatePath()` extended with `additionalAllowedDirs` for temp upload dir	`pathSecurity.ts`
CLI: payload stripping	Strip base64 image data from tool results before socket transport	`apiSession.ts`
i18n	6 new keys across all 11 supported languages	`translations/*.ts`

Bug Fixes & Improvements

Fix crypto.getRandomValues crash on iOS/Android — Hermes doesn't have Web Crypto API; replaced with getRandomBytes from expo-crypto
Fix base64 encoding stack overflow — Chunked conversion in encodeBase64() for large buffers (>65KB) that previously crashed on web
Fix stale tool states — Reducer Phase 6 force-completes tools stuck in 'running' when agent has already responded with text
Fix ToolHeader overflow — Layout uses flexShrink instead of flexGrow to prevent text clipping on long file paths
Fix RPC timeout — Add 30s timeout to socket RPC calls with descriptive error messages (previously hung indefinitely)
Fix session init race — Create realtime session before SDK metadata extraction
Type safety — Extract ModelMode into const array with runtime validator isValidModelMode(), remove as any casts
Tool result parsing — Support image content blocks in tool results (not just text)
Cleanup — Remove leftover console.log debug statements from MultiTextInput
Tauri — Bump to ~2.9

Technical Notes

Payload safety: 520KB base64 cap ensures final encrypted payload stays under Socket.io's 1MB limit (double base64 encoding: 520KB → ~693KB on wire)
Platform split: imageUpload.ts (native) / imageUpload.web.ts (web) follows the existing MultiTextInput pattern. Shared logic extracted to imageUpload.shared.ts
No new dependencies: Both expo-image-picker and expo-image-manipulator were already in package.json but unused

Test plan

chatgpt-codex-connector · 2026-02-07T18:33:57Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

P1 fixes: - Prevent symlink traversal in pathSecurity via realpathSync - Serialize pick/paste with AsyncLock to prevent race conditions - Replace queueMicrotask with 300ms setTimeout for double-tap guard P2 fixes: - Clean up session upload temp dir on shutdown - Use shared encodeBase64 to fix O(n²) string concatenation in web - Scope upload dir validation per session (cross-session isolation) - Preserve non-timeout RPC errors instead of swallowing as "timed out" - Type RPC result instead of any - Fix isSessionMode type guard parameter type - Tighten tool result content type to z.enum(['text', 'image'])

vinhnxv force-pushed the feat/send-images-to-agent branch 2 times, most recently from 808a46d to 4efba38 Compare February 7, 2026 20:06

vinhnxv force-pushed the feat/send-images-to-agent branch from 4efba38 to 230f495 Compare February 7, 2026 21:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(image-upload): send images to agent#554

feat(image-upload): send images to agent#554
vinhnxv wants to merge 1 commit intoslopus:mainfrom
vinhnxv:feat/send-images-to-agent

vinhnxv commented Feb 7, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vinhnxv commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Core Feature: Image Upload Pipeline

What's New

Bug Fixes & Improvements

Technical Notes

Test plan

Uh oh!

chatgpt-codex-connector bot commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vinhnxv commented Feb 7, 2026 •

edited

Loading