Skip to content

feat: AI-powered bin content detection from video/photos #48

@akifbayram

Description

@akifbayram

Summary

Add the ability to record a short video clip of a bin's contents and use AI vision to automatically identify and list the items inside.

Motivation

Manually typing out bin contents is tedious, especially for bins with many small items. A short video pan across a bin could automatically populate the items list, saving time and improving accuracy.

Proposed Solution

Extract key frames from a short video clip and send them to an AI vision API for analysis.

  1. Capture — User records a short video clip (3-10s) of a bin's contents, or selects an existing video from their device
  2. Frame extraction — Client-side extraction of N key frames (e.g., 3-5) from the video using canvas/<video> element
  3. Analysis — Send extracted frames to a vision-capable AI API (configurable provider) with a prompt to identify and list visible items
  4. Review & confirm — Display detected items for the user to review, edit, and confirm before saving to the bin

This approach is provider-agnostic — any API that supports image input works (OpenAI GPT-4o, Anthropic Claude, Google Gemini, etc.). The provider and API key would be configurable in server settings.

Note: Google Gemini natively accepts video file uploads, which would skip the frame-extraction step entirely. This could be offered as an optimized path when Gemini is the configured provider.

Acceptance Criteria

  • User can record or select a short video clip from the bin detail page
  • Frames are extracted client-side from the video
  • Extracted frames are sent to a configurable AI vision API
  • Detected items are presented for user review before saving
  • AI provider and API key are configurable in settings
  • Works on mobile (primary use case — phone pointed at bin)
  • Graceful fallback if AI analysis fails or returns low-confidence results

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions