vision-model

Here are 17 public repositories matching this topic...

SystemVll / Montscan

🖨️ Automated scanner document processor with AI-powered naming and WebDav integration. Receives scans via FTP, extracts text using Vision AI, generates intelligent filenames with Ollama AI, and uploads to your cloud storage.

go golang printer ai scanner nextcloud webdav ftp-server vision-model vision-language-model ollama

Updated Apr 4, 2026
Go

SpaceinvaderOne / a-eye

Star

Self-hosted AI photo intelligence tool. Uses local vision models via Ollama to describe, tag, rename, and search your photos. No cloud needed.

selfhosted unraid photo-management vision-model local-ai photo-renaming

Updated Apr 5, 2026
Python

guaardvark / guaardvark

Star

The self-hosted AI workstation. Autonomous screen agents, 3-tier neural routing, parallel agent swarms, video generation, 4K/8K upscaling, RAG, voice interface, 57-tool execution engine — all running locally on your hardware.

Updated Apr 7, 2026
Python

Varun-Patkar / ChromePilot

Star

AI-powered browser automation agent using a dual-LLM architecture. The orchestrator (qwen3-vl-32k) creates execution plans from screenshots, while the executor (llama3.1:8b) translates steps into browser actions using an accessibility tree for reliable element selection. Local, private, powered by Ollama.

javascript chrome-extension markdown streaming web-scraping html-parsing browser-extension web-automation conversational-ai privacy-focused vision-model ai-assistant local-ai ollama qwen3-vl

Updated Dec 13, 2025
JavaScript

i-evi / evMLP

Star

evMLP: An Efficient Event-Driven MLP Architecture for Vision

computer-vision backbone mlp-classifier mlp-networks vision-model

Updated Nov 25, 2025
Python

gabrimatic / eyra

Star

Real-time AI screen analysis from the terminal. Local inference, voice interaction, adaptive model routing.

python macos cli privacy ai computer-vision voice spacy screen-capture vision-model ollama google-gemini local-whisper

Updated Mar 14, 2026
Python

yorha2b-lab / auto-crud-copilot

Star

基于视觉大模型的前端(React+Antd)全自动 CRUD 代码生成器 🚀 / AI-powered full-automatic CRUD code generator.

react cli crud ai code-generator antd umi copilot low-code vision-model

Updated Apr 9, 2026
JavaScript

ASCII125 / aiyer-object-viewer

Star

Aiyer is an image analysis and interpretation tool that integrates with large language models (LLMs) to process, classify, and extract information from visual data. Its purpose is to standardize outputs, enabling structured, consistent responses that are easy to integrate with other systems.

python library groq vision-model llm ollama

Updated Apr 9, 2026
Python

Shehjad2019 / ai-nutrition-vision

Star

AI Nutrition Vision analyzes food images using OpenAI Vision to detect food items and produce detailed nutrition insights (calories, protein, fat, serving size, etc.) with clean Streamlit UI.

python machine-learning ai deep-learning openai food-recognition streamlit vision-model calorie-estimation llm gpt-4o image-recognitiion ai-nutrition

Updated Dec 14, 2025
Python

JordanmFrancis / biotracker

Star

iOS app that ingests bloodwork photos from multiple providers and uses AI to extract and trend lab values over time.

swift ios ai health swiftui vision-model

Updated Apr 8, 2026
Swift

IRedDragonICY / resonote

Sponsor

Star

Next-gen AI Optical Music Recognition (OMR) platform. Convert sheet music images into playable ABC notation instantly using Google Gemini 3 Pro Vision. Built with React 19, TypeScript, and Tailwind.

react typescript artificial-intelligence omr sheet-music abc-notation abcjs optical-music-recognition gemini-api vision-model generative-ai google-gemini

Updated Mar 3, 2026
TypeScript

Rakshath66 / ClipFindr

Star

🔍 A CLIP-powered image similarity finder built with Streamlit — upload a query image and find the most visually similar matches from a gallery using deep visual embeddings.

Updated Jul 27, 2025
Python

mishafyi / hot-dog-or-not

Star

Compare how vision models reason about images — not just their accuracy scores

python machine-learning typescript ai computer-vision nextjs fastapi vision-model openrouter llm-benchmark nvidia-nemotron

Updated Mar 20, 2026
TypeScript

dvdsgdgdg / resonote

Star

🎶 Transform sheet music into interactive digital content with Resonote, leveraging advanced Optical Music Recognition for seamless musical score analysis.

react redux typescript webpack styled-components omr sheet-music abc-notation abcjs optical-music-recognition gemini-api vision-model openai-whisper generative-ai openai-chatgpt google-gemini

Updated Apr 9, 2026
TypeScript

amitrathiesh / screenshot-to-base64-mcp

Star

MCP server that converts images to base64 data URIs for LMStudio compatibility

screenshot base64 mcp vscode base64-encoding vision-model lmstudio mcp-server

Updated Dec 25, 2025
JavaScript

gspain89 / crewai-hr-onboarding-cua-agent

Star

🚀 Fully automated HR onboarding with CrewAI agents + vision-based CUA browser automation | MS-FARA 7B + Playwright

multi-agent browser-automation cua vision-model ai-automation crewai computer-use multi-agent-orchestration

Updated Jan 12, 2026
Python

Ashwathama2024 / manual-diagnostic-ai

Star

Offline AI diagnostic assistant for marine/industrial equipment — Upload PDF manuals, ask questions, get engineering-grade answers with citations. Vision-powered diagram understanding. 100% local: Ollama LLM + ChromaDB + Streamlit.

ai predictive-maintenance rag pdf-processing streamlit vision-model marine-engineering chromadb local-llm ollama offline-ai diagnostic-assistant

Updated Apr 7, 2026
Python

Improve this page

Add a description, image, and links to the vision-model topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vision-model topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vision-model

Here are 17 public repositories matching this topic...

SystemVll / Montscan

SpaceinvaderOne / a-eye

guaardvark / guaardvark

Varun-Patkar / ChromePilot

i-evi / evMLP

gabrimatic / eyra

yorha2b-lab / auto-crud-copilot

ASCII125 / aiyer-object-viewer

Shehjad2019 / ai-nutrition-vision

JordanmFrancis / biotracker

IRedDragonICY / resonote

Rakshath66 / ClipFindr

mishafyi / hot-dog-or-not

dvdsgdgdg / resonote

amitrathiesh / screenshot-to-base64-mcp

gspain89 / crewai-hr-onboarding-cua-agent

Ashwathama2024 / manual-diagnostic-ai

Improve this page

Add this topic to your repo