Add multimodal input to the system

> streamlit chat doesn't support upload image
> need to find another alternative for the UI to enable multimodal

Enhance our existing LangChain-based knowledge chatbot to support multimodal inputs (text, images) alongside the current text-only functionality. This will allow users to interact with the system using various input types for a richer conversational experience.

## Current State
✅ Text input processing via LangChain
✅ Text-based responses
✅ Document upload support
❌ Image input support

## Must Have
-  Image Input: Support common formats (PNG, JPG, JPEG, WebP, GIF)
- Input Validation: File size limits (images: 10MB)
- Error Handling: Clear error messages for unsupported formats/sizes

## Nice to Have
- Image OCR: Extract text from images for processing
- Batch Processing: Multiple file uploads simultaneously
- Progress Indicators: Upload/processing status feedback



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multimodal input to the system #19

Current State

Must Have

Nice to Have

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add multimodal input to the system #19

Description

Current State

Must Have

Nice to Have

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions