🎯 Meeting → Storyboard

An end-to-end web application that:

Transcribes meeting audio/video using Whisper.cpp.
Summarizes into concise bullet points via Ollama.
Generates visual storyboards using Stable Diffusion.
Displays results in a modern React-based UI.

📌 Features

🎤 Audio/Video Upload or Record via Mic
✍ Automatic Transcription (Whisper.cpp)
📑 Summarization into bullet points (Ollama LLM)
🖼 Image Generation for each bullet point (Stable Diffusion)
💻 Modern Frontend built with React + Vite
⚡ FastAPI Backend for processing

🏗 Architecture

         ┌──────────────┐
         │   Frontend   │ (React + Vite)
         │              │
         └──────┬───────┘
                │ Upload Audio/Video
                ▼
         ┌──────────────┐
         │   Backend    │ (FastAPI)
         │              │
         ├──────┬───────┤
         │      │       │
  Transcription Summarization
   (Whisper.cpp)   (Ollama LLM)
         │              │
         └──────┬───────┘
                ▼
         ┌──────────────┐
         │ Image Gen    │ (Stable Diffusion)
         └──────┬───────┘
                ▼
         ┌──────────────┐
         │  Storyboard  │
         │  Display UI  │
         └──────────────┘

🚀 Setup Instructions

# Clone the repository
git clone https://github.com/kudos07/EchoFrames.git
cd EchoFrames

# -------------------
# Backend Setup
# -------------------
cd backend
pip install -r requirements.txt

# Run the backend
uvicorn main:app --reload &

# -------------------
# Frontend Setup
# -------------------
cd ../frontend
npm install

# Run the frontend
npm run dev

Open the app at: http://localhost:5173

🔍 How It Works

Upload / Record Audio
- Drag & drop audio/video file or record via mic.
Transcription
- Whisper.cpp converts speech to text.
Summarization
- Ollama processes transcript into concise bullet points.
Image Generation
- Stable Diffusion creates visuals for each bullet point.
Storyboard Display
- React frontend shows images + text in a clean layout.

🎨 Customization & Image Quality Tips

To improve the generated images and make them more relevant to your audio content:

1️⃣ Provide Clear, Context-Rich Audio

Use high-quality recordings with minimal background noise.
Ensure each speaker talks clearly and stays on-topic.
Meetings with structured discussions produce more accurate summaries and visuals.

2️⃣ Adjust Image Generation Hyperparameters

In cli_storyboard.py (or wherever image generation is triggered), experiment with:

num_inference_steps – Higher values (e.g., 30–50) improve detail but increase generation time.
guidance_scale – Controls adherence to the prompt.
- Lower (~5) = more creative, less literal.
- Higher (~8–12) = more literal, closer to summary text.
Image size – Default is 512x512; increasing resolution (e.g., 768x768) can improve quality, but requires more GPU memory.

3️⃣ Prompt Engineering for Better Context

You can prepend extra context to the bullet points before sending them to the image generation model.
Example : "A professional corporate meeting scene: " + bullet_point

This improves thematic consistency.

4️⃣ Enable Full Mode vs Preview Mode

Preview Mode: Only top 3 bullet points are processed — faster.
Full Mode: All bullet points processed for a richer storyboard.

💡 Pro Tip: If the generated images feel too generic, try editing the summarization prompt in the backend to produce more vivid, descriptive bullet points.

📦 Tech Stack

Frontend: React, Vite, TailwindCSS
Backend: FastAPI, Python
Speech-to-Text: Whisper.cpp
LLM Summarization: Ollama
Image Generation: Stable Diffusion
Package Manager: npm, pip

📸 Sample Output

📝 License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
frontend		frontend
output_shots		output_shots
static/images		static/images
whispercpp/Release		whispercpp/Release
.gitattributes		.gitattributes
.gitignore		.gitignore
README.MD		README.MD
cli_storyboard.py		cli_storyboard.py
file.py		file.py
test_sdxl.py		test_sdxl.py
test_sdxlfast.py		test_sdxlfast.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎯 Meeting → Storyboard

📌 Features

🏗 Architecture

🚀 Setup Instructions

🔍 How It Works

🎨 Customization & Image Quality Tips

1️⃣ Provide Clear, Context-Rich Audio

2️⃣ Adjust Image Generation Hyperparameters

3️⃣ Prompt Engineering for Better Context

4️⃣ Enable Full Mode vs Preview Mode

📦 Tech Stack

📸 Sample Output

📝 License

About

Uh oh!

Releases

Packages

Languages

kudos07/EchoFrames

Folders and files

Latest commit

History

Repository files navigation

🎯 Meeting → Storyboard

📌 Features

🏗 Architecture

🚀 Setup Instructions

🔍 How It Works

🎨 Customization & Image Quality Tips

1️⃣ Provide Clear, Context-Rich Audio

2️⃣ Adjust Image Generation Hyperparameters

3️⃣ Prompt Engineering for Better Context

4️⃣ Enable Full Mode vs Preview Mode

📦 Tech Stack

📸 Sample Output

📝 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages