VoxPilot is an AI-powered voice assistant for YouTube video analysis. It allows users to paste YouTube URLs, get AI-generated summaries, and interact with the content using natural voice commands. The application leverages Google Gemini for intelligent content analysis and ElevenLabs for natural text-to-speech responses.
- Paste any YouTube URL to analyze the video content
- Automatically extracts transcripts when available for accurate analysis
- Falls back to metadata-based inference when transcripts are unavailable
- Displays confidence badges (Full or Inferred) based on data source
- Generates key takeaways, abstracts, and structured summaries
- Full voice command support for hands-free operation
- Supported voice commands:
- "Summarize this video" or "Analyze this" to process a pasted URL
- "Save" or "Save this video" to bookmark the current video
- "Delete" to remove a video from your library (with voice confirmation)
- "Read the summary" to hear the summary spoken aloud
- "How many videos do I have?" to get a count of saved videos
- "Switch to dark mode" or "Switch to light mode" for theme control
- "Play" or "Watch" to open the video on YouTube
- Save analyzed videos to your personal library
- View and manage saved videos in the sidebar
- One-click loading of previously saved summaries
- Voice commands for library management
- Natural text-to-speech powered by ElevenLabs
- Voice feedback for confirmations, summaries, and answers
- Confidence-adjusted voice tone for follow-up question responses
| Technology | Purpose |
|---|---|
| Next.js 14 | React framework with App Router and Server Actions |
| TypeScript | Type-safe development |
| Tailwind CSS | Utility-first styling |
| Supabase | Authentication and database for saved videos |
| Google Gemini | AI-powered video summarization and Q&A |
| ElevenLabs | Natural text-to-speech for voice responses |
| Framer Motion | Smooth animations and transitions |
| Radix UI | Accessible component primitives |
| Web Speech API | Browser-based voice recognition |
Before running VoxPilot locally, you will need:
- Node.js (v18 or higher)
- pnpm package manager
- Supabase account with a project set up
- Google AI API key for Gemini access
- ElevenLabs API key for text-to-speech
git clone https://github.com/your-username/VoxPilot.git
cd VoxPilotpnpm installCopy the example environment file and fill in your API keys:
cp .env.example .env.localEdit .env.local with your credentials:
# Supabase Configuration
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your_supabase_anon_key_here
# Google AI API Key (for Gemini)
GOOGLE_API_KEY=your_google_api_key_here
# ElevenLabs API Key (for Text-to-Speech)
ELEVEN_LABS_API_KEY=your_elevenlabs_api_key_hereCreate the following table in your Supabase project:
create table saved_content (
id uuid default gen_random_uuid() primary key,
user_id uuid references auth.users(id) on delete cascade,
url text not null,
video_id text not null,
title text not null,
summary_json jsonb not null,
thumbnail_url text,
created_at timestamp with time zone default now()
);
-- Enable Row Level Security
alter table saved_content enable row level security;
-- Create policy for users to manage their own content
create policy "Users can manage own content" on saved_content
for all using (auth.uid() = user_id);pnpm devOpen http://localhost:3000 in your browser.
VoxPilot/
├── app/
│ ├── actions.ts # Server actions for AI processing
│ ├── dashboard/ # Main dashboard page
│ ├── login/ # Login page
│ ├── signup/ # Signup page
│ ├── globals.css # Global styles
│ ├── layout.tsx # Root layout
│ └── page.tsx # Landing page
├── components/
│ ├── ui/ # Reusable UI components
│ └── icons.tsx # Icon components
├── lib/
│ ├── supabase/ # Supabase client configuration
│ └── utils.ts # Utility functions
├── types/ # TypeScript type definitions
└── middleware.ts # Auth middleware
- Sign up or log in to access the dashboard
- Paste a YouTube URL in the input field
- Click Analyze or say "Summarize this video"
- View the AI-generated summary with key takeaways
- Save videos to your library for later reference
- Use voice commands for hands-free interaction
| Command | Action |
|---|---|
| "Summarize this" / "Analyze this" | Analyze the video URL in the input field |
| "Save" / "Save this" | Save current video to library |
| "Delete" | Delete current video (requires confirmation) |
| "Read the summary" | Read summary aloud |
| "Read the answer" | Read the last Q&A answer aloud |
| "How many videos" | Count saved videos |
| "Dark mode" / "Light mode" | Switch theme |
| "Hello" / "Hi" | Greeting response |
| Ask any question | Get contextual answers about the video |
- Visit Google AI Studio
- Create or select a project
- Generate an API key
- Add to
GOOGLE_API_KEYin.env.local
- Visit ElevenLabs
- Sign up and go to Profile Settings
- Copy your API key
- Add to
ELEVEN_LABS_API_KEYin.env.local
- Visit Supabase Dashboard
- Create a new project
- Go to Project Settings > API
- Copy the URL and anon key
- Add to
.env.local
This project is licensed under the MIT License.