VoxPilot

VoxPilot is an AI-powered voice assistant for YouTube video analysis. It allows users to paste YouTube URLs, get AI-generated summaries, and interact with the content using natural voice commands. The application leverages Google Gemini for intelligent content analysis and ElevenLabs for natural text-to-speech responses.

Features

Video Analysis

Paste any YouTube URL to analyze the video content
Automatically extracts transcripts when available for accurate analysis
Falls back to metadata-based inference when transcripts are unavailable
Displays confidence badges (Full or Inferred) based on data source
Generates key takeaways, abstracts, and structured summaries

Voice Control

Full voice command support for hands-free operation
Supported voice commands:
- "Summarize this video" or "Analyze this" to process a pasted URL
- "Save" or "Save this video" to bookmark the current video
- "Delete" to remove a video from your library (with voice confirmation)
- "Read the summary" to hear the summary spoken aloud
- "How many videos do I have?" to get a count of saved videos
- "Switch to dark mode" or "Switch to light mode" for theme control
- "Play" or "Watch" to open the video on YouTube

Video Library

Save analyzed videos to your personal library
View and manage saved videos in the sidebar
One-click loading of previously saved summaries
Voice commands for library management

Audio Responses

Natural text-to-speech powered by ElevenLabs
Voice feedback for confirmations, summaries, and answers
Confidence-adjusted voice tone for follow-up question responses

Tech Stack

Technology	Purpose
Next.js 14	React framework with App Router and Server Actions
TypeScript	Type-safe development
Tailwind CSS	Utility-first styling
Supabase	Authentication and database for saved videos
Google Gemini	AI-powered video summarization and Q&A
ElevenLabs	Natural text-to-speech for voice responses
Framer Motion	Smooth animations and transitions
Radix UI	Accessible component primitives
Web Speech API	Browser-based voice recognition

Prerequisites

Before running VoxPilot locally, you will need:

Node.js (v18 or higher)
pnpm package manager
Supabase account with a project set up
Google AI API key for Gemini access
ElevenLabs API key for text-to-speech

Local Development

1. Clone the Repository

git clone https://github.com/your-username/VoxPilot.git
cd VoxPilot

2. Install Dependencies

pnpm install

3. Configure Environment Variables

Copy the example environment file and fill in your API keys:

cp .env.example .env.local

Edit .env.local with your credentials:

# Supabase Configuration
NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=your_supabase_anon_key_here

# Google AI API Key (for Gemini)
GOOGLE_API_KEY=your_google_api_key_here

# ElevenLabs API Key (for Text-to-Speech)
ELEVEN_LABS_API_KEY=your_elevenlabs_api_key_here

4. Set Up Supabase Database

Create the following table in your Supabase project:

create table saved_content (
  id uuid default gen_random_uuid() primary key,
  user_id uuid references auth.users(id) on delete cascade,
  url text not null,
  video_id text not null,
  title text not null,
  summary_json jsonb not null,
  thumbnail_url text,
  created_at timestamp with time zone default now()
);

-- Enable Row Level Security
alter table saved_content enable row level security;

-- Create policy for users to manage their own content
create policy "Users can manage own content" on saved_content
  for all using (auth.uid() = user_id);

5. Run the Development Server

pnpm dev

Open http://localhost:3000 in your browser.

Project Structure

VoxPilot/
├── app/
│   ├── actions.ts          # Server actions for AI processing
│   ├── dashboard/          # Main dashboard page
│   ├── login/              # Login page
│   ├── signup/             # Signup page
│   ├── globals.css         # Global styles
│   ├── layout.tsx          # Root layout
│   └── page.tsx            # Landing page
├── components/
│   ├── ui/                 # Reusable UI components
│   └── icons.tsx           # Icon components
├── lib/
│   ├── supabase/           # Supabase client configuration
│   └── utils.ts            # Utility functions
├── types/                  # TypeScript type definitions
└── middleware.ts           # Auth middleware

Usage

Sign up or log in to access the dashboard
Paste a YouTube URL in the input field
Click Analyze or say "Summarize this video"
View the AI-generated summary with key takeaways
Save videos to your library for later reference
Use voice commands for hands-free interaction

Voice Commands Reference

Command	Action
"Summarize this" / "Analyze this"	Analyze the video URL in the input field
"Save" / "Save this"	Save current video to library
"Delete"	Delete current video (requires confirmation)
"Read the summary"	Read summary aloud
"Read the answer"	Read the last Q&A answer aloud
"How many videos"	Count saved videos
"Dark mode" / "Light mode"	Switch theme
"Hello" / "Hi"	Greeting response
Ask any question	Get contextual answers about the video

API Keys

Google Gemini

Visit Google AI Studio
Create or select a project
Generate an API key
Add to GOOGLE_API_KEY in .env.local

ElevenLabs

Visit ElevenLabs
Sign up and go to Profile Settings
Copy your API key
Add to ELEVEN_LABS_API_KEY in .env.local

Supabase

Visit Supabase Dashboard
Create a new project
Go to Project Settings > API
Copy the URL and anon key
Add to .env.local

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
app		app
components		components
lib		lib
supabase		supabase
types		types
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
components.json		components.json
middleware.ts		middleware.ts
next.config.mjs		next.config.mjs
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.js		postcss.config.js
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoxPilot

Features

Video Analysis

Voice Control

Video Library

Audio Responses

Tech Stack

Prerequisites

Local Development

1. Clone the Repository

2. Install Dependencies

3. Configure Environment Variables

4. Set Up Supabase Database

5. Run the Development Server

Project Structure

Usage

Voice Commands Reference

API Keys

Google Gemini

ElevenLabs

Supabase

License

About

Uh oh!

Releases

Packages

Languages

License

samar-703/VoxPilot

Folders and files

Latest commit

History

Repository files navigation

VoxPilot

Features

Video Analysis

Voice Control

Video Library

Audio Responses

Tech Stack

Prerequisites

Local Development

1. Clone the Repository

2. Install Dependencies

3. Configure Environment Variables

4. Set Up Supabase Database

5. Run the Development Server

Project Structure

Usage

Voice Commands Reference

API Keys

Google Gemini

ElevenLabs

Supabase

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages