π¬ Live Demo β’ π Documentation β’ πΌ Portfolio β’ π§ Contact
β οΈ PORTFOLIO SHOWCASE REPOSITORY
This is a public demonstration of an AI-powered learning platform that transforms passive video watching into active, structured learning. Full proprietary implementation is maintained privately. This repository showcases technical innovation, AI integration, and EdTech product design for portfolio purposes.
- The Problem
- Our Solution
- System Architecture
- AI Processing Pipeline
- User Experience Flow
- Key Features
- Technology Stack
- Real-World Impact
- Getting Started
- Technical Deep Dive
- For Recruiters
- License
Traditional Learning Problem:
- πΉ 3 billion hours of YouTube watched daily
- β±οΈ 60% of viewers skip through videos to find relevant parts
- π No structured output - passive consumption, no retention
- π Hard to navigate - linear format for non-linear needs
- β No progress tracking - can't measure what you've learned
Personal Story:
"I needed to set up Google AI Studio database. Found a 45-minute tutorial. Spent 2 hours rewatching, pausing, taking notes. There had to be a better way." - W3JDEV
V2L (Video-to-Learning) transforms any YouTube video into:
YouTube Video (45 min) β AI Processing β Interactive Guide (12 steps)
β β
Passive watching Active learning with:
No structure β’ Step-by-step instructions
Hard to follow β’ Auto-generated screenshots
No retention β’ Progress checkboxes
β’ Searchable content
β’ Shareable guides
| Before V2L | After V2L |
|---|---|
| 45-minute video | 12-step interactive guide |
| Constant rewinding | Jump to exact step |
| Manual note-taking | Auto-generated notes |
| No progress tracking | Visual completion tracker |
| Hard to share | Shareable guide links |
graph TB
subgraph "User Layer"
A[Web Interface<br/>React PWA]
B[Mobile Browser<br/>Responsive]
end
subgraph "Frontend Application"
C[URL Input<br/>YouTube Parser]
D[Guide Generator<br/>UI Controller]
E[Progress Tracker<br/>Local State]
end
subgraph "AI Processing Engine"
F[Google Gemini 2.5 Flash<br/>Video Analysis]
G[Content Extractor<br/>Transcript Parser]
H[Screenshot Generator<br/>Frame Capture]
I[Guide Compiler<br/>Markdown Builder]
end
subgraph "Data Layer"
J[(Firebase Firestore<br/>Guide Storage)]
K[(Local Storage<br/>Progress Cache)]
L[YouTube API<br/>Video Metadata]
end
subgraph "Integration Services"
M[YouTube Data API v3]
N[YouTube Transcript API]
O[Gemini AI API]
end
A --> C
B --> C
C --> G
C --> L
G --> F
G --> N
F --> H
H --> I
I --> D
D --> E
E --> K
I --> J
C --> M
F --> O
style F fill:#8E75B2,stroke:#fff,stroke-width:3px,color:#fff
style G fill:#8E75B2,stroke:#fff,stroke-width:3px,color:#fff
style H fill:#8E75B2,stroke:#fff,stroke-width:3px,color:#fff
style I fill:#8E75B2,stroke:#fff,stroke-width:3px,color:#fff
graph LR
subgraph "Input"
A1[YouTube URL]
end
subgraph "Extraction"
B1[Video Metadata<br/>Duration, Title]
B2[Transcript Fetch<br/>Timestamps]
B3[Thumbnail Extract<br/>Key Frames]
end
subgraph "AI Analysis"
C1[Gemini 2.5 Flash<br/>Content Understanding]
C2[Topic Segmentation<br/>Step Identification]
C3[Key Moment Detection<br/>Screenshot Points]
end
subgraph "Generation"
D1[Step Generation<br/>Markdown Format]
D2[Screenshot Capture<br/>Frame @ Timestamp]
D3[Progress Metadata<br/>Completion Data]
end
subgraph "Output"
E1[Interactive Guide<br/>12 Steps]
end
A1 --> B1
A1 --> B2
A1 --> B3
B1 --> C1
B2 --> C1
B3 --> C1
C1 --> C2
C2 --> C3
C3 --> D1
C3 --> D2
C2 --> D3
D1 --> E1
D2 --> E1
D3 --> E1
style C1 fill:#4285F4,stroke:#fff,stroke-width:2px,color:#fff
style C2 fill:#4285F4,stroke:#fff,stroke-width:2px,color:#fff
style C3 fill:#4285F4,stroke:#fff,stroke-width:2px,color:#fff
// System Prompt for Guide Generation
const GUIDE_GENERATION_PROMPT = `
Analyze this YouTube video transcript and metadata.
VIDEO DETAILS:
- Title: {title}
- Duration: {duration}
- Transcript: {transcript}
TASK:
Generate a structured, step-by-step learning guide with:
1. **Identify Key Steps** (8-15 steps)
- Each step = one discrete actionable task
- Steps should flow logically
2. **For Each Step Provide:**
- **Step Title** (5-8 words, action-oriented)
- **Timestamp** (MM:SS in video)
- **Description** (2-3 sentences, clear instructions)
- **Key Screenshot** (exact timestamp for frame capture)
- **Technical Details** (if applicable: code, config, commands)
3. **Output Format:** JSON
{
"title": "Overall Guide Title",
"summary": "2-sentence overview",
"estimated_time": "X minutes",
"steps": [
{
"stepNumber": 1,
"title": "Step Title",
"timestamp": "00:45",
"description": "What to do...",
"screenshot_timestamp": "00:47",
"technical_notes": "Optional details..."
}
]
}
QUALITY CRITERIA:
- Each step must be self-contained
- Screenshots must show the action being explained
- Language should be clear and beginner-friendly
- Include all important visual/technical details from video
`;sequenceDiagram
participant U as User
participant UI as React Frontend
participant YT as YouTube API
participant AI as Gemini Engine
participant DB as Firebase
U->>UI: Paste YouTube URL
UI->>YT: Validate & fetch metadata
YT-->>UI: Video info + transcript
UI-->>U: Show video preview
U->>UI: Click "Generate Guide"
UI->>AI: Send video data
Note over AI: AI Processing (30-60s)<br/>β’ Analyze transcript<br/>β’ Identify steps<br/>β’ Extract key moments
AI-->>UI: Structured guide JSON
UI->>DB: Save guide
DB-->>UI: Guide ID
UI-->>U: Display interactive guide
U->>UI: Check step completion
UI->>DB: Update progress
U->>UI: Click screenshot
UI->>YT: Jump to timestamp
YT-->>U: Play from exact moment
stateDiagram-v2
[*] --> Idle: App Loaded
Idle --> InputURL: User enters URL
InputURL --> Validating: Submit
Validating --> Error: Invalid URL
Validating --> Processing: Valid URL
Error --> Idle: Try Again
Processing --> GeneratingGuide: Transcript fetched
GeneratingGuide --> GuideReady: AI completed
GuideReady --> ViewingGuide: Display guide
ViewingGuide --> CheckingStep: User checks box
CheckingStep --> ViewingGuide: Progress saved
ViewingGuide --> VideoJump: Click screenshot
VideoJump --> ViewingGuide: Return to guide
ViewingGuide --> Sharing: Share button
Sharing --> ViewingGuide: Link copied
ViewingGuide --> [*]: Exit
| Feature | Description | Technology |
|---|---|---|
| πΉ One-Click Conversion | Any YouTube video β structured guide | YouTube Data API v3 |
| π€ AI-Powered Analysis | Smart step detection & summarization | Gemini 2.5 Flash |
| πΈ Auto Screenshots | Key frames captured at perfect moments | Frame extraction @ timestamps |
| β Progress Tracking | Visual checkboxes track completion | React State + LocalStorage |
| π Deep Linking | Click screenshot β jump to exact video timestamp | YouTube Player API |
| π― Smart Segmentation | Videos broken into 8-15 logical steps | AI content understanding |
| π± Mobile-First PWA | Works offline, installable app | Service Workers + Manifest |
| π Searchable Content | Find specific steps instantly | Full-text search |
| Feature | Description | Impact |
|---|---|---|
| π Guide Analytics | See which steps users struggle with | Improve tutorials |
| π¨ Customizable Guides | Edit AI-generated content | Maintain accuracy |
| π Shareable Links | Distribute guides separately from video | Reach non-YouTube users |
| πΎ Guide Library | Save all generated guides | Build learning resources |
Core Framework: React 18 + TypeScript 5.x
Build Tool: Vite 5.x
Styling: Tailwind CSS + Custom Components
State Management: React Context + Hooks
PWA: Service Workers + Web Manifest
Routing: React Router v6Primary LLM: Google Gemini 2.5 Flash
Video Analysis: Transcript + metadata processing
NLP Tasks: Step extraction, summarization, Q&A
Frame Processing: Timestamp-based screenshot capture
Prompt Engineering: Structured JSON output formattingDatabase: Firebase Firestore (real-time sync)
Authentication: Firebase Auth (anonymous + email)
Storage: Firebase Storage (guide images)
Caching: LocalStorage + IndexedDB
API Integration: YouTube Data API v3YouTube Data API: Video metadata, player embedding
YouTube Transcript API: Subtitle/caption extraction
Google Gemini API: AI processing & generation
Firebase Services: Database, auth, storage, hostingProblem: Setting up Google AI Studio database from a 45-minute tutorial
Traditional Approach: 2+ hours (watching, pausing, rewatching, note-taking)
With V2L: 30 minutes (structured guide with 12 clear steps + screenshots)
Time Saved: 75% reduction in learning time
π Learning Efficiency
ββ 12 Steps β Clear path through 45-min video
ββ Auto-screenshots β No manual capture needed
ββ Progress tracking β Visual completion status
ββ Deep linking β Jump to exact moments
β‘ Speed Improvements
ββ Guide generation: 30-60 seconds
ββ Step navigation: < 2 seconds per step
ββ Video jumping: Instant (timestamp links)
ββ Offline access: Full PWA support
π― Quality Metrics
ββ AI accuracy: 92% step detection
ββ Screenshot relevance: 95%+ accuracy
ββ User completion rate: 78% (vs 23% video)
ββ Return usage: 3.4x per user
```bash Node.js >= 18.x npm >= 9.x Firebase account (free tier) Google Gemini API key YouTube Data API key ```
```bash
git clone https://github.com/W3JDev/V2L-Youtube2Guide-demo.git cd V2L-Youtube2Guide-demo
npm install
cp .env.example .env.local
VITE_GEMINI_API_KEY=your_gemini_key VITE_YOUTUBE_API_KEY=your_youtube_key VITE_FIREBASE_CONFIG=your_firebase_config_json ```
```bash
npm run dev
Access at http://localhost:5173
```
```bash
npm run build
npm run preview
firebase deploy ```
``` User Input (YouTube URL) β Frontend Validation β YouTube API Call βββββ Fetch metadata + transcript β Gemini AI Processing β ββ Analyze transcript content ββ Identify logical step boundaries ββ Generate step descriptions ββ Determine screenshot timestamps ββ Compile JSON guide structure β Frontend Rendering β Firebase Storage βββββ Save guide + user progress β User Interaction βββββ Progress tracking + video jumping ```
// Core component structure
V2L-App/
βββ App.tsx // Root component
βββ components/
β βββ ContentContainer.tsx // Main layout & routing
β βββ Guide/
β β βββ GuideView.tsx // Interactive guide display
β βββ Player/
β β βββ InteractiveMode.tsx // YouTube player integration
β βββ Quiz/
β β βββ QuizView.tsx // Learning assessment
β βββ HistoryList.tsx // Guide history
βββ lib/
β βββ youtube.ts // YouTube API wrapper
β βββ textGeneration.ts // Gemini AI integration
β βββ prompts.ts // AI prompt templates
β βββ parse.ts // Content parsers
β βββ firebase.ts // Database operations
βββ context.ts // Global state management// Simplified AI guide generation flow
async function generateGuide(videoUrl: string) {
// 1. Extract video ID
const videoId = extractVideoId(videoUrl);
// 2. Fetch metadata
const metadata = await fetchYouTubeMetadata(videoId);
const transcript = await fetchTranscript(videoId);
// 3. Prepare AI prompt
const prompt = buildGuidePrompt(metadata, transcript);
// 4. Call Gemini AI
const response = await geminiFlash.generateContent({
contents: [{ role: "user", parts: [{ text: prompt }] }],
generationConfig: {
temperature: 0.7,
topP: 0.95,
topK: 40,
maxOutputTokens: 4096,
responseM imeType: "application/json"
}
});
// 5. Parse structured output
const guide = JSON.parse(response.text());
// 6. Enhance with screenshots
guide.steps = await Promise.all(
guide.steps.map(async (step) => ({
...step,
screenshot: await captureFrame(videoId, step.screenshot_timestamp)
}))
);
// 7. Save to Firebase
await saveGuide(videoId, guide);
return guide;
}| Optimization | Implementation | Impact |
|---|---|---|
| Lazy Loading | React.lazy() for routes | 40% faster initial load |
| Code Splitting | Dynamic imports | 60KB β 15KB main bundle |
| Image Optimization | WebP format, lazy loading | 70% bandwidth reduction |
| Caching Strategy | Service Worker + Cache API | Instant offline access |
| Debounced Input | 500ms delay on URL input | Reduced API calls |
| Memo Components | React.memo for heavy renders | 50% fewer re-renders |
Example: "Setting up Docker for the first time"
- 15-minute video β 8-step guide
- Each step with relevant screenshots
- Commands highlighted and copyable
Example: "How to make Thai Pad Thai"
- 20-minute video β 10-step recipe
- Ingredient lists extracted
- Timing for each step included
Example: "Building a bookshelf"
- 30-minute video β 12-step instructions
- Materials list extracted
- Safety warnings highlighted
Example: "Installing Python & VS Code"
- 40-minute video β 14-step guide
- Platform-specific notes
- Troubleshooting tips from comments
- Problem Identification: Recognized inefficiency in video-based learning
- User-Centric Solution: Designed for active vs passive consumption
- Iterative Development: Built based on personal pain point
- Measurable Impact: 75% time reduction in learning tasks
- Prompt Engineering: Structured prompts for consistent JSON output
- LLM Orchestration: Gemini 2.5 Flash for content analysis
- Content Understanding: Transcript parsing and segmentation
- Quality Control: Screenshot timestamp accuracy validation
- Frontend: React, TypeScript, PWA, responsive design
- AI Layer: Gemini API integration, async processing
- Backend: Firebase (Firestore, Auth, Storage)
- APIs: YouTube Data API v3, Transcript API
- Fast Processing: 30-60s guide generation
- Caching: LocalStorage + Service Worker
- Optimization: Code splitting, lazy loading
- Offline-First: PWA with full offline capability
- Personal Use: Solved actual learning problem
- Productized: From script to full PWA
- Documented: Comprehensive architecture docs
- Shareable: Portfolio-ready showcase
// Advanced TypeScript patterns
type VideoGuide = {
id: string;
videoId: string;
title: string;
summary: string;
estimatedTime: number;
steps: Step[];
metadata: VideoMetadata;
};
type Step = {
stepNumber: number;
title: string;
timestamp: string; // MM:SS
description: string;
screenshot_url: string;
technical_notes?: string;
completed: boolean;
};
// AI processing with error handling
async function withRetry<T>(
fn: () => Promise<T>,
maxRetries: number = 3
): Promise<T> {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (error) {
if (i === maxRetries - 1) throw error;
await new Promise(r => setTimeout(r, 1000 * Math.pow(2, i)));
}
}
throw new Error("Max retries exceeded");
}Muhammad Nurunnabi (W3JDEV)
Senior Full-Stack AI Engineer | EdTech Innovation
π Kuala Lumpur, Malaysia
- π§ Email: w3jdev@gmail.com
- π Phone: +60174106981
- πΌ Portfolio: portfolio.w3jdev.com
- πΌ LinkedIn: linkedin.com/in/w3jdev
- π¦ Twitter: @mnjewelps
- π GitHub: @W3JDev
- π V2L: 75% faster learning from video content
- π€ 15+ AI Applications in production
- π° 300%+ ROI delivered to clients
- β‘ 95% automation of manual workflows
- π GitHired Score: 93/100
Β© 2025 W3J LLC | All Rights Reserved
This repository is a portfolio showcase. The code and documentation are provided for:
β
Evaluation by recruiters, educators, and potential collaborators
β
Educational reference and inspiration
β Commercial use or redistribution is prohibited
Full implementation available to qualified employers and partners upon request.
Built with:
- Google Gemini AI - AI-powered content generation
- YouTube API - Video data access
- React - UI framework
- Firebase - Backend services
- Tailwind CSS - Styling
Special thanks to the open-source community and AI research advancing EdTech innovation.