Fix GitHub Pages deploy failure on PRs; add CI build on every push#1
Fix GitHub Pages deploy failure on PRs; add CI build on every push#1
Conversation
Co-authored-by: Paranjayy <120307516+Paranjayy@users.noreply.github.com>
…lectors Co-authored-by: Paranjayy <120307516+Paranjayy@users.noreply.github.com>
Co-authored-by: Paranjayy <120307516+Paranjayy@users.noreply.github.com>
…e README and start.sh Co-authored-by: Paranjayy <120307516+Paranjayy@users.noreply.github.com>
…in CI Co-authored-by: Paranjayy <120307516+Paranjayy@users.noreply.github.com>
Co-authored-by: Paranjayy <120307516+Paranjayy@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces a new React + Vite + Tailwind frontend (“Content Extractor Pro”) while keeping the legacy HTML frontend for backwards compatibility, and updates the GitHub Pages deployment workflow to publish the new app at the site root.
Changes:
- Added a new React 19 + Vite 7 + Tailwind v4 TypeScript app under
app/with pages for YouTube transcript extraction and URL-to-Markdown conversion. - Updated
start.shto prefer running the React app (dev or build+preview) with a fallback to the legacyfrontend/static server. - Updated GitHub Pages deployment workflow to publish
app/distat/and legacy tools under/legacy/.
Reviewed changes
Copilot reviewed 39 out of 42 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| start.sh | Prefer React app startup; fallback to legacy static server |
| frontend/index.html | Legacy UI styling + tab scoping fixes + duplicate-id cleanup (partial) |
| app/vite.config.ts | Vite config with dev server port + /api proxy |
| app/tsconfig.node.json | TS config for Vite config build reference |
| app/tsconfig.json | TS project references |
| app/tsconfig.app.json | TS config for React app sources |
| app/src/utils/helpers.ts | Shared helpers: formatting, markdown generation, exports, metadata fetch |
| app/src/utils/api.ts | Backend API wrapper + URL metadata fallback |
| app/src/types.ts | Shared TS types for videos/URLs/settings |
| app/src/store/AppContext.tsx | Global app state + localStorage persistence |
| app/src/pages/UrlSettingsPage.tsx | URL settings UI |
| app/src/pages/UrlHistoryPage.tsx | URL conversion history UI + export actions |
| app/src/pages/SingleVideo.tsx | Single YouTube extraction page |
| app/src/pages/SingleUrl.tsx | Single URL conversion page |
| app/src/pages/SettingsPage.tsx | App settings (API key, processing, backend URL, privacy) |
| app/src/pages/Playlist.tsx | Playlist extraction page |
| app/src/pages/HistoryPage.tsx | YouTube extraction history page |
| app/src/pages/ExtractFromText.tsx | Extract URLs from pasted text + bulk convert |
| app/src/pages/ExportPage.tsx | Export hub for ZIP/CSV/JSON |
| app/src/pages/BulkVideos.tsx | Bulk YouTube extraction page |
| app/src/pages/BulkUrls.tsx | Bulk URL conversion page |
| app/src/main.tsx | React entrypoint |
| app/src/index.css | Tailwind import + base styling |
| app/src/components/VideoCard.tsx | YouTube result card + copy/download actions |
| app/src/components/UI.tsx | Shared UI primitives (badges, toggles, progress, etc.) |
| app/src/components/TranscriptViewer.tsx | Paragraph/raw transcript viewer |
| app/src/components/Sidebar.tsx | Left navigation + backend status indicator |
| app/src/components/Input.tsx | Styled input/textarea components |
| app/src/components/Card.tsx | Card wrappers |
| app/src/components/Button.tsx | Styled/animated button component |
| app/src/assets/react.svg | React asset |
| app/src/App.tsx | App shell + simple internal page routing + health polling |
| app/src/App.css | Default Vite template CSS (still present) |
| app/public/vite.svg | Vite asset |
| app/package.json | App dependencies/scripts |
| app/index.html | Vite HTML entry + font includes |
| app/eslint.config.js | ESLint flat config for TS/React |
| app/README.md | App documentation |
| app/.gitignore | App-specific ignores |
| .github/workflows/deploy.yml | Pages build/publish now uses app/dist + legacy under /legacy/ |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const blob = await zip.generateAsync({ type: 'blob' }); | ||
| const a = document.createElement('a'); | ||
| a.href = URL.createObjectURL(blob); | ||
| a.download = `url_conversions_${new Date().toISOString().split('T')[0]}.zip`; | ||
| document.body.appendChild(a); | ||
| a.click(); | ||
| document.body.removeChild(a); | ||
| } |
There was a problem hiding this comment.
exportUrlsAsZip creates an object URL via URL.createObjectURL(blob) but never revokes it. This can leak memory across repeated exports; store the created URL in a variable, call URL.revokeObjectURL(...) after clicking, and ensure the temporary <a> is cleaned up (similar to exportVideosAsZip).
| case '[[title]](url)': return `[[${title}]](${url})`; | ||
| case "[title](url 'description')": return `[${title}](${url} '${description.substring(0, 100)}...')`; | ||
| default: return `[${title}](${url})`; |
There was a problem hiding this comment.
In the tooltip link format ([title](url 'description')), the description is inserted without escaping. If description contains a single quote (') or newlines, the generated Markdown becomes invalid. Consider escaping/normalizing the description (e.g., replace ' with \' or switch to double quotes + escaping) and avoid always appending ... when the description is shorter than 100 chars.
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Checkout |
There was a problem hiding this comment.
This workflow file still has an on: pull_request trigger at the top (outside this hunk), which conflicts with the PR description that deploy.yml should no longer run on PRs once ci.yml exists. Consider removing the PR trigger from deploy.yml and adding the missing ci.yml for PR validation (or update the PR description if the intent changed).
| echo "🌐 React App: http://localhost:${REACT_PORT}" | ||
| echo "🎬 Legacy Frontend: http://localhost:8000/index.html" | ||
| echo "🔧 Backend API: http://localhost:5002" |
There was a problem hiding this comment.
The script always prints a “Legacy Frontend” URL on port 8000, but when the React app path is taken it never starts the legacy python3 -m http.server 8000 process. This makes the output misleading; consider only printing the legacy URL when the fallback server is actually started (or start the legacy server alongside React).
| const [includeDesc, setIncludeDesc] = useState(true); | ||
| const [includeStats, setIncludeStats] = useState(true); | ||
|
|
||
| const handleExtract = async () => { | ||
| if (!url.trim()) { toast.error('Please enter a YouTube URL'); return; } | ||
| setLoading(true); | ||
| setResult(null); | ||
| try { | ||
| const data = await extractTranscript(url.trim()); | ||
| setResult(data); | ||
| addVideo(data); | ||
| toast.success(`Extracted ${data.transcript.length} lines!`); |
There was a problem hiding this comment.
includeDesc / includeStats are exposed in the UI but never affect the request or export output (they’re not passed to extractTranscript and not used in generateMarkdown). This is likely confusing for users; either wire these flags through the API call/markdown generation or remove the controls/state.
| const [maxVideos, setMaxVideos] = useState(settings.defaultMaxVideos); | ||
| const [includeDesc, setIncludeDesc] = useState(settings.autoIncludeDescription); | ||
| const [includeStats, setIncludeStats] = useState(settings.autoIncludeStats); | ||
| const [processing, setProcessing] = useState(false); | ||
| const [progress, setProgress] = useState(0); | ||
| const [results, setResults] = useState<BulkResult[]>([]); | ||
| const [stats, setStats] = useState({ processed: 0, successful: 0, failed: 0, duplicates: 0, words: 0 }); | ||
| const [filter, setFilter] = useState<'all' | 'success' | 'error' | 'duplicate'>('all'); | ||
|
|
||
| const processAll = async () => { | ||
| const lines = urls.split('\n').map(u => u.trim()).filter(u => u && !isPlaylistUrl(u)).slice(0, maxVideos); | ||
| if (!lines.length) { toast.error('No valid video URLs found'); return; } | ||
|
|
||
| setProcessing(true); | ||
| setResults([]); | ||
| setProgress(0); | ||
| const newStats = { processed: 0, successful: 0, failed: 0, duplicates: 0, words: 0 }; | ||
|
|
||
| for (let i = 0; i < lines.length; i++) { | ||
| const url = lines[i]; | ||
| const videoId = extractVideoId(url); | ||
|
|
||
| if (settings.detectDuplicates && videoId && extractedVideos.some(v => v.videoId === videoId)) { | ||
| newStats.duplicates++; | ||
| setResults(p => [...p, { url, videoId: videoId || undefined, status: 'duplicate', error: 'Already extracted' }]); | ||
| } else { | ||
| try { | ||
| const data = await extractTranscript(url); | ||
| addVideo(data); | ||
| newStats.successful++; | ||
| newStats.words += data.transcript.reduce((n, e) => n + e.text.split(' ').length, 0); | ||
| setResults(p => [...p, { url, videoId: data.videoId, status: 'success', data }]); |
There was a problem hiding this comment.
The Descriptions / Stats checkboxes update includeDesc / includeStats, but those values are never used when calling extractTranscript (only the URL is sent). Either remove these toggles or extend the backend/API call so the options actually affect extraction/metadata.
| const [maxUrls, setMaxUrls] = useState(25); | ||
| const [includeDesc, setIncludeDesc] = useState(true); | ||
| const [autoRetry, setAutoRetry] = useState(false); | ||
| const [processing, setProcessing] = useState(false); | ||
| const [progress, setProgress] = useState(0); | ||
| const [results, setResults] = useState<(UrlData & { status: 'success' | 'error' | 'duplicate' })[]>([]); | ||
| const [stats, setStats] = useState({ processed: 0, successful: 0, failed: 0, duplicates: 0 }); | ||
| const [filter, setFilter] = useState<FilterType>('all'); | ||
|
|
||
| const processAll = async () => { | ||
| const urls = input.split('\n').map(u => u.trim()).filter(u => isValidUrl(u)).slice(0, maxUrls); | ||
| if (!urls.length) { toast.error('No valid URLs found'); return; } | ||
|
|
||
| setProcessing(true); | ||
| setResults([]); | ||
| setProgress(0); | ||
| const s = { processed: 0, successful: 0, failed: 0, duplicates: 0 }; | ||
|
|
||
| for (let i = 0; i < urls.length; i++) { | ||
| const url = urls[i]; | ||
| if (extractedUrls.some(u => u.url === url)) { | ||
| s.duplicates++; | ||
| setResults(p => [...p, { url, title: url, description: '', domain: new URL(url).hostname, success: false, error: 'Already converted', convertedAt: new Date().toISOString(), status: 'duplicate' }]); | ||
| } else { | ||
| try { | ||
| await new Promise(r => setTimeout(r, urlSettings.requestDelay)); | ||
| const data = await extractUrlMetadata(url, includeDesc, true); | ||
| addUrl(data); | ||
| if (data.success) s.successful++; | ||
| else s.failed++; | ||
| setResults(p => [...p, { ...data, status: data.success ? 'success' : 'error' }]); | ||
| } catch { | ||
| s.failed++; | ||
| setResults(p => [...p, { url, title: new URL(url).hostname, description: '', domain: new URL(url).hostname, success: false, error: 'Fetch failed', convertedAt: new Date().toISOString(), status: 'error' }]); | ||
| } |
There was a problem hiding this comment.
autoRetry is surfaced as “Auto-retry failures” but is never read in processAll, so toggling it has no effect. Either implement the retry behavior (respecting urlSettings.maxRetries/requestTimeout) or remove the checkbox to avoid misleading UX.
| <Database size={15} className="text-indigo-400" /> | ||
| <span className="text-sm font-semibold text-slate-200">Data & Privacy</span> | ||
| </div> | ||
| <p className="text-xs text-slate-500">All data is stored in your browser session. Nothing is sent to external servers except the YouTube/URL you request.</p> |
There was a problem hiding this comment.
The “Nothing is sent to external servers…” statement isn’t accurate given extractUrlMetadata falls back to https://api.allorigins.win/get?... when the backend call fails. Consider updating this copy to mention the AllOrigins fallback (or removing that fallback / making it opt-in) to avoid misleading users about where URLs are sent.
| <p className="text-xs text-slate-500">All data is stored in your browser session. Nothing is sent to external servers except the YouTube/URL you request.</p> | |
| <p className="text-xs text-slate-500"> | |
| Most data is stored locally in your browser session. URLs you request may be sent to your configured backend, YouTube, or an external metadata proxy service (for example, AllOrigins) to fetch information about them. | |
| </p> |
| export async function extractTranscript(url: string): Promise<VideoData> { | ||
| const res = await fetch(`${BACKEND_BASE}/extract`, { | ||
| method: 'POST', | ||
| headers: { 'Content-Type': 'application/json' }, | ||
| body: JSON.stringify({ url }), | ||
| }); |
There was a problem hiding this comment.
extractTranscript only posts { url }, but the UI exposes “include description / stats” toggles (e.g., Single Video / Bulk Videos). If those options are meant to affect backend behavior, the API wrapper should accept and forward them; otherwise the UI controls should be removed to avoid a mismatch.
| // Public demo API key - override via Settings > YouTube Data API > Custom API Key | ||
| const YOUTUBE_API_KEY = import.meta.env.VITE_YOUTUBE_API_KEY || 'AIzaSyAV_j5IsZlkXNtkadQ7HQiocTYysm9kvH0'; | ||
| const YOUTUBE_API_BASE = 'https://www.googleapis.com/youtube/v3'; |
There was a problem hiding this comment.
A real YouTube Data API key is hard-coded as a default (YOUTUBE_API_KEY). Even if intended as a demo key, committing API keys to the repo is risky (leakage/abuse/quota exhaustion). Prefer requiring VITE_YOUTUBE_API_KEY (or user-provided key) and omitting the fallback key from source control.
The
deployjob was running onpull_requestevents, hitting GitHub Pages environment protection rules that block deployment fromrefs/pull/*/merge. Separately, no workflow ran on pushes to non-mainbranches.deploy.ymlpull_requesttrigger — deploy workflow now only fires onpushtomainif: github.event_name == 'push'guard on thedeployjobbuildsteps to install Node 20, runnpm ci+npm run buildinapp/, and assemble_site/with the React dist at root and legacy HTML under/legacy/ci.yml(new)push: branches: ['**']andpull_request: branches: ['**']— every commit on every branch gets a buildnpm ci→npm run lint→npm run build(TypeScript + Vite)app/dist/as a downloadable artifact keyed by commit SHA (7-day retention)Resulting pipeline:
main💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.