This feature lets users input a YouTube link, play the video, and interact with a synchronized transcript. As the video plays, the current subtitle is highlighted. Clicking on a subtitle triggers a language analysis directly below the selected line.
Frontend Components
YouTubePlayer: embeds the video and tracks current timestampTranscriptViewer: renders subtitles; clicking on a subtitle triggers analysisTranscriptAnalysis: streams progress messages + displays the result of/analyze-streamSemanticMatchSidebar: shows top-k semantically similar subtitles as cardsScrollableWordCards: displays Hanja annotations with Pinyin, 훈음, and English
Backend Endpoints
/transcript: downloads and preprocesses.vttsubtitles/analyze-stream: performs language analysis/search: embeds subtitle and retrieves top-k semantic matches from Qdrant
Flow
- User provides a YouTube link and presses Load
- Subtitles are parsed and segmented into sentence-level chunks
- The current subtitle is highlighted in sync as the video plays
- Clicking a subtitle:
- Sends a request to
/analyze-stream- Streams progress messages inline under subtitle
- Displays gloss and Hanja cards on completion
- Sends a request to
/search- Embeds subtitle, retrieves semantic matches from Qdrant
- Displays related subtitles to the right side of the player
We use Server-Sent Events to stream progress messages to the user. This gives the user feedback and avoids waiting without notice while the backend applies grammar chunking + performs GPT calls.
Precomputing all subtitle chunks would:
- Delay transcript rendering
- Clutter the UI with output
- Increase GPT usage and cost
Instead, we only perform analysis when the user clicks on a subtitle.