Skip to content

jdmonaco/ytcapture

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ytcapture

Extract video frames and transcripts from YouTube videos into Obsidian-compatible markdown notes.

Why ytcapture?

Watching a lecture, tutorial, or presentation on YouTube? ytcapture turns any video into a searchable, skimmable markdown note with:

  • Embedded frame images at regular intervals so you can see what's on screen
  • Timestamped transcript segments aligned to each frame
  • Obsidian-ready format with YAML frontmatter and ![[wikilink]] embeds
  • Smart deduplication that removes redundant frames (great for slide-based content)

No more scrubbing through hour-long videos to find that one slide. Your notes become a visual index of the entire video.

Requirements

  • Python 3.10+
  • ffmpeg (for frame extraction)
  • yt-dlp (for video/transcript fetching)

On macOS:

brew install ffmpeg yt-dlp

Installation

# Clone the repository
git clone https://github.com/jdmonaco/ytcapture.git
cd ytcapture

# Install as a CLI tool with uv (recommended)
uv tool install -e .

# Or install with pip
pip install -e .

Usage

# Basic usage - outputs to current directory
ytcapture "https://www.youtube.com/watch?v=VIDEO_ID"

# Multiple videos at once
ytcapture URL1 URL2 URL3

# Process an entire playlist (auto-expands)
ytcapture "https://www.youtube.com/playlist?list=PLAYLIST_ID"

# On macOS, just copy a YouTube URL (or playlist) and run without arguments
ytcapture

# Skip confirmation for large playlists (>10 videos)
ytcapture "https://www.youtube.com/playlist?list=PLAYLIST_ID" -y

# Specify output directory
ytcapture URL -o my-notes/

# Adjust frame interval (default: 15 seconds)
ytcapture URL --interval 30

# Extract more frames with aggressive deduplication
ytcapture URL --interval 5 --dedup-threshold 0.80

Output Structure

./
├── images/
│   └── VIDEO_ID/
│       ├── frame-0000.jpg
│       ├── frame-0001.jpg
│       └── ...
├── transcripts/
│   └── raw-transcript-VIDEO_ID.json
└── Video Title (Channel Name) 20241120.md

Assets are organized by video ID to support multiple video captures in the same directory.

Example Output

The generated markdown looks like this:

---
title: Understanding Neural Networks
source: https://www.youtube.com/watch?v=abc123
author:
  - Deep Learning Channel
created: '2024-12-15'
published: '2024-11-20'
description: An introduction to neural networks and deep learning fundamentals...
tags:
  - youtube
---

# Understanding Neural Networks

> An introduction to neural networks and deep learning fundamentals.

## 00:00:00

![[images/abc123/frame-0000.jpg]]

Welcome to this tutorial on neural networks. Today we'll cover the basics.

## 00:00:15

![[images/abc123/frame-0001.jpg]]

Let's start by understanding what a neuron is and how it processes information.

Options

Option Default Description
-o, --output . Output directory
--interval 15 Frame extraction interval in seconds
--max-frames None Maximum number of frames to extract
--frame-format jpg Frame format: jpg or png
--language en Transcript language code
--dedup-threshold 0.85 Similarity threshold for removing duplicate frames (0.0-1.0)
--no-dedup - Disable frame deduplication
--prefer-manual - Only use manual transcripts
--keep-video - Keep downloaded video file after frame extraction
-y, --yes - Skip confirmation prompt for large batches (>10 videos)
-v, --verbose - Verbose output
-h, --help - Show help message

Tips

For slide-based presentations

Use a shorter interval with deduplication to catch slide transitions:

ytcapture URL --interval 5 --dedup-threshold 0.90

For fast-moving content

Disable deduplication to keep all frames:

ytcapture URL --interval 10 --no-dedup

For long videos

Limit the number of frames to avoid huge output:

ytcapture URL --max-frames 50

Markdown Formatting

If you have mdformat installed, ytcapture will automatically format the output markdown:

pip install mdformat mdformat-gfm mdformat-frontmatter

License

MIT

About

Extract YouTube video frames and transcripts to Obsidian markdown.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages