ytcapture

Extract video frames and transcripts from YouTube videos into Obsidian-compatible markdown notes.

Why ytcapture?

Watching a lecture, tutorial, or presentation on YouTube? ytcapture turns any video into a searchable, skimmable markdown note with:

Embedded frame images at regular intervals so you can see what's on screen
Timestamped transcript segments aligned to each frame
Obsidian-ready format with YAML frontmatter and ![[wikilink]] embeds
Smart deduplication that removes redundant frames (great for slide-based content)

No more scrubbing through hour-long videos to find that one slide. Your notes become a visual index of the entire video.

Requirements

Python 3.10+
ffmpeg (for frame extraction)
yt-dlp (for video/transcript fetching)

On macOS:

brew install ffmpeg yt-dlp

Installation

# Clone the repository
git clone https://github.com/jdmonaco/ytcapture.git
cd ytcapture

# Install as a CLI tool with uv (recommended)
uv tool install -e .

# Or install with pip
pip install -e .

Usage

# Basic usage - outputs to current directory
ytcapture "https://www.youtube.com/watch?v=VIDEO_ID"

# Multiple videos at once
ytcapture URL1 URL2 URL3

# Process an entire playlist (auto-expands)
ytcapture "https://www.youtube.com/playlist?list=PLAYLIST_ID"

# On macOS, just copy a YouTube URL (or playlist) and run without arguments
ytcapture

# Skip confirmation for large playlists (>10 videos)
ytcapture "https://www.youtube.com/playlist?list=PLAYLIST_ID" -y

# Specify output directory
ytcapture URL -o my-notes/

# Adjust frame interval (default: 15 seconds)
ytcapture URL --interval 30

# Extract more frames with aggressive deduplication
ytcapture URL --interval 5 --dedup-threshold 0.80

Output Structure

./
├── images/
│   └── VIDEO_ID/
│       ├── frame-0000.jpg
│       ├── frame-0001.jpg
│       └── ...
├── transcripts/
│   └── raw-transcript-VIDEO_ID.json
└── Video Title (Channel Name) 20241120.md

Assets are organized by video ID to support multiple video captures in the same directory.

Example Output

The generated markdown looks like this:

---
title: Understanding Neural Networks
source: https://www.youtube.com/watch?v=abc123
author:
  - Deep Learning Channel
created: '2024-12-15'
published: '2024-11-20'
description: An introduction to neural networks and deep learning fundamentals...
tags:
  - youtube
---

# Understanding Neural Networks

> An introduction to neural networks and deep learning fundamentals.

## 00:00:00

![[images/abc123/frame-0000.jpg]]

Welcome to this tutorial on neural networks. Today we'll cover the basics.

## 00:00:15

![[images/abc123/frame-0001.jpg]]

Let's start by understanding what a neuron is and how it processes information.

Options

Option	Default	Description
`-o, --output`	`.`	Output directory
`--interval`	15	Frame extraction interval in seconds
`--max-frames`	None	Maximum number of frames to extract
`--frame-format`	jpg	Frame format: `jpg` or `png`
`--language`	en	Transcript language code
`--dedup-threshold`	0.85	Similarity threshold for removing duplicate frames (0.0-1.0)
`--no-dedup`	-	Disable frame deduplication
`--prefer-manual`	-	Only use manual transcripts
`--keep-video`	-	Keep downloaded video file after frame extraction
`-y, --yes`	-	Skip confirmation prompt for large batches (>10 videos)
`-v, --verbose`	-	Verbose output
`-h, --help`	-	Show help message

Tips

For slide-based presentations

Use a shorter interval with deduplication to catch slide transitions:

ytcapture URL --interval 5 --dedup-threshold 0.90

For fast-moving content

Disable deduplication to keep all frames:

ytcapture URL --interval 10 --no-dedup

For long videos

Limit the number of frames to avoid huge output:

ytcapture URL --max-frames 50

Markdown Formatting

If you have mdformat installed, ytcapture will automatically format the output markdown:

pip install mdformat mdformat-gfm mdformat-frontmatter

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
src/ytcapture		src/ytcapture
tests		tests
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ytcapture

Why ytcapture?

Requirements

Installation

Usage

Output Structure

Example Output

Options

Tips

For slide-based presentations

For fast-moving content

For long videos

Markdown Formatting

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

jdmonaco/ytcapture

Folders and files

Latest commit

History

Repository files navigation

ytcapture

Why ytcapture?

Requirements

Installation

Usage

Output Structure

Example Output

Options

Tips

For slide-based presentations

For fast-moving content

For long videos

Markdown Formatting

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages