Automatically transcribe and summarize any audio or video file using whisper.cpp + pi. Works with YouTube videos, podcasts, recordings, meetings, lectures - any audio content.
This is a tool built for macOS.
Reading is faster than watching videos. For certain types of videos I find it faster to read a detailed summary versus watching the video at a faster speed.
- Local speech-to-text using whisper.cpp with Metal GPU acceleration
- Automatic summarization via pi using RPC mode with live streaming output
- Privacy-first - all transcription runs locally on your Mac
- Simple CLI - one command to get transcript + summary
Install required tools:
# Package managers (one-time setup)
brew install yt-dlp ffmpeg
# pi (AI coding agent used for summarization)
# Follow: https://github.com/mariozechner/pi-coding-agent
# whisper.cpp (build from source)
git clone https://github.com/ggml-org/whisper.cpp.git
cd whisper.cpp
cmake -B build -DWHISPER_METAL=ON
cmake --build build --config Release
# Download a whisper model
bash models/download-ggml-model.sh large-v3-turboSet environment variables:
# Add to ~/.zshrc or ~/.bashrc
export WHISPER_CLI=~/path/to/whisper.cpp/build/bin/whisper-cli
export WHISPER_MODEL=~/path/to/whisper.cpp/models/ggml-large-v3-turbo.bin# Clone this repo
git clone https://github.com/roybotbot/ausum.git
cd ausum
# Install with pip
pip install .
# Or with pipx (recommended)
pipx install .# YouTube videos
ausum "https://www.youtube.com/watch?v=VIDEO_ID"
# YouTube videos with playlist in URL (only processes the single video)
ausum "https://www.youtube.com/watch?v=VIDEO_ID&list=PLAYLIST_ID"
# Local audio/video files
ausum /path/to/video.mp4
ausum ~/Downloads/podcast.mp3
ausum ./recording.wav
# Override saved directory for a single run
ausum "https://www.youtube.com/watch?v=VIDEO_ID" -d ~/my-transcripts
# Open summary in mdv after creation
ausum "https://www.youtube.com/watch?v=VIDEO_ID" --readSupported formats: Any audio or video format that ffmpeg can read (mp4, mp3, wav, m4a, webm, mkv, avi, flac, ogg, etc.)
Output files:
<title>.txt- Full transcript<title>-summary.md- Structured summary
On your first run, ausum will:
- Ask where summaries should be saved (defaults to
~/Documentsif it exists) - Ask where transcripts should be saved (press Enter to use the same directory as summaries)
- Ask whether to save transcript
.txtfiles at all - Save preferences to
~/.config/ausum/config.json
Subsequent runs use your saved preferences. You can always override the output directory for a single run with -d.
Preferences are stored in ~/.config/ausum/config.json. You can edit it directly to change settings without re-running the setup prompt:
{
"summary_dir": "/path/to/summaries",
"transcript_dir": "/path/to/transcripts",
"save_transcript": true
}summary_dir— where.mdsummary files are savedtranscript_dir— where.txttranscript files are saved (optional; if omitted, usessummary_dir)save_transcript— set tofalseto skip saving the raw transcript
Summaries follow a structured format:
- Overview — bullet list of high-level concepts and key takeaways
- Detailed Summary — major sections with descriptive headers and detailed bullets
- Next Steps — actionable recommendations for learning more
Each summary includes a source link at the bottom.
MIT - See LICENSE file