Forked from zarazhangrui/my-podcast-feed by @zarazhangrui.
An automated 4-stage pipeline (Fetch → Remix → Speak → Publish) that converts RSS newsletter feeds into personalized podcast episodes using AI.
- Pulls articles from configured RSS sources (e.g. Ben's Bites, Latent Space, Technically)
- Generates a conversational podcast script via an LLM (Claude, GPT, or OpenCode Zen) using Jinja2 prompt templates
- Converts the script to audio using Kokoro ONNX TTS with distinct voices per host
- Publishes the MP3 and an updated RSS feed to GitHub Pages so podcast players auto-download new episodes
- Runs on a 3-day GitHub Actions cron schedule or manually via CLI It fetches articles from configured sources, generates a natural-sounding podcast script with an LLM, converts it to audio via text-to-speech, and publishes it as a subscribable RSS feed on GitHub Pages.
Each run of the pipeline executes four stages:
- Fetch — Pulls new articles from configured RSS feeds (e.g., tech newsletters, AI blogs). Filters out previously processed articles using a persistent state file.
- Remix — Sends the fetched articles to an LLM (Anthropic Claude, OpenAI GPT, or OpenCode Zen) with a prompt template that instructs it to write a podcast conversation script. Supports single-host monologue or two-host conversational formats.
- Speak — Converts each line of the script to audio using Kokoro ONNX text-to-speech (offline, no API needed). Each host gets a distinct voice. Audio segments are stitched together with pauses and fade effects into a single MP3.
- Publish — Pushes the MP3 to a GitHub Pages repository, updates the RSS feed XML (
feed.xml), and commits the changes. Podcast players subscribed to the feed automatically pick up new episodes.
Anyone who wants to consume their daily newsletters as audio instead of reading them. Particularly useful for:
- Commuters who prefer listening over reading
- People who follow multiple newsletters but lack time to read them all
- Anyone wanting a personalized, AI-generated daily news summary in podcast form
scripts/
run_pipeline.py — Orchestrator; coordinates all four stages
fetch.py — Stage 1: RSS feed fetching and article extraction
remix.py — Stage 2: LLM-based podcast script generation
speak.py — Stage 3: Text-to-speech audio generation
publish.py — Stage 4: GitHub Pages deployment and RSS feed update
utils.py — Shared utilities (config, state, logging)
templates/
prompt_1host.md — LLM prompt template for single-host format
prompt_2host.md — LLM prompt template for two-host format
feed_template.xml — Jinja2 template for RSS feed XML
.github/workflows/
generate-episode.yml — GitHub Actions cron job (runs every 3 days)
episodes/ — Generated MP3 files
cover-art.png — Podcast cover art
feed.xml — RSS feed (served by GitHub Pages)
episodes.json — Episode metadata
state.json — Pipeline state (last run time, processed article IDs)
index.html — Minimal landing page for the feed URL
- Python 3.12+
- ffmpeg (for audio processing via pydub)
- An OpenCode Zen, Anthropic, or OpenAI API key (for script generation)
- GitHub CLI (
gh) authenticated (for publishing, if running locally)
Recommended:
python3 -m venv .venv
./.venv/bin/pip install -r requirements.txtOr install into your current environment:
pip install feedparser anthropic openai kokoro-onnx soundfile pydub PyYAML Jinja2 python-dotenvOn macOS:
brew install ffmpegOn Ubuntu:
sudo apt install ffmpegThe pipeline now supports repo-local config by default. The quickest setup is:
cp config.example.yaml config.yamlThen edit config.yaml as needed:
show_name: "My Daily Podcast"
hosts: 2 # 1 or 2
length_minutes: 10
tone: "casual and conversational"
language: "en"
sources:
rss:
- https://www.bensbites.com/feed
- https://read.technically.dev/feed
- https://www.latent.space/feed
llm:
provider: "opencode" # "anthropic", "openai", or "opencode"
api_key_env: "OPENCODE_API_KEY"
model: "claude-sonnet-4-6" # Any Zen model id supported by responses/messages/chat_completions
base_url: "https://opencode.ai/zen"
# api_style: "messages" # Optional: responses | messages | chat_completions
tts:
provider: "kokoro"
host_a_voice: "af_heart" # female voice — see VOICES.md for full list
host_b_voice: "am_michael" # male voice — see VOICES.md for full list
lang_code: "a" # "a" = American English
publish:
github_repo: "youruser/your-podcast-feed"
github_pages_url: "https://youruser.github.io/your-podcast-feed"
retention:
max_episodes: 30 # Auto-delete episodes beyond this limitCreate a repo-local .env file:
cp .env.example .envThen add your key:
OPENCODE_API_KEY=...
Legacy ~/.claude/personalized-podcast/config.yaml and .env files are still supported as fallbacks.
provider: "opencode" auto-routes requests to the right Zen API family for the endpoint styles implemented in this project:
gpt-*models use the ZenresponsesAPIclaude-*models use the ZenmessagesAPI- OpenAI-compatible Zen models like
glm-5,kimi-k2.5, andminimax-m2.5use Zenchat/completions
If a Zen model uses one of those endpoint families but does not match the built-in prefixes, set llm.api_style explicitly to responses, messages, or chat_completions.
Create a GitHub repo, enable GitHub Pages on the main branch, and add a cover-art.png for podcast artwork.
./.venv/bin/python scripts/run_pipeline.py./.venv/bin/python scripts/run_pipeline.py --from-stage speak # Re-run TTS only
./.venv/bin/python scripts/run_pipeline.py --from-stage remix # Re-generate script + audio./.venv/bin/python scripts/run_pipeline.py --from-stage speak --date 2026-03-14./.venv/bin/python scripts/run_pipeline.py --skip-publishpython scripts/fetch.py # Test fetching only
python scripts/remix.py # Generate a script from fetched articles
python scripts/speak.py # Convert a saved script to audio
python scripts/publish.py # Publish the latest MP3The included GitHub Actions workflow (.github/workflows/generate-episode.yml) runs the pipeline every 3 days at 8am Pacific. It can also be triggered manually from the Actions tab. API keys are stored as GitHub repository secrets (for example OPENCODE_API_KEY or ANTHROPIC_API_KEY). The Hugging Face model cache is used to avoid re-downloading on every run.
Add the feed URL to any podcast player (Apple Podcasts, Overcast, Pocket Casts, etc.):
https://youruser.github.io/your-podcast-feed/feed.xml
| Key | Description | Default |
|---|---|---|
show_name |
Name of the podcast | "My Daily Digest" |
hosts |
Number of hosts (1 or 2) | 2 |
length_minutes |
Target episode length in minutes | 10 |
tone |
Writing style for the script | "casual and conversational" |
sources.rss |
List of RSS feed URLs to pull from | (required) |
llm.provider |
LLM provider ("anthropic", "openai", or "opencode") |
"anthropic" |
llm.api_key_env |
Environment variable containing the API key | "ANTHROPIC_API_KEY" |
llm.model |
Model name or OpenCode Zen model id | "claude-sonnet-4-6" |
llm.base_url |
Optional API base URL override (used by OpenCode) | "https://opencode.ai/zen" |
llm.api_style |
Optional OpenCode endpoint override (responses, messages, chat_completions) |
auto-detected |
tts.host_a_voice |
Kokoro voice name for host A | "af_heart" |
tts.host_b_voice |
Kokoro voice name for host B | "am_michael" |
tts.lang_code |
Language code ("a" = American English) | "a" |
publish.github_repo |
GitHub repo for hosting | (required) |
publish.github_pages_url |
Base URL for GitHub Pages | (required) |
retention.max_episodes |
Max episodes to keep (oldest deleted) | 30 |
The pipeline tracks state in state.json for repo-local runs, or PODCAST_DATA_DIR/state.json if you override the data directory:
last_run— ISO timestamp of the last successful run (used to filter old articles)processed_ids— IDs of articles already processed (prevents duplicates; capped at 500)
State is also committed to the repo so GitHub Actions can restore it between runs.
Originally created by Zara Zhangrui. Forked and adapted for Codex by TMFNK.