GitHub - innovatorved/whisper.api: This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model.

title	emoji	colorFrom	colorTo	sdk	app_file	app_port
whisper.api	😶‍🌫️	purple	gray	docker	Dockerfile	7860

Whisper API - Speech to Text Transcription

This open source project provides a self-hostable API for speech to text transcription using a finetuned Whisper ASR model. The API allows you to easily convert audio files to text through HTTP requests. Ideal for adding speech recognition capabilities to your applications.

Key features:

Uses a finetuned Whisper model for accurate speech recognition
Simple HTTP API for audio file transcription
User level access with API keys for managing usage
Self-hostable code for your own speech transcription service
Quantized model optimization for fast and efficient inference
Asynchronous Processing: Non-blocking transcription for high availability
Concurrency Control: Built-in request queuing to prevent server overload
Open source implementation for customization and transparency

This repository contains code to deploy the API server along with finetuning and quantizing models. Check out the documentation for getting started!

Installation

To install the necessary dependencies and setup the Whisper binary, follow these steps:

1. System Dependencies

Install ffmpeg for audio processing and build tools (make, cmake, g++) for compiling Whisper.

# Ubuntu/Debian
sudo apt install ffmpeg git make cmake g++

# macOS
brew install ffmpeg cmake

2. Python Dependencies

Install the required Python packages.

pip install -r requirements.txt

3. Setup Environment

Copy the example environment file and configure it:

cp .env.example .env
# Edit .env with your database credentials and settings
# Optional: Set MAX_CONCURRENT_TRANSCRIPTIONS (default: 2) in .env to control parallel jobs

4. Setup Whisper

Run the setup script to clone, build, and configure the Whisper binary.

chmod +x setup_whisper.sh
./setup_whisper.sh

Running the Project

Run Locally (without Docker)

To run the project locally (e.g., inside a Conda environment or virtualenv):

# Ensure your environment is active (e.g., conda activate whisper-api)
uvicorn app.main:app --host 0.0.0.0 --port 7860 --reload

Docker (Production)

To run the project using Docker:

# Build the image
docker build -t whisper-api .

# Run the container (ensure env vars are passed or secrets used)
# For local testing with .env file:
docker run --env-file .env -p 7860:7860 whisper-api

Get Your token

To get your token, use the following command:

curl -X 'POST' \
  'http://localhost:8000/api/v1/users/get_token' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "email": "example@domain.com",
  "password": "password"
}'

Example to Transcribe a File

To upload a file and transcribe it, use the following command: Note: The token is a dummy token and will not work. Please use the token provided by the admin.

Here are the available models:

tiny.en
tiny.en.q5
base.en.q5

# Modify the token and audioFilePath
curl -X 'POST' \
  'http://localhost:8000/api/v1/transcribe/?model=tiny.en.q5' \
  -H 'accept: application/json' \
  -H 'Authentication: e9b7658aa93342c492fa64153849c68b8md9uBmaqCwKq4VcgkuBD0G54FmsE8JT' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@audioFilePath.wav;type=audio/wav'

License

MIT

Reference & Credits

Authors

Ved Gupta

🚀 About Me

Just try to be a developer!

Support

For support, email vedgupta@protonmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
.github/workflows		.github/workflows
app		app
audio		audio
models		models
transcribe		transcribe
.env.example		.env.example
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Pipfile		Pipfile
README.md		README.md
requirements.txt		requirements.txt
setup_whisper.sh		setup_whisper.sh
verify_async.py		verify_async.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Whisper API - Speech to Text Transcription

Installation

1. System Dependencies

2. Python Dependencies

3. Setup Environment

4. Setup Whisper

Running the Project

Run Locally (without Docker)

Docker (Production)

Get Your token

Example to Transcribe a File

License

Reference & Credits

Authors

🚀 About Me

Support

About

Uh oh!

Uh oh!

Languages

License

innovatorved/whisper.api

Folders and files

Latest commit

History

Repository files navigation

Whisper API - Speech to Text Transcription

Installation

1. System Dependencies

2. Python Dependencies

3. Setup Environment

4. Setup Whisper

Running the Project

Run Locally (without Docker)

Docker (Production)

Get Your token

Example to Transcribe a File

License

Reference & Credits

Authors

🚀 About Me

Support

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages