Voice Typer

A lightweight, open-source voice typing application that converts speech to text using local inference with Whisper models. Record and transcribe audio with a simple keyboard shortcut for seamless integration with any application.

Features

Keyboard-Driven: Toggle recording with a configurable hotkey (default: Cmd+Alt)
Local Inference: Uses whisper.cpp via the PyWhisperCPP wrapper for offline, private speech recognition
Universal Compatibility: Works with any application that accepts pasted text
Simple Interface: No GUI required - just press the hotkey, speak, and release to insert text
Customizable: Configure your preferred language model and hotkey

Installation

Prerequisites

Python 3.8 or higher
A working microphone
Linux, macOS, or Windows

Setup

Clone this repository:

git clone https://github.com/lladdy/voice-typing.git
cd voice-typing

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install the required dependencies:
```
pip install -r requirements.txt
```

Usage

Run the application:
```
python run.py
```
Press and hold your configured hotkey (default: Cmd+Alt) to start recording.
Speak clearly into your microphone.
Release the hotkey to stop recording and transcribe.
The text will be automatically pasted at your cursor position.

Configuration

You can customize the application by modifying the VoiceTyperConfig in run.py:

config = VoiceTyperConfig(
    recording_hotkey="<cmd>+<alt>",  # Change to your preferred hotkey
    model_name="base.en"             # Change to use different Whisper models
)

Available model options:

tiny.en: Fastest but less accurate (English only)
base.en: Good balance of speed and accuracy (English only)
small.en: More accurate but slower (English only)
medium.en: Most accurate but requires more resources (English only)
tiny, base, small, medium: Multilingual versions

Project Structure

run.py: Entry point for the application
voice_typer.py: Core application logic
audio.py: Audio recording and processing
speech_to_text.py: Speech recognition using Whisper.cpp
keyboard.py: Keyboard interaction and hotkey handling

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Whisper by OpenAI
whisper.cpp for efficient local inference
PyWhisperCPP for Python bindings to whisper.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice Typer

Features

Installation

Prerequisites

Setup

Usage

Configuration

Project Structure

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
audio.py		audio.py
keyboard.py		keyboard.py
requirements.txt		requirements.txt
run.py		run.py
speech_to_text.py		speech_to_text.py
voice_typer.py		voice_typer.py

License

lladdy/voice-typer

Folders and files

Latest commit

History

Repository files navigation

Voice Typer

Features

Installation

Prerequisites

Setup

Usage

Configuration

Project Structure

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages