Windows voice recognition application with real-time transcription.
- Push-to-talk with F2: Hold F2 to record, release to transcribe
- AI Transcription: Uses Qwen3-ASR-1.7B (supports French and 52 languages)
- Auto-paste: Transcribed text is automatically pasted at cursor position
- Custom Dictionary: Define word corrections for technical terms, names, etc.
- Auto-mute: System audio is muted during recording to prevent interference
- History: All transcriptions are saved (7-day retention)
- Visual feedback: Color-coded system tray icon + Windows notifications
- Windows 10/11
- Python 3.10+
- NVIDIA GPU with CUDA (RTX series recommended)
- ~4 GB VRAM available
- Clone the project:
git clone https://github.com/florentcollect/Flototext.git
cd Flototext- Create a virtual environment (recommended):
python -m venv venv
venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- For GPU support with PyTorch CUDA:
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu121- Double-click
install.bat(as administrator)- Configures auto-start with Windows
- Registers the application in Windows Settings
To uninstall: Windows Settings → Apps → Flototext → Uninstall
-
Launch manually (if needed):
- Double-click
start-flototext.bat - Or:
python -m flototext.main
- Double-click
-
The icon appears in the system tray (near the clock)
-
Recording:
- Press and hold F2 to start recording
- Speak (supports 52 languages including French, English, Chinese...)
- Release F2 to stop and transcribe
-
The transcribed text will be automatically pasted at your cursor position
| Color | State |
|---|---|
| Orange | Loading model |
| Green | Ready |
| Red | Recording |
| Yellow | Processing transcription |
| Gray | Error |
Right-click on the icon to access options:
- Copy last transcription: Copy the last transcription to clipboard
- Edit dictionary: Open the custom words dictionary file
- Sounds: Enable/disable audio feedback
- Notifications: Enable/disable Windows notifications
- Mute during recording: Enable/disable system audio muting while recording
- Quit: Close the application
Create custom word corrections for terms that are frequently misrecognized (technical jargon, names, etc.).
The dictionary file is located at data/custom_words.json:
{
"corrections": {
"clode": "Claude",
"anthropique": "Anthropic",
"pie torche": "PyTorch"
}
}- Keys: What the ASR model outputs (lowercase)
- Values: The correct spelling you want
Access via tray menu: Edit dictionary (opens the file in your default JSON editor)
Transcriptions are stored in data/transcriptions.db and kept for 7 days.
View history:
sqlite3 data/transcriptions.db "SELECT * FROM transcriptions ORDER BY created_at DESC LIMIT 10"Flototext/
├── flototext/
│ ├── __init__.py
│ ├── main.py # Entry point
│ ├── config.py # Configuration
│ ├── core/
│ │ ├── hotkey_manager.py # F2 key detection
│ │ ├── audio_recorder.py # Microphone capture
│ │ ├── transcriber.py # Qwen3-ASR model
│ │ ├── text_inserter.py # Clipboard paste
│ │ ├── text_corrector.py # Custom word corrections
│ │ └── audio_muter.py # System audio muting
│ ├── storage/
│ │ ├── database.py # SQLite operations
│ │ └── models.py # Data models
│ └── ui/
│ ├── tray_app.py # System tray icon
│ ├── notifications.py # Toast notifications
│ └── sounds.py # Audio feedback
├── data/
│ ├── transcriptions.db # Database
│ └── custom_words.json # Custom word dictionary
├── assets/
│ └── icon.ico
├── install.bat # Windows installer
├── uninstall.bat # Uninstaller
├── start-flototext.bat # Manual launch
├── requirements.txt
└── README.md
Edit flototext/config.py to customize:
hotkey.trigger_key: Trigger key (default: "f2")audio.sample_rate: Sample rate (default: 16000)model.model_name: ASR model to useui.play_sounds: Sounds enabled by defaultui.show_notifications: Notifications enabled by defaultui.mute_during_recording: Mute system audio while recording (default: True)
- Check CUDA is installed:
python -c "import torch; print(torch.cuda.is_available())" - Make sure you have enough VRAM (~4 GB)
- Check that the default microphone is properly configured in Windows
- Test with another recording application
- Make sure a text field is active (blinking cursor)
- Text is also copied to clipboard (use Ctrl+V manually)
If Flototext saves you time, consider supporting its development:
MIT License