TranslateAI is a powerful real-time speech translation desktop application built using PyQt and Hugging Face models. It enables users to convert spoken words into text and translate them into different languages.
MacOS |
Windows |
|
|
TranslateAI.mp4
This app allows you to:
- Select your preferred input device, whether it's a microphone or system audio, ensuring flexibility in capturing speech.
- Speak or even play pre-recorded audio files, and the app will process and transcribe the speech into text in real time.
- Enjoy an automatic translation trigger, which activates after about 1 second of silence, making the experience smooth and natural.
- Translate between multiple languages, including Turkish 🇹🇷, English 🇬🇧, Spanish 🇪🇸, and French 🇫🇷, with the possibility of adding more in the future.
Designed with ease of use and efficiency in mind, TranslateAI brings a fluid translation experience for various use cases, whether you're learning a new language, working in multilingual environments, or simply looking for an intuitive speech-to-text tool.
- 🚀 Features
- 📦 Installation
- MacOS
- Windows
- Linux
- 🛠️ Usage
- 🧠 How It Works
- Speech Detection
- Translation Model
- ⚙️ Configuration
- 🤝 Contributing
- 📝 License
- 🎙️ Real-time Speech & Audio Translation
- 🔊 Visual Indicator for Sound Intensity
- 🌐 Multi-language Support: Turkish 🇹🇷, English 🇺🇸, Spanish 🇪🇸, French 🇫🇷 (More to come!)
- 🤖 Hugging Face Model Integration (Helsinki-NLP)
- ⚡ Automatic Model Download Based on Selected Language
- 🖥️ User-Friendly PyQt GUI
git clone https://github.com/furkankarakuz/TranslateAI.git
cd TranslateAIIt’s often best to use a virtual environment:
python -m venv venv
source venv/bin/activate # On Mac/Linux
venv\Scripts\activate # On WindowsInstall required Python packages:
pip install -r requirements.txtNote: If
requirements.txtdoesn’t exist or you prefer manual installation, make sure to install:
PyQt5orPyQt6(depending on your code)pyaudio(orsounddevice/pydubif you adapt the code)torch,transformers,datasets(for Hugging Face)
On macOS, you may need to install portaudio via Homebrew before installing pyaudio:
brew install portaudioThen:
pip install pyaudioOn Windows, you can directly install pyaudio from PyPI:
pip install pyaudioIf you encounter issues, you might need to install the appropriate Microsoft Visual C++ Build Tools.
On Linux (Ubuntu/Debian-based), you may need:
sudo apt-get update
sudo apt-get install portaudio19-dev python3-pyaudioThen install Python packages as usual:
pip install -r requirements.txt-
Run the Application
python app.py
-
Select Your Audio Device
- In the top section of the UI, choose your microphone or system audio input (e.g., “MacBook Air Microphone …”).
-
Choose Language Pair
- Select the source and target language. For example,
EN-TRmeans you will speak English and get Turkish translations.
- Select the source and target language. For example,
-
Start Recording
- Click Start Record to begin capturing audio.
- Watch the volume indicator to see if audio is being detected.
- After you stop speaking for about 1 second, the application will automatically trigger the translation.
-
View Translations
- The translated text will appear in the console.
-
Stop Recording
- Click Stop Record when finished.
- PyAudio captures your microphone input (or chosen device).
- A volume level meter is displayed to help you monitor input levels.
- Once silence (below a certain threshold) is detected for ~1 second, the app processes the captured audio chunk and sends it for transcription & translation.
- The application uses Hugging Face Transformers to download and run the appropriate translation model for the chosen language pair.
- Model caching: Once a model is downloaded, it should be reused for subsequent translations to save time.
- The translations are displayed in real-time, providing an instant feedback loop.
- Silence Threshold & Delay: Currently set to about 1 second. You can modify this value in the code if you want quicker or slower triggers.
- Language Support: To add a new language pair, you need to:
- Find a Hugging Face translation model that supports that pair.
- Update the UI to include the new language option.
- Adjust the code that downloads/loads the model.
Contributions are welcome! Feel free to:
- Fork this repository.
- Create a new branch.
- Commit your changes.
- Open a Pull Request describing the improvements or bug fixes you’ve made.
This project is licensed under the Apache License 2.0

