Navaa transforms your Persian voice notes into polished, full-length podcast episodes using OpenAI's suite of APIs in a simple, automated pipeline.
-
Python 3.9+ for scripting and orchestration
-
OpenAI API
- Whisper for speech-to-text (STT)
- gpt-4o-mini for translation (Persian → English) and script generation
- gpt-4o-mini-tts for text-to-speech (TTS)
-
python-dotenv to manage environment variables securely
-
Logging via Python's
loggingmodule
- Speech-to-Text: Transcribe Persian audio notes into text with Whisper.
- Translation: Translate the Persian transcript into English for clearer LLM understanding.
- Script Generation: Use GPT to draft a conversational Persian podcast script (intro, body, conclusion).
- Text-to-Speech: Synthesize the final Persian script into high-fidelity audio.
git clone https://github.com/YOUR_USERNAME/Navaa.git
cd Navaapython -m venv venv
source venv/bin/activate # macOS/Linux
# or venv\\Scripts\\activate # Windows
pip install -r requirements.txt-
Create a file named
.envin the project root. -
Add your OpenAI key:
OPENAI_API_KEY=your_openai_api_key_here
Place your Persian audio file(s) into the assets/inputs/ directory. For example:
assets/inputs/myvoice.wav
Run the CLI tool with your filename (and optional output directory name):
python main.py --filename myvoice.wav [--outputname podcast_episode1]- --filename (
-f): Name of the input file inassets/inputs/. - --outputname (
-o): (Optional) Custom folder name underassets/outputs/. Defaults to the input file base name.
After successful processing, check:
assets/outputs/{outputname}/
├── process.log # Detailed processing log
├── stt.txt # Whisper transcript
├── translation.txt # English translation
├── script.txt # Generated podcast script (Persian)
└── {outputname}_podcast.wav # Final podcast audio