Podcast Translator Project

This guide explains how to set up and run the fully open-source, locally-run MediaTranslator application which requires no cloud services.

Prerequisites

Required Software

Java JDK 21 or higher
Maven for dependency management
FFmpeg (for audio extraction and processing)
Ollama (for LLM capabilities) - https://ollama.com

Speech-to-Text Option:

OpenAI Whisper locally installed:
- Install whisper.cpp:
  - Clone from https://github.com/ggerganov/whisper.cpp
  - In the directory, run make
  - Download a model, e.g. sh ./models/download-ggml-model.sh large-v3-turbo-q8_0

Choose One Text-to-Speech Option:

eSpeak: sudo apt-get install espeak (Linux)
Pico2Wave: sudo apt-get install libttspico-utils (Linux)
Mac's built-in 'say' command (MacOS only)
Python pyttsx3: pip install pyttsx3

Setting up Ollama

Install Ollama from https://ollama.com

Pull a model for translation and text enhancement:

ollama pull mistral    # Basic, fast model
ollama pull llama3     # Alternative option
ollama pull mixtral    # More capable for complex tasks

Ensure the Ollama service is running before using the application
Check the model is available:
```
ollama list
```

Building and Running the Application

mvn clean package
java -jar target/media-translator-1.0-SNAPSHOT.jar

Usage Instructions

Create a settings file settings.properties, based on src/main/resources/settings-example.properties
Run the application
The application will process the file and generate the translated audio

Troubleshooting

Speech Recognition Issues

Make sure either whisper.cpp is properly installed

Text-to-Speech Issues

The application will try multiple TTS options in order
If using pyttsx3, ensure you have compatible voices installed for your target language

Ollama Connection Issues

Check that the Ollama service is running on http://localhost:11434
Verify you have pulled the model you're trying to use

Supported Languages

The application supports translation between any language that Ollama models can handle. Speech synthesis capabilities may vary depending on your chosen TTS system, but commonly supported languages include:

English (en)
Spanish (es)
French (fr)
German (de)
Italian (it)
Portuguese (pt)
Russian (ru)
Chinese (zh)
Japanese (ja)

To add more language support for Text-to-Speech, install additional language packs for your chosen TTS system.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src/main		src/main
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Podcast Translator Project

Prerequisites

Required Software

Speech-to-Text Option:

Choose One Text-to-Speech Option:

Setting up Ollama

Building and Running the Application

Usage Instructions

Troubleshooting

Speech Recognition Issues

Text-to-Speech Issues

Ollama Connection Issues

Supported Languages

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

foojayio/podcast-translator

Folders and files

Latest commit

History

Repository files navigation

Podcast Translator Project

Prerequisites

Required Software

Speech-to-Text Option:

Choose One Text-to-Speech Option:

Setting up Ollama

Building and Running the Application

Usage Instructions

Troubleshooting

Speech Recognition Issues

Text-to-Speech Issues

Ollama Connection Issues

Supported Languages

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages