Want to read a book in a foreign language without losing the original context? EPUB Translator transforms any EPUB into a bilingual edition with AI-powered translations displayed side-by-side with the original text.
Whether you're learning a new language, conducting academic research, or simply enjoying foreign literature, you get both versions in one book - preserving all formatting, images, and structure.
We provide an online demo platform where you can try EPUB Translator's bilingual translation capabilities without any installation. Simply upload your EPUB file and get a translated bilingual edition.
pip install epub-translatorRequirements: Python 3.11, 3.12, or 3.13
The easiest way to use EPUB Translator is through OOMOL Studio with a visual interface:
from epub_translator import LLM, translate, language, SubmitKind
# Initialize LLM with your API credentials
llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
)
# Translate EPUB file using language constants
translate(
source_path="source.epub",
target_path="translated.epub",
target_language=language.ENGLISH,
submit=SubmitKind.APPEND_BLOCK,
llm=llm,
)from tqdm import tqdm
with tqdm(total=100, desc="Translating", unit="%") as pbar:
last_progress = 0.0
def on_progress(progress: float):
nonlocal last_progress
increment = (progress - last_progress) * 100
pbar.update(increment)
last_progress = progress
translate(
source_path="source.epub",
target_path="translated.epub",
target_language="English",
submit=SubmitKind.APPEND_BLOCK,
llm=llm,
on_progress=on_progress,
)Initialize the LLM client for translation:
LLM(
key: str, # API key
url: str, # API endpoint URL
model: str, # Model name (e.g., "gpt-4")
token_encoding: str, # Token encoding (e.g., "o200k_base")
cache_path: PathLike | None = None, # Cache directory path
timeout: float | None = None, # Request timeout in seconds
top_p: float | tuple[float, float] | None = None,
temperature: float | tuple[float, float] | None = None,
retry_times: int = 5, # Number of retries on failure
retry_interval_seconds: float = 6.0, # Interval between retries
log_dir_path: PathLike | None = None, # Log directory path
)Translate an EPUB file:
translate(
source_path: PathLike | str, # Source EPUB file path
target_path: PathLike | str, # Output EPUB file path
target_language: str, # Target language (e.g., "English", "Chinese")
submit: SubmitKind, # How to insert translations (REPLACE, APPEND_TEXT, or APPEND_BLOCK)
user_prompt: str | None = None, # Custom translation instructions
max_retries: int = 5, # Maximum retries for failed translations
max_group_tokens: int = 2600, # Maximum tokens per translation group
concurrency: int = 1, # Number of concurrent translation tasks (default: 1)
llm: LLM | None = None, # Single LLM instance for both translation and filling
translation_llm: LLM | None = None, # LLM instance for translation (overrides llm)
fill_llm: LLM | None = None, # LLM instance for XML filling (overrides llm)
on_progress: Callable[[float], None] | None = None, # Progress callback (0.0-1.0)
on_fill_failed: Callable[[FillFailedEvent], None] | None = None, # Error callback
)Note: Either llm or both translation_llm and fill_llm must be provided. Using separate LLMs allows for task-specific optimization.
The submit parameter controls how translated content is inserted into the document. Use SubmitKind enum to specify the insertion mode:
from epub_translator import SubmitKind
# Three available modes:
# - SubmitKind.REPLACE: Replace original content with translation (single-language output)
# - SubmitKind.APPEND_TEXT: Append translation as inline text (bilingual output)
# - SubmitKind.APPEND_BLOCK: Append translation as block elements (bilingual output, recommended)Mode Comparison:
-
SubmitKind.REPLACE: Creates a single-language translation by replacing original text with translated content. Useful for creating books in the target language only. -
SubmitKind.APPEND_TEXT: Appends translations as inline text immediately after the original content. Both languages appear in the same paragraph, creating a continuous reading flow. -
SubmitKind.APPEND_BLOCK(Recommended): Appends translations as separate block elements (paragraphs) after the original. This creates clear visual separation between languages, making it ideal for side-by-side bilingual reading.
Example:
# For bilingual books (recommended)
translate(
source_path="source.epub",
target_path="translated.epub",
target_language=language.ENGLISH,
submit=SubmitKind.APPEND_BLOCK,
llm=llm,
)
# For single-language translation
translate(
source_path="source.epub",
target_path="translated.epub",
target_language=language.ENGLISH,
submit=SubmitKind.REPLACE,
llm=llm,
)EPUB Translator provides predefined language constants for convenience. You can use these constants instead of writing language names as strings:
from epub_translator import language
# Usage example:
translate(
source_path="source.epub",
target_path="translated.epub",
target_language=language.ENGLISH,
submit=SubmitKind.APPEND_BLOCK,
llm=llm,
)
# You can also use custom language strings:
translate(
source_path="source.epub",
target_path="translated.epub",
target_language="Icelandic", # For languages not in the constants
submit=SubmitKind.APPEND_BLOCK,
llm=llm,
)Monitor translation errors using the on_fill_failed callback. The system automatically retries failed translations up to max_retries times (default: 5). Most errors are recovered during retries and don't affect the final output.
from epub_translator import FillFailedEvent
def handle_fill_error(event: FillFailedEvent):
# Only log critical errors that will affect the final EPUB
if event.over_maximum_retries:
print(f"Critical error after {event.retried_count} attempts:")
print(f" {event.error_message}")
print(" This error will be present in the final EPUB file!")
translate(
source_path="source.epub",
target_path="translated.epub",
target_language=language.ENGLISH,
submit=SubmitKind.APPEND_BLOCK,
llm=llm,
on_fill_failed=handle_fill_error,
)Understanding Error Severity:
The FillFailedEvent contains:
error_message: str- Description of the errorretried_count: int- Current retry attempt number (1 to max_retries)over_maximum_retries: bool- Whether the error is critical
Error Categories:
-
Recoverable errors (
over_maximum_retries=False): Errors during retry attempts. The system will continue retrying and may resolve these automatically. Safe to ignore in most cases. -
Critical errors (
over_maximum_retries=True): Errors that persist after all retry attempts. These will appear in the final EPUB file and should be investigated.
Advanced Usage:
For verbose logging during translation debugging:
def handle_fill_error(event: FillFailedEvent):
if event.over_maximum_retries:
# Critical: affects final output
print(f"❌ CRITICAL: {event.error_message}")
else:
# Informational: system is retrying
print(f"⚠️ Retry {event.retried_count}: {event.error_message}")Use separate LLM instances for translation and XML structure filling with different optimization parameters:
# Create two LLM instances with different temperatures
translation_llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
temperature=0.8, # Higher temperature for creative translation
)
fill_llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
temperature=0.3, # Lower temperature for structure preservation
)
translate(
source_path="source.epub",
target_path="translated.epub",
target_language=language.ENGLISH,
submit=SubmitKind.APPEND_BLOCK,
translation_llm=translation_llm,
fill_llm=fill_llm,
)llm = LLM(
key="sk-...",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
)llm = LLM(
key="your-azure-key",
url="https://your-resource.openai.azure.com/openai/deployments/your-deployment",
model="gpt-4",
token_encoding="o200k_base",
)Any service with an OpenAI-compatible API can be used:
llm = LLM(
key="your-api-key",
url="https://your-service.com/v1",
model="your-model",
token_encoding="o200k_base", # Match your model's encoding
)Provide specific translation instructions:
translate(
source_path="source.epub",
target_path="translated.epub",
target_language="English",
submit=SubmitKind.APPEND_BLOCK,
llm=llm,
user_prompt="Use formal language and preserve technical terminology",
)Enable caching to resume translation progress after failures:
llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
cache_path="./translation_cache", # Translations are cached here
)Speed up translation by processing multiple text segments concurrently. Use the concurrency parameter to control how many translation tasks run in parallel:
translate(
source_path="source.epub",
target_path="translated.epub",
target_language="English",
submit=SubmitKind.APPEND_BLOCK,
llm=llm,
concurrency=4, # Process 4 segments concurrently
)Performance Tips:
- Start with
concurrency=4and adjust based on your API rate limits and system resources - Higher concurrency values can significantly reduce translation time for large books
- The translation order is preserved regardless of concurrency settings
- Monitor your API provider's rate limits to avoid throttling
Thread Safety:
When using concurrency > 1, ensure that any custom callback functions (on_progress, on_fill_failed) are thread-safe. Built-in callbacks are thread-safe by default.
Track token consumption during translation to monitor API costs and usage:
from epub_translator import LLM, translate, language, SubmitKind
llm = LLM(
key="your-api-key",
url="https://api.openai.com/v1",
model="gpt-4",
token_encoding="o200k_base",
)
translate(
source_path="source.epub",
target_path="translated.epub",
target_language=language.ENGLISH,
submit=SubmitKind.APPEND_BLOCK,
llm=llm,
)
# Access token statistics after translation
print(f"Total tokens: {llm.total_tokens}")
print(f"Input tokens: {llm.input_tokens}")
print(f"Input cache tokens: {llm.input_cache_tokens}")
print(f"Output tokens: {llm.output_tokens}")Available Statistics:
total_tokens- Total number of tokens used (input + output)input_tokens- Number of prompt/input tokensinput_cache_tokens- Number of cached input tokens (when using prompt caching)output_tokens- Number of generated/completion tokens
Real-time Monitoring:
You can also monitor token usage in real-time during translation:
from tqdm import tqdm
import time
with tqdm(total=100, desc="Translating", unit="%") as pbar:
last_progress = 0.0
start_time = time.time()
def on_progress(progress: float):
nonlocal last_progress
increment = (progress - last_progress) * 100
pbar.update(increment)
last_progress = progress
# Update token stats in progress bar
pbar.set_postfix({
'tokens': llm.total_tokens,
'cost_est': f'${llm.total_tokens * 0.00001:.4f}' # Estimate based on your pricing
})
translate(
source_path="source.epub",
target_path="translated.epub",
target_language=language.ENGLISH,
submit=SubmitKind.APPEND_BLOCK,
llm=llm,
on_progress=on_progress,
)
elapsed = time.time() - start_time
print(f"\nTranslation completed in {elapsed:.1f}s")
print(f"Total tokens used: {llm.total_tokens:,}")
print(f"Average tokens/second: {llm.total_tokens/elapsed:.1f}")Dual-LLM Token Tracking:
When using separate LLMs for translation and filling, each LLM tracks its own statistics:
translation_llm = LLM(key="...", url="...", model="gpt-4", token_encoding="o200k_base")
fill_llm = LLM(key="...", url="...", model="gpt-4", token_encoding="o200k_base")
translate(
source_path="source.epub",
target_path="translated.epub",
target_language=language.ENGLISH,
submit=SubmitKind.APPEND_BLOCK,
translation_llm=translation_llm,
fill_llm=fill_llm,
)
print(f"Translation tokens: {translation_llm.total_tokens}")
print(f"Fill tokens: {fill_llm.total_tokens}")
print(f"Combined total: {translation_llm.total_tokens + fill_llm.total_tokens}")Note: Token statistics are cumulative across all API calls made by the LLM instance. The counts only increase and are thread-safe when using concurrent translation.
PDF Craft converts PDF files into EPUB and other formats, with a focus on scanned books. Combine PDF Craft with EPUB Translator to convert and translate scanned PDF books into bilingual EPUB format.
Workflow: Scanned PDF → [PDF Craft] → EPUB → [EPUB Translator] → Bilingual EPUB
For a complete tutorial, watch: Convert scanned PDF books to EPUB format and translate them into bilingual books
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Issues: GitHub Issues
- OOMOL Studio: Open in OOMOL Studio


