AI Subtitle Assistant

AI Subtitle Assistant is a command-line tool that uses AI technologies (Whisper and Large Language Models) to generate high-quality subtitles for video and audio files.

Features

Multi-format Support: Handles various common video and audio file formats.
High-precision Speech Recognition: Uses OpenAI's Whisper model for accurate audio transcription.
Intelligent Translation and Correction: Leverages Large Language Models (LLMs) for translation, correcting errors based on context, and identifying proper nouns.
Flexible LLM Configuration: Allows users to customize the API base URL, key, and model for their LLM provider.
Model Selection: Choose from different LLM models for translation tasks.
Concurrent Translation: Processes multiple translation requests concurrently for faster performance.
Translation Validation: Verifies that the original text returned by the LLM matches the input text to prevent hallucinations.
Improved Translation Quality: Adjusted importance weights to better balance accuracy and fluency (1:0.6).
Context Limit Handling: Detects and warns about model context limits that may cause truncated outputs.
Debug Mode: Enables detailed output of intermediate JSON data for troubleshooting.
Standard Subtitle Output: Generates standard UTF-8 encoded SRT subtitle files.
Bilingual Subtitles: Can generate bilingual subtitles for language learning.
Embedded Subtitle Extraction: Can detect and extract existing subtitle tracks from video files.
Internationalization: Supports multiple languages for the user interface (currently English and Chinese).
User-friendly CLI: Provides a clear and easy-to-use command-line interface with subcommands.

Installation

Recommended: Install from PyPI

pip install ai-subtitle

Or install from source:

git clone https://github.com/shdancer/ai-subtitle
cd subtitle
python3 -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
pip install .

ffmpeg is required for audio processing.

On macOS (using Homebrew):
```
brew install ffmpeg
```

On Debian/Ubuntu:

sudo apt update && sudo apt install ffmpeg

On Windows (using Chocolatey):
```
choco install ffmpeg
```

Usage

After installation, you can use the ai-subtitle command.

ai-subtitle <command> [options]

Global Options

--language {en,zh}: Sets the display language for the tool. Defaults to your system's language.

Commands

`transcribe`

Transcribes an audio/video file to an SRT file.

Usage: ai-subtitle transcribe <input_file> [options]

Arguments:

input_file: Path to the input video or audio file.

Options:

-o, --output: Path to the output SRT file. If not specified, prints to standard output.
-m, --model: The Whisper model to use (e.g., tiny, base, small, medium, large). Default is base.
--force-transcribe: Force transcription even if embedded subtitles are found.

Example:

# Transcribe a video and save to a file
ai-subtitle transcribe my_video.mp4 -o my_video.srt

# Extract an embedded subtitle instead of transcribing
ai-subtitle transcribe my_movie.mkv -o movie_subs.srt

`translate`

Translates an existing SRT file into a bilingual SRT file.

Usage: ai-subtitle translate [input_file] [options]

Arguments:

input_file: Path to the input SRT file. Reads from standard input if not provided.

Options:

-o, --output: Path to the output bilingual SRT file. Prints to standard output if not specified.
-t, --target-language: The target language for translation (e.g., "Chinese", "English"). Default is "Chinese".
--model: Select the model to use for translation (e.g., "gpt-3.5-turbo", "gpt-4"). Default is "gpt-3.5-turbo".
--max-workers: Maximum number of concurrent translation requests. Default is 5.
--list-models: List available models from the API and exit.
--api-base-url: Custom base URL for the LLM provider.
--api-key: Custom API key for the LLM provider.

Example:

# Transcribe and then translate
ai-subtitle transcribe my_video.mp4 | ai-subtitle translate -t "Japanese" -o bilingual.srt

`config`

Manages configuration settings for the AI Subtitle Assistant.

Usage: ai-subtitle config [options]

Options:

--show-path: Show the configuration file path.
--create: Create or update configuration interactively.

Example:

# Show the configuration file path
ai-subtitle config --show-path

# Create or update configuration
ai-subtitle config --create

How It Works

Audio Extraction/Transcription: For the transcribe command, it either extracts existing subtitles or uses ffmpeg to extract audio and whisper to transcribe it into timed text segments.
Chunking & Translation: For the translate command, it reads an SRT file, chunks the text to fit the LLM's context window, and sends it for translation.
LLM Processing: The text is sent to the configured LLM for translation and refinement. The process includes retries and a progress bar.
SRT Generation: The final processed text is formatted into a standard .srt file, either as a simple transcription or a bilingual subtitle.

Changelog

v0.1.4

Added: Model selection feature for translation with --model option
Added: List available models with --list-models option
Added: Debug mode for troubleshooting translation issues (enabled via AI_SUBTITLE_DEBUG=1 environment variable)
Improved: Translation prompt to better handle different language structures and prevent subtitle misalignment
Added: Graceful exit handling for keyboard interrupts (Ctrl+C)
Added: Validation to check for missing translations
Added: Concurrent translation processing for improved performance
Added: Translation validation to verify original text matches input text
Improved: Translation quality by adjusting importance weights to better balance accuracy and fluency (1:0.6)
Added: --max-workers option to control the number of concurrent translation requests
Added: Context limit handling to detect and warn about truncated outputs

v0.1.2

Fixed: SRT multi-line content parsing bug, now all lines are preserved and correctly translated
Improved: Documentation and README structure
Updated: PyPI installation instructions and project metadata

v0.1.1

Fixed issue with pipeline operations where debug output was interfering with standard output
Improved error handling and messaging
Updated documentation with installation instructions from PyPI

v0.1.0

Initial release with core functionality

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
locales		locales
scripts		scripts
src/ai_subtitle_assistant		src/ai_subtitle_assistant
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README_zh.md		README_zh.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Subtitle Assistant

Features

Installation

Usage

Global Options

Commands

`transcribe`

`translate`

`config`

How It Works

Changelog

v0.1.4

v0.1.2

v0.1.1

v0.1.0

About

Uh oh!

Releases

Packages

Languages

License

shdancer/ai-subtitle

Folders and files

Latest commit

History

Repository files navigation

AI Subtitle Assistant

Features

Installation

Usage

Global Options

Commands

transcribe

translate

config

How It Works

Changelog

v0.1.4

v0.1.2

v0.1.1

v0.1.0

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`transcribe`

`translate`

`config`

Packages