AI Subtitle Assistant is a command-line tool that uses AI technologies (Whisper and Large Language Models) to generate high-quality subtitles for video and audio files.
- Multi-format Support: Handles various common video and audio file formats.
- High-precision Speech Recognition: Uses OpenAI's Whisper model for accurate audio transcription.
- Intelligent Translation and Correction: Leverages Large Language Models (LLMs) for translation, correcting errors based on context, and identifying proper nouns.
- Flexible LLM Configuration: Allows users to customize the API base URL, key, and model for their LLM provider.
- Model Selection: Choose from different LLM models for translation tasks.
- Concurrent Translation: Processes multiple translation requests concurrently for faster performance.
- Translation Validation: Verifies that the original text returned by the LLM matches the input text to prevent hallucinations.
- Improved Translation Quality: Adjusted importance weights to better balance accuracy and fluency (1:0.6).
- Context Limit Handling: Detects and warns about model context limits that may cause truncated outputs.
- Debug Mode: Enables detailed output of intermediate JSON data for troubleshooting.
- Standard Subtitle Output: Generates standard UTF-8 encoded SRT subtitle files.
- Bilingual Subtitles: Can generate bilingual subtitles for language learning.
- Embedded Subtitle Extraction: Can detect and extract existing subtitle tracks from video files.
- Internationalization: Supports multiple languages for the user interface (currently English and Chinese).
- User-friendly CLI: Provides a clear and easy-to-use command-line interface with subcommands.
Recommended: Install from PyPI
pip install ai-subtitleOr install from source:
git clone https://github.com/shdancer/ai-subtitle
cd subtitle
python3 -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
pip install .ffmpeg is required for audio processing.
- On macOS (using Homebrew):
brew install ffmpeg
- On Debian/Ubuntu:
sudo apt update && sudo apt install ffmpeg - On Windows (using Chocolatey):
choco install ffmpeg
After installation, you can use the ai-subtitle command.
ai-subtitle <command> [options]--language {en,zh}: Sets the display language for the tool. Defaults to your system's language.
Transcribes an audio/video file to an SRT file.
Usage:
ai-subtitle transcribe <input_file> [options]
Arguments:
input_file: Path to the input video or audio file.
Options:
-o, --output: Path to the output SRT file. If not specified, prints to standard output.-m, --model: The Whisper model to use (e.g.,tiny,base,small,medium,large). Default isbase.--force-transcribe: Force transcription even if embedded subtitles are found.
Example:
# Transcribe a video and save to a file
ai-subtitle transcribe my_video.mp4 -o my_video.srt
# Extract an embedded subtitle instead of transcribing
ai-subtitle transcribe my_movie.mkv -o movie_subs.srtTranslates an existing SRT file into a bilingual SRT file.
Usage:
ai-subtitle translate [input_file] [options]
Arguments:
input_file: Path to the input SRT file. Reads from standard input if not provided.
Options:
-o, --output: Path to the output bilingual SRT file. Prints to standard output if not specified.-t, --target-language: The target language for translation (e.g., "Chinese", "English"). Default is "Chinese".--model: Select the model to use for translation (e.g., "gpt-3.5-turbo", "gpt-4"). Default is "gpt-3.5-turbo".--max-workers: Maximum number of concurrent translation requests. Default is 5.--list-models: List available models from the API and exit.--api-base-url: Custom base URL for the LLM provider.--api-key: Custom API key for the LLM provider.
Example:
# Transcribe and then translate
ai-subtitle transcribe my_video.mp4 | ai-subtitle translate -t "Japanese" -o bilingual.srtManages configuration settings for the AI Subtitle Assistant.
Usage:
ai-subtitle config [options]
Options:
--show-path: Show the configuration file path.--create: Create or update configuration interactively.
Example:
# Show the configuration file path
ai-subtitle config --show-path
# Create or update configuration
ai-subtitle config --create- Audio Extraction/Transcription: For the
transcribecommand, it either extracts existing subtitles or usesffmpegto extract audio andwhisperto transcribe it into timed text segments. - Chunking & Translation: For the
translatecommand, it reads an SRT file, chunks the text to fit the LLM's context window, and sends it for translation. - LLM Processing: The text is sent to the configured LLM for translation and refinement. The process includes retries and a progress bar.
- SRT Generation: The final processed text is formatted into a standard
.srtfile, either as a simple transcription or a bilingual subtitle.
- Added: Model selection feature for translation with
--modeloption - Added: List available models with
--list-modelsoption - Added: Debug mode for troubleshooting translation issues (enabled via
AI_SUBTITLE_DEBUG=1environment variable) - Improved: Translation prompt to better handle different language structures and prevent subtitle misalignment
- Added: Graceful exit handling for keyboard interrupts (Ctrl+C)
- Added: Validation to check for missing translations
- Added: Concurrent translation processing for improved performance
- Added: Translation validation to verify original text matches input text
- Improved: Translation quality by adjusting importance weights to better balance accuracy and fluency (1:0.6)
- Added:
--max-workersoption to control the number of concurrent translation requests - Added: Context limit handling to detect and warn about truncated outputs
- Fixed: SRT multi-line content parsing bug, now all lines are preserved and correctly translated
- Improved: Documentation and README structure
- Updated: PyPI installation instructions and project metadata
- Fixed issue with pipeline operations where debug output was interfering with standard output
- Improved error handling and messaging
- Updated documentation with installation instructions from PyPI
- Initial release with core functionality