Markdown to ElevenLabs is an open-source project that converts Markdown files into high-quality voiceovers using the ElevenLabs Text-to-Speech API. The project is designed for creating natural-sounding audio from written content, ideal for podcasts, audiobooks, and more.
- Markdown Parsing: Splits Markdown files into sections, intelligently grouping paragraphs and lists.
- Text-to-Speech Conversion: Uses ElevenLabs API to generate realistic voiceovers with customizable voice settings.
- Audio Processing: Combines generated audio sections into a single cohesive file, complete with natural pauses.
- Flexible Operation Modes:
- Process Markdown only (
--markdown-only). - Generate audio only (
--audio-only). - Combine existing audio files only (
--combine-only).
- Process Markdown only (
- Error Handling and Preprocessing:
- Cleans and normalizes text for smooth audio generation.
- Handles special characters (
—,…, etc.) gracefully.
- Python 3.8 or newer
- ElevenLabs API Key (Get yours here)
- Dependencies:
- Install
pydubfor audio manipulation. - Install
dotenvfor environment variable management. - Install
unidecodefor text normalization.
- Install
pip install -r requirements.txtEnsure the following dependencies are listed in your requirements.txt:
elevenlabs
pydub
python-dotenv
unidecodeThe CLI command to copy .env.example to .env is:
cp .env.example .envcopy .env.example .envCopy-Item .env.example .env- Log in to your ElevenLabs account and generate an API key.
- Copy your API key and voice ID to the
.envfile:ELEVENLABS_API_KEY=your_api_key_here ELEVENLABS_VOICE_ID=your_voice_id_here
Place your Markdown files in the markdown folder located in the project root. Before running the script, review and edit these files as needed:
- Remove any content you don't want included, such as:
- Code blocks
- Tables of contents
- Superfluous headings or sections
- Ensure the text is structured logically for audio generation.
After editing, the script will process the Markdown files and split them into individual sections for audio conversion.
python main.py [options]| Option | Description |
|---|---|
--reset |
Deletes previous output files and starts fresh. |
--audio-only |
Skips Markdown processing and generates audio for existing Markdown sections. |
--markdown-only |
Processes Markdown files only, without generating audio. |
--combine-only |
Combines existing audio files into a single cohesive file. |
--voice-id |
Specify a voice ID (overrides .env). |
--api-key |
Specify an API key (overrides .env). |
python main.py --resetpython main.py --markdown-onlypython main.py --audio-onlypython main.py --combine-onlymarkdown-to-elevenlabs/
├── main.py # Main entry point for the program
├── src/
│ ├── split_markdown.py # Splits Markdown files into sections
│ ├── build_output.py # Handles audio generation and combination
├── markdown/ # Input Markdown files
├── output/
│ ├── markdown/ # Processed Markdown sections
│ ├── audio/ # Generated audio files
├── .env # Environment variables
├── requirements.txt # Python dependencies
├── README.md # Project documentation
Contributions are welcome! To get started:
- Fork the repository.
- Create a new branch for your feature or bugfix.
- Submit a pull request with a detailed explanation.
This project is licensed under the MIT License.
- ElevenLabs for their industry-leading Text-to-Speech API.
- pydub for seamless audio processing.
- open-source contributors for making projects like this possible!