A Windows CLI application that converts PDF documents to Markdown using AI-powered transcription.
pdf2md renders each page of a PDF as an image, sends each image to OpenAI's GPT-5 vision API for transcription, and combines the results into a single Markdown file.
- Windows 10 or later (64-bit)
- Internet connection (for OpenAI API calls)
- Valid OpenAI API key with GPT-5 access
Download pdf2md.exe from the GitHub Releases page. No installation is required; the executable is self-contained.
Place the executable in a directory on your PATH, or run it directly from any location.
pdf2md [options] <input.pdf> [output.md]
| Argument | Description |
|---|---|
input.pdf |
Path to the PDF file to convert (required) |
output.md |
Path for the output Markdown file (optional, defaults to <input>.md) |
| Option | Default | Description |
|---|---|---|
--temp-dir <path> |
. (current directory) |
Directory for temporary page images |
--workers <n> |
1 |
Number of parallel transcription workers |
# Convert report.pdf to report.md
pdf2md report.pdf
# Specify output file location
pdf2md report.pdf notes/output.md
# Transcribe 4 pages in parallel for faster processing
pdf2md --workers 4 large-doc.pdf
# Use a specific directory for temporary files
pdf2md --temp-dir /tmp report.pdfSet the OPENAI_API_KEY environment variable before running pdf2md:
# Windows Command Prompt
set OPENAI_API_KEY=sk-...
# Windows PowerShell
$env:OPENAI_API_KEY = "sk-..."
# Git Bash / WSL
export OPENAI_API_KEY=sk-...The API key is not accepted as a command-line argument to avoid accidental exposure in shell history.
- Rendering: Each page of the PDF is rendered to a JPEG image using PDFium at 150 DPI.
- Transcription: Each page image is sent to OpenAI's GPT-5 vision API with a prompt to transcribe the content to Markdown.
- Combination: The Markdown outputs from all pages are joined with horizontal rule separators (
---) into a single file.
Temporary image files are created during processing and automatically cleaned up when the conversion completes or if an error occurs.
Use the --workers option to transcribe multiple pages simultaneously. This can significantly speed up conversion of large documents. Each worker processes one page at a time, and results are combined in the correct page order regardless of completion order.
Note: Higher worker counts increase API request concurrency. Ensure your OpenAI account rate limits can accommodate the number of workers you specify.
| Error | Meaning |
|---|---|
File not found: <path> |
The input PDF file does not exist |
File does not appear to be a PDF: <path> |
The input file does not have a .pdf extension (warning only) |
Cannot write to: <path> |
The output path is not writable |
Temp directory not found: <path> |
The specified temp directory does not exist |
Cannot write to temp directory: <path> |
The temp directory is not writable |
OPENAI_API_KEY environment variable not set |
The API key is missing |
OpenAI authentication failed |
The API key is invalid |
Rate limited |
Too many API requests; the application will retry automatically |
This project is released into the public domain under the Unlicense.