PDF to Markdown Converter

English | 中文

This is a Python-based command-line tool for converting PDF documents into well-formatted Markdown files. It uses the PyMuPDF (fitz) library to extract PDF pages as images, then performs OCR (Optical Character Recognition) on the images via Google Gemini or OpenRouter API, and finally consolidates the recognized structured text into a single Markdown file.

✨ Features

High-precision OCR: Utilizes Gemini models for high-quality text recognition.
Multiple API Support: Configurable to use either Google Gemini or OpenRouter API through environment variables.
Robust Error Handling: Includes retry mechanisms and clear error logging.
Easy to Use: Simple command-line interface with clear options.
Automation-friendly: Easy integration into scripts or automated workflows.
Code Quality: Over 90% unit test coverage.

⚙️ Installation

This project uses Poetry for dependency management.

Clone the repository (if obtained via git):

git clone https://github.com/JokerQianwei/PDF2Markwon-Gemini.git
cd ./PDF2Markwon-Gemini

Install dependencies: Make sure you have Poetry installed, then run:
```
poetry install
```

🔑 API Key Configuration

Before using this tool, you need to configure API keys. The tool searches for environment variables in the following priority order:

OPENROUTER_API_KEY: If this variable is set, OpenRouter API will be used.
GOOGLE_API_KEY: If OpenRouter key is not set, Google Gemini API will be used.

You can set these variables in one of two ways:

1. (Recommended) Using `.env` File

You can create a file named .env in the project's root directory and define your API keys in it. This is the recommended approach as it doesn't pollute your shell's global environment.

# .env file content
OPENROUTER_API_KEY="sk-or-v1-..."
# or
# GOOGLE_API_KEY="..."

The tool will automatically load this file on startup.

2. Setting Environment Variables

Set your keys as environment variables. For example, add the following lines to your .zshrc or .bashrc file:

# Using OpenRouter 
export OPENROUTER_API_KEY="sk-or-v1-..."

# or using Google
export GOOGLE_API_KEY="..."

If neither key is set, the program will not run.

🚀 Usage

You can execute this tool via poetry run.

Basic Usage

The simplest usage is to provide a path to an input PDF file. The program will automatically generate a .md file with the same name in the same directory.

poetry run pdf2md --input-path /path/to/your/document.pdf

Specifying Output Path

You can use the -o or --output-path option to specify the path for the output Markdown file.

poetry run pdf2md -i /path/to/your/document.pdf -o /path/to/your/output.md

Specifying OCR Model

You can use the -m or --model option to specify the model for OCR. If not provided, OpenRouter defaults to google/gemini-3.1-flash-lite-preview, while the direct Google Gemini client defaults to gemini-2.5-flash.

# Using another model supported by OpenRouter
poetry run pdf2md -i document.pdf -m "google/gemini-2.5-pro-preview"

Verbose Logging

If you encounter issues during conversion, you can enable verbose logging mode (--verbose or -v), which will print more detailed debugging information.

poetry run pdf2md -i document.pdf -v

Help Information

View all available commands and options:

poetry run pdf2md --help

Examples

The PDF folder in the project contains sample PDF files that you can use directly to test the conversion functionality. For example, to convert HNeRV.pdf:

poetry run pdf2md -i PDF/HNeRV.pdf

The converted Markdown file will be saved as PDF/HNeRV.md by default.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
PDF		PDF
src/pdf_to_markdown		src/pdf_to_markdown
tests		tests
.coverage		.coverage
.coverage.Qianweis-Macbook-Pro.local.99984.XSzclWdx		.coverage.Qianweis-Macbook-Pro.local.99984.XSzclWdx
.env.example		.env.example
.gitignore		.gitignore
.roomodes		.roomodes
README.md		README.md
README_CN.md		README_CN.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF to Markdown Converter

✨ Features

⚙️ Installation

🔑 API Key Configuration

1. (Recommended) Using `.env` File

2. Setting Environment Variables

🚀 Usage

Basic Usage

Specifying Output Path

Specifying OCR Model

Verbose Logging

Help Information

Examples

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PDF to Markdown Converter

✨ Features

⚙️ Installation

🔑 API Key Configuration

1. (Recommended) Using .env File

2. Setting Environment Variables

🚀 Usage

Basic Usage

Specifying Output Path

Specifying OCR Model

Verbose Logging

Help Information

Examples

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. (Recommended) Using `.env` File

Packages