KindleClippingsToJEX is a robust, professional-grade tool designed to process your Kindle highlights (My Clippings.txt) and convert them into a Joplin Export File (.jex). It preserves all critical metadata (author, book, date, page/location) and intelligently organizes them into a clean notebook structure ready for direct import into Joplin.
Whether you are a casual reader or a power user, this tool ensures your Kindle notes are never lost and always accessible in your favorite note-taking app.
- Features
- Download
- Installation
- Configuration
- Usage (GUI & CLI)
- Project Structure
- Future Improvements
- Troubleshooting
- Acknowledgements
- Contributing
- License
- JEX Native Export: Generates standard
.jexfiles (tarball of markdown files + JSON metadata) that import flawlessly into Joplin, preserving creation dates and official titles. - CSV Export: Optionally export your curated highlights to a standardized CSV format, perfect for importing into Excel, Notion, or other database tools. Uses UTF-8-SIG encoding for maximum compatibility with Microsoft Excel.
- Markdown Export (Obsidian): Generates a .zip archive containing individual
.mdfiles for each note. Each file includes YAML frontmatter (title, author, tags, date), making it fully compatible with Obsidian, Logseq, and other PKM tools. - JSON Export: Export raw data in JSON format for developers, backups, or custom processing scripts.
- Enhanced Metadata Extraction: Intelligently extracts author names, book titles, locations, and page numbers. It even handles page numbers with zero-padding (e.g.,
[0042]) to ensure proper lexical sorting.- Geo-tagging Support: Optionally add location data (lat/long/altitude) to your imported notes via
config.json. Joplin uses this to display your notes on a map (via OpenStreetMap).
- Geo-tagging Support: Optionally add location data (lat/long/altitude) to your imported notes via
- Smart Synchronization System: Say goodbye to duplicates!
- Deterministic IDs: Generates stable, content-based IDs (SHA-256) for every note, notebook, and tag. This means you can re-export your Kindle file 100 times, and Joplin will intelligently update your existing notes instead of creating messy duplicates.
- Smart Tagging: Converts your Kindle notes into Joplin tags. Supports splitting multiple tags by comma, semicolon, or period (e.g., "productivity, psychology").
- Smart Deduplication: Intelligent algorithm that detects and merges:
- Overlapping highlights: Keeps the longest/most complete version of a correction.
- Edited Notes: Keeps only the latest version of a note at a specific location.
- Accidental Highlights: Flags fragments (< 75 chars) if they start with lowercase or lack punctuation.
- Smart Association: Advanced logic to link notes to highlights even when Kindle places the note at the end of a long passage. Instead of exact matching, it uses sophisticated range coverage (Highlight Start β€ Note Location β€ Highlight End) to ensure your comments always find their parent text.
- Advanced Text Hygiene: Automatically polishes your highlights to professional standards:
- Title Polishing: Automatically cleans up book titles by removing clutter like " (Kindle Edition)", "(Spanish Edition)", "[eBook]", etc.
- Unicode (NFC) Normalization: Ensures cross-platform compatibility (Windows/Mac/Linux) for special characters, preventing issues with Obsidian/Joplin search and linking.
- Typesetting Cleanup: Fixes common Kindle issues like double spaces, spaces before punctuation (e.g.,
Hello , world), and capitalization errors. - De-hyphenation: Fixes broken words from PDF line breaks (e.g.,
word-\n suffixβwordsuffix). - Invisible Character Removal: Strips byte-order marks (BOM) and zero-width spaces that often corrupt text searches.
- Robustness & Transparency:
- Detailed Error Reporting: Intelligently skips corrupted blocks and provides a visual report with snippets of exactly what was skipped, ensuring data integrity without silent failures.
- Comprehensive Logging: Maintains a full audit trail (info, debug, errors) of the process in
app.logfor troubleshooting. - Strict Typing: Built with Python Dataclasses to prevent data corruption.
- Multi-language Support: Automatic language detection from the clippings file, with manual override support. Supported languages: English, Spanish, French, German, Italian, and Portuguese.
The project features a completely redesigned, modern "Zen" interface focused on simplicity, focus, and efficiency.
- Instant Auto-Load: Automatically detects and loads your configured clippings file on startup. If not found, a friendly "Empty State" guides you.
- Kindle Auto-Detection: Scans connected USB drives (Windows, macOS, Linux) for a Kindle device and offers to import directly.
- Drag & Drop: Simply drag your
My Clippings.txtfile onto the window to load it instantly. - Smart Cleanup: A "β»οΈ Clean" button appears automatically if duplicates or redundant highlights are detected. One click cleans up your file.
- π Light / Dark Mode: Toggle between light and dark themes with one click. Your preference is saved in
config.json. - Live Stats Dashboard: The header updates in real-time to show statistics like Highlights, Books, Authors, Tags, Avg/Book, and Days Span.
- Clean Data Table: A clutter-free table view focusing on what matters:
- Date: Sortable by timestamp.
- Book: Filterable title.
- Author: Filterable author name.
- Content: Smart preview that expands on double-click.
- Tags: Editable tags column (comma-separated).
- Smart Search: Real-time filtering by text. Type "Harry" and instantly see only related notes. The export function respects this filter ("What You See Is What You Get").
- Bidirectional Editing:
- Edit text directly in the table cells.
- OR use the spacious Bottom Editor Pane for long texts.
- Changes are synced instantly between views.
- Power Selection & Context Menu:
- Multi-select: Use
Shift+ClickorCtrl+Clickto select multiple rows. - Right-Click Menu:
- Export Selected: Create a mini .jex file containing ONLY the selected notes.
- Duplicate: Clone a row (useful for splitting one long highlight into two distinct notes).
- Delete: Remove unwanted highlights before exporting.
- Multi-select: Use
For power users and automation scripts, the CLI offers a headless experience.
- Batch Processing: Process huge files in seconds without opening the window.
- Configurable: Fully controlled via arguments or
config.json.
Don't want to mess with Python or code?
Simply download the latest executable for your system from the Releases Page.
- Windows:
KindleToJEX-Windows.exe - Mac:
KindleToJEX-MacOS - Linux:
KindleToJEX-Linux.bin
-
Clone the repository:
git clone https://github.com/KindleTools/KindleClippingsToJEX.git cd KindleClippingsToJEX -
Create a virtual environment (Recommended):
python -m venv .venv # Windows .\.venv\Scripts\activate # Linux/Mac source .venv/bin/activate
-
Install dependencies:
# For users (minimal install): pip install -e . # For contributors (includes linting, testing, and build tools): pip install -e .[dev]
-
Build the Executable (Optional): Simply double-click
build_exe.baton Windows to generate a standalone.exein thedist/folder.Note for Mac/Linux Users: To build efficiently for all platforms without needing multiple computers, this project includes a GitHub Actions workflow. Simply create a "Release" in your GitHub repository, and it will automatically build and attach
.exe(Windows),.bin(Linux), and Mac executables.
The application uses a config/config.json file for default settings. You can copy config/config.sample.json to get started.
{
"creator": "Your Name",
"notebook_title": "Kindle Imports",
"input_file": "path/to/My Clippings.txt",
"output_file": "import_clippings",
"language": "auto",
"theme": "light",
"location": [0.0, 0.0, 0]
}| Key | Description |
|---|---|
creator |
Your name (used as author metadata in exported notes). |
notebook_title |
Root notebook name in Joplin for the import. |
input_file |
Default path to your My Clippings.txt file. |
output_file |
Base name for the exported file (extension added automatically). |
language |
Parsing language: auto (recommended), en, es, fr, de, it, or pt. |
theme |
GUI theme: light or dark. |
location |
Geo-tagging as [latitude, longitude, altitude]. Joplin displays this on a map via OpenStreetMap. Set to [0, 0, 0] to disable. |
Run the main application entry point:
python main.pyWorkflow:
- Load: The app opens your default clippings file. If missing, click "Open File".
- Curate:
- Use the Search Bar to filter by book title or content.
- Use Ctrl+Click to select specific notes.
- Right-click and select "Delete Row(s)" to remove irrelevant highlights.
- Edit the Content or Tags columns to fix typos or add context.
- Export:
- Export Visible: Click the main "Export Notes" button. You can choose between JEX (for Joplin), CSV (for Excel), Markdown ZIP (for Obsidian), or JSON.
- Export Selection: Select specific rows, Right-Click > "Export Selected" to save just those notes (JEX, CSV, ZIP, or JSON).
- Import:
- Joplin: Go to File > Import > JEX - Joplin Export File.
- Obsidian: Unzip the downloaded file and drag the folder into your Obsidian Vault.
- Excel: Open the generated CSV file directly.
Run the CLI script:
python cli.pyOptions:
--input,-i: Path to source file (default: from config).--output,-o: Output filename (default: from config).--lang,-l: Force language parsing (e.g.,en).--notebook,-n: Root notebook title for the export (default: "Kindle Imports").--creator,-c: Author name metadata for the notes (default: "System").--format,-f: Output format:jex,csv,md, orjson.- Note: The CLI automatically applies Smart Deduplication unless
--no-cleanis used. --no-clean: Disable the smart deduplication and accidental highlight cleaning.
Example:
python cli.py --input "data/old_clippings.txt" --no-cleanThe project follows a modular hexagon-like architecture to separate UI, Business Logic, and Data Parsing. See ARCHITECTURE.md for design decisions and rationale.
βββ .github/ # CI/CD Workflows (GitHub Actions)
βββ config/ # Configuration Settings
βββ docs/ # Architecture & Design Documentation
βββ domain/ # Data Models (DDD Dataclasses)
βββ exporters/ # Export Strategies (JEX, CSV, MD, JSON)
βββ parsers/ # Kindle Parsing Logic (Multi-language)
βββ resources/ # Static Assets (Icons, Styles, Language Patterns)
βββ services/ # Business Logic & Orchestration
βββ tests/ # Unit & Integration Tests
βββ ui/ # GUI Layer (PyQt5)
βββ utils/ # Shared Utilities & Logging
βββ build_exe.bat # Windows Build Script
βββ cli.py # CLI Entry Point
βββ main.py # GUI Entry Point
βββ pyproject.toml # Project Definition
To set up the development environment:
# Install all dependencies (app + dev tools: ruff, pytest, mypy, coverage, pyinstaller)
pip install -e .[dev]Run quality checks:
# Type Checking
mypy .
# Linting & Formatting
ruff check .
ruff format .Run the test suite:
# Run all tests
pytest
# With coverage report
coverage run -m pytest
coverage report -mThe following are the next planned features. See the full Roadmap for the complete multi-phase development plan.
- π Direct Joplin Sync: Send notes directly to the running Joplin app via its local API (Port 41184), skipping the file import step.
- ποΈ SQLite Backend: Persistent database for highlights with edit history and undo/redo.
- π Reading Timeline: A visual heatmap or bar chart to visualize your reading habits over time.
- π Book Cover Enrichment: Automatic cover art via Open Library API for a visual "bookshelf" experience.
- π§ AI Auto-Tagging: Zero-Shot classification for automatic highlight categorization.
The app doesn't detect my Kindle automatically.
Ensure your Kindle is connected via USB and recognized as a drive by your computer. The app looks for
documents/My Clippings.txtin standard drive letters.
My highlights have the wrong encoding (weird characters).
The app tries to auto-detect UTF-8 or UTF-8-SIG. If you see weird characters, open
config/config.jsonand try changing your system language settings or saving your clippings file as UTF-8.
I see "No module named ui" errors when running from source.
Make sure you are running the script from the root directory:
python main.py, not from inside a subdirectory.
Special thanks to the open-source libraries that make this possible:
- PyQt5 for the beautiful GUI framework.
- dateparser for the magical multi-language date parsing.
- Pillow for image processing (icon conversion for cross-platform builds).
- Joplin for providing an excellent note-taking ecosystem.
Contributions are welcome! Please check CONTRIBUTING.md for guidelines on code style (PEP 8) and pull requests.
This project is licensed under the MIT License - see the LICENSE file for details.


