This project provides a Python-based parser for Kindle highlights and notes. It processes the "My Clippings.txt" file exported from Kindle devices and organizes the highlights and notes into a structured JSON format.
- Parses Kindle highlights and notes from "My Clippings.txt"
- Organizes data by book, including highlights and notes
- Matches notes to their corresponding highlights when possible
- Tracks processing progress to avoid duplicate entries
- Provides error handling and reporting for problematic entries
- Python 3.7+
- dataclasses-json library
-
Clone the repository: git clone https://github.com/yourusername/kindle-parser.git cd kindle-parser
-
Create and activate a virtual environment: python -m venv venv source venv/bin/activate # On Windows, use venv\Scripts\activate
-
Install the required dependencies: pip install -r requirements.txt
-
Place your "My Clippings.txt" file in the project root directory.
-
Run the parser: python -m kindle_parser.parser
-
The parsed data will be saved in
output.json, and processing information will be stored inprocessed_entries.json.
kindle_parser/parser.py: Contains the main parsing logictests/test_parser.py: Contains unit tests for the parserrequirements.txt: Lists the project dependenciesoutput.json: The parsed highlights and notes in JSON formatprocessed_entries.json: Tracks which entries have been processed
Contributions are welcome! Please feel free to submit a Pull Request.
- Verify that notes are getting attached to the correct highlights
- Add chapter indexing to notes and highlights
This project is licensed under the MIT License - see the LICENSE file for details.