-
Notifications
You must be signed in to change notification settings - Fork 2
Home
Welcome to the sakura-sumi wiki!
What is this? Sakura Sumi converts your code files into compressed PDFs that you can upload to AI models like Google Gemini. This lets you analyze entire codebases that would normally be too large.
3-Step Setup:
Install Python (if you don't have it):
Download from python.org Make sure to check "Add Python to PATH" during installation Get the code:
git clone https://github.com/yourusername/ocr-compression.git
cd ocr-compression
Set up and run:
Create a virtual environment (keeps dependencies organized)
python3 -m venv venv
Activate it
source venv/bin/activate # On macOS/Linux
# OR: venv\Scripts\activate # On Windows
Install required packages
pip install -r requirements.txt
Compress your codebase (replace with your actual path)
python scripts/compress.py "/path/to/your/codebase" -v
That's it! Your PDFs will be in {your_codebase}_ocr_ready/
The web portal provides a user-friendly interface - perfect if you're not comfortable with command-line tools.
Start the web server
python scripts/run_web.py
Open http://localhost:5001 in your browser Features:
- Beautiful sakura (cherry blossom) themed interface
- Point-and-click file selection
- Real-time progress tracking
- Token estimation before compression
- Job history and results management
- No command-line knowledge required
- For Advanced Users: The web portal exposes all CLI features through a GUI, including parallel processing, resume capability, and OCR compression modes.