This project automates the process of extracting, solving, and explaining LeetCode-style coding problems from desktop screenshots using OCR and OpenAI GPT models.
- β Detects open desktop windows and takes a screenshot of a selected one
- πΌοΈ Stores screenshots in a dedicated
screen_shots/folder - π Uses Tesseract OCR to extract text from screenshots
- π Saves parsed text into the
parsed_texts/folder - π€ Sends the extracted text to OpenAI GPT-4o via API to:
- Identify any LeetCode-style problem
- Explain the problem in natural language
- Provide a step-by-step solution with code
- π Saves the GPT-generated solution in Markdown format under
explained_solns/
Imagine you have a coding challenge open on your screen, and you want:
- To extract the question text from a screenshot
- To get an immediate explanation and solution from GPT
This project automates that entire flow.
. βββ screen_shots/ # Screenshots of captured windows βββ parsed_texts/ # Raw OCR text extracted from screenshots βββ explained_solns/ # Markdown solutions from OpenAI GPT βββ .env # Your OpenAI API key (OPENAI_API_KEY="sk-...") βββ notebook.ipynb # The Jupyter Notebook to run everything βββ requirements.txt # Dependency list βββ README.md # You're here :)
Make sure the following tools and Python packages are installed:
- Tesseract OCR
- Python 3.8+
- Install dependencies:
pip install -r requirements.txt