Prototype to count LEGO parts from instruction PDFs by reading the callout boxes per step. The threshold are hardcoded for the exact instructions I wanted to parse, instead of an actual piece of useful software this project is just intended to show how far LLM-assisted development can get you within less than a day of work. Personally, I found it quite impressive. It also goes to show how far traditional computer vision techniques can still get you when applied to a very specific problem, with no access to large data or compute.
- Python 3.12+
uv(recommended)- OCR uses EasyOCR (downloads models on first run)
Install deps:
uv syncProcess a page range (1-based, inclusive):
uv run python ./main.py --start 1 --end 20 --out out ./merged.pdfOutputs:
out/results.json(detections, clusters, counts)out/debug/(rendered pages, callouts, part crops)
flowchart LR
A([Input PDF]) --> B[Render pages]
B --> C{Try rotations}
C --> D[Detect callouts]
subgraph Extract[Extract]
D --> E[Extract parts]
E --> F[OCR qty]
E --> G[Normalize parts]
end
subgraph Cluster[Cluster]
G --> H[Compute features]
H --> I["Cluster (composite distance)"]
end
subgraph Output[Output]
I --> J([Clusters + results.json])
end
classDef io fill:#0b2239,stroke:#0b2239,color:#ffffff;
classDef process fill:#e7eef5,stroke:#6b7a89,color:#16212b;
classDef decision fill:#f6e6b6,stroke:#9d7b20,color:#4a3600;
classDef group fill:#ffffff,stroke:#c3ccd6,color:#16212b;
class A,J io;
class B,D,E,F,G,H,I process;
class C decision;
class Extract,Cluster,Output group;
uv run streamlit run streamlit_app.py -- --out outinstruction_matcher/cli.pyentrypointinstruction_matcher/pipeline.pymain pipelineinstruction_matcher/callout.pycallout detection + part extractioninstruction_matcher/ocr.pyfast qty OCR (template-based)instruction_matcher/clustering.pyglobal clusteringstreamlit_app.pydashboard
- Orientation is chosen by scoring readability of callout quantities.
- Clustering uses color hist + ORB + pHash on 512×512 normalized crops.
- Part crops saved in
out/debugare already normalized.
cProfile:
uv run python -m cProfile -o out/profile.prof ./main.py --start 1 --end 20 --out out ./merged.pdfMemory profile (py-spy):
py-spy record --mem --format flamegraph -o out/profile_mem.svg -- uv run python ./main.py --start 1 --end 20 --out out ./merged.pdf