Skip to content

Quickly compare document layouts to filter, categorize, or flag files before expensive text extraction—perfect for structured reports

License

Notifications You must be signed in to change notification settings

samsaara/layout-sim

Repository files navigation

Layout Similarity Detection

Layout Similarity Detection

Setup

  • Install pixi
  • Clone this repo
  • For Linux: pixi install -e linux
  • For Mac: pixi install -e mac

Prerequisites on Mac

To install with GPU accelerated libraries on Mac with Apple Silicon processors, here's what's needed:

  • gcc and zlib installed (either xcode/developer tools/brew etc.)
    • if gcc, is not found, you might be prompted to install developer tools or something similar.
    • If installed zlib via homebrew, make sure the library is in path. If not, run ZLIB_ROOT=$(brew --prefix zlib) pixi install -e mac

(I had limited access to a mac mini M1 and successfully tested it (via notebook) after the above steps.)

Run

The code for each of the tasks can be run in two ways: (a) CLI (b) gradio.

For gradio:

PYTHONPATH='.' pixi run -e [linux|mac] gradio app.py

To run from CLI or for help or to tweak parameters:

PYTHONPATH='.' pixi run -e [linux|mac] python src/main.py [--help]

Debug

logs will be stored in logs.log realtime for debugging & monitoring.

About

Quickly compare document layouts to filter, categorize, or flag files before expensive text extraction—perfect for structured reports

Resources

License

Stars

Watchers

Forks