GitHub - nulib-labs/Impulse

Impulse is an easy-to-deploy all-in-one solution for extremely high-throuput embarrassingly parallel data extraction built for digitization, indexing, and distribution through IIIF manifests.

Running on Northwestern University's Quest supercomputing cluster, we observed hundreds of pages per hour. (This is while limiting ourselves to 50 concurrent jobs!) This could easily hit thousands of pages per hour.

This code is very easily extensible to new models/paradigms. Just edit the auxiliary.surya_on_image function!

This project would not be possible without the work of other amazing open- source projects.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
ad-hoc		ad-hoc
docs		docs
experiments		experiments
templates		templates
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
auxiliary.py		auxiliary.py
flex.png		flex.png
impulse.py		impulse.py
logo.png		logo.png
main.py		main.py
my_pads.py		my_pads.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_ner_alone.py		run_ner_alone.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

nulib-labs/Impulse

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages