Release v0.6.1: Integrated PDF processing workflow with extralit-hf-space and incremental Dataset building with imports · Extralit/extralit

This release delivers major upgrades to document processing, import workflows, and exposed additional dataset-building functionalities in the UI. Highlights include OCRmyPDF-powered PDF processing via Redis jobs, a workspace selector at breadcrumb, and incremental import with dataset mapping.

What's Changed

[FEAT] integrate OCRmyPDF and on document upload in Redis Queue jobs by @priyankeshh and @JonnyTran in #115
[FIX] Import Files Flow by @JonnyTran in #120
[FEAT] Workspace Pinia Store and Dataset Breadcrumb Selector in AppHeader @JonnyTran in #121
[FIX] Import File Parsing and Matching Flow and Refactoring by @JonnyTran in #122
[FIX] DocumentAPI to query by params and return multiple documents & fix PDF file fetching by @JonnyTran in #123
[FEAT] minio presigned url for pdf by @JonnyTran in #124
[FIX] Import Analysis and Batch Refactoring, File Matching algorithm, Document Panel by @JonnyTran in #130
[FIX] Consolidating linting configuration by @JonnyTran in #133
[FEAT] Document workflows with rq jobs by @JonnyTran in #136
[FEAT] Import dataset mapping by @JonnyTran in #140

Contributors

Many thanks @priyankeshh for work on the https://github.com/Extralit/extralit-hf-space repo for PyMuPDF integration.
Welcome @Mr-Youssef-Sherif!

Full Changelog: v0.6.0...v0.6.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.6.1: Integrated PDF processing workflow with extralit-hf-space and incremental Dataset building with imports

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Contributors

Contributors

Uh oh!