Build software better, together

ALucek / seb-ocr

vLLM Processing for Unstructured Historical Documents

ocr large-language-model vision-language-model ocr-pipeline

Updated Jun 22, 2025
Python

jcaperella29 / Document_cleaning_CLI

🧠 AI-powered pipeline for cleaning scanned documents. Removes noise, enhances text, auto-tunes model weights, and returns OCR-optimized PDFs via CLI or cloud API.

python ocr computer-vision deep-learning rest-api image-processing scanned-documents batch-processing denoising cli-tool document-processing pytesseract image-enhancement fastapi cloud-run document-ai auto-tune ocr-pipelines ocr-pipeline

Updated Mar 15, 2025
MATLAB

bitsandbrains / ocr-pdf-text-extraction-service

Star

Serverless OCR & PDF Text Extraction microservice for Personal AI Factory v1. Built with TypeScript and Vercel Serverless Functions, using pdf-parse, and node-fetch for high-performance parsing of machine-readable PDFs. Supports extracting clean text from textual PDFs and exposes a clean HTTP API returning structured JSON output for downstream n8n.

rest-api data-preprocessing workflow-automation pdf-parsing event-driven-architecture edge-computing pdf-text-extraction document-intelligence vercel-functions document-processing-pipeline typescript-backend ocr-pipeline n8n-integration saas-infrastructure serverless-microservice nodejs-runtime

Updated Jan 10, 2026
TypeScript

anshwysmcbel2710 / ocr-pdf-text-extraction-service

Star

Serverless OCR & PDF Text Extraction microservice for Personal AI Factory v1. Built with TypeScript and Vercel Serverless Functions, using pdf-parse, and node-fetch for high-performance parsing of machine-readable PDFs. Supports extracting clean text from textual PDFs and exposes a clean HTTP API returning structured JSON output for downstream n8n.

Updated Jan 4, 2026
TypeScript

Not-Buddy / HackerXAPI

Star

High-performance Rust API with AI, multi-format docs, Gemini integration, security, CLI.

scalability async-programming production-ready parallel-processing document-processing gemini-api pdf-processing vector-database ai-ml-integration ocr-pipeline batch-operations llm-intelligence tokio-runtime smart-context-filtering chunking-strategy enterprise-secuirty prompt-injection-sanitization

Updated Jan 3, 2026
Rust

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ocr-pipeline

Here are 5 public repositories matching this topic...

ALucek / seb-ocr

jcaperella29 / Document_cleaning_CLI

bitsandbrains / ocr-pdf-text-extraction-service

anshwysmcbel2710 / ocr-pdf-text-extraction-service

Not-Buddy / HackerXAPI

Improve this page

Add this topic to your repo