vLLM Processing for Unstructured Historical Documents
-
Updated
Jun 22, 2025 - Python
vLLM Processing for Unstructured Historical Documents
🧠 AI-powered pipeline for cleaning scanned documents. Removes noise, enhances text, auto-tunes model weights, and returns OCR-optimized PDFs via CLI or cloud API.
Serverless OCR & PDF Text Extraction microservice for Personal AI Factory v1. Built with TypeScript and Vercel Serverless Functions, using pdf-parse, and node-fetch for high-performance parsing of machine-readable PDFs. Supports extracting clean text from textual PDFs and exposes a clean HTTP API returning structured JSON output for downstream n8n.
Serverless OCR & PDF Text Extraction microservice for Personal AI Factory v1. Built with TypeScript and Vercel Serverless Functions, using pdf-parse, and node-fetch for high-performance parsing of machine-readable PDFs. Supports extracting clean text from textual PDFs and exposes a clean HTTP API returning structured JSON output for downstream n8n.
High-performance Rust API with AI, multi-format docs, Gemini integration, security, CLI.
Add a description, image, and links to the ocr-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the ocr-pipeline topic, visit your repo's landing page and select "manage topics."