3DCF / doc2dataset: token-efficient document layer with NumGuard numeric integrity and multi-framework exports for RAG & fine-tuning.
nlp rust cli machine-learning ocr evaluation dataset-generation data-pipeline document-processing fine-tuning rag document-understanding llm 3dcf doc2dataset numguard
-
Updated
Dec 7, 2025 - Rust