-
Notifications
You must be signed in to change notification settings - Fork 18
Open
Labels
Stellar WaveIssues in the Stellar wave programIssues in the Stellar wave programbackendgood first issueGood for newcomersGood for newcomers
Description
Description:
Build a NestJS module that handles document upload, OCR processing, metadata extraction, and text analysis for land ownership documents.
Requirements:
- DocumentProcessingModule with service, controller, and entities
- Support for PDF and image uploads (PDF.js, Tesseract.js)
- Extract text content using OCR
- Parse metadata (dates, names, parcel IDs, coordinates)
- Store extracted data in PostgreSQL
- Queue-based processing for large documents (Bull)
- Progress tracking for async operations
- Error handling and retry logic
- File validation and sanitization
- S3 or local storage integration
Acceptance Criteria:
- Module is self-contained and importable
- Supports multiple file formats
- OCR accuracy is acceptable (>90%)
- Handles files up to 10MB
- Processing status is trackable
- Comprehensive error handling
- Unit and integration tests included
- API documentation (Swagger)
Tech Stack: NestJS, TypeORM, PostgreSQL, Bull, Tesseract.js,
Metadata
Metadata
Assignees
Labels
Stellar WaveIssues in the Stellar wave programIssues in the Stellar wave programbackendgood first issueGood for newcomersGood for newcomers