four-elo-dms-export - Change History
--ids option: Export specific documents by comma-separated object IDs
getDocumentsById() in DatabaseReader: Filter documents by array of object IDs
Auto-resume capability : Tracks exported IDs in exported_ids.json and automatically skips already-exported documents
loadExportedIds() in ExportOrganizer: Load previously exported IDs
isExported() in ExportOrganizer: Check if document already exported
markExported() in ExportOrganizer: Mark document as exported (saves immediately)
getExportedCount() in ExportOrganizer: Get count of already exported documents
Export progress is now automatically saved after each document
Re-running export on same output directory automatically resumes from where it left off
Documents that failed previously can be retried by running export again
Re-export documents that had conversion errors
Export specific documents for testing
Selective re-processing without full database scan
Resume interrupted exports without starting over
Safe to Ctrl+C and restart - no duplicate work
exported_ids.txt stored in output directory root (one ID per line)
File uses append mode (FILE_APPEND) for optimal performance - no JSON encoding/decoding overhead
Already-exported documents are silently skipped (counted in "skipped")
Works with both full export and --ids selective export
Lazy loading: IDs loaded from file only on first access
Fast writes: Single append operation instead of reading, parsing, modifying, encoding, and writing entire file
# Export specific documents
./bin/elo-export export /path/to/DMS.MDB /path/to/Archivdata --output=/path/to/export --ids=100,200,300
# Resume after interruption (auto-skips already exported)
./bin/elo-export export /path/to/DMS.MDB /path/to/Archivdata --output=/path/to/export
Folder calculation corrected : Changed from >> 10 to >> 10 << 2 (equivalent to >> 8, divide by 256)
Root object excluded : Root object (objtype 9999) now properly excluded from documents
Multi-page PDF support : Use writeImages() instead of writeImage() for proper multi-page TIFF conversion
Unified file handling : New addFile() method handles both conversion and direct copy
Direct PDF generation : ImageConverter writes directly to destination, no temporary files
Full path logging : Log entries now include complete file paths instead of basenames for better traceability
isSupportedFormat() in ImageConverter: Check if file format supports conversion (tif, tiff, jpg, jpeg, png, gif)
addFile() in ExportOrganizer: Unified method for adding files (auto-converts or copies based on format)
generateUniqueFilename() in ExportOrganizer: Handles duplicate filename resolution
Support for non-image files: Files with unsupported formats are copied as-is
Temporary PDF file handling (no longer needed)
Separate conversion and copy logic in ExportCommand (unified in ExportOrganizer)
Multi-page TIFF files now correctly converted to multi-page PDFs
Folder path calculation for physical file locations (divide by 256 instead of 1024)
Root object (objtype 9999) no longer processed as document
Folder calculation: (objdoc >> 10) << 2 = shift right 10, shift left 2 = net shift right 8 = divide by 256
ImageConverter uses writeImages($path, true) for multi-page support
ExportOrganizer checks format and either converts to PDF or copies original
Cleaner API: addFile(sourcePath, relativePath) handles everything
No temporary files in /tmp or project directories
Complete DatabaseReader refactor : Simplified to object-based API with single-load caching
Data model : Changed from associative arrays to stdClass objects for cleaner property access
Caching strategy : Load all objects once via getObjects(), cache in memory, filter on demand
Helper properties : Added isFolder, isDocument, isDeleted flags to each object for easy filtering
Method signatures : Simplified to work with stdClass objects instead of mixed types
Path creation : New dedicated methods createDocumentPath() and createFolderPath()
Null safety : Improved with null coalescing operator (??) throughout
Early returns : Added optimization for root/invalid folders in createFolderPath()
Explicit imports : Using explicit InvalidArgumentException and RuntimeException imports
getObjects(): Loads and caches all database objects as stdClass with helper flags
getDocuments(): Returns filtered array of document objects (non-deleted files)
getFolders(): Returns filtered array of folder objects (non-deleted folders)
createDocumentPath(): Generates full export path from document object
createFolderPath(): Recursively builds folder hierarchy path
buildFilePath(): Creates physical file path from document object
Generator/streaming approach (replaced with cached approach)
buildFilePathFromObjdoc() (replaced with buildFilePath())
objdocToHexFilename() (renamed to objdocidToHexFilename())
Two-pass processing complexity
ext-pdo dependency from composer.json (no longer needed with mdb-json)
ODBC/DSN configuration requirements
--dsn command option (no longer applicable)
Single mdb-json execution loads all objects into memory
Objects stored as stdClass (not arrays) using json_decode($line, false)
Objects indexed by objid for O(1) lookups
Helper flags computed once during load: isFolder, isDocument, isDeleted
Null coalescing operator (??) used for: objstatus, objparent, objshort, objdoc
Early return in createFolderPath() for $folderId <= 1 (root/invalid)
Folder path recursion uses cached object lookup for performance
Only system dependency: mdbtools package (provides mdb-json)
No PHP PDO/ODBC extensions required
Much simpler and more maintainable codebase
Database access method : Switched from PDO ODBC to mdb-json for reliable and maintained database access
Folder calculation corrected : Fixed file path calculation from "first 6 hex chars" to proper objdoc >> 10 formula
Line-by-line streaming : Database output now streamed line-by-line using popen() and fgets()
Type flexibility : buildFilePathFromObjdoc() now accepts both string and int parameters
Status filtering : Changed from objstatus != 1 to objstatus = 0 for active records
MDBTools ODBC unreliability issues (SQL parsing errors, segmentation faults)
Incorrect folder path calculation for physical file locations
Type errors when passing objdoc values between methods
Uses mdb-json {database} objekte which outputs one JSON object per line
Folder calculation: folder = (objdoc >> 10) converted to 6-char hex with UP prefix
Example: objdoc=3101 → 3101>>10=3 → UP000003 → Archivdata/DMS_1/UP000003/00000C1D.TIF
Two-pass processing: First pass collects folders, second pass streams files
All JSON decoding uses associative arrays (not objects) for consistent access
Database access completely independent of ODBC drivers
Streaming architecture : Converted getDocuments() to use PHP generator for memory-efficient processing
Partial database reading : Separated folder loading from file loading to reduce memory footprint
Optimized queries : Added WHERE clauses to filter at database level
Logs relocated : Moved all logs to var/log relative to tool (not in export path)
Clean export structure : Export folder now contains only documents/ hierarchy, no metadata files
Metadata export functionality (CSV and JSON reports)
processedDocuments tracking (no longer needed)
Undefined $filename variable in logging
getDocuments() now yields documents one at a time instead of loading all into memory
Added getDocumentCount() for efficient progress bar counting
Folder hierarchy built first from SELECT * FROM objekte WHERE objtype < 255
Files streamed from SELECT * FROM objekte WHERE objtype > 254 AND (objstatus IS NULL OR objstatus != 1)
Log path: {project_root}/var/log/
Export path: {user_specified}/documents/
Corrected folder hierarchy export logic
Files now placed in proper ELO folder structure
Filenames now use sanitized objshort values
Fixed MDBTools ODBC text field issues using mdb-export
Folder paths built from folders (objtype < 255) using objparent traversal
Files (objtype > 254) use objparent to determine folder placement
objkeys loaded via mdb-export CLI tool instead of PDO (workaround for ODBC issues)
Enhanced DBSCHEMA.md with comprehensive workflow documentation
Export structure: Export/documents/<folder_path>/<sanitized_objshort>.pdf
Folder lookup: Recursive objparent traversal for hierarchy
Filename sanitization: Unified rules for folders and files
File lookup: ELO_FNAME from objkeys via mdb-export
Initial project structure following 4 Bytes standards
Composer configuration with Symfony Console framework
Service architecture for modular export processing
Support for PHP Imagick-based image to PDF conversion
PDO ODBC integration for ELO MDB database access
Comprehensive logging system
Project documentation (CLAUDE.md, HISTORY.md, DBSCHEMA.md)
Package : four-bytes/four-elo-dms-export
Namespace : Four\Elo
PHP Version : 8.1+
Key Dependencies : symfony/console ^7.0, ext-imagick, ext-pdo
License : MIT
Repository : Public under four-bytes GitHub organization
Command pattern via Symfony Console
Service layer: DatabaseReader, ImageConverter, ExportOrganizer, Logger
Nextcloud-ready output structure with metadata export