. . . . . _/ |
FileForest is not yet a file organiser.
It is, at this stage, barely a file observer. However in the interest of documenting and logging the process of building something so fresh out of learning how to, this is being committed and versioned, even if it is tongue-in-cheek. May I forever be red-faced in embarrassment at these blunder years.
This project is the earliest experimental prototype of a visual file‑organisation system designed to:
- analyse
- classify
- repair
- and eventually reorganise
the messy outputs of data recovery, archive consolidation, and decades of digital entropy.
Right now, FileForest consists of a single script that performs a single process: it walks through a folder and extracts meaning from the files it encounters.
This primordial build includes:
- a folder-walker
- ExifTool-powered metadata extraction
- SHA‑1 hashing for duplicate detection
- a CSV ledger that records every scrap of metadata it can find
- a strict “do no harm” policy: it touches nothing, renames nothing, moves nothing
-
Install ExifTool
-
Place some harmless sample files into the “src” folder. Feel free to change the file extensions to test functionality.
-
Run: python3 scanner.py
-
Enter the folder to scan
-
Point the scanner to the "src" folder.
-
Open the generated CSV in the spreadsheet program of your choice
You will see:
- filename
- MIME type
- discovered metadata
- file size
- SHA‑1 hash
- inferred extension (if safe)
- placeholders for future sorting decisions
Files remain untouched.
- Rename files
- Modify metadata
- Move or reorganise anything
- Attempt format repair
This is effectively the 0.0.0.0.0.0.0.0.0.0.1 seed milestone — the primordial soup from which a real FileForest will evolve.
Future versions will introduce: • Duplicate analysis • Inferred extension reconciliation • Sorting strategies (by type, date, project category, etc.) • File recovery helpers for mixed-format PhotoRec outputs • A full GUI layer for visual file management
Files recovered by PhotoRec are often:
- arbitrarily named
- mis-typed or missing extensions
- stripped of timestamps
- placed in chaotic folder structures
FileForest aims to bring them back into order methodically:
-
Stage 1 (this version):
Crawl → analyse → extract → record → stop. -
Stage 2:
Controlled renaming, extension correction, metadata restoration. -
Stage 3:
Sorting (by type, date clusters, project category, etc.) -
Stage 4:
Repair helpers and deeper heuristics. -
Stage 5:
A GUI representing files as a navigable forest.
This is the seed.
- 0.0.0.1 — first cognition: scan + metadata → CSV
- 0.0.1 — structured metadata extraction
- 0.1.0 — full ledger with column guarantees
- 0.2.0 — safe rename/repair engine
- 0.3.0 — sorting behaviours
- 0.5.0 — partial-file format heuristics
- 1.0.0 — FileForest GUI emerges
Every forest starts with a seed.