Skip to content

Directory crawler + metadata extractor designed to tame chaotic PhotoRec output. Very much alive, but only barely.

Notifications You must be signed in to change notification settings

HanaSolo4230/File-Forest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FileForest v0.0.0.1 — Ordovician Edition

. . . . . _/ |

FileForest is not yet a file organiser.
It is, at this stage, barely a file observer. However in the interest of documenting and logging the process of building something so fresh out of learning how to, this is being committed and versioned, even if it is tongue-in-cheek. May I forever be red-faced in embarrassment at these blunder years.

This project is the earliest experimental prototype of a visual file‑organisation system designed to:

  • analyse
  • classify
  • repair
  • and eventually reorganise

the messy outputs of data recovery, archive consolidation, and decades of digital entropy.

Right now, FileForest consists of a single script that performs a single process: it walks through a folder and extracts meaning from the files it encounters.

What This Version Does

This primordial build includes:

  • a folder-walker
  • ExifTool-powered metadata extraction
  • SHA‑1 hashing for duplicate detection
  • a CSV ledger that records every scrap of metadata it can find
  • a strict “do no harm” policy: it touches nothing, renames nothing, moves nothing

How to Use This Version

  1. Install ExifTool

  2. Place some harmless sample files into the “src” folder. Feel free to change the file extensions to test functionality.

  3. Run: python3 scanner.py

  4. Enter the folder to scan

  5. Point the scanner to the "src" folder.

  6. Open the generated CSV in the spreadsheet program of your choice

You will see:

  • filename
  • MIME type
  • discovered metadata
  • file size
  • SHA‑1 hash
  • inferred extension (if safe)
  • placeholders for future sorting decisions

Files remain untouched.

What this version does NOT do:

  • Rename files
  • Modify metadata
  • Move or reorganise anything
  • Attempt format repair

This is effectively the 0.0.0.0.0.0.0.0.0.0.1 seed milestone — the primordial soup from which a real FileForest will evolve.

Future versions will introduce: • Duplicate analysis • Inferred extension reconciliation • Sorting strategies (by type, date, project category, etc.) • File recovery helpers for mixed-format PhotoRec outputs • A full GUI layer for visual file management


Purpose

Files recovered by PhotoRec are often:

  • arbitrarily named
  • mis-typed or missing extensions
  • stripped of timestamps
  • placed in chaotic folder structures

FileForest aims to bring them back into order methodically:

  1. Stage 1 (this version):
    Crawl → analyse → extract → record → stop.

  2. Stage 2:
    Controlled renaming, extension correction, metadata restoration.

  3. Stage 3:
    Sorting (by type, date clusters, project category, etc.)

  4. Stage 4:
    Repair helpers and deeper heuristics.

  5. Stage 5:
    A GUI representing files as a navigable forest.

This is the seed.


Roadmap (Abbreviated)

  • 0.0.0.1 — first cognition: scan + metadata → CSV
  • 0.0.1 — structured metadata extraction
  • 0.1.0 — full ledger with column guarantees
  • 0.2.0 — safe rename/repair engine
  • 0.3.0 — sorting behaviours
  • 0.5.0 — partial-file format heuristics
  • 1.0.0 — FileForest GUI emerges

Every forest starts with a seed.

Releases

No releases published

Packages

No packages published

Languages