dedupe is a simple tool written in C that detects identical files by their SHA256 hash and hardlinks all duplicates to the oldest version of that file, based on modified time.
It is very fast, only hashing files when there are more than one of the same size. It also knows not to cross mount points.
dedupe builds on Linux and FreeBSD.
- On Linux, it depends on
opensslandtalloc. Simply runmake. - On FreeBSD, it depends on
talloc, and requiresgmaketo build.
Simply pass a list of directories to scan as arguments.
There are a few options that can be passed as well:
-vor--verbosewill print a nice colorful progress as it scans files, as well as for duplicates it found.-nor--dry-runwill not actually do modifications.-ior--interactivewill ask what to do with each duplicate found.-eor--excludeto exclude file or directory whose names matches the pattern.-xor--use-xattrswill use extended attributes to cache the computed file hashes.
There are a few more options, run dedupe -h for information.